html - Unable to extract all spans with matching class or id -


this stupid. trying write simple scraper grab listing website: https://online.ncat.nsw.gov.au/hearing/hearinglist.aspx?locationcode=2000

well, run each locationcode example page.

i want extract both <span> headings , table data each date.

the general form of data is:

<span id="lblsubheader1242017" class="clsgriditem">1:15 pm wednesday, 12 apr 2017 @ room 15.6 level 15, 66 goulburn st </span> <hr /> <table id="dg1242017">     <tr class="clsgriditem">         <td width="15%">rt 17/11111</td>         <td width="30%">name of party</td>         <td width="55%">name of party</td>     </tr>     ...  </table> 

it's rough can grab table data pretty code of form:

page = requests.get('https://online.ncat.nsw.gov.au/hearing/hearinglist.aspx?locationcode=2000') tree = html.fromstring(page.content) events = tree.xpath('//table//td/text()') 

but when try grab the spans outside table can have location , date information like:

days = tree.xpath('//span[starts-with(@id,"lbl")]/text()') 

or

days = tree.xpath('//span[@class,"clsgriditem"]/text()') 

i following 2 results:

days:  ['there no matters listed in sydney today', 'there no matters listed in sydney today'] 

these refer 2 spans 2/3 of way down page:

<span id="lbl1442017" style="font-weight:bold;">sydney: friday, 14 apr 2017</span><br /><br /><span id="lblerror1442017" class="clsgriditem">there no matters listed in sydney today</span><br /><br /><br /><span id="lbl1742017" style="font-weight:bold;">sydney: monday, 17 apr 2017</span><br /><br /><span id="lblerror1742017" class="clsgriditem">there no matters listed in sydney today</span> 

could explain me doing wrong?

why other spans being skipped?

you can use below code every text content of <span class="clsgriditem">:

days = tree.xpath('//span[@class="clsgriditem"]//text()') 

but have no idea why //span[@class="clsgriditem"]/text() not working should applicable well...


Comments

Popular posts from this blog

Command prompt result in label. Python 2.7 -

javascript - How do I use URL parameters to change link href on page? -

amazon web services - AWS Route53 Trying To Get Site To Resolve To www -