what's wrong in this html dom php code? -
i'm trying code print contents of elements itemprop="price" link don't work, can't figure out why, code:
<?php error_reporting(0); ini_set('display_errors', 0); $doc = new domdocument(); $allscan = array( 'http://www.mobile54.co.il/30786', 'http://www.mobile54.co.il/35873', 'http://www.mobile54.co.il/34722' ); $alllinks = array(); $html = file_get_contents($allscan[0]); $doc->loadhtml($html); $href = $doc->getelementsbytagname('a'); ($j = 0; $j < count($allscan); $j++) { $html = file_get_contents($allscan[$j]); $doc->loadhtml($html); $href = $doc->getelementsbytagname('a'); ($i = 0; $i < $href->length; $i++) { $link = $href->item($i)->getattribute("href"); $lin = preg_replace('/\s+/', '', 'http://www.mobile54.co.il' . $link . "<br />"); if (strpos($link, 'items/') && !strpos($link, '#techdetailsaname')) { if (!in_array($lin, $alllinks)) { $alllinks[] = $lin; } } } } ($i = 0; $i < count($alllinks); $i++) { echo $alllinks[$i]; } ($i = 0; $i < count($alllinks); $i++) { $lin = "$alllinks[$i]"; $html = file_get_contents($lin); $doc->loadhtml('<?xml encoding="utf-8"?>' . $html); $span = $doc->getelementsbytagname('span'); ($j = 0; $j < $span->length; $j++) { $attr = $span->item($j)->getattribute('itemprop'); if ($attr == "price") { echo $span->item($j)->textcontent . "<br />"; } } } ?> when paste "someurl" insted of $lin work other way doesn't. i've tried $html = file_get_contents($alllinks[$i]); didn't work, don't know why.
i think problem appended <br /> end of url reason. but, there lot of opportunities improve code use of xpath. (note can pass url directly domdocument object.)
first pull <a> elements matching attribute values. urls , search them elements matching itemprop attribute, , text content of them.
<?php $url = "http://www.mobile54.co.il/30786"; $prices = []; $hrefs = []; $combined = []; $dom = new domdocument; libxml_use_internal_errors(true); $dom->loadhtmlfile($url); $xpath = new domxpath($dom); // <a> elements href containing items/ not #techdetailsaname $nodes = $xpath->query("//a[contains(@href, 'items/') , not(contains(@href, '#techdetailsaname'))]/@href"); foreach ($nodes $node) { $hrefs[] = trim($node->value); } // have list of urls foreach ($hrefs $k=>&$href) { $href = "http://www.mobile54.co.il$href"; $dom->loadhtmlfile($href); $xpath = new domxpath($dom); // element itemprop of price $nodes = $xpath->query("//*[@itemprop='price']"); $prices[$k] = $nodes->item(0)->textcontent; } // have $urls , $prices, combine them: foreach ($hrefs $k=>$v) { $combined[$k] = [$hrefs[$k], $prices[$k]]; } print_r($combined);
Comments
Post a Comment