PHP爬虫库:Page Scraper

jopen 10年前

易于使用的Web页面数据提取PHP类库,只需要几行代码。使用XPath 或 CSS Selector可以从任意网站抓取数据。示例:

$page = new Page('https://news.ycombinator.com');  $builder = new PageBuilder($page);  $builder->setDataConfig(array(      'side_links' => array('css' => '.title .comhead'), // use CSS Selector      'titles'     => '//td[@class="title"]//a/text()', // use XPath      'links'      => '//td[@class="title"]//a/@href', // use XPath  ));  $director = new PageBuilderDirector($builder);  $director->buildPage();  $data = $page->getData();

项目主页:http://www.open-open.com/lib/view/home/1418132786573