Python的HTML解析器 Beautiful Soup

jopen 13年前
     <p><strong>Beautiful Soup</strong> 库是一个非常神奇的 “粗糙的解析器”,用于解析实际 Web 页面中包含的有效 HTML。</p>    <p>示例:</p>    <pre class="brush:python; toolbar: true; auto-links: false;">from BeautifulSoup import BeautifulSoup html = "<html><p>Para 1<p>Para 2<blockquote>Quote 1<blockquote>Quote 2" soup = BeautifulSoup(html) print soup.prettify() # <html> #  <p> #   Para 1 #  </p> #  <p> #   Para 2 #   <blockquote> #    Quote 1 #    <blockquote> #     Quote 2 #    </blockquote> #   </blockquote> #  </p> # </html></pre>    <p><strong>项目主页:</strong><a href="http://www.open-open.com/lib/view/home/1324370854280" target="_blank">http://www.open-open.com/lib/view/home/1324370854280</a></p>