HTML解析库 html5lib

jopen 13年前
     html5lib 是一个 Ruby 和 Python 用来解析 HTML 文档的类库,支持HTML 5 以及最大程度兼容桌面浏览器。    <p>主要特性包括:</p>    <ul>     <li><a id="0.11_Release_Features">将有效和无效的HTML文档解析成树</a></li>     <li><a id="0.11_Release_Features">Support for <strong>minidom</strong>, <strong>ElementTree</strong> (including <strong>cElementTree</strong> and <strong>lxml.etree</strong>), <strong>BeautifulSoup</strong> and custom <strong>simpletree</strong> output formats </a></li>     <li><a id="0.11_Release_Features"><strong>DOM</strong> 到 <strong>SAX</strong> 转换器 </a></li>     <li><a id="0.11_Release_Features">Reports parse errors </a></li>     <li><a id="0.11_Release_Features">字符集探测 </a></li>     <li><a id="0.11_Release_Features">XML mode for working with illformed XML e.g. feeds </a></li>     <li><a id="0.11_Release_Features">Filtering and serializing of trees </a></li>     <li><a id="0.11_Release_Features">HTML+CSS sanitizer </a></li>     <li><a id="0.11_Release_Features">非常多的单元测试 </a></li>     <li><a id="0.11_Release_Features">比之前快 <br /> </a></li>    </ul>    <p><strong>项目主页:</strong><a href="http://www.open-open.com/lib/view/home/1324371284280" target="_blank">http://www.open-open.com/lib/view/home/1324371284280</a></p>