Python 字符编码检测库:PyCharlockHolmes

n6xb 9年前

PyCharlockHolmes 是豆瓣开发的一个 Python 的字符编码检测库。基于 ICU 和 libmagic 开发,灵感来自于 Charlock Holmes

Dependency

  1. icu
  2. file(libmagic)

Gentoo

emerge -av dev-libs/icu  emerge -av sys-apps/file

Ubuntu

apt-get install libicu-dev  apt-get install libmagic-dev

Brew

brew install icu4c  brew install libmagic  export ICUI18N="/usr/local/Cellar/icu4c/xx" # Replace "xx" as the version of your icu  export MAGIC="/usr/local/Cellar/libmagic/xx" # Replace "xx" as the version of your libmagic

Install

python setup build  python setup install

Usage

from charlockholmes import detect  file = open('test.txt')  content = file.read()  print detect(content)

项目主页:http://www.open-open.com/lib/view/home/1428388203901