From 6caa8f6580b30b4342ed2c1e5154b2594c6a7147 Mon Sep 17 00:00:00 2001 From: Carbo Kuo Date: Fri, 8 Nov 2013 07:02:50 +0800 Subject: [PATCH] Add README --- README.md | 96 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 96 insertions(+) create mode 100644 README.md diff --git a/README.md b/README.md new file mode 100644 index 0000000..2bd2bda --- /dev/null +++ b/README.md @@ -0,0 +1,96 @@ +# uchardet + +[uchardet](https://github.com/BYVoid/uchardet) is a C language binding of the original C++ implementation of the universal charset detection library by Mozilla. + +uchardet is an encoding detector library, which takes a sequence of bytes in an unknown character encoding without any additional information, and attempts to determine the encoding of the text. + +The original code of universalchardet is available at http://lxr.mozilla.org/seamonkey/source/extensions/universalchardet/ + +Techniques used by universalchardet are described at http://www.mozilla.org/projects/intl/UniversalCharsetDetection.html + +## Supported Encodings + + * Unicode + * UTF-8 + * UTF-16BE / UTF-16LE + * UTF-32BE / UTF-32LE / X-ISO-10646-UCS-4-34121 / X-ISO-10646-UCS-4-21431 + * Chinese + * ISO-2022-CN + * BIG5 + * EUC-TW + * GB18030 + * HZ-GB-23121 + * Japanese + * ISO-2022-JP + * SHIFT_JIS + * EUC-JP + * Korean + * ISO-2022-KR + * EUC-KR + * Cyrillic + * ISO-8859-5 + * KOI8-R + * WINDOWS-1251 + * MACCYRILLIC + * IBM866 + * IBM855 + * Greek + * ISO-8859-7 + * WINDOWS-1253 + * Hebrew + * ISO-8859-8 + * WINDOWS-1255 + * Others + * WINDOWS-1252 + +## Installation + +### Ubuntu/Debian + + apt-get install uchardet libuchardet-dev + +### Mac + + brew install uchardet + +### Build from source + + cmake . + make + make install + +## Usage + +### Command Line + +``` +uchardet Command Line Tool +Version 0.0.1 + +Author: BYVoid +Bug Report: http://code.google.com/p/uchardet/issues/entry + +Usage: + uchardet [Options] [File] + +Options: + -v, --version Print version and build information. + -h, --help Print this help. + ``` +### Library + +See [uchardet.h](https://github.com/BYVoid/uchardet/blob/master/src/uchardet.h) + +## Related Projects + + * [python-chardet](http://chardet.feedparser.org/) Python port + * [ruby-rchardet](http://rubyforge.org/projects/chardet/) Ruby port + * [juniversalchardet](http://code.google.com/p/juniversalchardet/) Java port of universalchardet + * [jchardet](http://jchardet.sourceforge.net/) Java port of chardet + * [nuniversalchardet](http://code.google.com/p/nuniversalchardet/) C# port of universalchardet + * [nchardet](http://www.conceptdevelopment.net/Localization/NCharDet/) C# port of chardet + * [uchardet-enhanced](https://bitbucket.org/medoc/uchardet-enhanced) A fork of mozilla universalchardet + +## License + +[Mozilla Public License Version 1.1](http://www.mozilla.org/MPL/1.1/)