uchardet/test/fr/utf-16be.txt
Jehan 7fa0fefef8 Add UTF-16 and UTF-32 test files in French, with BOM.
Unfortunately uchardet currently seems unable to detect UTF-16/32
text without a BOM.
2015-11-26 02:45:00 +01:00

9 lines
1.1 KiB
Plaintext

UTF-16 (16-bit Unicode Transformation Format) is a character encoding capable of
encoding all 1,112,064 possible characters in Unicode. The encoding is
variable-length, as code points are encoded with one or two 16-bit code units.
(also see Comparison of Unicode encodings for a comparison of UTF-8, -16 & -32)
UTF-16 developed from an earlier fixed-width 16-bit encoding known as UCS-2 (for
2-byte Universal Character Set) once it became clear that a fixed-width 2-byte
encoding could not encode enough characters to be truly universal.