uchardet

coffee/uchardet

Fork 0

mirror of https://gitlab.freedesktop.org/uchardet/uchardet.git synced 2025-12-24 12:44:46 +08:00

Commit Graph

Author	SHA1	Message	Date
Jehan	04f9309932	tests: update ISO-8859-15 French test file. Previous technical text about charsets themselves were not relevant to identify a language. In particular the special characters different between ISO-8859-1 and ISO-8859-15 were used by themselves, out of a char sequence context. Therefore without language understanding, they could have as well been representing the ISO-8859-15 letters or the ISO-8859-1 symbols at the corresponding codepoints. Replacing with text from this Wikipedia page: https://fr.wikipedia.org/wiki/Œuf_(cuisine) This uses some of these same characters (in particular 'œ') but in contextual character sequences, making it relevant for our algorithm.	2015-11-30 00:19:15 +01:00
Jehan	50588ba375	Add a ISO-8859-15 test file for French.	2015-11-28 02:18:57 +01:00

Author

SHA1

Message

Date

Jehan

04f9309932

tests: update ISO-8859-15 French test file.

Previous technical text about charsets themselves were not relevant
to identify a language. In particular the special characters different
between ISO-8859-1 and ISO-8859-15 were used by themselves, out of a
char sequence context. Therefore without language understanding, they
could have as well been representing the ISO-8859-15 letters or the
ISO-8859-1 symbols at the corresponding codepoints.
Replacing with text from this Wikipedia page:
https://fr.wikipedia.org/wiki/Œuf_(cuisine)
This uses some of these same characters (in particular 'œ') but in
contextual character sequences, making it relevant for our algorithm.

2015-11-30 00:19:15 +01:00

Jehan

50588ba375

Add a ISO-8859-15 test file for French.

2015-11-28 02:18:57 +01:00

2 Commits