6 Commits

Author SHA1 Message Date
Jehan
bafccfcea8 Add a Windows-1251 test files.
Texts taken from Bulgarian Wikipedia page about Windows-1251:
https://bg.wikipedia.org/wiki/Windows-1251
... and Russian Wikipedia page about Windows-1251:
https://ru.wikipedia.org/wiki/Windows-1251
The Bulgarian file detection is right, but the Russian detection
returns "MAC-CYRILLIC", which is an error and should be fixed.
2015-11-17 19:09:37 +01:00
Jehan
8216f7b395 Add an ISO-2022-KR test file.
Text taken from Korean Wikipedia page about the ISO-2022-KR:
https://ko.wikipedia.org/wiki/ISO/IEC_2022
2015-11-17 18:23:46 +01:00
Jehan
9172b763d1 Add TIS-620 in README (Thai language) and a test file.
Test text based on Thai Wikipedia page about the TIS-620 encoding:
https://th.wikipedia.org/wiki/TIS-620
2015-11-17 17:39:45 +01:00
Jehan
362e36d1ed Add EUC-KR test file.
Contains text taken from Wikipedia on EUC-KR page in Korean.
https://ko.wikipedia.org/wiki/EUC-KR
I added it as a simili-subtitle file because as the original Mozilla
paper says: "The input text may contain extraneous noises which have no
relation to its encoding, e.g. HTML tags, non-native words".
Therefore I feel it is important to have test files a little noisy if
possible, in order to test our resistance to noise in our algorithm.
2015-11-17 16:36:17 +01:00
byvoid
eaab1d7868 Set permissions. 2011-07-11 18:08:26 +08:00
BYVoid
86b4739e5a Add test cases. 2011-07-11 14:57:31 +08:00