5 Commits

Author SHA1 Message Date
Jehan
0d152ff430 src, test: fix the new Johab prober and add a test.
This prober comes from MR !1 on the main branch though it was too
agressive then and could not get merged. On the improved API branch, it
doesn't detect other tests as Johab anymore.

Also fixing it to work with the new API.

Finally adding a Johab/ko unit test.
2022-12-14 00:23:13 +01:00
Jehan
2a559e7b52 README, test: update README and rename EUC-KR test to UHC. 2016-09-19 01:44:32 +02:00
Jehan
fe7bf3e994 test: update UTF-16 and UTF-32 tests after label changing. 2015-12-04 19:46:51 +01:00
Jehan
c8532f63a8 Adding UTF-8 file for Korean.
Text taken from Korean Wikipedia:
https://ko.wikipedia.org/wiki/UTF-8
2015-11-18 02:36:33 +01:00
Jehan
0efcdfa546 Reorganize test files in language subdirectories.
I realize that the language information a text has been written in is
very important since it would completely change the character
distribution. Our test files should take this into account, and we
should create several test files in different languages for encoding
used in various languages.
2015-11-17 21:12:39 +01:00