uchardet

mirror of https://gitlab.freedesktop.org/uchardet/uchardet.git synced 2026-07-30 16:26:27 +08:00

History

Jehan 6d31689632 test: adding 2 tests for Hebrew/IBM862 recognition. This is the same text, taken from this Wikipedia page, which was today's page of honor on Wikipedia in Hebrew: https://he.wikipedia.org/wiki/שתי מסכתות על ממשל מדיני I put it in 2 variants, since IBM862 can be used in logical and visual variants. The visual variant is just about inverting orders of letters (per lines, while lines stay in proper order), so that's what I did. Though note that the English title quoted in the text should likely not have been reverted, but it doesn't matter too much since anyway these are off-Hebrew alphabet and would trigger bad sequence score, whichever their order. So I didn't bother fixing these.		2022-12-16 23:35:17 +01:00
..
ar	tests: add test files for Arabic.	2015-12-13 18:42:59 +01:00
bg	Reorganize test files in language subdirectories.	2015-11-17 21:12:39 +01:00
cs	test: adding test files for Czech.	2016-09-21 03:44:22 +02:00
da	script, src, test: add IBM865 support for Danish.	2022-11-30 19:57:52 +01:00
de	test: 4 new tests for UTF-8.	2022-12-14 00:23:13 +01:00
el	Add Greek test files.	2015-11-18 02:57:09 +01:00
en	test: finally add English/UTF-8 test file.	2022-12-14 21:45:29 +01:00
eo	test: 4 new tests for UTF-8.	2022-12-14 00:23:13 +01:00
es	tests: test files for Spanish.	2015-12-12 18:55:43 +01:00
et	LangModels: Estonian models created.	2016-09-27 00:14:29 +02:00
fi	LangModels: add Finnish support.	2016-09-21 18:27:39 +02:00
fr	test: update UTF-16 and UTF-32 tests after label changing.	2015-12-04 19:46:51 +01:00
ga	LangModels: added support for Irish Gaelic.	2016-09-27 00:49:05 +02:00
he	test: adding 2 tests for Hebrew/IBM862 recognition.	2022-12-16 23:35:17 +01:00
hi	src: add Hindi/UTF-8 support.	2022-12-14 00:23:13 +01:00
hr	LangModels: new Croatian models.	2016-09-26 01:32:49 +02:00
hu	test: 4 new tests for UTF-8.	2022-12-14 00:23:13 +01:00
it	LangModels: add Italian support.	2016-09-21 18:52:09 +02:00
ja	Add UTF-16 test files without BOM...	2015-11-28 19:50:18 +01:00
ko	src, test: fix the new Johab prober and add a test.	2022-12-14 00:23:13 +01:00
lt	LangModels: add support for Latvian \| Lithuanian / ISO-8859-4 \| ISO-8859-10.	2016-09-21 00:27:16 +02:00
lv	LangModels: add support for Latvian \| Lithuanian / ISO-8859-4 \| ISO-8859-10.	2016-09-21 00:27:16 +02:00
mt	test: update the Maltese / ISO-8859-3 test file.	2022-11-29 14:59:17 +01:00
no	Add tests for norwegian	2022-11-30 19:09:21 +01:00
pl	LangModels: add Polish support.	2016-09-21 17:30:15 +02:00
pt	LangModels: add support for Portuguese / ISO-8859-1.	2016-09-21 00:01:07 +02:00
ro	LangModels: Romanian support added.	2016-09-28 19:57:50 +02:00
ru	Add some Russian test files.	2015-11-27 18:17:20 +01:00
sk	LangModels: add support for Slovak.	2016-09-21 13:42:20 +02:00
sl	LangModels: add Slovene support.	2016-09-28 22:13:17 +02:00
sv	LangModels: add Swedish support.	2016-09-28 22:42:13 +02:00
th	LangModels: add ISO-8859-11 and regenerate TIS-620 Thai models.	2015-12-04 03:14:52 +01:00
tr	test: 4 new tests for UTF-8.	2022-12-14 00:23:13 +01:00
vi	LangModels: add VISCII encoding support and retrain Vietnamese model.	2016-02-13 03:51:18 +01:00
zh	Adding some more test files for Russian and Chinese.	2015-11-18 19:27:38 +01:00
CMakeLists.txt	test: add ability to have several tests per charsets.	2022-12-16 23:10:34 +01:00
uchardet-tests.c	src, test: rename s/uchardet_get_candidates/uchardet_get_n_candidates/.	2022-12-14 00:24:53 +01:00