uchardet

mirror of https://gitlab.freedesktop.org/uchardet/uchardet.git synced 2025-12-06 16:56:40 +08:00

History

Jehan 2a04e57c8f test: update the Maltese / ISO-8859-3 test file. Taken from the page: https://mt.wikipedia.org/wiki/Lingwa_Maltija The old test was fine but had some French words in it, which lowered the confidence for Maltese. Technically it should not be a huge issue in the end, i.e. that if there are enough actual Maltese words, the stats should still weigh in favor of Maltese likeness (which they mostly did anyway), but since I am making some other changes, this was just not enough. In particular I was changing some of the UTF-8 confidence logics and the file ended up detected as UTF-8 (even though it has illegal sequence and cannot be! Cf. #9). So the real long-term solution is to actually fix our UTF-8 detector, which I'll do at some point, but for the time being, let's have definite non-questionable Maltese in there to simplify testing at this early stage of uchardet rewriting.		2022-11-29 14:59:17 +01:00
..
ar	tests: add test files for Arabic.	2015-12-13 18:42:59 +01:00
bg	Reorganize test files in language subdirectories.	2015-11-17 21:12:39 +01:00
cs	test: adding test files for Czech.	2016-09-21 03:44:22 +02:00
da	LangModels: add Danish support (Windows-1252, ISO-8859-1 and ISO-8859-15).	2016-02-19 19:10:41 +01:00
de	LangModels: adding German models for ISO-8859-1 and Windows-1252.	2015-12-03 23:58:41 +01:00
el	Add Greek test files.	2015-11-18 02:57:09 +01:00
en	Add an ASCII test file for English...	2015-11-28 17:49:13 +01:00
eo	LangModels: add Esperanto ISO-8859-3 language model.	2015-12-04 01:35:56 +01:00
es	tests: test files for Spanish.	2015-12-12 18:55:43 +01:00
et	LangModels: Estonian models created.	2016-09-27 00:14:29 +02:00
fi	LangModels: add Finnish support.	2016-09-21 18:27:39 +02:00
fr	test: update UTF-16 and UTF-32 tests after label changing.	2015-12-04 19:46:51 +01:00
ga	LangModels: added support for Irish Gaelic.	2016-09-27 00:49:05 +02:00
he	Add Hebrew test files.	2015-11-18 03:16:18 +01:00
hr	LangModels: new Croatian models.	2016-09-26 01:32:49 +02:00
hu	tests: update Window-1250 test file for Hungarian.	2015-12-12 18:12:08 +01:00
it	LangModels: add Italian support.	2016-09-21 18:52:09 +02:00
ja	Add UTF-16 test files without BOM...	2015-11-28 19:50:18 +01:00
ko	README, test: update README and rename EUC-KR test to UHC.	2016-09-19 01:44:32 +02:00
lt	LangModels: add support for Latvian \| Lithuanian / ISO-8859-4 \| ISO-8859-10.	2016-09-21 00:27:16 +02:00
lv	LangModels: add support for Latvian \| Lithuanian / ISO-8859-4 \| ISO-8859-10.	2016-09-21 00:27:16 +02:00
mt	test: update the Maltese / ISO-8859-3 test file.	2022-11-29 14:59:17 +01:00
pl	LangModels: add Polish support.	2016-09-21 17:30:15 +02:00
pt	LangModels: add support for Portuguese / ISO-8859-1.	2016-09-21 00:01:07 +02:00
ro	LangModels: Romanian support added.	2016-09-28 19:57:50 +02:00
ru	Add some Russian test files.	2015-11-27 18:17:20 +01:00
sk	LangModels: add support for Slovak.	2016-09-21 13:42:20 +02:00
sl	LangModels: add Slovene support.	2016-09-28 22:13:17 +02:00
sv	LangModels: add Swedish support.	2016-09-28 22:42:13 +02:00
th	LangModels: add ISO-8859-11 and regenerate TIS-620 Thai models.	2015-12-04 03:14:52 +01:00
tr	LangModels: adding Turkish models for ISO-8859-3 and ISO-8859-9.	2015-12-04 02:35:09 +01:00
vi	LangModels: add VISCII encoding support and retrain Vietnamese model.	2016-02-13 03:51:18 +01:00
zh	Adding some more test files for Russian and Chinese.	2015-11-18 19:27:38 +01:00
CMakeLists.txt	Request C++11 standard project-wise and make it a strong requirement.	2017-05-28 15:43:44 +02:00
uchardet-tests.c	Update uchardet-tests.c	2022-11-29 13:57:31 +00:00