8 Commits

Author SHA1 Message Date
Jehan
15afc5c593 test: add a Hungarian Windows-1250 test but skip it for now.
Text from: https://hu.wikipedia.org/wiki/Magyar_nyelv
2015-12-03 21:18:55 +01:00
Jehan
683255278d Re-enable Hungarian language models.
Now that we have at least one model for ISO-8859-1, the risk of
detecting all ISO-8859-1 texts as ISO-8859-2 is lessened.
2015-12-02 22:24:36 +01:00
Jehan
f4f9fc3f28 test: reenable Windows-1251 test for Russian.
Commit 4f1c3ff actually fixed it!
2015-12-02 21:53:27 +01:00
Jehan
a8e9de307b Add UTF-16 test files without BOM...
... and disable the tests for now for these since uchardet is not able
to detect UTF-16 without a BOM as for now.
2015-11-28 19:50:18 +01:00
Jehan
005fd98086 Add initial support for French with ISO-8859-1 and ISO-8859-15.
Mostly generated with a script from Wikipedia data (only the typical
positive ratio is slightly modified).
This is a first test before adding my generating script to the main tree.
2015-11-28 02:14:39 +01:00
Jehan
5dcff7b241 Hide away tests known to fail.
Some charsets are simply not supported (ex: fr:iso-8859-1), some are
temporarily deactivated (ex: hu:iso-8859-2) and some are wrongly
detected as closely related charsets.
These were broken (or not efficient) from the start, and there is no
need to pollute the `make test` output with these, which may make us
miss when actual regressions will occur. So let's hide these away for
now until we can improve the situation.
2015-11-18 20:02:58 +01:00
Jehan
4b38e68aa2 CMake tests: separate the lang and charset with colon...
... rather than an hyphen. It makes it easier to read.
2015-11-18 19:42:35 +01:00
Jehan
eb727d3aca Add automatic testing against every test file. 2015-11-18 18:18:27 +01:00