Jehan
9c3c37517c
LangModels: add Arabic support.
...
Models constructed for ISO-8859-6 and Windows-1256.
2015-12-13 18:42:16 +01:00
Jehan
ffabb65712
LangModels: adding Spanish support.
...
With 3 charsets: ISO-8859-1, ISO-8859-15 and Windows-1252.
2015-12-12 18:54:35 +01:00
Jehan
fb3c47a073
LangModels: add ISO-8859-11 and regenerate TIS-620 Thai models.
...
ISO-8859-11 is basically exactly identical to TIS-620, with the added
non-breaking space character.
Basically our detection will always return TIS-620 except for
exceptional cases when a text has a non-breaking space.
2015-12-04 03:14:52 +01:00
Jehan
5ee1c3ee39
LangModels: adding Turkish models for ISO-8859-3 and ISO-8859-9.
2015-12-04 02:35:09 +01:00
Jehan
f0e122b506
LangModels: add Esperanto ISO-8859-3 language model.
2015-12-04 01:35:56 +01:00
Jehan
aa587a64bd
LangModels: adding German models for ISO-8859-1 and Windows-1252.
2015-12-03 23:58:41 +01:00
Jehan
0270b1e856
Adding French Windows-1252 support.
2015-12-03 21:22:30 +01:00
Jehan
683255278d
Re-enable Hungarian language models.
...
Now that we have at least one model for ISO-8859-1, the risk of
detecting all ISO-8859-1 texts as ISO-8859-2 is lessened.
2015-12-02 22:24:36 +01:00
Jehan
005fd98086
Add initial support for French with ISO-8859-1 and ISO-8859-15.
...
Mostly generated with a script from Wikipedia data (only the typical
positive ratio is slightly modified).
This is a first test before adding my generating script to the main tree.
2015-11-28 02:14:39 +01:00
BYVoid
84284eccf4
Update code from upstream.
2011-07-11 14:42:50 +08:00
BYVoid
3601900164
Initial release.
2011-07-10 15:04:42 +08:00