Jehan
9c3c37517c
LangModels: add Arabic support.
...
Models constructed for ISO-8859-6 and Windows-1256.
2015-12-13 18:42:16 +01:00
Jehan
ad2f7212e2
LangModels: retraining Greek models with my training script.
...
This fixes our Greek/Windows-1253 test.
2015-12-13 18:02:11 +01:00
Jehan
6b2722885a
BuildLangModel: forgot to add charset/language files.
2015-12-12 18:18:08 +01:00
Jehan
fb3c47a073
LangModels: add ISO-8859-11 and regenerate TIS-620 Thai models.
...
ISO-8859-11 is basically exactly identical to TIS-620, with the added
non-breaking space character.
Basically our detection will always return TIS-620 except for
exceptional cases when a text has a non-breaking space.
2015-12-04 03:14:52 +01:00
Jehan
ffcd85f709
script: forgot to commit ISO-8859-9 and Turkish files.
2015-12-04 02:40:54 +01:00
Jehan
f0e122b506
LangModels: add Esperanto ISO-8859-3 language model.
2015-12-04 01:35:56 +01:00
Jehan
0270b1e856
Adding French Windows-1252 support.
2015-12-03 21:22:30 +01:00
Jehan
0314f98ece
BuildLangModel.py: some in-progress script to build language models.
2015-11-29 01:30:04 +01:00