4 Commits

Author SHA1 Message Date
Jehan
e6e51d9fe8 src: all language models now rebuilt after the fix. 2022-12-15 14:31:55 +01:00
Jehan
6bb1b3e101 scripts: all language models rebuilt with the new ratio data. 2022-12-14 20:16:44 +01:00
Jehan
eb8308d50a src, script: regenerate all existing language models.
Now making sure that we have a generic language model working with UTF-8
for all 26 supported models which had single-byte encoding support until
now.
2022-12-14 00:23:13 +01:00
Jehan
3401ac70d0 LangModels: add Polish support.
With the following encodings: ISO-8859-2, ISO-8859-13, ISO-8859-16,
Windows-1250, IBM852, MAC-CENTRALEUROPE.
Test text from https://pl.wikipedia.org/wiki/Zofia_Holszańska
2016-09-21 17:30:15 +02:00