1 Commits

Author SHA1 Message Date
Jehan
0be80a21db script, src: update Norwegian model with the new language features.
As I just rebased my branch about new language detection API, I needed
to re-generate Norwegian language models. Unfortunately it doesn't
detect UTF-8 Norwegian text, though not far off (it detects it as second
candidate with high 91% confidence; beaten by Danish UTF-8 with 94%
confidence unfortunately!).

Note that I also update the alphabet list for Norwegian as there were
too many letters in there (according to Wikipedia at least), so even
when training a model, we had some missing characters in the training
set.
2022-12-14 00:24:53 +01:00