3 Commits

Author SHA1 Message Date
Jehan
5a949265d5 src: new API to get the detected language.
This doesn't work for all probers yet, in particular not for the most
generic probers (such as UTF-8) or WINDOWS-1252. These will return NULL.
It's still a good first step.

Right now, it returns the 2-character language code from ISO 639-1. A
using project could easily get the English language name from the
XML/json files provided by the iso-codes project. This project will also
allow to easily localize the language name in other languages through
gettext (this is what we do in GIMP for instance). I don't add any
dependency though and leave it to downstream projects to implement this.

I was also wondering if we want to support region information for cases
when it would make sense. I especially wondered about it for Chinese
encodings as some of them seem quite specific to a region (according to
Wikipedia at least). For the time being though, these just return "zh".
We'll see later if it makes sense to be more accurate (maybe depending
on reports?).
2022-12-14 00:23:13 +01:00
Jehan
44a50c30ee Issue #8: no newline at end of file.
Not sure if it is in the C++ standard, or was, but apparently some
compilers may complain when files don't end with a newline (though
neither GCC nor Clang as our CI and my local builds are fine).

So here are all our generated source which didn't have such ending
newline (hopefully I forgot none). I just loaded them in my vim editor,
and resaved them. This was enough to add an ending newline.
2020-04-22 22:53:25 +02:00
Jehan
aa587a64bd LangModels: adding German models for ISO-8859-1 and Windows-1252. 2015-12-03 23:58:41 +01:00