uchardet/script/BuildLangModelLogs/LangGreekModel.log
Jehan 5c3a2e8037 src, script: regenerate all existing language models.
Now making sure that we have a generic language model working with UTF-8
for all 26 supported models which had single-byte encoding support until
now.
2021-03-17 02:07:17 +01:00

175 lines
5.5 KiB
Plaintext
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

= Logs of language model for Greek (el) =
- Generated by BuildLangModel.py
- Started: 2021-03-16 18:54:42.415198
- Maximum depth: 4
- Max number of pages: 100
== Parsed pages ==
Πύλη:Κύρια (revision 7950664)
16 Μαρτίου (revision 8737120)
1797 (revision 8019834)
1839 (revision 8019704)
1900 (revision 7952521)
1901 (revision 7905277)
1935 (revision 8290828)
Mars 2020 (revision 8718725)
Perseverance (ρόβερ) (revision 8718754)
The Economist (revision 8341010)
Wiki (revision 8595867)
Wikimedia (revision 8518678)
Άρθουρ Έβανς (revision 8502931)
Άρθρουρ Γουέλσλεϋ, Δούκας του Ουέλλινγκτον (revision 8423158)
Αγγλική γλώσσα (revision 8702613)
Αδόλφος Χίτλερ (revision 8722090)
Αντισφαίριση (revision 8557812)
Αρειανό ελικόπτερο Ingenuity (revision 8718783)
Αυστραλιανό Όπεν (revision 8078988)
Βέρμαχτ (revision 8711795)
Βραβεία Νόμπελ Λογοτεχνίας (revision 8519145)
Γαλλία (revision 8680274)
Γενικός Διευθυντής του Παγκόσμιου Οργανισμού Εμπορίου (revision 8694448)
Γερμανία (revision 8724575)
Εγκυκλοπαίδεια (revision 8687200)
Ελεύθερο περιεχόμενο (revision 8707719)
Ελληνική Βικιπαίδεια (revision 8731090)
Κνωσός (revision 8697910)
Κρήτη (revision 8735869)
Λονδίνο (revision 8666776)
Ναόμι Οσάκα (revision 8736512)
Νγκόζι Οκόντζο-Ιουεάλα (revision 8716446)
Νόβακ Τζόκοβιτς (revision 8735633)
Ουίλιαμ Μπάντινγκ (revision 8298356)
Παγκόσμιος Οργανισμός Εμπορίου (revision 8694448)
Πατριάρχης Σερβίας Πορφύριος (revision 8716966)
Σερβική Ορθόδοξη Εκκλησία (revision 8703081)
Συλί Προυντόμ (revision 8736464)
Συνθήκη των Βερσαλλιών (revision 7991516)
10 Μαρτίου (revision 8726574)
1185 (revision 8532989)
1190 (revision 8729267)
11 Μαρτίου (revision 8730381)
1244 (revision 7906151)
12 Μαρτίου (revision 8730152)
13 Μαρτίου (revision 8544014)
1405 (revision 7906083)
1410 (revision 7906088)
1465 (revision 7905889)
1473 (revision 8687951)
1478 (revision 7905905)
14 Μαρτίου (revision 8096796)
15 Μαρτίου (revision 8734431)
1670 (revision 8120689)
1751 (revision 8019900)
1782 (revision 8019823)
1789 (revision 8019786)
1792 (revision 8019828)
1794 (revision 8019829)
17 Μαρτίου (revision 8233521)
1802 (revision 8019791)
1812 (revision 8019794)
1815 (revision 8728979)
1859 (revision 8019719)
1872 (revision 8019620)
1888 (revision 8678352)
1892 (revision 8019578)
1894 (revision 8019646)
1898 (revision 7905275)
18 Μαρτίου (revision 8666328)
1906 (revision 8019564)
1908 (revision 8110859)
1911 (revision 8234911)
1912 (revision 7905254)
1919 (revision 8188234)
1920 (revision 8689556)
1921 (revision 8019599)
1923 (revision 8640393)
1924 (revision 8019604)
1925 (revision 8424340)
1926 (revision 8019613)
1927 (revision 7905236)
1930 (revision 8019616)
1937 (revision 7905218)
1939 (revision 8731642)
1940 (revision 8503734)
1944 (revision 8556801)
1945 (revision 8699418)
1948 (revision 8707830)
1953 (revision 8660010)
1955 (revision 8733996)
1956 (revision 8637553)
1957 (revision 8582051)
1959 (revision 8621124)
1964 (revision 8701289)
1966 (revision 8596642)
1967 (revision 8657263)
1968 (revision 8640882)
1969 (revision 8709383)
1970 (revision 8645926)
== End of Parsed pages ==
- Wikipedia parsing ended at: 2021-03-16 18:58:31.004638
62 characters appeared 801479 times.
First 47 characters:
[ 0] Char α: 8.791371951105393 %
[ 1] Char ο: 8.656870610458913 %
[ 2] Char τ: 7.436002690026814 %
[ 3] Char ι: 6.335661944979219 %
[ 4] Char ν: 5.906455440504367 %
[ 5] Char ε: 5.323907426145913 %
[ 6] Char ρ: 5.098698780629311 %
[ 7] Char ς: 4.129740142910793 %
[ 8] Char κ: 4.033542987402041 %
[ 9] Char σ: 3.9103956560309125 %
[10] Char υ: 3.7128858023728633 %
[11] Char η: 3.4742020689250745 %
[12] Char λ: 3.4385180397739674 %
[13] Char π: 3.329220104332116 %
[14] Char μ: 3.3050148537890576 %
[15] Char ί: 2.7370648513560556 %
[16] Char ό: 2.185958708837038 %
[17] Char γ: 2.095251403966916 %
[18] Char ά: 1.8429678132552443 %
[19] Char έ: 1.6417148796163092 %
[20] Char δ: 1.4553094965682194 %
[21] Char β: 1.2000314418718394 %
[22] Char ω: 1.121801070271336 %
[23] Char ή: 1.0494348573075527 %
[24] Char χ: 0.9217958299593626 %
[25] Char ύ: 0.8777522555176118 %
[26] Char φ: 0.8600350102747546 %
[27] Char θ: 0.7800578680165045 %
[28] Char ώ: 0.617732966178777 %
[29] Char ζ: 0.4195992658572464 %
[30] Char e: 0.30456194111137036 %
[31] Char ξ: 0.28696946520120925 %
[32] Char i: 0.25203405204627943 %
[33] Char a: 0.23631311612656103 %
[34] Char n: 0.21647479222786872 %
[35] Char r: 0.1978841616561382 %
[36] Char o: 0.18915030836740576 %
[37] Char s: 0.17779629909205355 %
[38] Char t: 0.16269920983581604 %
[39] Char l: 0.14585534992183202 %
[40] Char d: 0.11665932607092637 %
[41] Char c: 0.10468147013209328 %
[42] Char h: 0.09257884486056403 %
[43] Char u: 0.08409453023722394 %
[44] Char m: 0.08247252891217362 %
[45] Char ΐ: 0.07161759696760614 %
[46] Char ψ: 0.06774974765402461 %
The first 47 characters have an accumulated ratio of 0.9947858895866266.
1390 sequences found.
First 512 (typical positive ratio): 0.9624941725288916
Next 512 (512-1024): 0.00617732966178777
Rest: 0.0016086054433421051
- Processing end: 2021-03-16 18:58:31.125842