uchardet/script/BuildLangModelLogs/LangMalteseModel.log

159 lines
4.6 KiB
Plaintext

= Logs of language model for Maltese (mt) =
- Generated by BuildLangModel.py
- Started: 2022-12-14 18:08:52.337231
- Maximum depth: 4
- Max number of pages: 200
== Parsed pages ==
Unjoni Ewropea (revision 279041)
Kummissjoni Ewropea (revision 258115)
2007 (revision 258027)
Repubblika Ċeka (revision 279325)
Renju Unit (revision 282249)
Bulgarija (revision 266495)
Finlandja (revision 282145)
Danimarka (revision 280266)
Transnistrija (revision 266548)
Belġju (revision 276022)
Stati Uniti tal-Amerika (revision 280264)
Rumanija (revision 266525)
Iżlanda (revision 280133)
Turkija (revision 270987)
Praga (revision 281545)
Santa Luċija (revision 281476)
Ażerbajġan (revision 283345)
Lista ta' pajjiżi skont l-erja (revision 260621)
Awstrija (revision 273952)
Lista ta' pajjiżi skont id-densità ta' popolazzjoni (revision 272026)
Repubblika tal-Irlanda (revision 280123)
Georgia (revision 279271)
Ġermanja (revision 279831)
Litwanja (revision 281573)
Estonja (revision 274160)
Albanija (revision 272682)
San Marino (revision 279324)
Italja (revision 277251)
Abkażja (revision 266550)
Bangladexx (revision 281383)
Sri Lanka (revision 281445)
Burundi (revision 262969)
Belt tal-Vatikan (revision 282714)
Stati Uniti (revision 280264)
Spanja (revision 274001)
Parlament Ewropew (revision 255748)
Slovakkja (revision 266528)
Kosovo (revision 277587)
Lesoto (revision 281408)
Serbja (revision 266527)
Russja (revision 266526)
Ġamajka (revision 281389)
Malta (revision 280087)
Uganda (revision 281453)
Bożnija-Ħerzegovina (revision 266494)
Portugall (revision 279398)
Kroazja (revision 279781)
Norveġja (revision 279820)
Ċekja (revision 279325)
Salzburg (stat) (revision 281285)
Belarussja (revision 270102)
Montenegro (revision 279276)
Tuvalu (revision 281452)
El Salvador (revision 253460)
Filippini (revision 266237)
Kamerun (revision 281402)
Repubblika Dominikana (revision 272017)
Tajlandja (revision 279268)
Żvezja (revision 282136)
Belt kapitali (revision 274120)
Kambodja (revision 266273)
Żambja (revision 281455)
Commonwealth tan-Nazzjonijiet (revision 262942)
Lista ta' kodiċi telefoniċi (revision 257699)
Ungerija (revision 268808)
Armenja (revision 278995)
Ċina (revision 266233)
Kittieb (revision 277970)
Repubblika tal-Maċedonja (revision 281602)
Gżejjer Faroe (revision 262423)
Lussemburgu (revision 279759)
Tajwan (revision 279695)
Beliże (revision 281377)
Riga (revision 276342)
Indoneżja (revision 279426)
Żimbabwe (revision 269714)
Netherlands (revision 281883)
Liechtenstein (revision 279758)
Maldive (revision 281413)
Monaco (revision 281097)
Copenhagen (revision 264454)
Baħrejn (revision 279277)
Kanada (revision 281405)
Liżbona (revision 273944)
Żvizzera (revision 268804)
Gżejjer Aran (revision 243256)
Afganistan (revision 277361)
Andorra (revision 278993)
Ukrajna (revision 274996)
Graz (revision 276048)
Iżlam (revision 280916)
Seychelles (revision 281440)
Lingwa uffiċjali (revision 251833)
Nagorno-Karabakh (revision 274274)
Joe Biden (revision 271580)
Repubblika tal-Maċedonja ta' Fuq (revision 281602)
Kuwajt (revision 279782)
Senegal (revision 269704)
== End of Parsed pages ==
- Wikipedia parsing ended at: 2022-12-14 18:10:59.208295
51 characters appeared 410264 times.
Most Frequent characters:
[ 0] Char a: 12.611879180235167 %
[ 1] Char i: 11.737807850554764 %
[ 2] Char l: 8.053838503987675 %
[ 3] Char t: 7.739650566464521 %
[ 4] Char e: 6.431712263323129 %
[ 5] Char n: 6.16895462433945 %
[ 6] Char r: 5.748006161886005 %
[ 7] Char u: 4.319413840843945 %
[ 8] Char o: 3.771473977731412 %
[ 9] Char j: 3.663982216328998 %
[10] Char m: 3.6181580640758146 %
[11] Char s: 3.4828793167326406 %
[12] Char k: 2.807460561979603 %
[13] Char d: 2.365793732815943 %
[14] Char b: 2.051605795292787 %
[15] Char p: 2.0045629155860616 %
[16] Char f: 1.9814070939687616 %
[17] Char g: 1.5919017998167033 %
[18] Char ħ: 1.552902521303356 %
[19] Char w: 1.355468673829534 %
[20] Char z: 1.2309147280775306 %
[21] Char h: 1.031287171187333 %
[22] Char ż: 1.005937640153657 %
[23] Char ġ: 0.7556110211961078 %
[24] Char v: 0.6895559932141255 %
[25] Char ċ: 0.663231480217616 %
[26] Char x: 0.5952264883099663 %
[27] Char q: 0.5294152058186924 %
[28] Char c: 0.22058966909112182 %
[29] Char à: 0.07824230251740343 %
[30] Char y: 0.06971121034260866 %
The first 31 characters have an accumulated ratio of 0.9992858257122245.
The first 3 characters have an accumulated ratio of 0.32403525534777605.
All characters whose order is over 22 have an accumulated ratio of 0.03601583370707642.
936 sequences found.
First 512 (typical positive ratio): 0.9950079702120929
Next 197 (709-512): 0.003994232243462514
Rest: 0.000997797544444623
- Processing end: 2022-12-14 18:10:59.258906