uchardet/script/BuildLangModelLogs/LangCroatianModel.log
Jehan eb8308d50a src, script: regenerate all existing language models.
Now making sure that we have a generic language model working with UTF-8
for all 26 supported models which had single-byte encoding support until
now.
2022-12-14 00:23:13 +01:00

158 lines
4.8 KiB
Plaintext

= Logs of language model for Croatian (hr) =
- Generated by BuildLangModel.py
- Started: 2021-03-16 19:09:36.740256
- Maximum depth: 4
- Max number of pages: 100
== Parsed pages ==
Fizika čvrstog stanja (revision 5777686)
Agregatno stanje (revision 5764830)
Alnico (revision 3915185)
Aluminij (revision 5755266)
Amorfna tvar (revision 5392804)
Antimon (revision 5435171)
Antoine Henri Becquerel (revision 5556977)
Apsolutna nula (revision 5482633)
Arsen (revision 5752189)
Arthur Holly Compton (revision 5313150)
Atom (revision 5730600)
Atomska jezgra (revision 5731544)
Bell Labs (revision 4769518)
Bor (element) (revision 5549612)
Brian Josephson (revision 5446101)
Cink (revision 5556719)
Comptonov učinak (revision 5313303)
Coulombov zakon (revision 5436283)
Dijamant (revision 5775412)
Dimenzija (revision 5379791)
Dinastija Han (revision 5772176)
Dislokacija (revision 5431109)
EV (revision 5430610)
Eksponencijalna funkcija (revision 5523460)
Električna struja (revision 5653050)
Električna vodljivost (revision 5376333)
Električni izolator (revision 5258197)
Električni luk (revision 5437134)
Električni naboj (revision 5774260)
Električni otpor (revision 4904596)
Električni vodič (revision 5334900)
Električno polje (revision 5247154)
Elektrolit (revision 4858367)
Elektromagnetsko zračenje (revision 5760956)
Elektron (revision 5774256)
Elektronika (revision 5556766)
Elektronska konfiguracija (revision 4949752)
Elektronski mikroskop (revision 5439229)
Elektrotehnika (revision 5254565)
Energetika (revision 4908587)
Energija (revision 5767106)
Fermi-Diracova statistika (revision 3934172)
Feromagnetizam (revision 5392729)
Fizika (revision 5777684)
Fizika kondenzirane tvari (revision 5455580)
Fizikalna veličina (revision 5497656)
Fosfor (revision 5556869)
Fotodioda (revision 5235215)
Fotoelektrični učinak (revision 5632628)
Foton (revision 5635311)
Fotonaponski sustavi (revision 5430012)
Francuski jezik (revision 5771033)
Galij (revision 5437600)
Genitiv (revision 5767472)
Germanij (revision 5437677)
Helij (revision 5556716)
Henri (revision 3922500)
Indij (revision 5439698)
Integrirani krug (revision 5500904)
Ion (revision 5750157)
Ioniziranje (revision 5318213)
John Bardeen (revision 5182165)
Kadmij (revision 5440736)
Kelvin (revision 5240179)
Keramika (revision 5655772)
Kinetička energija (revision 5753997)
Klasična mehanika (revision 5656259)
Kompas (revision 5750313)
Kondenzacija (revision 5492249)
Kondenzirana tvar (revision 5455580)
Konstrukcija (revision 4680450)
Kovalentna veza (revision 5751506)
Kristal (revision 5455704)
Kristalna rešetka (revision 5562348)
Kristalografija (revision 4105956)
Krutine (revision 5196995)
Kubični kristalni sustav (revision 5610803)
Kubični metar (revision 5082862)
Kvantna mehanika (revision 5777687)
Latinski jezik (revision 5663325)
Luminiscencija (revision 5052601)
Magnet (revision 5743549)
Magnetizam (revision 5728489)
Magnetska permeabilnost (revision 4675996)
Magnetska vodljivost (revision 4899860)
Magnetski moment (revision 5489691)
Magnetsko polje (revision 5671905)
Materijal (revision 5748275)
Mehanika (revision 5777691)
Metal (revision 5505185)
Metan (revision 5611051)
Metar (revision 5325605)
Mjerna veličina (revision 5497656)
Molekula (revision 5773190)
Molekule (revision 5773190)
Napon (revision 5556720)
Niskotemperaturna fizika (revision 4657522)
Njemački jezik (revision 5710175)
Optika (revision 5316843)
== End of Parsed pages ==
- Wikipedia parsing ended at: 2021-03-16 19:18:55.485669
49 characters appeared 643453 times.
First 31 characters:
[ 0] Char a: 10.677081309746011 %
[ 1] Char i: 9.900023777960474 %
[ 2] Char e: 9.741037806957152 %
[ 3] Char o: 8.583843730622128 %
[ 4] Char n: 6.852404138297591 %
[ 5] Char t: 5.517885533209108 %
[ 6] Char r: 5.292383437484944 %
[ 7] Char j: 5.03952891664193 %
[ 8] Char s: 4.730104607484929 %
[ 9] Char k: 4.032773178460587 %
[10] Char l: 3.9395262746463224 %
[11] Char m: 3.8557594727198414 %
[12] Char u: 3.7656207990327184 %
[13] Char v: 3.0636270248176634 %
[14] Char p: 2.654583940085756 %
[15] Char d: 2.6340696212466175 %
[16] Char z: 1.8657151338170777 %
[17] Char g: 1.5614194043698606 %
[18] Char č: 1.1537750231951673 %
[19] Char b: 1.1304632972416013 %
[20] Char c: 1.081042438220041 %
[21] Char h: 0.7697531909867543 %
[22] Char f: 0.4845730768214617 %
[23] Char š: 0.4174353060751912 %
[24] Char ž: 0.365217039939203 %
[25] Char ć: 0.35123000436706336 %
[26] Char đ: 0.22596833024323454 %
[27] Char y: 0.14857340007739495 %
[28] Char w: 0.06558365568269944 %
[29] Char x: 0.04988709354063157 %
[30] Char q: 0.030149832233278887 %
The first 31 characters have an accumulated ratio of 0.9998103979622444.
725 sequences found.
First 512 (typical positive ratio): 0.9990568119867879
Next 512 (512-1024): 0.00365217039939203
Rest: -4.0440741033709315e-17
- Processing end: 2021-03-16 19:18:56.030353