uchardet/script/BuildLangModelLogs/LangFinnishModel.log

236 lines
7.7 KiB
Plaintext

= Logs of language model for Finnish (fi) =
- Generated by BuildLangModel.py
- Started: 2022-12-14 23:51:17.009255
- Maximum depth: 4
- Max number of pages: 200
== Parsed pages ==
Yhdistynyt kuningaskunta (revision 21066772)
Sherlock Holmesin seikkailut (televisiosarja) (revision 19345728)
Englannin sisällissota (revision 20681585)
Damien Hirst (revision 20254144)
Coldplay (revision 20996509)
Puola (revision 21098204)
Vanguard-luokka (sukellusvene) (revision 20477212)
Unkari (revision 21093822)
Antigua ja Barbuda (revision 20834245)
Eurotunneli (revision 20871791)
Urdu (revision 21069477)
Gibraltar (revision 21007055)
Tuvalu (revision 20860615)
Fix You (revision 21005448)
Manchester (revision 20895719)
Blur (revision 20771440)
Jimmy Carter (revision 20860817)
Arcade Fire (revision 21107055)
Väli-Amerikka (revision 20603598)
Charles Ramirez (revision 20660516)
Kioa (revision 20880316)
UEFA (revision 20678496)
23. lokakuuta (revision 20918625)
Sadrin kieli (revision 20941171)
Antigua ja Barbudan lippu (revision 20650267)
Kuusi Napoleonia (revision 18401994)
Lontoon yliopisto (revision 18248559)
Torpedoputki (revision 19291207)
Tinapilli (revision 20621327)
Englannin kieli (revision 20829497)
Rolling Stone (revision 20937647)
Luettelo valtioista väkiluvun mukaan (revision 21110123)
Varjojen maat (revision 19455482)
Arthur Conan Doyle (revision 20650922)
Tuvalu mo te Atua (revision 20825014)
Vatikaani (revision 21017236)
Trinidad ja Tobago (revision 21082777)
The Pretenders (revision 20700307)
Mauritius (revision 21031101)
Hindustani (revision 16713728)
Napoleon I (revision 20998435)
Bristol (revision 20657021)
Astute-luokka (revision 17775821)
Bronisław Komorowski (revision 20657167)
Antiikintutkimus (revision 20815035)
Latvia (revision 21109727)
Luterilaisuus (revision 21108515)
Margaret Thatcher (revision 20827653)
Liikevaihto (revision 19244199)
Kaarle II (Englanti) (revision 21022937)
Lahna (revision 20718556)
Pariisi (revision 21098083)
Ohio-luokka (revision 20916657)
Säätyläiset (revision 20954041)
Karaatti (revision 20827014)
Kaarle I (Englanti) (revision 21028904)
Englanti (revision 21068035)
Parlophone Records (revision 20332700)
Postpositio (revision 19287247)
Yhdysvaltain Neitsytsaaret (revision 20804420)
Sherlock Holmes (revision 21038050)
Iso-Britannia (revision 21066772)
Arabialainen kirjaimisto (revision 20475019)
Charles Saatchi (revision 14917996)
Gibraltarin pääministeri (revision 16808956)
Newcastle (revision 21050270)
Väestötiheys (revision 20092734)
Internet (revision 21025514)
Marin tasavalta (revision 20896113)
Teollisuus (revision 20956826)
Doverinsalmi (revision 20369550)
Boston (revision 20992854)
Israelin kansalliskirjasto (revision 20854961)
Gibtelecom (revision 21007077)
Sindhi (revision 20948345)
Unkarin sosialistinen työväenpuolue (revision 18743747)
Kookosmaito (revision 21094033)
Arabian kieli (revision 21060827)
BRIT Awards (revision 19302405)
Gotthardin pohjatunneli (revision 21061923)
Mecsek (revision 20921164)
British Airways (revision 20984077)
Gawarin kieli (revision 13035766)
ITV (revision 20415578)
Megawatti (revision 20639645)
Bangladesh (revision 21101257)
HMS Victorious (S29) (revision 20088762)
Kivennuoliainen (revision 20945527)
Tuvalun dollari (revision 16801336)
Irena Szewińska (revision 19753210)
Ilja Leonard Pfeijffer (revision 21048419)
Neitsyt Maria (revision 20896050)
BBC Two (revision 20819614)
Tanganjika (revision 20446073)
Khowarin kieli (revision 19310691)
Saint Lucia (revision 21065825)
Bundelin kieli (revision 14167989)
Lontoo (revision 20946337)
Sinhali (revision 19311081)
Johnstonin atolli (revision 18905507)
Guinnessin ennätyskirja (revision 20839808)
Montserrat (revision 21048411)
Eurostar International (revision 20678739)
Jamaika (revision 21055658)
HMS Vanguard (S28) (revision 20088459)
Yanito (revision 20355121)
1965 (revision 20952728)
BBC (revision 20873802)
1984 (revision 21076882)
Ravintola (revision 20579600)
Vähittäiskauppa (revision 21059296)
Krzysztof Komeda (revision 17942536)
François Mitterrand (revision 20343193)
Lublin (revision 19195589)
Pitkäperjantai (revision 20423940)
Johannes Paavali II (revision 21066870)
Karibia (revision 20786667)
7. kesäkuuta (revision 20953482)
Deutsche Bahn (revision 21040025)
Gibraltar Chronicle (revision 21007056)
Alankomaat (revision 21066782)
Englannin kuningaskunta (revision 20703315)
Grand Hotel Europa (revision 20256757)
Julkisen palvelun yleisradiotoiminta (revision 20950803)
Gaston Browne (revision 20836659)
Monaco (revision 20905943)
Tokaji (revision 20197418)
Csongrád (lääni) (revision 19494157)
Nato (revision 21049954)
Venetsian biennaali (revision 20900561)
Yleisradiotoiminta (revision 20950803)
Britannia (revision 21066772)
Malawi (Commonwealth realm) (revision 20446067)
Platina (revision 20315754)
Permin Komi (revision 20926038)
UGM-27 Polaris (revision 20627059)
Yhdysvaltain dollari (revision 21093482)
Samarkand (revision 20861839)
Uganda (Commonwealth realm) (revision 20446074)
Ympäristönsuojelu (revision 20650048)
Guadeloupe (revision 20300349)
Rehtori (revision 20935388)
Joseph Bell (revision 20958309)
Fejér (revision 15207333)
OFC Nations Cup (revision 15982936)
Metsäsuomalaiset (revision 21027788)
Nicaragua (revision 21069170)
Westminsterin palatsi (revision 20640014)
Sosiaalihistoria (revision 20334015)
Gabriel (revision 21006459)
Pikkuviha (revision 20526971)
Virtual International Authority File (revision 21019677)
Uusi-Guinea (revision 20516634)
Kuuba (revision 21030857)
Sanya Richards-Ross (revision 20944016)
Säätiö (revision 20613246)
Dover (Englanti) (revision 20827636)
Musiikki (revision 20907775)
Suomen kieli (revision 21076647)
Murre (revision 20584718)
Ensimmäinen maailmansota (revision 21038559)
Edam (juusto) (revision 19473248)
Henrik VIII (Englanti) (revision 21014997)
Christopher Wren (revision 20136409)
Mongolia (revision 20916298)
Al-Andalus (revision 21099577)
Vesa-Pekka Rannikko (revision 20633828)
Devanagari (revision 20666572)
1940 (revision 20989523)
Kookospähkinä (revision 21028626)
SBB-CFF-FFS (revision 20940583)
Kaupunkivaltio (revision 21025690)
Murmanskin alue (revision 21049065)
Jean-Claude Juncker (revision 20875578)
Tim Berners-Lee (revision 20522743)
Armenia (revision 20987696)
Békés (lääni) (revision 19786715)
== End of Parsed pages ==
- Wikipedia parsing ended at: 2022-12-14 23:54:09.799129
76 characters appeared 1536977 times.
Most Frequent characters:
[ 0] Char a: 12.614697552403193 %
[ 1] Char i: 11.0243029010844 %
[ 2] Char t: 8.821992781934929 %
[ 3] Char n: 8.82095177741762 %
[ 4] Char e: 7.688859364844107 %
[ 5] Char s: 7.632775246474085 %
[ 6] Char l: 6.030083729294583 %
[ 7] Char o: 5.542438175717659 %
[ 8] Char u: 5.201379070734305 %
[ 9] Char k: 4.7905726630912495 %
[10] Char r: 3.1331633459706945 %
[11] Char m: 3.0228168671359428 %
[12] Char ä: 3.0008256467077907 %
[13] Char v: 2.2698452872098933 %
[14] Char j: 1.9956056596813094 %
[15] Char p: 1.7456995127448232 %
[16] Char h: 1.7247492968339802 %
[17] Char y: 1.5948189205173533 %
[18] Char d: 1.110296380492356 %
[19] Char g: 0.48133446369073835 %
[20] Char b: 0.4582371759629455 %
[21] Char ö: 0.40117711585794713 %
[22] Char c: 0.3493220783394937 %
[23] Char f: 0.20696471059749108 %
[24] Char w: 0.14886364597518376 %
[25] Char z: 0.06128914095656604 %
[26] Char x: 0.028497498661333255 %
[27] Char é: 0.01450900045999387 %
[28] Char q: 0.013858372636675761 %
The first 29 characters have an accumulated ratio of 0.9992992738342864.
The first 4 characters have an accumulated ratio of 0.4128194501284014.
All characters whose order is over 17 have an accumulated ratio of 0.03274349583630724.
1146 sequences found.
First 417 (typical positive ratio): 0.9950442901604022
Next 226 (643-417): 0.003959181230548281
Rest: 0.0009965286090495296
- Processing end: 2022-12-14 23:54:09.877879