mirror of
https://gitlab.freedesktop.org/uchardet/uchardet.git
synced 2025-12-06 16:56:40 +08:00
236 lines
7.7 KiB
Plaintext
236 lines
7.7 KiB
Plaintext
= Logs of language model for Finnish (fi) =
|
|
|
|
- Generated by BuildLangModel.py
|
|
- Started: 2022-12-14 23:51:17.009255
|
|
- Maximum depth: 4
|
|
- Max number of pages: 200
|
|
|
|
== Parsed pages ==
|
|
|
|
Yhdistynyt kuningaskunta (revision 21066772)
|
|
Sherlock Holmesin seikkailut (televisiosarja) (revision 19345728)
|
|
Englannin sisällissota (revision 20681585)
|
|
Damien Hirst (revision 20254144)
|
|
Coldplay (revision 20996509)
|
|
Puola (revision 21098204)
|
|
Vanguard-luokka (sukellusvene) (revision 20477212)
|
|
Unkari (revision 21093822)
|
|
Antigua ja Barbuda (revision 20834245)
|
|
Eurotunneli (revision 20871791)
|
|
Urdu (revision 21069477)
|
|
Gibraltar (revision 21007055)
|
|
Tuvalu (revision 20860615)
|
|
Fix You (revision 21005448)
|
|
Manchester (revision 20895719)
|
|
Blur (revision 20771440)
|
|
Jimmy Carter (revision 20860817)
|
|
Arcade Fire (revision 21107055)
|
|
Väli-Amerikka (revision 20603598)
|
|
Charles Ramirez (revision 20660516)
|
|
Kioa (revision 20880316)
|
|
UEFA (revision 20678496)
|
|
23. lokakuuta (revision 20918625)
|
|
Sadrin kieli (revision 20941171)
|
|
Antigua ja Barbudan lippu (revision 20650267)
|
|
Kuusi Napoleonia (revision 18401994)
|
|
Lontoon yliopisto (revision 18248559)
|
|
Torpedoputki (revision 19291207)
|
|
Tinapilli (revision 20621327)
|
|
Englannin kieli (revision 20829497)
|
|
Rolling Stone (revision 20937647)
|
|
Luettelo valtioista väkiluvun mukaan (revision 21110123)
|
|
Varjojen maat (revision 19455482)
|
|
Arthur Conan Doyle (revision 20650922)
|
|
Tuvalu mo te Atua (revision 20825014)
|
|
Vatikaani (revision 21017236)
|
|
Trinidad ja Tobago (revision 21082777)
|
|
The Pretenders (revision 20700307)
|
|
Mauritius (revision 21031101)
|
|
Hindustani (revision 16713728)
|
|
Napoleon I (revision 20998435)
|
|
Bristol (revision 20657021)
|
|
Astute-luokka (revision 17775821)
|
|
Bronisław Komorowski (revision 20657167)
|
|
Antiikintutkimus (revision 20815035)
|
|
Latvia (revision 21109727)
|
|
Luterilaisuus (revision 21108515)
|
|
Margaret Thatcher (revision 20827653)
|
|
Liikevaihto (revision 19244199)
|
|
Kaarle II (Englanti) (revision 21022937)
|
|
Lahna (revision 20718556)
|
|
Pariisi (revision 21098083)
|
|
Ohio-luokka (revision 20916657)
|
|
Säätyläiset (revision 20954041)
|
|
Karaatti (revision 20827014)
|
|
Kaarle I (Englanti) (revision 21028904)
|
|
Englanti (revision 21068035)
|
|
Parlophone Records (revision 20332700)
|
|
Postpositio (revision 19287247)
|
|
Yhdysvaltain Neitsytsaaret (revision 20804420)
|
|
Sherlock Holmes (revision 21038050)
|
|
Iso-Britannia (revision 21066772)
|
|
Arabialainen kirjaimisto (revision 20475019)
|
|
Charles Saatchi (revision 14917996)
|
|
Gibraltarin pääministeri (revision 16808956)
|
|
Newcastle (revision 21050270)
|
|
Väestötiheys (revision 20092734)
|
|
Internet (revision 21025514)
|
|
Marin tasavalta (revision 20896113)
|
|
Teollisuus (revision 20956826)
|
|
Doverinsalmi (revision 20369550)
|
|
Boston (revision 20992854)
|
|
Israelin kansalliskirjasto (revision 20854961)
|
|
Gibtelecom (revision 21007077)
|
|
Sindhi (revision 20948345)
|
|
Unkarin sosialistinen työväenpuolue (revision 18743747)
|
|
Kookosmaito (revision 21094033)
|
|
Arabian kieli (revision 21060827)
|
|
BRIT Awards (revision 19302405)
|
|
Gotthardin pohjatunneli (revision 21061923)
|
|
Mecsek (revision 20921164)
|
|
British Airways (revision 20984077)
|
|
Gawarin kieli (revision 13035766)
|
|
ITV (revision 20415578)
|
|
Megawatti (revision 20639645)
|
|
Bangladesh (revision 21101257)
|
|
HMS Victorious (S29) (revision 20088762)
|
|
Kivennuoliainen (revision 20945527)
|
|
Tuvalun dollari (revision 16801336)
|
|
Irena Szewińska (revision 19753210)
|
|
Ilja Leonard Pfeijffer (revision 21048419)
|
|
Neitsyt Maria (revision 20896050)
|
|
BBC Two (revision 20819614)
|
|
Tanganjika (revision 20446073)
|
|
Khowarin kieli (revision 19310691)
|
|
Saint Lucia (revision 21065825)
|
|
Bundelin kieli (revision 14167989)
|
|
Lontoo (revision 20946337)
|
|
Sinhali (revision 19311081)
|
|
Johnstonin atolli (revision 18905507)
|
|
Guinnessin ennätyskirja (revision 20839808)
|
|
Montserrat (revision 21048411)
|
|
Eurostar International (revision 20678739)
|
|
Jamaika (revision 21055658)
|
|
HMS Vanguard (S28) (revision 20088459)
|
|
Yanito (revision 20355121)
|
|
1965 (revision 20952728)
|
|
BBC (revision 20873802)
|
|
1984 (revision 21076882)
|
|
Ravintola (revision 20579600)
|
|
Vähittäiskauppa (revision 21059296)
|
|
Krzysztof Komeda (revision 17942536)
|
|
François Mitterrand (revision 20343193)
|
|
Lublin (revision 19195589)
|
|
Pitkäperjantai (revision 20423940)
|
|
Johannes Paavali II (revision 21066870)
|
|
Karibia (revision 20786667)
|
|
7. kesäkuuta (revision 20953482)
|
|
Deutsche Bahn (revision 21040025)
|
|
Gibraltar Chronicle (revision 21007056)
|
|
Alankomaat (revision 21066782)
|
|
Englannin kuningaskunta (revision 20703315)
|
|
Grand Hotel Europa (revision 20256757)
|
|
Julkisen palvelun yleisradiotoiminta (revision 20950803)
|
|
Gaston Browne (revision 20836659)
|
|
Monaco (revision 20905943)
|
|
Tokaji (revision 20197418)
|
|
Csongrád (lääni) (revision 19494157)
|
|
Nato (revision 21049954)
|
|
Venetsian biennaali (revision 20900561)
|
|
Yleisradiotoiminta (revision 20950803)
|
|
Britannia (revision 21066772)
|
|
Malawi (Commonwealth realm) (revision 20446067)
|
|
Platina (revision 20315754)
|
|
Permin Komi (revision 20926038)
|
|
UGM-27 Polaris (revision 20627059)
|
|
Yhdysvaltain dollari (revision 21093482)
|
|
Samarkand (revision 20861839)
|
|
Uganda (Commonwealth realm) (revision 20446074)
|
|
Ympäristönsuojelu (revision 20650048)
|
|
Guadeloupe (revision 20300349)
|
|
Rehtori (revision 20935388)
|
|
Joseph Bell (revision 20958309)
|
|
Fejér (revision 15207333)
|
|
OFC Nations Cup (revision 15982936)
|
|
Metsäsuomalaiset (revision 21027788)
|
|
Nicaragua (revision 21069170)
|
|
Westminsterin palatsi (revision 20640014)
|
|
Sosiaalihistoria (revision 20334015)
|
|
Gabriel (revision 21006459)
|
|
Pikkuviha (revision 20526971)
|
|
Virtual International Authority File (revision 21019677)
|
|
Uusi-Guinea (revision 20516634)
|
|
Kuuba (revision 21030857)
|
|
Sanya Richards-Ross (revision 20944016)
|
|
Säätiö (revision 20613246)
|
|
Dover (Englanti) (revision 20827636)
|
|
Musiikki (revision 20907775)
|
|
Suomen kieli (revision 21076647)
|
|
Murre (revision 20584718)
|
|
Ensimmäinen maailmansota (revision 21038559)
|
|
Edam (juusto) (revision 19473248)
|
|
Henrik VIII (Englanti) (revision 21014997)
|
|
Christopher Wren (revision 20136409)
|
|
Mongolia (revision 20916298)
|
|
Al-Andalus (revision 21099577)
|
|
Vesa-Pekka Rannikko (revision 20633828)
|
|
Devanagari (revision 20666572)
|
|
1940 (revision 20989523)
|
|
Kookospähkinä (revision 21028626)
|
|
SBB-CFF-FFS (revision 20940583)
|
|
Kaupunkivaltio (revision 21025690)
|
|
Murmanskin alue (revision 21049065)
|
|
Jean-Claude Juncker (revision 20875578)
|
|
Tim Berners-Lee (revision 20522743)
|
|
Armenia (revision 20987696)
|
|
Békés (lääni) (revision 19786715)
|
|
|
|
== End of Parsed pages ==
|
|
|
|
- Wikipedia parsing ended at: 2022-12-14 23:54:09.799129
|
|
|
|
76 characters appeared 1536977 times.
|
|
|
|
Most Frequent characters:
|
|
[ 0] Char a: 12.614697552403193 %
|
|
[ 1] Char i: 11.0243029010844 %
|
|
[ 2] Char t: 8.821992781934929 %
|
|
[ 3] Char n: 8.82095177741762 %
|
|
[ 4] Char e: 7.688859364844107 %
|
|
[ 5] Char s: 7.632775246474085 %
|
|
[ 6] Char l: 6.030083729294583 %
|
|
[ 7] Char o: 5.542438175717659 %
|
|
[ 8] Char u: 5.201379070734305 %
|
|
[ 9] Char k: 4.7905726630912495 %
|
|
[10] Char r: 3.1331633459706945 %
|
|
[11] Char m: 3.0228168671359428 %
|
|
[12] Char ä: 3.0008256467077907 %
|
|
[13] Char v: 2.2698452872098933 %
|
|
[14] Char j: 1.9956056596813094 %
|
|
[15] Char p: 1.7456995127448232 %
|
|
[16] Char h: 1.7247492968339802 %
|
|
[17] Char y: 1.5948189205173533 %
|
|
[18] Char d: 1.110296380492356 %
|
|
[19] Char g: 0.48133446369073835 %
|
|
[20] Char b: 0.4582371759629455 %
|
|
[21] Char ö: 0.40117711585794713 %
|
|
[22] Char c: 0.3493220783394937 %
|
|
[23] Char f: 0.20696471059749108 %
|
|
[24] Char w: 0.14886364597518376 %
|
|
[25] Char z: 0.06128914095656604 %
|
|
[26] Char x: 0.028497498661333255 %
|
|
[27] Char é: 0.01450900045999387 %
|
|
[28] Char q: 0.013858372636675761 %
|
|
|
|
The first 29 characters have an accumulated ratio of 0.9992992738342864.
|
|
The first 4 characters have an accumulated ratio of 0.4128194501284014.
|
|
All characters whose order is over 17 have an accumulated ratio of 0.03274349583630724.
|
|
|
|
1146 sequences found.
|
|
|
|
First 417 (typical positive ratio): 0.9950442901604022
|
|
Next 226 (643-417): 0.003959181230548281
|
|
Rest: 0.0009965286090495296
|
|
|
|
- Processing end: 2022-12-14 23:54:09.877879
|