uchardet/script/BuildLangModelLogs/LangIrishModel.log

222 lines
6.8 KiB
Plaintext

= Logs of language model for Irish (ga) =
- Generated by BuildLangModel.py
- Started: 2022-12-14 18:05:56.518022
- Maximum depth: 4
- Max number of pages: 200
== Parsed pages ==
Tracy Caldwell Dyson (revision 972597)
Tointeálaí spáis (revision 1049998)
Spásaire (revision 1066024)
Ceimiceoir (revision 1069325)
SAM (revision 1117044)
Fisiceoir (revision 1070391)
Stáit Aontaithe Mheiriceá (revision 1117044)
Rúisis (revision 1106700)
Ceimic (revision 1118628)
IMDb (revision 1120126)
14 Lúnasa (revision 1096367)
Príomhchathair (revision 1112302)
Dlí Raoult (revision 1000192)
Meath radaighníomhach (revision 1119072)
Comhghuaillithe (revision 1116753)
1929 (revision 1100654)
Vermont (revision 1058965)
Fyodor Dostoyevsky (revision 1118623)
Albert Einstein (revision 1112165)
Pápa Pius VII (revision 1012016)
10ú haois (revision 739954)
2005 (revision 1095195)
Madonna (revision 1114144)
1802 (revision 1120813)
1986 (revision 1116382)
An tAontas Sóivéadach (revision 1012309)
Benjamin Franklin (revision 998375)
Antoine Laurent Lavoisier (revision 1101121)
An Sciath (revision 1107011)
An Laidin (revision 1114200)
Géarchéim airgeadais 2007-2008 (revision 1107877)
Teirmidinimic (revision 814554)
Guglielmo Marconi (revision 1063391)
Ernst Mach (revision 1027300)
Fréamh an Eolais (revision 1024048)
1790 (revision 1095417)
Ionsaí ar Pearl Harbor (revision 1109393)
Cúmánaigh (revision 1111797)
Cogadh Réabhlóideach Mheiriceá (revision 1106269)
Thomas Edison (revision 977995)
Meicsiceo (revision 1105054)
Clement Coughlan (revision 1027709)
Robert Millikan (revision 995498)
Intleacht (revision 1118184)
Eamhnú (revision 685516)
Inneall (revision 656989)
Marie Curie (revision 972274)
Daniel Gabriel Fahrenheit (revision 992356)
Fisic (revision 1025283)
Gníomhaireacht Spáis na hEorpa (revision 1118858)
1956 (revision 1120816)
Bratach Stáit Aontaithe Mheiriceá (revision 885909)
Ór (revision 1034373)
Walter Scott (revision 973708)
Radaighníomhaíocht (revision 1119072)
Spásárthach (revision 758622)
Conradh na Náisiún (revision 1108801)
Niels Bohr (revision 1101167)
Oklahoma (revision 980194)
Liosta Institiúidí Pleanála Teanga ar fud an Domhain (revision 652223)
21 Lúnasa (revision 1096218)
Giúdachas (revision 1057667)
Stáisiún spáis (revision 823620)
Missouri (revision 1109999)
1958 (revision 1095248)
Titanic (scannán 1997) (revision 1073341)
An tAigéan Ciúin (revision 1110855)
3 Lúnasa (revision 1096165)
Henri Becquerel (revision 1056324)
Tony Scannell (revision 971451)
13 Lúnasa (revision 1106170)
Antoine Lavoisier (revision 1101121)
Discovery (revision 1070352)
28 Lúnasa (revision 1096170)
29 Lúnasa (revision 1094422)
1981 (revision 1100770)
Bunreacht na Stát Aontaithe (revision 1089293)
Tír (revision 1048967)
11 Lúnasa (revision 1095733)
An Téalainn (revision 997128)
An Ríocht Aontaithe (revision 1119694)
9 Lúnasa (revision 1096168)
Conradh Versailles (revision 1085221)
Pennsylvania (revision 1058964)
Harold Macmillan (revision 1030157)
19ú haois (revision 1083522)
Creideamh (revision 1049197)
1996 (revision 1100768)
Nicearagua (revision 1106117)
An Bheilg (revision 1112792)
Sean-Ghréigis (revision 1054688)
Gaeilge (revision 1113730)
Síocháin (revision 1039465)
1945 (revision 1120818)
Réabhlóid na Feabhra (revision 1068209)
Aigéad hidreaclórach (revision 1076385)
Poblacht na hÉireann (revision 1118646)
25 Lúnasa (revision 1089322)
Gnó (revision 1025310)
27 Lúnasa (revision 1096378)
8 Nollaig (revision 1120310)
Pápa Pius XI (revision 1059455)
Stair (revision 996477)
Contae Chorcaí (revision 1108805)
Veirgil (revision 973705)
Gabriel Fahrenheit (revision 992356)
Lárionad Taighde na dTeangacha Dúchasacha (An Fhionlainn) (revision 832954)
931 (revision 700763)
Dia (revision 1113800)
Lucht dóiteáin (revision 659094)
Aigéan (revision 1106787)
Breitheamh (revision 1067256)
An Bheilís (revision 975457)
Teocht (revision 1094881)
Benoît Fourneyron (revision 1034033)
S. Scott Bullock (revision 1072412)
1830 (revision 1095380)
Raidió (revision 1110288)
Georgia (stát S.A.M.) (revision 999135)
Aoine (revision 1051861)
Vicipéid (revision 1117521)
1780í (revision 1047267)
7 Lúnasa (revision 1096108)
1920í (revision 1047088)
An Fhionlainn (revision 1100573)
1942 (revision 1095266)
Poblacht (revision 828806)
Coilíneachas (revision 897710)
Caitliceachas (revision 1115927)
An Ghinéiv (revision 1059159)
Impireacht na Rúise (revision 1116927)
Ráta úis (revision 1110345)
Ciara Conway (revision 1022483)
1930í (revision 740221)
Dlí (revision 1025297)
Breatnais (revision 1120118)
1971 (revision 1120763)
John Lydon (revision 979043)
An India (revision 1119349)
1995 (revision 1120752)
An Seiseamhán (revision 1107012)
Fionlainnis (revision 1113969)
Leathré (revision 1094786)
Adam Mickiewicz (revision 1059506)
Morelos (revision 1008223)
An Chríostaíocht (revision 1111524)
Chadwick Boseman (revision 1026926)
Querétaro (revision 982615)
1913 (revision 1095297)
Cumadóir (revision 797877)
Idaho (revision 986069)
Panthéon (revision 1101065)
Hélène Carrère d'Encausse (revision 1097053)
An Domhan (revision 1070377)
Iriseoir (revision 1109397)
Pearl Harbor (revision 1116761)
8 Lúnasa (revision 1100776)
5 Feabhra (revision 1096085)
Aisteoir (revision 1058523)
Enid Blyton (revision 1020472)
Feithicil armúrtha troda (revision 685853)
== End of Parsed pages ==
- Wikipedia parsing ended at: 2022-12-14 18:08:23.899888
50 characters appeared 510350 times.
Most Frequent characters:
[ 0] Char a: 15.64201038502988 %
[ 1] Char i: 10.403056725776427 %
[ 2] Char n: 8.308023905163124 %
[ 3] Char h: 7.609875575585383 %
[ 4] Char r: 6.313118448123836 %
[ 5] Char e: 6.009797198001372 %
[ 6] Char s: 5.175859704124621 %
[ 7] Char t: 4.867052023121388 %
[ 8] Char c: 4.800039188792005 %
[ 9] Char o: 4.058391300088174 %
[10] Char l: 4.055256196727735 %
[11] Char d: 3.1166846281963356 %
[12] Char g: 2.863329087880866 %
[13] Char m: 2.718722445380621 %
[14] Char u: 2.1199177035367884 %
[15] Char b: 1.9925541295189575 %
[16] Char á: 1.8597041246203587 %
[17] Char í: 1.8483393749387673 %
[18] Char é: 1.3028313902223965 %
[19] Char f: 1.1746840403644558 %
[20] Char ó: 0.9724698736161459 %
[21] Char ú: 0.954834917213677 %
[22] Char p: 0.8300186146762025 %
[23] Char k: 0.22847065739198588 %
[24] Char v: 0.2212207308709709 %
[25] Char y: 0.21612618791025767 %
[26] Char w: 0.11227588909571862 %
[27] Char j: 0.0944449887332223 %
[28] Char z: 0.05310081316743412 %
[29] Char x: 0.028215930243950228 %
[30] Char q: 0.017830900362496325 %
The first 31 characters have an accumulated ratio of 0.9996825707847552.
The first 3 characters have an accumulated ratio of 0.3435309101596943.
All characters whose order is over 19 have an accumulated ratio of 0.03729009503282062.
853 sequences found.
First 461 (typical positive ratio): 0.995039617055503
Next 163 (624-461): 0.003960483178947816
Rest: 0.0009998997655491504
- Processing end: 2022-12-14 18:08:23.948646