mirror of
https://gitlab.freedesktop.org/uchardet/uchardet.git
synced 2025-12-06 16:56:40 +08:00
256 lines
9.5 KiB
Plaintext
256 lines
9.5 KiB
Plaintext
= Logs of language model for English (en) =
|
|
|
|
- Generated by BuildLangModel.py
|
|
- Started: 2022-12-14 17:54:31.274511
|
|
- Maximum depth: 4
|
|
- Max number of pages: 200
|
|
|
|
== Parsed pages ==
|
|
|
|
Marmot (revision 1116705550)
|
|
Oxford English Dictionary (revision 1126934642)
|
|
Forrest's rock squirrel (revision 1121471379)
|
|
Johann Friedrich Blumenbach (revision 1124282696)
|
|
Black-tailed prairie dog (revision 1120101763)
|
|
Herodotus (revision 1126195293)
|
|
Gray-collared chipmunk (revision 1121473607)
|
|
Callospermophilus (revision 1015470924)
|
|
Siberia (revision 1125951683)
|
|
Asia Minor ground squirrel (revision 1121357197)
|
|
France (revision 1127134794)
|
|
Marmoset (revision 1110976265)
|
|
Eskimos (revision 1126440133)
|
|
Last Glacial Period (revision 1127412073)
|
|
Tropical ground squirrel (revision 1121471157)
|
|
Columbian ground squirrel (revision 1124139650)
|
|
Merriam's chipmunk (revision 1121301344)
|
|
Race in Singapore (revision 1118674650)
|
|
Buller's chipmunk (revision 1121473516)
|
|
Marca's marmoset (revision 1110797645)
|
|
Yellow ground squirrel (revision 1121469509)
|
|
Antelope squirrel (revision 1089053714)
|
|
Nail (anatomy) (revision 1123634076)
|
|
PMID (identifier) (revision 1125133244)
|
|
Siberian chipmunk (revision 1121472776)
|
|
List of ancient Greek playwrights (revision 1122231123)
|
|
Roy Harris (linguist) (revision 1087505131)
|
|
Wordnik (revision 1097144364)
|
|
Perseus Project (revision 1112478286)
|
|
Laüs (revision 1096210239)
|
|
Texas antelope squirrel (revision 1121470154)
|
|
AAVE (revision 1125269600)
|
|
Callimico (revision 1125614017)
|
|
Otospermophilus (revision 1093268410)
|
|
Sozopol (revision 1117454721)
|
|
Magnifying glass (revision 1126820579)
|
|
Oscar Peschel (revision 1086657308)
|
|
Ethnogenesis (revision 1123304848)
|
|
Hoary marmot (revision 1121363017)
|
|
Ictidomys (revision 1095023307)
|
|
Erechtheion (revision 1123163760)
|
|
Eastern Siberia (revision 1125951683)
|
|
Interim Register of Marine and Nonmarine Genera (revision 1093112130)
|
|
Spermophilus ralli (revision 1121469745)
|
|
Espíritu Santo antelope squirrel (revision 1121470113)
|
|
Oak (revision 1126949583)
|
|
Tamarin (revision 1120713888)
|
|
Alexandria (revision 1126013507)
|
|
Red Terror (revision 1125612607)
|
|
Göttingen (revision 1126739277)
|
|
1968 Winter Olympics (revision 1121518304)
|
|
Black-capped marmot (revision 1121471697)
|
|
Swift fox (revision 1123284998)
|
|
Anaximenes of Miletus (revision 1124180264)
|
|
Leninism (revision 1127169744)
|
|
July Revolution (revision 1118106788)
|
|
Don E. Wilson (revision 1101839421)
|
|
White-tailed prairie dog (revision 1121472368)
|
|
Concise Oxford English Dictionary (revision 1127006175)
|
|
Mountain plover (revision 1118572990)
|
|
Prisons in Russia (revision 1122751238)
|
|
Race in Brazil (revision 1124705196)
|
|
Pyrrhus of Epirus (revision 1126501014)
|
|
Turkey (revision 1127301063)
|
|
World Wide Web (revision 1119819405)
|
|
John F. Richards (revision 1091216340)
|
|
Mammal (revision 1127041212)
|
|
Schneider's marmoset (revision 1110797810)
|
|
Colorado chipmunk (revision 1121299151)
|
|
Aeschylus (revision 1117125663)
|
|
Uinta ground squirrel (revision 1121367964)
|
|
Long-eared chipmunk (revision 1121298477)
|
|
St. Bartholomew's Day massacre (revision 1120328739)
|
|
Long-tailed marmot (revision 1122462086)
|
|
Flag of Russia (revision 1126574449)
|
|
Johann Christian Polycarp Erxleben (revision 1123183621)
|
|
William Chester Minor (revision 1101752889)
|
|
Antarctic (revision 1124702474)
|
|
Anu Garg (revision 1082253113)
|
|
Commonwealth of Independent States (revision 1127257304)
|
|
Potential enlargement of the European Union (revision 1127421947)
|
|
Common marmoset (revision 1110796571)
|
|
Thirteen-lined ground squirrel (revision 1127159966)
|
|
Mill Hill (revision 1127109155)
|
|
California ground squirrel (revision 1121359049)
|
|
Round-tailed ground squirrel (revision 1121470819)
|
|
Golden lion tamarin (revision 1126498136)
|
|
Polyandry in nature (revision 1125188329)
|
|
October Revolution (revision 1126331974)
|
|
Spruce-fir forests (revision 1105538592)
|
|
Piute ground squirrel (revision 1121468740)
|
|
Flemish art (revision 1029307027)
|
|
Received Pronunciation (revision 1126918653)
|
|
Groundhog (revision 1126432353)
|
|
Taxonomy (biology) (revision 1126106840)
|
|
Menzbier's marmot (revision 1121471953)
|
|
Group of Seven (revision 1127242480)
|
|
Hopi chipmunk (revision 1121297258)
|
|
Global warming potential (revision 1124849235)
|
|
Monogenism (revision 1117320065)
|
|
Daurian ground squirrel (revision 1121469422)
|
|
Gray marmot (revision 1122462225)
|
|
Nuclear weapon (revision 1124805455)
|
|
Wordhunt (revision 1107342769)
|
|
Palmer's chipmunk (revision 1121473732)
|
|
Vendian (revision 1124816391)
|
|
Alberta (revision 1127108924)
|
|
European ground squirrel (revision 1121469378)
|
|
Gaullism (revision 1127187176)
|
|
Diccionario de la lengua española (revision 1122640046)
|
|
Franklin's ground squirrel (revision 1121361872)
|
|
Xerospermophilus (revision 1095542738)
|
|
Alípio de Miranda-Ribeiro (revision 1118371411)
|
|
Prairie dog town (revision 1125350300)
|
|
Charles Caldwell (physician) (revision 1124873172)
|
|
Timeline of French history (revision 1124169293)
|
|
Lodgepole chipmunk (revision 1121296771)
|
|
Geoffroy's tamarin (revision 1120714387)
|
|
Selinunte (revision 1127066618)
|
|
LibriVox (revision 1123211025)
|
|
Nominal GDP (revision 1126535284)
|
|
Caspian race (revision 1096592610)
|
|
Eastern chipmunk (revision 1120765340)
|
|
Natural history (revision 1125550833)
|
|
Encyclopedia of Life (revision 1123215390)
|
|
The Passing of the Great Race (revision 1125593849)
|
|
Least Concern (revision 1114094351)
|
|
New International Encyclopedia (revision 1122804470)
|
|
Pandosia (Lucania) (revision 1095669257)
|
|
Falcon (revision 1127211661)
|
|
Least chipmunk (revision 1120765536)
|
|
Etymology (revision 1126207231)
|
|
Caucasus Mountains (revision 1112109454)
|
|
Cascade golden-mantled ground squirrel (revision 1121471310)
|
|
Ontario (revision 1126506635)
|
|
The Australian National Dictionary (revision 1117901185)
|
|
Cliff chipmunk (revision 1121473647)
|
|
Emile Berliner (revision 1124364621)
|
|
Henry II, Holy Roman Emperor (revision 1125851826)
|
|
Teat (revision 1113585530)
|
|
Deforestation of the Amazon rainforest (revision 1127204853)
|
|
Russian Far East (revision 1125785102)
|
|
FOSS (revision 1123158325)
|
|
Google Books (revision 1123968126)
|
|
The American Heritage Dictionary of the English Language (revision 1125678949)
|
|
Family (biology) (revision 1115174458)
|
|
Shared decision-making (revision 1105303711)
|
|
1968 Summer Paralympics (revision 1110266388)
|
|
California chipmunk (revision 1121299691)
|
|
Chromista (revision 1104191406)
|
|
List of heads of state of the Soviet Union (revision 1113841810)
|
|
Scrub plane (revision 895337184)
|
|
Dog (revision 1125590248)
|
|
Platypus (revision 1126576847)
|
|
Palo Alto, California (revision 1124234189)
|
|
Geoffrey Nunberg (revision 1101933840)
|
|
Soviet submarine K-431 (revision 1123390956)
|
|
African-American (revision 1127390519)
|
|
Varnish (revision 1104965513)
|
|
Barcode of Life Data System (revision 1090221883)
|
|
Lesson's saddle-back tamarin (revision 1120721208)
|
|
Molina's hog-nosed skunk (revision 1119016570)
|
|
Race Life of the Aryan Peoples (revision 1072651965)
|
|
Joan of Arc (revision 1127287571)
|
|
Talcott Williams (revision 1124147325)
|
|
Supreme Court of the Soviet Union (revision 1029671037)
|
|
DNA sequence (revision 1126520849)
|
|
Human science (revision 1125850795)
|
|
Luxembourgish phonology (revision 1073415697)
|
|
Karl Pearson (revision 1126931733)
|
|
Conservation status (revision 1126423906)
|
|
Sierra Madre ground squirrel (revision 1121471267)
|
|
Gray-footed chipmunk (revision 1121473564)
|
|
ISNI (identifier) (revision 1116919527)
|
|
New Mexico (revision 1127238323)
|
|
Archipelago (revision 1116401445)
|
|
2022 SCO summit (revision 1120811549)
|
|
The New York Times (revision 1127291077)
|
|
Partition of the Ottoman Empire (revision 1126544087)
|
|
Idaho (revision 1127080022)
|
|
ISBN (identifier) (revision 1124259962)
|
|
Spermophilus brevicauda (revision 1010428942)
|
|
Taurus ground squirrel (revision 1121469893)
|
|
Tajikistani somoni (revision 1120621502)
|
|
9/11 Commission Report (revision 1123122065)
|
|
Spermophilus (revision 1089055218)
|
|
Gatun Lake (revision 1124617227)
|
|
Canadian Oxford Dictionary (revision 1021304609)
|
|
Alexandre Dumas, père (revision 1127252593)
|
|
Crusafontia (revision 1045515255)
|
|
Mercosur (revision 1125969034)
|
|
Missionary (revision 1124709979)
|
|
Materialism (revision 1126420363)
|
|
Primate (revision 1127035196)
|
|
Agesilaus II (revision 1122309607)
|
|
Roosmalens' dwarf marmoset (revision 1110797884)
|
|
VIAF (identifier) (revision 1122669300)
|
|
Agostinho Neto (revision 1113402334)
|
|
Soviet Empire (revision 1124525265)
|
|
Rivers in Russia (revision 1120330182)
|
|
|
|
== End of Parsed pages ==
|
|
|
|
- Wikipedia parsing ended at: 2022-12-14 17:58:33.140597
|
|
|
|
59 characters appeared 2649878 times.
|
|
|
|
Most Frequent characters:
|
|
[ 0] Char e: 11.955833438369616 %
|
|
[ 1] Char a: 8.65764386134003 %
|
|
[ 2] Char t: 8.565752838432562 %
|
|
[ 3] Char i: 7.92813857845531 %
|
|
[ 4] Char n: 7.520383957299166 %
|
|
[ 5] Char o: 7.336677386657047 %
|
|
[ 6] Char s: 6.807558687607505 %
|
|
[ 7] Char r: 6.7380083158545405 %
|
|
[ 8] Char h: 4.371823910383799 %
|
|
[ 9] Char l: 4.20785409743392 %
|
|
[10] Char d: 3.74643662840327 %
|
|
[11] Char c: 3.560050689126065 %
|
|
[12] Char u: 2.780052515625248 %
|
|
[13] Char m: 2.632498552763561 %
|
|
[14] Char p: 2.203950521495707 %
|
|
[15] Char f: 2.1648921195617308 %
|
|
[16] Char g: 1.993563477261972 %
|
|
[17] Char y: 1.5471278300359488 %
|
|
[18] Char b: 1.5069750380960933 %
|
|
[19] Char w: 1.383950506400672 %
|
|
[20] Char v: 1.0527654480696849 %
|
|
[21] Char k: 0.604707084628047 %
|
|
[22] Char x: 0.25389848136404775 %
|
|
[23] Char j: 0.14932762942293948 %
|
|
[24] Char z: 0.14351604111585514 %
|
|
[25] Char q: 0.12242073031286722 %
|
|
|
|
The first 26 characters have an accumulated ratio of 0.9993580836551719.
|
|
The first 4 characters have an accumulated ratio of 0.3710736871659752.
|
|
All characters whose order is over 18 have an accumulated ratio of 0.037105859213141135.
|
|
|
|
1047 sequences found.
|
|
|
|
First 377 (typical positive ratio): 0.9950075198967843
|
|
Next 160 (537-377): 0.003999516176216855
|
|
Rest: 0.00099296392699888
|
|
|
|
- Processing end: 2022-12-14 17:58:33.195443
|