mirror of
https://gitlab.freedesktop.org/uchardet/uchardet.git
synced 2025-12-06 08:46:40 +08:00
252 lines
9.4 KiB
Plaintext
252 lines
9.4 KiB
Plaintext
= Logs of language model for English (en) =
|
||
|
||
- Generated by BuildLangModel.py
|
||
- Started: 2022-12-14 20:20:53.218193
|
||
- Maximum depth: 4
|
||
- Max number of pages: 200
|
||
|
||
== Parsed pages ==
|
||
|
||
Marmot (revision 1116705550)
|
||
Barcode of Life Data System (revision 1090221883)
|
||
Palmer's chipmunk (revision 1121473732)
|
||
Jacopo Ligozzi (revision 1104222073)
|
||
Olympic Peninsula (revision 1123430023)
|
||
INaturalist (revision 1122751314)
|
||
Mammal Species of the World (revision 1127351948)
|
||
Berry (revision 1112801626)
|
||
Rock squirrel (revision 1121470993)
|
||
Natural reservoir (revision 1110806364)
|
||
Onomatopoeia (revision 1120663626)
|
||
Mohave ground squirrel (revision 1121470764)
|
||
Townsend's chipmunk (revision 1121473824)
|
||
Madrid (revision 1126851882)
|
||
Otospermophilus (revision 1093268410)
|
||
Plant hormone (revision 1116921032)
|
||
Cuckoo (revision 1126465747)
|
||
Daurian ground squirrel (revision 1121469422)
|
||
Elwha River (revision 1121691243)
|
||
All rights reserved (revision 1125321157)
|
||
Long-tailed ground squirrel (revision 1121468895)
|
||
CDFG (revision 1122725741)
|
||
Don Martin (cartoonist) (revision 1116900902)
|
||
Palindromic (revision 1121604941)
|
||
EMBnet (revision 1018817077)
|
||
Ferdinando II de' Medici, Grand Duke of Tuscany (revision 1125579637)
|
||
Cloister (revision 1120569425)
|
||
Asymptomatic (revision 1111685734)
|
||
Grand Duke (revision 1126227666)
|
||
Eucalyptus oil (revision 1123039166)
|
||
Seattle (revision 1127044692)
|
||
Xerospermophilus (revision 1095542738)
|
||
Red-cheeked ground squirrel (revision 1121469468)
|
||
Roy Crane (revision 1073477180)
|
||
Round-tailed ground squirrel (revision 1121470819)
|
||
Asia Minor ground squirrel (revision 1121357197)
|
||
Ictidomys parvidens (revision 1121470382)
|
||
Hopi chipmunk (revision 1121297258)
|
||
Anecdata.org (revision 1099498174)
|
||
Himalayan marmot (revision 1113552191)
|
||
Storage organ (revision 1087238870)
|
||
Phage therapy (revision 1115823876)
|
||
Pacific County, Washington (revision 1115141058)
|
||
Agostino Carracci (revision 1118965396)
|
||
Share-alike (revision 1124025423)
|
||
Fragaria chiloensis (revision 1117621684)
|
||
Pacific Northwest (revision 1125120564)
|
||
Eastern chipmunk (revision 1120765340)
|
||
Yakima County, Washington (revision 1117226237)
|
||
United States congressional delegations from Washington (revision 1113282930)
|
||
Hygiene (revision 1121837793)
|
||
Synonym (taxonomy) (revision 1115465643)
|
||
Washington's congressional districts (revision 1126665844)
|
||
Culling (revision 1124588069)
|
||
Citizen scientists (revision 1126971493)
|
||
Accademia delle Arti del Disegno (revision 1117591379)
|
||
Lacey, Washington (revision 1118158829)
|
||
Berberis thunbergii (revision 1098470800)
|
||
James Joyce (revision 1127091935)
|
||
Interim Register of Marine and Nonmarine Genera (revision 1093112130)
|
||
Marker-assisted selection (revision 1101841526)
|
||
Blood-borne disease (revision 1104089084)
|
||
Research in Computational Molecular Biology (revision 1098389228)
|
||
Eggplant (revision 1127383368)
|
||
Purr (revision 1125642484)
|
||
Blastomycosis (revision 1125999120)
|
||
NatureServe (revision 1122446327)
|
||
Xerinae (revision 1093432948)
|
||
Baja California rock squirrel (revision 1121471079)
|
||
Lodgepole chipmunk (revision 1121296771)
|
||
Honey (revision 1127398567)
|
||
Bouba/kiki effect (revision 1127022127)
|
||
Ferdinando I de' Medici, Grand Duke of Tuscany (revision 1125114864)
|
||
Medici (revision 1123423946)
|
||
Bristlecone pine (revision 1108725770)
|
||
Morphology (biology) (revision 1126240066)
|
||
Albanian language (revision 1127442244)
|
||
Taurus ground squirrel (revision 1121469893)
|
||
World Environment Day (revision 1119598477)
|
||
The New York Times (revision 1127291077)
|
||
Rat Genome Database (revision 1121949622)
|
||
Geobotanical prospecting (revision 992549326)
|
||
Pre-exposure prophylaxis (revision 1121706582)
|
||
Least chipmunk (revision 1120765536)
|
||
EcoHealth Alliance (revision 1124297887)
|
||
InterPro (revision 1123732177)
|
||
Gunnison's prairie dog (revision 1121472300)
|
||
EMBOSS (revision 1108898594)
|
||
Black-capped marmot (revision 1121471697)
|
||
Speckled ground squirrel (revision 1121469813)
|
||
National Gallery of Art (revision 1124058120)
|
||
Ground squirrel (revision 1106618817)
|
||
Texas antelope squirrel (revision 1121470154)
|
||
Skamania County, Washington (revision 1115141102)
|
||
Zebrafish Information Network (revision 1084187264)
|
||
Merriam's chipmunk (revision 1121301344)
|
||
Stamen (revision 1107327988)
|
||
Plant stem (revision 1125685714)
|
||
Uinta chipmunk (revision 1121367930)
|
||
Public Lab (revision 1123308321)
|
||
Sierra Madre ground squirrel (revision 1121471267)
|
||
Scripps Research (revision 1120793534)
|
||
Morbillivirus (revision 1123109002)
|
||
Conservation status (revision 1126423906)
|
||
Korean language (revision 1127097954)
|
||
Flatiron Institute (revision 1114126605)
|
||
Espíritu Santo antelope squirrel (revision 1121470113)
|
||
Pietre dure (revision 1124553077)
|
||
List of biological databases (revision 1116920095)
|
||
Needle sharing (revision 1066293994)
|
||
ISCB Africa ASBCB Conference on Bioinformatics (revision 1003545343)
|
||
Northern Idaho ground squirrel (revision 1123076448)
|
||
Animal track (revision 1112366053)
|
||
HMMER (revision 1090926305)
|
||
RERO (identifier) (revision 1068185782)
|
||
Catalogue of Life (revision 1118132647)
|
||
Francesco I de' Medici, Grand Duke of Tuscany (revision 1123286810)
|
||
Whip-poor-will (revision 1120975767)
|
||
Doi (identifier) (revision 1127429235)
|
||
Wildlife Conservation Society (revision 1125787985)
|
||
Panamint chipmunk (revision 1121299808)
|
||
Bioblitz (revision 1113263878)
|
||
Habitat loss (revision 1117935852)
|
||
Sciuromorpha (revision 1107286064)
|
||
Yellow-bellied marmot (revision 1121472145)
|
||
Allen's chipmunk (revision 1121299548)
|
||
Hood Canal (revision 1124856006)
|
||
Computer vision (revision 1126383414)
|
||
Vibrio cholerae (revision 1123125512)
|
||
Phulwara oil (revision 1039287034)
|
||
Neah Bay, Washington (revision 1117347476)
|
||
Chelan County, Washington (revision 1115437018)
|
||
Columbia River (revision 1121152264)
|
||
Philippine Genome Center (revision 1086509191)
|
||
Thirteen-lined ground squirrel (revision 1127159966)
|
||
Cat massage (revision 1120597363)
|
||
Swiss French (revision 1126844735)
|
||
Probabilistic risk analysis (revision 1118087495)
|
||
Kingdom (biology) (revision 1126766133)
|
||
Norfloxacin (revision 1126442196)
|
||
Tropical ground squirrel (revision 1121471157)
|
||
Cannabis culture (revision 1123260879)
|
||
Fontarrón (revision 962928722)
|
||
Heuristic algorithm (revision 1124780994)
|
||
Spotted ground squirrel (revision 1122239672)
|
||
Hand washing (revision 1126772691)
|
||
Human skin (revision 1125889832)
|
||
Slovenia (revision 1127365628)
|
||
Australia Bioinformatics Resource (revision 1023592097)
|
||
Utah prairie dog (revision 1125084849)
|
||
Research center (revision 1122565049)
|
||
Australian Wildlife Conservancy (revision 1126004200)
|
||
Catholicism (revision 1126878543)
|
||
White-tailed prairie dog (revision 1121472368)
|
||
Rabbit (revision 1125928365)
|
||
Cathedral (revision 1117971650)
|
||
Columbia Plateau (revision 1111592488)
|
||
Pablo de Olavide University (revision 1100528254)
|
||
Plant habit (revision 1101707375)
|
||
Anti-fascism (revision 1126769811)
|
||
Coral-billed ground-cuckoo (revision 1119603104)
|
||
Alpine marmot (revision 1121471662)
|
||
Homozygous (revision 1125746174)
|
||
COVID-19 vaccination in the Republic of Ireland (revision 1125658338)
|
||
Music of North Korea (revision 1109275365)
|
||
Eastern Washington (revision 1111432324)
|
||
Tarbagan marmot (revision 1121488248)
|
||
VIAF (identifier) (revision 1122669300)
|
||
Duke of Florence (revision 1010655117)
|
||
Accademia della Crusca (revision 1118884925)
|
||
Mobile robot (revision 1125548051)
|
||
Hyperlocal (revision 1116240164)
|
||
Oregon Trail (revision 1124389602)
|
||
Cane rat (revision 1089272788)
|
||
Federal Way, Washington (revision 1122923555)
|
||
Rubens (revision 1121190866)
|
||
Pala d'Oro (revision 1072202795)
|
||
Archduke Rainer of Austria (1895–1930) (revision 1081133439)
|
||
Bioinformatics (revision 1125974897)
|
||
Renal tubular acidosis (revision 1105330876)
|
||
Brain morphometry (revision 1053832132)
|
||
Ethnologue (revision 1127241433)
|
||
OregonLive.com (revision 1114379550)
|
||
Yangban (revision 1121415587)
|
||
Belize Inlet (revision 982557553)
|
||
Canebrake Ecological Reserve (revision 1121247294)
|
||
Glycogen (revision 1110998630)
|
||
Richardson's ground squirrel (revision 1122297225)
|
||
Cluster analysis (revision 1116924542)
|
||
Genomics (revision 1126520756)
|
||
Spermophilus brevicauda (revision 1010428942)
|
||
Endosperm (revision 1112721337)
|
||
Relational database (revision 1116718100)
|
||
Snow (revision 1126528822)
|
||
Roadless area conservation (revision 1103267389)
|
||
Minneapolis–Saint Paul (revision 1124710168)
|
||
|
||
== End of Parsed pages ==
|
||
|
||
- Wikipedia parsing ended at: 2022-12-14 20:24:17.046830
|
||
|
||
59 characters appeared 2235074 times.
|
||
|
||
Most Frequent characters:
|
||
[ 0] Char e: 11.901753588471792 %
|
||
[ 1] Char a: 8.660205433914046 %
|
||
[ 2] Char t: 8.534616750944265 %
|
||
[ 3] Char i: 7.941079355761822 %
|
||
[ 4] Char n: 7.5567520359504865 %
|
||
[ 5] Char o: 7.4230651871034254 %
|
||
[ 6] Char s: 6.903216627279455 %
|
||
[ 7] Char r: 6.589625220462499 %
|
||
[ 8] Char l: 4.254847937920624 %
|
||
[ 9] Char h: 4.180219536355396 %
|
||
[10] Char c: 3.813967680712137 %
|
||
[11] Char d: 3.744797711395685 %
|
||
[12] Char u: 2.734361367901018 %
|
||
[13] Char m: 2.5771853638850435 %
|
||
[14] Char p: 2.266099466952772 %
|
||
[15] Char f: 2.170576902599198 %
|
||
[16] Char g: 1.9969361193410151 %
|
||
[17] Char b: 1.540888579080603 %
|
||
[18] Char y: 1.515833480233764 %
|
||
[19] Char w: 1.324385680295149 %
|
||
[20] Char v: 1.0713739231900152 %
|
||
[21] Char k: 0.5591761167639192 %
|
||
[22] Char x: 0.22384046344774267 %
|
||
[23] Char j: 0.18035197044930057 %
|
||
[24] Char z: 0.16464779242208535 %
|
||
[25] Char q: 0.12464911676302441 %
|
||
|
||
The first 26 characters have an accumulated ratio of 0.9995445340959629.
|
||
The first 5 characters have an accumulated ratio of 0.4459440716504241.
|
||
All characters whose order is over 18 have an accumulated ratio of 0.036484250633312364.
|
||
|
||
972 sequences found.
|
||
|
||
First 373 (typical positive ratio): 0.9950190506759622
|
||
Next 160 (533-373): 0.003986976910237083
|
||
Rest: 0.0009939724138007255
|
||
|
||
- Processing end: 2022-12-14 20:24:17.102402
|