uchardet/script/BuildLangModelLogs/LangPortugueseModel.log

257 lines
8.6 KiB
Plaintext

= Logs of language model for Portuguese (pt) =
- Generated by BuildLangModel.py
- Started: 2022-12-15 00:13:40.621625
- Maximum depth: 4
- Max number of pages: 200
== Parsed pages ==
Papagaio-das-mascarenhas (revision 61083234)
François-Nicolas Martinet (revision 43679514)
Tanygnathus (revision 63727477)
Alfred Newton (revision 63772066)
Aves (revision 64642129)
Julian Hume (revision 41876605)
Ilha da Reunião (revision 64417746)
Tirosina (revision 64330501)
Pedro Mascarenhas (c. 1484-1555) (revision 64281128)
Herpetologista (revision 60800107)
Praslin (revision 60991639)
August von Pelzeln (revision 62048504)
INaturalist (revision 62752196)
Peixe (revision 64431170)
Oswald Heer (revision 64579144)
Biólogo (revision 61723910)
Coral (revision 62383136)
Royal Society (revision 63227627)
Johann Natterer (revision 63305664)
Família (biologia) (revision 61575111)
UNESCO (revision 64651554)
Paridade do poder de compra (revision 64194230)
Lorena (França) (revision 57151319)
Área (revision 63988916)
Ecologia (revision 64022144)
Masiakasaurus (revision 64018705)
1984 (revision 64860161)
Animalia (revision 64303459)
Biblioteca Nacional da Austrália (revision 63908354)
Malaca Portuguesa (revision 64517772)
México (revision 64868116)
PubMed Identifier (revision 64178664)
Owen Willans Richardson (revision 58168602)
William Burnside (revision 62739863)
Endémico (revision 64450772)
Amendoim (revision 64423017)
Cisteína (revision 64443908)
Réptil (revision 64240956)
Omnívoro (revision 64303184)
Psittaciformes (revision 63932960)
Joel Serrão (revision 62566046)
Áustria (revision 64777663)
Seicheles (revision 64635903)
Chordata (revision 64103327)
Anfíbio (revision 64657407)
Johann Georg Wagler (revision 61847261)
Feniletilamina (revision 64766772)
Aminoácido essencial (revision 62163188)
Ictiologia (revision 64184350)
Georg von Frauenfeld (revision 62413353)
Sistema de acasalamento (revision 64465607)
Oxford University Press (revision 63009975)
Coordenadas geográficas (revision 64874098)
Digital object identifier (revision 63209667)
John Desmond Bernal (revision 60419838)
John Edward Marr (revision 62745345)
Encefalina (revision 64330411)
Conquiliologia (revision 56999872)
Quilómetro quadrado (revision 64134927)
História da Itália (revision 63544997)
Identificação automatizada de espécies (revision 60520809)
Gil Eanes (revision 64787644)
Registro CAS (revision 62829292)
Ronald Ross (revision 63575195)
Biologia regenerativa (revision 56549505)
Santo Eustáquio (Países Baixos) (revision 63516356)
Cabo da Boa Esperança (revision 64850246)
Edward Mellanby (revision 59542666)
Geografia da França (revision 63700063)
Condorraptor (revision 64060396)
Fudge (revision 64331291)
Rapator (revision 64107459)
Viena (revision 64653743)
1973 (revision 64252513)
Classe (biologia) (revision 63495321)
História natural (revision 60797583)
Francisco de Mascarenhas (revision 64533486)
Henry John Carter (revision 64088767)
Garcia de Noronha (revision 61943288)
Essuatíni (revision 64541626)
Etologia (revision 63703415)
1825 (revision 64231448)
Pitohui (revision 55136936)
Doador de electrões (revision 49471221)
Francisco Xavier Soares da Veiga (revision 42160927)
Tanygnathus megalorynchos (revision 63460044)
Engenharia biológica (revision 64476460)
Biologia forense (revision 59861252)
Califórnia (revision 64085181)
Cupuaçu (revision 64791967)
Classificação científica (revision 63914619)
Ilha Europa (revision 58458237)
Pedro Álvares Cabral (revision 64766295)
1882 (revision 60523806)
Arquipélago (revision 64873918)
Tristão Teixeira (revision 63759821)
Ornitologia (revision 63950590)
Maiote (revision 63509604)
Manuel Duarte Leitão (revision 62776308)
Biodiversidade (revision 64635148)
Dynamoterror (revision 64149681)
Columbiformes (revision 61584181)
Los Angeles (revision 64907059)
Ilha de Linosa (revision 55210386)
Inteligência artificial (revision 64867398)
Megaraptora (revision 64096312)
Árabes (revision 64244377)
Gráfico semi-log (revision 53359355)
Densidade populacional (revision 64809653)
Garcia de Sá (revision 58468727)
Ferdinand von Hochstetter (revision 63490806)
Römpp Lexikon Chemie (revision 58796446)
Chocolate quente (revision 64330451)
Histologia (revision 61422516)
Henry Dale (revision 58667524)
Estêvão da Gama (c. 1470) (revision 64693733)
Espécie (revision 64553712)
Reino (biologia) (revision 62163157)
África Austral (revision 61960381)
Metro (revision 64654584)
Sudão (revision 64456425)
Fenol (revision 64404823)
Lista de disciplinas da biologia (revision 61981999)
Ornitólogo (revision 63950590)
Porfiriato (revision 64906132)
853 a.C. (revision 63744132)
Pedro de Noronha (revision 64269853)
International Standard Name Identifier (revision 64790504)
Padrão-ouro (revision 64448730)
Corpo (anatomia) (revision 64637457)
Produtividade (ecologia) (revision 63242479)
Bioinformática (revision 63353600)
Sterculioideae (revision 59214802)
Áries (revision 64192868)
Filipe II de Espanha (revision 64462191)
Biologia (revision 64766112)
Bioestatística (revision 64552825)
Hospital (revision 63940681)
Cecil Edgar Tilley (revision 62726767)
Ascendência (revision 58302798)
Ostafrikasaurus (revision 64071145)
Carl Edward Hellmayr (revision 62499688)
África do Sul (revision 64803180)
Lamberto Dini (revision 61581701)
Hipótese Gaia (revision 63036733)
Alberto do Canto (revision 64484219)
Real (moeda portuguesa) (revision 64536085)
Biomedicina (revision 64851943)
Evolução (revision 64809463)
Magnoliophyta (revision 64552676)
Protostomia (revision 64698835)
John Joly (revision 62745300)
Base Virtual Internacional de Autoridade (revision 61190425)
Joseph Barcroft (revision 53561143)
Diogo Cão (revision 64617588)
Hadeano (revision 64828835)
Gabriel Soares de Sousa (revision 64695192)
Partido socialista francês (revision 64269594)
António Pais de Sande (revision 52559072)
Cranganor (revision 61413974)
Bovinos (revision 63721509)
1880 (revision 58173615)
Celêntero (revision 58975856)
Língua grega antiga (revision 64775316)
Herpetologia (revision 60800107)
Luís Mascarenhas, 2.º conde de Alva (revision 64555516)
Custo marginal (revision 62175678)
Crocodilo-de-água-salgada (revision 64007088)
Geólogo (revision 64608075)
Washington, D.C. (revision 64536061)
Buenos Aires (revision 64811726)
Lisboa (revision 64898256)
Chocolate (revision 64868103)
Eukaryota (revision 64256026)
Eixo terrestre (revision 64647487)
Namíbia (revision 64658419)
Lagos (Algarve) (revision 64828967)
Igreja Copta (revision 64842664)
Alexis Carrel (revision 63190094)
Temnospondyli (revision 64390652)
Gestão hospitalar (revision 63982120)
Joaquim Augusto Mouzinho de Albuquerque (revision 64891429)
Pantyrannosauria (revision 64848070)
Aspirina (revision 64769177)
Cracklers de chocolate (revision 64330464)
Ilhas Crozet (revision 63024289)
Sparidae (revision 59205456)
Suliformes (revision 61403162)
Miguel Corte Real (revision 64244782)
== End of Parsed pages ==
- Wikipedia parsing ended at: 2022-12-15 00:16:37.497792
52 characters appeared 1621926 times.
Most Frequent characters:
[ 0] Char a: 11.982174279221123 %
[ 1] Char e: 11.377091186650933 %
[ 2] Char o: 10.194793104001047 %
[ 3] Char s: 8.025890207074799 %
[ 4] Char i: 7.1634587521255595 %
[ 5] Char r: 6.492281398781449 %
[ 6] Char d: 5.492112463823873 %
[ 7] Char n: 5.366706002616642 %
[ 8] Char t: 4.890543711611997 %
[ 9] Char m: 4.4280071963825725 %
[10] Char c: 4.01473310126356 %
[11] Char u: 3.616749469457916 %
[12] Char l: 3.2010091705786823 %
[13] Char p: 2.7590038016530967 %
[14] Char g: 1.3863764438081638 %
[15] Char v: 1.197341925587234 %
[16] Char f: 1.1109014837914923 %
[17] Char b: 1.0721820847560246 %
[18] Char h: 0.8071884907202919 %
[19] Char ã: 0.7111298542596888 %
[20] Char ç: 0.6203735558835607 %
[21] Char q: 0.6127899793208814 %
[22] Char é: 0.6002739952377606 %
[23] Char í: 0.41370568077705144 %
[24] Char á: 0.40106638650591947 %
[25] Char x: 0.345946732464983 %
[26] Char z: 0.3207914541107301 %
[27] Char ó: 0.27467344379459974 %
[28] Char j: 0.20426332644029382 %
[29] Char ê: 0.182190802786317 %
[30] Char õ: 0.15703552443206412 %
[31] Char y: 0.1389705818884462 %
[32] Char ú: 0.09833987493880732 %
[33] Char k: 0.0737394924306041 %
[34] Char w: 0.07330790677256546 %
[35] Char â: 0.07207480489245502 %
[36] Char à: 0.0646761936117924 %
[37] Char ô: 0.04346684127389289 %
The first 38 characters have an accumulated ratio of 0.9998736070572886.
The first 4 characters have an accumulated ratio of 0.415799487769479.
All characters whose order is over 21 have an accumulated ratio of 0.03464523042358283.
1068 sequences found.
First 514 (typical positive ratio): 0.9950108744191293
Next 183 (697-514): 0.003990843236141739
Rest: 0.0009982823447289846
- Processing end: 2022-12-15 00:16:37.626796