mirror of
https://gitlab.freedesktop.org/uchardet/uchardet.git
synced 2025-12-07 01:06:40 +08:00
Now making sure that we have a generic language model working with UTF-8 for all 26 supported models which had single-byte encoding support until now.
167 lines
5.7 KiB
Plaintext
167 lines
5.7 KiB
Plaintext
= Logs of language model for Portuguese (pt) =
|
|
|
|
- Generated by BuildLangModel.py
|
|
- Started: 2021-03-16 19:54:55.771448
|
|
- Maximum depth: 4
|
|
- Max number of pages: 100
|
|
|
|
== Parsed pages ==
|
|
|
|
Papagaio-das-mascarenhas (revision 58875640)
|
|
Albinismo (revision 60544601)
|
|
Alfred Newton (revision 55613591)
|
|
Alphonse Milne-Edwards (revision 55360216)
|
|
Animalia (revision 59086849)
|
|
Asa (revision 59016280)
|
|
August von Pelzeln (revision 55658828)
|
|
Aves (revision 59780941)
|
|
Bico (revision 59270926)
|
|
BirdLife International (revision 60296296)
|
|
Carl Wilhelm Hahn (revision 58280895)
|
|
Carlos Lineu (revision 60424490)
|
|
Carolus Linnaeus (revision 60424490)
|
|
Cauda (revision 56806253)
|
|
Charles Lucien Bonaparte (revision 52587707)
|
|
Chordata (revision 60632448)
|
|
Cladograma (revision 55578666)
|
|
Classe (biologia) (revision 56051821)
|
|
Classificação científica (revision 59003514)
|
|
Coleção Leverian (revision 49939876)
|
|
Comores (revision 60033304)
|
|
Coracopsinae (revision 36946101)
|
|
Coracopsis nigra (revision 49364496)
|
|
Coracopsis vasa (revision 55904306)
|
|
Cylindraspis indica (revision 55039606)
|
|
Cúlmen (revision 59270926)
|
|
Digital object identifier (revision 59704276)
|
|
EBird (revision 54789725)
|
|
Eclectus roratus (revision 60346158)
|
|
Edward Newton (revision 52355291)
|
|
Enciclopédia da Vida (revision 53360339)
|
|
Endemismo (revision 59148596)
|
|
Epíteto específico (revision 58254455)
|
|
Espécie (revision 60480387)
|
|
Esquilo-vermelho (revision 59084882)
|
|
Estado de conservação (revision 60507425)
|
|
Extinção (revision 60618960)
|
|
Família (biologia) (revision 58605859)
|
|
Filo (revision 58307920)
|
|
Fossilworks Paleobiology Database (revision 60618977)
|
|
França (revision 60657760)
|
|
François-Nicolas Martinet (revision 43679514)
|
|
François Levaillant (revision 49358726)
|
|
Fredrik Hasselqvist (revision 52281786)
|
|
Fregilupus varius (revision 54591191)
|
|
Fumigação (revision 50600995)
|
|
George Robert Gray (revision 60662109)
|
|
Georges-Louis Leclerc, conde de Buffon (revision 53113664)
|
|
Global Biodiversity Information Facility (revision 59909217)
|
|
Género (biologia) (revision 60485207)
|
|
Hermann Schlegel (revision 58280671)
|
|
Herpetologista (revision 57406279)
|
|
Histoire Naturelle (revision 50957493)
|
|
Holótipo (revision 55228464)
|
|
INaturalist (revision 54028036)
|
|
ITIS (revision 59095296)
|
|
IUCN (revision 58907792)
|
|
Ilha da Reunião (revision 60519224)
|
|
Ilha vulcânica (revision 59932533)
|
|
Ilhas Mascarenhas (revision 60149877)
|
|
Ilhas Molucas (revision 58541748)
|
|
International Standard Book Number (revision 59096583)
|
|
Jacques Barraband (revision 45007769)
|
|
Jean Feuilley (revision 43140791)
|
|
Johann Georg Wagler (revision 58641840)
|
|
John Gerrard Keulemans (revision 49649801)
|
|
Julian Hume (revision 41876605)
|
|
Leiolopisma (revision 49675967)
|
|
Lionel Walter Rothschild (revision 60408276)
|
|
Lista Vermelha da IUCN (revision 59379270)
|
|
Lista Vermelha da União Internacional para a Conservação da Natureza e dos Recursos Naturais (revision 58907792)
|
|
Lista Vermelha de Espécies Ameaçadas da IUCN (revision 59379270)
|
|
Lista de aves extintas (revision 56678269)
|
|
Londres (revision 60339639)
|
|
Língua inglesa (revision 60421609)
|
|
Madagascar (revision 60519261)
|
|
Mascarenotus grucheti (revision 43145662)
|
|
Mathurin Jacques Brisson (revision 51922685)
|
|
Maurício (revision 60625767)
|
|
Maximiliano I José da Baviera (revision 58499194)
|
|
Melanina (revision 59475698)
|
|
Museu Nacional de História Natural (França) (revision 59928766)
|
|
National Center for Biotechnology Information (revision 59213569)
|
|
Naturhistorisches Museum (revision 51807264)
|
|
Nesoenas duboisi (revision 57384381)
|
|
Nome científico (revision 60480452)
|
|
Nomenclatura binomial (revision 60480452)
|
|
Nycticorax duboisi (revision 57384378)
|
|
Nível do mar (revision 59494064)
|
|
Ordem (biologia) (revision 56361837)
|
|
Otto Finsch (revision 52466524)
|
|
Papagaio (revision 60655174)
|
|
Papagaio-cinzento (revision 59484957)
|
|
Papagaio-cinzento-de-maurício (revision 58875653)
|
|
Pedro Mascarenhas (c. 1484-1555) (revision 49518171)
|
|
Periquito-de-maurício (revision 54615644)
|
|
Periquito-de-reunião (revision 54615645)
|
|
Peter Mundy (revision 58162914)
|
|
Piton des Neiges (revision 57212555)
|
|
Pleistoceno (revision 59637437)
|
|
Plumagem (revision 56296594)
|
|
|
|
== End of Parsed pages ==
|
|
|
|
- Wikipedia parsing ended at: 2021-03-16 19:59:19.802576
|
|
|
|
51 characters appeared 713201 times.
|
|
|
|
First 38 characters:
|
|
[ 0] Char a: 11.984419539512704 %
|
|
[ 1] Char e: 11.434925077222271 %
|
|
[ 2] Char o: 9.885712442915812 %
|
|
[ 3] Char s: 8.280835276450818 %
|
|
[ 4] Char i: 7.116787553578866 %
|
|
[ 5] Char r: 6.403664605069258 %
|
|
[ 6] Char n: 5.615948379208667 %
|
|
[ 7] Char d: 5.256442433479482 %
|
|
[ 8] Char t: 4.736673111787561 %
|
|
[ 9] Char m: 4.516118177063689 %
|
|
[10] Char c: 3.973213722358774 %
|
|
[11] Char u: 3.7191478979978996 %
|
|
[12] Char l: 3.1644655573954608 %
|
|
[13] Char p: 2.783647246708852 %
|
|
[14] Char g: 1.3397345208433526 %
|
|
[15] Char v: 1.3255730151808536 %
|
|
[16] Char f: 1.1414734415683656 %
|
|
[17] Char b: 0.9920064610116923 %
|
|
[18] Char h: 0.868759297869745 %
|
|
[19] Char ã: 0.7190118914583687 %
|
|
[20] Char é: 0.6653103402827534 %
|
|
[21] Char ç: 0.6455403175261952 %
|
|
[22] Char q: 0.5922594051326344 %
|
|
[23] Char í: 0.41138472884923044 %
|
|
[24] Char x: 0.3736674513916834 %
|
|
[25] Char á: 0.3452042271393338 %
|
|
[26] Char z: 0.3241722880366124 %
|
|
[27] Char ó: 0.2204147217965202 %
|
|
[28] Char ê: 0.204150022223749 %
|
|
[29] Char j: 0.2023272541681798 %
|
|
[30] Char õ: 0.17863126944578034 %
|
|
[31] Char y: 0.13222079049244184 %
|
|
[32] Char ú: 0.08819393130407838 %
|
|
[33] Char â: 0.08300605299207375 %
|
|
[34] Char w: 0.08174413664591049 %
|
|
[35] Char k: 0.07445306442363374 %
|
|
[36] Char à: 0.06688156634665403 %
|
|
[37] Char ô: 0.034492380128463083 %
|
|
|
|
The first 38 characters have an accumulated ratio of 0.9998261359700841.
|
|
|
|
929 sequences found.
|
|
|
|
First 512 (typical positive ratio): 0.9952990712503466
|
|
Next 512 (512-1024): 0.0008819393130407837
|
|
Rest: -7.806255641895632e-18
|
|
|
|
- Processing end: 2021-03-16 19:59:19.891534
|