mirror of
https://gitlab.freedesktop.org/uchardet/uchardet.git
synced 2025-12-08 01:36:41 +08:00
I was missing some characters, especially in the Slovak alphabet. Oppositely the Slovene alphabet does not use 4 of the common ASCII alphabet.
151 lines
4.5 KiB
Plaintext
151 lines
4.5 KiB
Plaintext
= Logs of language model for Slovene (sl) =
|
|
|
|
- Generated by BuildLangModel.py
|
|
- Started: 2021-03-21 12:30:22.611188
|
|
- Maximum depth: 4
|
|
- Max number of pages: 100
|
|
|
|
== Parsed pages ==
|
|
|
|
Ljubljana (revision 5468628)
|
|
1689 (revision 4230028)
|
|
1918 (revision 5249637)
|
|
1926 (revision 5456617)
|
|
1929 (revision 5444196)
|
|
1930. (revision 5118014)
|
|
2011 (revision 5469547)
|
|
25. junij (revision 5447338)
|
|
A1 (radio) (revision 5360678)
|
|
ACH Volley (revision 5089458)
|
|
AKC Metelkova mesto (revision 5323280)
|
|
Abecedarium (revision 5092193)
|
|
Academia operosorum Labacensis (revision 5228146)
|
|
Adam Bohorič (revision 5414191)
|
|
Ajdovščina (revision 5423173)
|
|
Albert Kosmač (revision 5368699)
|
|
Albin Belar (revision 5197298)
|
|
Aleksander Bajt (revision 4917916)
|
|
Aleksandrija (revision 5405515)
|
|
Aleš Kunaver (revision 5029295)
|
|
Alojzij Šuštar (revision 5442498)
|
|
Alpe (revision 5464842)
|
|
Amsterdam (revision 5359727)
|
|
Anastasius Grün (revision 5070788)
|
|
Andorra la Vella (revision 5390252)
|
|
Andrej Fleischmann (revision 4930149)
|
|
Andrej Smole (revision 5467820)
|
|
Angela Vode (revision 5466809)
|
|
Anica Cevc (revision 5414746)
|
|
Anja Bukovec (revision 5041799)
|
|
Anton Aleksander Auersperg (revision 5070788)
|
|
Anton Alojzij Wolf (revision 5361749)
|
|
Anton Bitenc (revision 5463597)
|
|
Anton Bonaventura Jeglič (revision 5414522)
|
|
Anton Cerar (revision 5376771)
|
|
Anton Codelli (izumitelj) (revision 5161385)
|
|
Anton Foerster (revision 5270593)
|
|
Anton Gvajc (revision 5035801)
|
|
Anton Lajovic (revision 4867406)
|
|
Anton Melik (revision 5272303)
|
|
Anton Ocvirk (revision 5470942)
|
|
Anton Peterlin (revision 4979305)
|
|
Anton Stres (revision 5464457)
|
|
Anton Tomaž Linhart (revision 5413399)
|
|
Anton Verovšek (revision 5412417)
|
|
Anton Vodnik (revision 5180239)
|
|
Anton Šivic (revision 5410565)
|
|
Antwerpen (revision 5375367)
|
|
Arena Stožice (revision 5462141)
|
|
Argentinski park, Ljubljana (revision 5398130)
|
|
Argonavti (revision 5425545)
|
|
Arne Hodalič (revision 5417283)
|
|
Art nouveau (revision 5371096)
|
|
Ateizem (revision 5427207)
|
|
Atene (revision 5360039)
|
|
Ati Soss (revision 5463553)
|
|
Atila (revision 5425308)
|
|
Avgusta Danilova (revision 4788392)
|
|
Avstro-Ogrska (revision 5431606)
|
|
Avtobusna postaja Ljubljana (revision 4479008)
|
|
Avtocesta A1 (revision 5292269)
|
|
Avtocesta A2 (revision 5387166)
|
|
Aškerčeva cesta, Ljubljana (revision 4578067)
|
|
BTC (revision 5450525)
|
|
Bajer (potok) (revision 5147457)
|
|
Bakrorez (revision 5375208)
|
|
Bangkok (revision 5378204)
|
|
Barje (revision 5180470)
|
|
Barok (revision 5463042)
|
|
Bejrut (revision 5356724)
|
|
Benetke (revision 5424094)
|
|
Beograd (revision 5448139)
|
|
Berlin (revision 5435344)
|
|
Bern (revision 5466493)
|
|
Biblija (revision 5404188)
|
|
Bicike(lj) (revision 5468628)
|
|
Bine Rogelj (revision 5086972)
|
|
Biodiverziteta (revision 5352270)
|
|
Bizoviški potok (revision 5305268)
|
|
Bled (revision 5469179)
|
|
Bleiweisova cesta, Ljubljana (revision 5184903)
|
|
Bogo Grafenauer (revision 5311308)
|
|
Bogota (revision 5363243)
|
|
Bojan Adamič (revision 5409135)
|
|
Bojan Čop (revision 5247252)
|
|
Bojan Štih (revision 5305724)
|
|
Boris Kobe (revision 5296972)
|
|
Boris Sket (revision 5413264)
|
|
Borut Lesjak (revision 5273043)
|
|
Botanični vrt, Ljubljana (revision 5142111)
|
|
Botanični vrt Ljubljana (revision 5142111)
|
|
Botanični vrt Univerze v Ljubljani (revision 5142111)
|
|
Bovec (revision 5330651)
|
|
Boštjan Putrih (revision 5124433)
|
|
Boštjan Žekš (revision 5415317)
|
|
Božena Ravnihar (revision 5415042)
|
|
Božo Vodušek (revision 5122962)
|
|
Božo Škerlj (revision 5268384)
|
|
|
|
== End of Parsed pages ==
|
|
|
|
- Wikipedia parsing ended at: 2021-03-21 12:38:56.631283
|
|
|
|
57 characters appeared 519434 times.
|
|
|
|
Most Frequent characters:
|
|
[ 0] Char e: 10.223242991409881 %
|
|
[ 1] Char a: 10.130257164529084 %
|
|
[ 2] Char i: 8.972265966417293 %
|
|
[ 3] Char o: 8.507144314773388 %
|
|
[ 4] Char n: 7.334329289187846 %
|
|
[ 5] Char r: 5.438226993227244 %
|
|
[ 6] Char s: 5.162157271183634 %
|
|
[ 7] Char l: 5.052614961669818 %
|
|
[ 8] Char t: 4.829679997843807 %
|
|
[ 9] Char j: 4.445223069725894 %
|
|
[10] Char v: 4.3826549667522725 %
|
|
[11] Char k: 3.543664835185991 %
|
|
[12] Char d: 3.1351432520782234 %
|
|
[13] Char p: 2.8430945991213514 %
|
|
[14] Char m: 2.7564618411578756 %
|
|
[15] Char u: 2.3069340859473964 %
|
|
[16] Char z: 2.0064146744340956 %
|
|
[17] Char b: 1.937300985303234 %
|
|
[18] Char g: 1.6027060223242993 %
|
|
[19] Char h: 1.1235306121663196 %
|
|
[20] Char č: 1.0794441642249064 %
|
|
[21] Char c: 1.048256371358055 %
|
|
[22] Char š: 0.9687467512715764 %
|
|
[23] Char ž: 0.5263421339380941 %
|
|
[24] Char f: 0.41391206582549467 %
|
|
|
|
The first 25 characters have an accumulated ratio of 0.9976974938105708.
|
|
|
|
880 sequences found.
|
|
|
|
First 449 (typical positive ratio): 0.9950499684040537
|
|
Next 172 (621-449): 0.003957684836286113
|
|
Rest: 0.000992346759660201
|
|
|
|
- Processing end: 2021-03-21 12:38:56.993560
|