mirror of
https://gitlab.freedesktop.org/uchardet/uchardet.git
synced 2025-12-06 08:46:40 +08:00
script, src: regenerate Russian models and add UTF-8/Russian support.
This fixes the broken Russian test in Windows-1251 which once again gets a much better score with Russian. Also this adds UTF-8 support. Same as Bulgarian, I wonder why I had not regenerated this earlier. The new UTF-8 test comes from the 'Сурки' page of Wikipedia in Russian. Note that now this broke the test zh:gb18030 (the score for KOI8-R / ru (0.766388) beats GB18030 / zh (0.700000)). I think I'll have to look a bit closer at our GB18030 dedicated prober.
This commit is contained in:
parent
60dcec8a82
commit
41d309e8a2
@ -156,6 +156,7 @@ uchardet started as a C language binding of the original C++ implementation of t
|
||||
* Windows-1250
|
||||
* IBM852
|
||||
* Russian
|
||||
* UTF-8
|
||||
* ISO-8859-5
|
||||
* KOI8-R
|
||||
* WINDOWS-1251
|
||||
|
||||
270
script/BuildLangModelLogs/LangRussianModel.log
Normal file
270
script/BuildLangModelLogs/LangRussianModel.log
Normal file
@ -0,0 +1,270 @@
|
||||
= Logs of language model for Russian (ru) =
|
||||
|
||||
- Generated by BuildLangModel.py
|
||||
- Started: 2022-12-17 19:53:30.416132
|
||||
- Maximum depth: 4
|
||||
- Max number of pages: 200
|
||||
|
||||
== Parsed pages ==
|
||||
|
||||
Пулмен (рабочий посёлок) (revision 127314030)
|
||||
Водонапорная башня (revision 123368499)
|
||||
Обама, Барак (revision 127312814)
|
||||
Историзм (искусство) (revision 125199154)
|
||||
Насосная станция (revision 126671775)
|
||||
Школьный округ (revision 118138873)
|
||||
Конденсат (revision 97819205)
|
||||
1880-е годы (revision 124959394)
|
||||
Линкольн, Роберт Тодд (revision 126851305)
|
||||
Габарит подвижного состава (revision 127265050)
|
||||
Межвоенный период (revision 123201828)
|
||||
Гражданская война в США (revision 127311614)
|
||||
История евреев в США (revision 123703208)
|
||||
Англо-занзибарская война (revision 127263956)
|
||||
Линкольн, Джесси Харлан (revision 87795509)
|
||||
Бенкен, Герман (revision 120809711)
|
||||
УралГАХУ (revision 126489964)
|
||||
Великобритания (revision 127175319)
|
||||
Фленсбург (revision 126961771)
|
||||
Мещанство (revision 127304945)
|
||||
Прохоров, Александр Михайлович (revision 127233579)
|
||||
VIAF (revision 122626337)
|
||||
Национальная библиотека Чешской Республики (revision 124152023)
|
||||
Регулирующая арматура (revision 116046805)
|
||||
Раннее Средневековье (revision 126932807)
|
||||
Европейская интеграция (revision 125721443)
|
||||
Бойл, Уиллард (revision 120835257)
|
||||
Бут, Эдвин (revision 126437526)
|
||||
Московский трамвай (revision 127184149)
|
||||
Лондонский метрополитен (revision 126810923)
|
||||
F-18 (revision 127113399)
|
||||
Сацумско-британская война (revision 124671983)
|
||||
Луизианская покупка (revision 123941200)
|
||||
Община (Германия) (revision 125007479)
|
||||
Запорная арматура (revision 121220496)
|
||||
Новая Англия (revision 125214368)
|
||||
Берни Сандерс (revision 126983575)
|
||||
Бак (резервуар) (revision 126670363)
|
||||
Хемингуэй, Эрнест (revision 126959711)
|
||||
2021 год (revision 127125948)
|
||||
1951 год (revision 126285688)
|
||||
Жидкость (revision 127133343)
|
||||
Большая советская энциклопедия (revision 127144085)
|
||||
Россия (revision 127297047)
|
||||
CSS Virginia (revision 121318647)
|
||||
Школа реки Гудзон (revision 123627995)
|
||||
Водозаборные сооружения (revision 123836554)
|
||||
Ривера, Диего (revision 125976771)
|
||||
Квантовая физика (revision 126896053)
|
||||
Рочестер (Нью-Йорк) (revision 126016553)
|
||||
Конденсация (теплотехника) (revision 123837631)
|
||||
Средиземноморская Антанта (revision 125156636)
|
||||
Историография (revision 121180824)
|
||||
Гбови, Лейма (revision 124860814)
|
||||
Премудрый пискарь (revision 121359555)
|
||||
Люнебургская водонапорная башня (revision 117681965)
|
||||
XVIII век (revision 126913825)
|
||||
Сислей, Альфред (revision 127063100)
|
||||
Средние века (revision 127154753)
|
||||
Энциклопедический словарь Брокгауза и Ефрона (revision 125357601)
|
||||
Нефтепровод (revision 123810227)
|
||||
Нефть (revision 126997759)
|
||||
Вентиляция (revision 126675588)
|
||||
Цилиндр (revision 126783664)
|
||||
Английский язык (revision 127275941)
|
||||
Бензин (revision 126966322)
|
||||
Министр по делам ветеранов США (revision 124072400)
|
||||
Первобытное общество (revision 127057340)
|
||||
Пикассо, Пабло (revision 126869217)
|
||||
Рисунок в разрезе (revision 121960314)
|
||||
Междупутье (revision 125745955)
|
||||
Битва при Форт-Генри (revision 123999672)
|
||||
Канал (водный) (revision 123736265)
|
||||
Белорусская народная республика (revision 126958885)
|
||||
25 апреля (revision 127246597)
|
||||
Насос (revision 126768788)
|
||||
Теннесси (revision 124804069)
|
||||
Локомотив (revision 127032264)
|
||||
Габарит погрузки (revision 123372556)
|
||||
Вебби (revision 121964659)
|
||||
Алегзандрия (Виргиния) (revision 126338837)
|
||||
Война Фаррапус (revision 125765352)
|
||||
Образование в США (revision 126788195)
|
||||
Пресс-конференция (revision 127075029)
|
||||
Рио-де-Жанейро (revision 127002708)
|
||||
Габарит приближения строений (revision 117538368)
|
||||
Международный идентификатор стандартных наименований (revision 120216410)
|
||||
Мопассан, Ги де (revision 127086462)
|
||||
История Европейского союза (revision 123952687)
|
||||
Прусский социализм (revision 127165836)
|
||||
Библиотека Александрина (revision 126093192)
|
||||
Тэйкан-дзукури (revision 124877986)
|
||||
1883 год (revision 125476166)
|
||||
Конфликт на Китайско-Восточной железной дороге (revision 122499702)
|
||||
Энергетический уровень (revision 119322956)
|
||||
Алюминий (revision 126861293)
|
||||
Санкт-петербургский трамвай (revision 127306763)
|
||||
Национальная библиотека Франции (revision 127015965)
|
||||
12 мая (revision 127207333)
|
||||
Граммофон (revision 126498827)
|
||||
Маккьяйоли (revision 126836176)
|
||||
Канализационная установка (revision 123736401)
|
||||
Газ (revision 126950046)
|
||||
Луизиана (revision 127312945)
|
||||
Память Парижской Коммуны (revision 126960401)
|
||||
Сталь (revision 127216605)
|
||||
Семья Барака Обамы (revision 124529726)
|
||||
Поверхностный насос (revision 121146223)
|
||||
Каразин, Николай Николаевич (revision 127097562)
|
||||
Кирпичная готика (revision 125337841)
|
||||
The Century Magazine (revision 127098805)
|
||||
Контрольный номер Библиотеки Конгресса (revision 113360170)
|
||||
Русско-персидская война (1804—1813) (revision 126999654)
|
||||
Берн (revision 122913269)
|
||||
Поздняя античность (revision 127266287)
|
||||
Гарвардский университет (revision 127033732)
|
||||
Бои на Халхин-Голе (revision 126542980)
|
||||
Алый знак доблести (фильм, 1951) (revision 120728355)
|
||||
Водопровод (revision 127182411)
|
||||
Пар (revision 126003244)
|
||||
1971 год (revision 127068279)
|
||||
Искусство Древнего Египта (revision 125737336)
|
||||
Пенсильванский университет (Индиана) (revision 123963620)
|
||||
Национальная библиотека Израиля (revision 126108080)
|
||||
1884 год (revision 125476122)
|
||||
Проезд снаружи поездов (revision 127239100)
|
||||
Норвегия (revision 126986958)
|
||||
Барбур, Джеймс (revision 126851158)
|
||||
Французская интервенция в Испанию (revision 119666106)
|
||||
Англия (revision 127268120)
|
||||
Галлатин, Альберт (revision 127160198)
|
||||
Калифорния (revision 127027363)
|
||||
Роял, Кеннет Клайборн (revision 110605693)
|
||||
США (revision 126887888)
|
||||
Федеральная архитектура (revision 116000492)
|
||||
Конденсат Бозе — Эйнштейна (revision 125188375)
|
||||
Колонна (revision 126876842)
|
||||
1907 год (revision 127134918)
|
||||
13 сентября (revision 125587404)
|
||||
Генрих Лев (revision 126407574)
|
||||
Этрусское искусство (revision 123158050)
|
||||
Амальрик, Андрей Алексеевич (revision 126033545)
|
||||
9 декабря (revision 127201233)
|
||||
Селищи (22712000298) (revision 124521248)
|
||||
1798 год (revision 125783094)
|
||||
Мюледорф (Берн) (revision 121861015)
|
||||
Большая игра (revision 126891168)
|
||||
Битва (revision 124395796)
|
||||
Война не-персе (revision 127189710)
|
||||
Президентские выборы в США (2020) (revision 126639368)
|
||||
Площадь Карла Фаберже (revision 123223942)
|
||||
Банкрофт, Джордж (revision 126851184)
|
||||
Кобаяси, Макото (revision 121939251)
|
||||
Газойль (revision 123647640)
|
||||
Ватиканская апостольская библиотека (revision 124986491)
|
||||
Общественная собственность (revision 125722109)
|
||||
Славная революция (revision 122270271)
|
||||
Золя (revision 127092383)
|
||||
Офицер (revision 126230098)
|
||||
Метастабильное состояние (revision 118552209)
|
||||
Лыжные гонки (revision 124233040)
|
||||
Средиземное море (revision 126980465)
|
||||
Защитная арматура (revision 124665168)
|
||||
Президент Турции (revision 123861767)
|
||||
Макдональд, Артур (revision 123992590)
|
||||
Песок (revision 126799930)
|
||||
Сублимация (физика) (revision 127108939)
|
||||
Новицкий, Василий Фёдорович (revision 126350745)
|
||||
Список султанов Занзибара (revision 94020222)
|
||||
Туман (revision 124866163)
|
||||
2005 год (revision 127291761)
|
||||
Исламская Республика Афганистан (revision 126605442)
|
||||
Викисловарь (revision 126840626)
|
||||
22 января (revision 126465130)
|
||||
Российская национальная библиотека (revision 126055277)
|
||||
Наука в США (revision 124150312)
|
||||
Екатеринбургский завод (revision 125779202)
|
||||
Океания (revision 125374219)
|
||||
Нидершерли (revision 116230829)
|
||||
Война за австрийское наследство (revision 126874381)
|
||||
Доминиканская Республика (revision 127046641)
|
||||
Военный паровоз (revision 124117506)
|
||||
Подземные воды (revision 126705165)
|
||||
5 сентября (revision 126628763)
|
||||
Кафка, Франц (revision 127130321)
|
||||
Двухванная печь (revision 123510834)
|
||||
Чертаново Южное (revision 122081039)
|
||||
|
||||
== End of Parsed pages ==
|
||||
|
||||
- Wikipedia parsing ended at: 2022-12-17 19:57:30.506110
|
||||
|
||||
63 characters appeared 2343890 times.
|
||||
|
||||
Most Frequent characters:
|
||||
[ 0] Char о: 10.136567842347551 %
|
||||
[ 1] Char и: 8.217151828797427 %
|
||||
[ 2] Char а: 7.941797609956098 %
|
||||
[ 3] Char е: 7.781337861418411 %
|
||||
[ 4] Char н: 6.689093771465385 %
|
||||
[ 5] Char с: 5.755304216494801 %
|
||||
[ 6] Char р: 5.58695160609073 %
|
||||
[ 7] Char т: 5.486136294791991 %
|
||||
[ 8] Char в: 4.621547939536412 %
|
||||
[ 9] Char л: 4.156039745892512 %
|
||||
[10] Char к: 3.458694733967891 %
|
||||
[11] Char м: 2.899666793236884 %
|
||||
[12] Char д: 2.856064064439884 %
|
||||
[13] Char п: 2.69799350652121 %
|
||||
[14] Char у: 2.0648579924825823 %
|
||||
[15] Char я: 1.9596482770095867 %
|
||||
[16] Char г: 1.812798382176638 %
|
||||
[17] Char ы: 1.7729500957809452 %
|
||||
[18] Char б: 1.5043794717328884 %
|
||||
[19] Char з: 1.4936707780655236 %
|
||||
[20] Char й: 1.4190938994577391 %
|
||||
[21] Char ь: 1.2650764327677493 %
|
||||
[22] Char ч: 1.0549983147673312 %
|
||||
[23] Char х: 1.0016255029032932 %
|
||||
[24] Char ж: 0.7652236239755279 %
|
||||
[25] Char ц: 0.5965297006258826 %
|
||||
[26] Char ю: 0.5917513193878552 %
|
||||
[27] Char ш: 0.5520310253467526 %
|
||||
[28] Char ф: 0.4393977533075358 %
|
||||
[29] Char щ: 0.3068403380704726 %
|
||||
[30] Char э: 0.3063710327703092 %
|
||||
[31] Char i: 0.25978181569954223 %
|
||||
[32] Char ё: 0.24984107615971737 %
|
||||
[33] Char e: 0.2357619171548153 %
|
||||
[34] Char a: 0.21839762104876934 %
|
||||
[35] Char n: 0.18004257878996027 %
|
||||
[36] Char r: 0.1703151598411188 %
|
||||
[37] Char t: 0.16216631326555428 %
|
||||
[38] Char s: 0.15969179441014725 %
|
||||
[39] Char o: 0.1568759626091668 %
|
||||
[40] Char l: 0.1263711180985456 %
|
||||
[41] Char c: 0.09795681537956133 %
|
||||
[42] Char d: 0.08571221345711616 %
|
||||
[43] Char h: 0.07956858043679524 %
|
||||
[44] Char m: 0.07009714619713382 %
|
||||
[45] Char u: 0.0688598867694303 %
|
||||
[46] Char x: 0.05725524661993524 %
|
||||
[47] Char p: 0.05644462837419845 %
|
||||
[48] Char b: 0.05482339188272487 %
|
||||
[49] Char g: 0.051111613599614324 %
|
||||
[50] Char f: 0.05038632359027087 %
|
||||
[51] Char y: 0.04923439239896071 %
|
||||
[52] Char v: 0.0470158582527337 %
|
||||
[53] Char ъ: 0.03617917223077875 %
|
||||
|
||||
The first 54 characters have an accumulated ratio of 0.9991548238185242.
|
||||
The first 5 characters have an accumulated ratio of 0.40765948913984873.
|
||||
All characters whose order is over 29 have an accumulated ratio of 0.030302616590369.
|
||||
|
||||
1554 sequences found.
|
||||
|
||||
First 819 (typical positive ratio): 0.9950050289366638
|
||||
Next 260 (1079-819): 0.003999322715788067
|
||||
Rest: 0.0009956483475481726
|
||||
|
||||
- Processing end: 2022-12-17 19:57:30.653466
|
||||
75
script/charsets/ibm855.py
Normal file
75
script/charsets/ibm855.py
Normal file
@ -0,0 +1,75 @@
|
||||
#!/usr/bin/python
|
||||
# -*- coding: utf-8 -*-
|
||||
|
||||
# ##### BEGIN LICENSE BLOCK #####
|
||||
# Version: MPL 1.1/GPL 2.0/LGPL 2.1
|
||||
#
|
||||
# The contents of this file are subject to the Mozilla Public License Version
|
||||
# 1.1 (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at
|
||||
# http://www.mozilla.org/MPL/
|
||||
#
|
||||
# Software distributed under the License is distributed on an "AS IS" basis,
|
||||
# WITHOUT WARRANTY OF ANY KIND, either express or implied. See the License
|
||||
# for the specific language governing rights and limitations under the
|
||||
# License.
|
||||
#
|
||||
# The Original Code is Mozilla Universal charset detector code.
|
||||
#
|
||||
# The Initial Developer of the Original Code is
|
||||
# Netscape Communications Corporation.
|
||||
# Portions created by the Initial Developer are Copyright (C) 2001
|
||||
# the Initial Developer. All Rights Reserved.
|
||||
#
|
||||
# Contributor(s):
|
||||
# Jehan <jehan@girinstud.io>
|
||||
#
|
||||
# Alternatively, the contents of this file may be used under the terms of
|
||||
# either the GNU General Public License Version 2 or later (the "GPL"), or
|
||||
# the GNU Lesser General Public License Version 2.1 or later (the "LGPL"),
|
||||
# in which case the provisions of the GPL or the LGPL are applicable instead
|
||||
# of those above. If you wish to allow use of your version of this file only
|
||||
# under the terms of either the GPL or the LGPL, and not to allow others to
|
||||
# use your version of this file under the terms of the MPL, indicate your
|
||||
# decision by deleting the provisions above and replace them with the notice
|
||||
# and other provisions required by the GPL or the LGPL. If you do not delete
|
||||
# the provisions above, a recipient may use your version of this file under
|
||||
# the terms of any one of the MPL, the GPL or the LGPL.
|
||||
#
|
||||
# ##### END LICENSE BLOCK #####
|
||||
|
||||
from codepoints import *
|
||||
|
||||
name = 'IBM855'
|
||||
aliases = ['CP855', 'OEM 855', 'MS-DOS Cyrillic']
|
||||
|
||||
language = \
|
||||
{
|
||||
# Wikipedia tells us: At one time it was widely used in Serbia, Macedonia
|
||||
# and Bulgaria, but it never caught on in Russia, where Code page 866 was more
|
||||
# common. This code page is not used much.
|
||||
'complete': [ 'sr', 'mk', 'bg', 'ru' ],
|
||||
'incomplete': []
|
||||
}
|
||||
|
||||
# X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 XA XB XC XD XE XF #
|
||||
charmap = \
|
||||
[
|
||||
CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,RET,CTR,CTR,RET,CTR,CTR, # 0X
|
||||
CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR, # 1X
|
||||
SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM, # 2X
|
||||
NUM,NUM,NUM,NUM,NUM,NUM,NUM,NUM,NUM,NUM,SYM,SYM,SYM,SYM,SYM,SYM, # 3X
|
||||
SYM,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET, # 4X
|
||||
LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,SYM,SYM,SYM,SYM,SYM, # 5X
|
||||
SYM,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET, # 6X
|
||||
LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,SYM,SYM,SYM,SYM,CTR, # 7X
|
||||
|
||||
LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET, # 8X
|
||||
LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET, # 9X
|
||||
LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,SYM,SYM, # AX
|
||||
SYM,SYM,SYM,SYM,SYM,LET,LET,LET,LET,SYM,SYM,SYM,SYM,LET,LET,SYM, # BX
|
||||
SYM,SYM,SYM,SYM,SYM,SYM,LET,LET,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM, # CX
|
||||
LET,LET,LET,LET,LET,LET,LET,LET,LET,SYM,SYM,SYM,SYM,LET,LET,SYM, # DX
|
||||
LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,SYM, # EX
|
||||
SYM,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,SYM,SYM,SYM, # FX
|
||||
]
|
||||
72
script/charsets/ibm866.py
Normal file
72
script/charsets/ibm866.py
Normal file
@ -0,0 +1,72 @@
|
||||
#!/usr/bin/python
|
||||
# -*- coding: utf-8 -*-
|
||||
|
||||
# ##### BEGIN LICENSE BLOCK #####
|
||||
# Version: MPL 1.1/GPL 2.0/LGPL 2.1
|
||||
#
|
||||
# The contents of this file are subject to the Mozilla Public License Version
|
||||
# 1.1 (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at
|
||||
# http://www.mozilla.org/MPL/
|
||||
#
|
||||
# Software distributed under the License is distributed on an "AS IS" basis,
|
||||
# WITHOUT WARRANTY OF ANY KIND, either express or implied. See the License
|
||||
# for the specific language governing rights and limitations under the
|
||||
# License.
|
||||
#
|
||||
# The Original Code is Mozilla Universal charset detector code.
|
||||
#
|
||||
# The Initial Developer of the Original Code is
|
||||
# Netscape Communications Corporation.
|
||||
# Portions created by the Initial Developer are Copyright (C) 2001
|
||||
# the Initial Developer. All Rights Reserved.
|
||||
#
|
||||
# Contributor(s):
|
||||
# Jehan <jehan@girinstud.io>
|
||||
#
|
||||
# Alternatively, the contents of this file may be used under the terms of
|
||||
# either the GNU General Public License Version 2 or later (the "GPL"), or
|
||||
# the GNU Lesser General Public License Version 2.1 or later (the "LGPL"),
|
||||
# in which case the provisions of the GPL or the LGPL are applicable instead
|
||||
# of those above. If you wish to allow use of your version of this file only
|
||||
# under the terms of either the GPL or the LGPL, and not to allow others to
|
||||
# use your version of this file under the terms of the MPL, indicate your
|
||||
# decision by deleting the provisions above and replace them with the notice
|
||||
# and other provisions required by the GPL or the LGPL. If you do not delete
|
||||
# the provisions above, a recipient may use your version of this file under
|
||||
# the terms of any one of the MPL, the GPL or the LGPL.
|
||||
#
|
||||
# ##### END LICENSE BLOCK #####
|
||||
|
||||
from codepoints import *
|
||||
|
||||
name = 'IBM866'
|
||||
aliases = ['CP866', 'DOS Cyrillic Russian']
|
||||
|
||||
language = \
|
||||
{
|
||||
'complete': [ 'bg', 'ru' ],
|
||||
'incomplete': [ 'uk', 'be' ]
|
||||
}
|
||||
|
||||
# X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 XA XB XC XD XE XF #
|
||||
charmap = \
|
||||
[
|
||||
CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,RET,CTR,CTR,RET,CTR,CTR, # 0X
|
||||
CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR, # 1X
|
||||
SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM, # 2X
|
||||
NUM,NUM,NUM,NUM,NUM,NUM,NUM,NUM,NUM,NUM,SYM,SYM,SYM,SYM,SYM,SYM, # 3X
|
||||
SYM,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET, # 4X
|
||||
LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,SYM,SYM,SYM,SYM,SYM, # 5X
|
||||
SYM,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET, # 6X
|
||||
LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,SYM,SYM,SYM,SYM,CTR, # 7X
|
||||
|
||||
LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET, # 8X
|
||||
LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET, # 9X
|
||||
LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET, # AX
|
||||
SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM, # BX
|
||||
SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM, # CX
|
||||
SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM, # DX
|
||||
LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET, # EX
|
||||
LET,LET,LET,LET,LET,LET,LET,LET,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM, # FX
|
||||
]
|
||||
74
script/charsets/koi8-r.py
Normal file
74
script/charsets/koi8-r.py
Normal file
@ -0,0 +1,74 @@
|
||||
#!/usr/bin/python
|
||||
# -*- coding: utf-8 -*-
|
||||
|
||||
# ##### BEGIN LICENSE BLOCK #####
|
||||
# Version: MPL 1.1/GPL 2.0/LGPL 2.1
|
||||
#
|
||||
# The contents of this file are subject to the Mozilla Public License Version
|
||||
# 1.1 (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at
|
||||
# http://www.mozilla.org/MPL/
|
||||
#
|
||||
# Software distributed under the License is distributed on an "AS IS" basis,
|
||||
# WITHOUT WARRANTY OF ANY KIND, either express or implied. See the License
|
||||
# for the specific language governing rights and limitations under the
|
||||
# License.
|
||||
#
|
||||
# The Original Code is Mozilla Universal charset detector code.
|
||||
#
|
||||
# The Initial Developer of the Original Code is
|
||||
# Netscape Communications Corporation.
|
||||
# Portions created by the Initial Developer are Copyright (C) 2001
|
||||
# the Initial Developer. All Rights Reserved.
|
||||
#
|
||||
# Contributor(s):
|
||||
# Jehan <jehan@girinstud.io>
|
||||
#
|
||||
# Alternatively, the contents of this file may be used under the terms of
|
||||
# either the GNU General Public License Version 2 or later (the "GPL"), or
|
||||
# the GNU Lesser General Public License Version 2.1 or later (the "LGPL"),
|
||||
# in which case the provisions of the GPL or the LGPL are applicable instead
|
||||
# of those above. If you wish to allow use of your version of this file only
|
||||
# under the terms of either the GPL or the LGPL, and not to allow others to
|
||||
# use your version of this file under the terms of the MPL, indicate your
|
||||
# decision by deleting the provisions above and replace them with the notice
|
||||
# and other provisions required by the GPL or the LGPL. If you do not delete
|
||||
# the provisions above, a recipient may use your version of this file under
|
||||
# the terms of any one of the MPL, the GPL or the LGPL.
|
||||
#
|
||||
# ##### END LICENSE BLOCK #####
|
||||
|
||||
from codepoints import *
|
||||
|
||||
name = 'KOI8-R'
|
||||
aliases = ['csKOI8R']
|
||||
|
||||
language = \
|
||||
{
|
||||
# KOI8-R is an 8-bit character encoding, designed to cover Russian, which
|
||||
# uses a Cyrillic alphabet. It also happens to cover Bulgarian, but has not
|
||||
# been used for that purpose since CP1251 was accepted.
|
||||
'complete': [ 'ru', 'bg' ],
|
||||
'incomplete': []
|
||||
}
|
||||
|
||||
# X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 XA XB XC XD XE XF #
|
||||
charmap = \
|
||||
[
|
||||
CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,RET,CTR,CTR,RET,CTR,CTR, # 0X
|
||||
CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR, # 1X
|
||||
SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM, # 2X
|
||||
NUM,NUM,NUM,NUM,NUM,NUM,NUM,NUM,NUM,NUM,SYM,SYM,SYM,SYM,SYM,SYM, # 3X
|
||||
SYM,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET, # 4X
|
||||
LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,SYM,SYM,SYM,SYM,SYM, # 5X
|
||||
SYM,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET, # 6X
|
||||
LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,SYM,SYM,SYM,SYM,CTR, # 7X
|
||||
SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM, # 8X
|
||||
SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM, # 9X
|
||||
SYM,SYM,SYM,LET,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM, # AX
|
||||
SYM,SYM,SYM,LET,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM, # BX
|
||||
LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET, # CX
|
||||
LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET, # DX
|
||||
LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET, # EX
|
||||
LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET, # FX
|
||||
]
|
||||
72
script/charsets/mac-cyrillic.py
Normal file
72
script/charsets/mac-cyrillic.py
Normal file
@ -0,0 +1,72 @@
|
||||
#!/usr/bin/python
|
||||
# -*- coding: utf-8 -*-
|
||||
|
||||
# ##### BEGIN LICENSE BLOCK #####
|
||||
# Version: MPL 1.1/GPL 2.0/LGPL 2.1
|
||||
#
|
||||
# The contents of this file are subject to the Mozilla Public License Version
|
||||
# 1.1 (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at
|
||||
# http://www.mozilla.org/MPL/
|
||||
#
|
||||
# Software distributed under the License is distributed on an "AS IS" basis,
|
||||
# WITHOUT WARRANTY OF ANY KIND, either express or implied. See the License
|
||||
# for the specific language governing rights and limitations under the
|
||||
# License.
|
||||
#
|
||||
# The Original Code is Mozilla Universal charset detector code.
|
||||
#
|
||||
# The Initial Developer of the Original Code is
|
||||
# Netscape Communications Corporation.
|
||||
# Portions created by the Initial Developer are Copyright (C) 2001
|
||||
# the Initial Developer. All Rights Reserved.
|
||||
#
|
||||
# Contributor(s):
|
||||
# Jehan <jehan@girinstud.io>
|
||||
#
|
||||
# Alternatively, the contents of this file may be used under the terms of
|
||||
# either the GNU General Public License Version 2 or later (the "GPL"), or
|
||||
# the GNU Lesser General Public License Version 2.1 or later (the "LGPL"),
|
||||
# in which case the provisions of the GPL or the LGPL are applicable instead
|
||||
# of those above. If you wish to allow use of your version of this file only
|
||||
# under the terms of either the GPL or the LGPL, and not to allow others to
|
||||
# use your version of this file under the terms of the MPL, indicate your
|
||||
# decision by deleting the provisions above and replace them with the notice
|
||||
# and other provisions required by the GPL or the LGPL. If you do not delete
|
||||
# the provisions above, a recipient may use your version of this file under
|
||||
# the terms of any one of the MPL, the GPL or the LGPL.
|
||||
#
|
||||
# ##### END LICENSE BLOCK #####
|
||||
|
||||
from codepoints import *
|
||||
|
||||
name = 'MAC-CYRILLIC'
|
||||
aliases = ['x-mac-cyrillic' ]
|
||||
|
||||
language = \
|
||||
{
|
||||
'complete': [ 'bg', 'ru' ],
|
||||
'incomplete': [ 'uk', 'be' ]
|
||||
}
|
||||
|
||||
# X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 XA XB XC XD XE XF #
|
||||
charmap = \
|
||||
[
|
||||
CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,RET,CTR,CTR,RET,CTR,CTR, # 0X
|
||||
CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR, # 1X
|
||||
SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM, # 2X
|
||||
NUM,NUM,NUM,NUM,NUM,NUM,NUM,NUM,NUM,NUM,SYM,SYM,SYM,SYM,SYM,SYM, # 3X
|
||||
SYM,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET, # 4X
|
||||
LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,SYM,SYM,SYM,SYM,SYM, # 5X
|
||||
SYM,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET, # 6X
|
||||
LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,SYM,SYM,SYM,SYM,CTR, # 7X
|
||||
|
||||
LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET, # 8X
|
||||
LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET, # 9X
|
||||
SYM,SYM,LET,SYM,SYM,SYM,SYM,LET,SYM,SYM,SYM,LET,LET,SYM,LET,LET, # AX
|
||||
SYM,SYM,SYM,SYM,LET,SYM,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET, # BX
|
||||
LET,LET,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,LET,LET,LET,LET,LET, # CX
|
||||
SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,LET,LET,LET,LET,SYM,LET,LET,LET, # DX
|
||||
LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET, # EX
|
||||
LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,LET,SYM, # FX
|
||||
]
|
||||
58
script/langs/ru.py
Normal file
58
script/langs/ru.py
Normal file
@ -0,0 +1,58 @@
|
||||
#!/bin/python3
|
||||
# -*- coding: utf-8 -*-
|
||||
|
||||
# ##### BEGIN LICENSE BLOCK #####
|
||||
# Version: MPL 1.1/GPL 2.0/LGPL 2.1
|
||||
#
|
||||
# The contents of this file are subject to the Mozilla Public License Version
|
||||
# 1.1 (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at
|
||||
# http://www.mozilla.org/MPL/
|
||||
#
|
||||
# Software distributed under the License is distributed on an "AS IS" basis,
|
||||
# WITHOUT WARRANTY OF ANY KIND, either express or implied. See the License
|
||||
# for the specific language governing rights and limitations under the
|
||||
# License.
|
||||
#
|
||||
# The Original Code is Mozilla Universal charset detector code.
|
||||
#
|
||||
# The Initial Developer of the Original Code is
|
||||
# Netscape Communications Corporation.
|
||||
# Portions created by the Initial Developer are Copyright (C) 2001
|
||||
# the Initial Developer. All Rights Reserved.
|
||||
#
|
||||
# Contributor(s):
|
||||
# Jehan <jehan@girinstud.io>
|
||||
#
|
||||
# Alternatively, the contents of this file may be used under the terms of
|
||||
# either the GNU General Public License Version 2 or later (the "GPL"), or
|
||||
# the GNU Lesser General Public License Version 2.1 or later (the "LGPL"),
|
||||
# in which case the provisions of the GPL or the LGPL are applicable instead
|
||||
# of those above. If you wish to allow use of your version of this file only
|
||||
# under the terms of either the GPL or the LGPL, and not to allow others to
|
||||
# use your version of this file under the terms of the MPL, indicate your
|
||||
# decision by deleting the provisions above and replace them with the notice
|
||||
# and other provisions required by the GPL or the LGPL. If you do not delete
|
||||
# the provisions above, a recipient may use your version of this file under
|
||||
# the terms of any one of the MPL, the GPL or the LGPL.
|
||||
#
|
||||
# ##### END LICENSE BLOCK #####
|
||||
|
||||
import re
|
||||
|
||||
## Mandatory Properties ##
|
||||
|
||||
name = 'Russian'
|
||||
code = 'ru'
|
||||
use_ascii = False
|
||||
charsets = [ 'WINDOWS-1251', 'ISO-8859-5', 'KOI8-R', 'IBM855', 'IBM866', 'MAC-CYRILLIC' ]
|
||||
|
||||
## Optional Properties ##
|
||||
|
||||
# Alphabet characters.
|
||||
alphabet = 'абвгдеёжзийклмнопрстуфхцчшщъыьэюя'
|
||||
# A starred page which was rewarded on the main page when I created
|
||||
# the data.
|
||||
start_pages = ['Пулмен (рабочий посёлок)']
|
||||
wikipedia_code = code
|
||||
case_mapping = True
|
||||
@ -36,332 +36,374 @@
|
||||
* ***** END LICENSE BLOCK ***** */
|
||||
|
||||
#include "../nsSBCharSetProber.h"
|
||||
#include "../nsLanguageDetector.h"
|
||||
|
||||
/********* Language model for: Russian *********/
|
||||
|
||||
/**
|
||||
* Generated by BuildLangModel.py
|
||||
* On: 2022-12-17 19:57:30.506433
|
||||
**/
|
||||
|
||||
//KOI8-R language model
|
||||
//Character Mapping Table:
|
||||
static const unsigned char KOI8R_CharToOrderMap[] =
|
||||
/* Character Mapping Table:
|
||||
* ILL: illegal character.
|
||||
* CTR: control character specific to the charset.
|
||||
* RET: carriage/return.
|
||||
* SYM: symbol (punctuation) that does not belong to word.
|
||||
* NUM: 0 - 9.
|
||||
*
|
||||
* Other characters are ordered by probabilities
|
||||
* (0 is the most common character in the language).
|
||||
*
|
||||
* Orders are generic to a language. So the codepoint with order X in
|
||||
* CHARSET1 maps to the same character as the codepoint with the same
|
||||
* order X in CHARSET2 for the same language.
|
||||
* As such, it is possible to get missing order. For instance the
|
||||
* ligature of 'o' and 'e' exists in ISO-8859-15 but not in ISO-8859-1
|
||||
* even though they are both used for French. Same for the euro sign.
|
||||
*/
|
||||
static const unsigned char Windows_1251_CharToOrderMap[] =
|
||||
{
|
||||
CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,RET,CTR,CTR,RET,CTR,CTR, //00
|
||||
CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR, //10
|
||||
SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM, //20
|
||||
NUM,NUM,NUM,NUM,NUM,NUM,NUM,NUM,NUM,NUM,SYM,SYM,SYM,SYM,SYM,SYM, //30
|
||||
SYM,142,143,144,145,146,147,148,149,150,151,152, 74,153, 75,154, //40
|
||||
155,156,157,158,159,160,161,162,163,164,165,SYM,SYM,SYM,SYM,SYM, //50
|
||||
SYM, 71,172, 66,173, 65,174, 76,175, 64,176,177, 77, 72,178, 69, //60
|
||||
67,179, 78, 73,180,181, 79,182,183,184,185,SYM,SYM,SYM,SYM,SYM, //70
|
||||
191,192,193,194,195,196,197,198,199,200,201,202,203,204,205,206, //80
|
||||
207,208,209,210,211,212,213,214,215,216,217,218,219,220,221,222, //90
|
||||
223,224,225, 68,226,227,228,229,230,231,232,233,234,235,236,237, //a0
|
||||
238,239,240,241,242,243,244,245,246,247,248,249,250,251,NUM,SYM, //b0
|
||||
27, 3, 21, 28, 13, 2, 39, 19, 26, 4, 23, 11, 8, 12, 5, 1, //c0
|
||||
15, 16, 9, 7, 6, 14, 24, 10, 17, 18, 20, 25, 30, 29, 22, 54, //d0
|
||||
59, 37, 44, 58, 41, 48, 53, 46, 55, 42, 60, 36, 49, 38, 31, 34, //e0
|
||||
35, 43, 45, 32, 40, 52, 56, 33, 61, 62, 51, 57, 47, 63, 50, 70, //f0
|
||||
CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,RET,CTR,CTR,RET,CTR,CTR, /* 0X */
|
||||
CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR, /* 1X */
|
||||
SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM, /* 2X */
|
||||
NUM,NUM,NUM,NUM,NUM,NUM,NUM,NUM,NUM,NUM,SYM,SYM,SYM,SYM,SYM,SYM, /* 3X */
|
||||
SYM, 34, 48, 41, 42, 33, 50, 49, 43, 31, 56, 54, 40, 44, 35, 39, /* 4X */
|
||||
47, 58, 36, 38, 37, 45, 52, 55, 46, 51, 57,SYM,SYM,SYM,SYM,SYM, /* 5X */
|
||||
SYM, 34, 48, 41, 42, 33, 50, 49, 43, 31, 56, 54, 40, 44, 35, 39, /* 6X */
|
||||
47, 58, 36, 38, 37, 45, 52, 55, 46, 51, 57,SYM,SYM,SYM,SYM,CTR, /* 7X */
|
||||
63, 64,SYM, 65,SYM,SYM,SYM,SYM,SYM,SYM, 66,SYM, 67, 68, 69, 70, /* 8X */
|
||||
71,SYM,SYM,SYM,SYM,SYM,SYM,SYM,ILL,SYM, 72,SYM, 73, 74, 75, 76, /* 9X */
|
||||
SYM, 60, 60, 77,SYM, 62,SYM,SYM, 32,SYM, 78,SYM,SYM,SYM,SYM, 61, /* AX */
|
||||
SYM,SYM, 59, 59, 62,SYM,SYM,SYM, 32,SYM, 79,SYM, 80, 81, 82, 61, /* BX */
|
||||
2, 18, 8, 16, 12, 3, 24, 19, 1, 20, 10, 9, 11, 4, 0, 13, /* CX */
|
||||
6, 5, 7, 14, 28, 23, 25, 22, 27, 29, 53, 17, 21, 30, 26, 15, /* DX */
|
||||
2, 18, 8, 16, 12, 3, 24, 19, 1, 20, 10, 9, 11, 4, 0, 13, /* EX */
|
||||
6, 5, 7, 14, 28, 23, 25, 22, 27, 29, 53, 17, 21, 30, 26, 15, /* FX */
|
||||
};
|
||||
/*X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 XA XB XC XD XE XF */
|
||||
|
||||
static const unsigned char Iso_8859_5_CharToOrderMap[] =
|
||||
{
|
||||
CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,RET,CTR,CTR,RET,CTR,CTR, /* 0X */
|
||||
CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR, /* 1X */
|
||||
SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM, /* 2X */
|
||||
NUM,NUM,NUM,NUM,NUM,NUM,NUM,NUM,NUM,NUM,SYM,SYM,SYM,SYM,SYM,SYM, /* 3X */
|
||||
SYM, 34, 48, 41, 42, 33, 50, 49, 43, 31, 56, 54, 40, 44, 35, 39, /* 4X */
|
||||
47, 58, 36, 38, 37, 45, 52, 55, 46, 51, 57,SYM,SYM,SYM,SYM,SYM, /* 5X */
|
||||
SYM, 34, 48, 41, 42, 33, 50, 49, 43, 31, 56, 54, 40, 44, 35, 39, /* 6X */
|
||||
47, 58, 36, 38, 37, 45, 52, 55, 46, 51, 57,SYM,SYM,SYM,SYM,CTR, /* 7X */
|
||||
CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR, /* 8X */
|
||||
CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR, /* 9X */
|
||||
SYM, 32, 83, 84, 85, 86, 59, 61, 87, 88, 89, 90, 91,SYM, 60, 92, /* AX */
|
||||
2, 18, 8, 16, 12, 3, 24, 19, 1, 20, 10, 9, 11, 4, 0, 13, /* BX */
|
||||
6, 5, 7, 14, 28, 23, 25, 22, 27, 29, 53, 17, 21, 30, 26, 15, /* CX */
|
||||
2, 18, 8, 16, 12, 3, 24, 19, 1, 20, 10, 9, 11, 4, 0, 13, /* DX */
|
||||
6, 5, 7, 14, 28, 23, 25, 22, 27, 29, 53, 17, 21, 30, 26, 15, /* EX */
|
||||
SYM, 32, 93, 94, 95, 96, 59, 61, 97, 98, 99,100,101,SYM, 60,102, /* FX */
|
||||
};
|
||||
/*X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 XA XB XC XD XE XF */
|
||||
|
||||
static const unsigned char Koi8_R_CharToOrderMap[] =
|
||||
{
|
||||
CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,RET,CTR,CTR,RET,CTR,CTR, /* 0X */
|
||||
CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR, /* 1X */
|
||||
SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM, /* 2X */
|
||||
NUM,NUM,NUM,NUM,NUM,NUM,NUM,NUM,NUM,NUM,SYM,SYM,SYM,SYM,SYM,SYM, /* 3X */
|
||||
SYM, 34, 48, 41, 42, 33, 50, 49, 43, 31, 56, 54, 40, 44, 35, 39, /* 4X */
|
||||
47, 58, 36, 38, 37, 45, 52, 55, 46, 51, 57,SYM,SYM,SYM,SYM,SYM, /* 5X */
|
||||
SYM, 34, 48, 41, 42, 33, 50, 49, 43, 31, 56, 54, 40, 44, 35, 39, /* 6X */
|
||||
47, 58, 36, 38, 37, 45, 52, 55, 46, 51, 57,SYM,SYM,SYM,SYM,CTR, /* 7X */
|
||||
SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM, /* 8X */
|
||||
SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM, /* 9X */
|
||||
SYM,SYM,SYM, 32,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM, /* AX */
|
||||
SYM,SYM,SYM, 32,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM, /* BX */
|
||||
26, 2, 18, 25, 12, 3, 28, 16, 23, 1, 20, 10, 9, 11, 4, 0, /* CX */
|
||||
13, 15, 6, 5, 7, 14, 24, 8, 21, 17, 19, 27, 30, 29, 22, 53, /* DX */
|
||||
26, 2, 18, 25, 12, 3, 28, 16, 23, 1, 20, 10, 9, 11, 4, 0, /* EX */
|
||||
13, 15, 6, 5, 7, 14, 24, 8, 21, 17, 19, 27, 30, 29, 22, 53, /* FX */
|
||||
};
|
||||
/*X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 XA XB XC XD XE XF */
|
||||
|
||||
static const unsigned char Ibm855_CharToOrderMap[] =
|
||||
{
|
||||
CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,RET,CTR,CTR,RET,CTR,CTR, /* 0X */
|
||||
CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR, /* 1X */
|
||||
SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM, /* 2X */
|
||||
NUM,NUM,NUM,NUM,NUM,NUM,NUM,NUM,NUM,NUM,SYM,SYM,SYM,SYM,SYM,SYM, /* 3X */
|
||||
SYM, 34, 48, 41, 42, 33, 50, 49, 43, 31, 56, 54, 40, 44, 35, 39, /* 4X */
|
||||
47, 58, 36, 38, 37, 45, 52, 55, 46, 51, 57,SYM,SYM,SYM,SYM,SYM, /* 5X */
|
||||
SYM, 34, 48, 41, 42, 33, 50, 49, 43, 31, 56, 54, 40, 44, 35, 39, /* 6X */
|
||||
47, 58, 36, 38, 37, 45, 52, 55, 46, 51, 57,SYM,SYM,SYM,SYM,CTR, /* 7X */
|
||||
103,104,105,106, 32, 32,107,108,109,110, 59, 59, 61, 61,111,112, /* 8X */
|
||||
113,114,115,116,117,118,119,120, 60, 60,121,122, 26, 26, 53, 53, /* 9X */
|
||||
2, 2, 18, 18, 25, 25, 12, 12, 3, 3, 28, 28, 16, 16,SYM,SYM, /* AX */
|
||||
SYM,SYM,SYM,SYM,SYM, 23, 23, 1, 1,SYM,SYM,SYM,SYM, 20, 20,SYM, /* BX */
|
||||
SYM,SYM,SYM,SYM,SYM,SYM, 10, 10,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM, /* CX */
|
||||
9, 9, 11, 11, 4, 4, 0, 0, 13,SYM,SYM,SYM,SYM, 13, 15,SYM, /* DX */
|
||||
15, 6, 6, 5, 5, 7, 7, 14, 14, 24, 24, 8, 8, 21, 21,SYM, /* EX */
|
||||
SYM, 17, 17, 19, 19, 27, 27, 30, 30, 29, 29, 22, 22,SYM,SYM,SYM, /* FX */
|
||||
};
|
||||
/*X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 XA XB XC XD XE XF */
|
||||
|
||||
static const unsigned char Ibm866_CharToOrderMap[] =
|
||||
{
|
||||
CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,RET,CTR,CTR,RET,CTR,CTR, /* 0X */
|
||||
CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR, /* 1X */
|
||||
SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM, /* 2X */
|
||||
NUM,NUM,NUM,NUM,NUM,NUM,NUM,NUM,NUM,NUM,SYM,SYM,SYM,SYM,SYM,SYM, /* 3X */
|
||||
SYM, 34, 48, 41, 42, 33, 50, 49, 43, 31, 56, 54, 40, 44, 35, 39, /* 4X */
|
||||
47, 58, 36, 38, 37, 45, 52, 55, 46, 51, 57,SYM,SYM,SYM,SYM,SYM, /* 5X */
|
||||
SYM, 34, 48, 41, 42, 33, 50, 49, 43, 31, 56, 54, 40, 44, 35, 39, /* 6X */
|
||||
47, 58, 36, 38, 37, 45, 52, 55, 46, 51, 57,SYM,SYM,SYM,SYM,CTR, /* 7X */
|
||||
2, 18, 8, 16, 12, 3, 24, 19, 1, 20, 10, 9, 11, 4, 0, 13, /* 8X */
|
||||
6, 5, 7, 14, 28, 23, 25, 22, 27, 29, 53, 17, 21, 30, 26, 15, /* 9X */
|
||||
2, 18, 8, 16, 12, 3, 24, 19, 1, 20, 10, 9, 11, 4, 0, 13, /* AX */
|
||||
SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM, /* BX */
|
||||
SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM, /* CX */
|
||||
SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM, /* DX */
|
||||
6, 5, 7, 14, 28, 23, 25, 22, 27, 29, 53, 17, 21, 30, 26, 15, /* EX */
|
||||
32, 32,123,124, 61, 61, 60, 60,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM, /* FX */
|
||||
};
|
||||
/*X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 XA XB XC XD XE XF */
|
||||
|
||||
static const unsigned char Mac_Cyrillic_CharToOrderMap[] =
|
||||
{
|
||||
CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,RET,CTR,CTR,RET,CTR,CTR, /* 0X */
|
||||
CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR, /* 1X */
|
||||
SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM, /* 2X */
|
||||
NUM,NUM,NUM,NUM,NUM,NUM,NUM,NUM,NUM,NUM,SYM,SYM,SYM,SYM,SYM,SYM, /* 3X */
|
||||
SYM, 34, 48, 41, 42, 33, 50, 49, 43, 31, 56, 54, 40, 44, 35, 39, /* 4X */
|
||||
47, 58, 36, 38, 37, 45, 52, 55, 46, 51, 57,SYM,SYM,SYM,SYM,SYM, /* 5X */
|
||||
SYM, 34, 48, 41, 42, 33, 50, 49, 43, 31, 56, 54, 40, 44, 35, 39, /* 6X */
|
||||
47, 58, 36, 38, 37, 45, 52, 55, 46, 51, 57,SYM,SYM,SYM,SYM,CTR, /* 7X */
|
||||
2, 18, 8, 16, 12, 3, 24, 19, 1, 20, 10, 9, 11, 4, 0, 13, /* 8X */
|
||||
6, 5, 7, 14, 28, 23, 25, 22, 27, 29, 53, 17, 21, 30, 26, 15, /* 9X */
|
||||
SYM,SYM, 62,SYM,SYM,SYM,SYM, 59,SYM,SYM,SYM,125,126,SYM,127,128, /* AX */
|
||||
SYM,SYM,SYM,SYM, 59,SYM, 62,129,130,131, 61, 61,132,133,134,135, /* BX */
|
||||
136,137,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,138,139,140,141,142, /* CX */
|
||||
SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM, 60, 60,143,144,SYM, 32, 32, 15, /* DX */
|
||||
2, 18, 8, 16, 12, 3, 24, 19, 1, 20, 10, 9, 11, 4, 0, 13, /* EX */
|
||||
6, 5, 7, 14, 28, 23, 25, 22, 27, 29, 53, 17, 21, 30, 26,SYM, /* FX */
|
||||
};
|
||||
/*X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 XA XB XC XD XE XF */
|
||||
|
||||
static const int Unicode_Char_size = 108;
|
||||
static const unsigned int Unicode_CharOrder[] =
|
||||
{
|
||||
65, 34, 66, 48, 67, 41, 68, 42, 69, 33, 70, 50, 71, 49, 72, 43,
|
||||
73, 31, 76, 40, 77, 44, 78, 35, 79, 39, 80, 47, 82, 36, 83, 38,
|
||||
84, 37, 85, 45, 86, 52, 88, 46, 89, 51, 97, 34, 98, 48, 99, 41,
|
||||
100, 42, 101, 33, 102, 50, 103, 49, 104, 43, 105, 31, 108, 40, 109, 44,
|
||||
110, 35, 111, 39, 112, 47, 114, 36, 115, 38, 116, 37, 117, 45, 118, 52,
|
||||
120, 46, 121, 51, 1025, 32, 1040, 2, 1041, 18, 1042, 8, 1043, 16,1044, 12,
|
||||
1045, 3, 1046, 24, 1047, 19, 1048, 1, 1049, 20, 1050, 10, 1051, 9,1052, 11,
|
||||
1053, 4, 1054, 0, 1055, 13, 1056, 6, 1057, 5, 1058, 7, 1059, 14,1060, 28,
|
||||
1061, 23, 1062, 25, 1063, 22, 1064, 27, 1065, 29, 1066, 53, 1067, 17,1068, 21,
|
||||
1069, 30, 1070, 26, 1071, 15, 1072, 2, 1073, 18, 1074, 8, 1075, 16,1076, 12,
|
||||
1077, 3, 1078, 24, 1079, 19, 1080, 1, 1081, 20, 1082, 10, 1083, 9,1084, 11,
|
||||
1085, 4, 1086, 0, 1087, 13, 1088, 6, 1089, 5, 1090, 7, 1091, 14,1092, 28,
|
||||
1093, 23, 1094, 25, 1095, 22, 1096, 27, 1097, 29, 1098, 53, 1099, 17,1100, 21,
|
||||
1101, 30, 1102, 26, 1103, 15, 1105, 32,
|
||||
};
|
||||
|
||||
static const unsigned char win1251_CharToOrderMap[] =
|
||||
{
|
||||
CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,RET,CTR,CTR,RET,CTR,CTR, //00
|
||||
CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR, //10
|
||||
SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM, //20
|
||||
NUM,NUM,NUM,NUM,NUM,NUM,NUM,NUM,NUM,NUM,SYM,SYM,SYM,SYM,SYM,SYM, //30
|
||||
SYM,142,143,144,145,146,147,148,149,150,151,152, 74,153, 75,154, //40
|
||||
155,156,157,158,159,160,161,162,163,164,165,SYM,SYM,SYM,SYM,SYM, //50
|
||||
SYM, 71,172, 66,173, 65,174, 76,175, 64,176,177, 77, 72,178, 69, //60
|
||||
67,179, 78, 73,180,181, 79,182,183,184,185,SYM,SYM,SYM,SYM,SYM, //70
|
||||
191,192,193,194,195,196,197,198,199,200,201,202,203,204,205,206,
|
||||
207,208,209,210,211,212,213,214,ILL,216,217,218,219,220,221,222,
|
||||
223,224,225,226,227,228,229,230,231,232,233,234,235,236,237,238,
|
||||
239,240,241,242,243,244,245,246, 68,247,248,249,250,251,NUM,SYM,
|
||||
37, 44, 33, 46, 41, 48, 56, 51, 42, 60, 36, 49, 38, 31, 34, 35,
|
||||
45, 32, 40, 52, 53, 55, 58, 50, 57, 63, 70, 62, 61, 47, 59, 43,
|
||||
3, 21, 10, 19, 13, 2, 24, 20, 4, 23, 11, 8, 12, 5, 1, 15,
|
||||
9, 7, 6, 14, 39, 26, 28, 22, 25, 29, 54, 18, 17, 30, 27, 16,
|
||||
};
|
||||
|
||||
static const unsigned char latin5_CharToOrderMap[] =
|
||||
{
|
||||
CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,RET,CTR,CTR,RET,CTR,CTR, //00
|
||||
CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR, //10
|
||||
SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM, //20
|
||||
NUM,NUM,NUM,NUM,NUM,NUM,NUM,NUM,NUM,NUM,SYM,SYM,SYM,SYM,SYM,SYM, //30
|
||||
SYM,142,143,144,145,146,147,148,149,150,151,152, 74,153, 75,154, //40
|
||||
155,156,157,158,159,160,161,162,163,164,165,SYM,SYM,SYM,SYM,SYM, //50
|
||||
SYM, 71,172, 66,173, 65,174, 76,175, 64,176,177, 77, 72,178, 69, //60
|
||||
67,179, 78, 73,180,181, 79,182,183,184,185,SYM,SYM,SYM,SYM,SYM, //70
|
||||
191,192,193,194,195,196,197,198,199,200,201,202,203,204,205,206,
|
||||
207,208,209,210,211,212,213,214,215,216,217,218,219,220,221,222,
|
||||
223,224,225,226,227,228,229,230,231,232,233,234,235,236,237,238,
|
||||
37, 44, 33, 46, 41, 48, 56, 51, 42, 60, 36, 49, 38, 31, 34, 35,
|
||||
45, 32, 40, 52, 53, 55, 58, 50, 57, 63, 70, 62, 61, 47, 59, 43,
|
||||
3, 21, 10, 19, 13, 2, 24, 20, 4, 23, 11, 8, 12, 5, 1, 15,
|
||||
9, 7, 6, 14, 39, 26, 28, 22, 25, 29, 54, 18, 17, 30, 27, 16,
|
||||
239, 68,240,241,242,243,244,245,246,247,248,249,250,251,NUM,CTR,
|
||||
};
|
||||
|
||||
static const unsigned char macCyrillic_CharToOrderMap[] =
|
||||
{
|
||||
CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,RET,CTR,CTR,RET,CTR,CTR, //00
|
||||
CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR, //10
|
||||
SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM, //20
|
||||
NUM,NUM,NUM,NUM,NUM,NUM,NUM,NUM,NUM,NUM,SYM,SYM,SYM,SYM,SYM,SYM, //30
|
||||
SYM,142,143,144,145,146,147,148,149,150,151,152, 74,153, 75,154, //40
|
||||
155,156,157,158,159,160,161,162,163,164,165,SYM,SYM,SYM,SYM,SYM, //50
|
||||
SYM, 71,172, 66,173, 65,174, 76,175, 64,176,177, 77, 72,178, 69, //60
|
||||
67,179, 78, 73,180,181, 79,182,183,184,185,SYM,SYM,SYM,SYM,SYM, //70
|
||||
37, 44, 33, 46, 41, 48, 56, 51, 42, 60, 36, 49, 38, 31, 34, 35,
|
||||
45, 32, 40, 52, 53, 55, 58, 50, 57, 63, 70, 62, 61, 47, 59, 43,
|
||||
191,192,193,194,195,196,197,198,199,200,201,202,203,204,205,206,
|
||||
207,208,209,210,211,212,213,214,215,216,217,218,219,220,221,222,
|
||||
223,224,225,226,227,228,229,230,231,232,233,234,235,236,237,238,
|
||||
239,240,241,242,243,244,245,246,247,248,249,250,251,NUM, 68, 16,
|
||||
3, 21, 10, 19, 13, 2, 24, 20, 4, 23, 11, 8, 12, 5, 1, 15,
|
||||
9, 7, 6, 14, 39, 26, 28, 22, 25, 29, 54, 18, 17, 30, 27,CTR,
|
||||
};
|
||||
|
||||
static const unsigned char IBM855_CharToOrderMap[] =
|
||||
{
|
||||
CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,RET,CTR,CTR,RET,CTR,CTR, //00
|
||||
CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR, //10
|
||||
SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM, //20
|
||||
NUM,NUM,NUM,NUM,NUM,NUM,NUM,NUM,NUM,NUM,SYM,SYM,SYM,SYM,SYM,SYM, //30
|
||||
SYM,142,143,144,145,146,147,148,149,150,151,152, 74,153, 75,154, //40
|
||||
155,156,157,158,159,160,161,162,163,164,165,SYM,SYM,SYM,SYM,SYM, //50
|
||||
SYM, 71,172, 66,173, 65,174, 76,175, 64,176,177, 77, 72,178, 69, //60
|
||||
67,179, 78, 73,180,181, 79,182,183,184,185,SYM,SYM,SYM,SYM,SYM, //70
|
||||
191,192,193,194, 68,195,196,197,198,199,200,201,202,203,204,205,
|
||||
206,207,208,209,210,211,212,213,214,215,216,217, 27, 59, 54, 70,
|
||||
3, 37, 21, 44, 28, 58, 13, 41, 2, 48, 39, 53, 19, 46,218,219,
|
||||
220,221,222,223,224, 26, 55, 4, 42,225,226,227,228, 23, 60,229,
|
||||
230,231,232,233,234,235, 11, 36,236,237,238,239,240,241,242,243,
|
||||
8, 49, 12, 38, 5, 31, 1, 34, 15,244,245,246,247, 35, 16,248,
|
||||
43, 9, 45, 7, 32, 6, 40, 14, 52, 24, 56, 10, 33, 17, 61,249,
|
||||
250, 18, 62, 20, 51, 25, 57, 30, 47, 29, 63, 22, 50,251,NUM,CTR,
|
||||
};
|
||||
|
||||
static const unsigned char IBM866_CharToOrderMap[] =
|
||||
{
|
||||
CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,RET,CTR,CTR,RET,CTR,CTR, //00
|
||||
CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR,CTR, //10
|
||||
SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM,SYM, //20
|
||||
NUM,NUM,NUM,NUM,NUM,NUM,NUM,NUM,NUM,NUM,SYM,SYM,SYM,SYM,SYM,SYM, //30
|
||||
SYM,142,143,144,145,146,147,148,149,150,151,152, 74,153, 75,154, //40
|
||||
155,156,157,158,159,160,161,162,163,164,165,SYM,SYM,SYM,SYM,SYM, //50
|
||||
SYM, 71,172, 66,173, 65,174, 76,175, 64,176,177, 77, 72,178, 69, //60
|
||||
67,179, 78, 73,180,181, 79,182,183,184,185,SYM,SYM,SYM,SYM,SYM, //70
|
||||
37, 44, 33, 46, 41, 48, 56, 51, 42, 60, 36, 49, 38, 31, 34, 35,
|
||||
45, 32, 40, 52, 53, 55, 58, 50, 57, 63, 70, 62, 61, 47, 59, 43,
|
||||
3, 21, 10, 19, 13, 2, 24, 20, 4, 23, 11, 8, 12, 5, 1, 15,
|
||||
191,192,193,194,195,196,197,198,199,200,201,202,203,204,205,206,
|
||||
207,208,209,210,211,212,213,214,215,216,217,218,219,220,221,222,
|
||||
223,224,225,226,227,228,229,230,231,232,233,234,235,236,237,238,
|
||||
9, 7, 6, 14, 39, 26, 28, 22, 25, 29, 54, 18, 17, 30, 27, 16,
|
||||
239, 68,240,241,242,243,244,245,246,247,248,249,250,251,NUM,CTR,
|
||||
};
|
||||
|
||||
//Model Table:
|
||||
//total sequences: 100%
|
||||
//first 512 sequences: 97.6601%
|
||||
//first 1024 sequences: 2.3389%
|
||||
//rest sequences: 0.1237%
|
||||
//negative sequences: 0.0009%
|
||||
/* Model Table:
|
||||
* Total considered sequences: 1554 / 2916
|
||||
* - Positive sequences: first 819 (0.9950050289366638)
|
||||
* - Probable sequences: next 260 (1079-819) (0.003999322715788067)
|
||||
* - Neutral sequences: last 1837 (0.0009956483475481726)
|
||||
* - Negative sequences: 1362 (off-ratio)
|
||||
* Negative sequences: TODO
|
||||
*/
|
||||
static const PRUint8 RussianLangModel[] =
|
||||
{
|
||||
0,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,1,1,3,3,3,3,1,3,3,3,2,3,2,3,3,
|
||||
3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,0,3,2,2,2,2,2,0,0,2,
|
||||
3,3,3,2,3,3,3,3,3,3,3,3,3,3,2,3,3,0,0,3,3,3,3,3,3,3,3,3,2,3,2,0,
|
||||
0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
3,3,3,2,2,3,3,3,3,3,3,3,3,3,2,3,3,0,0,3,3,3,3,3,3,3,3,2,3,3,1,0,
|
||||
0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
3,2,3,2,3,3,3,3,3,3,3,3,3,3,3,3,3,0,0,3,3,3,3,3,3,3,3,3,3,3,2,1,
|
||||
0,0,0,0,0,0,0,2,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
3,3,3,3,3,3,3,3,3,3,3,3,3,3,2,3,3,0,0,3,3,3,3,3,3,3,3,3,3,3,2,1,
|
||||
0,0,0,0,0,1,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
3,3,3,3,3,3,3,3,2,2,2,3,1,3,3,1,3,3,3,3,2,2,3,0,2,2,2,3,3,2,1,0,
|
||||
0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,
|
||||
3,3,3,3,3,3,2,3,3,3,3,3,2,2,3,2,3,3,3,2,1,2,2,0,1,2,2,2,2,2,2,0,
|
||||
0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,
|
||||
3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,2,2,2,3,0,2,2,3,3,2,1,2,0,
|
||||
0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,1,0,0,2,0,0,0,0,0,0,0,0,0,
|
||||
3,3,3,3,3,3,2,3,3,1,2,3,2,2,3,2,3,3,3,3,2,2,3,0,3,2,2,3,1,1,1,0,
|
||||
0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
3,3,3,3,3,3,3,3,2,2,3,3,3,3,3,2,3,3,3,3,2,2,2,0,3,3,3,2,2,2,2,0,
|
||||
0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
3,3,3,3,3,3,3,3,3,3,2,3,2,3,3,3,3,3,3,2,3,2,2,0,1,3,2,1,2,2,1,0,
|
||||
0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,
|
||||
3,3,3,3,3,3,3,3,3,3,3,2,1,1,3,0,1,1,1,1,2,1,1,0,2,2,2,1,2,0,1,0,
|
||||
0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
3,3,3,3,3,3,2,3,3,2,2,2,2,1,3,2,3,2,3,2,1,2,2,0,1,1,2,1,2,1,2,0,
|
||||
0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
3,3,3,3,3,3,3,3,3,3,3,3,2,2,3,2,3,3,3,2,2,2,2,0,2,2,2,2,3,1,1,0,
|
||||
0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,
|
||||
3,2,3,2,2,3,3,3,3,3,3,3,3,3,1,3,2,0,0,3,3,3,3,2,3,3,3,3,2,3,2,0,
|
||||
0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
2,3,3,3,3,3,2,2,3,3,0,2,1,0,3,2,3,2,3,0,0,1,2,0,0,1,0,1,2,1,1,0,
|
||||
0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
3,0,3,0,2,3,3,3,3,2,3,3,3,3,1,2,2,0,0,2,3,2,2,2,3,2,3,2,2,3,0,0,
|
||||
0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
3,2,3,0,2,3,2,3,0,1,2,3,3,2,0,2,3,0,0,2,3,2,2,0,1,3,1,3,2,2,1,0,
|
||||
0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
3,1,3,0,2,3,3,3,3,3,3,3,3,2,1,3,2,0,0,2,2,3,3,3,2,3,3,0,2,2,0,0,
|
||||
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
3,3,3,3,3,3,2,2,3,3,2,2,2,3,3,0,0,1,1,1,1,1,2,0,0,1,1,1,1,0,1,0,
|
||||
0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
3,3,3,3,3,3,2,2,3,3,3,3,3,3,3,0,3,2,3,3,2,3,2,0,2,1,0,1,1,0,1,0,
|
||||
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,
|
||||
3,3,3,3,3,3,2,3,3,3,2,2,2,2,3,1,3,2,3,1,1,2,1,0,2,2,2,2,1,3,1,0,
|
||||
0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,
|
||||
2,2,3,3,3,3,3,1,2,2,1,3,1,0,3,0,0,3,0,0,0,1,1,0,1,2,1,0,0,0,0,0,
|
||||
0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
3,2,2,1,1,3,3,3,2,2,1,2,2,3,1,1,2,0,0,2,2,1,3,0,0,2,1,1,2,1,1,0,
|
||||
0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
3,2,3,3,3,3,1,2,2,2,1,2,1,3,3,1,1,2,1,2,1,2,2,0,2,0,0,1,1,0,1,0,
|
||||
0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
2,3,3,3,3,3,2,1,3,2,2,3,2,0,3,2,0,3,0,1,0,1,1,0,0,1,1,1,1,0,1,0,
|
||||
0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
3,3,2,3,3,3,2,2,2,3,3,1,2,1,2,1,0,1,0,1,1,0,1,0,0,2,1,1,1,0,1,0,
|
||||
0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,
|
||||
3,1,1,2,1,2,3,3,2,2,1,2,2,3,0,2,1,0,0,2,2,3,2,1,2,2,2,2,2,3,1,0,
|
||||
0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
3,3,3,3,3,1,1,0,1,1,2,2,1,1,3,0,0,1,3,1,1,1,0,0,0,1,0,1,1,0,0,0,
|
||||
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
2,1,3,3,3,2,0,0,0,2,1,0,1,0,2,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
2,0,1,0,0,2,3,2,2,2,1,2,2,2,1,2,1,0,0,1,1,1,0,2,0,1,1,1,0,0,1,1,
|
||||
1,0,0,0,0,0,1,2,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,
|
||||
2,3,3,3,3,0,0,0,0,1,0,0,0,0,3,0,1,2,1,0,0,0,0,0,0,0,1,1,0,0,1,1,
|
||||
1,0,1,0,1,2,0,0,1,1,2,1,0,1,1,1,1,0,1,1,1,1,0,1,0,0,1,0,0,1,1,0,
|
||||
2,2,3,2,2,2,3,1,2,2,2,2,2,2,2,2,1,1,1,1,1,1,1,0,1,0,1,1,1,0,2,1,
|
||||
1,1,1,1,1,1,1,1,2,1,1,1,1,1,1,1,1,1,1,0,1,0,1,1,0,1,1,1,0,1,1,0,
|
||||
3,3,3,2,2,2,2,3,2,2,1,1,2,2,2,2,1,1,3,1,2,1,2,0,0,1,1,0,1,0,2,1,
|
||||
1,1,1,1,1,2,1,0,1,1,1,1,0,1,0,0,1,1,0,0,1,0,1,0,0,1,0,0,0,1,1,0,
|
||||
2,0,0,1,0,3,2,2,2,2,1,2,1,2,1,2,0,0,0,2,1,2,2,1,1,2,2,0,1,1,0,2,
|
||||
1,1,1,1,1,0,1,1,1,2,1,1,1,2,1,0,1,2,1,1,1,1,0,1,1,1,0,0,1,0,0,1,
|
||||
1,3,2,2,2,1,1,1,2,3,0,0,0,0,2,0,2,2,1,0,0,0,0,0,0,1,0,0,0,0,1,1,
|
||||
1,0,1,1,0,1,0,1,1,0,1,1,0,2,0,0,1,1,0,0,1,0,0,0,0,0,0,0,0,1,1,0,
|
||||
2,3,2,3,2,1,2,2,2,2,1,0,0,0,2,0,0,1,1,0,0,0,0,0,0,0,1,1,0,0,2,1,
|
||||
1,1,2,1,0,2,0,0,1,0,1,0,0,1,0,0,1,1,0,1,1,0,0,0,0,0,1,0,0,0,0,0,
|
||||
3,0,0,1,0,2,2,2,3,2,2,2,2,2,2,2,0,0,0,2,1,2,1,1,1,2,2,0,0,0,1,2,
|
||||
1,1,1,1,1,0,1,2,1,1,1,1,1,1,1,0,1,1,1,1,1,1,0,1,1,1,1,1,1,0,0,1,
|
||||
2,3,2,3,3,2,0,1,1,1,0,0,1,0,2,0,1,1,3,1,0,0,0,0,0,0,0,1,0,0,2,1,
|
||||
1,1,1,1,1,1,1,0,1,0,1,1,1,1,0,1,1,1,0,0,1,1,0,1,0,0,0,0,0,0,1,0,
|
||||
2,3,3,3,3,1,2,2,2,2,0,1,1,0,2,1,1,1,2,1,0,1,1,0,0,1,0,1,0,0,2,0,
|
||||
0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
2,3,3,3,2,0,0,1,1,2,2,1,0,0,2,0,1,1,3,0,0,1,0,0,0,0,0,1,0,1,2,1,
|
||||
1,1,2,0,1,1,1,0,1,0,1,1,0,1,0,1,1,1,1,0,1,0,0,0,0,0,0,1,0,1,1,0,
|
||||
1,3,2,3,2,1,0,0,2,2,2,0,1,0,2,0,1,1,1,0,1,0,0,0,3,0,1,1,0,0,2,1,
|
||||
1,1,1,0,1,1,0,0,0,0,1,1,0,1,0,0,2,1,1,0,1,0,0,0,1,0,1,0,0,1,1,0,
|
||||
3,1,2,1,1,2,2,2,2,2,2,1,2,2,1,1,0,0,0,2,2,2,0,0,0,1,2,1,0,1,0,1,
|
||||
2,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,2,1,1,1,0,1,0,1,1,0,1,1,1,0,0,1,
|
||||
3,0,0,0,0,2,0,1,1,1,1,1,1,1,0,1,0,0,0,1,1,1,0,1,0,1,1,0,0,1,0,1,
|
||||
1,1,0,0,1,0,0,0,1,0,1,1,0,0,1,0,1,0,1,0,0,0,0,1,0,0,0,1,0,0,0,1,
|
||||
1,3,3,2,2,0,0,0,2,2,0,0,0,1,2,0,1,1,2,0,0,0,0,0,0,0,0,1,0,0,2,1,
|
||||
0,1,1,0,0,1,1,0,0,0,1,1,0,1,1,0,1,1,0,0,1,0,0,0,0,0,0,0,0,0,1,0,
|
||||
2,3,2,3,2,0,0,0,0,1,1,0,0,0,2,0,2,0,2,0,0,0,0,0,1,0,0,1,0,0,1,1,
|
||||
1,1,2,0,1,2,1,0,1,1,2,1,1,1,1,1,2,1,1,0,1,0,0,1,1,1,1,1,0,1,1,0,
|
||||
1,3,2,2,2,1,0,0,2,2,1,0,1,2,2,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,1,1,
|
||||
0,0,1,1,0,1,1,0,0,1,1,0,1,1,0,0,1,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,
|
||||
1,0,0,1,0,2,3,1,2,2,2,2,2,2,1,1,0,0,0,1,0,1,0,2,1,1,1,0,0,0,0,1,
|
||||
1,1,0,1,1,0,1,1,1,1,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,1,0,0,0,
|
||||
2,0,2,0,0,1,0,3,2,1,2,1,2,2,0,1,0,0,0,2,1,0,0,2,1,1,1,1,0,2,0,2,
|
||||
2,1,1,1,1,1,1,1,1,1,1,1,1,2,1,0,1,1,1,1,0,0,0,1,1,1,1,0,1,0,0,1,
|
||||
1,2,2,2,2,1,0,0,1,0,0,0,0,0,2,0,1,1,1,1,0,0,0,0,1,0,1,2,0,0,2,0,
|
||||
1,0,1,1,1,2,1,0,1,0,1,1,0,0,1,0,1,1,1,0,1,0,0,0,1,0,0,1,0,1,1,0,
|
||||
2,1,2,2,2,0,3,0,1,1,0,0,0,0,2,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,
|
||||
0,0,0,1,1,1,0,0,1,0,1,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,
|
||||
1,2,2,3,2,2,0,0,1,1,2,0,1,2,1,0,1,0,1,0,0,1,0,0,0,0,0,0,0,0,0,1,
|
||||
0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,1,1,0,0,1,0,0,0,0,0,0,0,0,1,1,0,
|
||||
2,2,1,1,2,1,2,2,2,2,2,1,2,2,0,1,0,0,0,1,2,2,2,1,2,1,1,1,1,1,2,1,
|
||||
1,1,1,1,1,1,1,1,1,1,0,0,1,1,1,0,1,1,1,0,0,0,0,1,1,1,0,1,1,0,0,1,
|
||||
1,2,2,2,2,0,1,0,2,2,0,0,0,0,2,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,2,0,
|
||||
0,0,1,0,0,1,0,0,0,0,1,0,1,1,0,0,1,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,
|
||||
0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,
|
||||
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
1,2,2,2,2,0,0,0,2,2,2,0,1,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,1,
|
||||
0,1,1,0,0,1,1,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
1,2,2,2,2,0,0,0,0,1,0,0,1,1,2,0,0,0,0,1,0,1,0,0,1,0,0,2,0,0,0,1,
|
||||
0,0,1,0,0,1,0,0,0,1,1,0,0,0,0,0,1,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,
|
||||
1,2,2,2,1,1,2,0,2,1,1,1,1,0,2,2,0,0,0,0,0,0,0,0,0,1,1,0,0,0,1,1,
|
||||
0,0,1,0,1,1,0,0,0,0,1,0,0,0,0,0,1,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,
|
||||
1,0,2,1,2,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,
|
||||
0,0,1,0,1,1,0,0,0,0,1,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,
|
||||
1,0,0,0,0,2,0,1,2,1,0,1,1,1,0,1,0,0,0,1,0,1,0,0,1,0,1,0,0,0,0,1,
|
||||
0,0,0,0,0,1,0,0,1,1,0,0,1,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,
|
||||
2,2,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,
|
||||
1,0,0,0,1,0,0,0,1,1,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,1,0,0,0,0,0,
|
||||
2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,
|
||||
1,1,1,0,1,0,1,0,0,1,1,1,1,0,0,0,1,0,0,0,0,1,0,0,0,1,0,1,0,0,0,0,
|
||||
1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,
|
||||
1,1,0,1,1,0,1,0,1,0,0,0,0,1,1,0,1,1,0,0,0,0,0,1,0,1,1,0,1,0,0,0,
|
||||
0,1,1,1,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,
|
||||
3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,0,3,3,3,0,3,3,3,3,3,
|
||||
3,3,3,3,0,3,1,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,
|
||||
3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,0,3,3,3,0,3,3,3,3,3,
|
||||
3,3,3,2,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,
|
||||
3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,0,3,3,3,0,3,3,3,3,3,
|
||||
3,3,3,3,1,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,0,3,3,3,0,3,3,3,3,3,
|
||||
3,3,3,2,0,3,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
3,3,3,3,3,3,3,3,3,2,3,2,3,2,3,3,3,3,3,3,0,3,3,3,3,3,3,
|
||||
3,3,3,2,0,3,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,
|
||||
3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,1,0,3,3,3,3,3,3,
|
||||
3,3,1,3,0,3,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,3,
|
||||
3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,0,3,3,3,3,3,3,
|
||||
3,3,2,3,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,
|
||||
3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,2,3,3,1,0,3,3,2,1,3,3,
|
||||
1,3,2,2,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,
|
||||
3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,2,3,0,3,2,3,2,2,1,
|
||||
3,1,3,1,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,
|
||||
3,3,3,3,3,3,3,3,2,3,3,3,3,2,3,3,3,3,2,2,0,3,2,3,3,1,3,
|
||||
1,2,2,2,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
3,3,3,3,3,3,3,3,3,3,3,3,2,2,3,1,3,1,1,3,0,2,1,1,3,3,1,
|
||||
2,2,0,2,0,2,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
3,3,3,3,3,3,2,2,3,3,3,3,2,3,3,3,2,3,3,2,0,3,2,1,1,2,2,
|
||||
1,2,1,3,1,3,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,1,
|
||||
3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,2,3,0,3,3,2,3,3,3,
|
||||
3,1,0,3,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,
|
||||
3,3,3,3,3,3,3,3,1,3,3,1,2,3,3,3,1,3,3,1,0,3,2,1,0,3,1,
|
||||
2,2,1,1,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
3,3,3,3,3,3,3,3,3,3,3,3,3,3,2,3,3,0,3,3,2,0,3,3,3,3,3,
|
||||
3,3,3,3,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,
|
||||
1,1,1,3,3,3,3,3,3,3,3,3,3,3,1,3,3,0,3,3,3,0,3,3,3,3,3,
|
||||
2,1,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
3,3,3,3,3,3,3,3,3,3,3,2,3,1,3,1,3,1,2,2,0,1,2,1,1,1,2,
|
||||
2,1,0,2,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
0,2,1,3,3,3,3,3,3,3,3,3,3,3,1,2,3,0,3,3,3,0,3,3,3,2,0,
|
||||
3,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
3,3,3,3,3,3,3,2,3,3,3,3,2,1,3,3,1,3,3,2,0,2,2,3,3,2,3,
|
||||
3,1,3,2,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,
|
||||
3,3,3,3,3,3,3,1,3,3,3,3,3,1,3,3,3,3,3,2,0,3,2,0,3,2,2,
|
||||
2,0,0,1,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,
|
||||
3,2,2,2,3,3,3,3,2,3,3,3,3,2,1,2,2,0,3,2,0,0,3,1,1,3,1,
|
||||
3,2,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
3,3,0,3,3,3,2,3,3,1,3,3,3,3,0,3,3,0,3,3,0,0,3,1,1,3,3,
|
||||
3,3,2,1,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
2,3,3,3,3,1,3,3,2,3,3,2,0,1,3,0,0,1,0,0,0,3,1,0,3,0,1,
|
||||
3,1,0,1,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
3,3,3,3,3,3,3,3,3,3,2,3,2,1,3,0,1,1,1,1,0,2,1,2,0,1,1,
|
||||
1,1,0,2,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,
|
||||
3,3,3,3,3,3,1,1,2,1,3,1,3,2,3,0,1,1,3,1,0,3,2,0,1,2,1,
|
||||
1,0,0,1,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,
|
||||
3,3,3,3,1,1,1,1,3,2,3,1,2,1,3,1,1,3,2,2,0,1,0,1,0,1,1,
|
||||
1,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
1,2,2,1,3,3,3,3,2,3,2,3,3,2,1,1,3,0,3,3,1,0,3,2,3,3,2,
|
||||
1,1,3,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
3,3,3,3,3,2,3,3,3,3,3,2,1,2,3,0,1,0,1,0,0,3,1,1,0,0,1,
|
||||
1,1,0,1,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
3,3,3,3,2,3,3,3,0,3,3,2,1,1,3,1,3,2,1,1,0,3,0,1,0,1,2,
|
||||
1,3,0,1,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
1,3,3,3,3,0,1,0,0,0,0,0,0,0,3,0,0,0,0,0,0,3,0,0,0,0,0,
|
||||
0,0,0,0,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
1,1,1,1,3,3,3,3,3,3,3,3,3,3,1,2,2,0,2,2,3,0,1,1,1,2,1,
|
||||
2,3,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,
|
||||
0,0,0,0,3,0,3,3,3,3,3,3,3,3,3,3,1,3,2,3,2,3,3,2,1,3,0,
|
||||
0,0,0,1,3,3,3,3,3,3,2,3,2,2,1,1,3,0,2,3,2,0,1,3,2,1,0,
|
||||
0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
0,1,0,0,1,1,1,0,0,0,1,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
0,0,0,0,3,0,3,3,3,3,3,3,2,3,3,3,2,3,3,2,2,2,3,2,2,3,0,
|
||||
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,
|
||||
0,0,0,0,3,0,2,2,3,3,3,3,2,3,3,3,2,3,3,1,3,2,3,3,3,2,0,
|
||||
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
0,0,0,0,3,0,3,3,3,2,3,3,3,2,3,3,1,1,2,1,1,2,3,2,2,1,0,
|
||||
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
0,0,0,0,3,0,3,3,3,3,3,3,3,3,3,3,2,3,3,0,2,2,3,2,3,2,0,
|
||||
0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
0,0,0,0,3,0,3,3,1,3,3,3,3,2,2,1,3,1,3,0,1,1,1,2,3,1,0,
|
||||
0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
0,0,0,0,3,0,3,3,2,1,3,3,3,2,3,1,3,2,2,0,3,3,1,2,2,1,0,
|
||||
1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
0,0,0,0,2,0,2,2,3,3,3,3,3,3,3,3,2,3,3,2,3,3,2,3,2,3,0,
|
||||
1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
0,0,0,0,3,0,3,3,1,1,3,2,3,3,2,2,1,2,3,0,1,2,1,2,2,1,0,
|
||||
1,0,0,1,0,1,0,0,1,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
0,0,0,0,3,0,3,3,2,2,3,2,3,3,2,1,3,1,2,0,1,1,1,1,2,1,0,
|
||||
0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
0,0,0,0,3,0,3,3,1,2,1,2,3,2,1,2,1,1,2,0,1,1,2,1,2,1,0,
|
||||
0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
0,0,0,0,3,0,3,3,2,2,3,2,3,2,1,1,1,1,3,0,1,1,1,1,2,0,0,
|
||||
0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
0,0,0,0,3,0,3,3,2,1,1,2,3,1,2,1,1,3,3,0,3,3,2,1,2,1,0,
|
||||
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
0,0,0,0,2,0,3,2,3,3,3,3,1,3,2,2,1,3,0,1,2,3,2,2,1,1,0,
|
||||
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
0,0,0,0,3,0,1,1,0,1,1,1,1,0,1,0,1,0,0,3,1,1,0,1,1,3,0,
|
||||
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
0,0,0,0,3,0,3,3,1,3,2,2,3,3,1,1,3,1,3,0,2,1,0,1,1,1,0,
|
||||
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
0,0,0,0,3,0,3,3,3,3,1,1,3,3,2,1,1,1,2,1,1,2,0,1,2,0,0,
|
||||
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
0,0,0,0,3,0,3,3,2,3,1,2,3,3,1,1,3,1,2,0,1,1,1,0,1,1,0,
|
||||
0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
0,0,0,0,3,0,3,3,1,3,2,1,3,2,1,1,0,1,2,1,0,1,1,2,1,0,0,
|
||||
0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,
|
||||
0,0,0,0,1,0,2,2,2,1,1,3,2,3,2,1,1,1,1,0,1,3,1,1,1,1,0,
|
||||
0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
0,0,0,0,3,0,3,3,1,1,0,1,2,0,1,1,0,1,1,1,0,1,0,1,1,0,0,
|
||||
0,1,0,3,0,0,0,0,0,0,0,0,0,0,0,3,0,0,0,0,0,0,0,0,0,0,1,
|
||||
0,0,0,0,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
|
||||
};
|
||||
|
||||
|
||||
const SequenceModel Koi8rRussianModel =
|
||||
const SequenceModel Windows_1251RussianModel =
|
||||
{
|
||||
KOI8R_CharToOrderMap,
|
||||
Windows_1251_CharToOrderMap,
|
||||
RussianLangModel,
|
||||
64,
|
||||
(float)0.976601,
|
||||
PR_FALSE,
|
||||
"KOI8-R",
|
||||
"ru"
|
||||
};
|
||||
|
||||
const SequenceModel Win1251RussianModel =
|
||||
{
|
||||
win1251_CharToOrderMap,
|
||||
RussianLangModel,
|
||||
64,
|
||||
(float)0.976601,
|
||||
54,
|
||||
(float)0.9990043516524518,
|
||||
PR_FALSE,
|
||||
"WINDOWS-1251",
|
||||
"ru"
|
||||
};
|
||||
|
||||
const SequenceModel Latin5RussianModel =
|
||||
const SequenceModel Iso_8859_5RussianModel =
|
||||
{
|
||||
latin5_CharToOrderMap,
|
||||
Iso_8859_5_CharToOrderMap,
|
||||
RussianLangModel,
|
||||
64,
|
||||
(float)0.976601,
|
||||
54,
|
||||
(float)0.9990043516524518,
|
||||
PR_FALSE,
|
||||
"ISO-8859-5",
|
||||
"ru"
|
||||
};
|
||||
|
||||
const SequenceModel MacCyrillicRussianModel =
|
||||
const SequenceModel Koi8_RRussianModel =
|
||||
{
|
||||
macCyrillic_CharToOrderMap,
|
||||
Koi8_R_CharToOrderMap,
|
||||
RussianLangModel,
|
||||
64,
|
||||
(float)0.976601,
|
||||
54,
|
||||
(float)0.9990043516524518,
|
||||
PR_FALSE,
|
||||
"MAC-CYRILLIC",
|
||||
"ru"
|
||||
};
|
||||
|
||||
const SequenceModel Ibm866RussianModel =
|
||||
{
|
||||
IBM866_CharToOrderMap,
|
||||
RussianLangModel,
|
||||
64,
|
||||
(float)0.976601,
|
||||
PR_FALSE,
|
||||
"IBM866",
|
||||
"KOI8-R",
|
||||
"ru"
|
||||
};
|
||||
|
||||
const SequenceModel Ibm855RussianModel =
|
||||
{
|
||||
IBM855_CharToOrderMap,
|
||||
Ibm855_CharToOrderMap,
|
||||
RussianLangModel,
|
||||
64,
|
||||
(float)0.976601,
|
||||
54,
|
||||
(float)0.9990043516524518,
|
||||
PR_FALSE,
|
||||
"IBM855",
|
||||
"ru"
|
||||
};
|
||||
|
||||
const SequenceModel Ibm866RussianModel =
|
||||
{
|
||||
Ibm866_CharToOrderMap,
|
||||
RussianLangModel,
|
||||
54,
|
||||
(float)0.9990043516524518,
|
||||
PR_FALSE,
|
||||
"IBM866",
|
||||
"ru"
|
||||
};
|
||||
|
||||
const SequenceModel Mac_CyrillicRussianModel =
|
||||
{
|
||||
Mac_Cyrillic_CharToOrderMap,
|
||||
RussianLangModel,
|
||||
54,
|
||||
(float)0.9990043516524518,
|
||||
PR_FALSE,
|
||||
"MAC-CYRILLIC",
|
||||
"ru"
|
||||
};
|
||||
|
||||
const LanguageModel RussianModel =
|
||||
{
|
||||
"ru",
|
||||
Unicode_CharOrder,
|
||||
108,
|
||||
RussianLangModel,
|
||||
54,
|
||||
5,
|
||||
(float)0.40765948913984873,
|
||||
29,
|
||||
(float)0.030302616590369,
|
||||
};
|
||||
|
||||
@ -150,6 +150,7 @@ extern const LanguageModel NorwegianModel;
|
||||
extern const LanguageModel PolishModel;
|
||||
extern const LanguageModel PortugueseModel;
|
||||
extern const LanguageModel RomanianModel;
|
||||
extern const LanguageModel RussianModel;
|
||||
extern const LanguageModel SlovakModel;
|
||||
extern const LanguageModel SloveneModel;
|
||||
extern const LanguageModel SpanishModel;
|
||||
|
||||
@ -117,6 +117,7 @@ nsMBCSGroupProber::nsMBCSGroupProber(PRUint32 aLanguageFilter)
|
||||
langDetectors[i][j++] = new nsLanguageDetector(&PolishModel);
|
||||
langDetectors[i][j++] = new nsLanguageDetector(&PortugueseModel);
|
||||
langDetectors[i][j++] = new nsLanguageDetector(&RomanianModel);
|
||||
langDetectors[i][j++] = new nsLanguageDetector(&RussianModel);
|
||||
langDetectors[i][j++] = new nsLanguageDetector(&SlovakModel);
|
||||
langDetectors[i][j++] = new nsLanguageDetector(&SloveneModel);
|
||||
langDetectors[i][j++] = new nsLanguageDetector(&SpanishModel);
|
||||
|
||||
@ -49,7 +49,7 @@
|
||||
#include "nsEUCTWProber.h"
|
||||
|
||||
#define NUM_OF_PROBERS 8
|
||||
#define NUM_OF_LANGUAGES 34
|
||||
#define NUM_OF_LANGUAGES 35
|
||||
|
||||
class nsMBCSGroupProber: public nsCharSetProber {
|
||||
public:
|
||||
|
||||
@ -50,10 +50,10 @@ nsSBCSGroupProber::nsSBCSGroupProber()
|
||||
PRUint32 heb_prober_idx;
|
||||
PRUint32 n = 0;
|
||||
|
||||
mProbers[n++] = new nsSingleByteCharSetProber(&Win1251RussianModel);
|
||||
mProbers[n++] = new nsSingleByteCharSetProber(&Koi8rRussianModel);
|
||||
mProbers[n++] = new nsSingleByteCharSetProber(&Latin5RussianModel);
|
||||
mProbers[n++] = new nsSingleByteCharSetProber(&MacCyrillicRussianModel);
|
||||
mProbers[n++] = new nsSingleByteCharSetProber(&Windows_1251RussianModel);
|
||||
mProbers[n++] = new nsSingleByteCharSetProber(&Koi8_RRussianModel);
|
||||
mProbers[n++] = new nsSingleByteCharSetProber(&Iso_8859_5RussianModel);
|
||||
mProbers[n++] = new nsSingleByteCharSetProber(&Mac_CyrillicRussianModel);
|
||||
mProbers[n++] = new nsSingleByteCharSetProber(&Ibm866RussianModel);
|
||||
mProbers[n++] = new nsSingleByteCharSetProber(&Ibm855RussianModel);
|
||||
|
||||
|
||||
@ -134,10 +134,10 @@ protected:
|
||||
extern const SequenceModel Windows_1256ArabicModel;
|
||||
extern const SequenceModel Iso_8859_6ArabicModel;
|
||||
|
||||
extern const SequenceModel Koi8rRussianModel;
|
||||
extern const SequenceModel Win1251RussianModel;
|
||||
extern const SequenceModel Latin5RussianModel;
|
||||
extern const SequenceModel MacCyrillicRussianModel;
|
||||
extern const SequenceModel Koi8_RRussianModel;
|
||||
extern const SequenceModel Windows_1251RussianModel;
|
||||
extern const SequenceModel Iso_8859_5RussianModel;
|
||||
extern const SequenceModel Mac_CyrillicRussianModel;
|
||||
extern const SequenceModel Ibm866RussianModel;
|
||||
extern const SequenceModel Ibm855RussianModel;
|
||||
|
||||
|
||||
1
test/ru/utf-8.txt
Normal file
1
test/ru/utf-8.txt
Normal file
@ -0,0 +1 @@
|
||||
Сурки образуют отчётливо выраженную группу из 14 или 15 видов (статус лесостепного сурка как отдельного вида является предметом обсуждения), в рамках семейства беличьих. Это относительно крупные, весом в несколько килограммов, животные, обитающие в открытых ландшафтах, в сооружаемых самостоятельно норах. Прародина сурков — Северная Америка, откуда они распространились через Берингию в Азию, и дальше — в Европу. В Евразии большинство исследователей выделяет 8 видов сурков: три или четыре вида, объединяемых в группу bobak (степной сурок, лесостепной сурок, серый сурок и монгольский сурок), населяющие широкую полосу степей и гор от Украины на западе до северо-западного Китая на востоке, единственный чисто европейский вид — альпийский сурок, три вида гор Центральной Азии — сурок Мензбира, длиннохвостый, или красный сурок и гималайский сурок и обособленный северо-восточный вид — черношапочный сурок. Разные виды сурков обособились в различных географических зонах и отличаются друг от друга особенностями поведения, но сохранили внешнее сходство и необходимость впадать в зимнюю спячку. Все сурки травоядны, селятся в норах, имеют тёплый мех и почти все живут колониями. Различаются равнинные сурки (байбаки) и сурки горные, живущие в суровых условиях альпийских гор, куда летнее тепло приходит поздно, а зима является рано. Сурки встречают свистом восход солнца[3]. Сурки иногда храпят[3].
|
||||
Loading…
x
Reference in New Issue
Block a user