Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit5e8d670

Browse files
committed
1 parentec74369 commit5e8d670

File tree

2 files changed

+236
-4
lines changed

2 files changed

+236
-4
lines changed

‎contrib/unaccent/generate_unaccent_rules.py

Lines changed: 15 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,15 @@
2929
importsys
3030
importxml.etree.ElementTreeasET
3131

32+
# The ranges of Unicode characters that we consider to be "plain letters".
33+
# For now we are being conservative by including only Latin and Greek. This
34+
# could be extended in future based on feedback from people with relevant
35+
# language knowledge.
36+
PLAIN_LETTER_RANGES= ((ord('a'),ord('z')),# Latin lower case
37+
(ord('A'),ord('Z')),# Latin upper case
38+
(0x03b1,0x03c9),# GREEK SMALL LETTER ALPHA, GREEK SMALL LETTER OMEGA
39+
(0x0391,0x03a9))# GREEK CAPITAL LETTER ALPHA, GREEK CAPITAL LETTER OMEGA
40+
3241
defprint_record(codepoint,letter):
3342
print (unichr(codepoint)+"\t"+letter).encode("UTF-8")
3443

@@ -39,9 +48,11 @@ def __init__(self, id, general_category, combining_ids):
3948
self.combining_ids=combining_ids
4049

4150
defis_plain_letter(codepoint):
42-
"""Return true if codepoint represents a plain ASCII letter."""
43-
return (codepoint.id>=ord('a')andcodepoint.id<=ord('z'))or \
44-
(codepoint.id>=ord('A')andcodepoint.id<=ord('Z'))
51+
"""Return true if codepoint represents a "plain letter"."""
52+
forbegin,endinPLAIN_LETTER_RANGES:
53+
ifcodepoint.id>=beginandcodepoint.id<=end:
54+
returnTrue
55+
returnFalse
4556

4657
defis_mark(codepoint):
4758
"""Returns true for diacritical marks (combining codepoints)."""
@@ -184,7 +195,7 @@ def main(args):
184195
len(codepoint.combining_ids)>1:
185196
ifis_letter_with_marks(codepoint,table):
186197
charactersSet.add((codepoint.id,
187-
chr(get_plain_letter(codepoint,table).id)))
198+
unichr(get_plain_letter(codepoint,table).id)))
188199
elifargs.noLigaturesExpansionisFalseandis_ligature(codepoint,table):
189200
charactersSet.add((codepoint.id,
190201
"".join(unichr(combining_codepoint.id)

‎contrib/unaccent/unaccent.rules

Lines changed: 221 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -399,6 +399,26 @@
399399
ʦts
400400
ʪls
401401
ʫlz
402+
ΆΑ
403+
ΈΕ
404+
ΉΗ
405+
ΊΙ
406+
ΌΟ
407+
ΎΥ
408+
ΏΩ
409+
ΐι
410+
ΪΙ
411+
ΫΥ
412+
άα
413+
έε
414+
ήη
415+
ίι
416+
ΰυ
417+
ϊι
418+
ϋυ
419+
όο
420+
ύυ
421+
ώω
402422
ЁЕ
403423
ёе
404424
ᴀA
@@ -709,6 +729,207 @@
709729
ỽv
710730
ỾY
711731
ỿy
732+
ἀα
733+
ἁα
734+
ἂα
735+
ἃα
736+
ἄα
737+
ἅα
738+
ἆα
739+
ἇα
740+
ἈΑ
741+
ἉΑ
742+
ἊΑ
743+
ἋΑ
744+
ἌΑ
745+
ἍΑ
746+
ἎΑ
747+
ἏΑ
748+
ἐε
749+
ἑε
750+
ἒε
751+
ἓε
752+
ἔε
753+
ἕε
754+
ἘΕ
755+
ἙΕ
756+
ἚΕ
757+
ἛΕ
758+
ἜΕ
759+
ἝΕ
760+
ἠη
761+
ἡη
762+
ἢη
763+
ἣη
764+
ἤη
765+
ἥη
766+
ἦη
767+
ἧη
768+
ἨΗ
769+
ἩΗ
770+
ἪΗ
771+
ἫΗ
772+
ἬΗ
773+
ἭΗ
774+
ἮΗ
775+
ἯΗ
776+
ἰι
777+
ἱι
778+
ἲι
779+
ἳι
780+
ἴι
781+
ἵι
782+
ἶι
783+
ἷι
784+
ἸΙ
785+
ἹΙ
786+
ἺΙ
787+
ἻΙ
788+
ἼΙ
789+
ἽΙ
790+
ἾΙ
791+
ἿΙ
792+
ὀο
793+
ὁο
794+
ὂο
795+
ὃο
796+
ὄο
797+
ὅο
798+
ὈΟ
799+
ὉΟ
800+
ὊΟ
801+
ὋΟ
802+
ὌΟ
803+
ὍΟ
804+
ὐυ
805+
ὑυ
806+
ὒυ
807+
ὓυ
808+
ὔυ
809+
ὕυ
810+
ὖυ
811+
ὗυ
812+
ὙΥ
813+
ὛΥ
814+
ὝΥ
815+
ὟΥ
816+
ὠω
817+
ὡω
818+
ὢω
819+
ὣω
820+
ὤω
821+
ὥω
822+
ὦω
823+
ὧω
824+
ὨΩ
825+
ὩΩ
826+
ὪΩ
827+
ὫΩ
828+
ὬΩ
829+
ὭΩ
830+
ὮΩ
831+
ὯΩ
832+
ὰα
833+
ὲε
834+
ὴη
835+
ὶι
836+
ὸο
837+
ὺυ
838+
ὼω
839+
ᾀα
840+
ᾁα
841+
ᾂα
842+
ᾃα
843+
ᾄα
844+
ᾅα
845+
ᾆα
846+
ᾇα
847+
ᾈΑ
848+
ᾉΑ
849+
ᾊΑ
850+
ᾋΑ
851+
ᾌΑ
852+
ᾍΑ
853+
ᾎΑ
854+
ᾏΑ
855+
ᾐη
856+
ᾑη
857+
ᾒη
858+
ᾓη
859+
ᾔη
860+
ᾕη
861+
ᾖη
862+
ᾗη
863+
ᾘΗ
864+
ᾙΗ
865+
ᾚΗ
866+
ᾛΗ
867+
ᾜΗ
868+
ᾝΗ
869+
ᾞΗ
870+
ᾟΗ
871+
ᾠω
872+
ᾡω
873+
ᾢω
874+
ᾣω
875+
ᾤω
876+
ᾥω
877+
ᾦω
878+
ᾧω
879+
ᾨΩ
880+
ᾩΩ
881+
ᾪΩ
882+
ᾫΩ
883+
ᾬΩ
884+
ᾭΩ
885+
ᾮΩ
886+
ᾯΩ
887+
ᾰα
888+
ᾱα
889+
ᾲα
890+
ᾳα
891+
ᾴα
892+
ᾶα
893+
ᾷα
894+
ᾸΑ
895+
ᾹΑ
896+
ᾺΑ
897+
ᾼΑ
898+
ῂη
899+
ῃη
900+
ῄη
901+
ῆη
902+
ῇη
903+
ῈΕ
904+
ῊΗ
905+
ῌΗ
906+
ῐι
907+
ῑι
908+
ῒι
909+
ῖι
910+
ῗι
911+
ῘΙ
912+
ῙΙ
913+
ῚΙ
914+
ῠυ
915+
ῡυ
916+
ῢυ
917+
ῤρ
918+
ῥρ
919+
ῦυ
920+
ῧυ
921+
ῨΥ
922+
ῩΥ
923+
ῪΥ
924+
ῬΡ
925+
ῲω
926+
ῳω
927+
ῴω
928+
ῶω
929+
ῷω
930+
ῸΟ
931+
ῺΩ
932+
ῼΩ
712933
‐-
713934
‑-
714935
‒-

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp