Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

National characters transcription module.

License

NotificationsYou must be signed in to change notification settings

zzzsochi/trans

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This module translates national characters into similar soundinglatin characters (transliteration).At the moment, Czech, Greek, Latvian, Polish, Turkish, Russian, Ukrainian,Kazakh and Farsi alphabets are supported (it covers 99% of needs).

Python 3:

>>>from transimport trans>>> trans('Привет, Мир!')

Python 2:

>>>import trans>>>u'Привет, Мир!'.encode('trans')u'Privet, Mir!'>>> trans.trans(u'Привет, Мир!')u'Privet, Mir!'
>>>'Hello World!'.encode('trans')Traceback (most recent call last):    ...TypeError: trans codec support only unicode string, <type 'str'> given.
>>> s=u'''\...-- Раскудрить твою через коромысло в бога душу мать...             триста тысяч раз едрену вошь тебе в крыло...             и кактус в глотку!-- взревел разъяренный Никодим....-- Аминь,-- робко добавил из склепа папа Пий....                 (c) Г. Л. Олди,"Сказки дедушки вампира".'''>>>>>>print s.encode('trans')   -- Raskudrit tvoyu cherez koromyslo v boga dushu mat            trista tysyach raz edrenu vosh tebe v krylo            i kaktus v glotku! -- vzrevel razyarennyy Nikodim.   -- Amin, -- robko dobavil iz sklepa papa Piy.                (c) G. L. Oldi, "Skazki dedushki vampira".

Use the table "slug", leaving only the Latin characters, digits and underscores:

>>>printu'1 2 3 4 5\n6 7 8 9 0'.encode('trans')1 2 3 4 56 7 8 9 0>>>printu'1 2 3 4 5\n6 7 8 9 0'.encode('trans/slug')1_2_3_4_5__6_7_8_9_0>>> s.encode('trans/slug')[-42:-1]u'_c__G__L__Oldi___Skazki_dedushki_vampira_'

Tableid is deprecated and renamed toslug.Old name also available, but not recommended.

>>>u'1 2 3 4 5 6 7 8 9 0'.encode('trans/my')Traceback (most recent call last):    ...ValueError: Table "my" not found in tables!>>> trans.tables['my']= {u'1':u'A',u'2':u'B'};>>>u'1 2 3 4 5 6 7 8 9 0'.encode('trans/my')u'A_B________________'>>>

Table can consist of two parts - the map of diphthongs and the map of characters.Diphthongs are processed first by simple replacement in the substring.Then each character of the received string is replaced according to the map ofcharacters. If character is absent in the map of characters, keyNone are checked.If keyNone is not present, the default characteru'_' is used.

>>> diphthongs= {u'11':u'AA',u'22':u'BB'}>>> characters= {u'a':u'z',u'b':u'y',u'c':u'x',None:u'-',...u'A':u'A',u'B':u'B'}# See below...>>> trans.tables['test']= (diphthongs, characters)>>>u'11abc22cbaCC'.encode('trans/test')u'AAzyxBBxyz--'

The characters are created by processing of diphthongs also processedby the map of the symbols:

>>> diphthongs= {u'11':u'AA',u'22':u'BB'}>>> characters= {u'a':u'z',u'b':u'y',u'c':u'x',None:u'-'}>>> trans.tables['test']= (diphthongs, characters)>>>u'11abc22cbaCC'.encode('trans/test')u'--zyx--xyz--'

These two tables are equivalent:

>>> characters= {u'a':u'z',u'b':u'y',u'c':u'x',None:u'-'}>>> trans.tables['t1']= characters>>> trans.tables['t2']= ({}, characters)>>>u'11abc22cbaCC'.encode('trans/t1')==u'11abc22cbaCC'.encode('trans/t2')True

2.1 2016-09-19

  • Add Farsi alphabet (thx rodgar-nvkz)
  • Use pytest
  • Some code style refactoring

2.0 2013-04-01

  • Python 3 support
  • class Trans for create different tables spaces

1.5 2012-09-12

  • Add support of kazakh alphabet.

1.4 2011-11-29

  • Change license to BSD.

1.3 2010-05-18

  • Table "id" renamed to "slug". Old name also available.
  • Some speed optimizations (thx to AndyLegkiy <andy.legkiy at gmail.com>).

1.2 2010-01-10

  • First public release.
  • Translate documentation to English.

Releases

No releases published

Packages

No packages published

Languages


[8]ページ先頭

©2009-2025 Movatter.jp