Russian, English, Sweden, Estonian and Finnish Phonetic algorithm based on Soundex/Metaphone.
Package has both implemented phoneme transformation into letter-number sequence and distance engine for comparison of phonetic sequences (based on Levenstein and Hamming distances).
Furthermore, both Russian phonetic algorithms supports preprocessing for specific phoneme cases.
- Install this package via
pip
- Import Soundex algorithm.
Package supports a lot of opportunities, it's possible to cut a result sequence (like in the original Soundex version) or also code vowels.
fromfonetika.soundeximportRussianSoundexsoundex=RussianSoundex(delete_first_letter=True)soundex.transform('ёлочка')...J070530soundex=RussianSoundex(delete_first_letter=True,code_vowels=True)soundex.transform('ёлочка')...JA7A53AA structure of the library is scalable,RussianSoundex class inherits basic classSoundex (original for English language). In order to extend our algorithm, you need just inherit own class fromSoundex and override methods.
- Import Soundex distance for usage of string comparision
fromfonetika.distanceimportPhoneticsInnerLanguageDistancesoundex=RussianSoundex(delete_first_letter=True)phon_distance=PhoneticsInnerLanguageDistance(soundex)phon_distance.distance('ёлочка','йолочка')...0- You can also calculate distance between words of two languages. It would be useful for working with one language family group.
fromfonetika.distanceimportPhoneticsBetweenLanguagesDistancem1=FinnishMetaphone(reduce_word=False)m2=EstonianMetaphone(reduce_word=False)phon_distance=PhoneticsBetweenLanguagesDistance(m1,m2)phon_distance.distance('yö','öö')...1