Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Experimental two-level rule compilation using Python HFST

NotificationsYou must be signed in to change notification settings

koskenni/pytwolc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Experimental two-level rule compilation using Python HFST. For more information, seehttps://github.com/hfst/python

Rule compiler: twol.py

The Python programtwol.py is a rule compiler and tester for rules of simplified two-level model, seehttps://pytwolc.readthedocs.io/en/latest/formalism.html for more information on the rule formalism and the compiler. The HST package can be loaded using the command:

$ python3 -m pip install hfst

The program twol.py uses and depend on the 'tatsu' Python parser generator by Juancarlo Añez, seeehttp://tatsu.readthedocs.io/en/stable/index.html for detailed documentation. You can load and install TaTsu from the net using a command:

$ python3 -m pip install tatsu

The program is prepared to handle input in Unicode, including user percieved graphemes which are combined out of two or more Unicode characters (with a so called code point). In order to recognize suchgraphemes, an additional package has to be installed:

$ python3 -m pip install grapheme

The compiler needs two files: (1) examples as a FST and (2) a rule file. The human readable examples must be converted into a FST usingtwexamp.py program.

The compiler is normally executed as follows:

$ python3 twol.py examples.fst rules.twolc

One can get more information by using the--help parameter. More documentation on twol.py can be found athttps://pytwolc.readthedocs.io/en/latest/compiletest.html

Converting examples from pair string format into a FST: twexamp.py

The moduletwexamp.py handles various tasks for the compiler during the compilation process. It is also needed for converting human readable examples into a FST so that ti is not necessary recompile it at every step of testing rules. A recompilation is only needed when the examples are changed. In order to convert examples from a pair string format into a fst you can e.g.:

$ python3 twexamp.py examples.pstr examples.fst

Morphophonemic representations

The sequence of programsparad2words.py,words2zerofilled.py,zerofilled2raw.py andraw2named.py is intended for determining the underlying or morphophonemic representations for word stems. It starts from a table of word forms or paradigms where morphs are separated from each other e.g. by a period (.). Seehttps://pytwolc.readthedocs.io/en/latest/morphophon.html for more information on their use. Each program is run from the command line, and one can get detailed information on the parameters by running the command with a--help argument, e.g.

$ python3 words2zerofilled.py --help

Some of the programs of this sequence need the packageorderedset which one can get from the net by

$ python3 -m pip install orderedset

Especially the zero-filling program needs the same package for handling combined graphemes as twol.py uses:

$ python3 -m pip install grapheme

There is a Makefile in the subdirectoryparad and examples which may help in testing and using the programs.

Discovering raw rules: twdiscov.py

This program builds tentative or raw rules out of a set of examples. The examples must be given one example per line as a space-separated list of symbol pairs. Seehttps://pytwolc.readthedocs.io/en/latest/twdiscov.html for more information.

About

Experimental two-level rule compilation using Python HFST

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors2

  •  
  •  

[8]ページ先頭

©2009-2025 Movatter.jp