Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Rust re-implementation of OpenFST - library for constructing, combining, optimizing, and searching weighted finite-state transducers (FSTs). A Python binding is also available.

License

Unknown and 2 other licenses found

Licenses found

Unknown
LICENSE
Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT
NotificationsYou must be signed in to change notification settings

garvys-org/rustfst

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

License: MIT/Apache-2.0MaintenanceGithub tag

Rust

rustc >= 1.51.0Native Linux test statusDocumentation

Python

PyPI versionPyPI download monthPyPI pyversions

Rust implementation of Weighted Finite States Transducers.

Rustfst is a library for constructing, combining, optimizing, and searching weightedfinite-state transducers (FSTs). Weighted finite-state transducers are automata whereeach transition has an input label, an output label, and a weight.The more familiar finite-state acceptor is represented as a transducerwith each transition's input and output label equal. Finite-state acceptorsare used to represent sets of strings (specifically, regular or rational sets);finite-state transducers are used to represent binary relations between pairs ofstrings (specifically, rational transductions). The weights can be used to representthe cost of taking a particular transition.

FSTs have key applications in speech recognition and synthesis, machine translation,optical character recognition, pattern matching, string processing, machine learning,information extraction and retrieval among others. Often a weighted transducer is used torepresent a probabilistic model (e.g., an n-gram model, pronunciation model). FSTs can beoptimized by determinization and minimization, models can be applied to hypothesis sets(also represented as automata) or cascaded by finite-state composition, and the bestresults can be selected by shortest-path algorithms.

fst

Overview

For a basicexample see the section below.

Some simple and commonly encountered types of FSTs can be easilycreated with the macro [fst] or the functionsacceptor andtransducer.

For more complex cases you will likely start with theVectorFst type, which will be importedin the [prelude] along with most everything else you need.VectorFst<TropicalWeight> correspondsdirectly to the OpenFSTStdVectorFst, and can be used to loadits files usingread orread_text.

Because "iteration" over an FST can mean many different things,there are a variety of different iterators. To iterate over stateIDs you may usestates_iter, while toiterate over transitions out of a state, you may useget_trs. Since it is common toiterate over both, this can be done usingfst_iter orfst_into_iter. Itis also very common to iterate over paths accepted by an FST,which can be done withpaths_iter, and as a conveniencefor generating text,string_paths_iter.Alternately, in the case of a linear FST, you may retrieve theonly possible path withdecode_linear_fst.

Note that iterating over paths is not the same thing as findingtheshortest path or paths, which is done withshortest_path (for a single path)orshortest_path_with_config(for N-shortest paths).

For the complete list of algorithms, see the [algorithms] module.

You may now be wondering, especially if you have previously usedsuch linguist-friendly tools aspyfoma, "what if I just wanttotransduce some text???" The unfriendly answer is thatrustfst is a somewhat lower-level library, designed forimplementing things like speech recognizers. The somewhat morehelpful answer is that you would do this by constructing anacceptor for your input, which you willcompose with atransducer, thenproject the result to itsoutput, and finallyiterate over the paths inthe resulting FST.

References

Implementation heavily inspired from Mehryar Mohri's, Cyril Allauzen's and Michael Riley's work :

The API closely resembles that of OpenFST, with somesimplifications and changes to make it more idiomatic in Rust, notablythe use ofTr instead ofArc. SeeDifferences fromOpenFST for more information.

Example

use anyhow::Result;use rustfst::prelude::*;use rustfst::algorithms::determinize::{DeterminizeType, determinize};use rustfst::algorithms::rm_epsilon::rm_epsilon;fnmain() ->Result<()>{// Creates a empty wFSTletmut fst =VectorFst::<TropicalWeight>::new();// Add some stateslet s0 = fst.add_state();let s1 = fst.add_state();let s2 = fst.add_state();// Set s0 as the start state    fst.set_start(s0)?;// Add a transition from s0 to s1    fst.add_tr(s0,Tr::new(3,5,10.0, s1))?;// Add a transition from s0 to s2    fst.add_tr(s0,Tr::new(5,7,18.0, s2))?;// Set s1 and s2 as final states    fst.set_final(s1,31.0)?;    fst.set_final(s2,45.0)?;// Iter over all the paths in the wFSTfor pin fst.paths_iter(){println!("{:?}", p);}// A lot of operations are available to modify/optimize the FST.// Here are a few examples :// - Remove useless states.connect(&mut fst)?;// - Optimize the FST by merging states with the same behaviour.minimize(&mut fst)?;// - Copy all the input labels in the output.project(&mut fst,ProjectType::ProjectInput);// - Remove epsilon transitions.rm_epsilon(&mut fst)?;// - Compute an equivalent FST but deterministic.    fst =determinize(&fst)?;Ok(())}

Differences from OpenFST

Here is a non-exhaustive list of ways in which Rustfst's APIdiffers from OpenFST:

  • The default epsilon symbol is<eps> and not<epsilon>.
  • Functions and methods follow Rust naming conventions,e.g.add_state rather thanAddState, but are otherwise mostlyequivalent, except that:
  • Transitions are calledTr and notArc, becauseArc has arather different and well-established meaning in Rust, and rustfstuses it (std::sync::Arc, that is) to reference-count symboltables. All associated functions also usetr.
  • Final states are not indicated by a final weight ofzero. Youcan test for finality usingis_final, andfinal_weight returns an [Option]. Thisrequires some care when converting OpenFST code.
  • Transitions can be accessed directly as a slice rather than requiringan iterator.
  • Semiring operations are expressed as plain old methods ratherthan strange C++ things. So writew1.plus(w2) rather thanPlus(w1, w2), for instance.
  • Weights have in-place operations for ⊕(plus_assign) and ⊗(times_assign).
  • Most of the type aliases (which would be trait aliases in Rust) suchasStdArc,StdFst, and so forth, are missing, but type inferenceallows us to avoid explicit type arguments in most cases, such aswhen calling [Tr::new], for instance.
  • State IDs are unsigned, with [NO_STATE_ID] used for a missing value.They are also 32 bits by default (presumably, 4 billion statesis enough for most applications). This means you must take care tocast them to [usize] when using them as indices, and vice-versa,preferably checking for overflows
  • Symbol IDs are also unsigned and 32-bits, with [NO_LABEL] usedfor a missing value.
  • Floating-point weights are not generic, so are always single-precision.

Benchmark with OpenFST

I did a benchmark some time ago on almost every linear fst algorithm and compared the results withOpenFst. You can find the results here :

Spoiler alert:Rustfst is faster on all those algorithms 😅

Documentation

The documentation of the last released version is available here :https://docs.rs/rustfst

Release process

  1. Use the scriptupdate_version.sh to update the version of every package.
  2. Push
  3. Push a new tag with the prefixrustfst-v

Example :

./update_version.sh 0.9.1-alpha.6git commit -am"Release 0.9.1-alpha.6"git pushgit tag -a rustfst-v0.9.1-alpha.6 -m"Release rustfst 0.9.1-alpha.6"  git push --tags

Optionally, if this is a major release, create a GitHub release in the UI.

Projects contained in this repository

This repository contains two main projects:

  • rustfst is the Rust re-implementation.
    • Crate available on crates.iohere
    • Documentation available on docs.rshere
  • rustfst-python is the python binding ofrustfst.
    • Package available on Pypihere
    • Documentation available on Github Pageshere

License

Licensed under either of

at your option.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

About

Rust re-implementation of OpenFST - library for constructing, combining, optimizing, and searching weighted finite-state transducers (FSTs). A Python binding is also available.

Topics

Resources

License

Unknown and 2 other licenses found

Licenses found

Unknown
LICENSE
Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT

Code of conduct

Stars

Watchers

Forks

Packages

No packages published

Contributors18

Languages


[8]ページ先頭

©2009-2025 Movatter.jp