- Notifications
You must be signed in to change notification settings - Fork177
A PyTorch Implementation of End-to-End Models for Speech-to-Text
License
awni/speech
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Speech is an open-source package to build end-to-end models for automaticspeech recognition. Sequence-to-sequence models with attention,Connectionist Temporal Classification and the RNN Sequence Transducerare currently supported.
The goal of this software is to facilitate research in end-to-end models forspeech recognition. The models are implemented in PyTorch.
The software has only been tested in Python3.6.
We will not be providing backward compatability for Python2.7.
We recommend creating a virtual environment and installing the pythonrequirements there.
virtualenv <path_to_your_env>source <path_to_your_env>/bin/activatepip install -r requirements.txt
Then follow the installation instructions for a version ofPyTorch which works for your machine.
After all the python requirements are installed, from the top level directory,run:
make
The build process requires CMake as well as Make.
After that, source thesetup.sh
from the repo root.
source setup.sh
Consider adding this to yourbashrc
.
You can verify the install was successful by running thetests from thetests
directory.
cd testspytest
To train a model run
python train.py <path_to_config>
After the model is done training you can evaluate it with
python eval.py <path_to_model> <path_to_data_json>
To see the available options for each script use-h
:
python {train, eval}.py -h
For examples of model configurations and datasets, visit the examplesdirectory. Each example dataset should have instructions and/or scripts fordownloading and preparing the data. There should also be one or more modelconfigurations available. The results for each configuration will documented ineach examples correspondingREADME.md
.