Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork634
🎙Speech recognition using the tensorflow deep learning framework, sequence-to-sequence neural networks
License
pannous/tensorflow-speech-recognition
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
Speech recognition using google'stensorflow deep learning framework,sequence-to-sequence neural networks.
Replacescaffe-speech-recognition, see there for some background.
This (relatively) old project is NO LONGER UP TO DATE.
The tensorflow 1.0 used is not compatible anymore and the theory is no longer state of the art either.
We highly recommend you check out and usewhisper
Update 2020:Mozilla releasedDeepSpeech
They achieve gooderror rates. Free Speech is in good hands, gothere if you are an end user.For nowthis project is only maintained for educational purposes.
Create a decent standalone speech recognition for Linux etc.Some people say we have the models but not enough training data.We disagree: There is plenty of training data (100GBhere and 21GBhere on openslr.org , synthetic Text to Speech snippets, Movies with transcripts, Gutenberg, YouTube with captions etc etc) we just need a simple yet powerful model. It's only a question of time...
Sample spectrogram, Karen uttering 'zero' with 160 words per minute.
git clone https://github.com/pannous/tensorflow-speech-recognitioncd tensorflow-speech-recognitiongit clone https://github.com/pannous/layer.gitgit clone https://github.com/pannous/tensorpeers.gitrequirements portaudio fromhttp://www.portaudio.com/
git clone https://git.assembla.com/portaudio.git./configure --prefix=/path/to/your/localmakemake installexport LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/path/to/your/local/libexport LIDRARY_PATH=$LIBRARY_PATH:/path/to/your/local/libexport CPATH=$CPATH:/path/to/your/local/includesource ~/.bashrcpip install pyaudioToy examples:./number_classifier_tflearn.py./speaker_classifier_tflearn.py
Some less trivial architectures:./densenet_layer.py
Later:./train.sh./record.py
Update: Nervanademonstrated that it is possible for 'independents' to build speech recognizers that are state of the art.
- Watch video :https://www.youtube.com/watch?v=u9FPqkuoEJ8
- Understand and correct the corresponding code:lstm-tflearn.py
- Data Augmentation : create on-the-fly modulation of the data: increase the speech frequency, add background noise, alter the pitch etc,...
Extensions to current tensorflow which are probably needed:
- WarpCTC on the GPU seeissue
- Incremental collaborative snapshots ('P2P learning') !
- Modular graphs/models + persistance
Even though this project is far from finished we hope it gives you some starting points.
Looking for a tensorflow collaboration / consultant / deep learning contractor? Reach out toinfo@pannous.com
About
🎙Speech recognition using the tensorflow deep learning framework, sequence-to-sequence neural networks
Topics
Resources
License
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Releases
Sponsor this project
Uh oh!
There was an error while loading.Please reload this page.
Packages0
Uh oh!
There was an error while loading.Please reload this page.
Contributors2
Uh oh!
There was an error while loading.Please reload this page.

