georgesterpu/pyVSRPublic

NotificationsYou must be signed in to change notification settings
Fork11
Star37

Python toolkit for Visual Speech Recognition

License

GPL-3.0 license

37 stars 11 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 85 Commits
.github		.github
pyVSR		pyVSR
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
example_aam.py		example_aam.py
example_dct.py		example_dct.py
example_hmm.py		example_hmm.py
example_landmarks.py		example_landmarks.py

Repository files navigation

pyVSR

Python toolkit for Visual Speech Recognition

About

pyVSR is a Python toolkit aimed at running Visual Speech Recognition (VSR) experiments in a traditional framework (e.g. handcrafted visual features, Hidden Markov Models for pattern recognition).

The main goal of pyVSR is to easily reproduce VSR experiments in order to have a baseline result on most publicly available audio-visual datasets.

What can you do with pyVSR:

1. Fetch a filtered list of files from a dataset

currently supported:
- TCD-TIMIT
  - speaker-dependent protocol (Gillen)
  - speaker-independent protocol (Gillen)
  - single person
- OuluVS2
  - speaker-independent protocol (Saitoh)
  - single person

2. Extract visual features:

Discrete Cosine Transform (DCT)
- Automatic ROI extraction (grayscale, RGB, DCT)
- Face alignment (from 5 stable landmarks)
- Configurable window size
- Fourth order accurate derivatives
- Sample rate interpolation
- Storage in HDF5 format
Active Appearance Models (AAM)
- Do NOT require manually annotated landmarks
- Face, lips, and chin models supported
- Parameters obtainable either through fitting or projection
- Implementation based onMenpo
Point cloud of facial landmarks
- OpenFace wrapper

3. Train Hidden Markov Models (HMMs)

easy HTK wrapper for Python
optional bigram language model
multi-threaded support (both for training and decoding at full CPU Power)

4. Extend the support for additional features

pyVSR has a simple, modular, object-oriented architecture

Examples

Please refer to the attached examples.

pyVSR was re-designed to simplify its usage on multiple datasets.

Users can provide their own dictionaries of (input, output) pairs for all of pyVSR's functionalities.

Installing pyVSR

The recommended way is to create an emptyconda environment and install the following dependencies:

conda install -c menpo menpo menpofit menpodetect menpowidgets
conda install -c menpo pango harfbuzz
conda install h5py
conda install natsort
conda install scipy

Alternatively, you can use theenvironment.yml file:

conda env create -f environment.yml

It is the user's responsibility to compileOpenFace andHTK.
Please refer to the documentation upstream:
OpenFace
HTK 3.5

Add the HTK binaries to the system path (e.g./usr/local/bin/) or to./pyVSR/bins/htk/
Add the OpenFace binaries to./pyVSR/bins/openface/

pyVSR was initially developed on a system running Manjaro Linux, frequently updated from thetesting repositories.We also succesfully tested it on Windows systems.

If you are not interested in using the AAM module, you can skip installing a great amount of Python packages.We recommend running the example scripts and installing the missing dependencies (opencv, dlib, numpy).

How to cite

If you use this work, please cite it as:

George Sterpu and Naomi Harte.Towards lipreading sentences using active appearance models.In AVSP, Stockholm, Sweden, August 2017.

Bib

Contact

We are always happy to hear from you:

George Sterpu sterpug [at] tcd.ie
Naomi Harte nharte [at] tcd.ie

About

Python toolkit for Visual Speech Recognition

Releases2

HTK 3.4.1 Windows binaries Latest

Nov 18, 2017

+ 1 release

Sponsor this project

Learn more about GitHub Sponsors

Packages

No packages published

Languages

Python100.0%

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

License

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

pyVSR

About

What can you do with pyVSR:

1. Fetch a filtered list of files from a dataset

2. Extract visual features:

3. Train Hidden Markov Models (HMMs)

4. Extend the support for additional features

Examples

Installing pyVSR

How to cite

Contact

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases2

Sponsor this project

Uh oh!

Packages

Languages

Movatterモバイル変換

Uh oh!

License

georgesterpu/pyVSR

Folders and files

Latest commit

History

Repository files navigation

pyVSR

About

What can you do with pyVSR:

1. Fetch a filtered list of files from a dataset

2. Extract visual features:

3. Train Hidden Markov Models (HMMs)

4. Extend the support for additional features

Examples

Installing pyVSR

How to cite

Contact

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases2

Sponsor this project

Uh oh!

Packages0

Languages

Packages