NVIDIA/tacotron2Public

NotificationsYou must be signed in to change notification settings
Fork1.4k
Star5.2k

Tacotron 2 - PyTorch implementation with faster-than-realtime inference

License

BSD-3-Clause license

5.2k stars 1.4k forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 134 Commits
filelists		filelists
text		text
waveglow @ 5bc2a53		waveglow @ 5bc2a53
.gitmodules		.gitmodules
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
audio_processing.py		audio_processing.py
data_utils.py		data_utils.py
demo.wav		demo.wav
distributed.py		distributed.py
hparams.py		hparams.py
inference.ipynb		inference.ipynb
layers.py		layers.py
logger.py		logger.py
loss_function.py		loss_function.py
loss_scaler.py		loss_scaler.py
model.py		model.py
multiproc.py		multiproc.py
plotting_utils.py		plotting_utils.py
requirements.txt		requirements.txt
stft.py		stft.py
tensorboard.png		tensorboard.png
train.py		train.py
utils.py		utils.py

Repository files navigation

Tacotron 2 (without wavenet)

PyTorch implementation ofNatural TTS Synthesis By ConditioningWavenet On Mel Spectrogram Predictions.

This implementation includesdistributed andautomatic mixed precision supportand uses theLJSpeech dataset.

Distributed and Automatic Mixed Precision support relies on NVIDIA'sApex andAMP.

Visit ourwebsite for audio samples using our publishedTacotron 2 andWaveGlow models.

Pre-requisites

NVIDIA GPU + CUDA cuDNN

Setup

Download and extract theLJ Speech dataset
Clone this repo:git clone https://github.com/NVIDIA/tacotron2.git
CD into this repo:cd tacotron2
Initialize submodule:git submodule init; git submodule update
Update .wav paths:sed -i -- 's,DUMMY,ljs_dataset_folder/wavs,g' filelists/*.txt
- Alternatively, setload_mel_from_disk=True inhparams.py and update mel-spectrogram paths
InstallPyTorch 1.0
InstallApex
Install python requirements or build docker image
- Install python requirements:pip install -r requirements.txt

Training

python train.py --output_directory=outdir --log_directory=logdir
(OPTIONAL)tensorboard --logdir=outdir/logdir

Training using a pre-trained model

Training using a pre-trained model can lead to faster convergence
By default, the dataset dependent text embedding layers areignored

Download our publishedTacotron 2 model
python train.py --output_directory=outdir --log_directory=logdir -c tacotron2_statedict.pt --warm_start

Multi-GPU (distributed) and Automatic Mixed Precision Training

python -m multiproc train.py --output_directory=outdir --log_directory=logdir --hparams=distributed_run=True,fp16_run=True

Inference demo

Download our publishedTacotron 2 model
Download our publishedWaveGlow model
jupyter notebook --ip=127.0.0.1 --port=31337
Load inference.ipynb

N.b. When performing Mel-Spectrogram to Audio synthesis, make sure Tacotron 2and the Mel decoder were trained on the same mel-spectrogram representation.

Related repos

WaveGlow Faster than real time Flow-basedGenerative Network for Speech Synthesis

nv-wavenet Faster than real timeWaveNet.

Acknowledgements

This implementation uses code from the following repos:KeithIto,PremSeetharaman as described in our code.

We are inspired byRyuchi Yamamoto'sTacotron PyTorch implementation.

We are thankful to the Tacotron 2 paper authors, specially Jonathan Shen, YuxuanWang and Zongheng Yang.

About

Tacotron 2 - PyTorch implementation with faster-than-realtime inference

Releases

No releases published

Packages

No packages published

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Tacotron 2 (without wavenet)

Pre-requisites

Setup

Training

Training using a pre-trained model

Multi-GPU (distributed) and Automatic Mixed Precision Training

Inference demo

Related repos

Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Uh oh!

Contributors8

Languages

Movatterモバイル変換

License

NVIDIA/tacotron2

Folders and files

Latest commit

History

Repository files navigation

Tacotron 2 (without wavenet)

Pre-requisites

Setup

Training

Training using a pre-trained model

Multi-GPU (distributed) and Automatic Mixed Precision Training

Inference demo

Related repos

Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages0

Uh oh!

Contributors8

Languages

Packages