- Notifications
You must be signed in to change notification settings - Fork59
End-to-End Neural Diarization
License
hitachi-speech/EEND
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
EEND (End-to-End Neural Diarization) is a neural-network-based speaker diarization method.
- BLSTM EEND (INTERSPEECH 2019)
- Self-attentive EEND (ASRU 2019)
The EEND extension for various number of speakers is also provided in this repository.
- Self-attentive EEND with encoder-decoder based attractors
- NVIDIA CUDA GPU
- CUDA Toolkit (8.0 <= version <= 10.1)
cd toolsmake
- This command builds kaldi at
tools/kaldi
- if you want to use pre-build kaldiThis option make a symlink at
cd toolsmake KALDI=<existing_kaldi_root>
tools/kaldi
- if you want to use pre-build kaldi
- This command extracts miniconda3 at
tools/miniconda3
, and creates conda envirionment named 'eend' - Then, installs Chainer and cupy into 'eend' environment
- use CUDA in
/usr/local/cuda/
- if you need to specify your CUDA pathThis command installs cupy-cudaXX according to your CUDA version.Seehttps://docs-cupy.chainer.org/en/stable/install.html#install-cupy
cd toolsmake CUDA_PATH=/your/path/to/cuda-8.0
- if you need to specify your CUDA path
- use CUDA in
- Modify
egs/mini_librispeech/v1/cmd.sh
according to your job schedular.If you use your local machine, use "run.pl".If you use Grid Engine, use "queue.pl"If you use SLURM, use "slurm.pl".For more information about cmd.sh seehttp://kaldi-asr.org/doc/queue.html.
cd egs/mini_librispeech/v1./run_prepare_shared.sh
./run.sh
- If you use encoder-decoder based attractors [3], modify
run.sh
to useconfig/eda/{train,infer}.yaml
- See
RESULT.md
and compare with your result.
- Modify
egs/callhome/v1/cmd.sh
according to your job schedular.If you use your local machine, use "run.pl".If you use Grid Engine, use "queue.pl"If you use SLURM, use "slurm.pl".For more information about cmd.sh seehttp://kaldi-asr.org/doc/queue.html. - Modify
egs/callhome/v1/run_prepare_shared.sh
according to storage paths of your corpora.
cd egs/callhome/v1./run_prepare_shared.sh# If you want to conduct 1-4 speaker experiments, run below.# You also have to set paths to your corpora properly../run_prepare_shared_eda.sh
./run.sh
local/run_blstm.sh
./run_eda.sh
[1] Yusuke Fujita, Naoyuki Kanda, Shota Horiguchi, Kenji Nagamatsu, Shinji Watanabe, "End-to-End Neural Speaker Diarization with Permutation-free Objectives," Proc. Interspeech, pp. 4300-4304, 2019
[2] Yusuke Fujita, Naoyuki Kanda, Shota Horiguchi, Yawen Xue, Kenji Nagamatsu, Shinji Watanabe, "End-to-End Neural Speaker Diarization with Self-attention," Proc. ASRU, pp. 296-303, 2019
[3] Shota Horiguchi, Yusuke Fujita, Shinji Watanabe, Yawen Xue, Kenji Nagamatsu, "End-to-End Speaker Diarization for an Unknown Number of Speakers with Encoder-Decoder Based Attractors," Proc. INTERSPEECH, 2020
@inproceedings{Fujita2019Interspeech, author={Yusuke Fujita and Naoyuki Kanda and Shota Horiguchi and Kenji Nagamatsu and Shinji Watanabe}, title={{End-to-End Neural Speaker Diarization with Permutation-free Objectives}}, booktitle={Interspeech}, pages={4300--4304} year=2019}
About
End-to-End Neural Diarization