Movatterモバイル変換


[0]ホーム

URL:


Hugging Face's logoHugging Face

reazonspeech-espnet-v2

reazonspeech-espnet-v2 is an automatic speech recognition (ASR) modeltrained onReazonSpeech v2.0 corpus.

Model Architecture

The general architecture is the same asreazonspeech-espnet-v1.

  • Conformer-Transducer model with 118.85M parameters.

  • We trained this model for 33 epoch using Adam optimizer. The maximumlearning rate was 0.02, with 15000 warmup steps.

  • The training audio files were sampled at 16khz. Make sure that yourinput audio files have the same sampling rate.

Usage

We recommend to use this model through ourreazonspeechlibrary.

from reazonspeech.espnet.asr import load_model, transcribe, audio_from_pathaudio = audio_from_path("speech.wav")model = load_model()ret = transcribe(model, audio)print(ret.text)

License

Apaceh Licence 2.0

Downloads last month
273
Inference ProvidersNEW
This model isn't deployed by any Inference Provider.🙋Ask for provider support

Space usingreazon-research/reazonspeech-espnet-v21

Collection includingreazon-research/reazonspeech-espnet-v2


[8]ページ先頭

©2009-2025 Movatter.jp