- Notifications
You must be signed in to change notification settings - Fork33
Omni SenseVoice: High-Speed Speech Recognition with words timestamps 🗣️🎯
License
NotificationsYou must be signed in to change notification settings
lifeiteng/OmniSenseVoice
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Built onSenseVoice, Omni SenseVoice is optimized for lightning-fast inference and precise timestamps—giving you a smarter, faster way to handle audio transcription!
pip3 install OmniSenseVoice
omnisense transcribe [OPTIONS] AUDIO_PATH
Key Options:
--language
: Automatically detect the language or specify (auto, zh, en, yue, ja, ko
).--textnorm
: Choose whether to apply inverse text normalization (withitn for inverse normalized
orwoitn for raw
).--device-id
: Run on a specific GPU (default: -1 for CPU).--quantize
: Use a quantized model for faster processing.--help
: Display detailed help information.
omnisense benchmark -s -d --num-workers 2 --device-id 0 --batch-size 10 --textnorm woitn --language en benchmark/data/manifests/libritts/libritts_cuts_dev-clean.jsonl
Optimize | test set | GPU | WER ⬇️ | RTF ⬇️ | Speed Up 🔥 |
---|---|---|---|---|---|
onnx | dev-clean[:100] | NVIDIA L4 GPU | 4.47% | 0.1200 | 1x |
torch | dev-clean[:100] | NVIDIA L4 GPU | 5.02% | 0.0022 | 50x |
onnxfix cudnn | dev-clean[all] | NVIDIA L4 GPU | 5.60% | 0.0027 | 50x |
torch | dev-clean[all] | NVIDIA L4 GPU | 6.39% | 0.0019 | 50x |
fix cudnn
:cudnn_conv_algo_search: DEFAULT
- With Omni SenseVoice, experience up to 50x faster processing without sacrificing accuracy.
# LibriTTSDIR=benchmark/datalhotse download libritts -p dev-clean benchmark/dataLibriTTSlhotse prepare libritts -p dev-clean benchmark/data/LibriTTS/LibriTTS benchmark/data/manifests/librittslhotse cut simple --force-eager -r benchmark/data/manifests/libritts/libritts_recordings_dev-clean.jsonl.gz \ -s benchmark/data/manifests/libritts/libritts_supervisions_dev-clean.jsonl.gz \ benchmark/data/manifests/libritts/libritts_cuts_dev-clean.jsonlomnisense benchmark -s -d --num-workers 2 --device-id 0 --batch-size 10 --textnorm woitn --language en benchmark/data/manifests/libritts/libritts_cuts_dev-clean.jsonlomnisense benchmark -s --num-workers 4 --device-id 0 --batch-size 16 --textnorm woitn --language en benchmark/data/manifests/libritts/libritts_cuts_dev-clean.jsonl
Set up pre-commit hooks:
pip install pre-commit==3.6.0pre-commit install
Submit your awesome improvements through a PR. 😊
About
Omni SenseVoice: High-Speed Speech Recognition with words timestamps 🗣️🎯
Resources
License
Stars
Watchers
Forks
Packages0
No packages published