Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

Omni SenseVoice: High-Speed Speech Recognition with words timestamps 🗣️🎯

License

NotificationsYou must be signed in to change notification settings

lifeiteng/OmniSenseVoice

Repository files navigation

The Ultimate Speech Recognition Solution

Built onSenseVoice, Omni SenseVoice is optimized for lightning-fast inference and precise timestamps—giving you a smarter, faster way to handle audio transcription!

Install

pip3 install OmniSenseVoice

Usage

omnisense transcribe [OPTIONS] AUDIO_PATH

Key Options:

  • --language: Automatically detect the language or specify (auto, zh, en, yue, ja, ko).
  • --textnorm: Choose whether to apply inverse text normalization (withitn for inverse normalized orwoitn for raw).
  • --device-id: Run on a specific GPU (default: -1 for CPU).
  • --quantize: Use a quantized model for faster processing.
  • --help: Display detailed help information.

Benchmark

omnisense benchmark -s -d --num-workers 2 --device-id 0 --batch-size 10 --textnorm woitn --language en benchmark/data/manifests/libritts/libritts_cuts_dev-clean.jsonl

Optimizetest setGPUWER ⬇️RTF ⬇️Speed Up 🔥
onnxdev-clean[:100]NVIDIA L4 GPU4.47%0.12001x
torchdev-clean[:100]NVIDIA L4 GPU5.02%0.002250x
onnxfix cudnndev-clean[all]NVIDIA L4 GPU5.60%0.002750x
torchdev-clean[all]NVIDIA L4 GPU6.39%0.001950x
  • fix cudnn:cudnn_conv_algo_search: DEFAULT
  • With Omni SenseVoice, experience up to 50x faster processing without sacrificing accuracy.
# LibriTTSDIR=benchmark/datalhotse download libritts -p dev-clean benchmark/dataLibriTTSlhotse prepare libritts -p dev-clean benchmark/data/LibriTTS/LibriTTS benchmark/data/manifests/librittslhotse cut simple --force-eager -r benchmark/data/manifests/libritts/libritts_recordings_dev-clean.jsonl.gz \    -s benchmark/data/manifests/libritts/libritts_supervisions_dev-clean.jsonl.gz \    benchmark/data/manifests/libritts/libritts_cuts_dev-clean.jsonlomnisense benchmark -s -d --num-workers 2 --device-id 0 --batch-size 10 --textnorm woitn --language en benchmark/data/manifests/libritts/libritts_cuts_dev-clean.jsonlomnisense benchmark -s --num-workers 4 --device-id 0 --batch-size 16 --textnorm woitn --language en benchmark/data/manifests/libritts/libritts_cuts_dev-clean.jsonl

Contributing 🙌

Step 1: Code Formatting

Set up pre-commit hooks:

pip install pre-commit==3.6.0pre-commit install

Step 2: Pull Request

Submit your awesome improvements through a PR. 😊

About

Omni SenseVoice: High-Speed Speech Recognition with words timestamps 🗣️🎯

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages


[8]ページ先頭

©2009-2025 Movatter.jp