lifeiteng/OmniSenseVoicePublic

NotificationsYou must be signed in to change notification settings
Fork33
Star827

Omni SenseVoice: High-Speed Speech Recognition with words timestamps 🗣️🎯

License

Apache-2.0 license

827 stars 33 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
.github/workflows		.github/workflows
src/omnisense		src/omnisense
tests		tests
.flake8		.flake8
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Repository files navigation

Omni SenseVoice 🚀

The Ultimate Speech Recognition Solution

Built onSenseVoice, Omni SenseVoice is optimized for lightning-fast inference and precise timestamps—giving you a smarter, faster way to handle audio transcription!

Install

pip3 install OmniSenseVoice

Usage

omnisense transcribe [OPTIONS] AUDIO_PATH

Key Options:

--language: Automatically detect the language or specify (auto, zh, en, yue, ja, ko).
--textnorm: Choose whether to apply inverse text normalization (withitn for inverse normalized orwoitn for raw).
--device-id: Run on a specific GPU (default: -1 for CPU).
--quantize: Use a quantized model for faster processing.
--help: Display detailed help information.

Benchmark

omnisense benchmark -s -d --num-workers 2 --device-id 0 --batch-size 10 --textnorm woitn --language en benchmark/data/manifests/libritts/libritts_cuts_dev-clean.jsonl

Optimize	test set	GPU	WER ⬇️	RTF ⬇️	Speed Up 🔥
onnx	dev-clean[:100]	NVIDIA L4 GPU	4.47%	0.1200	1x
torch	dev-clean[:100]	NVIDIA L4 GPU	5.02%	0.0022	50x
onnx`fix cudnn`	dev-clean[all]	NVIDIA L4 GPU	5.60%	0.0027	50x
torch	dev-clean[all]	NVIDIA L4 GPU	6.39%	0.0019	50x

fix cudnn:cudnn_conv_algo_search: DEFAULT
With Omni SenseVoice, experience up to 50x faster processing without sacrificing accuracy.

# LibriTTSDIR=benchmark/datalhotse download libritts -p dev-clean benchmark/dataLibriTTSlhotse prepare libritts -p dev-clean benchmark/data/LibriTTS/LibriTTS benchmark/data/manifests/librittslhotse cut simple --force-eager -r benchmark/data/manifests/libritts/libritts_recordings_dev-clean.jsonl.gz \    -s benchmark/data/manifests/libritts/libritts_supervisions_dev-clean.jsonl.gz \    benchmark/data/manifests/libritts/libritts_cuts_dev-clean.jsonlomnisense benchmark -s -d --num-workers 2 --device-id 0 --batch-size 10 --textnorm woitn --language en benchmark/data/manifests/libritts/libritts_cuts_dev-clean.jsonlomnisense benchmark -s --num-workers 4 --device-id 0 --batch-size 16 --textnorm woitn --language en benchmark/data/manifests/libritts/libritts_cuts_dev-clean.jsonl

Contributing 🙌

Step 1: Code Formatting

Set up pre-commit hooks:

pip install pre-commit==3.6.0pre-commit install

Step 2: Pull Request

Submit your awesome improvements through a PR. 😊

About

Omni SenseVoice: High-Speed Speech Recognition with words timestamps 🗣️🎯

Releases6

Improve device setting Latest

Mar 7, 2025

+ 5 releases

Packages

No packages published

Contributors2

Languages

Python100.0%

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Folders and files

Latest commit

History

Repository files navigation

Omni SenseVoice 🚀

The Ultimate Speech Recognition Solution

Install

Usage

Benchmark

Contributing 🙌

Step 1: Code Formatting

Step 2: Pull Request

About

Resources

License

Stars

Watchers

Forks

Releases6

Packages

Contributors2

Languages

Movatterモバイル変換

License

lifeiteng/OmniSenseVoice

Folders and files

Latest commit

History

Repository files navigation

Omni SenseVoice 🚀

The Ultimate Speech Recognition Solution

Install

Usage

Benchmark

Contributing 🙌

Step 1: Code Formatting

Step 2: Pull Request

About

Resources

License

Stars

Watchers

Forks

Releases6

Packages0

Contributors2

Languages

Packages