speech

Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything

speech image-editing caption data-generation 3d-whole-body-pose-estimation open-vocabulary-detection open-vocabulary-segmentation automatic-labeling-system

UpdatedSep 5, 2024
Jupyter Notebook

kaldi-asr /kaldi

Star15.3k

kaldi-asr/kaldi is the official location of the Kaldi project.

shell c-plus-plus cuda speech speech-recognition speech-to-text kaldi speaker-verification speaker-id

UpdatedSep 22, 2025
Shell

AIGC-Audio /AudioGPT

Star10.2k

AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head

audio music speech sound gpt talking-head

UpdatedJul 6, 2024
Python

mozilla /TTS

Star10.1k

🤖 💬 Deep learning for Text to Speech (Discussion forum:https://discourse.mozilla.org/c/tts)

python text-to-speech deep-learning speech pytorch tts vocoder tacotron tensorflow2 tacotron2 melgan speaker-encoder dataset-analysis glow-tts multiband-melgan gantts

UpdatedNov 9, 2023
Jupyter Notebook

modelscope /modelscope

Star8.7k

ModelScope: bring the notion of Model-as-a-Service to life.

python nlp science machine-learning deep-learning cv speech multi-modal

UpdatedFeb 19, 2026
Python

netease-youdao /EmotiVoice

Star8.4k

EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine

python text-to-speech ai deep-learning style prompt speech emotion pytorch tts speech-synthesis multi-speaker emotivoice

UpdatedAug 13, 2024
Python

snakers4 /silero-vad

Star8.2k

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

voice-commands speech pytorch voice-recognition vad voice-control speech-processing voice-detection voice-activity-detection onnx onnxruntime onnx-runtime

UpdatedFeb 12, 2026
Python

PaddlePaddle /models

Star6.9k

Officially maintained, supported by PaddlePaddle, including CV, NLP, Speech, Rec, TS, big models and so on.

nlp natural-language-processing computer-vision deep-learning neural-network models cv speech recommendation paddlepaddle

UpdatedJan 15, 2025
Python

TalAter /annyang

Star6.7k

💬 Speech recognition for your site

voice speech speech-recognition speech-to-text

UpdatedAug 7, 2024
JavaScript

OpenBMB /VoxCPM

Star5.9k

VoxCPM: Tokenizer-Free TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning

audio python text-to-speech speech pytorch tts speech-synthesis deeplearning voice-cloning tts-model minicpm

UpdatedFeb 11, 2026
Python

snakers4 /silero-models

Star5.8k

Silero Models: pre-trained text-to-speech models made embarrassingly simple

text-to-speech speech pytorch tts speech-synthesis colab armenian russian speech-to-text ukrainian pretrained-models georgian belarus kyrgyz uzbek kazakh azerbaijani tajik tts-models torch-hub

UpdatedFeb 3, 2026
Jupyter Notebook

MahmoudAshraf97 /whisper-diarization

Star5.4k

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

speech speech-recognition speech-to-text whisper asr speaker-diarization

UpdatedNov 26, 2025
Jupyter Notebook

huggingface /speech-to-speech

Star4.5k

Speech To Speech: an effort for an open-sourced and modular GPT4-o

python machine-learning ai speech speech-synthesis assistant speech-to-text language-model speech-translation

UpdatedFeb 20, 2026
Python

fixie-ai /ultravox

Star4.4k

A fast multimodal LLM for real-time voice

ai speech slm llm

UpdatedDec 12, 2025
Python

cactus-compute /cactus

Star4.3k

Low-latency AI inference engine for mobile devices & wearables

android ios mobile framework ai speech edge transformer smartphone whisper llm llms llamacpp llm-inference

UpdatedFeb 20, 2026
C

Improve this page

Add a description, image, and links to thespeech topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with thespeech topic, visit your repo's landing page and select "manage topics."

Learn more

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly