speech-processing

Star

Here are 728 public repositories matching this topic...

Language:All

Filter by language

All728 Python324 Jupyter Notebook128 MATLAB36 JavaScript29 C++23 C19 HTML18 Java17 Swift7 C#6

Sort:Most stars

Sort options

Most stars Fewest stars Most forks Fewest forks Recently updated Least recently updated

speechbrain /speechbrain

Star10.9k

A PyTorch-based Speech Toolkit

audio deep-learning transformers pytorch voice-recognition speech-recognition speech-to-text language-model speaker-recognition speaker-verification speech-processing audio-processing asr speaker-diarization speechrecognition speech-separation speech-enhancement spoken-language-understanding huggingface speech-toolkit

UpdatedDec 15, 2025
Python

pyannote /pyannote-audio

Star8.9k

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

pytorch pretrained-models speaker-recognition speaker-verification speech-processing speaker-diarization voice-activity-detection speech-activity-detection speaker-change-detection speaker-embedding overlapped-speech-detection

UpdatedDec 13, 2025
Jupyter Notebook

snakers4 /silero-vad

Star7.7k

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

voice-commands speech pytorch voice-recognition vad voice-control speech-processing voice-detection voice-activity-detection onnx onnxruntime onnx-runtime

UpdatedDec 10, 2025
Python

pliang279 /awesome-multimodal-ml

Star6.8k

Reading list for research topics in multimodal machine learning

machine-learning natural-language-processing reinforcement-learning computer-vision deep-learning robotics healthcare reading-list representation-learning speech-processing multimodal-learning

UpdatedAug 20, 2024

microsoft /torchscale

Star3.1k

Foundation Architecture for (M)LLMs

machine-learning natural-language-processing translation computer-vision transformer speech-processing multimodal pretrained-language-model

UpdatedApr 11, 2024
Python

linto-ai /whisper-timestamped

Star2.7k

Multilingual Automatic Speech Recognition with word-level timestamps and confidence

python machine-learning deep-learning speech transformers python3 pytorch speech-recognition speech-to-text attention-mechanism whisper speech-processing asr speaker-diarization attention-model attention-is-all-you-need attention-seq2seq attention-visualization attention-network multilingual-models

UpdatedSep 9, 2025
Python

r9y9 /wavenet_vocoder

Sponsor

Star2.4k

WaveNet vocoder

python speech pytorch speech-synthesis wavenet speech-processing wavenet-vocoder neural-vocoder

UpdatedJul 29, 2023
Python

resemble-ai /resemble-enhance

Star2.1k

AI powered speech denoising and enhancement

speech-processing denoise speech-enhancement speech-denoising

UpdatedDec 3, 2024
Python

DigitalPhonetics /IMS-Toucan

Star2.1k

Controllable and fast Text-to-Speech for over 7000 languages!

text-to-speech deep-learning toolkit speech pytorch tts speech-synthesis speech-processing

UpdatedJun 30, 2025
Python

r9y9 /deepvoice3_pytorch

Sponsor

Star2k

PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models

python machine-learning end-to-end pytorch tts speech-synthesis speech-processing multi-speaker

UpdatedDec 19, 2023
Python

wq2012 /awesome-diarization

Star1.8k

A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.

machine-learning awesome deep-learning speech-recognition awesome-list speech-processing speaker-diarization

UpdatedJul 22, 2025

TEN-framework /ten-vad

Star1.8k

Voice Activity Detector (VAD) : low-latency, high-performance and lightweight

audio real-time voice-commands speech voice-recognition vad automatic-speech-recognition speech-processing conversational-ai voice-activity-detection voice-agent silero-vad

UpdatedDec 15, 2025
C

coqui-ai /open-speech-corpora

Star1.4k

💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies

text-to-speech tts speech-synthesis voice-recognition speech-recognition speech-to-text stt speech-processing voice-activity-detection speech-separation speech-emotion-recognition voice-cloning

UpdatedJun 6, 2024

haoheliu /voicefixer

Sponsor

Star1.3k

General Speech Restoration

speech tts speech-synthesis super-resolution speech-processing vocoder speech-analysis denoise mel speech-enhancement dereverberation declipping

UpdatedFeb 17, 2025
Python

mravanelli /SincNet

Star1.2k

SincNet is a neural architecture for efficiently processing raw audio samples.

audio python deep-learning signal-processing waveform cnn pytorch artificial-intelligence speech-recognition neural-networks convolutional-neural-networks digital-signal-processing filtering speaker-recognition speaker-verification speech-processing audio-processing asr timit speaker-identification

UpdatedApr 28, 2021
Python

ictnlp /StreamSpeech

Star1.2k

StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.

text-to-speech translation machine-translation voice speech tts speech-synthesis speech-recognition speech-to-text all-in-one speech-processing audio-processing asr streaming-audio seamless speech-enhancement speech-translation non-autoregressive text-to-audio simultaneous-translation

UpdatedJun 29, 2025
Python

midas-research /audino

Star1.1k

Open source audio annotation tool for humans

python machine-learning datasets speech-processing audio-processing annotation-tool audio-annotation

UpdatedDec 14, 2025
TypeScript

X-LANCE /SLAM-LLM

Star939

A Framework for Speech, Language, Audio, Music Processing with Large Language Model

speech-processing audio-processing peft music-processing large-language-model multimodal-large-language-models

UpdatedOct 24, 2025
Python

nyrahealth /CrisperWhisper

Star880

Verbatim Automatic Speech Recognition with improved word-level timestamps and filler detection

audio recognition detection speech speech-recognition filler transcription whisper speech-processing asr timestamps verbatim

UpdatedJun 3, 2025
Python

Ryuk17 /SpeechAlgorithms

Star838

You can find the speech algorithms you want here

speech-processing

UpdatedJul 26, 2025
C

Improve this page

Add a description, image, and links to thespeech-processing topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with thespeech-processing topic, visit your repo's landing page and select "manage topics."

Learn more

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

speech-processing

Here are 728 public repositories matching this topic...

speechbrain /speechbrain

pyannote /pyannote-audio

snakers4 /silero-vad

pliang279 /awesome-multimodal-ml

microsoft /torchscale

linto-ai /whisper-timestamped

r9y9 /wavenet_vocoder

resemble-ai /resemble-enhance

DigitalPhonetics /IMS-Toucan

r9y9 /deepvoice3_pytorch

wq2012 /awesome-diarization

TEN-framework /ten-vad

coqui-ai /open-speech-corpora

haoheliu /voicefixer

mravanelli /SincNet

ictnlp /StreamSpeech

midas-research /audino

X-LANCE /SLAM-LLM

nyrahealth /CrisperWhisper

Ryuk17 /SpeechAlgorithms

Improve this page

Add this topic to your repo