speaker-embedding

Frontier CoreML audio models in your apps — text-to-speech, speech-to-text, voice activity detection, and speaker diarization. In Swift, powered by SOTA open source.

audio macos swift ios real-time avfoundation nvidia vad automatic-speech-recognition speech-to-text ane speaker-recognition asr speaker-diarization voice-activity-detection coreml speaker-identification speaker-embedding parakeet

UpdatedFeb 20, 2026
Swift

yistLin /dvector

Star286

Speaker embedding (d-vector) trained with GE2E loss

pytorch speaker-verification speaker-embedding ge2e torchscript speaker-encoder dvector

UpdatedJan 8, 2024
Python

Walleclipse /Deep_Speaker-speaker_recognition_system

Star253

Keras implementation of ‘’Deep Speaker: an End-to-End Neural Speaker Embedding System‘’ (speaker recognition)

keras speech speaker-recognition triplet-loss speaker-embedding

UpdatedApr 27, 2020
Python

Chris10M /Lip2Speech

Sponsor

Star93

A pipeline to read lips and generate speech for the read content, i.e Lip to Speech Synthesis.

real-time deep-learning pytorch speech-synthesis lip-reading speaker-embedding lipreading liptospeech

UpdatedJul 23, 2025
Python

yuyq96 /D-TDNN

Star90

PyTorch implementation of Densely Connected Time Delay Neural Network

speech speaker-recognition speaker-verification speaker-diarization time-delay-neural-network speaker-embedding speaker-adaptation temporal-convolutional-network d-tdnn

UpdatedMay 4, 2023
Python

gokhaneraslan /chatterbox-finetuning

Sponsor

Star73

Fine-tuning toolkit for Chatterbox TTS & Chatterbox TURBO models. Supports 23 languages with smart vocabulary extension. Features offline preprocessing, automatic VAD trimming, and voice cloning capabilities. Train custom TTS models with your own dataset in LJSpeech and file-based format.

multilingual text-to-speech pytorch tts speech-synthesis transformer turkish-language audio-processing chatterbox voice-synthesis fine-tuning voice-training speaker-embedding voice-cloning gpt2 chatterbox-tts chatterbox-multilingual chatterbox-tts-turbo

UpdatedFeb 20, 2026
Python

juanmc2005 /SpeakerEmbeddingLossComparison

Star61

Companion repository for the paper "A Comparison of Metric Learning Loss Functions for End-to-End Speaker Verification" published at SLSP 2020

pytorch metric-learning speaker-verification end-to-end-machine-learning speaker-embedding x-vector sincnet additive-angular-margin-loss

UpdatedOct 7, 2020
Jupyter Notebook

ranchlai /awesome-speaker-embedding

Star52

A curated list of speaker-embedding speaker-verification, speaker-identification resources.

speaker-recognition speaker-verification speaker-embedding

UpdatedAug 12, 2021

maxhollmann /voxceleb-luigi

Star43

Luigi pipeline to download VoxCeleb(2) audio from YouTube and extract speaker segments

luigi speaker-recognition speaker-verification voxceleb speaker-embedding

UpdatedMar 29, 2021
Python

swshon /voxceleb-ivector

Star43

Voxceleb1 i-vector based speaker recognition system

kaldi speaker-recognition speaker-verification i-vector speaker-identification voxceleb voxceleb1 speaker-embedding

UpdatedMay 22, 2018
Perl

Picovoice /eagle

Star41

On-device speaker recognition engine powered by deep learning

speaker-recognition speaker-identification speaker-embedding

UpdatedFeb 13, 2026
Python

PiotrTa /Huawei-Challenge-Speaker-Identification

Star36

Trained speaker embedding deep learning models and evaluation pipelines in pytorch and tesorflow for speaker recognition.

voice-recognition speaker-recognition speaker-verification speech-processing voice-activity-detection speaker-identification speaker-embedding

UpdatedOct 4, 2019
Jupyter Notebook

iPRoBe-lab /1D-Triplet-CNN

Star32

PyTorch implementation of the 1D-Triplet-CNN neural network model described in Fusing MFCC and LPC Features using 1D Triplet CNN for Speaker Recognition in Severely Degraded Audio Signals by A. Chowdhury, and A. Ross.

deep-learning convolutional-neural-networks speaker-recognition speaker-verification speaker-embedding

UpdatedJan 13, 2020
Python

PlayVoice /VI-Speaker

Star30

Speaker embedding for VI-SVC and VI-SVS, alse for VITS; Use this to replace the ID to implement voice clone.

speaker-identification speaker-embedding vits voice-clone

UpdatedSep 16, 2022
Python

bunyaminergen /awesome-speech-dataset

Star26

Awesome Speech Dataset, including download links and a brief explanation for each resource. These datasets provide diverse and high-quality speech data covering various domains such as conversational, academic, political, and more.

text-to-speech speech speech-recognition speech-to-text speaker-recognition speaker-verification speech-processing speaker-diarization speech-analysis speech-enhancement speech-emotion-recognition speaker-identification speaker-embedding