asr
Here are 1,442 public repositories matching this topic...
Language:All
Sort:Most stars
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
- Updated
Oct 21, 2025 - Python
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
- Updated
Dec 17, 2025 - Python
Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
- Updated
Dec 8, 2025 - Jupyter Notebook
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
- Updated
Oct 20, 2025 - Python
A PyTorch-based Speech Toolkit
- Updated
Dec 15, 2025 - Python
Speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS, Raspberry Pi, RISC-V, RK NPU, Ascend NPU, x86_64 servers, websocket server/client, support 12 programming languages
- Updated
Dec 17, 2025 - C++
Multilingual Voice Understanding Model
- Updated
Aug 15, 2025 - Python
🤖 wukong-robot 是一个简单、灵活、优雅的中文语音对话机器人/智能音箱项目,支持ChatGPT多轮对话能力,还可能是首个支持脑机交互的开源智能音箱项目。
- Updated
Oct 25, 2024 - Python
This is a python API which allows you to get the transcript/subtitles for a given YouTube video. It also works for automatically generated subtitles and it does not require an API key nor a headless browser, like other selenium based solutions do!
- Updated
Oct 13, 2025 - Python
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
- Updated
Nov 26, 2025 - Jupyter Notebook
Production First and Production Ready End-to-End Speech Recognition Toolkit
- Updated
Dec 16, 2025 - Python
Streamer-Sales 销冠 —— 卖货主播 LLM 大模型🛒🎁,一个能够根据给定的商品特点从激发用户购买意愿角度出发进行商品解说的卖货主播大模型。🚀⭐内含详细的数据生成流程❗ 📦另外还集成了 LMDeploy 加速推理🚀、RAG检索增强生成 📚、TTS文字转语音🔊、数字人生成 🦸、 Agent 使用网络查询实时信息🌐、ASR 语音转文字🎙️、Vue 生态搭建前端🍍、FastAPI 搭建后端🗝️、Docker-compose 打包部署🐋
- Updated
Mar 8, 2025 - Python
OpenAI Whisper ASR Webservice API
- Updated
Nov 23, 2025 - Python
faster_whisper GUI with PySide6
- Updated
Dec 8, 2024 - Python
Whisper & Faster-Whisper standalone executables for those who don't want to bother with Python.
- Updated
Nov 7, 2025
Multilingual Automatic Speech Recognition with word-level timestamps and confidence
- Updated
Sep 9, 2025 - Python
🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.
- Updated
Mar 11, 2024 - C++
pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.
- Updated
Mar 14, 2022 - Python
Improve this page
Add a description, image, and links to theasr topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with theasr topic, visit your repo's landing page and select "manage topics."