text-to-audio
Here are 68 public repositories matching this topic...
Language:All
Sort:Most stars
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
- Updated
May 27, 2025 - Python
Generate audiobooks from EPUBs, PDFs and text with synchronized captions.
- Updated
Oct 30, 2025 - Python
[CVPR 2025] MMAudio: Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis
- Updated
Sep 24, 2025 - Python
HunyuanVideo-Foley: Multimodal Diffusion with Representation Alignment for High-Fidelity Foley Audio Generation.
- Updated
Sep 28, 2025 - Python
A webui for different audio related Neural Networks
- Updated
May 19, 2025 - Python
A family of diffusion models for text-to-audio generation.
- Updated
Jul 29, 2025 - Python
StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
- Updated
Jun 29, 2025 - Python
[NeurIPS 2025] PyTorch implementation of [ThinkSound], a unified framework for generating audio from any modality, guided by Chain-of-Thought (CoT) reasoning.
- Updated
Sep 19, 2025 - Python
TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching
- Updated
Jul 29, 2025 - Jupyter Notebook
PyTorch Implementation of Make-An-Audio (ICML'23) with a Text-to-Audio Generative Model
- Updated
May 22, 2024 - Python
OpenMusic: SOTA Text-to-music (TTM) Generation
- Updated
Jun 26, 2025 - Python
Implementation of NÜWA, state of the art attention network for text to video synthesis, in Pytorch
- Updated
Jan 17, 2023 - Python
🔥🔥🔥 A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).
- Updated
Apr 4, 2025 - HTML
Mustango: Toward Controllable Text-to-Music Generation
- Updated
Jun 2, 2025 - Python
High-quality Text-to-Audio Generation with Efficient Diffusion Transformer
- Updated
Oct 8, 2025 - Python
AudioStory: Generating Long-Form Narrative Audio with Large Language Models
- Updated
Sep 21, 2025 - Jupyter Notebook
Official codes and models of the paper "Auffusion: Leveraging the Power of Diffusion and Large Language Models for Text-to-Audio Generation"
- Updated
Mar 25, 2024 - Jupyter Notebook
Word2Wave: a framework for generating short audio samples from a text prompt using WaveGAN and COALA.
- Updated
Dec 13, 2021 - Python
Subtitle to audio, generate audio from any subtitle file using Coqui-ai TTS and synchronize the audio timing according to subtitle time.
- Updated
Dec 14, 2023 - Python
Pytorch implementation of SoundCTM
- Updated
Mar 31, 2025 - Python
Improve this page
Add a description, image, and links to thetext-to-audio topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with thetext-to-audio topic, visit your repo's landing page and select "manage topics."