audio-datasets
Here are 21 public repositories matching this topic...
Language:All
Sort:Most stars
🔊 A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).
- Updated
Jun 6, 2024
A collection of datasets for the purpose of emotion recognition/detection in speech.
- Updated
Sep 30, 2024 - HTML
open-source audio datasets
- Updated
Sep 7, 2023
A description of "RealMAN: A Real-Recorded and Annotated Microphone Array Dataset for Dynamic Speech Enhancement and Localization" [NeurIPS 2024]
- Updated
Apr 29, 2025 - Python
Python library for handling audio datasets.
- Updated
Jul 6, 2023 - Python
A library built for easier audio self-supervised training, downstream tasks evaluation
- Updated
Sep 25, 2025 - Python
This package aims at simplifying the download of the AudioSet dataset.
- Updated
Jul 17, 2025 - Python
Multi-Language Dataset Cleaner/Creator for Mozilla's DeepSpeech Framework
- Updated
May 22, 2023 - Python
A powerful and easy-to-use web scrapper for collecting data from the web. Supports scraping of images, text, videos, meta data, and more. Ideal for machine learning and deep learning engineers. Download and extract data with just one line of code
- Updated
Nov 19, 2023 - Python
KATube is a tool to automate the process of creating datasets for training Text-To-Speech (TTS) and Speech-To-Text (STT) models. From a list of YouTube playlists or YouTube channels, KATube will generate dataset with audios and texts.
- Updated
Jul 27, 2024 - Python
GitHub Repository for the Survey Paper on Audio-Language Datasets for Scenes and Events
- Updated
Feb 7, 2025 - Jupyter Notebook
Download speech datasets (English and non-English) for Automatic Speech Recognition
- Updated
Jan 22, 2023 - Jupyter Notebook
[v.1.0] Lingualibre Languages Gallery in VueJS.
- Updated
Aug 5, 2024 - CSS
Synthetic sounds datasets and real sounds datasets of waterflow sounds for the repo 'Neural-Texture-Sound-Synthesis-with-physically-driven-continuous-controls'.
- Updated
Aug 30, 2023
top dataset for voice conversion models
- Updated
Oct 28, 2023
This repository contains the resources our team used through the course of the CLEF competition.
- Updated
May 27, 2022 - Jupyter Notebook
Playback Helper Website for Recording Stem-Separated Datasets
- Updated
Jan 19, 2026 - JavaScript
Improve this page
Add a description, image, and links to theaudio-datasets topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with theaudio-datasets topic, visit your repo's landing page and select "manage topics."