Movatterモバイル変換

zabir-nabil/torch-speech-dataloaderPublic

NotificationsYou must be signed in to change notification settings
Fork0
Star8

A ready-to-use pytorch dataloader for audio classification, speech classification, speaker recognition, etc. with in-GPU augmentations

License

MIT license

8 stars 0 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
torch_speech_dataloader		torch_speech_dataloader
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
setup.py		setup.py
test.py		test.py

Repository files navigation

torch-speech-dataloader

A ready-to-use pytorch dataloader for audio classification, speech classification, speaker recognition, etc. with in-GPU augmentations.

PyTorch speech dataloader with 5 (or less) lines of code.get_torch_speech_dataloader_from_config(config)
Batch augmentation in GPU, powered bytorch-audiomentations
RIRs augmentation with any set of IR file(s) [cpu]
MUSAN-like augmentation with any set of source files. Customizable. [cpu]
Written in one night, may contain bugs!

Install

pip install -U git+https://github.com/zabir-nabil/torch-speech-dataloader.git@main

Use

fromtorch_speech_dataloaderimportget_torch_speech_dataloader,get_torch_speech_dataloader_from_configfromtorch_speech_dataloader.augmentation_utilsimportplaceholder_gpu_augmentationconfig_1= {"filenames" : ["../test.wav"]*5+ ["../test_hindi.wav"]*5,"speech_labels" : ["test"]*5+ ["test2"]*5,"batch_size" :3,"num_workers" :5,"device" :torch.device('cuda:1'),"sanity_check_path" :"../sanity_test","sanity_check_samples" :2,"batch_audio_augmentation":placeholder_gpu_augmentation,"rirs_reverb" : {"apply":True},"musan_augmentation" : {"apply":True,"mix_multiples_max_count":-1,"musan_max_len":1.},"verbose" :0}dummy_tsdl=get_torch_speech_dataloader_from_config(config_1)ford,lindummy_tsdl.get_batch():print(d.shape)print(l)

Others

`config` parameters

filenames: A list of filepaths for the audio / speech files (usually wav).
speech_labels: Corresponding labels forfilenames / list of audio files.
batch_size: Batch size of the dataloader.
num_workers: Dataloader workers.
device: torch device [default:cpu].
sanity_check_path: If you want to look at the sample audio files generated, specify a path where the sample augmented audio files will be saved.
sanity_check_samples: Number of sample audio files to store in the sanity check folder.
batch_audio_augmentation: Usually, it will run on the GPU batch if gpu device is specified, else on the CPU batch. Any transform (compose) / augmentation, that takes a tensor of dimension[B x C x N].
rirs_reverb:
- apply: If apply is true, only then this augmentation will be applied to each audio individually.
- reverb_source_files_path: A list of IR filepaths.
musan_augmentation:
- apply: If apply is true, only then this augmentation will be applied to each audio individually.
- musan_config:{ "music": ([list of music file paths], range_for_num_music_files_to_use, range_for_noise_snr), "speech": ([list of speech file paths], range_for_num_speech_files_to_use, range_for_noise_snr), }[example: augmentation_utils.placeholder_musan_config]
- mix_multiples_max_count: Multiple noise types should be mixed (music + noise +...). Number of noise types that should be mixed at most.
- musan_max_len:<= 0: take the musan noise and crop it with equal length (same as input audio);> 0: maximum length of the cropped musan noise (in secs.).
audio_augmentation: List offuncs that can be applied to a single audio with shape[N,].
features: Feature extraction.[N,] ->[T,F].
feature_augmentation: List offuncs that can be applied to a single feature with shape[T,F].

About

A ready-to-use pytorch dataloader for audio classification, speech classification, speaker recognition, etc. with in-GPU augmentations

Releases

No releases published

Packages

No packages published

Languages

Python100.0%

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

torch-speech-dataloader

Install

Use

Others

`config` parameters

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Uh oh!

Languages

Movatterモバイル変換

License

zabir-nabil/torch-speech-dataloader

Folders and files

Latest commit

History

Repository files navigation

torch-speech-dataloader

Install

Use

Others

config parameters

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages0

Uh oh!

Languages

`config` parameters

Packages