Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

Data manipulation and transformation for audio signal processing, powered by PyTorch

License

NotificationsYou must be signed in to change notification settings

classicvalues/audio

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Build StatusDocumentationAnaconda BadgeAnaconda-Server Badge

The aim of torchaudio is to applyPyTorch tothe audio domain. By supporting PyTorch, torchaudio follows the same philosophyof providing strong GPU acceleration, having a focus on trainable features throughthe autograd system, and having consistent style (tensor names and dimension names).Therefore, it is primarily a machine learning library and not a general signalprocessing library. The benefits of PyTorch can be seen in torchaudio throughhaving all the computations be through PyTorch operations which makes it easyto use and feel like a natural extension.

Dependencies

  • PyTorch (See below for the compatible versions)
  • [optional] vesis84/kaldi-io-for-python commit cb46cb1f44318a5d04d4941cf39084c5b021241e or above

The following are the correspondingtorchaudio versions and supported Python versions.

torchtorchaudiopython
Developmentmaster /nightlymain /nightly>=3.7,<=3.10
Latest versioned release1.12.00.12.0>=3.7,<=3.10
LTS1.8.20.8.2>=3.6,<=3.9
Previous versions
torchtorchaudiopython
1.11.00.11.0>=3.7,<=3.9
1.10.00.10.0>=3.6,<=3.9
1.9.10.9.1>=3.6,<=3.9
1.9.00.9.0>=3.6,<=3.9
1.8.20.8.2>=3.6,<=3.9
1.8.00.8.0>=3.6,<=3.9
1.7.10.7.2>=3.6,<=3.9
1.7.00.7.0>=3.6,<=3.8
1.6.00.6.0>=3.6,<=3.8
1.5.00.5.0>=3.5,<=3.8
1.4.00.4.0==2.7,>=3.5,<=3.8

Installation

Binary Distributions

torchaudio has binary distributions for PyPI (pip) and Anaconda (conda).

Please refer tohttps://pytorch.org/get-started/locally/ for the details.

Note Starting0.10, torchaudio has CPU-only and CUDA-enabled binary distributions, each of which requires a matching PyTorch version.

NoteLTS versions are distributed through a different channel than the other versioned releases. Please refer to the above page for details.

Note This software was compiled against an unmodified copy of FFmpeg (licensed underthe LGPLv2.1), with the specific rpath removed so as to enable the use of system libraries. The LGPL source can be downloadedhere.

From Source

On non-Windows platforms, the build process builds libsox and codecs that torchaudio need to link to. It will fetch and build libmad, lame, flac, vorbis, opus, and libsox before building extension. This process requirescmake andpkg-config. libsox-based features can be disabled withBUILD_SOX=0.The build process also builds the RNN transducer loss and CTC beam search decoder. These functionalities can be disabled by setting the environment variableBUILD_RNNT=0 andBUILD_CTC_DECODER=0, respectively.

# Linuxpython setup.py install# OSXCC=clang CXX=clang++ python setup.py install# Windows# We need to use the MSVC x64 toolset for compilation, with Visual Studio's vcvarsall.bat or directly with vcvars64.bat.# These batch files are under Visual Studio's installation folder, under 'VC\Auxiliary\Build\'.# More information available at:#   https://docs.microsoft.com/en-us/cpp/build/how-to-enable-a-64-bit-visual-cpp-toolset-on-the-command-line?view=msvc-160#use-vcvarsallbat-to-set-a-64-bit-hosted-build-architecturecall"C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Auxiliary\Build\vcvarsall.bat" x64&&set BUILD_SOX=0&& python setup.py install# orcall"C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Auxiliary\Build\vcvars64.bat"&&set BUILD_SOX=0&& python setup.py install

This is known to work on linux and unix distributions such as Ubuntu and CentOS 7 and macOS.If you try this on a new system and find a solution to make it work, feel free to share it by opening an issue.

Quick Usage

importtorchaudiowaveform,sample_rate=torchaudio.load('foo.wav')# load tensor from filetorchaudio.save('foo_save.wav',waveform,sample_rate)# save tensor to file

Backend Dispatch

By default in OSX and Linux, torchaudio uses SoX as a backend to load and save files.The backend can be changed toSoundFileusing the following. SeeSoundFilefor installation instructions.

importtorchaudiotorchaudio.set_audio_backend("soundfile")# switch backendwaveform,sample_rate=torchaudio.load('foo.wav')# load tensor from file, as usualtorchaudio.save('foo_save.wav',waveform,sample_rate)# save tensor to file, as usual

Note

  • SoundFile currently does not support mp3.
  • "soundfile" backend is not supported by TorchScript.

API Reference

API Reference is located here:http://pytorch.org/audio/main/

Contributing Guidelines

Please refer toCONTRIBUTING.md

Citation

If you find this package useful, please cite as:

@article{yang2021torchaudio,title={TorchAudio: Building Blocks for Audio and Speech Processing},author={Yao-Yuan Yang and Moto Hira and Zhaoheng Ni and Anjali Chourdia and Artyom Astafurov and Caroline Chen and Ching-Feng Yeh and Christian Puhrsch and David Pollack and Dmitriy Genzel and Donny Greenberg and Edward Z. Yang and Jason Lian and Jay Mahadeokar and Jeff Hwang and Ji Chen and Peter Goldsborough and Prabhat Roy and Sean Narenthiran and Shinji Watanabe and Soumith Chintala and Vincent Quenneville-Bélair and Yangyang Shi},journal={arXiv preprint arXiv:2110.15018},year={2021}}

Disclaimer on Datasets

This is a utility library that downloads and prepares public datasets. We do not host or distribute these datasets, vouch for their quality or fairness, or claim that you have license to use the dataset. It is your responsibility to determine whether you have permission to use the dataset under the dataset's license.

If you're a dataset owner and wish to update any part of it (description, citation, etc.), or do not want your dataset to be included in this library, please get in touch through a GitHub issue. Thanks for your contribution to the ML community!

About

Data manipulation and transformation for audio signal processing, powered by PyTorch

Resources

License

Code of conduct

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python78.3%
  • C++16.2%
  • Shell1.8%
  • CMake1.5%
  • Cuda1.2%
  • Batchfile0.8%
  • Dockerfile0.2%

[8]ページ先頭

©2009-2025 Movatter.jp