Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

kapre: Keras Audio Preprocessors

License

NotificationsYou must be signed in to change notification settings

keunwoochoi/kapre

Repository files navigation

Keras Audio Preprocessors - compute STFT, ISTFT, Melspectrogram, and others on GPU real-time.

Tested on Python 3.8+, with type hints for better development experience

Why Kapre?

vs. Pre-computation

  • You can optimize DSP parameters
  • Your model deployment becomes much simpler and consistent.
  • Your code and model has less dependencies

vs. Your own implementation

  • Quick and easy!
  • Consistent with 1D/2D tensorflow batch shapes
  • Data format agnostic (channels_first andchannels_last)
  • Less error prone - Kapre layers are tested against Librosa (stft, decibel, etc) - which is (trust me)trickier than you think.
  • Kapre layers have some extended APIs from the defaulttf.signals implementation such as..
    • A perfectly invertibleSTFT andInverseSTFT pair
    • Mel-spectrogram with more options
  • Reproducibility - Kapre is available on pip with versioning

Workflow with Kapre

  1. Preprocess your audio dataset. Resample the audio to the right sampling rate and store the audio signals (waveforms).
  2. In your ML model, add Kapre layer e.g.kapre.time_frequency.STFT() as the first layer of the model.
  3. The data loader simply loads audio signals and feed them into the model
  4. In your hyperparameter search, include DSP parameters liken_fft to boost the performance.
  5. When deploying the final model, all you need to remember is the sampling rate of the signal. No dependency or preprocessing!

Installation

pip install kapre

Development

Kapre includes comprehensive type hints for better IDE support and development experience.

Type Checking

Run type checking with our included script:

python scripts/check_types.py

Or use your preferred type checker:

# With mypypip install mypymypy kapre/# With pyrightpip install pyrightpyright kapre/

Development Setup

# Install development dependenciespip install -e".[dev]"# Run testspytest tests/# Run type checkingpython scripts/check_types.py# Format codeblack kapre/ tests/# Lint codeflake8 kapre/ tests/

API Documentation

Please refer to Kapre API Documentation athttps://kapre.readthedocs.io

One-shot example

fromtensorflow.keras.modelsimportSequentialfromtensorflow.keras.layersimportConv2D,BatchNormalization,ReLU,GlobalAveragePooling2D,Dense,SoftmaxfromkapreimportSTFT,Magnitude,MagnitudeToDecibelfromkapre.composedimportget_melspectrogram_layer,get_log_frequency_spectrogram_layer# 6 channels (!), maybe 1-sec audio signal, for an example.input_shape= (44100,6)sr=44100model=Sequential()# A STFT layermodel.add(STFT(n_fft=2048,win_length=2018,hop_length=1024,window_name=None,pad_end=False,input_data_format='channels_last',output_data_format='channels_last',input_shape=input_shape))model.add(Magnitude())model.add(MagnitudeToDecibel())# these three layers can be replaced with get_stft_magnitude_layer()# Alternatively, you may want to use a melspectrogram layer# melgram_layer = get_melspectrogram_layer()# or log-frequency layer# log_stft_layer = get_log_frequency_spectrogram_layer()# add more layers as you wantmodel.add(Conv2D(32, (3,3),strides=(2,2)))model.add(BatchNormalization())model.add(ReLU())model.add(GlobalAveragePooling2D())model.add(Dense(10))model.add(Softmax())# Compile the modelmodel.compile('adam','categorical_crossentropy')# if single-label classification# train it with raw audio sample inputs# for example, you may have functions that load your data as below.x=load_x()# e.g., x.shape = (10000, 6, 44100)y=load_y()# e.g., y.shape = (10000, 10) if it's 10-class classification# then..model.fit(x,y)# Done!

Tflite compatbility

TheSTFT layer is not tflite compatible (due totf.signal.stft). To create a tflitecompatible model, first train using the normalkapre layers then create a newmodel replacingSTFT andMagnitude withSTFTTflite,MagnitudeTflite.Tflite compatible layers are restricted to a batch size of 1 which prevents useof them during training.

# assumes you have run the one-shot example above.fromkapreimportSTFTTflite,MagnitudeTflitemodel_tflite=Sequential()model_tflite.add(STFTTflite(n_fft=2048,win_length=2018,hop_length=1024,window_name=None,pad_end=False,input_data_format='channels_last',output_data_format='channels_last',input_shape=input_shape))model_tflite.add(MagnitudeTflite())model_tflite.add(MagnitudeToDecibel())model_tflite.add(Conv2D(32, (3,3),strides=(2,2)))model_tflite.add(BatchNormalization())model_tflite.add(ReLU())model_tflite.add(GlobalAveragePooling2D())model_tflite.add(Dense(10))model_tflite.add(Softmax())# load the trained weights into the tflite compatible model.model_tflite.set_weights(model.get_weights())

Citation

Please cite this paper if you use Kapre for your work.

@inproceedings{choi2017kapre,  title={Kapre: On-GPU Audio Preprocessing Layers for a Quick Implementation of Deep Neural Network Models with Keras},  author={Choi, Keunwoo and Joo, Deokjin and Kim, Juho},  booktitle={Machine Learning for Music Discovery Workshop at 34th International Conference on Machine Learning},  year={2017},  organization={ICML}}

Packages

No packages published

Contributors14


[8]ページ先頭

©2009-2025 Movatter.jp