keunwoochoi/kaprePublic

NotificationsYou must be signed in to change notification settings
Fork149
Star931

kapre: Keras Audio Preprocessors

License

MIT license

931 stars 149 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 199 Commits
.github		.github
docs		docs
examples		examples
kapre		kapre
scripts		scripts
srcs		srcs
tests		tests
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE.txt		LICENSE.txt
LLM-TODO.md		LLM-TODO.md
README.md		README.md
mypy.ini		mypy.ini
pyproject.toml		pyproject.toml
pyrightconfig.json		pyrightconfig.json
setup.cfg		setup.cfg
setup.py		setup.py
tox.ini		tox.ini

Repository files navigation

Kapre

Keras Audio Preprocessors - compute STFT, ISTFT, Melspectrogram, and others on GPU real-time.

Tested on Python 3.8+, with type hints for better development experience

Why Kapre?

vs. Pre-computation

You can optimize DSP parameters
Your model deployment becomes much simpler and consistent.
Your code and model has less dependencies

vs. Your own implementation

Quick and easy!
Consistent with 1D/2D tensorflow batch shapes
Data format agnostic (channels_first andchannels_last)
Less error prone - Kapre layers are tested against Librosa (stft, decibel, etc) - which is (trust me)trickier than you think.
Kapre layers have some extended APIs from the defaulttf.signals implementation such as..
- A perfectly invertibleSTFT andInverseSTFT pair
- Mel-spectrogram with more options
Reproducibility - Kapre is available on pip with versioning

Workflow with Kapre

Preprocess your audio dataset. Resample the audio to the right sampling rate and store the audio signals (waveforms).
In your ML model, add Kapre layer e.g.kapre.time_frequency.STFT() as the first layer of the model.
The data loader simply loads audio signals and feed them into the model
In your hyperparameter search, include DSP parameters liken_fft to boost the performance.
When deploying the final model, all you need to remember is the sampling rate of the signal. No dependency or preprocessing!

Installation

pip install kapre

Development

Kapre includes comprehensive type hints for better IDE support and development experience.

Type Checking

Run type checking with our included script:

python scripts/check_types.py

Or use your preferred type checker:

# With mypypip install mypymypy kapre/# With pyrightpip install pyrightpyright kapre/

Development Setup

# Install development dependenciespip install -e".[dev]"# Run testspytest tests/# Run type checkingpython scripts/check_types.py# Format codeblack kapre/ tests/# Lint codeflake8 kapre/ tests/

API Documentation

Please refer to Kapre API Documentation athttps://kapre.readthedocs.io

One-shot example

fromtensorflow.keras.modelsimportSequentialfromtensorflow.keras.layersimportConv2D,BatchNormalization,ReLU,GlobalAveragePooling2D,Dense,SoftmaxfromkapreimportSTFT,Magnitude,MagnitudeToDecibelfromkapre.composedimportget_melspectrogram_layer,get_log_frequency_spectrogram_layer# 6 channels (!), maybe 1-sec audio signal, for an example.input_shape= (44100,6)sr=44100model=Sequential()# A STFT layermodel.add(STFT(n_fft=2048,win_length=2018,hop_length=1024,window_name=None,pad_end=False,input_data_format='channels_last',output_data_format='channels_last',input_shape=input_shape))model.add(Magnitude())model.add(MagnitudeToDecibel())# these three layers can be replaced with get_stft_magnitude_layer()# Alternatively, you may want to use a melspectrogram layer# melgram_layer = get_melspectrogram_layer()# or log-frequency layer# log_stft_layer = get_log_frequency_spectrogram_layer()# add more layers as you wantmodel.add(Conv2D(32, (3,3),strides=(2,2)))model.add(BatchNormalization())model.add(ReLU())model.add(GlobalAveragePooling2D())model.add(Dense(10))model.add(Softmax())# Compile the modelmodel.compile('adam','categorical_crossentropy')# if single-label classification# train it with raw audio sample inputs# for example, you may have functions that load your data as below.x=load_x()# e.g., x.shape = (10000, 6, 44100)y=load_y()# e.g., y.shape = (10000, 10) if it's 10-class classification# then..model.fit(x,y)# Done!

See the Jupyter notebook at theexample folder

Tflite compatbility

TheSTFT layer is not tflite compatible (due totf.signal.stft). To create a tflitecompatible model, first train using the normalkapre layers then create a newmodel replacingSTFT andMagnitude withSTFTTflite,MagnitudeTflite.Tflite compatible layers are restricted to a batch size of 1 which prevents useof them during training.

# assumes you have run the one-shot example above.fromkapreimportSTFTTflite,MagnitudeTflitemodel_tflite=Sequential()model_tflite.add(STFTTflite(n_fft=2048,win_length=2018,hop_length=1024,window_name=None,pad_end=False,input_data_format='channels_last',output_data_format='channels_last',input_shape=input_shape))model_tflite.add(MagnitudeTflite())model_tflite.add(MagnitudeToDecibel())model_tflite.add(Conv2D(32, (3,3),strides=(2,2)))model_tflite.add(BatchNormalization())model_tflite.add(ReLU())model_tflite.add(GlobalAveragePooling2D())model_tflite.add(Dense(10))model_tflite.add(Softmax())# load the trained weights into the tflite compatible model.model_tflite.set_weights(model.get_weights())

Citation

Please cite this paper if you use Kapre for your work.

@inproceedings{choi2017kapre,  title={Kapre: On-GPU Audio Preprocessing Layers for a Quick Implementation of Deep Neural Network Models with Keras},  author={Choi, Keunwoo and Joo, Deokjin and Kim, Juho},  booktitle={Machine Learning for Music Discovery Workshop at 34th International Conference on Machine Learning},  year={2017},  organization={ICML}}

About

kapre: Keras Audio Preprocessors

Releases11

Kapre-0.4.0 Latest

Oct 13, 2025

+ 10 releases

Packages

No packages published

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Kapre

Why Kapre?

vs. Pre-computation

vs. Your own implementation

Workflow with Kapre

Installation

Development

Type Checking

Development Setup

API Documentation

One-shot example

Tflite compatbility

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases11

Packages

Uh oh!

Contributors14

Uh oh!

Languages

Movatterモバイル変換

License

keunwoochoi/kapre

Folders and files

Latest commit

History

Repository files navigation

Kapre

Why Kapre?

vs. Pre-computation

vs. Your own implementation

Workflow with Kapre

Installation

Development

Type Checking

Development Setup

API Documentation

One-shot example

Tflite compatbility

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases11

Packages0

Uh oh!

Contributors14

Uh oh!

Languages

Packages