zafarrafii/Zaf-JuliaPublic

NotificationsYou must be signed in to change notification settings
Fork1
Star9

Zafar's Audio Functions in Julia for audio signal analysis: STFT, inverse STFT, CQT kernel, CQT spectrogram, CQT chromagram, MFCC, DCT, DST, MDCT, inverse MDCT.

9 stars 1 fork Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 121 Commits
images		images
.gitignore		.gitignore
README.md		README.md
audio_file.wav		audio_file.wav
examples.ipynb		examples.ipynb
zaf.jl		zaf.jl

Repository files navigation

Zaf-Julia

Zafar's Audio Functions inJulia for audio signal analysis.

Files:

zaf.py: Julia module with the audio functions.
examples.ipynb: Jupyter notebook with some examples.
audio_file.wav: audio file used for the examples.

zaf.py

This Julia module implements a number of functions for audio signal analysis.

Simply copy the filezaf.jl in your working directory, runinclude("./zaf.jl"); using .zaf, and you are good to go. Make sure to have the following packages installed (viaPkg.add("name_of_the_package")):

WAV: Julia package to read and write the WAV audio file format.
FFTW: Julia bindings to theFFTW library for fast Fourier transforms (FFTs), as well as functionality useful for signal processing.
Plots: powerful convenience for visualization in Julia.

Functions:

stft - Compute the short-time Fourier transform (STFT).
istft - Compute the inverse STFT.
melfilterbank - Compute the mel filterbank.
melspectrogram - Compute the mel spectrogram using a mel filterbank.
mfcc - Compute the mel-frequency cepstral coefficients (MFCCs) using a mel filterbank.
cqtkernel - Compute the constant-Q transform (CQT) kernel.
cqtspectrogram - Compute the CQT spectrogram using a CQT kernel.
cqtchromagram - Compute the CQT chromagram using a CQT kernel.
dct - Compute the discrete cosine transform (DCT) using the fast Fourier transform (FFT).
dst - Compute the discrete sine transform (DST) using the FFT.
mdct - Compute the modified discrete cosine transform (MDCT) using the FFT.
imdct - Compute the inverse MDCT using the FFT.

Other:

hamming - Compute the Hamming window.
sigplot - Plot a signal in seconds.
specshow - Display a spectrogram in dB, seconds, and Hz.
melspecshow - Display a mel spectrogram in dB, seconds, and Hz.
mfccshow - Display MFCCs in seconds.
cqtspecshow - Display a CQT spectrogram in dB, seconds, and Hz.
cqtchromshow - Display a CQT chromagram in seconds.

stft

Compute the short-time Fourier transform (STFT).

audio_stft = zaf.stft(audio_signal, window_function, step_length)    Inputs:    audio_signal: audio signal (number_samples,)    window_function: window function (window_length,)    step_length: step length in samplesOutput:    audio_stft: audio STFT (window_length, number_frames)

Example: Compute and display the spectrogram from an audio file.

# Load the needed modulesinclude("./zaf.jl")using .zafusing WAVusing Statisticsusing Plots# Read the audio signal with its sampling frequency in Hz, and average it over its channelsaudio_signal, sampling_frequency = wavread("audio_file.wav")audio_signal = mean(audio_signal, dims=2)# Set the window duration in seconds (audio is stationary around 40 milliseconds)window_duration = 0.04;# Derive the window length in samples (use powers of 2 for faster FFT and constant overlap-add (COLA))window_length = nextpow(2, ceil(Int, window_duration*sampling_frequency))# Compute the window function (periodic Hamming window for COLA)window_function = zaf.hamming(window_length, "periodic")# Set the step length in samples (half of the window length for COLA)step_length = convert(Int, window_length/2)# Compute the STFTaudio_stft = zaf.stft(audio_signal, window_function, step_length)# Derive the magnitude spectrogram (without the DC component and the mirrored frequencies)audio_spectrogram = abs.(audio_stft[2:convert(Int, window_length/2)+1, :])# Magnitude spectrogram (without the DC component and the mirrored frequencies)audio_spectrogram = abs.(audio_stft[2:convert(Int, window_length/2)+1, :])# Display the spectrogram in dB, seconds, and Hznumber_samples = length(audio_signal)xtick_step = 1ytick_step = 1000plot_object = zaf.specshow(audio_spectrogram, number_samples, sampling_frequency, xtick_step, ytick_step)heatmap!(title = "Spectrogram (dB)", size = (990, 600))

istft

Compute the inverse short-time Fourier transform (STFT).

audio_signal = zaf.istft(audio_stft, window_function, step_length)Inputs:    audio_stft: audio STFT (window_length, number_frames)    window_function: window function (window_length,)    step_length: step length in samplesOutput:    audio_signal: audio signal (number_samples,)

Example: Estimate the center and the sides from a stereo audio file.

# Load the needed modulesinclude("./zaf.jl")using .zafusing WAVusing Plots# Read the (stereo) audio signal with its sampling frequency in Hzaudio_signal, sampling_frequency = wavread("audio_file.wav")# Set the parameters for the STFTwindow_length = nextpow(2, ceil(Int, 0.04*sampling_frequency))window_function = zaf.hamming(window_length, "periodic")step_length = convert(Int, window_length/2)# Compute the STFTs for the left and right channelsaudio_stft1 = zaf.stft(audio_signal[:,1], window_function, step_length)audio_stft2 = zaf.stft(audio_signal[:,2], window_function, step_length)# Derive the magnitude spectrograms (with DC component) for the left and right channelsaudio_spectrogram1 = abs.(audio_stft1[1:convert(Int, window_length/2)+1, :])audio_spectrogram2 = abs.(audio_stft2[1:convert(Int, window_length/2)+1, :])# Estimate the time-frequency masks for the left and right channels for the centercenter_mask1 = min.(audio_spectrogram1, audio_spectrogram2)./audio_spectrogram1center_mask2 = min.(audio_spectrogram1, audio_spectrogram2)./audio_spectrogram2# Derive the STFTs for the left and right channels for the center (with mirrored frequencies)center_stft1 = [center_mask1; center_mask1[convert(Int, window_length/2):-1:2,:]] .* audio_stft1center_stft2 = [center_mask2; center_mask2[convert(Int, window_length/2):-1:2,:]] .* audio_stft2# Synthesize the signals for the left and right channels for the centercenter_signal1 = zaf.istft(center_stft1, window_function, step_length)center_signal2 = zaf.istft(center_stft2, window_function, step_length)# Derive the final stereo center and sides signalscenter_signal = [center_signal1 center_signal2];center_signal = center_signal[1:size(audio_signal, 1), :]sides_signal = audio_signal-center_signal;# Write the center and sides signalswavwrite(center_signal, "center_signal.wav", Fs=sampling_frequency)wavwrite(sides_signal, "sides_signal.wav", Fs=sampling_frequency)# Display the original, center, and sides signals in secondsxtick_step = 1plot_object1 = zaf.sigplot(audio_signal, sampling_frequency, xtick_step)plot!(ylims = (-1, 1), title = "Original signal")plot_object2 = zaf.sigplot(center_signal, sampling_frequency, xtick_step)plot!(ylims = (-1, 1), title = "Center signal")plot_object3 = zaf.sigplot(sides_signal, sampling_frequency, xtick_step)plot!(ylims = (-1, 1), title = "Sides signal")plot(plot_object1, plot_object2, plot_object3, layout = (3, 1), size = (990, 600))

melfilterbank

Compute the mel filterbank.

mel_filterbank = zaf.melfilterbank(sampling_frequency, window_length, number_mels)Inputs:    sampling_frequency: sampling frequency in Hz    window_length: window length for the Fourier analysis in samples    number_mels: number of mel filters    Output:    mel_filterbank: mel filterbank (sparse) (number_mels, number_frequencies)

Example: Compute and display the mel filterbank.

# Load the needed modulesinclude("./zaf.jl")using .zafusing Plots# Compute the mel filterbank using some parameterssampling_frequency = 44100window_length = nextpow(2, ceil(Int, 0.04*sampling_frequency))number_mels = 128mel_filterbank = zaf.melfilterbank(sampling_frequency, window_length, number_mels)# Display the mel filterbankheatmap(Array(mel_filterbank), fillcolor = :jet, legend = false, fmt = :png, size = (990, 300),     title = "Mel filterbank", xlabel = "Frequency index", ylabel = "Mel index")

melspectrogram

Compute the mel spectrogram using a mel filterbank.

mel_filterbank = zaf.melspectrogram(audio_signal, window_function, step_length, mel_filterbank)Inputs:    audio_signal: audio signal (number_samples,)    window_function: window function (window_length,)    step_length: step length in samples    mel_filterbank: mel filterbank (number_mels, number_frequencies)Output:    mel_spectrogram: mel spectrogram (number_mels, number_times)

Example: Compute and display the mel spectrogram.

# Load the needed modulesinclude("./zaf.jl")using .zafusing WAVusing Statisticsusing Plots# Read the audio signal with its sampling frequency in Hz, and average it over its channelsaudio_signal, sampling_frequency = wavread("audio_file.wav")audio_signal = mean(audio_signal, dims=2)# Set the parameters for the Fourier analysiswindow_length = nextpow(2, ceil(Int, 0.04*sampling_frequency))window_function = zaf.hamming(window_length, "periodic")step_length = convert(Int, window_length/2)# Compute the mel filterbanknumber_mels = 128mel_filterbank = zaf.melfilterbank(sampling_frequency, window_length, number_mels)# Compute the mel spectrogram using the filterbankmel_spectrogram = zaf.melspectrogram(audio_signal, window_function, step_length, mel_filterbank)# Display the mel spectrogram in dB, seconds, and Hznumber_samples = length(audio_signal)xtick_step = 1plot_object = zaf.melspecshow(mel_spectrogram, number_samples, sampling_frequency, window_length, xtick_step)heatmap!(title = "Mel spectrogram (dB)", size = (990, 600))

mfcc

Compute the mel-frequency cepstral coefficients (MFCCs) using a mel filterbank.

audio_mfcc = zaf.mfcc(audio_signal, sample_rate, number_filters, number_coefficients)Inputs:    audio_signal: audio signal (number_samples,)    sampling_frequency: sampling frequency in Hz    number_filters: number of filters    number_coefficients: number of coefficients (without the 0th coefficient)Output:    audio_mfcc: audio MFCCs (number_times, number_coefficients)

Example: Compute and display the MFCCs, delta MFCCs, and delta-detla MFCCs.

# Load the needed modulesinclude("./zaf.jl")using .zafusing WAVusing Statisticsusing Plots# Read the audio signal with its sampling frequency in Hz, and average it over its channelsaudio_signal, sampling_frequency = wavread("audio_file.wav")audio_signal = mean(audio_signal, dims=2)# Set the parameters for the Fourier analysiswindow_length = nextpow(2, ceil(Int, 0.04*sampling_frequency))window_function = zaf.hamming(window_length, "periodic")step_length = convert(Int, window_length/2)# Compute the mel filterbanknumber_mels = 40mel_filterbank = zaf.melfilterbank(sampling_frequency, window_length, number_mels)# Compute the MFCCs using the filterbanknumber_coefficients = 20audio_mfcc = zaf.mfcc(audio_signal, window_function, step_length, mel_filterbank, number_coefficients)# Compute the delta and delta-delta MFCCsaudio_dmfcc = diff(audio_mfcc, dims=2)audio_ddmfcc = diff(audio_dmfcc, dims=2)# Display the MFCCs, delta MFCCs, and delta-delta MFCCs in secondsxtick_step = 1number_samples = length(audio_signal)plot_object1 = zaf.mfccshow(audio_mfcc, number_samples, sampling_frequency, xtick_step); plot!(title = "MFCCs")plot_object2 = zaf.mfccshow(audio_dmfcc, number_samples, sampling_frequency, xtick_step); plot!(title = "Delta MFCCs")plot_object3 = zaf.mfccshow(audio_ddmfcc, number_samples, sampling_frequency, xtick_step); plot!(title = "Delta MFCCs")plot(plot_object1, plot_object2, plot_object3, layout = (3, 1), size = (990, 600))

cqtkernel

Compute the constant-Q transform (CQT) kernel.

cqt_kernel = zaf.cqtkernel(sampling_frequency, octave_resolution, minimum_frequency, maximum_frequency)Inputs:    sampling_frequency: sampling frequency in Hz    octave_resolution: number of frequency channels per octave    minimum_frequency: minimum frequency in Hz    maximum_frequency: maximum frequency in HzOutput:    cqt_kernel: CQT kernel (sparse) (number_frequencies, fft_length)

Example: Compute and display the CQT kernel.

# Load the needed modulesinclude("./zaf.jl")using .zafusing Plots# Set the parameters for the CQT kernelsampling_frequency = 44100octave_resolution = 24minimum_frequency = 55maximum_frequency = sampling_frequency/2# Compute the CQT kernelcqt_kernel = zaf.cqtkernel(sampling_frequency, octave_resolution, minimum_frequency, maximum_frequency)# Display the magnitude CQT kernelheatmap(abs.(Array(cqt_kernel)), fillcolor = :jet, legend = false, fmt = :png, size = (990, 300),     title = "Magnitude CQT kernel", xlabel = "FFT index", ylabel = "CQT index")

cqtspectrogram

Compute the constant-Q transform (CQT) spectrogram using a CQT kernel.

cqt_spectrogram = zaf.cqtspectrogram(audio_signal, sampling_frequency, time_resolution, cqt_kernel)Inputs:    audio_signal: audio signal (number_samples,)    sampling_frequency: sampling frequency in Hz    time_resolution: number of time frames per second    cqt_kernel: CQT kernel (number_frequencies, fft_length)Output:    cqt_spectrogram: CQT spectrogram (number_frequencies, number_times)

Example: Compute and display the CQT spectrogram.

# Load the needed modulesinclude("./zaf.jl")using .zafusing WAVusing Statisticsusing Plots# Read the audio signal with its sampling frequency in Hz, and average it over its channelsaudio_signal, sampling_frequency = wavread("audio_file.wav")audio_signal = mean(audio_signal, dims=2)# Compute the CQT kerneloctave_resolution = 24minimum_frequency = 55maximum_frequency = 3520cqt_kernel = zaf.cqtkernel(sampling_frequency, octave_resolution, minimum_frequency, maximum_frequency)# Compute the CQT spectrogram using the kerneltime_resolution = 25cqt_spectrogram = zaf.cqtspectrogram(audio_signal, sampling_frequency, time_resolution, cqt_kernel)# Display the CQT spectrogram in dB, seconds, and Hzxtick_step = 1plot_object = zaf.cqtspecshow(cqt_spectrogram, time_resolution, octave_resolution, minimum_frequency, xtick_step)heatmap!(title = "CQT spectrogram (dB)", size = (990, 600))

cqtchromagram

Compute the constant-Q transform (CQT) chromagram using a kernel.

cqt_chromagram = zaf.cqtchromagram(audio_signal, sampling_frequency, time_resolution, octave_resolution, cqt_kernel)Inputs:    audio_signal: audio signal (number_samples,)    sampling_frequency: sampling frequency in Hz    time_resolution: number of time frames per second    octave_resolution: number of frequency channels per octave    cqt_kernel: CQT kernel (number_frequencies, fft_length)Output:    cqt_chromagram: CQT chromagram (number_chromas, number_times)

Example: Compute and display the CQT chromagram.

# Load the needed modulesinclude("./zaf.jl")using .zafusing WAVusing Statisticsusing Plots# Read the audio signal with its sampling frequency in Hz, and average it over its channelsaudio_signal, sampling_frequency = wavread("audio_file.wav")audio_signal = mean(audio_signal, dims=2)# Compute the CQT kerneloctave_resolution = 24minimum_frequency = 55maximum_frequency = 3520cqt_kernel = zaf.cqtkernel(sampling_frequency, octave_resolution, minimum_frequency, maximum_frequency)# Compute the CQT chromagram using the kerneltime_resolution = 25cqt_chromagram = zaf.cqtchromagram(audio_signal, sampling_frequency, time_resolution, octave_resolution, cqt_kernel)# Display the CQT chromagram in secondsxtick_step = 1plot_object = zaf.cqtchromshow(cqt_chromagram, time_resolution, xtick_step)heatmap!(title = "CQT chromagram", size = (990, 300))

dct

Compute the discrete cosine transform (DCT) using the fast Fourier transform (FFT).

audio_dct = zaf.dct(audio_signal, dct_type)Inputs:    audio_signal: audio signal (window_length,)    dct_type: dct type (1, 2, 3, or 4)Output:    audio_dct: audio DCT (number_frequencies,)

Example: Compute the 4 different DCTs and compare them to FFTW's DCTs.

# Load the needed modulesinclude("./zaf.jl")using .zafusing WAVusing Statisticsusing FFTWusing Plots# Read the audio signal with its sampling frequency in Hz, and average it over its channelsaudio_signal, sampling_frequency = wavread("audio_file.wav")audio_signal = mean(audio_signal, dims=2)# Get an audio segment for a given window lengthwindow_length = 1024audio_segment = audio_signal[1:window_length]# Compute the DCT-I, II, III, and IVaudio_dct1 = zaf.dct(audio_segment, 1)audio_dct2 = zaf.dct(audio_segment, 2)audio_dct3 = zaf.dct(audio_segment, 3)audio_dct4 = zaf.dct(audio_segment, 4)# Compute FFTW's DCT-I, II, III, and IV (orthogonalized)audio_segment2 = copy(audio_segment)audio_segment2[[1, end]] = audio_segment2[[1, end]]*sqrt(2)fftw_dct1 = FFTW.r2r(audio_segment2, FFTW.REDFT00)fftw_dct1[[1, window_length]] = fftw_dct1[[1, window_length]]/sqrt(2)fftw_dct1 = fftw_dct1/2 * sqrt(2/(window_length - 1))fftw_dct2 = FFTW.r2r(audio_segment, FFTW.REDFT10)fftw_dct2[1] = fftw_dct2[1]/sqrt(2)fftw_dct2 = fftw_dct2/2 * sqrt(2/window_length)audio_segment2 = copy(audio_segment)audio_segment2[1] = audio_segment2[1]*sqrt(2)fftw_dct3 = FFTW.r2r(audio_segment2, FFTW.REDFT01)fftw_dct3 = fftw_dct3/2 * sqrt(2/window_length)fftw_dct4 = FFTW.r2r(audio_segment, FFTW.REDFT11)fftw_dct4 = fftw_dct4/2 * sqrt(2/window_length)# Plot the DCT-I, II, III, and IV, FFTW's versions, and their differences (using yformatter because of precision issue in ticks)dct1_plot = plot(audio_dct1, title = "DCT-I")dct2_plot = plot(audio_dct2, title = "DCT-II")dct3_plot = plot(audio_dct3, title = "DCT-III")dct4_plot = plot(audio_dct4, title = "DCT-IV")dct1_plot2 = plot(fftw_dct1, title = "FFTW's DCT-I")dct2_plot2 = plot(fftw_dct2, title = "FFTW's DCT-II")dct3_plot2 = plot(fftw_dct3, title = "FFTW's DCT-III")dct4_plot2 = plot(fftw_dct4, title = "FFTW's DCT-IV")diff1_plot = plot(audio_dct1-fftw_dct1, title = "DCT-I - FFTW's DCT-I")diff2_plot = plot(audio_dct2-fftw_dct2, title = "DCT-II - FFTW's DCT-II", yformatter = y->string(convert(Int, round(y/1e-16)),"x10⁻¹⁶"))diff3_plot = plot(audio_dct3-fftw_dct3, title = "DCT-III - FFTW's DCT-III", yformatter = y->string(convert(Int, round(y/1e-16)),"x10⁻¹⁶"))diff4_plot = plot(audio_dct4-fftw_dct4, title = "DCT-IV - FFTW's DCT-IV", yformatter = y->string(convert(Int, round(y/1e-16)),"x10⁻¹⁶"))plot(dct1_plot, dct2_plot, dct3_plot, dct4_plot, dct1_plot2, dct2_plot2, dct3_plot2, dct4_plot2,     diff1_plot, diff2_plot, diff3_plot, diff4_plot, xlims = (0, window_length), legend = false, titlefont = 10,     layout = (3,4), size = (990, 600), fmt = :png)

dst

Compute the discrete sine transform (DST) using the fast Fourier transform (FFT).

audio_dst = zaf.dst(audio_signal, dst_type)Inputs:    audio_signal: audio signal (window_length,)    dst_type: DST type (1, 2, 3, or 4)Output:    audio_dst: audio DST (number_frequencies,)

Example: Compute the 4 different DSTs and compare their respective inverses with the original audio.

# Load the needed modulesinclude("./zaf.jl")using .zafusing WAVusing Statisticsusing Plots# Read the audio signal with its sampling frequency in Hz, and average it over its channelsaudio_signal, sampling_frequency = wavread("audio_file.wav")audio_signal = mean(audio_signal, dims=2)# Get an audio segment for a given window lengthwindow_length = 1024audio_segment = audio_signal[1:window_length]# Compute the DST-I, II, III, and IVaudio_dst1 = zaf.dst(audio_segment, 1)audio_dst2 = zaf.dst(audio_segment, 2)audio_dst3 = zaf.dst(audio_segment, 3)audio_dst4 = zaf.dst(audio_segment, 4)# Compute their respective inverses, i.e., DST-I, II, III, and IVaudio_idst1 = zaf.dst(audio_dst1, 1)audio_idst2 = zaf.dst(audio_dst2, 3)audio_idst3 = zaf.dst(audio_dst3, 2)audio_idst4 = zaf.dst(audio_dst4, 4)# Plot the DST-I, II, III, and IV, their respective inverses, and their differences with the original audio segmentdst1_plot = plot(audio_dst1, title = "DST-I")dst2_plot = plot(audio_dst2, title = "DST-II")dst3_plot = plot(audio_dst3, title = "DST-III")dst4_plot = plot(audio_dst4, title = "DST-IV")idst1_plot2 = plot(audio_idst1, title = "Inverse DST-I (DST-I)")idst2_plot2 = plot(audio_idst2, title = "Inverse DST-II (DST-III)")idst3_plot2 = plot(audio_idst3, title = "Inverse DST-III (DST-II)")idst4_plot2 = plot(audio_idst4, title = "Inverse DST-IV (DST-IV)")diff1_plot = plot(audio_idst1-audio_segment, title = "Inverse DST-I - audio segment")diff2_plot = plot(audio_idst2-audio_segment, title = "Inverse DST-II - audio segment")diff3_plot = plot(audio_idst3-audio_segment, title = "Inverse DST-III - audio segment")diff4_plot = plot(audio_idst4-audio_segment, title = "Inverse DST-IV - audio segment")plot(dst1_plot, dst2_plot, dst3_plot, dst4_plot, idst1_plot2, idst2_plot2, idst3_plot2, idst4_plot2,     diff1_plot, diff2_plot, diff3_plot, diff4_plot, xlims = (0, window_length), legend = false, titlefont = 10,     layout = (3,4), size = (990, 600), fmt = :png)

mdct

Compute the modified discrete cosine transform (MDCT) using the fast Fourier transform (FFT).

audio_mdct = zaf.mdct(audio_signal, window_function)Inputs:    audio_signal: audio signal (number_samples,)    window_function: window function (window_length,)Output:    audio_mdct: audio MDCT (number_frequencies, number_times)

Example: Compute and display the MDCT as used in the AC-3 audio coding format.

# Load the needed modulesinclude("./zaf.jl")using .zafusing WAVusing Statisticsusing Plots# Add and use the DSP package to get the kaiser window functionusing PkgPkg.add("DSP")using DSP# Read the audio signal with its sampling frequency in Hz, and average it over its channelsaudio_signal, sampling_frequency = wavread("audio_file.wav")audio_signal = mean(audio_signal, dims=2)# Compute the Kaiser-Bessel-derived (KBD) window as used in the AC-3 audio coding formatwindow_length = 512alpha_value = 5window_function = kaiser(convert(Int, window_length/2)+1, alpha_value*pi)window_function2 = cumsum(window_function[1:convert(Int, window_length/2)])window_function = sqrt.([window_function2; window_function2[convert(Int, window_length/2):-1:1]]./sum(window_function))# Compute the MDCTaudio_mdct = zaf.mdct(audio_signal, window_function)# Display the MDCT in dB, seconds, and Hzxtick_step = 1ytick_step = 1000plot_object = zaf.specshow(abs.(audio_mdct), length(audio_signal), sampling_frequency, xtick_step, ytick_step)heatmap!(title = "MDCT (dB)", size = (990, 600))

imdct

Compute the inverse modified discrete cosine transform (MDCT) using the fast Fourier transform (FFT).

audio_signal = zaf.imdct(audio_mdct, window_function)Inputs:    audio_mdct: audio MDCT (number_frequencies, number_times)    window_function: window function (window_length,)Output:    audio_signal: audio signal (number_samples,)

Example: Verify that the MDCT is perfectly invertible.

# Load the needed modulesinclude("./zaf.jl")using .zafusing WAVusing Statisticsusing Plots# Read the audio signal with its sampling frequency in Hz, and average it over its channelsaudio_signal, sampling_frequency = wavread("audio_file.wav")audio_signal = mean(audio_signal, dims=2)# Compute the MDCT with a slope function as used in the Vorbis audio coding formatwindow_length = 2048window_function = sin.(pi/2*(sin.(pi/window_length*(0.5:window_length-0.5)).^2))audio_mdct = zaf.mdct(audio_signal, window_function)# Compute the inverse MDCTaudio_signal2 = zaf.imdct(audio_mdct, window_function)audio_signal2 = audio_signal2[1:length(audio_signal)]# Compute the differences between the original signal and the resynthesized oneaudio_differences = audio_signal-audio_signal2y_max = maximum(abs.(audio_differences))# Display the original and resynthesized signals, and their differences in secondsxtick_step = 1plot_object1 = zaf.sigplot(audio_signal, sampling_frequency, xtick_step)plot!(ylims = (-1, 1), title = "Original signal")plot_object2 = zaf.sigplot(audio_signal2, sampling_frequency, xtick_step)plot!(ylims = (-1, 1), title = "Resyntesized signal")plot_object3 = zaf.sigplot(audio_differences, sampling_frequency, xtick_step)plot!(ylims = (-y_max, y_max), title = "Original - resyntesized signal")plot(plot_object1, plot_object2, plot_object3, layout = (3, 1), size = (990, 600))

examples.ipynb

This Jupyter notebook shows some examples for the different functions of the Julia modulezaf.

SeeJupyter notebook viewer.

audio_file.wav

23 second audio excerpt from the songQue Pena Tanto Faz performed byTamy.

Author

About