bnosac/audio.vadwebrtcPublic

NotificationsYou must be signed in to change notification settings
Fork1
Star1

Voice Activity Detection in R using the "webrtc" toolkit

License

MPL-2.0, Unknown licenses found

Licenses found

1 star 1 fork Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
.github/workflows		.github/workflows
R		R
inst		inst
man		man
src		src
tools		tools
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
DESCRIPTION		DESCRIPTION
LICENSE		LICENSE
LICENSE.note		LICENSE.note
NAMESPACE		NAMESPACE
NEWS.md		NEWS.md
README.md		README.md

Repository files navigation

audio.vadwebrtc

This repository contains an R package which is an Rcpp wrapper around thewebrtc Voice Activity Detection module.

example-vad.mp4

The package was created with as main goal to remove non-speech audio segments before doing an automatic transcription usingaudio.whisper to avoid transcription hallucinations. It contains

functions to detect the location of voice in audio using a Gaussian Mixture Model implemented inwebrtc
functions to extract audio where there is voice / silence in a new audio file
functionality to rewrite the timepoints of transcribed sentences where specific sections with non-audio are removed to make sure the timepoints of the transcriptions without silences align with the original audio signal

Installation

The package is currently not on CRAN
For thedevelopment version of this package:remotes::install_github("bnosac/audio.vadwebrtc")

Look to the documentation of the functions:help(package = "audio.vadwebrtc")

Example

Get a audio file in 16 bit with mono PCM samples (pcm_s16le codec) with a sampling rate of either 8Khz, 16KHz or 32Khz

library(audio.vadwebrtc)file <- system.file(package = "audio.vadwebrtc", "extdata", "test_wav.wav")vad  <- VAD(file, mode = "normal")vadVoice Activity Detection   - file: D:/Jan/R/win-library/4.1/audio.vadwebrtc/extdata/test_wav.wav   - sample rate: 16000   - VAD type: webrtc-gmm, VAD mode: normal, VAD by milliseconds: 10, VAD frame_length: 160    - Percent of audio containing a voiced signal: 90.2%     - Seconds voiced: 6.3     - Seconds unvoiced: 0.7vad$vad_segments vad_segment start  end has_voice           1  0.00 0.08     FALSE           2  0.09 3.30      TRUE           3  3.31 3.71     FALSE           4  3.72 6.78      TRUE           5  6.79 6.99     FALSE

Example of a simple plot of these audio and voice segments

library(av)x <- read_audio_bin(file)plot(seq_along(x) / 16000, x, type = "l", xlab = "Seconds", ylab = "Signal")abline(v = vad$vad_segments$start, col = "red", lwd = 2)abline(v = vad$vad_segments$end, col = "blue", lwd = 2)

Or show it interactively alongside R package wavesurfer:wavesurfer

library(wavesurfer)library(shiny)file <- system.file(package = "audio.vadwebrtc", "extdata", "test_wav.wav")vad  <- VAD(file, mode = "lowbitrate")anno <- data.frame(audio_id = vad$file,                    region_id = vad$vad_segments$vad_segment,                    start = vad$vad_segments$start,                    end = vad$vad_segments$end,                    label = ifelse(vad$vad_segments$has_voice, "Voiced", "Silent"))anno <- subset(anno, label %in% "Silent")  wavs_folder <- system.file(package = "audio.vadwebrtc", "extdata")shiny::addResourcePath("wav", wavs_folder)ui <- fluidPage(  wavesurferOutput("my_ws", height = "128px"),  tags$p("Press spacebar to toggle play/pause."),)server <- function(input, output, session) {  output$my_ws <- renderWavesurfer({    wavesurfer(audio = paste0("wav/", "test_wav.wav"), annotations = anno) %>%      ws_set_wave_color('#5511aa') %>%      ws_cursor()  })}shinyApp(ui = ui, server = server)

Support in text mining

Need support in text mining?Contact BNOSAC:http://www.bnosac.be

About

Voice Activity Detection in R using the "webrtc" toolkit

Resources

Readme

License

MPL-2.0, Unknown licenses found

Releases2

0.2 Latest

May 6, 2024

+ 1 release

Packages

No packages published

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Licenses found

Folders and files

Latest commit

History

Repository files navigation

audio.vadwebrtc

Installation

Example

Support in text mining

About

Resources

License

Licenses found

Stars

Watchers

Forks

Releases2

Packages

Languages

Movatterモバイル変換

License

Licenses found

bnosac/audio.vadwebrtc

Folders and files

Latest commit

History

Repository files navigation

audio.vadwebrtc

Installation

Example

Support in text mining

About

Resources

License

Licenses found

Stars

Watchers

Forks

Releases2

Packages0

Languages

Packages