Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

General tools for voice analysis.

License

NotificationsYou must be signed in to change notification settings

filipezabala/voice

Repository files navigation

General tools for voice analysis.

Thevoice package is being developed to be an easy-to-use set of tools to deal with audio analysis in R. It provides a free and user-friendly toolkit for audio analysis, enabling researchers to extract, tag, and analyze voice data efficiently. It supports the extraction of audio features, enrichment of structured datasets with audio summaries, and automatic identification of spoken segments—while introducing novel features. It also allows audio analysis based on musical theory, associating frequencies with musical notes arranged in a score viagm package.

The package has been tested extensively since 2019, including:

  • Real-world applications: Dozens of uses, e.g. sex prediction from voice features and speaker diarization in audiobooks.
  • Validation: Successful tests on open datasets and LibriVox recordings.

If you want to contribute, report bugs or request new features, use the 'Issues' tab on Github.

0. Basic installation

# Development version from GitHubinstall.packages(c('devtools','tidyverse'))devtools::install_github('filipezabala/voice')# Stable version from CRANinstall.packages('voice')

If you wish to perform a full installation, proceed to Section 4.

0.1 For Windows Users

If you're compiling R packages from source, you may need to installRTools, a collection of Windows-specific build tools for R.

0.2 For macOS Users

If you're compiling packages, ensure you haveXcode Command Line Tools installed. You also may needmacOS tools.

# Install Xcode on MacOSxcode-select --install

More details may be found athttps://filipezabala.com/voicegnette/.

1. Extract features

1.1 Load packages and audio files

# packslibrary(voice)library(tidyverse)# get path to audio filewavDir <- list.files(system.file('extdata', package = 'wrassp'),                     pattern = glob2rx('*.wav'), full.names = TRUE)

1.2 Extract features

# minimal usageM <- voice::extract_features(wavDir)glimpse(M)

2. Tag

# creating Extended synthetic dataE <- dplyr::tibble(subject_id = c(1,1,1,2,2,2,3,3,3), wav_path = wavDir)E# minimal usagevoice::tag(E)# canonical datavoice::tag(E, groupBy = 'subject_id')

3. Visualization

3.1 Get audio

url0 <- 'https://github.com/filipezabala/voiceAudios/raw/refs/heads/main/wav/doremi.wav'download.file(url0, paste0(tempdir(), '/doremi.wav'), mode = 'wb')

You may use the commandvoice::embed_audio(url0) if you wish to show a playbutton when compiling an .Rmd file. Seehttps://github.com/mccarthy-m-g/embedrfor more details aboutembed_audio() related functions.

3.2 Media data

M <- voice::extract_features(tempdir())summary(M)

3.3 Plot

voice::piano_plot(M, 0) # f0voice::piano_plot(M, 0:1) # f0 + f1

3.4 Assign notes

(f0_spn <- voice::assign_notes(M, fmt = 0, min_points = 22, min_percentile = .85)) # f0(f1_spn <- voice::assign_notes(M, fmt = 1, min_points = 22, min_percentile = .85)) # f1

3.5 Sheet music

Must haveMuseScore andgm.

3.5.1 Notes sequence of f0

library(gm)line_0 <- gm::Line(as.character(f0_spn))m0 <- gm::Music() +  gm::Meter(4, 4) +  line_0gm::show(m0, to = c('score', 'audio'))

3.5.2 Notes sequences of f0 and f1

line_0 <- gm::Line(as.character(f0_spn))line_1 <- gm::Line(as.character(f1_spn))m1 <- gm::Music() +  gm::Meter(4, 4) +  line_0 + line_1gm::show(m1, to = c('score', 'audio'))

4. Advanced installation

Python-based functionsdiarize andextract_features (when the latter is inferringf0_praat andfmt_praat features) require a configured Python environment.

4.1 Ubuntu

The following steps are used to fully configurevoice onUbuntu 24.04 LTS (Noble Numbat). Reports of inconsistencies are welcome.

4.1.1.Curl

Command line tool and library for transferring data with URLs.

# installing dependenciessudo apt-get updatesudo apt-get install -y libssl-dev autoconf libtool make# installing curlsudo apt install curl# verify installationcurl --version

4.1.2.ffmpeg

ffmpeg is a cross-platform solution to record, convert and stream audio and video.

sudo apt-get updatesudo apt-get install ffmpeg

4.1.3. Audio drivers and extra packages

sudo apt-get updatesudo apt-get install portaudio19-dev libasound2-dev libfontconfig1-dev libmagick++-dev libxml2-dev libharfbuzz-dev libfribidi-dev libgdal-dev cmake cmake-doc ninja-build

4.1.4.MuseScore

MuseScore is an open source notation software.

sudo add-apt-repository ppa:mscore-ubuntu/mscore-stablesudo apt-get updatesudo apt-get install musescore

4.1.5.R

R is a free software environment for statistical computing and graphics. To find out your Ubuntu distribution uselsb_release -a at terminal.

sudo sh -c'echo "deb https://cloud.r-project.org/bin/linux/ubuntu focal-cran40/" >> /etc/apt/sources.list.d/cran.list'sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E084DAB9 sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 51716619E084DAB9gpg -a --export E084DAB9| sudo apt-key add -sudo add-apt-repository ppa:c2d4u.team/c2d4u4.0+sudo apt-get update&& sudo apt-get upgradesudo apt-get install r-base r-base-dev

4.1.6.RStudio

RStudio is an Integrated Development Environment (IDE) for R. Check for updateshere.

sudo apt-get updatesudo apt-get install gdebi-corewget https://download1.rstudio.org/electron/jammy/amd64/rstudio-2025.05.0-496-amd64.debsudo gdebi rstudio-2025.05.0-496-amd64.deb

4.1.9. R packages

"Packages are the fundamental units of reproducible R code."Hadley Wickham and Jennifer Bryan. The installation may take several minutes. At terminal run:

sudo R

Running R as super user paste the following, row by row:

packs<- c('audio','reticulate','R.utils','seewave','tidyverse','tuneR','wrassp')install.packages(packs,dep=TRUE)update.packages(ask=FALSE)devtools::install_github('egenn/music')devtools::install_github('flujoo/gm')

To configure thegm package.

usethis::edit_r_environ()

Add the lineMUSESCORE_PATH=/usr/bin/mscore to/root/.Renviron file. To exit use:wq at VI. Save and restart the R/RStudio session.

4.1.10.Miniconda

Miniconda is a free minimal installer forconda, an open source package, dependency and environment management system for any language—Python, R, Ruby, Lua, Scala, Java, JavaScript, C/ C++, FORTRAN and more, that runs on Windows, macOS and Linux.
Follow the instructions athttps://docs.conda.io/en/latest/miniconda.html.

At terminal:

cd~/Downloads/wget -r -np -k https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.shcd repo.anaconda.com/miniconda/bash Miniconda3-latest-Linux-x86_64.sh

Do you accept the license terms? [yes|no]yes.

Miniconda3 will now be installed into this location: /home/user/miniconda3 [ENTER]

You can undo this by runningconda init --reverse $SHELL?yes

Do you wish the installer to initialize Miniconda3 by running conda init?yes.

Close and reopen terminal.

conda update -n base -c defaults conda

The following packages will be INSTALLED/REMOVED/UPDATED/DOWNGRADED:... Proceed ([y]/n)?y

conda create -n pyvoice python=3.12

The following (NEW) packages will be downloaded/INSTALLED:... Proceed ([y]/n)?y

conda activate pyvoicepip install -r https://raw.githubusercontent.com/filipezabala/voice/master/requirements.txt

4.2 MacOS

The following steps are used to fully configurevoice onMacOS Sonoma (Link to MacOS Sequoia). Reports of inconsistencies are welcome.

4.2.1.Homebrew

Install Homebrew, 'The Missing Package Manager for macOS (or Linux)' and remember tobrew doctor eventually. At terminal (command + space 'terminal') run:

/bin/bash -c"$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

4.2.2.wget

GNU Wget is a free software package for retrieving files using HTTP, HTTPS, FTP and FTPS, the most widely used Internet protocols. It is a non-interactive commandline tool, so it may easily be called from scripts, cron jobs, terminals without X-Windows support, etc.

brew install wget

4.2.3.Python

Python is a programming language that integrate systems. According tothis post, it is recommended to install Python 3.8 and 3.9 and make it consistent.

brew install python@3.12python3 --version pip3 --version

4.2.4.ffmpeg

ffmpeg is a cross-platform solution to record, convert and stream audio and video. The installation may take several minutes.

brew install ffmpeg

4.2.5.XQuartz

The XQuartz project is an open-source effort to develop a version of theX.Org X Window System that runs on macOS.

4.2.6.MacPorts

Follow the instructions fromhttps://guide.macports.org/chunked/installing.macports.html.

4.2.7.tcllib

sudo port selfupdate&& sudo port upgrade tcllibsudo port install tcllib

4.2.8.MuseScore

MuseScore is an open source notation software.

4.2.9.R

R is a free software environment for statistical computing and graphics.

4.2.10.RStudio

RStudio is an Integrated Development Environment (IDE) for R.

4.2.11. R packages

"Packages are the fundamental units of reproducible R code."Hadley Wickham and Jennifer Bryan. Typecommand + space 'terminal'

sudo R

Running R as super user paste the following, one line at a time.

packs<- c('audio','reticulate','R.utils','seewave','tidyverse','tuneR','wrassp')install.packages(packs,dep=TRUE)update.packages(ask=FALSE)devtools::install_github('egenn/music')devtools::install_github('flujoo/gm')

4.2.12.Miniconda

Miniconda is a free minimal installer forconda, an open source package, dependency and environment management system for any language—Python, R, Ruby, Lua, Scala, Java, JavaScript, C/ C++, FORTRAN and more, that runs on Windows, macOS and Linux.

For 64-bit version use

cd~/Downloadswget -r -np -k https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.shcd repo.anaconda.com/miniconda/bash Miniconda3-latest-MacOSX-x86_64.sh

For M1 version use

cd~/Downloadswget -r -np -k https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-arm64.shcd repo.anaconda.com/miniconda/bash Miniconda3-latest-MacOSX-arm64.sh

In order to continue the installation process, please review the licenseagreement. Please, press ENTER to continueENTER.

You can undo this by runningconda init --reverse $SHELL?yes

Close and reopen terminal.

export PATH="~/miniconda3/bin:$PATH"conda update -n base -c defaults conda

The following packages will be INSTALLED/REMOVED/UPDATED/DOWNGRADED:... Proceed ([y]/n)?y

conda create -n pyvoice python=3.12

The following (NEW) packages will be downloaded/INSTALLED:... Proceed ([y]/n)?y

Close and reopen terminal.

conda activate baseconda activate pyvoicepip install -r https://raw.githubusercontent.com/filipezabala/voice/master/requirements.txt

5. Diarize

# downloadurl0 <- 'https://github.com/filipezabala/voiceAudios/raw/main/wav/sherlock0.wav'wavDir <- normalizePath(tempdir())download.file(url0, paste0(wavDir, '/sherlock0.wav'), mode = 'wb')

Diarization can be performed to detect speaker segments (i.e., 'who spoke when').

# diarizevoice::diarize(fromWav = wavDir, toRttm = wavDir, token = 'YOUR_TOKEN')

Thevoice::diarize() function creates Rich Transcription Time Marked (RTTM)1 files, space-delimited text files containing one turn per line defined by NIST - National Institute of Standards and Technology. The RTTM files can be read usingvoice::read_rttm().

# read_rttm(rttm <- voice::read_rttm(wavDir))

Finally, the audio waves can be automatically segmented.

# split audio wavevoice::splitw(fromWav = wavDir, fromRttm = wavDir, to = wavDir)dir(wavDir, pattern = '.[Ww][Aa][Vv]$')

Footnotes

  1. See AppendixC athttps://www.nist.gov/system/files/documents/itl/iad/mig/KWS15-evalplan-v05.pdf.

About

General tools for voice analysis.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

[8]ページ先頭

©2009-2025 Movatter.jp