- Notifications
You must be signed in to change notification settings - Fork1
mottensmann/NocMigR
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
NocMigR package: (Deprecated!: SeeNocMigR2)
This package is in avery preliminary state and provides someworkflows for processing large sound files (e.g.,NocMig
,NFC
,AudioMoth
), with a main emphasis on automatising the detection ofevents (i.e., extracting calls with time-stamps) that can be easilyreviewed inAudacity.Note: On theoccasion of some recent changes to the data privacy policy and ownershipof Audacity I highly suggest to stick to version 3.0.2!
All major computation steps are carried out by sophisticated librariescalled in the background. Including:
R packages
python packages
To install the package, use …
devtools::install_github("mottensmann/NocMigR")
Load the package once installed …
library(NocMigR)
The package contains an example file captured using anAudioMoth recorder. To reducefile size, a segment of five minutes was resampled at 44.1 kHz and savedas 128 kbps mp3 file. In addition to a lot of noise there is shortsegment of interest (scale call of a Eurasian Pygmy OwlGlaucidiumpasserinum).
## get path to test_audio.mp3path<- system.file("extdata","20211220_064253.mp3",package="NocMigR")## create temp folderdir.create("example")#> Warning in dir.create("example"): 'example' already exists## copy to test_folderfile.copy(path,"example")## convert to wavbioacoustics::mp3_to_wav("example/20211220_064253.mp3",delete=T)file.rename(from="example/20211220_064253.wav",to="example/20211220_064253.WAV")
Plot spectrogram to see there is a lot of noise and a few spikesreflecting actual signals …
## read audioaudio<-tuneR::readWave("example/20211220_064253.WAV")## plot spectrumbioacoustics::spectro(audio,FFT_size=2048,flim= c(0,5000))
Naming files using a string that combines the recording date andstarting time (YYYYMMDD_HHMMSS
) is convenient for archiving andanalysing audio files (e.g, default ofAudioMoth). Some (most?) of thepopular field recorders (e.g., Olympus LS, Tascam DR or Sony PCM) usedifferent, rather uninformative naming schemes (date and number atbest), but the relevant information to construct a proper date_timestring is embedded in the meta data of the recording (accessible usingfile.info()
, but requires correct settings of the internal clock!).For instance, long recording sessions using an Olympus LS-3 will createmultiple files, all of which share the same creation and modificationtimes (with respect to the first recording). By contrast, the SonyPCM-D100 saved files individually (i.e., all have uniquectimes andmtimes). Presets to rename files are available for both typesdescribed here.
## Simulate = T allows to see what would happen without altering filesrename_recording(path="example",format="WAV",recorder="Olympus LS-3",simulate=T)#> old.name seconds#> example/20211220_064253.WAV 20211220_064253.WAV 300.0686#> example/20211220_064253_extracted.WAV 20211220_064253_extracted.WAV 10.6300#> example/merged_events.WAV merged_events.WAV 10.6300#> time new.name#> example/20211220_064253.WAV 2023-09-15 21:43:47 20230915_214347.WAV#> example/20211220_064253_extracted.WAV 2023-09-15 21:48:47 20230915_214847.WAV#> example/merged_events.WAV 2023-09-15 21:48:58 20230915_214858.WAV
This function allows to split long audio recordings into smaller chunksfor processing withbioacoustics::threshold_detection
. To keep thetime information, files are written with the corresponding startingtime. *The task is performed using a python script queried usingreticulate
## split in segmentssplit_wave(file="20211220_064253.WAV",# which filepath="example",# where to find itsegment=30,# cut in 30 sec segmentsdownsample=32000)# resample at 32000#>#> Downsampling of 20211220_064253.WAV to 32000 Hz... done#> Split ...## show fileslist.files("example/split/")#> [1] "20211220_064253.WAV" "20211220_064323.WAV" "20211220_064353.WAV"#> [4] "20211220_064423.WAV" "20211220_064453.WAV" "20211220_064523.WAV"#> [7] "20211220_064553.WAV" "20211220_064623.WAV" "20211220_064653.WAV"#> [10] "20211220_064723.WAV" "20211220_064753.WAV"## delete folderunlink("example/split",recursive=TRUE)
This functions is a wrapper tobioacoustics::threshold_detection()
aiming at extracting calls based on the signal to noise ratio and sometarget-specific assumptions about approximate call frequencies anddurations. Check?bioacoustics::threshold_detection()
for details.Note, only some of the parameters that are defined inbioacoustics::threshold_detection()
are used right know.For longrecordings (i.e, several hours) it makes sense to run on segments ascreated before to avoid memory issues. Here we use the demo sound fileas it is
## run detection threshold algorithmTD<- find_events(wav.file="example/20211220_064253.WAV",threshold=8,# Signal-to-noise ratio in dbmin_dur=20,# min length in msmax_dur=300,# max length in msLPF=5000,# low-pass filter at 500 HzHPF=1000)# high-pass filter at 4 kHz## Review eventshead(TD$data$event_data[,c("filename","starting_time","duration","freq_max_amp")])#> filename starting_time duration freq_max_amp#> 1 20211220_064253.WAV 00:00:46.576 168.34467 1477.762#> 2 20211220_064253.WAV 00:00:47.045 190.11338 1646.544#> 3 20211220_064253.WAV 00:00:47.887 116.82540 1790.127#> 4 20211220_064253.WAV 00:00:48.277 150.92971 1827.046#> 5 20211220_064253.WAV 00:00:48.774 91.42857 1964.311#> 6 20211220_064253.WAV 00:00:49.332 21.04308 2264.046## display spectrogram based on approximate location of first six eventsaudio<-tuneR::readWave("example/20211220_064253.WAV",from=46,to=50,units="seconds")bioacoustics::spectro(audio,FFT_size=2048,flim= c(0,5000))
In addition to the output shown above, a file with labels for reviewingevents inAudacity
is created (wrappingseewave::write.audacity()
).
Refines the output offind_events
by first adding a buffer (default 1second on both sides of the event) and subsequently merging overlappingselections to make the output more pretty. Additionally, allows tofilter based on expected frequencies (i.e., checks maximum amplitudefrequency is within the frequency band defined byHPF:LPF
)
## extract events based on object TDdf<- extract_events(threshold_detection=TD,path="example",format="WAV",LPF=4000,HPF=1000,buffer=1)#>#> Existing files '_extracted.WAV will be overwritten!#> 6 selections overlapped
Display refined events …
## display spectrogram based on first six eventsaudio<-tuneR::readWave("example/20211220_064253.WAV",from=df$from,to=df$to,units="seconds")bioacoustics::spectro(audio,FFT_size=2048,flim= c(0,5000))
Takes the output of the previous operation and concatenates audiosignals as well as labels into files calledmerged.events.wav
andmerged.events.txt
respectively. This option comes handy if there aremany input files in the working directory.
merge_events(path="example")#>#> Existing files merged_events.WAV will be overwritten!
Process all files within a directory and run the steps shown above
batch_process(path="example",format="WAV",segment=NULL,downsample=NULL,SNR=8,target=data.frame(min_dur=20,# min length in msmax_dur=300,# max length in msLPF=5000,# low-pass filter at 500 HzHPF=1000),rename=FALSE)#> Start processing: 2023-09-15 22:06:40 [Input audio 5 minutes @ 44100 Hz ]#> Search for events ...#> Warning in find_events(wav.file = x, overwrite = TRUE, threshold = SNR, : NAs#> introduced by coercion#> Warning in find_events(wav.file = x, overwrite = TRUE, threshold = SNR, : NAs#> introduced by coercion#> done#> Extract events ...#>#> Existing files '_extracted.WAV will be overwritten!#> 8 selections overlapped#> In total 1 events detected#> Merge events and write audio example/merged_events.WAV#>#> Existing files merged_events.WAV will be overwritten!#> Finished processing: 2023-09-15 22:06:42#> Run time: 1.77 seconds#> filename from to starting_time event#> 1 20211220_064253.WAV 45.576 47.62258 2021-12-20 06:43:39 46.576#> 2 20211220_064253.WAV 46.045 48.09204 2021-12-20 06:43:40 47.045#> 3 20211220_064253.WAV 46.887 49.32528 2021-12-20 06:43:40 47.887#> 4 20211220_064253.WAV 47.774 49.82277 2021-12-20 06:43:41 48.774#> 5 20211220_064253.WAV 48.332 50.38133 2021-12-20 06:43:42 49.332#> 6 20211220_064253.WAV 152.434 156.35420 2021-12-20 06:45:26 153.434
Recording | Sample.rate | Downsampled | Channels | Run.time |
---|---|---|---|---|
60 h | 96000 Hz | 441000 Hz | Mono | 2.02 h |
60 h | 96000 Hz | 441000 Hz | Mono | 1.76 h |
11.91 | 96000 Hz | 441000 Hz | Stereo | 1.39 h |
10.6 h | 96000 Hz | 441000 Hz | Mono | 1.3 h |
2.73 | 96000 Hz | 441000 Hz | Mono | 4.88 min |
Run times all steps, notebook ~ Intel i5-4210M, 2 cores ~ 8 GB RAM
Recording | Sample.rate | Downsampled | Channels | Run.time |
---|---|---|---|---|
7.5 h | 96000 Hz | 441000 Hz | Mono | 14.52 min |
Run times only event detection, notebook ~ Intel i5-4210M, 2 cores ~ 8GB RAM
Update:
With adequate computational power there is no need to split even largerwave files into segments of one hour. This way, the event detectionprocess is much faster (steps 3:6), usually less than four minutes foran entireNocMig night!
#> #> #> | Recording | Sample.rate | Downsampled | Channels | Run.time |#> |:---------:|:-----------:|:-----------:|:--------:|:---------:|#> | 114.99 h | 48000 Hz | 441000 Hz | Mono | 26.79 min |#> #> Table: 115h AudioMoth recording, notebook ~ AMD RYZEN 7, 16 cores ~ 24 GB RAM
Retrieve weather data viaBright Sky (deMaeyer 2020) and compose a string describing a NocMig session from duskto dawn for a given location. Note, the comment follows suggestions byHGON (Schützeet al2022)
## example for Bielefeld## -----------------------------------------------------------------------------NocMig_meta(date= Sys.Date()-2,lat=52.032,lon=8.517)#> Teilliste 1: 13.9-14.9.2023, 20:23-06:25, trocken, 12°C, ESE, 2 km/h#> Teilliste 2: 13.9-14.9.2023, 20:23-06:25, trocken, 9°C, ESE, 3 km/h
Recently I started to play withBirdNET. Firsttrials suggest that only few calls of interest are missed, and themajority is correctly labelled using theBirdNET_GLOBAL_6K_V2.4model.Currently, it is rather difficult to run BirdNET through RStudioon a windows computer, and hence a few lines of python code are pastedto a Linux (Ubuntu) command line
- Setup list of target species
## Creates a species list by subsetting from the full model## -----------------------------------------------------------------BirdNET_species.list(names= c("Glaucidium passerinum","Bubo bubo"),scientific=T,out="example/species.txt")#> # A tibble: 2 × 2#> scientific_name englisch_name#> <chr> <chr>#> 1 Bubo bubo Eurasian Eagle-Owl#> 2 Glaucidium passerinum Eurasian Pygmy-Owl
Run analyze.py using a command line program (e.g. Ubuntu on windows).See details in the documentation ofBirdnet
## run BirdNET-Analyzer in a bash shell## --------------------python3analyze.py--i/exampele--o/exampele--slist/example/species.txt--rtype'audacity'--threads1--locale'de'
The functionBirdNET
(see ?BirdNET for details) does the following:
- Reshape audacity labels created by
analyze.py
to include the eventtime: - Write records to xlsx file (BirdNET.xlsx) as a template to simplifyinspection and verification:
df<- BirdNET(path="example/")#> Created example//BirdNET.xlsxdf[["Records"]]#> Taxon T1 T2 Score Verification#> 1 Sperlingskauz 2021-12-20 06:43:38 2021-12-20 06:43:41 0.516 NA#> 2 Sperlingskauz 2021-12-20 06:45:29 2021-12-20 06:45:32 0.378 NA#> 3 Sperlingskauz 2021-12-20 06:45:35 2021-12-20 06:45:38 0.126 NA#> Correction Quality Comment T0#> 1 NA NA NA 2021-12-20 06:42:53#> 2 NA NA NA 2021-12-20 06:42:53#> 3 NA NA NA 2021-12-20 06:42:53#> File#> 1 example/20211220_064253.BirdNET.results.txt#> 2 example/20211220_064253.BirdNET.results.txt#> 3 example/20211220_064253.BirdNET.results.txt## records per species and daydf[["Records.dd"]]#> # A tibble: 1 × 3#> # Groups: species [1]#> species date n#> <chr> <date> <int>#> 1 Sperlingskauz 2021-12-20 3## records per species and hourdf[["Records.hh"]]#> # A tibble: 1 × 3#> # Groups: species [1]#> species hour n#> <fct> <int> <int>#> 1 Sperlingskauz 6 3
Extract detections and export them as wav files. For easier access toverify records files are named as ‘Species_Date_Time.WAV’ (see below).
## extract eventsBirdNET_extract(path="example/",hyperlink=F)## If T: create hyperlink as excel formula## show fileslist.files("example/extracted/Sperlingskauz/")#> [1] "Sperlingskauz_20211220_064338.WAV" "Sperlingskauz_20211220_064529.WAV"#> [3] "Sperlingskauz_20211220_064535.WAV"
## clean-upunlink("example",recursive=TRUE)