wavlm
Here are 16 public repositories matching this topic...
Language:All
Sort:Most stars
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
- Updated
Aug 10, 2024 - Python
Self-Supervised Speech Pre-training and Representation Learning Toolkit
- Updated
Mar 11, 2025 - Python
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
- Updated
Feb 26, 2025 - Python
A low-bitrate single-codebook 16 kHz speech codec based on focal modulation
- Updated
Feb 12, 2025 - Python
This repo contains the source code of the first deep learning-base singing voice beat tracking system. It leverages WavLM and DistilHuBERT pre-trained speech models to create vocal embeddings and trains linear multi-head self-attention layers on top of them to extract vocal beat activations. Then, it uses HMM decoder to infer signing beats and t…
- Updated
Sep 4, 2022 - Python
A neural speech codec based on discrete WavLM representations
- Updated
Aug 28, 2024 - Python
A collections of audio codecs with a standardized API
- Updated
Feb 12, 2025 - Python
This repository contain the code of the main part of my master thesis degree at Politecnico di Torino in Data science & Engineering
- Updated
Mar 5, 2023 - Python
In this repository, the wavLM model is used for quality and poor quality data for speaker verification task, and the PyCM library is used for evaluation.
- Updated
May 27, 2023 - Jupyter Notebook
SOTA method for self-supervised speaker verification leveraging a large-scale pretrained ASR model.
- Updated
Feb 19, 2025 - Python
This repository combines `WavLM`, a powerful speech representation model from Microsoft, with `MSDD` (Multi-Scale Diarization Decoder), a state-of-the-art approach for speaker diarization from Nvidia.
- Updated
Mar 10, 2025 - Jupyter Notebook
Universal Pooling Method for Speaker Verification Utilizing Pre-trained Multi-layer Features, 2025 preprint
- Updated
Sep 19, 2024 - Python
This repo contains code used in the paper "Characterizing the temporal dynamics of universal speech representations for generalizable deepfake detection"
- Updated
Oct 19, 2023 - Python
WavLM Large + RawNetX Speaker Verification Base: End-to-End Speaker Verification Architecture
- Updated
Mar 10, 2025 - Python
Acoustic Transformer Models for Audio Classification
- Updated
Feb 15, 2025 - Python
CryCeleb2023 experiments
- Updated
Jul 5, 2023 - Jupyter Notebook
Improve this page
Add a description, image, and links to thewavlm topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with thewavlm topic, visit your repo's landing page and select "manage topics."