Meyer et al., 2023
ViewPDF| Publication | Publication Date | Title |
|---|---|---|
| Meyer et al. | Anonymizing speech with generative adversarial networks to preserve speaker privacy | |
| Wu et al. | Vqvc+: One-shot voice conversion by vector quantization and u-net architecture | |
| Casanova et al. | SC-GlowTTS: An efficient zero-shot multi-speaker text-to-speech model | |
| Fang et al. | Speaker anonymization using x-vector and neural waveform models | |
| Min et al. | Meta-stylespeech: Multi-speaker adaptive text-to-speech generation | |
| Sadjadi et al. | The 2021 NIST speaker recognition evaluation | |
| Luo et al. | Conv-tasnet: Surpassing ideal time–frequency magnitude masking for speech separation | |
| Kameoka et al. | ACVAE-VC: Non-parallel voice conversion with auxiliary classifier variational autoencoder | |
| Shon et al. | Voiceid loss: Speech enhancement for speaker verification | |
| Kaneko et al. | Sequence-to-Sequence Voice Conversion with Similarity Metric Learned Using Generative Adversarial Networks. | |
| Chen et al. | Towards understanding and mitigating audio adversarial examples for speaker recognition | |
| Justin et al. | Speaker de-identification using diphone recognition and speech synthesis | |
| Mansour et al. | Voice recognition using dynamic time warping and mel-frequency cepstral coefficients algorithms | |
| Yang et al. | Genhancer: High-fidelity speech enhancement via generative modeling on discrete codec tokens | |
| Ferrer et al. | Spoken language recognition based on senone posteriors. | |
| CN106328123A (en) | Method of recognizing ear speech in normal speech flow under condition of small database | |
| Huang et al. | Fastdiff 2: Revisiting and incorporating gans and diffusion models in high-fidelity speech synthesis | |
| Kheder et al. | A unified joint model to deal with nuisance variabilities in the i-vector space | |
| Khamsehashari et al. | Voice Privacy-leveraging multi-scale blocks with ECAPA-TDNN SE-Res2NeXt extension for speaker anonymization | |
| Ng et al. | Teacher-student training for text-independent speaker recognition | |
| Ranjan et al. | An i-vector plda based gender identification approach for severely distorted and multilingual darpa rats data | |
| CN108665901B (en) | Phoneme/syllable extraction method and device | |
| Ho et al. | Cross-lingual voice conversion with controllable speaker individuality using variational autoencoder and star generative adversarial network | |
| Esmaeilpour et al. | Cyclic defense gan against speech adversarial attacks | |
| Sim et al. | SKQVC: One-Shot Voice Conversion by K-Means Quantization with Self-Supervised Speech Representations |