CN120452467A

Movatterモバイル変換

Info

Publication number: CN120452467A
Application number: CN202510962659.3A
Authority: CN
Inventors: 关城; 林羽静; 张松磊; 张良嵩; 董衍旭; 乐艺泽; 杨启帆; 林志华; 叶勋; 陈红; 郑烜
Original assignee: Fujian Yili Information Technology Co ltd; Information and Telecommunication Branch of State Grid Fujian Electric Power Co Ltd
Current assignee: Fujian Yili Information Technology Co ltd; Information and Telecommunication Branch of State Grid Fujian Electric Power Co Ltd
Priority date: 2025-07-14
Filing date: 2025-07-14
Publication date: 2025-08-08
Anticipated expiration: 2045-07-14
Also published as: CN120452467B

Abstract

本发明公开了一种基于Codec的语音与背景音分离方法、装置、设备及介质，涉及语音噪音分离技术领域。所述方法包括分离模型构建过程、模型训练过程以及音频分离过程。通过构建并训练基于Codec的编解码分离模型，包括编码器、分离模块、两个音频表示量化器和解码器，实现在特征表征空间中的音频分离。本发明提供的一种基于Codec的语音与背景音分离方法、装置、设备及介质，以Codec技术为核心，通过在表征空间中进行语音与背景音的分离，避免了在波形或频谱层面直接处理带来的计算负担与分离误差，成功解决了传统分离方法在低比特率压缩条件下难以兼顾分离性能与音频质量的问题。

The present invention discloses a method, device, equipment and medium for separating speech and background sound based on Codec, and relates to the technical field of speech noise separation. The method includes a separation model construction process, a model training process and an audio separation process. By constructing and training a codec-based encoding and decoding separation model, including an encoder, a separation module, two audio representation quantizers and a decoder, audio separation in the feature representation space is achieved. The present invention provides a method, device, equipment and medium for separating speech and background sound based on Codec, which takes Codec technology as the core. By separating speech and background sound in the representation space, the computational burden and separation error caused by direct processing at the waveform or spectrum level are avoided, and the problem that traditional separation methods are difficult to balance separation performance and audio quality under low bit rate compression conditions is successfully solved.