





技术领域technical field
本发明涉及语音通信技术,特别涉及一种背景噪声编码/解码装置、方法和通信设备。 The present invention relates to voice communication technology, in particular to a background noise encoding/decoding device, method and communication equipment. the
背景技术Background technique
语音通信过程中,被传输的语音信号可以分为有声信号和无声信号两类,其中:有声信号是在检测到有效语音时发送的语音信号,主要包括通信方的语音以及当时的背景噪声等信息;无声信号在没有检测到有效语音的间隙发送的背景噪声信号,主要包括各种背景噪声,比如白噪声、背景嘈杂声和静音等。对通信双方来说,有声信号为有用信号,是主要通信信息的载体,而无声信号为无用信号,不包含主要通信信息。 In the process of voice communication, the transmitted voice signal can be divided into two types: voiced signal and silent signal. Among them, the voiced signal is the voice signal sent when valid voice is detected, mainly including the voice of the communicating party and the background noise at that time. ; The silent signal is the background noise signal sent during the interval when no effective voice is detected, mainly including various background noises, such as white noise, background noise, and silence. For both sides of the communication, the voiced signal is a useful signal and is the carrier of the main communication information, while the silent signal is a useless signal and does not contain the main communication information. the
在语音通信过程中,由于通信双方主要关注的是有用的有声信号,因此考虑到带宽的压力,不希望传送无用的无声信号,以降低传输带宽,增加系统容量。但是如果只传有声信号,会导致背景噪声的不连续,使收听的一方感觉非常不舒服,尤其在背景噪声较强的情况下更加明显,甚至会令语音难以理解,所以只能尽量减少传送无用的无声信号,在保证用户感受度的基础上尽量降低带宽,为此应运而生了CNG(Comfort Noise generation,舒适噪声生成)技术,CNG技术中,发送端在无声信号阶段采用静音压缩编码技术编码背景噪声帧,并采用DTX(Discontinuous Transmission,非连续传输)方式传输,接收端采用内插的方式产生连续的背景噪声信号,以增加收听方的舒适感。 In the process of voice communication, since the two sides of the communication mainly focus on useful voice signals, considering the pressure of bandwidth, it is not desirable to transmit useless silent signals in order to reduce the transmission bandwidth and increase the system capacity. However, if only the audio signal is transmitted, the background noise will be discontinuous, which will make the listening party feel very uncomfortable, especially when the background noise is strong, and it will even make the voice difficult to understand, so we can only minimize the useless transmission To reduce the bandwidth as much as possible on the basis of ensuring user experience, the CNG (Comfort Noise generation, comfort noise generation) technology came into being. In CNG technology, the sending end uses silent compression coding technology to encode in the silent signal stage. Background noise frame, and adopt DTX (Discontinuous Transmission, discontinuous transmission) mode transmission, the receiving end adopts interpolation method to generate continuous background noise signal, in order to increase the comfort of the listening party. the
如图1所示,为采用静音压缩编码技术的通信系统原理示意图,其中,发送端包括: As shown in Figure 1, it is a schematic diagram of the communication system using silent compression coding technology, wherein the sending end includes:
VAD(Voice Activity Detector,语音检测器),原始信号经一定预处理后输 入VAD,VAD通常根据每一帧信号的特征值,例如信噪比,检测信号中是否包含有效语音,并根据检测结果将信号送入不同的编码器进行编码; VAD (Voice Activity Detector, voice detector), the original signal is input into VAD after certain preprocessing, VAD usually detects whether the signal contains valid voice according to the characteristic value of each frame signal, such as signal-to-noise ratio, and according to the detection result Send the signal to different encoders for encoding;
语音编码器,用于将包含有效语音的有声信号编码为语音帧并输出,语音帧通过通信信道被传输给接收端; Speech encoder, used to encode the vocal signal containing effective speech into speech frames and output, and the speech frames are transmitted to the receiving end through the communication channel;
非语音编码器,采用静音压缩技术将不包含有效语音的无声信号编码为背景噪声帧; Non-speech coder, which uses silence compression technology to encode silent signals that do not contain valid speech into background noise frames;
背景噪声帧发送器,采用设定的DTX方式将背景噪声帧通过通信信道传输给接收端,例如每隔几帧发送一个背景噪声帧,或者根据背景噪声的特征参数变化情况决定是否发送一个背景噪声帧; The background noise frame transmitter uses the set DTX method to transmit the background noise frame to the receiving end through the communication channel, for example, sending a background noise frame every few frames, or deciding whether to send a background noise frame according to the change of the characteristic parameters of the background noise frame;
相应的,接收端包括: Correspondingly, the receiving end includes:
语音解码器,用于解码从通信信道接收的语音帧,并重构为有声信号后输出; Speech decoder, used to decode the speech frame received from the communication channel, and reconstruct it into a vocal signal and output it;
非语音解码器,用于解码从通信信道接收的背景噪声帧; a non-speech decoder for decoding background noise frames received from the communication channel;
重构器,用于采用内插的方式将解码数据重构为连续的背景噪声信号,以增加收听方的舒适感。 The reconstructor is used to reconstruct the decoded data into a continuous background noise signal by means of interpolation, so as to increase the comfort of the listener. the
发送端的非语音编码器和背景噪声帧发送器组成DTX单元,接收端的非语音解码器和重构器组成CNG单元,静音压缩技术通过位于发送端的VAD和DTX,以及位于接收端的CNG实现。 The non-speech encoder and background noise frame transmitter at the sending end form the DTX unit, and the non-speech decoder and reconstructor at the receiving end form the CNG unit. The silence compression technology is realized through VAD and DTX at the sending end and CNG at the receiving end. the
以常用的CELP(Code-Excited Linear Prediction,码激励线性预测)语音编码技术为例,CELP是一种高效率的语音编码技术。由于采用了感觉加权、分析合成、矢量量化和后滤波等技术,CELP能够在中低速率上给出高质量的合成语音。CELP语音编码技术中,在发送端进行语音信号分析以获取语音的特征参数,将语音的特征参数编码到码流中发送;接收端根据码流中的特征参数进行语音合成,重构语音信号。语音信号s(n)被分成短的时间段,称为帧,发送端的语音分析和接收端的语音合成都是逐帧进行的。 Taking the commonly used CELP (Code-Excited Linear Prediction) speech coding technology as an example, CELP is a high-efficiency speech coding technology. Due to the adoption of techniques such as sensory weighting, analysis synthesis, vector quantization and post-filtering, CELP can provide high-quality synthetic speech at low and medium rates. In the CELP speech coding technology, the speech signal is analyzed at the sending end to obtain speech characteristic parameters, and the speech characteristic parameters are encoded into the code stream for transmission; the receiving end performs speech synthesis according to the characteristic parameters in the code stream, and reconstructs the speech signal. The speech signal s(n) is divided into short time segments called frames, and speech analysis at the sending end and speech synthesis at the receiving end are performed frame by frame. the
在发送端,原始信号s(n)可以看成是一个激励信号e(n)激励一个滤波器v(n) 所产生的输出,即s(n)=e(n)*v(n),接收端使用相同的滤波器对语音进行合成,称为合成滤波器。因此语音信号的编码码流中传输的特征参数包括合成滤波器参数和激励参数,其中:合成滤波器参数主要为线谱频率LSF(Line SpectrumFrequency)量化参数,而激励信号参数包括:自适应码本参数(包括基音延迟参数和基音增益参数)和固定码本参数(包括固定码本中非零脉冲的位置和符号以及固定码本增益参数)。 At the sending end, the original signal s(n) can be regarded as the output generated by an excitation signal e(n) stimulating a filter v(n), that is, s(n)=e(n)*v(n), The receiving end uses the same filter to synthesize speech, which is called a synthesis filter. Therefore, the characteristic parameters transmitted in the encoded code stream of the voice signal include synthesis filter parameters and excitation parameters, wherein: the synthesis filter parameters are mainly line spectrum frequency LSF (Line SpectrumFrequency) quantization parameters, and the excitation signal parameters include: adaptive codebook parameters (including pitch delay parameters and pitch gain parameters) and fixed codebook parameters (including the position and sign of non-zero pulses in the fixed codebook and fixed codebook gain parameters). the
与语音编码参数不同,背景噪声编码参数描述的是背景噪声特性,由于背景噪声的激励信号可以认为是简单的噪声随机序列,而这些序列在编解码端均可以简单的用随机噪声产生装置产生,然后用能量参数控制这些序列的幅度,就可产生最终的激励信号,因此激励信号特征参数可以简单的用能量参数来表示,而不需要用其它的一些特征参数来进一步描述,所以在背景噪声编码码流中与语音帧不同的是,其激励参数为当前背景噪声帧的能量参数;而背景噪声编码码流中与语音帧相同的是,编码码流中的合成滤波器参数也为线谱频率LSF量化参数,只是具体的量化方法有所差别。通过以上分析,可以认为对背景噪声的编码方式本质上就是一种简单的语音编码方式。 Different from the speech coding parameters, the background noise coding parameters describe the characteristics of the background noise, because the excitation signal of the background noise can be considered as a simple noise random sequence, and these sequences can be simply generated by a random noise generator at the codec end, Then use the energy parameters to control the amplitude of these sequences to generate the final excitation signal, so the characteristic parameters of the excitation signal can be simply expressed by energy parameters, without further description by other characteristic parameters, so in the background noise coding The code stream is different from the speech frame in that its excitation parameter is the energy parameter of the current background noise frame; while the background noise code stream is the same as the speech frame in that the synthesis filter parameter in the code stream is also the line spectrum frequency LSF quantization parameters, but the specific quantization methods are different. Through the above analysis, it can be considered that the coding method for background noise is essentially a simple speech coding method. the
经DTX压缩编码并被非连续传输的背景噪声帧一般称为SID(SilenceDescriptor,静音描述帧)帧。在当前的各种语音编码系统中,每一系统内的背景噪声信号都使用SID帧格式和非连续传输机制。 Background noise frames that are compressed and coded by DTX and transmitted discontinuously are generally called SID (Silence Descriptor, silence description frame) frames. In various current speech coding systems, the background noise signal in each system uses SID frame format and discontinuous transmission mechanism. the
其中,G.729B静音压缩方案是一种具有代表性的静音压缩技术,G.729B是基于CELP算法模型的,因此其所传输的背景噪声参数也是基于CELP模型获取的,包括用于描述背景噪声的合成滤波器参数与激励参数。在编码端,例如VAD的判决结果为0时,表示当前的信号为噪声,则经一定预处理的原始信号被送入背景噪声编码装置,在背景噪声编码装置对背景噪声参数进行提取和量化,组成SID帧,然后背景噪声编码装置根据SID帧内的参数变化情况来决定是否发送一帧SID帧,例如SID帧中的滤波器参数和能量参数变化较大时发送一帧SID帧。G.729B中,两个相邻SID帧的间隔至少为2帧。在解码端,若进入背景噪声解码重构阶段,则在背景噪声解码装置中根据接收的SID帧合成出描述编码端背景噪声特性的舒适噪声。G729B的SID帧中为各参数分配的比特数如下表1所示: Among them, the G.729B silent compression scheme is a representative silent compression technology. G.729B is based on the CELP algorithm model, so the background noise parameters transmitted by it are also obtained based on the CELP model, including the parameters used to describe the background noise. The synthesis filter parameters and excitation parameters of . At the coding end, for example, when the judgment result of VAD is 0, it means that the current signal is noise, then the original signal after certain preprocessing is sent to the background noise coding device, and the background noise parameters are extracted and quantized by the background noise coding device. A SID frame is formed, and then the background noise coding device decides whether to send a SID frame according to the parameter change in the SID frame, for example, a SID frame is sent when the filter parameters and energy parameters in the SID frame change greatly. In G.729B, the interval between two adjacent SID frames is at least 2 frames. At the decoding end, if the background noise decoding and reconstruction stage is entered, the background noise decoding device synthesizes comfort noise describing the background noise characteristics of the encoding end according to the received SID frame. The number of bits allocated for each parameter in the SID frame of G729B is shown in Table 1 below:
表1. Table 1.
AMR(Adaptive Multi-Rate speech codec,自适应多码率语音编码器)和AMR-WB(Adaptive Multi-Rate Wide Band speech codec,宽带自适应多码率语音编码器)静音压缩方案中的DTX则采用固定的发送机制,无论背景噪声的特性如何变化,每8帧发送一次SID帧。AMR的SID帧中为各参数分配的比特数如下表2所示,AMR-WB的SID帧中为各参数分配的比特数如下表3所示。 DTX in AMR (Adaptive Multi-Rate speech codec, adaptive multi-rate speech codec) and AMR-WB (Adaptive Multi-Rate Wide Band speech codec, broadband adaptive multi-bit-rate speech codec) mute compression scheme uses Fixed sending mechanism, no matter how the characteristics of the background noise change, send a SID frame every 8 frames. The number of bits allocated to each parameter in the SID frame of AMR is shown in Table 2 below, and the number of bits allocated to each parameter in the SID frame of AMR-WB is shown in Table 3 below. the
表2. Table 2.
表3. table 3.
现有的背景噪声编码方式可对背景噪声进行简单编码,并在解码端合成出简单的舒适背景噪声,可以在一定程度上较好满足接听质量的要求。然而随着通信技术的发展,语音编码质量越来越高,相应的,有声阶段的背景噪声的编码质量也越来越高,在这种情况下,为使得语音通信时的背景噪声更加连续,也需要提高无声阶段背景噪声的编码质量。现有较为单一的SID帧编码方式和非连续传输机制显然无法满足多种通信环境的具体要求。 The existing background noise coding method can simply code the background noise and synthesize a simple comfortable background noise at the decoding end, which can better meet the requirements of listening quality to a certain extent. However, with the development of communication technology, the quality of speech coding is getting higher and higher. Correspondingly, the coding quality of background noise in the vocal stage is also getting higher and higher. In this case, in order to make the background noise of speech communication more continuous, There is also a need to improve the encoding quality of background noise during silent periods. The existing relatively single SID frame encoding method and discontinuous transmission mechanism obviously cannot meet the specific requirements of various communication environments. the
发明内容Contents of the invention
本发明实施例提供一种背景噪声编码/解码装置、方法和通信设备,以实现根据通信环境的背景噪声情况灵活调整SID帧编码方式和非连续传输机制。 Embodiments of the present invention provide a background noise coding/decoding device, method, and communication equipment, so as to flexibly adjust the SID frame coding method and the discontinuous transmission mechanism according to the background noise of the communication environment. the
一种背景噪声编码装置,包括: A background noise encoding device, comprising:
背景噪声分析单元,用于获取背景噪声信号的至少一个信号特征参数值; The background noise analysis unit is used to obtain at least one signal characteristic parameter value of the background noise signal;
背景噪声分类单元,用于根据所述至少一个信号特征参数值确定背景噪声类型; A background noise classification unit, configured to determine the background noise type according to the at least one signal characteristic parameter value;
编码切换单元和至少两个非连续传输单元,所述编码切换单元用于根据所述背景噪声类型将背景噪声信号输入对应的非连续传输单元,每一个非连续传输单元用于将背景噪声信号编码为包含非连续传输单元指示信息的静音描述帧并非连续传输,其中各非连续传输单元的静音描述帧编码方式和/或静音描述帧非连续传输机制不相同,各非连续传输单元的静音描述帧编码方式不相同具体为:不同静音描述帧的编码参数不相同,或者同一编码参数不同静音描述帧的量化精度不相同,各非连续传输单元的静音描述帧非连续传输机制不相同具体为:发送的相邻两个静音描述帧的传输间隔不相同。 A coding switching unit and at least two discontinuous transmission units, the coding switching unit is used to input the background noise signal into the corresponding discontinuous transmission unit according to the background noise type, and each discontinuous transmission unit is used to code the background noise signal The silence description frame containing the indication information of the discontinuous transmission unit is not transmitted continuously, and the encoding method and/or discontinuous transmission mechanism of the silence description frame of each discontinuous transmission unit is different, and the silence description frame of each discontinuous transmission unit The encoding methods are different, specifically: the encoding parameters of different silence description frames are different, or the quantization accuracy of different silence description frames with the same encoding parameter is different, and the discontinuous transmission mechanisms of the silence description frames of each discontinuous transmission unit are different. Specifically: send The transmission intervals of two adjacent silence description frames are different. the
一种背景噪声解码装置,包括: A background noise decoding device, comprising:
第一识别单元,用于识别静音描述帧中的非连续传输单元指示信息; The first identification unit is used to identify the discontinuous transmission unit indication information in the silence description frame;
解码切换单元和至少两个舒适噪声生成单元,所述解码切换单元用于根据所述识别单元的识别结果,将所述静音描述帧输入对应的舒适噪声生成单元,每一个舒适噪声生成单元根据接收的静音描述帧的编码方式对应解码静音描述帧,并重构背景噪声信号。 A decoding switching unit and at least two comfort noise generating units, the decoding switching unit is used to input the silence description frame into the corresponding comfort noise generating unit according to the identification result of the identifying unit, and each comfort noise generating unit receives The encoding method of the silence description frame corresponds to decoding the silence description frame and reconstructing the background noise signal. the
一种通信设备,包括: A communication device comprising:
第一语音检测器,用于检测采集的信号中是否包含有效语音并分类输出; The first voice detector is used to detect whether the collected signal contains valid voice and classifies the output;
语音编码器,用于将包含有效语音的信号编码为语音帧并连续输出; Speech encoder for encoding the signal containing valid speech into speech frames and outputting them continuously;
还包括背景噪声编码装置,该背景噪声编码装置具体包括: Also include background noise coding device, this background noise coding device specifically includes:
背景噪声分析单元,用于获取不包含有效语音的背景噪声信号的至少一个信号特征参数值; A background noise analysis unit, configured to obtain at least one signal characteristic parameter value of a background noise signal that does not contain valid speech;
背景噪声分类单元,用于根据所述至少一个信号特征参数值确定背景噪声类型; A background noise classification unit, configured to determine the background noise type according to the at least one signal characteristic parameter value;
编码切换单元和至少两个非连续传输单元,所述编码切换单元用于根据所述背景噪声类型将背景噪声信号输入对应的非连续传输单元,每一个非连续传输单元用于将背景噪声信号编码为包含非连续传输单元指示信息的静音描述帧并非连续传输,其中各非连续传输单元的静音描述帧编码方式和/或静音描述帧非连续传输机制不相同,各非连续传输单元的静音描述帧编码方式不相同具体为:不同静音描述帧的编码参数不相同,或者同一编码参数不同静音描述帧的量化精度不相同,各非连续传输单元的静音描述帧非连续传输机制不相同具体为:发送的相邻两个静音描述帧的传输间隔不相同。 A coding switching unit and at least two discontinuous transmission units, the coding switching unit is used to input the background noise signal into the corresponding discontinuous transmission unit according to the background noise type, and each discontinuous transmission unit is used to code the background noise signal The silence description frame containing the indication information of the discontinuous transmission unit is not transmitted continuously, and the encoding method and/or discontinuous transmission mechanism of the silence description frame of each discontinuous transmission unit is different, and the silence description frame of each discontinuous transmission unit The encoding methods are different, specifically: the encoding parameters of different silence description frames are different, or the quantization accuracy of different silence description frames with the same encoding parameter is different, and the discontinuous transmission mechanisms of the silence description frames of each discontinuous transmission unit are different. Specifically: send The transmission intervals of two adjacent silence description frames are different. the
一种通信设备,包括背景噪声解码装置,所述背景噪声解码装置包括: A kind of communication equipment, comprises background noise decoding device, and described background noise decoding device comprises:
第一识别单元,用于识别静音描述帧中的非连续传输单元指示信息; The first identification unit is used to identify the discontinuous transmission unit indication information in the silence description frame;
解码切换单元和至少两个舒适噪声生成单元,所述解码切换单元用于根据所述识别单元的识别结果,将所述静音描述帧输入对应的舒适噪声生成单元,每一个舒适噪声生成单元根据接收的静音描述帧的编码方式对应解码静音描述帧,并重构背景噪声信号。 A decoding switching unit and at least two comfort noise generating units, the decoding switching unit is used to input the silence description frame into the corresponding comfort noise generating unit according to the identification result of the identifying unit, and each comfort noise generating unit receives The encoding method of the silence description frame corresponds to decoding the silence description frame and reconstructing the background noise signal. the
一种背景噪声编码方法,包括: A background noise coding method, comprising:
获取背景噪声信号的至少一个信号特征参数值; Obtain at least one signal characteristic parameter value of the background noise signal;
根据所述至少一个信号特征参数值确定背景噪声类型; Determine the background noise type according to the at least one signal characteristic parameter value;
根据所述背景噪声类型选择一种非连续传输机制将背景噪声信号编码为包含非连续传输机制指示信息的静音描述帧并非连续发送,其中每一种非连续传输机制中的静音描述帧编码方式和/或静音描述帧非连续发送方式不相同,各非连续传输单元的静音描述帧编码方式不相同具体为:不同静音描述帧的编码参数不相同,或者同一编码参数不同静音描述帧的量化精度不相同,各非连续传输单元的静音描述帧非连续传输机制不相同具体为:发送的相邻两个静音描述帧的传输间隔不相同。 Select a discontinuous transmission mechanism according to the background noise type to encode the background noise signal into a silence description frame containing the discontinuous transmission mechanism indication information and send it non-continuously, wherein the silence description frame encoding method in each discontinuous transmission mechanism and /or the non-continuous transmission modes of the silence description frames are different, and the coding methods of the silence description frames of each discontinuous transmission unit are different, specifically: the encoding parameters of different silence description frames are different, or the quantization accuracy of the silence description frames with the same encoding parameters is different Similarly, the discontinuous transmission mechanisms of the silence description frames of each discontinuous transmission unit are different, specifically: the transmission intervals of two adjacent silence description frames to be sent are different. the
一种解码方法,包括: A decoding method comprising:
识别静音描述帧中的非连续传输机制指示信息,其中每一种非连续传输机制中的静音描述帧编码方式不相同; Identify the discontinuous transmission mechanism indication information in the silence description frame, where the encoding methods of the silence description frame in each discontinuous transmission mechanism are different;
根据所述非连续传输机制指示信息,采用对应的解码方式解码静音描述帧,并重构为背景噪声信号。 According to the discontinuous transmission mechanism indication information, the silence description frame is decoded in a corresponding decoding manner, and reconstructed into a background noise signal. the
应用本发明技术方案,可以根据背景噪声的不同类型灵活采用不同的SID帧编码方法和非连续传输机制,在合理利用通信系统实际资源的情况下自适应于不同类型背景噪声所要求的编码精度和时间分辨力,来达到在合理利用信道资源的前提下在解码端高质量的提供各种类型背景噪声的目的。 By applying the technical scheme of the present invention, different SID frame coding methods and discontinuous transmission mechanisms can be flexibly adopted according to different types of background noise, and the coding accuracy and the required coding accuracy of different types of background noise can be adapted to the situation of rational use of actual resources of the communication system. Time resolution to achieve the purpose of providing various types of background noise with high quality at the decoding end under the premise of rational use of channel resources. the
附图说明Description of drawings
图1为现有采用静音压缩编码技术的通信系统原理示意图; Figure 1 is a schematic diagram of the existing communication system using silent compression coding technology;
图2、图3、图5和图6分别为本发明实施例提供的一种背景噪声编码装置主要结构示意图; Fig. 2, Fig. 3, Fig. 5 and Fig. 6 are schematic diagrams of the main structure of a kind of background noise encoding device provided by the embodiment of the present invention respectively;
图4为本发明实施例提供的切换模式原理示意图; Figure 4 is a schematic diagram of the switching mode principle provided by the embodiment of the present invention;
图7、图8和图9分别为本发明实施例提供的一种通信设备主要结构示意图。 FIG. 7 , FIG. 8 and FIG. 9 are respectively schematic diagrams of the main structure of a communication device provided by an embodiment of the present invention. the
具体实施方式Detailed ways
进一步分析现有背景噪声编码,每一种编解码系统中都采用单一的背景噪声帧编码方式,无法针对性的描述各种特性的通信环境的具体背景噪声信号,从而满足通信环境多样性的需求;单一非连续传输机制也无法对不同特性的背景噪声帧有针对性的传输背景噪声帧,以至无法在解码端针对性的产生较好的舒适噪声;并且非连续传输机制的单一性也无法自适应于通信系统的实际状况,例如通信带宽,以至通信系统的资源不能被合理应用。 Further analysis of the existing background noise coding, each codec system adopts a single background noise frame coding method, which cannot describe the specific background noise signals of the communication environment with various characteristics in a targeted manner, so as to meet the needs of the diversity of the communication environment ; A single discontinuous transmission mechanism cannot transmit background noise frames in a targeted manner to background noise frames with different characteristics, so that it cannot produce better comfort noise at the decoding end; and the singleness of the discontinuous transmission mechanism cannot automatically Adapt to the actual situation of the communication system, such as communication bandwidth, so that the resources of the communication system cannot be used reasonably. the
因此,本发明实施例提供的一种编解码技术方案,可以针对性的根据背景噪声特性进行相应处理,在编码端对背景噪声进行分析,根据背景噪声的特性采用不同的DTX方案编码发送SID帧,并在接收端采用相对应的CNG方案重构背景噪声信号。 Therefore, the encoding and decoding technical solution provided by the embodiment of the present invention can perform corresponding processing according to the characteristics of the background noise, analyze the background noise at the encoding end, and use different DTX schemes to encode and send the SID frame according to the characteristics of the background noise , and the corresponding CNG scheme is used at the receiving end to reconstruct the background noise signal. the
如图2所示,为本发明实施例提供的一种背景噪声编码装置20,该背景噪声编码装置20用于静音压缩编码通信系统中,主要结构包括: As shown in Figure 2, it is a background
至少两个DTX单元21,包括DTX单元211、DTX单元212......DTX单元21n,各DTX单元21用于根据不同的DTX机制将背景噪声信号编码为SID帧并非连续输出,其中,不同的DTX机制之间的区别包括:SID帧编码方式不相同、SID帧非连续传输机制不相同,或者二者都不相同; At least two
背景噪声分析单元22,用于根据背景噪声信号计算至少一个信号特征参数值; Background
背景噪声分类单元23,用于根据所述至少一个信号特征参数值确定背景噪声类型; Background
编码切换单元24,用于根据所述背景噪声类型将接收的背景噪声信号输入对应的DTX单元21。 The
背景噪声分析单元22的功能是分析输入的背景噪声信号以获得背景信号的特征参数,分析方法是计算出一个或多个能够表征背景噪声信号的信号特征参数值,这些信号特征参数可以是如下参数之一或任意组合:长时SNR(signalto Noise Ratio)、背景噪声估计电平、背景噪声能量波动估计、背景噪声谱波动估计、背景噪声波动频率等。其它可以表征信号背景的大小、种类(稳态/非稳态,变化大小,变化快慢)、SNR(长时,短时,峰值)的参数也可以被计算出来用于分析。除将背景噪声信号输入背景噪声分析单元22外,还可以将VAD判决当前信号是背景信号的指示信号输入背景噪声分析单元22,背景噪声分析 单元22根据背景噪声信号指示信息触发计算信号特征参数值。 The function of the background
背景噪声分类单元23的功能是根据背景噪声分析单元22计算得到的信号特征参数值,划分当前背景噪声信号的类型。背景噪声类型至少定义两种,可以根据背景噪声的稳定与否定义,例如:将背景噪声划分为稳态背景噪声信号和非稳态背景噪声信号两类;又如根据背景噪声的稳定与否及长时SNR的大小定义:将背景噪声划分为稳态大SNR背景噪声信号、稳态小SNR背景噪声信号、非稳态大SNR背景噪声信号、非稳态小SNR背景噪声信号四类。还可以根据背景噪声所包含信息意义的大小定义背景噪声类型,比如背景噪声信号可被分为悦耳背景噪声信号(如音乐,流水,鸟语等)、环境背景噪声信号(如酒会,街边,机场等)或其它背景噪声信号三类。总之,背景噪声的划分类型可以根据需要设定,并不限定本发明技术方案的实施。 The function of the background
编码切换单元24中预先存储了每一种背景噪声类型适用的DTX单元21,根据背景噪声分类单元23确定的背景噪声类型,在至少两个DTX单元21之间进行切换,选择和当前背景噪声类型相对应的DTX单元21进行处理。 The
一般情况下各背景噪声类型分别对应一种不同的DTX方案,但是考虑到具体通信场景的需求或通信信道资源变化情况,还可以设定不同的切换模式,每一种切换模式对应不同的切换控制方案,可以将每一种切换模式所对应的具体切换控制方案预先存储到编码切换单元24中,并通过切换模式控制指令指示编码切换单元24所需的切换模式,切换模式控制指令可以是反映当前信道资源状况的有限个指令集,每一特定的切换模式控制指令对应一种切换模式,即一种DTX单元的分配方法。也可以在每一次切换模式控制指令时,将所需的切换模式对应的具体切换控制方案同时发送给编码切换单元24。 In general, each background noise type corresponds to a different DTX scheme, but considering the needs of specific communication scenarios or changes in communication channel resources, different switching modes can also be set, and each switching mode corresponds to a different switching control scheme, the specific switching control scheme corresponding to each switching mode can be pre-stored in the
编码切换单元24根据具体原因的指示信息确定出对应切换模式。对于本领域技术人员来说,具体的切换模式转换控制方法很多,这里不再一一列举。 The
因此如图3所示,图2所示背景噪声编码装置20中还可以增加设置指令接收单元25,接收切换模式控制指令并转发给编码切换单元24,切换模式控 制指令可以是操作人员从外部输入的,也可以是发送端根据检测的信道资源情况产生等。 Therefore as shown in Fig. 3, in the background
编码切换单元24可以根据切换模式控制指令选择恰当的切换模式,任意两个切换模式之间至少有一个背景噪声类型对应的DTX单元21不相同。以背景噪声编码装置20中包括四个DTX单元21为例,切换模式的设定例如图4所示: The
当没有设置指令接收单元25时,编码切换单元24仅根据背景噪声类型为背景噪声选择恰当的DTX单元,例如切换模式1; When the
当设置有指令接收单元25时,编码切换单元24根据切换模式控制指令确定选择DTX单元的模式。例如: When the
接收到切换模式控制指令1时,根据切换模式1进行切换; When receiving switching
接收到切换模式控制指令2时,根据切换模式2进行切换; When receiving the switch mode control command 2, switch according to the switch mode 2;
接收到切换模式控制指令3时,根据切换模式3进行切换。 When the switching
切换模式控制指令可以根据当前信道资源情况发出,例如当前信道资源紧张时,其中部分背景噪声类型的背景噪声可能使用同一个较低速率的DTX单元,甚至所有的背景噪声类型都使用最低速率DTX单元。又例如,当没有切换模式控制指令时,只有其中部分DTX单元会被使用,但当切换模式控制指令表示当前信道资源充裕或紧张时,另一部分DTX单元被使用。 The switching mode control command can be issued according to the current channel resource situation, for example, when the current channel resource is tight, some background noise types may use the same lower rate DTX unit, or even all background noise types use the lowest rate DTX unit . For another example, when there is no switching mode control instruction, only some of the DTX units will be used, but when the switching mode control instruction indicates that the current channel resource is sufficient or insufficient, another part of the DTX units will be used. the
由图4所示的例子可见: It can be seen from the example shown in Figure 4:
指令1指示进入切换模式1,此时每一种背景噪声类型都对应一种不同的DTX单元,指令1可以是表示当前的信道资源正常;
指令2指示进入切换模式2,此时部分背景噪声类型会对应相同的DTX单元,如背景噪声类型1和背景噪声类型2均对应DTX单元212,背景噪声类型3和背景噪声类型4均对应DTX单元214,指令2可以表示当前信道资源较丰富(DTX单元212较DTX单元211编码速率高,占用资源多;DTX单元214较DTX单元213编码速率高,占用资源多),或者可以表示当前信道资源 较紧张(DTX单元212较DTX单元211编码速率的低,占用资源少;DTX单元214较DTX单元213编码速率低,占用资源少); Instruction 2 indicates to enter switching mode 2. At this time, some background noise types will correspond to the same DTX unit. For example,
指令3指示进入切换模式3,此时所有的背景噪声类型均只对应DTX单元214,指令3可以表示当前信道资源十分丰富(DTX单元214编码速率最高,占用资源最多),或者可以表示当前信道资源十分紧张(DTX单元214编码速率最低,占用资源最少)。
如图5所示,在当前信道资源非常充足或者背景噪声特别悦耳时,还可以在背景噪声编码装置20中再设置语音编码器,利用语音编码器对特定背景噪声进行全速率编码。特定背景噪声的信号特征参数值满足设定条件,背景噪声分类单元23根据至少一个信号特征参数值是否满足设定条件来判断特定背景噪声类型,并指示编码切换单元24在特定背景噪声时切换到语音编码器,语音编码器将该特定背景噪声信号编码为语音帧并连续输出。当然,也可以利用发送端已有的语音编码器承担该功能。 As shown in FIG. 5 , when the current channel resource is very sufficient or the background noise is particularly pleasing to the ear, a speech encoder can also be installed in the background
参阅图1所示,每一个DTX单元21中包括两部分,其一为非语音编码器,用于将背景噪声信号编码为SID帧,其二为背景噪声帧发送器,用于根据设定的非连续方式发送背景噪声帧。DTX单元间的不同可以是SID帧编码方式的不同,也可以是非连续传输机制的不同,或者二者皆不同。其中:SID帧编码方式的不同可以体现在以下几个方面: Referring to shown in Fig. 1, comprise two parts in each
1、编码参数的不同 1. Different encoding parameters
不同的SID帧可以采用不同种类的参数和不同数量的参数; Different SID frames can use different types of parameters and different numbers of parameters;
2、相同参数量化精度的不同 2. The difference in quantization accuracy of the same parameter
对于同一参数不同SID帧可以采用不同的量化精度; Different quantization precision can be used for different SID frames of the same parameter;
因此,不同SID帧编码方式产生的SID帧的结构和长度可能不同,例如:稳态背景噪声信号噪声的编码参数可以相对简单,参数的量化精度可以相对较低,SID帧的长度可以相对较短。相对应的,非稳态背景噪声信号的编码参数可以相对较多,一种方法是在现有的SID帧中增加语音编码参数中的激励参 数,如自适应码本参数(基音延迟和增益)和固定码本参数(非零脉冲的位置和符号以及增益)等,参数的量化精度可以相对较高,如增加现有SID帧中滤波器参数和激励参数的量化精度等,SID帧的长度也可相应增加。又如,悦耳背景噪声信号的SID帧应使用最复杂的编码方式(可以是全速率编码),环境背景噪声信号的SID帧使用次复杂的编码方式,其它背景噪声信号的SID帧则使用较简单的编码方式。 Therefore, the structure and length of the SID frame generated by different SID frame encoding methods may be different, for example: the encoding parameters of steady-state background noise signal noise can be relatively simple, the quantization accuracy of the parameters can be relatively low, and the length of the SID frame can be relatively short . Correspondingly, the coding parameters of the non-stationary background noise signal can be relatively many, and a kind of method is to increase the excitation parameter in the speech coding parameter in existing SID frame, as adaptive codebook parameter (pitch delay and gain ) and fixed codebook parameters (position and sign of non-zero pulse and gain), etc., the quantization accuracy of parameters can be relatively high, such as increasing the quantization accuracy of filter parameters and excitation parameters in existing SID frames, etc., the length of SID frame It can also be increased accordingly. For another example, the SID frame of pleasant background noise signal should use the most complex encoding method (it can be full-rate encoding), the SID frame of environmental background noise signal should use the second complex encoding method, and the SID frame of other background noise signals should use simpler encoding method. encoding method. the
不同的背景噪声类型还可以使用不同的非连续传输机制,不同背景噪声类型的SID帧发送机制的不同主要体现在相邻两个SID帧的传输间隔上。例如对于稳态背景噪声信号,由于背景的变动较小,人类听觉系统对该类背景噪声经DTX/CNG之后的质量下降敏感度不高,只需用相对较长的非连续传输间隔即可。对于非稳态背景噪声信号,由于背景在时间上的变动较大,为了保证该类背景噪声经DTX/CNG之后能拥有一个较高的主观质量,需要使用相对较短的非连续传输间隔甚至是连续的传输。又如,对于悦耳背景,由于其本身的非噪声性质,主观质量的要求相当高,因此DTX系统就需要将其进行连续的传输。对于环境背景噪声,由于其本身携带了一定的环境信息,为了确保不丢失这些信息,DTX系统也需要以较高的速度来更新SID帧。对于其它背景噪声,由于还原背景的真实面貌并非主要的目的,因此DTX系统就可以相对较慢的进行SID帧的更新。 Different background noise types can also use different discontinuous transmission mechanisms, and the difference in the SID frame transmission mechanism of different background noise types is mainly reflected in the transmission interval between two adjacent SID frames. For example, for a steady-state background noise signal, the human auditory system is not sensitive to the quality degradation of this type of background noise after DTX/CNG due to small changes in the background, and only relatively long discontinuous transmission intervals are required. For non-stationary background noise signals, because the background changes greatly in time, in order to ensure that this type of background noise can have a high subjective quality after DTX/CNG, it is necessary to use relatively short discontinuous transmission intervals or even Continuous transmission. As another example, for the pleasant background, due to its non-noise nature, the subjective quality requirement is quite high, so the DTX system needs to transmit it continuously. For environmental background noise, since it carries certain environmental information, in order to ensure that the information is not lost, the DTX system also needs to update the SID frame at a relatively high speed. For other background noises, since restoring the true appearance of the background is not the main purpose, the DTX system can update the SID frame relatively slowly. the
SID帧的传输间隔分为固定传输间隔和自适应传输间隔两种。在固定传输间隔的机制下,相邻两个SID帧间的间隔是固定的,间隔的大小就取决于背景噪声的类型。在自适应传输间隔的机制下,虽然SID帧的发送是自适应于噪声变化的快慢,但也有最小传输间隔的限制。本发明实施例中,根据不同背景噪声类型可决定采用不同的最小间隔。 The transmission interval of the SID frame is divided into two types: fixed transmission interval and adaptive transmission interval. Under the mechanism of fixed transmission interval, the interval between two adjacent SID frames is fixed, and the size of the interval depends on the type of background noise. Under the mechanism of adaptive transmission interval, although the transmission of SID frame is adaptive to the speed of noise change, there is also a limit on the minimum transmission interval. In the embodiment of the present invention, different minimum intervals may be adopted according to different background noise types. the
如图6所示,为本发明提供的第一种通信设备的主要结构示意图,包括:VAD10、背景噪声编码装置20、语音编码器30和通信接口40,其中: As shown in Figure 6, it is a schematic diagram of the main structure of the first kind of communication equipment provided by the present invention, including: VAD10, background
VAD10用于检测有效语音,将有声信号送入语音编码器30编码为语音帧, 通信接口通过通信信道发送语音帧;将背景噪声信号送入背景噪声编码装置20,背景噪声编码装置20输出SID帧,通信接口通过通信信道发送SID帧。 VAD10 is used for detecting effective speech, sends voice signal into
背景噪声编码装置20中,编码切换单元24根据背景噪声信号的类型切换适用的DTX单元21。 In the background
如图7所示,为本发明实施例提供的第二种通信设备主要结构示意图,其中:背景噪声编码装置20中的编码切换单元24根据指令接收单元25的指令确定切换模式,并根据背景噪声信号的类型以及当前使用的切换模式切换适用的DTX单元21。 As shown in Figure 7, it is a schematic diagram of the main structure of the second type of communication equipment provided by the embodiment of the present invention, wherein: the encoding switching
参见图5所示,图6和图7所示的通信设备中的背景噪声编码装置20中还可以设置语音编码器,语音编码器完成背景噪声的全速率编码。 Referring to FIG. 5 , the background
如图8所示,如果利用已有的语音编码器30完成背景噪声的全速率编码,则编码切换单元24连接到语音编码器30上,将背景噪声信号送入语音编码器30中编码为全速率的语音帧并连续输出。 As shown in Figure 8, if utilize existing
采用本发明实施例提供的技术方案,需要对应每一种SID帧的编码发送方式,在SID帧中设置相应的DTX单元指示信息,DTX单元指示信息供解码端确定对应的解码重构方案。对于语音帧,由语音解码器进行解码并连续输出。 Adopting the technical solution provided by the embodiment of the present invention needs to correspond to each encoding and sending mode of the SID frame, and set the corresponding DTX unit indication information in the SID frame, and the DTX unit indication information is used for the decoding end to determine the corresponding decoding reconstruction scheme. For voice frames, it is decoded by the voice decoder and output continuously. the
由于背景噪声分析单元22对计算背景特征参数值的计算和各DTX单元21在编码SID帧时的部分计算相同,因此背景噪声分析单元22还可以连接各DTX单元21,将分析计算结果输出给各DTX单元21,从而部分减少DTX单元21的计算量。 Because the calculation of the background
如图9所示,为本发明实施例提供的一种通信设备结构示意图,完成接收端功能的单元主要包括:背景噪声解码装置50、识别单元60、解码切换单元70、语音解码器80以及通信接口,其中: As shown in Figure 9, it is a schematic structural diagram of a communication device provided by the embodiment of the present invention. The unit that completes the function of the receiving end mainly includes: a background
该背景噪声解码装置50中包括至少两个CNG单元(51、52...5n),每一个CNG单元根据一种设定的解码重构方案解码SID帧并重构为背景噪声; The background
识别单元60,用于识别通过通信接口所接收的语音帧和SID帧,并进一 步识别SID帧中的DTX单元指示信息;
解码切换单元70,用于根据识别单元60的识别结果进行切换,将语音帧送入语音解密器80,将SID帧送入对应的CNG单元进行解码重构。 The
其中,参阅图1所示,每一个CNG单元中包括非语音解码器,用于解码从通信信道接收的背景噪声帧;重构器,用于采用内插的方式将解码数据重构为连续的背景噪声信号,以增加收听方的舒适感。 Wherein, referring to shown in Fig. 1, comprise non-speech decoder in each CNG unit, be used for decoding the background noise frame that receives from communication channel; background noise signal to increase the comfort of the listening party. the
识别单元60首先识别接收到的编码帧是全速率编码的语音帧还是SID帧并且若是SID帧,进一步识别该SID帧中的DTX单元指示信息。在这里,SID帧的码流中应存在1或几个用于携带DTX单元指示信息的比特位,比特位的位置既可在SID帧的开头也可在末尾,或者在SID帧的其它指定位置,例如下表3或表4所示: The identifying
表3. table 3.
表4. Table 4.
识别单元60中可以包括两个识别子单元,一个识别子单元用于识别接收到的编码帧是全速率编码的语音帧还是SID帧,如果识别出语音帧则通知解码 切换单元70,解码切换单元将语音帧送入语音解码器80解码并连续输出;如果识别出是SID帧,则将SID帧送入另一个识别子单元进一步识别SID帧中的DTX单元指示信息。 Two recognition subunits can be included in the
由于不同的背景噪声类型有不同的SID帧格式或更新机制,根据DTX单元指示信息选择适当的DTX解码方案解码SID帧,不同解码方案中中的噪声内插方法不同,最终得到恢复出的舒适噪声。 Since different background noise types have different SID frame formats or update mechanisms, the appropriate DTX decoding scheme is selected to decode the SID frame according to the DTX unit instruction information. The noise interpolation methods in different decoding schemes are different, and the recovered comfort noise is finally obtained. . the
通信设备可以即具备发送端功能,也可以具体接收端功能,图9所示结构通常和图7或图8所示结构合并设置在一个通信设备中。 The communication device can not only have the function of the sending end, but also have the function of the specific receiving end. The structure shown in FIG. 9 is usually combined with the structure shown in FIG. 7 or 8 and arranged in one communication device. the
下面再以具体实施例进行详细说明。 The following will describe in detail with specific embodiments. the
实施例一 Embodiment one
在编码端,背景噪声信号被逐帧输入背景噪声分析单元22,背景噪声分析单元22计算每一个当前信号帧的子带电平level[k,i],其中k和i分别表示是第i帧第k子带的电平并保存在内存中。子带的计算可以通过滤波器组实现也可以通过变换的方法实现。 At the encoding end, the background noise signal is input to the background
当信号帧积累够一定数量时,背景噪声分析单元22开始根据保存在内存中的子带电平,计算背景噪声谱波动参数spec_var[i],并在之后收到每一帧背景噪声帧时更新计算结果。 When the signal frame accumulates enough to a certain number, the background
背景噪声谱波动参数spec_var[i]的计算方法为: The calculation method of background noise spectrum fluctuation parameter spec_var[i] is:
其中N为子带的数量,L为被选为做长时平均的背景噪声信号帧数量。背景噪声谱波动参数spec_var[i]的计算也可以是基于LSF系数,即: where N is the number of subbands, and L is the number of background noise signal frames selected for long-term averaging. The calculation of the background noise spectrum fluctuation parameter spec_var[i] can also be based on the LSF coefficient, namely:
其中lsf[k,i]表示第i帧的第k个lsf系数,N为lsf系数数量,L为被选为做长时平均的背景噪声信号帧数量。Among them, lsf[k, i] represents the k-th lsf coefficient of the i-th frame, N is the number of lsf coefficients, and L is the number of background noise signal frames selected for long-term averaging.
背景噪声分析单元22将计算得到的背景噪声谱波动参数输出至背景噪声分类单元23。 The background
编码切换单元24根据背景噪声分类单元23输出的背景噪声类型信息为当前背景噪声帧选择对应的DTX单元。 The
本实施例中,定义了两种背景类型:稳态背景噪声信号和非稳态背景噪声信号。背景噪声分类单元23通过将背景噪声谱波动参数与一个门限spec_var_thr做比较将当前背景噪声划归为稳态或非稳态中的一种。若spec_var[i]<spec_var_thr,当前背景噪声信号被归为稳态背景噪声信号,否则被归为非稳态背景噪声信号。 In this embodiment, two background types are defined: a steady state background noise signal and a non-stationary background noise signal. The background
本实施例中,DTX系统有两种不同的工作方案: In this embodiment, the DTX system has two different working schemes:
方案一采用每8帧发送一个SID帧的非连续传输机制,SID帧中只包含滤波器和能量参数;
方案二采用每3帧发送一个SID帧的非连续传输机制,SID帧中除了包含滤波器,能量参数外还增加了固定码本参数(属于激励参数)。 Solution 2 adopts a discontinuous transmission mechanism of sending a SID frame every 3 frames. In addition to filters and energy parameters, the SID frame also adds fixed codebook parameters (belonging to excitation parameters). the
当前背景噪声属于稳态类时,编码切换单元24选择工作方案一。当前背景噪声属于非稳态类时,编码切换单元24选择工作方案二。 When the current background noise belongs to the steady-state category, the
每种SID帧组装时在帧内设置一个DTX单元指示信息以声明该SID帧属于哪种DTX单元。本实施例中,DTX单元指示信息可以用1比特来表示,0和1分别表示DTX单元一和DTX单元二。 When each SID frame is assembled, a DTX unit indication information is set in the frame to declare which DTX unit the SID frame belongs to. In this embodiment, the DTX unit indication information may be represented by 1 bit, and 0 and 1 represent DTX unit one and DTX unit two respectively. the
在解码端,识别单元60首先根据SID帧中的DTX单元指示信息判断出该SID帧所属的DTX单元,解码切换单元70将SID帧输入对应的CNG单元50进行解码重构。其中: At the decoding end, the
若DTX单元指示信息为0,对应的CNG单元则按照上述DTX工作方案一中的SID帧格式解码得到滤波器参数和能量参数并对其后的8个背景噪声信号帧进行内插; If the DTX unit indication information is 0, the corresponding CNG unit decodes the filter parameters and energy parameters according to the SID frame format in the above-mentioned
若DTX单元指示信息为1,对应的CNG单元则按照上述DTX工作方案 二中的SID帧格式解码得到滤波器参数、能量参数和固定码本参数并对其后的3个背景噪声信号帧进行内插。 If the DTX unit indication information is 1, the corresponding CNG unit decodes the filter parameters, energy parameters and fixed codebook parameters according to the SID frame format in the above-mentioned DTX working scheme 2, and internalizes the following three background noise signal frames. insert. the
实施例二 Example two
在编码端,背景噪声信号被逐帧输入至背景噪声分析单元22,背景噪声分析单元22计算当前背景噪声信号帧的子带电平level[k,i],帧能量pow[i]和短时信噪比snr[i],并保存在内存中。其中: At the encoding end, the background noise signal is input to the background
子带电平level[k,i]的计算可以通过滤波器组实现也可以通过变换的方法实现; The calculation of sub-band level level[k, i] can be realized by filter bank or by transformation method;
帧能量的计算方式为:
短时信噪比的计算方式为:
当背景噪声信号帧积累够一定数量时,背景噪声分析单元22开始根据保存在内存中的短时特征参数(子带电平,帧能量,短时信噪比)计算背景噪声能量波动pow_var[i]、长时信噪比snr_long[i]、长时背景噪声能量估计bckr_noise_pow[i],并在收到每一个背景噪声信号帧时更新计算结果。 When background noise signal frames are accumulated enough to a certain number, background
除长时信噪比以外,其余参数计算结果的更新仅发生在当前信号帧为背景噪声信号帧时,长时信噪比的更新仅发生在当前信号帧为非背景噪声信号帧时。 Except for the long-term signal-to-noise ratio, the update of the calculation results of other parameters only occurs when the current signal frame is a background noise signal frame, and the update of the long-term signal-to-noise ratio only occurs when the current signal frame is a non-background noise signal frame. the
长时信噪比snr_long[i]的计算方法为: The calculation method of long-term signal-to-noise ratio snr_long[i] is:
背景噪声能量波动pow_var[i]的计算方法为: The calculation method of background noise energy fluctuation pow_var[i] is:
长时背景噪声能量估计bckr_noise_pow[i]的计算方法为: The calculation method of long-term background noise energy estimation bckr_noise_pow[i] is:
bckr_noise_pow[i]=(1-α)*bckr_noise_pow[i-1]+α*pow[i],其中α为一个0到1间的比例因子,本实施例中取值为5%。 bckr_noise_pow[i]=(1-α)*bckr_noise_pow[i-1]+α*pow[i], where α is a proportional factor between 0 and 1, and the value in this embodiment is 5%. the
背景噪声分析单元22将计算得到的背景噪声能量波动pow_var[i],长时信噪比snr_long[i]输出至背景噪声分类单元23。 The background
本实施例中,定义了四种背景类型:稳态大SNR背景噪声信号、稳态小SNR背景噪声信号、非稳态大SNR背景噪声信号和非稳态小SNR背景噪声信号。 In this embodiment, four background types are defined: a steady-state large SNR background noise signal, a steady-state small SNR background noise signal, an unsteady large SNR background noise signal, and an unsteady small SNR background noise signal. the
背景噪声分类单元23通过将powvar[i]和snr_long[i]分别与一个能量波动门限pow_var_thr和一个信噪比门限snr_thr做比较,将当前背景噪声划归为4种背景噪声类型种的一类,其中: The background
当pow_var[i]>pow_var_thr且snr_long>snr_thr时,当前背景为非稳态大SNR背景噪声信号; When pow_var[i]>pow_var_thr and snr_long>snr_thr, the current background is an unsteady large SNR background noise signal;
当pow_var[i]>pow_var_thr且snr_long<snr_thr时,当前背景为非稳态小SNR背景噪声信号; When pow_var[i]>pow_var_thr and snr_long<snr_thr, the current background is an unsteady small SNR background noise signal;
当powvar[i]<pow_var_thr且snr_long>snr_thr时,当前背景为稳态大SNR背景噪声信号; When powvar[i]<pow_var_thr and snr_long>snr_thr, the current background is a steady-state large SNR background noise signal;
当pow_var[i]<pow_var_thr且snr_long<snr_thr时,当前背景为稳态小SNR背景噪声信号。 When pow_var[i]<pow_var_thr and snr_long<snr_thr, the current background is a steady-state small SNR background noise signal. the
本实施例中,一共有四种不同的DTX单元,其中: In this embodiment, there are four different DTX units, among which:
DTX单元一采用每8帧发送一个SID帧的非连续传输机制,SID帧中只包含滤波器参数和能量参数;
DTX单元二采用每8帧发送一个SID帧的非连续传输机制,SID帧中除包含滤波器参数和能量参数外还包含固定码本参数; DTX unit 2 adopts the discontinuous transmission mechanism of sending a SID frame every 8 frames. In addition to filter parameters and energy parameters, the SID frame also contains fixed codebook parameters;
DTX单元三采用每3帧发送一个SID帧的非连续传输机制,SID帧中除包 含滤波器参数和能量参数外还包含固定码本参数; DTX unit three adopts the discontinuous transmission mechanism of sending a SID frame every 3 frames, and the SID frame contains fixed codebook parameters in addition to filter parameters and energy parameters;
DTX单元四采用全速率编码的连续传输机制。 DTX unit four adopts the continuous transmission mechanism of full-rate encoding. the
编码切换单元24除了接收背景噪声分类单元23的背景噪声类型信息外,还与一个切换模式控制指令接口相连。切换模式控制指令有5种指示,分别表示带宽正常、带宽较多、带宽较少、带宽很多和带宽很少。 In addition to receiving the background noise type information from the background
切换模式控制指令与DTX系统的切换模式关系如下: The relationship between the switching mode control command and the switching mode of the DTX system is as follows:
当切换模式控制指令表示带宽正常时,DTX系统的切换模式为:对稳态大SNR类背景采用DTX单元一;对稳态小SNR类背景采用DTX单元二;对非稳态大SNR类背景采用DTX单元三;对非稳态小SNR类背景采用DTX单元四。 When the switching mode control command indicates that the bandwidth is normal, the switching mode of the DTX system is:
当切换模式控制指令表示带宽较多时,DTX系统的切换模式为:对稳态大SNR类背景和稳态小SNR类背景均采用DTX单元二;对非稳态大SNR类背景和非稳态小SNR类背景均采用DTX单元四; When the switching mode control instruction indicates that the bandwidth is large, the switching mode of the DTX system is: DTX unit 2 is used for the steady-state large SNR background and the steady-state small SNR background; for the unsteady large SNR background and the unsteady small SNR backgrounds all use DTX unit 4;
当切换模式控制指令表示带宽较少时,DTX系统的切换模式为:对稳态大SNR类背景和稳态小SNR类背景均采用DTX单元一;对非稳态大SNR类背景和非稳态小SNR类背景均采用DTX单元三; When the switching mode control command indicates that the bandwidth is low, the switching mode of the DTX system is:
当切换模式控制指令表示带宽很多时,DTX系统的切换模式为:对四种背景噪声类型均采用DTX单元四; When the switching mode control instruction indicates that the bandwidth is very large, the switching mode of the DTX system is: all four types of background noises adopt DTX unit 4;
当切换模式控制指令表示带宽很少时,DTX系统的切换模式为:对四种背景噪声类型均采用DTX单元一。 When the switching mode control command indicates that the bandwidth is low, the switching mode of the DTX system is:
每种SID帧组装时在帧内设置一个DTX单元指示信息以声明该SID帧属于哪种DTX单元。本实施例中的DTX单元指示信息可用2比特来表示,其中:00、01和10分别表示DTX单元一、DTX单元二和DTX单元三,方案四由于是全速率编码所以无需标识出来。 When each SID frame is assembled, a DTX unit indication information is set in the frame to declare which DTX unit the SID frame belongs to. The DTX unit indication information in this embodiment can be represented by 2 bits, wherein: 00, 01 and 10 respectively represent
在解码端,识别单元60首先根据SID帧中的DTX单元指示信息判断出该SID帧所属的DTX单元,并指示对应的CNG单元解码重构,其中:At the decoding end, the
若DTX单元指示信息为00,则按照上述DTX工作方案一中的SID帧格式解码得到滤波器参数和能量参数并对其后的8个背景噪声信号帧进行内插; If the DTX unit indication information is 00, then decode the filter parameters and energy parameters according to the SID frame format in the above-mentioned DTX working plan one, and interpolate the following 8 background noise signal frames;
若DTX单元指示信息为01,则按照上述DTX工作方案二中的SID帧格式解码得到激励参数,滤波器参数和能量参数并对其后的8个背景噪声信号帧进行内插。 If the indication information of the DTX unit is 01, the excitation parameters, filter parameters and energy parameters are decoded according to the SID frame format in the above-mentioned DTX working scheme 2, and the subsequent 8 background noise signal frames are interpolated. the
若DTX单元指示信息为10,则按照上述DTX工作方案三中的SID帧格式解码得到滤波器参数、能量参数和固定码本参数并对其后的3个背景噪声信号帧进行内插。 If the DTX unit indication information is 10, decode the filter parameters, energy parameters and fixed codebook parameters according to the SID frame format in the above-mentioned
本领域技术人员可以灵活根据信号特征参数划分背景噪声类型,并设定切换模式,这里不再一一列举。 Those skilled in the art can flexibly classify background noise types according to signal characteristic parameters, and set switching modes, which will not be listed here. the
综上所述,本发明实施例提供一种编码方法,包括: In summary, the embodiment of the present invention provides an encoding method, including:
获取待发送背景噪声信号的至少一个信号特征参数值; Obtain at least one signal characteristic parameter value of the background noise signal to be sent;
根据所述至少一个信号特征参数值确定背景噪声类型; Determine the background noise type according to the at least one signal characteristic parameter value;
根据所述背景噪声类型选择一种DTX机制将背景噪声信号编码为包含DTX机制指示信息的SID帧并非连续发送,其中每一种DTX机制中的SID帧编码方式和/或SID帧非连续发送方式不相同。 Select a DTX mechanism according to the background noise type to encode the background noise signal into a SID frame containing DTX mechanism indication information and send it non-continuously, wherein the SID frame encoding method and/or the SID frame non-continuous transmission method in each DTX mechanism Are not the same. the
进一步,所述编码方法还包括:设定不同的DTX机制选择模式,即不同的切换模式,任意两个DTX机制选择模式之间至少有一个背景噪声类型对应的DTX机制不相同;并根据选定的DTX机制选择模式,为各背景噪声类型确定对应的DTX机制。 Further, the encoding method also includes: setting different DTX mechanism selection modes, that is, different switching modes, at least one DTX mechanism corresponding to the background noise type is different between any two DTX mechanism selection modes; and according to the selected The DTX mechanism selection mode for each background noise type determines the corresponding DTX mechanism. the
更近一步,所述方法还包括:当所述至少一个信号特征参数值为的设定值时,将背景噪声信号编码为全速率的语音帧并连续发送。 Furthermore, the method further includes: when the at least one signal characteristic parameter value is a set value of , encoding the background noise signal into full-rate speech frames and sending them continuously. the
其中,信号特征参数值包括如下之一或任意组合:长时SNR、背景噪声估计电平、背景噪声能量波动估计、背景噪声谱波动估计和背景噪声波动频率。 Wherein, the signal characteristic parameter value includes one or any combination of the following: long-term SNR, estimated background noise level, estimated background noise energy fluctuation, estimated background noise spectrum fluctuation, and background noise fluctuation frequency. the
本发明实施例还提供一种对对应的SID帧进行解码的方法,包括: The embodiment of the present invention also provides a method for decoding the corresponding SID frame, including:
识别SID帧中的DTX机制指示信息,其中每一种DTX机制中的SID帧 编码方式不相同; Identify the DTX mechanism indication information in the SID frame, where the SID frames in each DTX mechanism have different encoding methods;
根据所述DTX机制指示信息,采用对应的解码方式解码SID帧,并重构为背景噪声信号。 According to the DTX mechanism indication information, the SID frame is decoded in a corresponding decoding manner, and reconstructed into a background noise signal. the
应用本发明实施例提出的技术方案,可以根据背景噪声的不同类型以及通信系统的实际状况,采用不同的SID帧编码方法和非连续传输机制,在合理利用通信系统实际资源的情况下自适应于不同类型背景噪声所要求的编码精度和时间分辨力,来达到在合理利用信道资源的前提下在解码端高质量的提供各种类型背景噪声的目的。 Applying the technical solutions proposed by the embodiments of the present invention, different SID frame encoding methods and discontinuous transmission mechanisms can be adopted according to different types of background noise and actual conditions of the communication system, and adaptively adapt to The coding accuracy and time resolution required by different types of background noise are used to achieve the purpose of providing various types of background noise at the decoding end with high quality under the premise of rational use of channel resources. the
显然,本领域的技术人员可以对本发明进行各种改动和变型而不脱离本发明的精神和范围。这样,倘若本发明的这些修改和变型属于本发明权利要求及其等同技术的范围之内,则本发明也意图包含这些改动和变型在内。Obviously, those skilled in the art can make various changes and modifications to the present invention without departing from the spirit and scope of the present invention. Thus, if these modifications and variations of the present invention fall within the scope of the claims of the present invention and equivalent technologies thereof, the present invention also intends to include these modifications and variations.
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN 200710110573CN101320563B (en) | 2007-06-05 | 2007-06-05 | Background noise encoding/decoding device, method and communication equipment |
| PCT/CN2008/070886WO2008148321A1 (en) | 2007-06-05 | 2008-05-06 | An encoding or decoding apparatus and method for background noise, and a communication device using the same |
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN 200710110573CN101320563B (en) | 2007-06-05 | 2007-06-05 | Background noise encoding/decoding device, method and communication equipment |
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201210194567.8ADivisionCN102760441B (en) | 2007-06-05 | 2007-06-05 | Background noise coding/decoding device and method as well as communication equipment |
| Publication Number | Publication Date |
|---|---|
| CN101320563A CN101320563A (en) | 2008-12-10 |
| CN101320563Btrue CN101320563B (en) | 2012-06-27 |
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN 200710110573ActiveCN101320563B (en) | 2007-06-05 | 2007-06-05 | Background noise encoding/decoding device, method and communication equipment |
| Country | Link |
|---|---|
| CN (1) | CN101320563B (en) |
| WO (1) | WO2008148321A1 (en) |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN102760441A (en)* | 2007-06-05 | 2012-10-31 | 华为技术有限公司 | Background noise coding/decoding device and method as well as communication equipment |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN102044241B (en)* | 2009-10-15 | 2012-04-04 | 华为技术有限公司 | Method and device for tracking background noise in communication system |
| CN103187065B (en) | 2011-12-30 | 2015-12-16 | 华为技术有限公司 | The disposal route of voice data, device and system |
| US9179404B2 (en)* | 2013-03-25 | 2015-11-03 | Qualcomm Incorporated | Method and apparatus for UE-only discontinuous-TX smart blanking |
| CN106169297B (en) | 2013-05-30 | 2019-04-19 | 华为技术有限公司 | Signal coding method and device |
| CN104378774A (en)* | 2013-08-15 | 2015-02-25 | 中兴通讯股份有限公司 | Voice quality processing method and device |
| CN103617797A (en) | 2013-12-09 | 2014-03-05 | 腾讯科技(深圳)有限公司 | Voice processing method and device |
| CN110097892B (en) | 2014-06-03 | 2022-05-10 | 华为技术有限公司 | Voice frequency signal processing method and device |
| EP2980790A1 (en)* | 2014-07-28 | 2016-02-03 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Apparatus and method for comfort noise generation mode selection |
| CN105356886B (en)* | 2015-06-26 | 2021-03-26 | 深圳市国电科技通信有限公司 | Power line noise compression method and device based on compressed sensing |
| CN117392988A (en) | 2016-09-28 | 2024-01-12 | 华为技术有限公司 | A method, device and system for processing multi-channel audio signals |
| CN110634469B (en)* | 2019-09-27 | 2022-03-11 | 腾讯科技(深圳)有限公司 | Speech signal processing method and device based on artificial intelligence and storage medium |
| CN111586245B (en)* | 2020-04-07 | 2021-12-10 | 深圳震有科技股份有限公司 | Transmission control method of mute packet, electronic device and storage medium |
| CN112037803B (en)* | 2020-05-08 | 2023-09-29 | 珠海市杰理科技股份有限公司 | Audio encoding method and device, electronic equipment and storage medium |
| CN111783434B (en)* | 2020-07-10 | 2023-06-23 | 思必驰科技股份有限公司 | Method and system for improving anti-noise ability of reply generation model |
| CN114448957B (en)* | 2022-01-28 | 2024-03-29 | 上海小度技术有限公司 | Audio data transmission method and device |
| CN118055442B (en)* | 2024-04-11 | 2024-06-25 | 中国电子科技集团公司第五十四研究所 | Discontinuous service transmission method for mobile phone direct connection satellite communication system |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6144936A (en)* | 1994-12-05 | 2000-11-07 | Nokia Telecommunications Oy | Method for substituting bad speech frames in a digital communication system |
| CN1327574A (en)* | 1998-12-07 | 2001-12-19 | 三菱电机株式会社 | Sound decoding device and sound decoding method |
| CN1367918A (en)* | 1999-06-07 | 2002-09-04 | 艾利森公司 | Methods and apparatus for generating comfort noise using parametric noise model statistics |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| TW376611B (en)* | 1998-05-26 | 1999-12-11 | Koninkl Philips Electronics Nv | Transmission system with improved speech encoder |
| US6275798B1 (en)* | 1998-09-16 | 2001-08-14 | Telefonaktiebolaget L M Ericsson | Speech coding with improved background noise reproduction |
| US6691084B2 (en)* | 1998-12-21 | 2004-02-10 | Qualcomm Incorporated | Multiple mode variable rate speech coding |
| US6631352B1 (en)* | 1999-01-08 | 2003-10-07 | Matushita Electric Industrial Co. Ltd. | Decoding circuit and reproduction apparatus which mutes audio after header parameter changes |
| CN1275223C (en)* | 2004-12-31 | 2006-09-13 | 苏州大学 | A low bit-rate speech coder |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6144936A (en)* | 1994-12-05 | 2000-11-07 | Nokia Telecommunications Oy | Method for substituting bad speech frames in a digital communication system |
| CN1327574A (en)* | 1998-12-07 | 2001-12-19 | 三菱电机株式会社 | Sound decoding device and sound decoding method |
| CN1367918A (en)* | 1999-06-07 | 2002-09-04 | 艾利森公司 | Methods and apparatus for generating comfort noise using parametric noise model statistics |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN102760441A (en)* | 2007-06-05 | 2012-10-31 | 华为技术有限公司 | Background noise coding/decoding device and method as well as communication equipment |
| CN102760441B (en)* | 2007-06-05 | 2014-03-12 | 华为技术有限公司 | Background noise coding/decoding device and method as well as communication equipment |
| Publication number | Publication date |
|---|---|
| CN101320563A (en) | 2008-12-10 |
| WO2008148321A1 (en) | 2008-12-11 |
| Publication | Publication Date | Title |
|---|---|---|
| CN101320563B (en) | Background noise encoding/decoding device, method and communication equipment | |
| KR102677745B1 (en) | Method and system for encoding a stereo sound signal using coding parameters of the primary channel to encode the secondary channel | |
| CN101335000B (en) | Coding method and device | |
| KR100883656B1 (en) | Method and apparatus for classifying audio signals and method and apparatus for encoding / decoding audio signals using the same | |
| JP4444749B2 (en) | Method and apparatus for performing reduced rate, variable rate speech analysis synthesis | |
| US7203638B2 (en) | Method for interoperation between adaptive multi-rate wideband (AMR-WB) and multi-mode variable bit-rate wideband (VMR-WB) codecs | |
| KR100964402B1 (en) | Method and apparatus for determining encoding mode of audio signal and method and apparatus for encoding / decoding audio signal using same | |
| CN1954367B (en) | Support for switching between audio encoder modes | |
| CN1828722B (en) | Complex signal activated detection for improved speech/noise classification of an audio signal | |
| KR101771828B1 (en) | Audio Encoder, Audio Decoder, Method for Providing an Encoded Audio Information, Method for Providing a Decoded Audio Information, Computer Program and Encoded Representation Using a Signal-Adaptive Bandwidth Extension | |
| KR101698905B1 (en) | Apparatus and method for encoding and decoding an audio signal using an aligned look-ahead portion | |
| US20050177364A1 (en) | Methods and devices for source controlled variable bit-rate wideband speech coding | |
| CN101087319B (en) | A method and device for sending and receiving background noise and silence compression system | |
| KR20080093074A (en) | Classification of Audio Signals | |
| EP1747442A1 (en) | Selection of coding models for encoding an audio signal | |
| CN103050122B (en) | MELP-based (Mixed Excitation Linear Prediction-based) multi-frame joint quantization low-rate speech coding and decoding method | |
| CN108231083A (en) | A kind of speech coder code efficiency based on SILK improves method | |
| KR20100095509A (en) | A method and an apparatus for processing a signal | |
| CN103680509A (en) | Method for discontinuous transmission of voice signals and generation of background noise | |
| CN101609681B (en) | Encoding method, encoder, decoding method and decoder | |
| CN102760441A (en) | Background noise coding/decoding device and method as well as communication equipment | |
| CN101170590B (en) | A method, system and device for background noise coded stream transmission | |
| CN101651752B (en) | Decoding method and decoding device | |
| CN102254562B (en) | Method for coding variable speed audio frequency switching between adjacent high/low speed coding modes | |
| CN1275223C (en) | A low bit-rate speech coder |
| Date | Code | Title | Description |
|---|---|---|---|
| C06 | Publication | ||
| PB01 | Publication | ||
| C10 | Entry into substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| C14 | Grant of patent or utility model | ||
| GR01 | Patent grant |