技术领域technical field
本发明涉及通信领域,尤其是涉及一种基于信源先验信息的语音帧修复方法和装置。The invention relates to the communication field, in particular to a method and device for repairing speech frames based on information source prior information.
背景技术Background technique
随着通信技术的发展,现在很多语音通信很多是通过网际协议(IP:InternetProtocol)网络、无线网络进行连接、传输,在这种通信环境下,出现数据包丢失的概率更大。许多语音识别系统所具有的常见问题是准确性。用户可对着语音识别器说话,且系统可用识别文本做出响应,但所述识别文本通常可能含有许多错误,因为语音识别器未能恰当地识别人类用户的话语。With the development of communication technology, many voice communications are now connected and transmitted through Internet Protocol (IP: Internet Protocol) networks and wireless networks. In this communication environment, the probability of data packet loss is greater. A common problem with many speech recognition systems is accuracy. A user can speak into a speech recognizer and the system can respond with recognized text, but the recognized text can often contain many errors because the speech recognizer fails to properly recognize the human user's utterances.
目前,数字语音通信技术几乎覆盖到了通信的每一个领域,计算机的发展为数字语音通信相关技术的研究提供了强有力的工具,大规模、超大规模集成电路的出现则为语音编码的实现提供了基础。在中低速率(16Kb/s以下)中获得高质量的语音编码一直是语音编码研究的主要目标。At present, digital voice communication technology has covered almost every field of communication. The development of computers has provided a powerful tool for the research of digital voice communication related technologies. Base. Obtaining high-quality speech coding at medium and low rates (below 16Kb/s) has always been the main goal of speech coding research.
在中低速率语音编码中,基于线性预测的声码器在发送端对语音信号进行解析,提取出语音信号的特征参数,主要包括激励参数、声道参数以及能量参数等,然后对参数进行量化与编码,经信道传输后,接收端再根据收到的特征参量恢复出语音信号。In low-to-medium rate speech coding, the vocoder based on linear prediction analyzes the speech signal at the sending end, extracts the characteristic parameters of the speech signal, mainly including excitation parameters, channel parameters, and energy parameters, etc., and then quantizes the parameters And coding, after channel transmission, the receiving end restores the voice signal according to the received characteristic parameters.
随着移动通信和移动互联网的飞速发展,以及网络传输速率的不断提高,语音通信在便捷性和移动性方面已经取得了相当巨大的进步,因此,移动语音通信和互联网语音通信的占比也越来越大。然而,一些问题依然无法忽视,一方面,由于移动网络本身的不稳定性;另一方面,互联网语音要求通信低时延,而在差错率方面,要求较低。这些将导致语音数据在信道传输过程中产生的一些错误,解码端在合成语音时就会产生各种畸变,导致语音通信质量变化较大。针对该类问题,在以往的方法中,主要采用丢帧或者错误掩盖(插值)的方法。其实现过程为:在接收端,通过丢失语音帧的前帧、或前后帧相应的参数值来对当前帧的参数值进行插值运算,该类方法在一定程度上达到了基本要求。With the rapid development of mobile communication and mobile Internet, and the continuous improvement of network transmission rate, voice communication has made considerable progress in terms of convenience and mobility. Therefore, the proportion of mobile voice communication and Internet voice communication is also increasing. bigger and bigger. However, some problems still cannot be ignored. On the one hand, due to the instability of the mobile network itself; on the other hand, Internet voice requires low communication delay and low error rate. These will lead to some errors in the voice data transmission process, and various distortions will occur when the decoding end synthesizes voice, resulting in a large change in voice communication quality. For this kind of problem, in the previous methods, the method of dropping frames or error concealment (interpolation) is mainly used. The implementation process is as follows: at the receiving end, the parameter value of the current frame is interpolated by missing the previous frame of the speech frame, or the corresponding parameter values of the previous and subsequent frames. This type of method meets the basic requirements to a certain extent.
但上述方法并没有有效地利用语音前后帧的相关性以及声道变化缓慢的特性,获得的语音质量还不够。However, the above method does not effectively use the correlation between the front and rear frames of the speech and the slow change of the channel, and the obtained speech quality is not enough.
发明内容Contents of the invention
本发明的目的就是为了克服上述现有技术存在的缺陷而提供一种基于信源先验信息的语音帧修复方法和装置。在基于线性预测类的声码器中,关键参数(声道参数和能量参数)的连续性较强,本发明利用信源的先验知识以及接收端的错误帧来对该错误帧进行容错解码,通过该方法来提升合成语音的质量。The object of the present invention is to provide a method and device for repairing speech frames based on prior information of information sources in order to overcome the above-mentioned defects in the prior art. In the vocoder based on the linear prediction class, the continuity of key parameters (channel parameters and energy parameters) is strong, and the present invention uses the prior knowledge of the source and the error frame of the receiving end to carry out error-tolerant decoding of the error frame, This method is used to improve the quality of synthesized speech.
本发明的目的可以通过以下技术方案来实现:The purpose of the present invention can be achieved through the following technical solutions:
本发明目的之一是提供一种基于信源先验信息的语音帧修复方法,包括以下步骤:One of object of the present invention is to provide a kind of speech frame restoration method based on information source prior information, comprising the following steps:
检测受损语音帧;detect corrupted speech frames;
根据信源先验信息,确定所述受损语音帧中受损语音参数的位置;Determining the position of the damaged speech parameter in the damaged speech frame according to the prior information of the information source;
根据信源先验信息,修复所述受损语音参数;Repairing the damaged speech parameters according to the prior information of the information source;
其中,所述信源先验信息包括当前帧的若干前帧、当前帧的若干后帧、各前帧语音参数和/或各后帧语音参数;Wherein, the information source prior information includes several previous frames of the current frame, several subsequent frames of the current frame, speech parameters of each previous frame and/or speech parameters of each subsequent frame;
所述语音参数包括声道参数和/或能量参数。The speech parameters include vocal tract parameters and/or energy parameters.
所述检测受损语音帧具体为:The detection of the damaged speech frame is specifically:
计算当前帧的CRC值;Calculate the CRC value of the current frame;
判断CRC值是否等于校验值,若是,则当前帧未受损,若否,则当前帧为受损语音帧。It is judged whether the CRC value is equal to the check value, if yes, then the current frame is not damaged, if not, then the current frame is a damaged speech frame.
所述检测受损语音帧具体为:The detection of the damaged speech frame is specifically:
获取信源先验信息,基于所述信源先验信息计算当前帧某一语音参数向量的条件概率;Acquiring information source prior information, calculating a conditional probability of a speech parameter vector of the current frame based on the information source prior information;
判断该条件概率是否大于或等于对应语音参数向量的检测阈值,若是,则当前帧未受损,若否,则当前帧为受损语音帧。Judging whether the conditional probability is greater than or equal to the detection threshold of the corresponding speech parameter vector, if yes, the current frame is not damaged, if not, the current frame is a damaged speech frame.
确定所述受损语音帧中受损语音参数的位置的具体过程为:The specific process for determining the position of the damaged speech parameter in the damaged speech frame is:
根据若干前帧和/或若干后帧的某一语音参数向量计算当前帧的每一个对应语音参数的条件概率;Calculate the conditional probability of each corresponding speech parameter of the current frame according to a certain speech parameter vector of several previous frames and/or several subsequent frames;
依次判断每一个对应语音参数的条件概率是否大于或等于该语音参数的检测阈值,若是,则对应位置的语音参数未受损,若否,则对应位置的语音参数为受损语音参数。Determine in turn whether the conditional probability of each corresponding speech parameter is greater than or equal to the detection threshold of the speech parameter, if so, the speech parameter at the corresponding position is not damaged, if not, the speech parameter at the corresponding position is a damaged speech parameter.
确定所述受损语音帧中受损语音参数的位置的具体过程为:The specific process for determining the position of the damaged speech parameter in the damaged speech frame is:
根据若干前帧和/或若干后帧的某一语音参数中的第i个值计算当前帧的第i个对应语音参数的条件概率;Calculate the conditional probability of the i-th corresponding speech parameter of the current frame according to the i-th value in a certain speech parameter of several previous frames and/or several subsequent frames;
依次判断每一个对应语音参数的条件概率是否大于或等于该语音参数的检测阈值,若是,则对应位置的语音参数未受损,若否,则对应位置的语音参数为受损语音参数。Determine in turn whether the conditional probability of each corresponding speech parameter is greater than or equal to the detection threshold of the speech parameter, if so, the speech parameter at the corresponding position is not damaged, if not, the speech parameter at the corresponding position is a damaged speech parameter.
修复所述受损语音参数具体为:The repairing of the damaged speech parameters is specifically:
根据最大先验概率准则,利用若干前帧和/或若干后帧的语音参数向量恢复受损帧的对应受损语音参数。According to the maximum prior probability criterion, the corresponding damaged speech parameters of the damaged frame are recovered by using the speech parameter vectors of several previous frames and/or several subsequent frames.
修复所述受损语音参数具体为:The repairing of the damaged speech parameters is specifically:
根据最大先验概率准则,利用若干前帧和/或若干后帧的第i个语音参数恢复受损帧的对应第i个受损语音参数。According to the maximum prior probability criterion, the i-th damaged speech parameter corresponding to the damaged frame is recovered by using the i-th speech parameter of several previous frames and/or several subsequent frames.
本发明目的之二是提供一种位于声码器解码端的语音容错解码方法,包括所述的基于信源先验信息的语音帧修复方法。The second object of the present invention is to provide a speech error-tolerant decoding method at the decoding end of the vocoder, including the speech frame repair method based on the prior information of the information source.
本发明目的之三是提供一种基于信源先验信息的语音帧修复装置,包括:The third object of the present invention is to provide a device for repairing speech frames based on information source prior information, including:
检测模块,用于检测当前语音帧序列中是否存在受损语音帧,并确定受损语音参数位置;A detection module, configured to detect whether there is a damaged speech frame in the current speech frame sequence, and determine the damaged speech parameter position;
修复模块,用于根据信源先验信息对受损语音帧中的受损语音参数进行比特级修复,所述受损语音参数包括受损声道参数和/或受损能量参数。A repairing module, configured to perform bit-level repair on the damaged speech parameters in the damaged speech frame according to the information source prior information, and the damaged speech parameters include damaged channel parameters and/or damaged energy parameters.
所述检测模块包括:The detection module includes:
受损帧判断单元,用于根据校验值或语音参数连续性检测是否存在受损语音帧;A damaged frame judging unit is used to detect whether there is a damaged speech frame according to the check value or the continuity of the speech parameter;
参数位置确定单元,用于根据信源先验信息,确定所述受损语音帧中受损语音参数的位置。The parameter position determination unit is configured to determine the position of the damaged speech parameter in the damaged speech frame according to the prior information of the information source.
与现有技术相比,本发明具有以下有益效果:Compared with the prior art, the present invention has the following beneficial effects:
本发明修复受损帧是基于信源先验信息,避免了现有技术中对错误帧的完全丢弃。通过最大程度利用信源先验信息,修复语音帧中的声道参数和能量参数,从而提高语音质量,提高用户体验。The invention repairs the damaged frame based on the prior information of the information source, and avoids the complete discarding of the error frame in the prior art. By utilizing the prior information of the signal source to the greatest extent, the channel parameters and energy parameters in the speech frame are repaired, thereby improving the speech quality and user experience.
本发明可用于蜂窝小区交界区域的语音通信,通过修复受损语音帧,降低语音中断的概率,提高语音通信质量,提高用户体验。The invention can be used for voice communication in the border area of cellular cells, and can reduce the probability of voice interruption by repairing damaged voice frames, improve the quality of voice communication, and improve user experience.
本发明也可用于偏远地区的语音通信。偏远地区的小区面积比较大,数据传输速率较低,利用本发明可以在不降低语音通信质量的情况下,扩大蜂窝小区的面积,从而降低基站建设成本。The invention can also be used for voice communication in remote areas. The cell area in the remote area is relatively large, and the data transmission rate is low. The invention can expand the area of the cell cell without reducing the voice communication quality, thereby reducing the construction cost of the base station.
本发明与现有语音解码技术完全兼容,只要在通信终端加入此语音容错解码算法,就可以提高语音通信质量,而不需要改变现有通信网络设备,因此它是一种低成本高收益的方案,具有可观的市场价值。The present invention is fully compatible with the existing speech decoding technology, as long as the speech fault-tolerant decoding algorithm is added to the communication terminal, the quality of speech communication can be improved without changing the existing communication network equipment, so it is a low-cost and high-yield solution , has considerable market value.
附图说明Description of drawings
图1为本发明修复方法的流程示意图;Fig. 1 is the schematic flow sheet of repairing method of the present invention;
图2为本发明的一种受损语音帧的检测流程示意图;Fig. 2 is a schematic diagram of the detection process of a damaged speech frame of the present invention;
图3为本发明的另一种受损语音帧的检测流程示意图;Fig. 3 is a schematic diagram of the detection process of another damaged speech frame of the present invention;
图4为本发明的一种受损声道参数位置确定流程图;Fig. 4 is a flow chart of determining the location of a damaged vocal tract parameter in the present invention;
图5为本发明的另一种受损声道参数位置确定流程图;Fig. 5 is another flow chart of determining the position of damaged vocal tract parameters in the present invention;
图6为本发明的一种受损能量参数位置确定流程图;Fig. 6 is a flow chart of determining the location of a damaged energy parameter in the present invention;
图7为本发明的另一种受损能量参数位置确定流程图;Fig. 7 is another flow chart for determining the location of damaged energy parameters in the present invention;
图8为本发明的一种声道参数修复流程图;Fig. 8 is a kind of channel parameter restoration flow chart of the present invention;
图9为本发明的另一种声道参数修复流程图;Fig. 9 is another kind of channel parameter restoration flowchart of the present invention;
图10为本发明的一种能量参数修复流程图;Fig. 10 is a flow chart of energy parameter restoration in the present invention;
图11为本发明的另一种能量参数修复流程图。Fig. 11 is another energy parameter recovery flow chart of the present invention.
具体实施方式detailed description
下面结合附图和具体实施例对本发明进行详细说明。本实施例以本发明技术方案为前提进行实施,给出了详细的实施方式和具体的操作过程,但本发明的保护范围不限于下述的实施例。The present invention will be described in detail below in conjunction with the accompanying drawings and specific embodiments. This embodiment is carried out on the premise of the technical solution of the present invention, and detailed implementation and specific operation process are given, but the protection scope of the present invention is not limited to the following embodiments.
如图1所示,本实施例提供一种基于信源先验信息的语音帧修复方法,该方法包括:As shown in Figure 1, the present embodiment provides a method for repairing a speech frame based on information source prior information, the method comprising:
步骤101:根据校验值和/或语音参数连续性来检测受损语音帧;Step 101: detecting damaged speech frames according to the check value and/or speech parameter continuity;
步骤102:根据信源先验信息来确定受损语音参数的位置;Step 102: Determine the position of the damaged speech parameter according to the prior information of the source;
步骤103:根据信源先验信息,利用若干前帧和/或若干后帧来修复受损语音参数。Step 103: According to the prior information of the information source, use several previous frames and/or several subsequent frames to repair the damaged speech parameters.
其中,信源先验信息包括当前帧的若干前帧、当前帧的若干后帧、各前帧语音参数和/或各后帧语音参数,语音参数包括声道参数和/或能量参数。Wherein, the information source prior information includes several previous frames of the current frame, several subsequent frames of the current frame, speech parameters of each previous frame and/or speech parameters of each subsequent frame, and the speech parameters include channel parameters and/or energy parameters.
如图2所示,本实施例提供的检测受损语音帧的过程包括:As shown in Figure 2, the process of detecting damaged speech frames provided by this embodiment includes:
步骤201:计算当前帧的CRC值;Step 201: Calculate the CRC value of the current frame;
步骤202:判断CRC值是否等于校验值;Step 202: judging whether the CRC value is equal to the check value;
步骤203:如果CRC值等于校验值,则当前帧是未受损的;Step 203: If the CRC value is equal to the check value, the current frame is not damaged;
步骤204:如果CRC值不等于校验值,则当前帧为受损语音帧。Step 204: If the CRC value is not equal to the check value, then the current frame is a damaged speech frame.
上述过程适用于语音帧中包含校验值的情况。The above process is applicable to the situation that the check value is included in the speech frame.
如图3所示,本发明的另一实施例中,检测受损语音帧的过程为首先根据m个前帧和n个后帧的声道参数向量和能量参数向量来计算当前帧声道参数向量和能量参数向量的条件概率,然后通过检测阈值来判定声道参数向量和能量参数向量是否受损,从而检测到受损语音帧。该过程具体包括:As shown in Figure 3, in another embodiment of the present invention, the process of detecting damaged speech frames is to first calculate the channel parameters of the current frame according to the channel parameter vectors and energy parameter vectors of m previous frames and n subsequent frames vector and the conditional probability of the energy parameter vector, and then determine whether the channel parameter vector and the energy parameter vector are damaged through the detection threshold, so as to detect the damaged speech frame. Specifically, the process includes:
步骤301:根据m个前帧和n个后帧的声道参数向量来计算当前帧声道参数向量的条件概率P(s0|s-m,s-m+1,…,s-1,s1,s2,…,sn),m可取任意正整数。s0、sn、s-m分别表示当前帧的声道参数向量、当前帧之后第n个帧的声道参数向量、当前帧之前第m个帧的声道参数向量;Step 301: Calculate the conditional probability P(s0 |s-m ,s-m+1 ,...,s-1 , s1 ,s2 ,…,sn ), m can take any positive integer. s0 , sn , and s-m respectively represent the channel parameter vector of the current frame, the channel parameter vector of the nth frame after the current frame, and the channel parameter vector of the mth frame before the current frame;
步骤302:根据m个前帧和n个后帧的能量参数向量来计算当前帧能量参数向量的条件概率P(e0|e-m,e-m+1,…,e-1,e1,e2,…,en),n可取任意正整数。e0、en、e-m表示当前帧的能量参数向量、当前帧之后第n个帧的能量参数向量、当前帧之前第m个帧的能量参数向量;Step 302: Calculate the conditional probability P(e0 |e-m ,e-m+1 ,...,e-1 ,e1 of the energy parameter vector of the current frame according to the energy parameter vectors of m previous frames and n subsequent frames ,e2 ,…,en ), n can take any positive integer. e0 , en , e-m represent the energy parameter vector of the current frame, the energy parameter vector of the nth frame after the current frame, and the energy parameter vector of the mth frame before the current frame;
步骤303:判断不等式P(s0|s-m,…,s-1,s1,…,sn)≥PTF是否成立,如果此不等式成立,则声道参数是未受损的,否则,声道参数是是受损的,PTF为检测阈值;Step 303: Judging whether the inequality P(s0 |s-m ,...,s-1 ,s1 ,...,sn )≥PTF is true, if this inequality is true, the channel parameters are intact, otherwise , the channel parameter is damaged, and PTF is the detection threshold;
步骤304:判断不等式P(e0|e-m,…,e-1,e1,…,en)≥PTF是否成立,如果此不等式成立,则能量参数是未受损的,否则,能量参数是是受损的。Step 304: Judging whether the inequality P(e0 |e-m ,...,e-1 ,e1 ,...,en )≥PTF is true, if this inequality is true, the energy parameter is intact, otherwise, Energy parameters are compromised.
上述过程适用于于语音帧中不包含校验值的情况。The above process is applicable to the situation that the check value is not included in the speech frame.
步骤102中,受损语音参数包括受损声道参数和/或受损能量参数。如图4所示,确定受损语音帧中受损声道参数的过程为根据m个前帧和n个后帧的声道参数向量来计算当前语音帧声道参数的条件概率,并根据声道参数的条件概率和声道检测阈值PTS来判断声道参数是否受损,具体包括:In step 102, the impaired speech parameters include impaired vocal tract parameters and/or impaired energy parameters. As shown in Figure 4, the process of determining the damaged channel parameters in the damaged speech frame is to calculate the conditional probability of the channel parameters of the current speech frame according to the channel parameter vectors of m previous frames and n subsequent frames, and according to the The conditional probability of the channel parameters and the channel detection threshold PTS are used to judge whether the channel parameters are damaged, specifically including:
步骤401:判断声道参数向量是否受损,如果声道参数向量是未受损的,则执行步骤407;Step 401: Determine whether the channel parameter vector is damaged, and if the channel parameter vector is not damaged, perform step 407;
步骤402:令k=0;Step 402: set k=0;
步骤403:根据m个前帧和n个后帧的声道参数向量来计算当前语音帧的第i个声道参数s0,i的条件概率P(s0,i|s-m,…,s-1,s1,…,sn),0<i<ks-1,ks为声道参数的个数;Step 403: Calculate the conditional probability P(s0,i |s-m ,..., s-1 ,s1 ,…,sn ), 0<i<ks -1, ks is the number of channel parameters;
步骤404:判断条件概率是否大于等于声道检测阈值PTS:Step 404: Determine whether the conditional probability is greater than or equal to the channel detection threshold PTS :
P(s0,i|s-m,…,s-1,s1,…,sn)≥PTS,P(s0,i |s-m ,…,s-1 ,s1 ,…,sn )≥PTS ,
如果以上不等式不成立时,则声道参数s0,i是受损的;If the above inequality does not hold, the channel parameters s0, i are damaged;
步骤405:i=i+1;Step 405: i=i+1;
步骤406:判断i≥ks是否成立,如果不等式不成立,则执行步骤403;Step 406: Judging whether i≥ks is true, if the inequality is not true, then execute step 403;
步骤407:输出受损的声道参数。Step 407: Output the damaged channel parameters.
本发明的另一实施例中,如图5所示,确定受损语音帧中受损声道参数的过程为根据m个前帧和n个后帧的第i个声道参数来计算当前语音帧第i个声道参数的条件概率,并根据声道参数的条件概率和声道检测阈值PTS来判断声道参数是否受损,具体包括:In another embodiment of the present invention, as shown in Figure 5, the process of determining the damaged channel parameters in the damaged speech frame is to calculate the current speech channel parameters according to the i-th channel parameters of m previous frames and n subsequent frames The conditional probability of the i-th channel parameter of the frame, and judge whether the channel parameter is damaged according to the conditional probability of the channel parameter and the channel detection threshold PTS , specifically including:
步骤501:判断声道参数向量是否受损,如果声道参数向量是未受损的,则执行步骤507;Step 501: Determine whether the channel parameter vector is damaged, and if the channel parameter vector is not damaged, perform step 507;
步骤502:令k=0;Step 502: set k=0;
步骤503:根据m个前帧和n个后帧的第i个声道参数(s-m,i,…,s-1,i,s1,i,…,sn,i)来计算当前语音帧的第i个声道参数s0,i的条件概率P(s0,i|s-m,i,…,s-1,i,s1,i,…,sn,i);Step503: calculatethe current Conditional probability P(s0,i |s-m,i ,...,s-1,i ,s1,i ,...,sn,i ) of the i-th channel parameter s0,i of the speech frame;
步骤504:判断条件概率是否大于等于声道检测阈值PTS:Step 504: Determine whether the conditional probability is greater than or equal to the channel detection threshold PTS :
P(s0,i|s-m,i,…,s-1,i,s1,i,…,sn,i)≥PTS,P(s0,i |s-m,i ,…,s-1,i ,s1,i ,…,sn,i )≥PTS ,
如果以上不等式不成立时,则声道参数s0,i是受损的;If the above inequality does not hold, the channel parameters s0, i are damaged;
步骤505:i=i+1;Step 505: i=i+1;
步骤506:判断i≥ks是否成立,如果不等式不成立,则执行步骤503;Step 506: Judging whether i≥ks is true, if the inequality is not true, then execute step 503;
步骤507:输出受损的声道参数。Step 507: Output the damaged channel parameters.
如图6所示,确定受损语音帧中受损能量参数的过程为根据m个前帧和n个后帧的能量参数向量来计算当前语音帧的能量参数的条件概率,并根据能量参数的条件概率和能量检测阈值PTE来判断能量参数是否受损,具体包括:As shown in Figure 6, the process of determining the damaged energy parameters in the damaged speech frame is to calculate the conditional probability of the energy parameters of the current speech frame according to the energy parameter vectors of m previous frames and n subsequent frames, and according to the energy parameters of the energy parameters The conditional probability and the energy detection threshold PTE are used to judge whether the energy parameters are damaged, including:
步骤601:判断能量参数向量是否受损,如果能量参数向量是未受损的,则执行步骤607;Step 601: Determine whether the energy parameter vector is damaged, and if the energy parameter vector is not damaged, perform step 607;
步骤602:令k=0;Step 602: set k=0;
步骤603:根据m个前帧和n个后帧的能量参数向量来计算当前语音帧的第i个能量参数e0,i的条件概率P(e0,j|e-m,…,e-1,e1,…,en),0<j<ke-1,ke分别为能量参数的个数;Step 603: Calculate the conditional probability P(e0,j |e-m ,...,e- 1 ,e1 ,…,en ), 0<j<ke -1, ke are the number of energy parameters respectively;
步骤604:判断条件概率是否大于等于能量检测阈值PTE:Step 604: Determine whether the conditional probability is greater than or equal to the energy detection threshold PTE :
P(e0,j|e-m,…,e-1,e1,…,en)≥PTE,P(e0,j |e-m ,…,e-1 ,e1 ,…,en )≥PTE ,
如果以上不等式不成立时,则能量参数e0,i是受损的;If the above inequality does not hold, the energy parameter e0,i is damaged;
步骤605:i=i+1;Step 605: i=i+1;
步骤606:判断i≥ke是否成立,如果不等式不成立,则执行步骤603;Step 606: Judging whether i≥ke is true, if the inequality is not true, execute step 603;
步骤607:输出受损的声道参数。Step 607: Output the damaged channel parameters.
本发明的另一实施例中,如图7所示,确定受损语音帧中受损能量参数的过程为根据m个前帧和n个后帧的第i个能量参数来计算当前语音帧的第i个能量参数的条件概率,并根据能量参数的条件概率和能量检测阈值PTE来判断能量参数是否受损,具体包括:In another embodiment of the present invention, as shown in FIG. 7, the process of determining the damaged energy parameter in the damaged speech frame is to calculate the energy parameter of the current speech frame according to the i-th energy parameter of the m previous frames and the n subsequent frames. The conditional probability of the i-th energy parameter, and judge whether the energy parameter is damaged according to the conditional probability of the energy parameter and the energy detection threshold PTE , specifically including:
步骤701:判断能量参数向量是否受损,如果能量参数向量是未受损的,则执行步骤707;Step 701: Determine whether the energy parameter vector is damaged, and if the energy parameter vector is not damaged, perform step 707;
步骤702:令k=0;Step 702: set k=0;
步骤703:根据m个前帧和n个后帧的第i个能量参数(e-m,i,…,e-1,i,e1,i,…,en,i)来计算当前语音帧的第i个能量参数e0,i的条件概率P(e0,j|e-m,i,…,e-1,i,e1,i,…,en,i);Step 703: Calculate the current speech according to the i-th energy parameter (e-m,i ,...,e-1,i ,e1,i ,...,en,i ) of the m previous frames and n subsequent frames Conditional probability P(e0,j |e-m,i ,...,e-1,i ,e1,i ,...,en,i ) of the i-th energy parameter e0,i of the frame;
步骤704:判断条件概率是否大于等于能量检测阈值PTE:Step 704: Determine whether the conditional probability is greater than or equal to the energy detection threshold PTE :
P(e0,j|e-m,j,…,e-1,j,e1,j,…,en,j)≥PTE,P(e0,j |e-m,j ,…,e-1,j ,e1,j ,…,en,j )≥PTE ,
如果以上不等式不成立时,则能量参数e0,i是受损的;If the above inequality does not hold, the energy parameter e0,i is damaged;
步骤705:i=i+1;Step 705: i=i+1;
步骤706:判断i≥ke是否成立,如果不等式不成立,则执行步骤703;Step 706: Judging whether i≥ke is true, if the inequality is not true, then execute step 703;
步骤707:输出受损的声道参数。Step 707: Output the damaged channel parameters.
步骤103中,修复受损语音参数也包括受损声道参数的修复和受损能量参数的修复。In step 103, restoring the damaged speech parameters also includes restoring damaged channel parameters and restoring damaged energy parameters.
如图8所示,本实施例的受损语音帧中受损声道参数的修复过程为利用信源先验信息和m个前帧和n个后帧的声道参数向量来估计当前帧的声道参数,具体包括:As shown in Figure 8, the repair process of the damaged channel parameters in the damaged speech frame in this embodiment is to use the source prior information and the channel parameter vectors of m previous frames and n subsequent frames to estimate the current frame Channel parameters, including:
步骤801:i=0;Step 801: i=0;
步骤802:判断声道参数s0,i是否受损,如果声道参数s0,i没有受损,则执行步骤804。Step 802: Determine whether the channel parameter s0,i is damaged, and if the channel parameter s0,i is not damaged, then execute step 804.
步骤803:修复声道参数s0,i:Step 803: Repair channel parameters s0,i :
上式表示将满足P(s0,i|s-m,…,s-1,s1,…,sn)取到最大值的声道参数s0,i作为修复的声道参数;The above formula means that the channel parameters s0,i that satisfy P(s0,i |s-m ,...,s-1 ,s1 ,...,sn ) to the maximum value will be used as the repaired channel parameters;
步骤804:i=i+1;Step 804: i=i+1;
步骤805:判断不等式i>ks是否成立,如果此不等式不成立,则执行步骤802;Step 805: judging whether the inequality i>ks is true, if the inequality is not true, then execute step 802;
步骤806:输出被修复的声道参数向量s0。Step 806: Output the repaired channel parameter vector s0 .
如图9所示,本发明的另一实施例中,受损语音帧中受损声道参数的修复过程为利用信源先验信息和m个前帧和n个后帧的第i个声道参数来估计当前帧的i个声道参数,具体包括:As shown in Figure 9, in another embodiment of the present invention, the repair process of the damaged channel parameters in the damaged speech frame is to use the prior information of the source and the i-th sound channel of m previous frames and n rear frames channel parameters to estimate the i channel parameters of the current frame, specifically including:
步骤901:i=0;Step 901: i=0;
步骤902:判断声道参数s0,i是否受损,如果声道参数s0,i没有受损,则执行步骤904。Step 902: Determine whether the channel parameter s0,i is damaged, and if the channel parameter s0,i is not damaged, perform step 904.
步骤903:修复声道参数s0,i:Step 903: Repair channel parameters s0,i :
上式表示将满足P(s0,i|s-m,i,…,s-1,i,s1,i,…,sn,i)取到最大值的声道参数s0,i作为修复的声道参数;The above formula expresses the channel parameter s0,i that satisfies P(s0,i |s-m,i ,…,s-1,i ,s1,i ,…,sn,i ) to the maximum value As the repaired channel parameters;
步骤904:i=i+1;Step 904: i=i+1;
步骤905:判断不等式i>ks是否成立,如果此不等式不成立,则执行步骤902;Step 905: judging whether the inequality i>ks is established, if the inequality is not established, then perform step 902;
步骤906:输出被修复的声道参数向量s0。Step 906: Output the repaired channel parameter vector s0 .
如图10所示,本实施例的受损语音帧中受损能量参数的修复过程为利用信源先验信息和m个前帧和n个后帧的能量参数向量来估计当前帧的能量参数,具体包括:As shown in Figure 10, the repair process of the damaged energy parameters in the damaged speech frame in this embodiment is to use the source prior information and the energy parameter vectors of m previous frames and n subsequent frames to estimate the energy parameters of the current frame , including:
步骤1001:i=0;Step 1001: i=0;
步骤1002:判断能量参数e0,i是否受损,如果能量参数e0,i没有受损,则执行步骤1004。Step 1002: Judging whether the energy parameter e0,i is damaged, if the energy parameter e0,i is not damaged, then execute step 1004.
步骤1003:修复能量参数e0,i:Step 1003: Repair energy parameter e0,i :
上式表示将满足P(e0,i|e-m,…,e-1,e1,…,en)取到最大值的能量参数e0,i作为修复的能量参数;The above formula means that the energy parameter e0,i that satisfies P(e0,i |e-m ,…,e-1 ,e1 ,…,en ) and reaches the maximum value will be used as the energy parameter for repair;
步骤1004:i=i+1;Step 1004: i=i+1;
步骤1005:判断不等式i>ke是否成立,如果此不等式不成立,则执行步骤1002;Step 1005: judging whether the inequality i>ke is established, if the inequality is not established, then execute step 1002;
步骤1006:输出被修复的能量参数向量e0。Step 1006: Output the repaired energy parameter vector e0 .
如图11所示,本发明的另一实施例中,受损语音帧中受损能量参数的修复过程为利用信源先验信息和m个前帧和n个后帧的第i个能量参数来估计当前帧的i个能量参数,具体包括:As shown in Figure 11, in another embodiment of the present invention, the repair process of the damaged energy parameter in the damaged speech frame is to use the prior information of the source and the i-th energy parameter of the m previous frames and the n subsequent frames To estimate the i energy parameters of the current frame, specifically including:
步骤1101:i=0;Step 1101: i=0;
步骤1102:判断能量参数e0,i是否受损,如果能量参数e0,i没有受损,则执行步骤1104。Step 1102: Judging whether the energy parameter e0,i is damaged, if the energy parameter e0,i is not damaged, go to step 1104.
步骤1103:修复能量参数e0,i:Step 1103: Repair energy parameter e0,i :
上式表示将满足P(e0,i|e-m,i,…,e-1,i,e1,i,…,en,i)取到最大值的能量参数e0,i作为修复的能量参数;The above formula means that the energy parameter e0,i that satisfies P(e0,i |e-m,i ,…,e-1,i ,e1,i ,…,en,i ) to the maximum value is taken as The energy parameters of the repair;
步骤1104:i=i+1;Step 1104: i=i+1;
步骤1105:判断不等式i>ke是否成立,如果此不等式不成立,则执行步骤1102;Step 1105: judging whether the inequality i>ke is established, if the inequality is not established, then execute step 1102;
步骤1106:输出被修复的能量参数向量e0。Step 1106: Output the repaired energy parameter vector e0 .
实现上述基于信源先验信息的语音帧修复方法的装置包括检测模块和修复模块,其中,检测模块用于检测当前语音帧序列中是否存在受损语音帧,并确定受损语音参数位置;修复模块用于根据信源先验信息对受损语音帧中的受损语音参数进行比特级修复。The device for implementing the speech frame repair method based on the prior information of the source includes a detection module and a repair module, wherein the detection module is used to detect whether there is a damaged speech frame in the current speech frame sequence, and determine the position of the damaged speech parameter; repair The module is used to perform bit-level repair on the damaged speech parameters in the damaged speech frame according to the prior information of the signal source.
本发明的另一实施例中,基于上述基于信源先验信息的语音帧修复方法实现一种位于声码器解码端的语音容错解码方法。In another embodiment of the present invention, a speech error-tolerant decoding method at the decoding end of a vocoder is implemented based on the speech frame repair method based on the prior information of the information source.
以上详细描述了本发明的较佳具体实施例。应当理解,本领域的普通技术人员无需创造性劳动就可以根据本发明的构思作出诸多修改和变化。因此,凡本技术领域中技术人员依本发明的构思在现有技术的基础上通过逻辑分析、推理或者有限的实验可以得到的技术方案,皆应在由权利要求书所确定的保护范围内。The preferred specific embodiments of the present invention have been described in detail above. It should be understood that those skilled in the art can make many modifications and changes according to the concept of the present invention without creative efforts. Therefore, all technical solutions that can be obtained by those skilled in the art based on the concept of the present invention through logical analysis, reasoning or limited experiments on the basis of the prior art shall be within the scope of protection defined by the claims.
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201710565971.4ACN107564533A (en) | 2017-07-12 | 2017-07-12 | Speech frame restorative procedure and device based on information source prior information |
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201710565971.4ACN107564533A (en) | 2017-07-12 | 2017-07-12 | Speech frame restorative procedure and device based on information source prior information |
| Publication Number | Publication Date |
|---|---|
| CN107564533Atrue CN107564533A (en) | 2018-01-09 |
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201710565971.4APendingCN107564533A (en) | 2017-07-12 | 2017-07-12 | Speech frame restorative procedure and device based on information source prior information |
| Country | Link |
|---|---|
| CN (1) | CN107564533A (en) |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN108965562A (en)* | 2018-07-24 | 2018-12-07 | Oppo(重庆)智能科技有限公司 | Voice data generation method and relevant apparatus |
| CN109003619A (en)* | 2018-07-24 | 2018-12-14 | Oppo(重庆)智能科技有限公司 | Voice data generation method and relevant apparatus |
| CN109887515A (en)* | 2019-01-29 | 2019-06-14 | 北京市商汤科技开发有限公司 | Audio-frequency processing method and device, electronic equipment and storage medium |
| CN110782906A (en)* | 2018-07-30 | 2020-02-11 | 南京中感微电子有限公司 | Audio data recovery method and device and Bluetooth equipment |
| WO2020135614A1 (en)* | 2018-12-28 | 2020-07-02 | 南京中感微电子有限公司 | Audio data recovery method and apparatus, and bluetooth device |
| WO2021169356A1 (en)* | 2020-09-18 | 2021-09-02 | 平安科技(深圳)有限公司 | Voice file repairing method and apparatus, computer device, and storage medium |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN1373952A (en)* | 1999-09-10 | 2002-10-09 | 因芬尼昂技术股份公司 | Method for estimating bit error rate in radio receiver and corresponding radio receiver |
| CN1647405A (en)* | 2002-04-23 | 2005-07-27 | D·基尔班克 | Systems and methods for using microcells in communications |
| US20080114592A1 (en)* | 2006-11-09 | 2008-05-15 | Sony Computer Entertainment Inc. | Low complexity no delay reconstruction of missing packets for lpc decoder |
| CN101689961A (en)* | 2007-03-20 | 2010-03-31 | 弗劳恩霍夫应用研究促进协会 | Device and method for sending a sequence of data packets and decoder and device for decoding a sequence of data packets |
| CN102034476A (en)* | 2009-09-30 | 2011-04-27 | 华为技术有限公司 | Methods and devices for detecting and repairing error voice frame |
| CA2483791C (en)* | 2002-05-31 | 2013-09-03 | Voiceage Corporation | Method and device for efficient frame erasure concealment in linear predictive based speech codecs |
| CN103426435A (en)* | 2012-05-04 | 2013-12-04 | 索尼电脑娱乐公司 | Source separation by independent component analysis with moving constraint |
| CN104756453A (en)* | 2012-06-20 | 2015-07-01 | 麦格纳康姆有限公司 | Highly-spectrally-efficient reception using orthogonal frequency division multiplexing |
| CN104883732A (en)* | 2015-04-14 | 2015-09-02 | 哈尔滨工程大学 | Enhanced indoor passive human body location method |
| CN104934035A (en)* | 2014-03-21 | 2015-09-23 | 华为技术有限公司 | Method and device for decoding voice and audio code stream |
| CN104937858A (en)* | 2012-12-18 | 2015-09-23 | 华为技术有限公司 | Systems and methods for a priori decoding |
| CN105741843A (en)* | 2014-12-10 | 2016-07-06 | 联芯科技有限公司 | Packet loss compensation method and system based on time delay jitter |
| CN106375004A (en)* | 2016-11-09 | 2017-02-01 | 山东大学 | A Visible Light Communication Spatial Modulation Method Based on Hartley Transform and Its Realization System |
| CN106537819A (en)* | 2014-07-25 | 2017-03-22 | 骁阳网络有限公司 | Cycle slip resilient coded modulation for fiber-optic communications |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN1373952A (en)* | 1999-09-10 | 2002-10-09 | 因芬尼昂技术股份公司 | Method for estimating bit error rate in radio receiver and corresponding radio receiver |
| CN1647405A (en)* | 2002-04-23 | 2005-07-27 | D·基尔班克 | Systems and methods for using microcells in communications |
| CA2483791C (en)* | 2002-05-31 | 2013-09-03 | Voiceage Corporation | Method and device for efficient frame erasure concealment in linear predictive based speech codecs |
| US20080114592A1 (en)* | 2006-11-09 | 2008-05-15 | Sony Computer Entertainment Inc. | Low complexity no delay reconstruction of missing packets for lpc decoder |
| CN101689961A (en)* | 2007-03-20 | 2010-03-31 | 弗劳恩霍夫应用研究促进协会 | Device and method for sending a sequence of data packets and decoder and device for decoding a sequence of data packets |
| CN102034476A (en)* | 2009-09-30 | 2011-04-27 | 华为技术有限公司 | Methods and devices for detecting and repairing error voice frame |
| CN103426435A (en)* | 2012-05-04 | 2013-12-04 | 索尼电脑娱乐公司 | Source separation by independent component analysis with moving constraint |
| CN104756453A (en)* | 2012-06-20 | 2015-07-01 | 麦格纳康姆有限公司 | Highly-spectrally-efficient reception using orthogonal frequency division multiplexing |
| CN104937858A (en)* | 2012-12-18 | 2015-09-23 | 华为技术有限公司 | Systems and methods for a priori decoding |
| CN104934035A (en)* | 2014-03-21 | 2015-09-23 | 华为技术有限公司 | Method and device for decoding voice and audio code stream |
| CN106537819A (en)* | 2014-07-25 | 2017-03-22 | 骁阳网络有限公司 | Cycle slip resilient coded modulation for fiber-optic communications |
| CN105741843A (en)* | 2014-12-10 | 2016-07-06 | 联芯科技有限公司 | Packet loss compensation method and system based on time delay jitter |
| CN104883732A (en)* | 2015-04-14 | 2015-09-02 | 哈尔滨工程大学 | Enhanced indoor passive human body location method |
| CN106375004A (en)* | 2016-11-09 | 2017-02-01 | 山东大学 | A Visible Light Communication Spatial Modulation Method Based on Hartley Transform and Its Realization System |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN108965562A (en)* | 2018-07-24 | 2018-12-07 | Oppo(重庆)智能科技有限公司 | Voice data generation method and relevant apparatus |
| CN109003619A (en)* | 2018-07-24 | 2018-12-14 | Oppo(重庆)智能科技有限公司 | Voice data generation method and relevant apparatus |
| CN108965562B (en)* | 2018-07-24 | 2021-04-13 | Oppo(重庆)智能科技有限公司 | Voice data generation method and related device |
| CN110782906A (en)* | 2018-07-30 | 2020-02-11 | 南京中感微电子有限公司 | Audio data recovery method and device and Bluetooth equipment |
| WO2020135609A1 (en)* | 2018-07-30 | 2020-07-02 | 南京中感微电子有限公司 | Audio data recovery method, device and bluetooth apparatus |
| WO2020135614A1 (en)* | 2018-12-28 | 2020-07-02 | 南京中感微电子有限公司 | Audio data recovery method and apparatus, and bluetooth device |
| CN111402905A (en)* | 2018-12-28 | 2020-07-10 | 南京中感微电子有限公司 | Audio data recovery method and device and Bluetooth equipment |
| US20210327439A1 (en)* | 2018-12-28 | 2021-10-21 | Nanjing Zgmicro Company Limited | Audio data recovery method, device and Bluetooth device |
| CN109887515A (en)* | 2019-01-29 | 2019-06-14 | 北京市商汤科技开发有限公司 | Audio-frequency processing method and device, electronic equipment and storage medium |
| CN109887515B (en)* | 2019-01-29 | 2021-07-09 | 北京市商汤科技开发有限公司 | Audio processing method and device, electronic equipment and storage medium |
| WO2021169356A1 (en)* | 2020-09-18 | 2021-09-02 | 平安科技(深圳)有限公司 | Voice file repairing method and apparatus, computer device, and storage medium |
| Publication | Publication Date | Title |
|---|---|---|
| CN107564533A (en) | Speech frame restorative procedure and device based on information source prior information | |
| Weng et al. | Semantic communications for speech signals | |
| US11869516B2 (en) | Voice processing method and apparatus, computer- readable storage medium, and computer device | |
| WO2012141486A2 (en) | Frame erasure concealment for a multi-rate speech and audio codec | |
| US20150348546A1 (en) | Audio processing apparatus and audio processing method | |
| CN102984666B (en) | Address list voice information processing method in a kind of communication process and system | |
| CN106067847A (en) | A kind of voice data transmission method and device | |
| CN110913073A (en) | Voice processing method and related equipment | |
| CN103295575A (en) | Speech recognition method and client | |
| KR101279857B1 (en) | Adaptive multi rate codec mode decoding method and apparatus thereof | |
| CN114067800B (en) | Speech recognition method, device and electronic equipment | |
| CN112669821A (en) | Voice intention recognition method, device, equipment and storage medium | |
| CN111371534A (en) | Data retransmission method and device, electronic equipment and storage medium | |
| CN107391498B (en) | Voice translation method and device | |
| CN119449236A (en) | Semantic communication system, method and related equipment | |
| CN113409792A (en) | Voice recognition method and related equipment thereof | |
| US11848026B2 (en) | Performing artificial intelligence sign language translation services in a video relay service environment | |
| Endo et al. | Missing feature theory applied to robust speech recognition over IP network | |
| EP2512052B1 (en) | Method and device for determining in-band signalling decoding mode | |
| KR20070105151A (en) | Apparatus and method for voice packet recovery | |
| KR20010021093A (en) | Communication system, receiver, device and method of correcting channel errors | |
| US12407777B2 (en) | Performance optimization for real-time large language speech to text systems | |
| JP4406870B2 (en) | Mobile phone with imaging function | |
| WO2014010175A1 (en) | Encoding device and encoding method | |
| JP2004184535A (en) | Speech recognition device and method |
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| RJ01 | Rejection of invention patent application after publication | Application publication date:20180109 | |
| RJ01 | Rejection of invention patent application after publication |