技术领域technical field
本发明涉及语音识别技术领域,特别是一种语音识别结果纠错方法。The invention relates to the technical field of speech recognition, in particular to a method for correcting errors in speech recognition results.
背景技术Background technique
目前大部分语音识别系统采用N元文法(Ngram)语言模型,由于这种模型存在一种不完善的独立性假设,即假设当前词只依赖该词之前的N-1个词,其局限性表现在它只是前N-1个词的不确定性推理,导致识别结果往往出现毫无意义的句子或片段。At present, most speech recognition systems use the N-gram language model (Ngram). Since this model has an imperfect independence assumption, that is, it is assumed that the current word only depends on the N-1 words before the word, and its limitations show It is only the uncertainty reasoning of the first N-1 words, which often leads to meaningless sentences or fragments in the recognition results.
发明内容Contents of the invention
本发明提出了一种语音识别结果纠错方法,能够利用可变长纠错模版,根据置信度和声学混淆度对识别结果进行纠错。本发明可用于大词汇量连续语音识别系统。本发明主要有如下特征:一是以语料库中的连续语言片段作为纠错模版,利用语料库建立纠错模版库;二是对纠错模版库建立索引,使用快速搜索技术对纠错模版库进行快速查找;三是依据纠错模式,利用置信度将识别结果切分成短的识别片段,并将识别片段中的可信赖部分提交的纠错模版系统进行快速查找,得到与识别片段相关性高的纠错模版候选;四是利用声学混淆度矩阵从纠错模版候选中挑选与识别片段声学特征相近的模版进行替换纠错。The invention proposes an error correction method for speech recognition results, which can use a variable-length error correction template to correct errors for the recognition results according to the degree of confidence and the degree of acoustic confusion. The invention can be used in a large vocabulary continuous speech recognition system. The present invention mainly has the following features: one is to use the continuous language segment in the corpus as the error correction template, and use the corpus to build an error correction template library; the other is to build an index for the error correction template library, and use the fast search technology to quickly search the error correction template library. Third, according to the error correction mode, the recognition result is divided into short recognition segments by using the confidence level, and the error correction template system submitted by the reliable part of the recognition segment is quickly searched to obtain the corrected part with high correlation with the recognition segment. The fourth is to use the acoustic confusion matrix to select a template similar to the acoustic characteristics of the recognition segment from the error correction template candidates for replacement and error correction.
技术方案Technical solutions
一种语音识别结果纠错方法,包括以下步骤:A method for correcting errors in speech recognition results, comprising the following steps:
1)识别系统对输入语音进行识别运算和置信度计算,得到带有1) The recognition system performs recognition operations and confidence calculations on the input speech, and obtains
置信度的识别结果;Confidence recognition results;
2)依据纠错模式,按照置信度的高低将识别结果切分成小的识别片段;2) According to the error correction mode, the recognition result is divided into small recognition segments according to the level of confidence;
3)将所得到的识别片段输入到纠错模版检索系统,得到与识别片段相关性高的纠错模版候选列表;3) Inputting the obtained recognition fragments into the error correction template retrieval system to obtain a candidate list of error correction templates highly correlated with the recognition fragments;
4)计算识别片段与候选列表中纠错模版的声学混淆度,挑选其中声学相似度最高的模版,当识别片断与该纠错模版的相似程度大于一个可信赖的门限时,使用纠错模版代替该识别结果片段;4) Calculate the acoustic confusion between the recognition segment and the error correction template in the candidate list, select the template with the highest acoustic similarity, and when the similarity between the recognition segment and the error correction template is greater than a reliable threshold, use the error correction template instead the recognition result segment;
5)将纠错后的片段合并,得到纠错后的识别结果。5) Merge the error-corrected segments to obtain the error-corrected recognition result.
所述的语音识别结果纠错方法,还包括,在对输入语音进行识别运算的同时进行置信度计算,得到带有置信度的识别结果的步骤。The speech recognition result error correction method further includes the step of calculating the confidence level while performing the recognition operation on the input speech, to obtain a recognition result with confidence level.
所述的一种语音识别结果纠错方法,还包括,根据置信的高低将识别结果切分成小的识别片段时,首先设置置信度门限CM-threshold和系统纠错模板最长字数max-var-length,置信度高于CM-threshold时认为识别结果是可信赖的,切分后的识别片段中可信赖的字数不得大于max-var-length。The error correction method for a speech recognition result also includes, when cutting the recognition result into small recognition segments according to the level of confidence, first setting the confidence threshold CM-threshold and the system error correction template longest word count max-var- length, when the confidence level is higher than CM-threshold, the recognition result is considered reliable, and the number of reliable words in the segmented recognition fragments must not be greater than max-var-length.
所述的语音识别结果纠错方法,还包括,将识别结果分块,连续的置信度高于或低于CM-threshold的字划为一个模块的步骤,即将识别结果划为一个或多个(A,x,B)模式构成,其中A、B为置信度高于CM-threshold的模块,x为置信度低于CM-threshold的模块,A和B最多一个为空模块。The error correction method of the speech recognition result also includes, the recognition result is divided into blocks, and the continuous confidence level is higher than or lower than the step of CM-threshold word is divided into a module, and the recognition result is divided into one or more ( A, x, B) pattern composition, where A and B are modules whose confidence is higher than CM-threshold, x is a module whose confidence is lower than CM-threshold, and at most one of A and B is an empty module.
所述的语音识别结果纠错方法,还包括,对于识别结果中所有的低置信度模块x,若A的长度大于或等于max-var-length,则将A中与x相邻的长为max-var-length的部分,设为sub-A,与x组成片段(sub-A,x),sub-A用来搜索纠错模版库,sub-A的长度不固定的步骤。The error correction method for the speech recognition result also includes, for all low confidence modules x in the recognition result, if the length of A is greater than or equal to max-var-length, then the length of A adjacent to x is max The part of -var-length is set to sub-A, which forms a segment (sub-A, x) with x, sub-A is used to search the error correction template library, and the length of sub-A is not a fixed step.
所述的语音识别结果纠错方法,还包括,对于识别结果中所有的低置信度模块x,若A长度小于max-var-length,则将B中与x相邻的部分sub-B,与A、x组成片段(A,x,sub-B),A和sub-B用来搜索纠错模版库的步骤,A和sub-B最多一个可以为空模块,其中A和sub-B的长度和不得大于max-var-length,A和sub-B的长度不固定。The error correction method for the speech recognition result also includes, for all low confidence modules x in the recognition result, if the length of A is less than max-var-length, sub-B, the part adjacent to x in B, and A and x form a segment (A, x, sub-B). A and sub-B are used to search the error correction template library. At most one of A and sub-B can be an empty module, where the length of A and sub-B The sum must not be greater than max-var-length, and the lengths of A and sub-B are not fixed.
所述的语音识别结果纠错方法,还包括,将识别结果切分成片段后,将每个片段的可信赖部分在纠错模版库中进行快速查找,得到一个或多个与识别片段相关性高的纠错模版的步骤。The error correction method for the speech recognition result also includes, after dividing the recognition result into segments, quickly searching the reliable part of each segment in the error correction template library, and obtaining one or more segments with high correlation with the recognition segment. The steps of the error correction template.
所述的语音识别结果纠错方法,还包括,纠错模版检索系统包括两个部分,第一部分是纠错模版索引的建立,第二部分是纠错模版的搜索。The error correction method for speech recognition results further includes that the error correction template retrieval system includes two parts, the first part is the establishment of the error correction template index, and the second part is the error correction template search.
所述的语音识别结果纠错方法,还包括,其中第一部分的基本原理是,把语料库中所有连续的字数在6到12之间的语言片段作为纠错模版,首先从语料库中提取出所有的纠错模版,然后采用倒置文件作为索引结构对纠错模版库建立索引,为了减小倒置文件的大小,需要对倒置文件压缩。The error correction method for speech recognition results also includes, wherein the basic principle of the first part is to use all consecutive language segments between 6 and 12 in the corpus as an error correction template, and first extract all the words from the corpus. Error correction template, and then use the inverted file as the index structure to index the error correction template library. In order to reduce the size of the inverted file, the inverted file needs to be compressed.
所述的语音识别结果纠错方法,还包括,其中第二部分的基本原理是,查询时首先将可信赖部分转换为布尔查询,在索引库中进行快速搜索,针对语音识别结果具有时序性、局部性的特点,在转换为布尔查询的时候,需要加入对可信赖部分的时序性要求和词与词间的局部性要求。The error correction method for the speech recognition result also includes, wherein the basic principle of the second part is that when querying, the reliable part is first converted into a Boolean query, and a fast search is carried out in the index library, and the speech recognition result has time sequence, The characteristics of locality, when converting to Boolean query, need to add timing requirements for reliable parts and locality requirements between words.
所述的语音识别结果纠错方法,还包括,对于纠错模版搜索返回的所有结果,使用纠错模版与识别片断的声学混淆度挑选最优模版的步骤,对于识别片段A和纠错模版候选列表中每一个模版Ti,计算A与Ti的混淆度C(A,Ti),当其中的最大值maxC(A,Ti)超过一个可信赖的门限时,我们使用该纠错模版替换识别片段,若maxC(A,Ti)小于该门限,则保留识别片段。The speech recognition result error correction method also includes, for all the results returned by the error correction template search, the step of using the error correction template and the acoustic confusion of the recognition segment to select the optimal template, for the recognition segment A and the error correction template candidate For each template Ti in the list, calculate the degree of confusion C(A, Ti ) between A and Ti , and when the maximum value maxC(A, Ti ) exceeds a reliable threshold, we use this error correction template Replace the recognition segment, if maxC(A, Ti ) is smaller than the threshold, keep the recognition segment.
所述的语音识别结果纠错方法,还包括,纠错模版与识别片断的声学混淆度的计算包括三个部分构成,第一部分是汉语声韵母识别混淆情况的统计,第二部分是汉语声韵母识别混淆度的后验概率计算,第三部分是识别片断与纠错模版的模糊整体匹配。The error correction method for the speech recognition result also includes that the calculation of the error correction template and the degree of acoustic confusion of the recognition segment includes three parts, the first part is the statistics of the recognition confusion of Chinese consonants and finals, and the second part is the statistics of Chinese consonants and finals. The posterior probability calculation of recognition confusion, the third part is the fuzzy overall matching of recognition fragments and error correction templates.
所述的语音识别结果纠错方法,还包括,第一部分的基本原理是,对语音数据库进行识别,并通过以下方式得到所有声母之间的混淆情况和所有韵母之间的混淆情况:假设声韵母之间不会产生混淆,若其中一个样本其识别结果为拼音串C1′V1′C2′V2′…Cm′Vm′,该识别结果与正确的C1V1C2V2…CnVn进行动态对整,使得其能对上的拼音串最大,这样就能得到大量的拼音串对,即(C1′,C1),(V1′,V1)…(Cm′,Cn)(Vm′,Vn),统计这些拼音串对的出现次数,得到每个声母的样本总数和它被识别为其他每个声母的次数,以及得到每个韵母的样本总数和它被识别为其他每个韵母的次数。The error correction method for speech recognition results also includes, the basic principle of the first part is to identify the speech database, and obtain the confusion between all initial consonants and the confusion between all finals in the following way: assuming that the initials and finals are There will be no confusion between them. If the recognition result of one of the samples is the pinyin string C1 ′V1 ′C2 ′V2 ′…Cm ′Vm ′, the recognition result is consistent with the correct C1 V1 C2 V2 …Cn Vn performs dynamic alignment, so that it can match the largest pinyin string, so that a large number of pinyin string pairs can be obtained, that is, (C1 ′, C1 ), (V1 ′, V1 )… (Cm ′, Cn )(Vm ′, Vn ), count the number of occurrences of these pinyin string pairs, get the total number of samples of each initial consonant and the number of times it is recognized as each other initial consonant, and get each final The total number of samples of and the number of times it was recognized as each other final.
所述的语音识别结果纠错方法,还包括,其中第二部分的基本原理是,根据第一部分的统计结果,首先计算每个声母被识别为其他声母的概率,Ci被混淆成Cj的模糊度,其计算公式为:The error correction method for the speech recognition result also includes, wherein the basic principle of the second part is, according to the statistical results of the first part, first calculate the probability that each initial consonant is recognized as other initial consonants, Ci is confused into Cj Fuzziness, its calculation formula is:
别为Cj的总数,|Ci|为Ci样本总数,is the total number of Cj , |Ci | is the total number of Ci samples,
当识别结果为Ci时,正确结果应为Cj的后验概率:When the recognition result is Ci , the correct result should be the posterior probability of Cj :
其中
所述的语音识别结果纠错方法,还包括,其中第三块的基本原理是,设识别片段A的拼音串为C1′V1′C2′V2′…Cm′Vm′设纠错模版候选列表中的第i个模版Ti的拼音串为C1V1C2V2…CnVn,则定义A与Ti的声学混淆度C(A,Ti)为:找到一个对齐(1,i1),(2,i2)…(k,ik)…(m,im),该对齐使得
所述的语音识别结果纠错方法,还包括,在实际应用时,首先对后验概率取对数后计算,将问题转化为使得The error correction method for the speech recognition result also includes, in actual application, first calculating the logarithm of the posterior probability, and converting the problem into such that
取得最大值,此时使用该最大值作为A与Ti的对数声学混淆度。Take the maximum value, and use this maximum value as the logarithmic acoustic confusion of A and Ti at this time.
具体实施方式Detailed ways
本发明主要有三个模块,一是利用置信度对识别结果的切分,二是纠错模版候选列表的获得,三是识别片断与纠错模版声学混淆度的计算。下面加以详细说明。The invention mainly has three modules, one is to segment the recognition result by using the confidence degree, the other is to obtain the error correction template candidate list, and the third is to calculate the acoustic confusion degree of the recognition fragment and the error correction template. Describe in detail below.
利用置信度对识别结果的切分。首先设置置信度门限CM-threshold和系统纠错模板最长字数max-var-length,置信度高于CM-threshold的识别结果认为是可信赖的,然后对识别结果进行切分,步骤如下:Segmentation of recognition results by confidence. First, set the confidence threshold CM-threshold and the maximum number of words in the system error correction template max-var-length. The recognition results with a confidence higher than CM-threshold are considered reliable, and then the recognition results are segmented. The steps are as follows:
1.将识别结果分块,连续的置信度高于或低于CM-threshold的字划为一个模块,将识别结果划为一个或多个(A,x,B)结构构成,其中A、B为置信度高于CM-threshold的模块,x为置信度低于CM-threshold的模块,A和B最多一个为空模块。1. Divide the recognition results into blocks, and divide the words with continuous confidence higher or lower than the CM-threshold into a module, and divide the recognition results into one or more (A, x, B) structures, where A, B is a module whose confidence is higher than CM-threshold, x is a module whose confidence is lower than CM-threshold, and at most one of A and B is an empty module.
2.对于识别结果中所有的低置信度模块x2. For all low confidence modules x in the recognition result
a)若A的长度大于或等于max-var-length,则将A中与x相邻的长为max-var-length的部分,设为sub-A,与x组成片段(sub-A,x),sub-A用来搜索纠错模版库,sub-A的长度不固定;a) If the length of A is greater than or equal to max-var-length, set the max-var-length part of A adjacent to x as sub-A, and form a segment with x (sub-A, x ), sub-A is used to search the error correction template library, and the length of sub-A is not fixed;
b)若A长度小于max-var-length,将B中与x相邻的部分sub-B,与A、x组成片段(A,x,sub-B),A和sub-B用来搜索纠错模版库,A和sub-B最多一个可以为空模块。其中A和sub-B的长度和不得大于max-var-length,A和sub-B的长度不固定。b) If the length of A is less than max-var-length, the part sub-B adjacent to x in B is combined with A and x to form a segment (A, x, sub-B), and A and sub-B are used to search and correct Wrong template library, at most one of A and sub-B can be an empty module. The sum of the lengths of A and sub-B must not be greater than max-var-length, and the lengths of A and sub-B are not fixed.
纠错模版候选列表的获得。将识别结果切分成片段后,将每个片段中的可信赖部分提交到纠错模版检索系统,得到与识别片段相关性高的纠错模版候选。纠错模版检索系统包括两个部分,第一部分是纠错模版索引的建立,第二部分是对纠错模版库的快速搜索。Acquisition of error correction template candidate list. After the recognition result is divided into segments, the reliable part of each segment is submitted to the error correction template retrieval system, and the error correction template candidates with high correlation with the recognition segment are obtained. The error correction template retrieval system includes two parts. The first part is the establishment of the error correction template index, and the second part is the fast search of the error correction template library.
其中第一部分的基本原理是,把语料库中所有连续的字数在6到12之间的语言片段作为纠错模版,首先从语料库中提取出所有的纠错模版,然后采用倒置文件作为索引结构对纠错模版库建立索引。为了减小倒置文件的大小,需要对倒置文件压缩。The basic principle of the first part is to use all consecutive language segments between 6 and 12 in the corpus as error correction templates, first extract all error correction templates from the corpus, and then use the inverted file as an index structure to correct Create an index for the wrong template library. In order to reduce the size of the inversion file, it is necessary to compress the inversion file.
其中第二部分的基本原理是,首先将片段中的可信赖部分转换为布尔查询,在索引库中进行快速检索。针对语音识别结果具有时序性、局部性特点,在转换为布尔查询的时候,需要加入对片段中可信赖部分的时序性要求和词与词间的局部性要求。The basic principle of the second part is to firstly convert the reliable part in the fragment into Boolean query, and perform fast retrieval in the index library. As the speech recognition results have the characteristics of timing and locality, when converting to Boolean query, it is necessary to add timing requirements for reliable parts in the segment and locality requirements between words.
声学混淆度的计算。纠错模版往往获得一个或多个候选,这时使用纠错模版与识别片断的声学混淆度挑选最优模版。对于识别片段A和纠错模版候选列表中每一个模版Ti,计算A与Ti的混淆度C(A,Ti),当其中的最大值maxC(A,Ti)超过一个可信赖的门限时,我们使用该纠错模版替换识别片段,若maxC(A,Ti)小于该门限,则保留识别片段。Calculation of Acoustic Confusion. The error correction template usually obtains one or more candidates, and the optimal template is selected by using the error correction template and the acoustic confusion of the recognition segment. For each template Ti in the recognition segment A and error correction template candidate list, calculate the confusion degree C(A, Ti ) between A and Ti , when the maximum value maxC(A, Ti ) exceeds a reliable When the threshold is set, we use the error correction template to replace the recognition segment. If maxC(A, Ti ) is smaller than the threshold, the recognition segment is retained.
混淆度的计算包括三个部分构成,第一部分是汉语声韵母识别混淆情况的统计,第二部分是汉语声韵母识别混淆度的后验概率计算,第三部分是识别片断与纠错模版的模糊整体匹配。The calculation of the degree of confusion consists of three parts. The first part is the statistics of the confusion in the recognition of Chinese consonants and finals. The second part is the calculation of the posterior probability of the recognition of Chinese consonants and finals. The third part is the fuzziness of recognition fragments and error correction templates. overall match.
其中第一部分的基本原理是,对语音数据库进行识别,并通过以下方式得到所有声母之间的混淆情况和所有韵母之间的混淆情况:假设声韵母之间不会产生混淆,若其中一个样本其识别结果为拼音串C1′V1′C2′V2′…Cm′Vm′,该识别结果与正确的C1V1C2V2…CnVn进行动态对整,使得其能对上的拼音串最大,这样就能得到大量的拼音串对,即(C1′,C1),(V1′,V1)…(Cm′,Cn)(Vm′,Vn),统计这些拼音串对的出现次数,得到每个声母的样本总数和它被识别为其他每个声母的次数,以及得到每个韵母的样本总数和它被识别为其他每个韵母的次数。The basic principle of the first part is to identify the speech database, and obtain the confusion between all initials and all finals in the following way: Assuming that there will be no confusion between the initials and finals, if one of the samples has The recognition result is a pinyin string C1 ′V1 ′C2 ′V2 ′…Cm ′Vm ′, which is dynamically aligned with the correct C1 V1 C2 V2 …Cn Vn , so that It can match the largest pinyin string, so that a large number of pinyin string pairs can be obtained, namely (C1 ′, C1 ), (V1 ′, V1 )...(Cm ′, Cn )(Vm ′ , Vn ), count the number of occurrences of these pinyin string pairs, get the total number of samples of each initial consonant and the number of times it is recognized as each other initial consonant, and obtain the total number of samples of each final and it is recognized as each other finals times.
其中第二部分的基本原理是,根据第一部分的统计结果,首先计算每个声母被识别为其他声母的概率,Ci被混淆成Cj的模糊度,其计算公式为:The basic principle of the second part is, according to the statistical results of the first part, first calculate the probability that each initial consonant is recognized as another initial consonant, Ci is confused into the ambiguity of Cj , and its calculation formula is:
其中∑(Ci,Cj)表示Ci被识别为Cj的总数,|Ci|表示Ci样本总数,where ∑(Ci , Cj ) represents the total number of Ci identified as Cj , |Ci | represents the total number of Ci samples,
当识别结果为Ci时正确结果应为Cj的后验概率:When the recognition result is Ci , the correct result should be the posterior probability of Cj :
其中
其中第三块的基本原理是,设识别片段A的拼音串为C1′V1′C2′V2′…Cm′Vm′,设纠错模版候选列表中的第i个模版Ti的拼音串为C1V1C2V2…CnVn,则定义A与Ti的声学混淆度C(A,Ti)为:找到一个对齐(1,i1),(2,i2)…(k,ik)…(m,im),该对齐使得The basic principle of the third block is that the pinyin string of the recognition segment A is C1 ′V1 ′C2 ′V2 ′…Cm ′Vm ′, and the i-th template T in the error correction template candidate list is The pinyin string ofi is C1 V1 C2 V2 ... Cn Vn , then define the acoustic confusion C(A, Ti ) between A and Ti as: find an alignment (1, i1 ), (2 , i2 )…(k, ik )…(m, im ), the alignment makes
取得最大值,定义该最大值为A与Ti的声学混淆度。The maximum value is obtained, and the maximum value is defined as the acoustic confusion between A and Ti .
在实际应用时,首先对后验概率取对数后计算,将问题转化为使得In practical application, the logarithm of the posterior probability is first calculated, and the problem is transformed into such that
取得最大值,此时使用该最大值作为A与Ti的对数声学混淆度。Take the maximum value, and use this maximum value as the logarithmic acoustic confusion of A and Ti at this time.
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CNA2005101274476ACN1979638A (en) | 2005-12-02 | 2005-12-02 | Method for correcting error of voice identification result |
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CNA2005101274476ACN1979638A (en) | 2005-12-02 | 2005-12-02 | Method for correcting error of voice identification result |
| Publication Number | Publication Date |
|---|---|
| CN1979638Atrue CN1979638A (en) | 2007-06-13 |
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CNA2005101274476APendingCN1979638A (en) | 2005-12-02 | 2005-12-02 | Method for correcting error of voice identification result |
| Country | Link |
|---|---|
| CN (1) | CN1979638A (en) |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101894549A (en)* | 2010-06-24 | 2010-11-24 | 中国科学院声学研究所 | Method for fast calculating confidence level in speech recognition application field |
| CN101452701B (en)* | 2007-12-05 | 2011-09-07 | 株式会社东芝 | Confidence degree estimation method and device based on inverse model |
| CN102831177A (en)* | 2012-07-31 | 2012-12-19 | 聚熵信息技术(上海)有限公司 | Statement error correction method and system |
| CN103000176A (en)* | 2012-12-28 | 2013-03-27 | 安徽科大讯飞信息科技股份有限公司 | Speech recognition method and system |
| CN103021412A (en)* | 2012-12-28 | 2013-04-03 | 安徽科大讯飞信息科技股份有限公司 | Voice recognition method and system |
| CN105244029A (en)* | 2015-08-28 | 2016-01-13 | 科大讯飞股份有限公司 | Voice recognition post-processing method and system |
| CN105632499A (en)* | 2014-10-31 | 2016-06-01 | 株式会社东芝 | Method and device for optimizing voice recognition result |
| CN106340293A (en)* | 2015-07-06 | 2017-01-18 | 无锡天脉聚源传媒科技有限公司 | Method and device for adjusting audio data recognition result |
| CN106534548A (en)* | 2016-11-17 | 2017-03-22 | 科大讯飞股份有限公司 | Voice error correction method and device |
| CN106683677A (en)* | 2015-11-06 | 2017-05-17 | 阿里巴巴集团控股有限公司 | Method and device for recognizing voice |
| CN106710592A (en)* | 2016-12-29 | 2017-05-24 | 北京奇虎科技有限公司 | Speech recognition error correction method and speech recognition error correction device used for intelligent hardware equipment |
| CN106875943A (en)* | 2017-01-22 | 2017-06-20 | 上海云信留客信息科技有限公司 | A kind of speech recognition system for big data analysis |
| CN107045496A (en)* | 2017-04-19 | 2017-08-15 | 畅捷通信息技术股份有限公司 | The error correction method and error correction device of text after speech recognition |
| CN108052498A (en)* | 2010-01-05 | 2018-05-18 | 谷歌有限责任公司 | The words grade of phonetic entry is corrected |
| CN109922371A (en)* | 2019-03-11 | 2019-06-21 | 青岛海信电器股份有限公司 | Natural language processing method, equipment and storage medium |
| CN109920432A (en)* | 2019-03-05 | 2019-06-21 | 百度在线网络技术(北京)有限公司 | A kind of audio recognition method, device, equipment and storage medium |
| CN110428822A (en)* | 2019-08-05 | 2019-11-08 | 重庆电子工程职业学院 | A kind of speech recognition error correction method and interactive system |
| CN110797026A (en)* | 2019-09-17 | 2020-02-14 | 腾讯科技(深圳)有限公司 | Voice recognition method, device and storage medium |
| CN111192586A (en)* | 2020-01-08 | 2020-05-22 | 北京松果电子有限公司 | Voice recognition method and device, electronic equipment and storage medium |
| CN112784581A (en)* | 2020-11-20 | 2021-05-11 | 网易(杭州)网络有限公司 | Text error correction method, device, medium and electronic equipment |
| CN113407694A (en)* | 2018-07-19 | 2021-09-17 | 深圳追一科技有限公司 | Customer service robot knowledge base ambiguity detection method, device and related equipment |
| CN114530145A (en)* | 2020-11-23 | 2022-05-24 | 中移互联网有限公司 | Speech recognition result error correction method and device, and computer readable storage medium |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101452701B (en)* | 2007-12-05 | 2011-09-07 | 株式会社东芝 | Confidence degree estimation method and device based on inverse model |
| CN108052498A (en)* | 2010-01-05 | 2018-05-18 | 谷歌有限责任公司 | The words grade of phonetic entry is corrected |
| CN101894549A (en)* | 2010-06-24 | 2010-11-24 | 中国科学院声学研究所 | Method for fast calculating confidence level in speech recognition application field |
| CN102831177B (en)* | 2012-07-31 | 2015-09-02 | 聚熵信息技术(上海)有限公司 | Statement error correction and system thereof |
| CN102831177A (en)* | 2012-07-31 | 2012-12-19 | 聚熵信息技术(上海)有限公司 | Statement error correction method and system |
| CN103021412B (en)* | 2012-12-28 | 2014-12-10 | 安徽科大讯飞信息科技股份有限公司 | Voice recognition method and system |
| CN103021412A (en)* | 2012-12-28 | 2013-04-03 | 安徽科大讯飞信息科技股份有限公司 | Voice recognition method and system |
| CN103000176B (en)* | 2012-12-28 | 2014-12-10 | 安徽科大讯飞信息科技股份有限公司 | Speech recognition method and system |
| CN103000176A (en)* | 2012-12-28 | 2013-03-27 | 安徽科大讯飞信息科技股份有限公司 | Speech recognition method and system |
| CN105632499B (en)* | 2014-10-31 | 2019-12-10 | 株式会社东芝 | Method and apparatus for optimizing speech recognition results |
| CN105632499A (en)* | 2014-10-31 | 2016-06-01 | 株式会社东芝 | Method and device for optimizing voice recognition result |
| CN106340293B (en)* | 2015-07-06 | 2019-11-29 | 无锡天脉聚源传媒科技有限公司 | A kind of method of adjustment and device of audio data recognition result |
| CN106340293A (en)* | 2015-07-06 | 2017-01-18 | 无锡天脉聚源传媒科技有限公司 | Method and device for adjusting audio data recognition result |
| CN105244029A (en)* | 2015-08-28 | 2016-01-13 | 科大讯飞股份有限公司 | Voice recognition post-processing method and system |
| CN105244029B (en)* | 2015-08-28 | 2019-02-26 | 安徽科大讯飞医疗信息技术有限公司 | Voice recognition post-processing method and system |
| CN106683677A (en)* | 2015-11-06 | 2017-05-17 | 阿里巴巴集团控股有限公司 | Method and device for recognizing voice |
| US11664020B2 (en) | 2015-11-06 | 2023-05-30 | Alibaba Group Holding Limited | Speech recognition method and apparatus |
| CN106534548A (en)* | 2016-11-17 | 2017-03-22 | 科大讯飞股份有限公司 | Voice error correction method and device |
| CN106534548B (en)* | 2016-11-17 | 2020-06-12 | 科大讯飞股份有限公司 | Voice error correction method and device |
| CN106710592A (en)* | 2016-12-29 | 2017-05-24 | 北京奇虎科技有限公司 | Speech recognition error correction method and speech recognition error correction device used for intelligent hardware equipment |
| CN106875943A (en)* | 2017-01-22 | 2017-06-20 | 上海云信留客信息科技有限公司 | A kind of speech recognition system for big data analysis |
| CN107045496A (en)* | 2017-04-19 | 2017-08-15 | 畅捷通信息技术股份有限公司 | The error correction method and error correction device of text after speech recognition |
| CN107045496B (en)* | 2017-04-19 | 2021-01-05 | 畅捷通信息技术股份有限公司 | Error correction method and error correction device for text after voice recognition |
| CN113407694A (en)* | 2018-07-19 | 2021-09-17 | 深圳追一科技有限公司 | Customer service robot knowledge base ambiguity detection method, device and related equipment |
| US11264034B2 (en) | 2019-03-05 | 2022-03-01 | Baidu Online Network Technology (Beijing) Co., Ltd | Voice identification method, device, apparatus, and storage medium |
| CN109920432A (en)* | 2019-03-05 | 2019-06-21 | 百度在线网络技术(北京)有限公司 | A kind of audio recognition method, device, equipment and storage medium |
| CN109922371A (en)* | 2019-03-11 | 2019-06-21 | 青岛海信电器股份有限公司 | Natural language processing method, equipment and storage medium |
| CN109922371B (en)* | 2019-03-11 | 2021-07-09 | 海信视像科技股份有限公司 | Natural language processing method, apparatus and storage medium |
| CN110428822B (en)* | 2019-08-05 | 2022-05-03 | 重庆电子工程职业学院 | Voice recognition error correction method and man-machine conversation system |
| CN110428822A (en)* | 2019-08-05 | 2019-11-08 | 重庆电子工程职业学院 | A kind of speech recognition error correction method and interactive system |
| CN110797026A (en)* | 2019-09-17 | 2020-02-14 | 腾讯科技(深圳)有限公司 | Voice recognition method, device and storage medium |
| CN110797026B (en)* | 2019-09-17 | 2024-11-26 | 腾讯科技(深圳)有限公司 | A speech recognition method, device and storage medium |
| CN111192586A (en)* | 2020-01-08 | 2020-05-22 | 北京松果电子有限公司 | Voice recognition method and device, electronic equipment and storage medium |
| CN112784581A (en)* | 2020-11-20 | 2021-05-11 | 网易(杭州)网络有限公司 | Text error correction method, device, medium and electronic equipment |
| CN112784581B (en)* | 2020-11-20 | 2024-02-13 | 网易(杭州)网络有限公司 | Text error correction method, device, medium and electronic equipment |
| CN114530145A (en)* | 2020-11-23 | 2022-05-24 | 中移互联网有限公司 | Speech recognition result error correction method and device, and computer readable storage medium |
| CN114530145B (en)* | 2020-11-23 | 2023-08-15 | 中移互联网有限公司 | Speech recognition result error correction method and device and computer readable storage medium |
| Publication | Publication Date | Title |
|---|---|---|
| CN1979638A (en) | Method for correcting error of voice identification result | |
| CN101510222B (en) | Multilayer index voice document searching method | |
| US6877001B2 (en) | Method and system for retrieving documents with spoken queries | |
| CN100440150C (en) | Machine translation system based on examples | |
| Xu et al. | Minimum bayes risk decoding and system combination based on a recursion for edit distance | |
| JP5449521B2 (en) | Search device and search program | |
| US20070179784A1 (en) | Dynamic match lattice spotting for indexing speech content | |
| CN105404621B (en) | A kind of method and system that Chinese character is read for blind person | |
| CN112395385B (en) | Text generation method and device based on artificial intelligence, computer equipment and medium | |
| US20030204399A1 (en) | Key word and key phrase based speech recognizer for information retrieval systems | |
| CN105957518A (en) | Mongolian large vocabulary continuous speech recognition method | |
| CN87106964A (en) | language translation system | |
| WO2007056029A1 (en) | Speech index pruning | |
| JP2002063199A (en) | Index method and apparatus | |
| CN109948144A (en) | A method of intelligent processing of teachers' discourse based on classroom teaching situation | |
| CN112232055A (en) | A Text Detection and Correction Method Based on Pinyin Similarity and Language Model | |
| CN111160014A (en) | Intelligent word segmentation method | |
| WO2010044123A1 (en) | Search device, search index creating device, and search system | |
| US7603272B1 (en) | System and method of word graph matrix decomposition | |
| CN102999533A (en) | Textspeak identification method and system | |
| Fusayasu et al. | Word-Error Correction of Continuous Speech Recognition Based on Normalized Relevance Distance. | |
| Wang et al. | Improving handwritten Chinese text recognition by unsupervised language model adaptation | |
| Palmer et al. | Improving out-of-vocabulary name resolution | |
| CN111583915B (en) | Optimization method, optimization device, optimization computer device and optimization storage medium for n-gram language model | |
| Liang et al. | An efficient error correction interface for speech recognition on mobile touchscreen devices |
| Date | Code | Title | Description |
|---|---|---|---|
| C06 | Publication | ||
| PB01 | Publication | ||
| C10 | Entry into substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| C12 | Rejection of a patent application after its publication | ||
| RJ01 | Rejection of invention patent application after publication | Open date:20070613 |