CN116912832A

Movatterモバイル変換

Info

Publication number: CN116912832A
Application number: CN202310792032.9A
Authority: CN
Inventors: 周俊; 李学勇; 何海清; 姜超
Original assignee: Bohai Bank Co ltd
Current assignee: Bohai Bank Co ltd
Priority date: 2023-06-30
Filing date: 2023-06-30
Publication date: 2023-10-20

Abstract

Translated fromChinese

本申请提供了一种图像文字识别的纠错方法、纠错装置、设备及介质，所述方法包括：从原始文字图像中提取待纠错文本，并将待纠错文本输入到统计语言模型中，确定出待纠错文本中的多个待纠错字符以及每个待纠错字符对应的多个候选字符；针对于每个待纠错字符，将该待纠错字符与该待纠错字符对应的每个候选字符之间的字符相似度特征输入到预先训练好的可替换概率预测模型中，确定出每个候选字符的可替换概率；基于每个候选字符的可替换概率从多个候选字符中确定出替换该待纠错字符的目标字符，并基于目标字符对该待纠错字符进行纠错，以得到纠错后文本。通过所述方法及装置，提高了图像文字识别的准确率，同时提高了对错误文字进行纠错的效率。

This application provides an error correction method, error correction device, equipment and medium for image text recognition. The method includes: extracting the text to be corrected from the original text image, and inputting the text to be corrected into a statistical language model. , determine multiple characters to be corrected in the text to be corrected and multiple candidate characters corresponding to each character to be corrected; for each character to be corrected, combine the character to be corrected with the character to be corrected The corresponding character similarity features between each candidate character are input into the pre-trained replaceability probability prediction model to determine the replaceability probability of each candidate character; based on the replaceability probability of each candidate character, multiple candidates are selected The target character that replaces the character to be corrected is determined among the characters, and the character to be corrected is corrected based on the target character to obtain the error-corrected text. Through the method and device, the accuracy of image text recognition is improved, and the efficiency of correcting erroneous text is also improved.

Description

Translated fromChinese

一种图像文字识别的纠错方法、纠错装置、设备及介质An error correction method, error correction device, equipment and medium for image text recognition

技术领域Technical field

本申请涉及文字处理技术领域，具体而言，涉及一种图像文字识别的纠错方法、纠错装置、设备及介质。The present application relates to the field of word processing technology, specifically, to an error correction method, error correction device, equipment and medium for image text recognition.

背景技术Background technique

银行在长期业务处理过程中会产生大量的图像资料，在日常运营过程中，许多场景都存在大量需要人工录入的数据。例如，目前财务报表的数据获取基本上还是基于人工录入的方式，效率低，无法实现审批自动化。银行传统人工获取财务数据的方式已经严重阻碍了银行的运营效率和业务开展，而财报数据又是银行审批过程中的核心数据，只有提高财务报表数据的采集工作，银行的运营、审批效率才会又大幅提高的可能。高精度、稳定、可靠的图片文字识别技术替代传统人工录入方式，能有效提高银行业务自动化处理能力。Banks will generate a large amount of image data during long-term business processing. In daily operations, there are large amounts of data that require manual entry in many scenarios. For example, the current data acquisition for financial statements is basically based on manual entry, which is inefficient and cannot automate approval. The traditional manual way of obtaining financial data by banks has seriously hindered the bank's operational efficiency and business development. Financial report data is the core data in the bank's approval process. Only by improving the collection of financial statement data can the bank's operation and approval efficiency be improved. The possibility is greatly improved. High-precision, stable and reliable picture and text recognition technology replaces traditional manual entry methods and can effectively improve the automated processing capabilities of banking business.

衡量一个图片文字识别系统最重要的指标就是准确率，即识别出的文字是否跟图片中实际文字相同，不同即为错误。过多的错误会造成系统可靠性差，反倒增加人工成本和操作风险。所以，如何提高图片文字识别的准确率成为了不容小觑的计算问题。The most important indicator to measure a picture text recognition system is accuracy, that is, whether the recognized text is the same as the actual text in the picture. If it is different, it is an error. Too many errors will cause poor system reliability, which in turn increases labor costs and operational risks. Therefore, how to improve the accuracy of image text recognition has become a computational problem that cannot be underestimated.

发明内容Contents of the invention

有鉴于此，本申请的目的在于提供一种图像文字识别的纠错方法、纠错装置、设备及介质，提高了图像文字识别的准确率，同时提高了对错误文字进行纠错的效率。In view of this, the purpose of this application is to provide an error correction method, error correction device, equipment and medium for image character recognition, which improves the accuracy of image character recognition and improves the efficiency of correcting erroneous characters.

第一方面，本申请实施例提供了一种图像文字识别的纠错方法，所述纠错方法包括：In a first aspect, embodiments of the present application provide an error correction method for image text recognition. The error correction method includes:

从原始文字图像中提取待纠错文本，并将所述待纠错文本输入到统计语言模型中，确定出所述待纠错文本中的多个待纠错字符以及每个待纠错字符对应的多个候选字符；Extract the text to be corrected from the original text image, input the text to be corrected into the statistical language model, and determine the multiple characters to be corrected in the text to be corrected and the corresponding characters for each character to be corrected. multiple candidate characters;

针对于每个待纠错字符，将该待纠错字符与该待纠错字符对应的每个候选字符之间的字符相似度特征输入到预先训练好的可替换概率预测模型中，确定出每个候选字符的可替换概率；For each character to be corrected, the character similarity features between the character to be corrected and each candidate character corresponding to the character to be corrected are input into the pre-trained replaceable probability prediction model, and each character is determined. The replaceability probability of candidate characters;

基于每个候选字符的可替换概率从多个所述候选字符中确定出替换该待纠错字符的目标字符，并基于所述目标字符对该待纠错字符进行纠错，以得到纠错后文本。Based on the replaceability probability of each candidate character, a target character that replaces the character to be corrected is determined from a plurality of candidate characters, and the character to be corrected is corrected based on the target character to obtain the corrected result. text.

进一步的，所述从原始文字图像中提取待纠错文本，包括：Further, the extraction of text to be corrected from the original text image includes:

对所述原始文字图像进行文字识别，得到所述原始文字图像对应的第一识别文字和所述第一识别文字对应的置信度；Perform text recognition on the original text image to obtain the first recognized text corresponding to the original text image and the confidence level corresponding to the first recognized text;

当所述第一识别文字对应的置信度低于置信度阈值时，则对所述原始文字图像进行图像超分辨率重建，得到目标文字图像；When the confidence corresponding to the first recognized text is lower than the confidence threshold, image super-resolution reconstruction is performed on the original text image to obtain the target text image;

对所述目标文字图像进行文字识别，得到所述目标文字图像对应的第二识别文字和所述第二识别文字对应的置信度，当所述第二识别文字对应的置信度低于所述置信度阈值时，将所述第二识别文字确定为所述待纠错文本。Perform text recognition on the target text image to obtain a second recognition text corresponding to the target text image and a confidence level corresponding to the second recognition text. When the confidence level corresponding to the second recognition text is lower than the confidence level When the threshold value is reached, the second recognized text is determined to be the text to be corrected.

进一步的，所述将所述待纠错文本输入到统计语言模型中，确定出所述待纠错文本中的多个待纠错字符以及每个待纠错字符对应的多个候选字符，包括：Further, the text to be corrected is input into a statistical language model, and a plurality of characters to be corrected in the text to be corrected and a plurality of candidate characters corresponding to each character to be corrected are determined, including :

将所述待纠错文本输入到统计语言模型中，得到所述待纠错文本中每个词对应的出现概率；Input the text to be corrected into a statistical language model to obtain the occurrence probability corresponding to each word in the text to be corrected;

针对于所述出现概率小于或等于概率阈值的每个词，确定该词对应的多个候选词；For each word whose occurrence probability is less than or equal to the probability threshold, determine multiple candidate words corresponding to the word;

针对于该词对应的多个候选词，使用该候选词对所述待纠错文本中的该词进行替换，得到替换后文本，并确定所述替换后文本对应的语义困惑度；For multiple candidate words corresponding to the word, use the candidate words to replace the word in the text to be corrected, obtain the replaced text, and determine the semantic confusion corresponding to the replaced text;

当所述语义困惑度小于或等于困惑度阈值时，则将该词中的字符确定为所述待纠错字符，将该候选词中与所述待纠错字符位置相同的字符确定为所述待纠错字符对应的候选字符。When the semantic confusion is less than or equal to the confusion threshold, the characters in the word are determined to be the characters to be corrected, and the characters in the candidate word that are in the same position as the characters to be corrected are determined to be the characters to be corrected. Candidate characters corresponding to the characters to be corrected.

进一步的，通过以下步骤确定该待纠错字符与该待纠错字符对应的每个候选字符之间的字符相似度特征：Further, the character similarity features between the character to be corrected and each candidate character corresponding to the character to be corrected are determined through the following steps:

针对于该待纠错字符对应的每个候选字符，计算该待纠错字符与该候选字符之间的多个相似度特征；其中，所述多个相似度特征包括以下至少一项：四码相似度、笔画编辑距离、笔画数相似度、结构相似度、偏旁相似度和拼音相似度；For each candidate character corresponding to the character to be corrected, multiple similarity features between the character to be corrected and the candidate character are calculated; wherein the multiple similarity features include at least one of the following: four codes Similarity, stroke edit distance, stroke number similarity, structural similarity, radical similarity and pinyin similarity;

将该待纠错字符与该候选字符之间的多个相似度特征、该候选字符所属的替换后文本对应的语义困惑度以及所述待纠错文本对应的置信度进行归一化，以得到该待纠错字符与该候选字符之间的字符相似度特征。Normalize multiple similarity features between the character to be corrected and the candidate character, the semantic perplexity corresponding to the replaced text to which the candidate character belongs, and the confidence corresponding to the text to be corrected, to obtain Character similarity features between the character to be corrected and the candidate character.

进一步的，通过以下步骤训练所述可替换概率预测模型：Further, train the alternative probability prediction model through the following steps:

获取训练数据集；其中，所述训练数据集中的训练数据包括待替换样本字符与候选样本字符之间的样本相似度特征以及用该候选样本字符替换该待替换样本字符的标注替换概率；Obtain a training data set; wherein, the training data in the training data set includes sample similarity features between the sample characters to be replaced and candidate sample characters and the annotation replacement probability of using the candidate sample characters to replace the sample characters to be replaced;

基于所述训练数据集对原始可替换概率预测模型进行训练，以得到所述可替换概率预测模型。The original alternative probability prediction model is trained based on the training data set to obtain the alternative probability prediction model.

进一步的，所述基于所述训练数据集对所述原始可替换概率预测模型进行训练，以得到所述可替换概率预测模型，包括：Further, training the original alternative probability prediction model based on the training data set to obtain the alternative probability prediction model includes:

基于所述训练数据生成第一决策树，其中，所述第一决策树代表所述样本数据中每个候选样本字符对应的预测概率；Generate a first decision tree based on the training data, wherein the first decision tree represents the predicted probability corresponding to each candidate sample character in the sample data;

将所述第一决策树的预测结果与所述训练数据中的标注替换概率进行对比，计算所述第一决策树的预测误差和损失值；Compare the prediction result of the first decision tree with the label replacement probability in the training data, and calculate the prediction error and loss value of the first decision tree;

基于所述第一决策树的预测误差生成第二决策树；Generate a second decision tree based on the prediction error of the first decision tree;

将所述第二决策树的预测结果与所述训练数据中的标注替换概率进行对比，计算所述第二决策树的预测误差和损失值；Compare the prediction result of the second decision tree with the label replacement probability in the training data, and calculate the prediction error and loss value of the second decision tree;

若所述第二决策树的损失值大于损失阈值，或，所述原始可替换概率预测模型中决策树的数量小于数量阈值，则基于所述第二决策树的预测误差生成下一决策树，直至所述下一决策树的损失值小于所述损失阈值或所述原始可替换概率预测模型中决策树的数量等于所述数量阈值，得到所述可替换概率预测模型。If the loss value of the second decision tree is greater than the loss threshold, or the number of decision trees in the original alternative probability prediction model is less than the number threshold, the next decision tree is generated based on the prediction error of the second decision tree, Until the loss value of the next decision tree is less than the loss threshold or the number of decision trees in the original alternative probability prediction model is equal to the number threshold, the alternative probability prediction model is obtained.

第二方面，本申请实施例还提供了一种图像文字识别的纠错装置，所述纠错装置包括：In a second aspect, embodiments of the present application also provide an error correction device for image text recognition. The error correction device includes:

纠错字符确定模块，用于从原始文字图像中提取待纠错文本，并将所述待纠错文本输入到统计语言模型中，确定出所述待纠错文本中的多个待纠错字符以及每个待纠错字符对应的多个候选字符；The error correction character determination module is used to extract the text to be corrected from the original text image, input the text to be corrected into the statistical language model, and determine a plurality of characters to be corrected in the text to be corrected. And multiple candidate characters corresponding to each character to be corrected;

可替换概率确定模块，用于针对于每个待纠错字符，将该待纠错字符与该待纠错字符对应的每个候选字符之间的字符相似度特征输入到预先训练好的可替换概率预测模型中，确定出每个候选字符的可替换概率；The replaceable probability determination module is used for inputting, for each character to be corrected, the character similarity feature between the character to be corrected and each candidate character corresponding to the character to be corrected into the pre-trained replaceable In the probabilistic prediction model, the replaceability probability of each candidate character is determined;

文本纠错模块，用于基于每个候选字符的可替换概率从多个所述候选字符中确定出替换该待纠错字符的目标字符，并基于所述目标字符对该待纠错字符进行纠错，以得到纠错后文本。A text error correction module, configured to determine a target character to replace the character to be corrected from a plurality of candidate characters based on the replaceability probability of each candidate character, and to correct the character to be corrected based on the target character. error to get the error-corrected text.

进一步的，所述纠错字符确定模块在用于从原始文字图像中提取待纠错文本时，所述纠错字符确定模块还用于：Further, when the error correction character determination module is used to extract the text to be corrected from the original text image, the error correction character determination module is also used to:

进一步的，所述纠错字符确定模块在用于将所述待纠错文本输入到统计语言模型中，确定出所述待纠错文本中的多个待纠错字符以及每个待纠错字符对应的多个候选字符时，所述纠错字符确定模块还用于：Further, when the error correction character determination module is used to input the error correction text into a statistical language model, the error correction character determination module determines a plurality of error correction characters and each error correction character in the error correction text. When there are multiple corresponding candidate characters, the error correction character determination module is also used to:

进一步的，所述纠错装置还包括字符相似度特征确定模块，所述字符相似度特征确定模块用于通过以下步骤确定该待纠错字符与该待纠错字符对应的每个候选字符之间的字符相似度特征：Further, the error correction device further includes a character similarity feature determination module, which is used to determine the difference between the character to be corrected and each candidate character corresponding to the character to be corrected through the following steps: Character similarity features:

进一步的，所述纠错装置还包括模型训练模块，所述模型训练模块用于通过以下步骤训练所述可替换概率预测模型：Further, the error correction device further includes a model training module, which is used to train the alternative probability prediction model through the following steps:

进一步的，所述模型训练模块在用于基于所述训练数据集对所述原始可替换概率预测模型进行训练，以得到所述可替换概率预测模型时，所述模型训练模块还用于：Further, when the model training module is used to train the original alternative probability prediction model based on the training data set to obtain the alternative probability prediction model, the model training module is also used to:

第三方面，本申请实施例还提供一种电子设备，包括：处理器、存储器和总线，所述存储器存储有所述处理器可执行的机器可读指令，当电子设备运行时，所述处理器与所述存储器之间通过总线通信，所述机器可读指令被所述处理器执行时执行如上述的图像文字识别的纠错方法的步骤。In a third aspect, embodiments of the present application further provide an electronic device, including: a processor, a memory, and a bus. The memory stores machine-readable instructions executable by the processor. When the electronic device is running, the processing The processor communicates with the memory through a bus, and when the machine-readable instructions are executed by the processor, the steps of the above error correction method for image text recognition are performed.

第四方面，本申请实施例还提供一种计算机可读存储介质，该计算机可读存储介质上存储有计算机程序，该计算机程序被处理器运行时执行如上述的图像文字识别的纠错方法的步骤。In a fourth aspect, embodiments of the present application further provide a computer-readable storage medium. The computer-readable storage medium stores a computer program. When the computer program is run by a processor, the computer program executes the above error correction method for image text recognition. step.

本申请实施例提供的一种图像文字识别的纠错方法、纠错装置、设备及介质，首先，从原始文字图像中提取待纠错文本，并将待纠错文本输入到统计语言模型中，确定出待纠错文本中的多个待纠错字符以及每个待纠错字符对应的多个候选字符；然后，针对于每个待纠错字符，将该待纠错字符与该待纠错字符对应的每个候选字符之间的字符相似度特征输入到预先训练好的可替换概率预测模型中，确定出每个候选字符的可替换概率；最后，基于每个候选字符的可替换概率从多个候选字符中确定出替换该待纠错字符的目标字符，并基于目标字符对该待纠错字符进行纠错，以得到纠错后文本。Embodiments of the present application provide an error correction method, error correction device, equipment and medium for image text recognition. First, the text to be corrected is extracted from the original text image, and the text to be corrected is input into the statistical language model. Determine multiple characters to be corrected in the text to be corrected and multiple candidate characters corresponding to each character to be corrected; then, for each character to be corrected, compare the character to be corrected with the character to be corrected The character similarity features between each candidate character corresponding to the character are input into the pre-trained replaceability probability prediction model to determine the replaceability probability of each candidate character; finally, based on the replaceability probability of each candidate character from A target character that replaces the character to be corrected is determined from multiple candidate characters, and the character to be corrected is corrected based on the target character to obtain error-corrected text.

本申请通过统计语言模型来确定候选字符，考虑了短文本，提高了获取候选字符的准确率，对短文本效果更佳，适用于银行等制式文本场景。基于待纠错字符与候选字符之间的字符相似度特征，使用可替换概率预测模型来判定候选字符是否可替代待纠错字符，避免人为设定阈值带来的误差。通过本申请提供的纠错方法，提高了图像文字识别的准确率，同时提高了对错误文字进行纠错的效率。This application uses a statistical language model to determine candidate characters, taking into account short texts, which improves the accuracy of obtaining candidate characters, has better effects on short texts, and is suitable for standard text scenarios such as banks. Based on the character similarity characteristics between the characters to be corrected and the candidate characters, a replaceable probability prediction model is used to determine whether the candidate characters can replace the characters to be corrected, to avoid errors caused by artificially setting thresholds. Through the error correction method provided by this application, the accuracy of image text recognition is improved, and the efficiency of correcting erroneous text is also improved.

为使本申请的上述目的、特征和优点能更明显易懂，下文特举较佳实施例，并配合所附附图，作详细说明如下。In order to make the above-mentioned objects, features and advantages of the present application more obvious and understandable, preferred embodiments are given below and described in detail with reference to the attached drawings.

附图说明Description of the drawings

为了更清楚地说明本申请实施例的技术方案，下面将对实施例中所需要使用的附图作简单地介绍，应当理解，以下附图仅示出了本申请的某些实施例，因此不应被看作是对范围的限定，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他相关的附图。In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required to be used in the embodiments will be briefly introduced below. It should be understood that the following drawings only show some embodiments of the present application and therefore do not It should be regarded as a limitation of the scope. For those of ordinary skill in the art, other relevant drawings can be obtained based on these drawings without exerting creative efforts.

图1为本申请实施例所提供的一种图像文字识别的纠错方法的流程图；Figure 1 is a flow chart of an error correction method for image text recognition provided by an embodiment of the present application;

图2为本申请实施例所提供的一种图像文字识别的纠错装置的结构示意图之一；Figure 2 is one of the structural schematic diagrams of an error correction device for image text recognition provided by an embodiment of the present application;

图3为本申请实施例所提供的一种图像文字识别的纠错装置的结构示意图之二；Figure 3 is a second structural schematic diagram of an error correction device for image text recognition provided by an embodiment of the present application;

图4为本申请实施例所提供的一种电子设备的结构示意图。FIG. 4 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.

具体实施方式Detailed ways

为使本申请实施例的目的、技术方案和优点更加清楚，下面将结合本申请实施例中附图，对本申请实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例仅仅是本申请一部分实施例，而不是全部的实施例。通常在此处附图中描述和示出的本申请实施例的组件可以以各种不同的配置来布置和设计。因此，以下对在附图中提供的本申请的实施例的详细描述并非旨在限制要求保护的本申请的范围，而是仅仅表示本申请的选定实施例。基于本申请的实施例，本领域技术人员在没有做出创造性劳动的前提下所获得的每个其他实施例，都属于本申请保护的范围。In order to make the purpose, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are only These are part of the embodiments of this application, but not all of them. The components of the embodiments of the present application generally described and illustrated in the figures herein may be arranged and designed in a variety of different configurations. Accordingly, the following detailed description of the embodiments of the application provided in the appended drawings is not intended to limit the scope of the claimed application, but rather to represent selected embodiments of the application. Based on the embodiments of this application, every other embodiment obtained by those skilled in the art without any creative work shall fall within the scope of protection of this application.

首先，对本申请可适用的应用场景进行介绍。本申请可应用于文字处理技术领域。First, the applicable application scenarios of this application are introduced. This application can be applied to the field of word processing technology.

基于此，本申请实施例提供了一种图像文字识别的纠错方法，以提高图像文字识别的准确率，同时提高对错误文字进行纠错的效率。Based on this, embodiments of the present application provide an error correction method for image text recognition to improve the accuracy of image text recognition and at the same time improve the efficiency of correcting erroneous text.

请参阅图1，图1为本申请实施例所提供的一种图像文字识别的纠错方法的流程图。如图1中所示，本申请实施例提供的纠错方法，包括：Please refer to FIG. 1 , which is a flow chart of an error correction method for image text recognition provided by an embodiment of the present application. As shown in Figure 1, the error correction method provided by the embodiment of the present application includes:

S101，从原始文字图像中提取待纠错文本，并将待纠错文本输入到统计语言模型中，确定出待纠错文本中的多个待纠错字符以及每个待纠错字符对应的多个候选字符。S101. Extract the text to be corrected from the original text image, input the text to be corrected into the statistical language model, and determine the multiple characters to be corrected in the text to be corrected and the number of characters corresponding to each character to be corrected. candidate characters.

需要说明的是，原始文字图像指的是包含带有文字的某张图像，例如文档、身份证和广告等，对此本申请不做具体限定。待纠错文本则是对原始文字图像进行文字识别后，里面带有识别错误的字符的文本。待纠错字符则是待纠错文本中需要被纠错的字符。候选字符则是统计语言模型预测出的，可以替换待纠错字符以对待纠错文本进行纠错的字符。统计语言模型可以为N-gram语言模型，N-gram语言模型可以是基于大量无监督语料训练得到的语言模型。N-gram语言模型是一种基于统计语言模型的算法，本质上就是通过概率计算某一段文本的存在是否合理，如果概率很高，说明这段文本是正常的，符合人类习惯的，如果概率很低，则说明文本是不正常的，不符合人类习惯的。It should be noted that the original text image refers to an image containing text, such as documents, ID cards, advertisements, etc. This application does not specifically limit this. The text to be corrected is the text containing incorrectly recognized characters after text recognition is performed on the original text image. The characters to be corrected are the characters that need to be corrected in the text to be corrected. Candidate characters are predicted by the statistical language model and can replace the characters to be corrected with the characters to be corrected in the text to be corrected. The statistical language model can be an N-gram language model, and the N-gram language model can be a language model trained based on a large amount of unsupervised corpus. N-gram language model is an algorithm based on statistical language model. In essence, it calculates whether the existence of a certain text is reasonable through probability. If the probability is very high, it means that this text is normal and in line with human habits. If the probability is very high, it means that the text is normal and in line with human habits. If it is low, it means that the text is abnormal and does not conform to human habits.

针对上述步骤S101，在具体实施时，首先获取带有文字的原始文字图像，然后对原始文字图像进行识别以提取出待纠错文本。然后将待纠错文本输入到统计语言模型中，确定出待纠错文本中的多个待纠错字符以及每个待纠错字符对应的多个候选字符。Regarding the above step S101, during specific implementation, an original text image with text is first obtained, and then the original text image is recognized to extract the text to be corrected. Then the text to be corrected is input into the statistical language model, and multiple characters to be corrected in the text to be corrected and multiple candidate characters corresponding to each character to be corrected are determined.

作为一种可选的实施方式，针对上述步骤S101，从原始文字图像中提取待纠错文本，可以包括以下第一步到第三步：As an optional implementation, for the above step S101, extracting the text to be corrected from the original text image may include the following first to third steps:

第一步：对原始文字图像进行文字识别，得到原始文字图像对应的第一识别文字和第一识别文字对应的置信度。The first step: perform text recognition on the original text image, and obtain the first recognized text corresponding to the original text image and the confidence level corresponding to the first recognized text.

在该步骤中，在具体实施时，对原始文字图像进行文字识别，以得到原始文字图像中携带的第一识别文字，以及第一识别文字对应的置信度。In this step, during specific implementation, text recognition is performed on the original text image to obtain the first recognized text carried in the original text image and the confidence level corresponding to the first recognized text.

需要说明的是，置信度是指特定个体对待特定命题真实性相信的程度。It should be noted that confidence refers to the degree to which a specific individual believes in the truth of a specific proposition.

这里，可以通过图片文字识别系统(OCR系统)对原始文字图像进行文字识别，此部分不做限定。OCR(optical character recognition)文字识别是指电子设备(例如扫描仪或数码相机)检查纸上打印的字符，然后用字符识别方法将形状翻译成计算机文字的过程；即，对文本资料进行扫描，然后对图像文件进行分析处理，获取文字及版面信息的过程。Here, text recognition can be performed on the original text image through an image text recognition system (OCR system), and this part is not limited. OCR (optical character recognition) text recognition refers to the process in which electronic devices (such as scanners or digital cameras) check characters printed on paper, and then use character recognition methods to translate the shapes into computer text; that is, scan the text data, and then The process of analyzing and processing image files to obtain text and layout information.

第二步：当第一识别文字对应的置信度低于置信度阈值时，则对原始文字图像进行图像超分辨率重建，得到目标文字图像。Step 2: When the confidence corresponding to the first recognized text is lower than the confidence threshold, image super-resolution reconstruction is performed on the original text image to obtain the target text image.

需要说明的是，置信度阈值指的是预先设定好的，用来判断是否需要对原始文字图像进行图像超分辨率重建的阈值。It should be noted that the confidence threshold refers to a preset threshold used to determine whether image super-resolution reconstruction of the original text image is required.

在该步骤中，得到第一识别文字后，确定出第一识别文字对应的置信度。具体如何确定出文本识别置信度的方法在现有技术中有详细说明，在此不再赘述。然后将第一识别文字对应的置信度与预设的置信度阈值进行对比，当判断第一识别文字对应的置信度低于置信度阈值时，则对原始文字图像进行图像超分辨率重建，得到目标文字图像。In this step, after obtaining the first recognized character, the confidence level corresponding to the first recognized character is determined. The specific method of determining text recognition confidence is described in detail in the prior art, and will not be described again here. Then the confidence corresponding to the first recognized text is compared with the preset confidence threshold. When it is judged that the confidence corresponding to the first recognized text is lower than the confidence threshold, image super-resolution reconstruction is performed on the original text image to obtain Target text image.

这里，现有深度学习技术应用研究领域中，超分辨率技术可以提升低质图像的质量，即采用信号处理的方法从给定的低频图像中恢复出高频信息，在不改变硬件设备的前提下，获得高于成像系统分辨率的图片。常用方法比如SRGAN和ESRGAN。这种技术通过超分辨化来提升图片的质量，从而提升低质量文本图像的清晰程度，可以对低质文本识别起到辅助作用。Here, in the current application research field of deep learning technology, super-resolution technology can improve the quality of low-quality images, that is, using signal processing methods to recover high-frequency information from a given low-frequency image without changing the hardware equipment. to obtain images with a resolution higher than that of the imaging system. Common methods such as SRGAN and ESRGAN. This technology improves the quality of images through super-resolution, thereby improving the clarity of low-quality text images, and can assist in low-quality text recognition.

图片文字识别过程中，很多的文字识别错误都是因图像质量低造成的。理论上讲，通过超分辨率重建提升图像质量，可降低文字识别的错误率。然而，目前该领域使用单帧图像的超分辨率技术较少，是因为超分辨率重建会增加资源的消耗以及增加文字识别耗时。In the process of image text recognition, many text recognition errors are caused by low image quality. Theoretically, improving image quality through super-resolution reconstruction can reduce the error rate of text recognition. However, currently there are few super-resolution technologies using single-frame images in this field because super-resolution reconstruction will increase the consumption of resources and increase the time-consuming of text recognition.

本申请中使用的图像超分辨率重建方法是使用的是TextSR，其基础是SRGAN，通过生成器和判别器来生成高分辨率图像。TextSR全程Text Super Resolution，是在SRGAN的基础上添加了一个文本识别的辨别器作为损失，通过引入该损失，提升了图片中文字的超分辨率重建效果。为使生成器对文字纹理的生成结果更好，还引入了ASTER文字识别网络和文本感知损失函数，最终达到较好的图像超分辨率重建效果。The image super-resolution reconstruction method used in this application uses TextSR, which is based on SRGAN, and generates high-resolution images through a generator and a discriminator. TextSR's full Text Super Resolution adds a text recognition discriminator as a loss based on SRGAN. By introducing this loss, the super-resolution reconstruction effect of text in the image is improved. In order to make the generator produce better text texture results, the ASTER text recognition network and text perception loss function are also introduced, ultimately achieving better image super-resolution reconstruction effects.

采用以上方法，可以从给定的低频图像中恢复出高频文字边缘信息，从而在不改变硬件设备的前提下，获得高于成像系统分辨率的图片，提高图片文字识别的准确率。并且，本申请只对图片识别文字识别置信度低于阈值的文本图像做超分辨率重建，需要重建的图片数量少、尺寸小，综合来讲耗时较短。Using the above method, high-frequency text edge information can be recovered from a given low-frequency image, thereby obtaining a picture with a higher resolution than the imaging system without changing the hardware equipment, and improving the accuracy of picture text recognition. Moreover, this application only performs super-resolution reconstruction on text images whose image recognition and text recognition confidence is lower than the threshold. The number of images that need to be reconstructed is small and the size is small, and overall it takes less time.

第三步：对目标文字图像进行文字识别，得到目标文字图像对应的第二识别文字和第二识别文字对应的置信度，当第二识别文字对应的置信度低于置信度阈值时，将第二识别文字确定为待纠错文本。Step 3: Perform text recognition on the target text image to obtain the second recognized text corresponding to the target text image and the confidence corresponding to the second recognized text. When the confidence corresponding to the second recognized text is lower than the confidence threshold, the 2. The recognized text is determined to be the text to be corrected.

在该步骤中，通过图像超分辨率重建得到目标文字图像后，再对目标文字图像进行文字识别，得到目标文字图像对应的第二识别文字。然后确定出第二识别文字对应的置信度，将第二识别文字对应的置信度与预设的置信度阈值进行对比，当判断第二识别文字对应的置信度低于置信度阈值时，则将第二识别文字确定为待纠错文本。In this step, after the target text image is obtained through image super-resolution reconstruction, text recognition is performed on the target text image to obtain the second recognized text corresponding to the target text image. Then the confidence corresponding to the second recognized text is determined, and the confidence corresponding to the second recognized text is compared with the preset confidence threshold. When it is determined that the confidence corresponding to the second recognized text is lower than the confidence threshold, then The second recognized text is determined to be the text to be corrected.

本申请引入图像超分辨率重建后的二次图像文字识别，只对第一次图像文字识别置信度低于阈值的原始文字图像做超分辨率重建，需要重建的图像数量少、尺寸小，重建耗时较短。相比对第一次图像文字识别结果直接进行纠错，超分辨率重建可减少待纠错字符的数量，总体来讲时间性能更优。This application introduces secondary image text recognition after image super-resolution reconstruction, and only performs super-resolution reconstruction on the original text images whose confidence level is lower than the threshold for the first image text recognition. The number of images that need to be reconstructed is small and the size is small. It takes less time. Compared with direct error correction on the first image text recognition result, super-resolution reconstruction can reduce the number of characters to be corrected, and overall the time performance is better.

作为一种可选的实施方式，针对上述步骤S101，将所述待纠错文本输入到统计语言模型中，确定出所述待纠错文本中的多个待纠错字符以及每个待纠错字符对应的多个候选字符，可以包括以下步骤1011到步骤1014：As an optional implementation, for the above step S101, the text to be corrected is input into a statistical language model, and multiple characters to be corrected and each character to be corrected in the text to be corrected are determined. Multiple candidate characters corresponding to characters may include the following steps 1011 to 1014:

步骤1011，将待纠错文本输入到统计语言模型中，得到待纠错文本中每个词对应的出现概率。Step 1011, input the text to be corrected into the statistical language model, and obtain the occurrence probability corresponding to each word in the text to be corrected.

针对上述步骤1011，在具体实施时，将待纠错文本输入到统计语言模型中，在统计语言模型中先对待纠错文本进行分词，统计语言模型即可确定出待纠错文本中每个词对应的出现概率。Regarding the above step 1011, during specific implementation, the text to be corrected is input into the statistical language model. In the statistical language model, the text to be corrected is first segmented into words, and the statistical language model can determine each word in the text to be corrected. the corresponding probability of occurrence.

这里，从数学角度讲，统计语言模型是一个单词序列上的概率分布，对于一个给定长度为n的序列，它可以为整个序列产生一个概率P(w₁,w₂,…,w_n)，其中w_n代表该句子做切词后的第n个词。要计算这段文本存在的概率，需要将上述公式经过链式法则转化为下述公式：Here, from a mathematical point of view, the statistical language model is a probability distribution over a word sequence. For a given sequence of length n, it can generate a probability P(w₁ ,w₂ ,...,w_n ) for the entire sequence. , where w_n represents the nth word of the sentence after word segmentation. To calculate the probability of the existence of this text, the above formula needs to be transformed into the following formula through the chain rule:

P(w₁，w₂，...，w_n)＝P(w₁)P(w₂|w₁)...P(w_n|w₁，...w_n-1)P(w₁ , w₂ ,..., w_n )=P(w₁ )P(w₂ |w₁ )...P(w_n |w₁ ,...w_n-1 )

例如，对于“我今天很开心”这个句子为例，分词后为“我/今天/很/开心”，则计算公式为：P(我)*P(今天|我)*P(很|我，今天)……For example, for the sentence "I am very happy today", after the word segmentation is "I/today/very/happy", the calculation formula is: P(I)*P(Today|I)*P(Very|I, today)……

具体的计算过程完全基于统计计算，以P(今天|我)为例，其计算方法如下述公式：The specific calculation process is completely based on statistical calculations. Taking P(today|me) as an example, the calculation method is as follows:

这里的freq可以用出现的词频数/总的词数来代替，因涉及统计，为了保证较好的统计显著性，需要根据尽量多且符合业务场景的语料来构建一个通用性较强的词库，以此提高计算的准确率。The freq here can be replaced by the word frequency/total number of words. Because statistics are involved, in order to ensure better statistical significance, it is necessary to build a highly versatile thesaurus based on as much corpus as possible and consistent with the business scenario. , in order to improve the accuracy of calculation.

但长文本条件下，距离较远的词往往相关性较低，导致该计算方法复杂度且存在稀疏问题，效率较低。However, under long text conditions, words that are far away tend to have low correlation, resulting in the complexity of the calculation method, sparsity problems, and low efficiency.

为了解决这个问题，ngram模型引入马尔可夫假设，即当前词出现的概率只依赖于前n-1个词，n即为ngram模型中的n，gram则表示句子中的词，汉字中即为句子做切词后的多个词语。最终优化结果如下述公式所示：In order to solve this problem, the ngram model introduces the Markov hypothesis, that is, the probability of the current word only depends on the first n-1 words. n is n in the ngram model, and gram represents the word in the sentence. In Chinese characters, it is Multiple words after sentence segmentation. The final optimization result is shown in the following formula:

n＝1 unigram：n=1 unigram:

n＝2 bigram：n=2 bigram:

n＝3 trigram：n=3 trigram:

本申请通过训练好的3gram模型即可得到待纠错文本中每个词对应的出现概率。This application can obtain the occurrence probability corresponding to each word in the text to be corrected through the trained 3gram model.

步骤1012，针对于出现概率小于或等于概率阈值的每个词，确定该词对应的多个候选词。Step 1012: For each word whose occurrence probability is less than or equal to the probability threshold, determine multiple candidate words corresponding to the word.

需要说明的是，候选词指的是对每个词召回得到的词。概率阈值指的是预先设定好的，用来判断是否需要对某个词进行替换的阈值。It should be noted that candidate words refer to the words recalled for each word. The probability threshold refers to a preset threshold used to determine whether a word needs to be replaced.

针对上述1012，在具体实施时，得到待替换文本中每个词对应的出现概率后，将每个词对应的出现概率与预先设定的概率阈值进行对比。针对于出现概率小于或等于概率阈值的每个词，确定该词对应的多个候选词。这里，可以使用相似拼音策略来确定该词对应的多个候选词。例如，当某个词为“每天”时，使用相似拼音策略得到的候选词可以是“煤田”、“梅田”、“没填”等等。也可以使用其他策略来确定候选词，例如相似结构策略等等，对此本申请不做具体限定。Regarding the above 1012, during specific implementation, after obtaining the occurrence probability corresponding to each word in the text to be replaced, the occurrence probability corresponding to each word is compared with the preset probability threshold. For each word whose occurrence probability is less than or equal to the probability threshold, multiple candidate words corresponding to the word are determined. Here, a similar pinyin strategy can be used to determine multiple candidate words corresponding to the word. For example, when a certain word is "every day", the candidate words obtained using a similar pinyin strategy can be "coalfield", "Umeda", "not filled in", etc. Other strategies may also be used to determine candidate words, such as similar structure strategies, etc., which are not specifically limited in this application.

步骤1013，针对于该词对应的多个候选词，使用该候选词对待纠错文本中的该词进行替换，得到替换后文本，并确定替换后文本对应的语义困惑度。Step 1013: For multiple candidate words corresponding to the word, use the candidate words to replace the word in the text to be corrected, obtain the replaced text, and determine the semantic perplexity corresponding to the replaced text.

步骤1014，当语义困惑度小于或等于困惑度阈值时，则将该词中的字符确定为待纠错字符，将该候选词中与待纠错字符位置相同的字符确定为待纠错字符对应的候选字符。Step 1014: When the semantic confusion is less than or equal to the confusion threshold, the characters in the word are determined as the characters to be corrected, and the characters in the candidate word that are in the same position as the characters to be corrected are determined to correspond to the characters to be corrected. candidate characters.

其中，困惑度阈值可以根据实际情况而定，例如，0.5等等，对此本申请不做具体限定。The perplexity threshold can be determined according to the actual situation, for example, 0.5, etc., which is not specifically limited in this application.

这里，语义困惑度，则是评价统计语言模型的指标。语义困惑度(perplexity)的基本思想是：给测试集(未来需要做预测的句子)的句子赋予较高概率值的语言模型较好，当语言模型训练完之后，测试集中的句子都是正常的句子，那么训练好的模型就是在测试集上的概率越高越好，语义困惑度公式如下：Here, semantic perplexity is an indicator for evaluating statistical language models. The basic idea of semantic perplexity (perplexity) is: it is better to assign a higher probability value to the sentences in the test set (sentences that need to be predicted in the future). After the language model is trained, the sentences in the test set are normal. sentence, then the trained model has a higher probability on the test set, the better. The semantic perplexity formula is as follows:

N-gram语言模型可以用来评估语句是否合理，即语句是否通顺。语言模型通常用语义困惑度衡量句子是否不通顺，通顺的句子往往句子概率大，语义困惑度较低；不通顺的句子往往句子概率小，语义困惑度较高。所以通过大量无监督语料训练N-gram语言模型，并且计算句子PPL可以用来衡量句子是否不通顺。The N-gram language model can be used to evaluate whether the statement is reasonable, that is, whether the statement is smooth. Language models usually use semantic perplexity to measure whether a sentence is fluent. Smooth sentences tend to have a high probability of sentences and low semantic confusion; sentences that are fluent tend to have a low probability of sentences and high semantic confusion. Therefore, N-gram language models are trained through a large amount of unsupervised corpus, and the PPL of sentences can be calculated to measure whether the sentences are not fluent.

针对上述步骤1013-步骤1014，在具体实施时，在上述步骤1012中将该词对应的多个候选词确定出后，针对于该词对应的多个候选词，使用该候选词对待纠错文本中的该词进行替换，得到替换后文本，并确定替换后文本对应的语义困惑度。替换后文本的语义困惑度越小，说明替换后文本越通顺。因此，当替换后文本的语义困惑度小于或等于预设的困惑度阈值时，则将该词中的字符确定为待纠错字符，将该候选词中与待纠错字符的位置相同的字符确定为待纠错字符对应的候选字符。本申请引入ngram模型生成候选字符，并生成了用来衡量语义是否合理的困惑度特征，避免了对于短文本纠错效果不佳的缺点，且时间性能更好。Regarding the above-mentioned steps 1013 to 1014, in the specific implementation, after the multiple candidate words corresponding to the word are determined in the above-mentioned step 1012, the candidate words are used for the multiple candidate words corresponding to the word to be corrected in the text. Replace the word in the text to obtain the replaced text, and determine the semantic confusion corresponding to the replaced text. The smaller the semantic confusion of the replaced text, the smoother the replaced text. Therefore, when the semantic confusion of the replaced text is less than or equal to the preset confusion threshold, the characters in the word are determined as the characters to be corrected, and the characters in the candidate word that are in the same position as the characters to be corrected are Determine the candidate character corresponding to the character to be corrected. This application introduces the ngram model to generate candidate characters, and generates perplexity features used to measure whether the semantics are reasonable, avoiding the shortcomings of poor error correction for short texts, and having better time performance.

S102，针对于每个待纠错字符，将该待纠错字符与该待纠错字符对应的每个候选字符之间的字符相似度特征输入到预先训练好的可替换概率预测模型中，确定出每个候选字符的可替换概率。S102. For each character to be corrected, input the character similarity feature between the character to be corrected and each candidate character corresponding to the character to be corrected into the pre-trained replaceable probability prediction model, and determine Find the replaceability probability of each candidate character.

需要说明的是，可替换概率预测模型指的是预先训练好的，对候选字符的可替换概率进行预测的模型。这里，可替换概率预测模型可以是GBDT模型。GBDT模型全称GradientBoosting Decistion Tree，即梯度提升决策树。GBDT本质上属于一种回归类决策树，核心在于累加所有树的结果作为最终结果。通过多轮迭代,每轮迭代产生一个弱分类器(通常默认为CART树)。CART树(Classification And Regression Tree)，中文名为分类回归树，是一种决策树分类方法。它采用一种二分递归分割的技术，分割方法是基于最小距离的基尼指数估计函数，可将当前的样本集分为两个子样本集，使生成的每个非叶子节点都有两个分支，是一种结构简洁的二叉树，适合作为GBDT中的弱分类器。每个分类器在上一轮分类器的残差基础上进行训练，每一棵树学的是之前所有树结论和的残差，这个残差就是一个加预测值后得到真实值的累加量，即可理解为梯度。本申请使用GBDT模型的原因是其可实现多阶特征组合和筛选，算法成熟，训练简单，可解释性较强。通过对梯度做集合式的优化，最终得到合适的树模型。可替换概率指的是候选字符可以代替待纠错字符的概率。It should be noted that the replaceability probability prediction model refers to a pre-trained model that predicts the replaceability probability of candidate characters. Here, the alternative probability prediction model may be a GBDT model. The full name of the GBDT model is GradientBoosting Decision Tree, which is the gradient boosting decision tree. GBDT is essentially a regression decision tree, and its core is to accumulate the results of all trees as the final result. Through multiple rounds of iteration, each iteration produces a weak classifier (usually a CART tree by default). CART tree (Classification And Regression Tree), Chinese name is classification regression tree, is a decision tree classification method. It uses a bipartite recursive segmentation technology. The segmentation method is the Gini index estimation function based on the minimum distance. It can divide the current sample set into two sub-sample sets, so that each generated non-leaf node has two branches, which is A binary tree with a simple structure, suitable as a weak classifier in GBDT. Each classifier is trained based on the residuals of the previous round of classifiers. Each tree learns the residuals of the sum of the conclusions of all previous trees. This residual is the cumulative amount that is added to the predicted value to obtain the true value. It can be understood as a gradient. The reason why this application uses the GBDT model is that it can achieve multi-level feature combination and screening, has mature algorithms, simple training, and strong interpretability. Through collective optimization of the gradient, a suitable tree model is finally obtained. The replaceability probability refers to the probability that a candidate character can replace the character to be corrected.

针对上述步骤S102，在具体实施时，待纠错字符以及待纠错字符对应的候选字符确定出后，针对于每个待纠错字符，将该待纠错字符与该待纠错字符对应的每个候选字符之间的字符相似度特征输入到预先训练好的可替换概率预测模型中，确定出每个候选字符的可替换概率。Regarding the above step S102, during specific implementation, after the character to be corrected and the candidate characters corresponding to the character to be corrected are determined, for each character to be corrected, the character to be corrected and the character corresponding to the character to be corrected are The character similarity features between each candidate character are input into the pre-trained replaceability probability prediction model to determine the replaceability probability of each candidate character.

作为一种可选的实施方式，针对上述步骤S102，通过以下步骤1021到步骤1022确定该待纠错字符与该待纠错字符对应的每个候选字符之间的字符相似度特征：As an optional implementation manner, for the above step S102, the character similarity features between the character to be corrected and each candidate character corresponding to the character to be corrected are determined through the following steps 1021 to 1022:

步骤1021：针对于该待纠错字符对应的每个候选字符，计算该待纠错字符与该候选字符之间的多个相似度特征。Step 1021: For each candidate character corresponding to the character to be corrected, calculate multiple similarity features between the character to be corrected and the candidate character.

这里，多个相似度特征包括以下至少一项：四码相似度、笔画编辑距离、笔画数相似度、结构相似度、偏旁相似度和拼音相似度。Here, the plurality of similarity features include at least one of the following: four-code similarity, stroke edit distance, stroke number similarity, structural similarity, radical similarity and pinyin similarity.

针对上述步骤1021，在具体实施时，针对于该待纠错字符对应的每个候选字符，计算该待纠错字符与该候选字符之间的多个相似度特征。具体的，通过以下方法计算多个相似度特征：Regarding the above step 1021, during specific implementation, for each candidate character corresponding to the character to be corrected, multiple similarity features between the character to be corrected and the candidate character are calculated. Specifically, multiple similarity features are calculated through the following methods:

(1)四码相似度：(1) Four-code similarity:

四角号码是汉语词典常用检字方法之一，用最多5个阿拉伯数字来对汉字进行归类。请参阅下表1，表1为本申请实施例所提供的一种四角号码取值规则表。Four-corner numbers are one of the commonly used word search methods in Chinese dictionaries. They use up to 5 Arabic numerals to classify Chinese characters. Please refer to Table 1 below. Table 1 is a four-corner number value rule table provided by the embodiment of the present application.

表1四角号码取值规则Table 1 Four-corner number value rules

根据本申请提供的实施例，在具体实施时，本申请将约5000个常用汉字的四角号码构建成本地字典，直接查询该字典获取待纠错字符和候选字符的四角号码，并计算二者四角号码的编辑距离，作为四码相似度。According to the embodiment provided by this application, during specific implementation, this application constructs a local dictionary with the four-corner numbers of about 5,000 commonly used Chinese characters, directly queries the dictionary to obtain the four-corner numbers of the characters to be corrected and candidate characters, and calculates the four-corner numbers of the two. Edit distance of numbers, as four-code similarity.

(2)笔画编辑距离和笔画数相似度：(2) Stroke edit distance and stroke number similarity:

汉语中共32种笔画，传统基本笔画包括点、横、竖、撇、捺、提、折、勾等。请参阅下表2，表2为本申请实施例所提供的一种汉字笔画32进制编码表，本申请将每种笔画用一位32进制数进行表示，如下表2所示：There are 32 kinds of strokes in Chinese. The traditional basic strokes include dot, horizontal, vertical, left, back, lift, fold, hook, etc. Please refer to the following Table 2. Table 2 is a 32-digit coding table for Chinese character strokes provided by the embodiment of the present application. In this application, each stroke is represented by one 32-digit number, as shown in the following Table 2:

表2汉字笔画32进制编码表Table 2 Chinese character stroke 32 hexadecimal coding table

根据本申请提供的实施例，在具体实施时，本申请将约5000个常用汉字的笔画32进制编码构建成本地字典，直接查询该字典获取待纠错字符和候选字符的笔画32进制编码，并计算二者该编码的编辑距离作为笔画编辑距离。此外，该编码的长度即为汉字的笔画数，即笔画数特征，笔画数特征相同则笔画数相似度为1，笔画数特征不同则笔画数相似度为0。According to the embodiments provided by this application, during specific implementation, this application constructs the stroke 32 encoding of about 5,000 commonly used Chinese characters into a local dictionary, and directly queries the dictionary to obtain the stroke 32 encoding of the characters to be corrected and candidate characters. , and calculate the edit distance of the two codes as the stroke edit distance. In addition, the length of the code is the number of strokes of the Chinese character, that is, the stroke number feature. If the stroke number features are the same, the stroke number similarity is 1, and if the stroke number features are different, the stroke number similarity is 0.

(3)结构相似度(3) Structural similarity

汉字分为独体字和复合字，复合字可按二分法切分，包括左右结构、上下结构、包容结构、嵌套结构4大类。每一类结构的每一成分又是独立部件或这4类结构之一，如此递归切分即可获得该汉字的结构。最终将汉字结构归类为以下12类。请参阅下表3，表3为本申请实施例所提供的一种汉字结构和示例表，具体结构和示例如下表3所示：Chinese characters are divided into single characters and compound characters. Compound characters can be divided into four categories: left-right structure, upper-lower structure, inclusive structure, and nested structure. Each component of each type of structure is an independent component or one of these four types of structures, so the structure of the Chinese character can be obtained by recursive segmentation. Finally, the Chinese character structure was classified into the following 12 categories. Please refer to Table 3 below. Table 3 is a Chinese character structure and example table provided by the embodiment of the present application. The specific structure and examples are shown in Table 3 below:

表3汉字结构和示例Table 3 Chinese character structures and examples

结构名Structure name示例Example无none一、丁、了、七、儿One, Ding, Le, Qi, Er左右about怪、哎、炮、柏、海Weird, hey, cannon, cypress, sea上下up and down否、热、吕、哲、盐no, hot, lv, zhe, salt左上包Upper left bag石、左、右、有、尼Stone, Left, Right, You, Ni左下包Lower left bag进、迫、辽、赵、飓Jin, forced, Liao, Zhao, Ji右上包Upper right bag匀、司、或、式、戎Jun, Si, or, Shi, Rong上三包Three guarantees风、向、问、闪、冈wind, direction, question, flash, hill左三包Three packs from left医、匠、匝、区、匹doctor, craftsman, zha, district, horse下三包Next three packs凶、函、幽、画、凼Fierce, Han, You, Hua, Taipa嵌套Nested团、国、困、围、圆Tuan, country, trapped, besieged, circle同字Same word瞐、众、淼、垚、品Lu, Zhong, Miao, Yao, Pin其他other巫、果、爽、市、夷Wu, Guo, Shuang, Shi, Yi

根据本申请提供的实施例，在具体实施时，本申请将约5000个常用汉字的结构构建成本地字典，直接查询该字典获取待纠错字符和候选字符的结构，结构相同则结构相似度为1，结构不同则结构相似度为0。According to the embodiments provided by this application, during specific implementation, this application constructed the structures of about 5,000 commonly used Chinese characters into a local dictionary, and directly queried the dictionary to obtain the structures of the characters to be corrected and candidate characters. If the structures are the same, the structural similarity is 1. If the structures are different, the structural similarity is 0.

(4)偏旁相似度和拼音相似度(4) Radical similarity and pinyin similarity

这两个相似度比较容易理解，就是汉字的偏旁和拼音。本方案将约5000个常用汉字的偏旁(无偏旁则置空)和拼音构建成本地字典，直接查询该字典获取待纠错字符和候选字符的偏旁和拼音。其中偏旁相同，则偏旁相似度为1，否则为0。These two similarities are easier to understand, they are the radicals and pinyin of Chinese characters. This solution builds a local dictionary from the radicals (leave blank if there is no radical) and pinyin of about 5,000 commonly used Chinese characters, and directly queries the dictionary to obtain the radicals and pinyin of the characters to be corrected and candidate characters. If the radicals are the same, the radical similarity is 1, otherwise it is 0.

请参阅下表4，表4为本申请实施例所提供的一种拼音相似度示例表。关于拼音相似度，本申请使用编辑距离为拼音相似度，比如阿(a)、安(an)、昂(ang)、方(fang)、范(fan)，拼音相似度的计算结果如下表4所示：Please refer to Table 4 below. Table 4 is an example table of pinyin similarity provided by the embodiment of the present application. Regarding pinyin similarity, this application uses edit distance as pinyin similarity, such as a (a), an (an), ang (ang), fang (fang), and fan (fan). The calculation results of pinyin similarity are as follows in Table 4 Shown:

表4拼音相似度示例Table 4 Pinyin similarity example

阿(a)a(a)安(an)An (an)昂(ang)ang方(fang)fang范(fan)fan阿(a)a(a)0011223322安(an)An (an)1100112211昂(ang)ang2211001122方(fang)fang3322110011范(fan)fan2211221100

根据本申请提供的实施例，在具体实施时，本申请直接根据待纠错字符的拼音和候选字符的拼音即可确定出待纠错字符与候选字符之间的拼音相似度。According to the embodiments provided by the present application, during specific implementation, the present application can determine the pinyin similarity between the character to be corrected and the candidate characters directly based on the pinyin of the character to be corrected and the pinyin of the candidate characters.

根据本申请提供的实施例，本申请引入六种字符相似度特征，且用这六种字符相似度特征综合判定待纠错字符和候选字符的接近程度。这六种特征可并行计算，且均可直接从本地字典读取后并做简单的计算，时间性能好。在计算资源充足的条件下，总体时间可控制在100ms以下。According to the embodiment provided by this application, this application introduces six character similarity features, and uses these six character similarity features to comprehensively determine the proximity between the character to be corrected and the candidate character. These six features can be calculated in parallel, and can be directly read from the local dictionary and performed simple calculations, with good time performance. Under sufficient computing resources, the overall time can be controlled below 100ms.

步骤1022：将该待纠错字符与该候选字符之间的多个相似度特征、该候选字符所属的替换后文本对应的语义困惑度以及待纠错文本对应的置信度进行归一化，以得到该待纠错字符与该候选字符之间的字符相似度特征。Step 1022: Normalize the multiple similarity features between the character to be corrected and the candidate character, the semantic perplexity corresponding to the replaced text to which the candidate character belongs, and the confidence level corresponding to the text to be corrected, so as to Character similarity features between the character to be corrected and the candidate character are obtained.

针对上述步骤1022，在具体实施时，将上述步骤1021得到的该待纠错字符与该候选字符之间的多个相似度特征、该候选字符所属的替换后文本对应的语义困惑度以及待纠错文本对应的置信度进行归一化，以得到该待纠错字符与该候选字符之间的字符相似度特征。Regarding the above step 1022, during specific implementation, the multiple similarity features between the character to be corrected and the candidate character obtained in the above step 1021, the semantic confusion corresponding to the replaced text to which the candidate character belongs, and the semantic confusion to be corrected are used. The confidence level corresponding to the error text is normalized to obtain the character similarity feature between the character to be corrected and the candidate character.

作为一种可选的实施方式，针对上述步骤S102，通过以下步骤(1)到步骤(2)训练所述可替换概率预测模型：As an optional implementation, for the above step S102, train the alternative probability prediction model through the following steps (1) to step (2):

步骤(1)：获取训练数据集。Step (1): Obtain the training data set.

步骤(2)：基于训练数据集对原始可替换概率预测模型进行训练，以得到可替换概率预测模型。Step (2): Train the original alternative probability prediction model based on the training data set to obtain an alternative probability prediction model.

需要说明的是，训练数据集指的是用于训练原始可替换概率预测模型的数据集。具体的，训练数据集中的训练数据包括待替换样本字符与候选样本字符之间的样本相似度特征以及用该候选样本字符替换该待替换样本字符的标注替换概率。原始可替换概率预测模型指的是预先构建好的，用于预测候选字符的可替换概率的GBDT原始模型。It should be noted that the training data set refers to the data set used to train the original alternative probability prediction model. Specifically, the training data in the training data set includes the sample similarity feature between the sample character to be replaced and the candidate sample character and the label replacement probability of using the candidate sample character to replace the sample character to be replaced. The original replaceable probability prediction model refers to the pre-built GBDT original model used to predict the replaceable probability of candidate characters.

针对上述步骤(1)-步骤(2)，在具体实施时，获取训练数据集，并基于训练数据集对原始可替换概率预测模型进行训练，以得到可替换概率预测模型。Regarding the above steps (1) to (2), during specific implementation, a training data set is obtained, and the original replaceable probability prediction model is trained based on the training data set to obtain a replaceable probability prediction model.

作为一种可选的实施方式，针对上述步骤(2)，基于训练数据集对原始可替换概率预测模型进行训练，以得到可替换概率预测模型，可以包括以下步骤I到步骤V：As an optional implementation, for the above step (2), training the original alternative probability prediction model based on the training data set to obtain the alternative probability prediction model may include the following steps I to step V:

步骤I：基于训练数据生成第一决策树。Step I: Generate the first decision tree based on the training data.

其中，第一决策树代表训练数据中每个候选样本字符对应的预测概率。Among them, the first decision tree represents the predicted probability corresponding to each candidate sample character in the training data.

需要说明的是，梯度提升决策树模型是以决策树为基分类器按顺序建立多棵决策树，每棵树学习拟合之前所有树构成的模型的负梯度，从而不断逼近最优分类。第一决策树则是原始可替换概率预测模型根据输入的训练数据所生成的决策树。这里，第一决策树的因变量是待替换字符样本与候选样本字符之间的样本相似度特征，自变量是预测概率，第一决策树代表训练数据中每个候选样本字符对应的预测概率。It should be noted that the gradient boosting decision tree model uses the decision tree as the base classifier to build multiple decision trees in sequence. Each tree learns to fit the negative gradient of the model composed of all previous trees, thereby continuously approaching the optimal classification. The first decision tree is a decision tree generated by the original alternative probability prediction model based on the input training data. Here, the dependent variable of the first decision tree is the sample similarity feature between the character sample to be replaced and the candidate sample character, the independent variable is the predicted probability, and the first decision tree represents the predicted probability corresponding to each candidate sample character in the training data.

针对上述步骤I，在具体实施时，基于训练数据集中的训练数据生成第一决策树。Regarding the above step I, during specific implementation, a first decision tree is generated based on the training data in the training data set.

步骤II：将第一决策树的预测结果与训练数据中的标注替换概率进行对比，计算第一决策树的预测误差和损失值。Step II: Compare the prediction result of the first decision tree with the label replacement probability in the training data, and calculate the prediction error and loss value of the first decision tree.

需要说明的是，这里的预测误差指的是第一决策树的预测结果与训练数据进行对比而产生的预测误差。损失值指的是将随机事件或其有关随机变量的取值映射为非负实数以表示该随机事件的“风险”或“损失”的函数。在应用中，损失值通常作为学习准则与优化问题相联系，即通过最小化损失函数求解和评估模型。例如，当第一决策树预测的某个候选样本字符的预测概率与训练数据中的标注替换概率不相同时，则认为第一决策树的预测结果存在误差，基于第一决策树的预测结果计算第一决策树的预测误差和损失值。在梯度提升决策树模型训练的过程中，根据第一决策树的预测结果计算预测误差和损失值的方式在现有技术中有详细说明，在此不再过多赘述。It should be noted that the prediction error here refers to the prediction error resulting from comparing the prediction result of the first decision tree with the training data. The loss value refers to a function that maps the value of a random event or its related random variables into a non-negative real number to represent the "risk" or "loss" of the random event. In applications, loss values are often associated with optimization problems as a learning criterion, i.e., solving and evaluating a model by minimizing a loss function. For example, when the predicted probability of a candidate sample character predicted by the first decision tree is different from the label replacement probability in the training data, it is considered that there is an error in the prediction result of the first decision tree, and the calculation is based on the prediction result of the first decision tree. Prediction error and loss value of the first decision tree. During the training process of the gradient boosting decision tree model, the method of calculating the prediction error and loss value based on the prediction result of the first decision tree is described in detail in the prior art, and will not be described in detail here.

针对上述步骤II，在具体实施时，将第一决策树的预测结果与训练数据中的标注替换概率进行对比，计算第一决策树的预测误差和损失值。Regarding the above step II, during specific implementation, the prediction result of the first decision tree is compared with the annotation replacement probability in the training data, and the prediction error and loss value of the first decision tree are calculated.

步骤III：基于第一决策树的预测误差生成第二决策树。Step III: Generate a second decision tree based on the prediction error of the first decision tree.

需要说明的是，第二决策树则是原始可替换概率预测模型根据第一决策树的预测误差所生成的决策树。这里，第二决策树的因变量是待替换字符样本与候选样本字符之间的样本相似度特征，自变量是第一决策树得到的预测误差，也就是，第二决策树代表第一决策树的预测误差。It should be noted that the second decision tree is a decision tree generated by the original alternative probability prediction model based on the prediction error of the first decision tree. Here, the dependent variable of the second decision tree is the sample similarity feature between the character sample to be replaced and the candidate sample character, and the independent variable is the prediction error obtained by the first decision tree. That is, the second decision tree represents the first decision tree. prediction error.

针对上述步骤III，在具体实施时，基于第一决策树的预测误差生成第二决策树。Regarding the above step III, during specific implementation, a second decision tree is generated based on the prediction error of the first decision tree.

步骤IV：将第二决策树的预测结果与训练数据中的标注替换概率进行对比，计算第二决策树的预测误差和损失值。Step IV: Compare the prediction result of the second decision tree with the label replacement probability in the training data, and calculate the prediction error and loss value of the second decision tree.

针对上述步骤IV，在具体实施时，第二决策树生成后，根据第二决策树的预测结果与训练数据中的标注替换概率进行对比，以计算第二决策树的预测误差和损失值。具体的，如何根据第二决策树的预测结果计算第二决策树的预测误差和损失值的方法与步骤II中的方法相同，在此不再赘述。Regarding the above step IV, during specific implementation, after the second decision tree is generated, the prediction result of the second decision tree is compared with the annotation replacement probability in the training data to calculate the prediction error and loss value of the second decision tree. Specifically, the method of calculating the prediction error and loss value of the second decision tree based on the prediction result of the second decision tree is the same as the method in step II, and will not be described again here.

步骤V：若所述第二决策树的损失值大于损失阈值，或，原始可替换概率预测模型中决策树的数量小于数量阈值，则基于第二决策树的预测误差生成下一决策树，直至下一决策树的损失值小于损失阈值或原始可替换概率预测模型中决策树的数量等于数量阈值，得到可替换概率预测模型。Step V: If the loss value of the second decision tree is greater than the loss threshold, or the number of decision trees in the original alternative probability prediction model is less than the number threshold, generate the next decision tree based on the prediction error of the second decision tree until If the loss value of the next decision tree is less than the loss threshold or the number of decision trees in the original alternative probability prediction model is equal to the quantity threshold, an alternative probability prediction model is obtained.

需要说明的是，损失阈值指的是提前设定好的一个标准，作为一种可选的实施方式，损失阈值可以设定为损失值的二阶导数接近于0，因为在二阶导数接近0时，则损失值的斜率最小，即原始可替换概率预测模型中最后两个决策树的损失值变化已经很小，当损失值接近于这个损失阈值时，则认为原始可替换概率预测模型达到收敛状态。数量阈值指的是提前设定好的原始可替换概率预测模型中所需的决策树的数量，具体的，数量阈值可以提前设定为10个。It should be noted that the loss threshold refers to a standard set in advance. As an optional implementation, the loss threshold can be set so that the second derivative of the loss value is close to 0, because when the second derivative is close to 0 When , the slope of the loss value is the smallest, that is, the change in the loss value of the last two decision trees in the original alternative probability prediction model has been very small. When the loss value is close to this loss threshold, the original alternative probability prediction model is considered to have reached convergence. state. The quantity threshold refers to the number of decision trees required in the original alternative probability prediction model that is set in advance. Specifically, the quantity threshold can be set to 10 in advance.

针对上述步骤V，在具体实施时，在第二决策树生成好后，判断第二决策树的损失值是否大于损失阈值，或原始可替换概率预测模型中的决策树的数量是否小于数量阈值。如果第二决策树的损失值大于损失阈值，或原始可替换概率预测模型中的决策树的数量小于数量阈值，则认为当前模型并没有训练结束，还需再生成下一决策树。生成下一决策树后，在计算该决策树的损失值，再次判断下一决策树的损失值是否大于损失阈值，或原始可替换概率预测模型中的决策树的数量是否小于数量阈值。若下一决策树的损失值小于或等于损失阈值，或原始可替换概率预测模型中的决策树的数量等于数量阈值，则认为当前模型训练结束，可以得到可替换概率预测模型。若下一决策树的损失值大于损失阈值，或原始可替换概率预测模型中的决策树的数量小于数量阈值，则还需继续生成决策树，直至最后一个决策树的损失值小于损失阈值或原始可替换概率预测模型中决策树的数量等于数量阈值，即可得到可替换概率预测模型。Regarding the above step V, during specific implementation, after the second decision tree is generated, it is determined whether the loss value of the second decision tree is greater than the loss threshold, or whether the number of decision trees in the original alternative probability prediction model is less than the quantity threshold. If the loss value of the second decision tree is greater than the loss threshold, or the number of decision trees in the original alternative probability prediction model is less than the number threshold, it is considered that the training of the current model has not ended, and the next decision tree needs to be generated. After generating the next decision tree, calculate the loss value of the decision tree, and then judge again whether the loss value of the next decision tree is greater than the loss threshold, or whether the number of decision trees in the original alternative probability prediction model is less than the number threshold. If the loss value of the next decision tree is less than or equal to the loss threshold, or the number of decision trees in the original alternative probability prediction model is equal to the quantity threshold, it is considered that the current model training is over and an alternative probability prediction model can be obtained. If the loss value of the next decision tree is greater than the loss threshold, or the number of decision trees in the original alternative probability prediction model is less than the number threshold, it is necessary to continue to generate decision trees until the loss value of the last decision tree is less than the loss threshold or the original The number of decision trees in the replaceable probability prediction model is equal to the number threshold, and the replaceable probability forecast model can be obtained.

S103，基于每个候选字符的可替换概率从多个候选字符中确定出替换该待纠错字符的目标字符，并基于目标字符对该待纠错字符进行纠错，以得到纠错后文本。S103: Determine the target character that replaces the character to be corrected from multiple candidate characters based on the replaceability probability of each candidate character, and perform error correction on the character to be corrected based on the target character to obtain the error-corrected text.

针对上述步骤S103，在具体实施时，该待纠错字符对应的每个候选字符的可替换概率确定出后，从多个候选字符中选择可替换概率大于预设阈值且可替换概率最大的候选字符作为该待纠错字符的目标字符，并将该待纠错字符替换为目标字符，以基于目标字符对该待纠错字符进行纠错，得到纠错后文本。Regarding the above step S103, during specific implementation, after the replaceability probability of each candidate character corresponding to the character to be corrected is determined, the candidate with the replaceability probability greater than the preset threshold and the highest replaceability probability is selected from multiple candidate characters. The character is used as the target character of the character to be corrected, and the character to be corrected is replaced with the target character, so that the character to be corrected is corrected based on the target character, and the error-corrected text is obtained.

本申请实施例提供的一种图像文字识别的纠错方法，首先，从原始文字图像中提取待纠错文本，并将待纠错文本输入到统计语言模型中，确定出待纠错文本中的多个待纠错字符以及每个待纠错字符对应的多个候选字符；然后，针对于每个待纠错字符，将该待纠错字符与该待纠错字符对应的每个候选字符之间的字符相似度特征输入到预先训练好的可替换概率预测模型中，确定出每个候选字符的可替换概率；最后，基于每个候选字符的可替换概率从多个候选字符中确定出替换该待纠错字符的目标字符，并基于目标字符对该待纠错字符进行纠错，以得到纠错后文本。An embodiment of the present application provides an error correction method for image text recognition. First, the text to be corrected is extracted from the original text image, and the text to be corrected is input into a statistical language model to determine the text to be corrected. Multiple characters to be corrected and multiple candidate characters corresponding to each character to be corrected; then, for each character to be corrected, the character to be corrected and each candidate character corresponding to the character to be corrected are The character similarity features between characters are input into the pre-trained replaceability probability prediction model to determine the replaceability probability of each candidate character; finally, the replacement is determined from multiple candidate characters based on the replaceability probability of each candidate character. The target character of the character to be corrected is corrected based on the target character to obtain the error-corrected text.

请参阅图2、图3，图2为本申请实施例所提供的一种图像文字识别的纠错装置的结构示意图之一，图3为本申请实施例所提供的一种图像文字识别的纠错装置的结构示意图之二。如图2中所示，所述纠错装置200包括：Please refer to Figures 2 and 3. Figure 2 is one of the structural schematic diagrams of an error correction device for image and text recognition provided by an embodiment of the present application. Figure 3 is a schematic diagram of an error correction device for image and text recognition provided by an embodiment of the present application. The second structural diagram of the wrong device. As shown in Figure 2, the error correction device 200 includes:

纠错字符确定模块201，用于从原始文字图像中提取待纠错文本，并将所述待纠错文本输入到统计语言模型中，确定出所述待纠错文本中的多个待纠错字符以及每个待纠错字符对应的多个候选字符；The error correction character determination module 201 is used to extract the text to be corrected from the original text image, input the text to be corrected into the statistical language model, and determine multiple errors to be corrected in the text to be corrected. characters and multiple candidate characters corresponding to each character to be corrected;

可替换概率确定模块202，用于针对于每个待纠错字符，将该待纠错字符与该待纠错字符对应的每个候选字符之间的字符相似度特征输入到预先训练好的可替换概率预测模型中，确定出每个候选字符的可替换概率；The replaceable probability determination module 202 is configured to, for each character to be corrected, input the character similarity feature between the character to be corrected and each candidate character corresponding to the character to be corrected into a pre-trained algorithm. In the substitution probability prediction model, the substitution probability of each candidate character is determined;

文本纠错模块203，用于基于每个候选字符的可替换概率从多个所述候选字符中确定出替换该待纠错字符的目标字符，并基于所述目标字符对该待纠错字符进行纠错，以得到纠错后文本。The text error correction module 203 is configured to determine a target character that replaces the character to be corrected from a plurality of candidate characters based on the replaceability probability of each candidate character, and perform a correction on the character to be corrected based on the target character. Correct errors to obtain error-corrected text.

进一步的，所述纠错字符确定模块201在用于从原始文字图像中提取待纠错文本时，所述纠错字符确定模块201还用于：Further, when the error correction character determination module 201 is used to extract the text to be corrected from the original text image, the error correction character determination module 201 is also used to:

进一步的，所述纠错字符确定模块201在用于将所述待纠错文本输入到统计语言模型中，确定出所述待纠错文本中的多个待纠错字符以及每个待纠错字符对应的多个候选字符时，所述纠错字符确定模块201还用于：Further, when the error correction character determination module 201 is used to input the error correction text into a statistical language model, the error correction character determination module 201 determines a plurality of error correction characters and each error correction character in the error correction text. When there are multiple candidate characters corresponding to a character, the error correction character determination module 201 is also used to:

进一步的，如图3所示，所述纠错装置200还包括字符相似度特征确定模块204，所述字符相似度特征确定模块204用于通过以下步骤确定该待纠错字符与该待纠错字符对应的每个候选字符之间的字符相似度特征：Further, as shown in Figure 3, the error correction device 200 also includes a character similarity feature determination module 204. The character similarity feature determination module 204 is used to determine the character to be corrected and the character to be corrected through the following steps: Character similarity features between each candidate character corresponding to the character:

进一步的，如图3所示，所述纠错装置200还包括模型训练模块205，所述模型训练模块205用于通过以下步骤训练所述可替换概率预测模型：Further, as shown in Figure 3, the error correction device 200 also includes a model training module 205. The model training module 205 is used to train the alternative probability prediction model through the following steps:

进一步的，所述模型训练模块205在用于基于所述训练数据集对所述原始可替换概率预测模型进行训练，以得到所述可替换概率预测模型时，所述模型训练模块205还用于：Further, when the model training module 205 is used to train the original alternative probability prediction model based on the training data set to obtain the alternative probability prediction model, the model training module 205 is also used to :

请参阅图4，图4为本申请实施例所提供的一种电子设备的结构示意图。如图4中所示，所述电子设备400包括处理器410、存储器420和总线430。Please refer to FIG. 4 , which is a schematic structural diagram of an electronic device provided by an embodiment of the present application. As shown in FIG. 4 , the electronic device 400 includes a processor 410 , a memory 420 and a bus 430 .

所述存储器420存储有所述处理器410可执行的机器可读指令，当电子设备400运行时，所述处理器410与所述存储器420之间通过总线430通信，所述机器可读指令被所述处理器410执行时，可以执行如上述图1所示方法实施例中的图像文字识别的纠错方法的步骤，具体实现方式可参见方法实施例，在此不再赘述。The memory 420 stores machine readable instructions executable by the processor 410. When the electronic device 400 is running, the processor 410 and the memory 420 communicate through the bus 430, and the machine readable instructions are When the processor 410 is executed, the steps of the error correction method for image text recognition in the method embodiment shown in FIG. 1 can be performed. For specific implementation methods, please refer to the method embodiment, which will not be described again here.

本申请实施例还提供一种计算机可读存储介质，该计算机可读存储介质上存储有计算机程序，该计算机程序被处理器运行时可以执行如上述图1所示方法实施例中的图像文字识别的纠错方法的步骤，具体实现方式可参见方法实施例，在此不再赘述。Embodiments of the present application also provide a computer-readable storage medium. A computer program is stored on the computer-readable storage medium. When the computer program is run by a processor, the computer program can perform image and text recognition in the method embodiment shown in Figure 1. For the steps of the error correction method, please refer to the method embodiments for specific implementation methods, and will not be described again here.

所属领域的技术人员可以清楚地了解到，为描述的方便和简洁，上述描述的系统、装置和单元的具体工作过程，可以参考前述方法实施例中的对应过程，在此不再赘述。Those skilled in the art can clearly understand that for the convenience and simplicity of description, the specific working processes of the systems, devices and units described above can be referred to the corresponding processes in the foregoing method embodiments, and will not be described again here.

在本申请所提供的几个实施例中，应该理解到，所揭露的系统、装置和方法，可以通过其它的方式实现。以上所描述的装置实施例仅仅是示意性的，例如，所述单元的划分，仅仅为一种逻辑功能划分，实际实现时可以有另外的划分方式，又例如，多个单元或组件可以结合或者可以集成到另一个系统，或一些特征可以忽略，或不执行。另一点，所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些通信接口，装置或单元的间接耦合或通信连接，可以是电性，机械或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed systems, devices and methods can be implemented in other ways. The device embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components may be combined or can be integrated into another system, or some features can be ignored, or not implemented. On the other hand, the coupling or direct coupling or communication connection between each other shown or discussed may be through some communication interfaces, and the indirect coupling or communication connection of the devices or units may be in electrical, mechanical or other forms.

所述作为分离部件说明的单元可以是或者也可以不是物理上分开的，作为单元显示的部件可以是或者也可以不是物理单元，即可以位于一个地方，或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or they may be distributed to multiple network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.

另外，在本申请各个实施例中的各功能单元可以集成在一个处理单元中，也可以是各个单元单独物理存在，也可以两个或两个以上单元集成在一个单元中。In addition, each functional unit in each embodiment of the present application can be integrated into one processing unit, each unit can exist physically alone, or two or more units can be integrated into one unit.

所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时，可以存储在一个处理器可执行的非易失的计算机可读取存储介质中。基于这样的理解，本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来，该计算机软件产品存储在一个存储介质中，包括若干指令用以使得一台计算机设备(可以是个人计算机，服务器，或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括：U盘、移动硬盘、只读存储器(Read-OnlyMemory，ROM)、随机存取存储器(Random Access Memory，RAM)、磁碟或者光盘等各种可以存储程序代码的介质。If the functions are implemented in the form of software functional units and sold or used as independent products, they can be stored in a non-volatile computer-readable storage medium that is executable by a processor. Based on this understanding, the technical solution of the present application is essentially or the part that contributes to the existing technology or the part of the technical solution can be embodied in the form of a software product. The computer software product is stored in a storage medium, including Several instructions are used to cause a computer device (which can be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in various embodiments of this application. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program code.

应注意到：相似的标号和字母在下面的附图中表示类似项，因此，一旦某一项在一个附图中被定义，则在随后的附图中不需要对其进行进一步定义和解释，此外，术语“第一”、“第二”、“第三”等仅用于区分描述，而不能理解为指示或暗示相对重要性。It should be noted that similar reference numerals and letters represent similar items in the following drawings. Therefore, once an item is defined in one drawing, it does not need further definition and explanation in subsequent drawings. In addition, the terms "first", "second", "third", etc. are only used to distinguish descriptions and shall not be understood as indicating or implying relative importance.

最后应说明的是：以上所述实施例，仅为本申请的具体实施方式，用以说明本申请的技术方案，而非对其限制，本申请的保护范围并不局限于此，尽管参照前述实施例对本申请进行了详细的说明，本领域的普通技术人员应当理解：任何熟悉本技术领域的技术人员在本申请揭露的技术范围内，其依然可以对前述实施例所记载的技术方案进行修改或可轻易想到变化，或者对其中部分技术特征进行等同替换；而这些修改、变化或者替换，并不使相应技术方案的本质脱离本申请实施例技术方案的精神和范围，都应涵盖在本申请的保护范围之内。因此，本申请的保护范围应以权利要求的保护范围为准。Finally, it should be noted that the above-mentioned embodiments are only specific implementation modes of the present application, and are used to illustrate the technical solutions of the present application, but not to limit them. The protection scope of the present application is not limited thereto. Although refer to the foregoing The embodiments describe the present application in detail. Those of ordinary skill in the art should understand that any person familiar with the technical field can still modify the technical solutions recorded in the foregoing embodiments within the technical scope disclosed in the present application. It is possible to easily think of changes, or to make equivalent substitutions for some of the technical features; and these modifications, changes or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of this application, and they should all be covered by this application. within the scope of protection. Therefore, the protection scope of this application should be subject to the protection scope of the claims.