Detailed Description
To make the purpose, technical solutions and advantages of the present application clearer, the following will describe embodiments of the present application in further detail with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The terms "first," "second," and the like in this application are used for distinguishing between similar items and items that have substantially the same function or similar functionality, and it should be understood that "first," "second," and "nth" do not have any logical or temporal dependency or limitation on the number or order of execution.
Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.
The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like. The embodiment of the application relates to a natural language processing technology in an artificial intelligence technology.
Natural Language Processing (NLP) is an important direction in the fields of computer science and artificial intelligence. It studies various theories and methods that enable efficient communication between humans and computers using natural language. Natural language processing is a science integrating linguistics, computer science and mathematics. Therefore, the research in this field will involve natural language, i.e. the language that people use everyday, so it is closely related to the research of linguistics. Natural language processing techniques typically include text processing, semantic understanding, machine translation, robotic question and answer, knowledge mapping, and the like.
In order to facilitate understanding of the technical processes of the embodiments of the present application, some terms referred to in the embodiments of the present application are explained below:
BERT (Bidirectional Encoder retrieval from transforms, a natural language recognition model): the method is a natural language identification model issued by google, wherein a BERT model comprises a plurality of operation layers, namely a plurality of Transformers (converters), and each transformer acquires and decodes the characteristics of a text based on an attention mechanism. The BERT model is a model which is trained in advance, and when a developer applies the BERT model, the developer only needs to finely adjust each parameter in the model according to a specific natural language processing task, so that the training difficulty of the model can be effectively reduced, and the time consumption of model training is reduced. For example, in the embodiment of the present application, the developer further trains the BERT model based on the text error correction task, so that the BERT model can recognize wrong characters in the text and correct the wrong characters into correct characters.
An attention mechanism is as follows: the method is a means for rapidly screening high-value information from a large amount of information by using limited attention resources. The attention mechanism comprises two aspects, on the one hand, deciding which part of the input information needs to be focused on, and on the other hand, allocating limited information processing resources to the important part.
Fig. 1 is a schematic diagram of an implementation environment of a text error correction method according to an embodiment of the present application, and referring to fig. 1, the implementation environment includes a terminal 110 and aserver 140.
The terminal 110 is installed and operated with an application program supporting text error correction, for example, a browser application program or an information application program, and when a user inputs a piece of text for data search, the application program can correct the text input by the user to improve the search accuracy, or correct wrong characters in the search result based on a text error correction function, or filter out the search result containing too many wrong characters. It should be noted that the embodiment of the present application does not limit the type of the application program. Optionally, the terminal 110 is a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, or the like, and the device type of the terminal 110 is not limited in this embodiment of the application. Illustratively, the terminal 110 is a terminal used by a user, and an application running in the terminal 110 is logged with a user account. The terminal 110 generally refers to one of a plurality of terminals, and the embodiment is only illustrated by theterminal 110.
Theserver 140 may be at least one of a server, a plurality of servers, a cloud computing platform, and a virtualization center. Theserver 140 is used to provide background services for applications that support text correction. Optionally, theserver 140 undertakes primary text correction work, and the terminal 110 undertakes secondary text correction work; alternatively,server 140 undertakes the secondary text correction work and terminal 110 undertakes the primary text correction work; alternatively, theserver 140 or the terminal 110 may be respectively responsible for text correction.
Optionally, theserver 140 includes: an access server, a natural language processing server and a database. The access server is used to provide access services for the terminal 110. The natural language processing server is used for providing background services related to text error correction, and the natural language processing server can be loaded with a natural language processor and supports multithreading parallel computation of the natural language processor. The natural language processing server may be one or more. When the number of the natural language processing servers is multiple, there are at least two natural language processing servers for providing different services, and/or there are at least two natural language processing servers for providing the same service, for example, providing the same service in a load balancing manner, which is not limited in the embodiments of the present application. The natural language processing server can be provided with a text error correction model, and can carry a natural language processing processor and support the parallel operation of the natural language processing processor in the model training and application process. The server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a Network service, cloud communication, a middleware service, a domain name service, a security service, a CDN (Content Delivery Network), a big data and artificial intelligence platform, and the like.
The terminal 110 and theserver 140 may be directly or indirectly connected through wired or wireless communication, which is not limited in this embodiment of the application.
The text error correction method provided by the embodiment of the application can be combined with various application scenes and applied to various application programs. For example, in an information application program, the quality of information content may seriously affect user experience, and in general, the information content contains many wrongly written characters, and is poor in quality, and even wrong characters may be intentionally set in order to avoid machine review.
Fig. 2 is a flowchart of a text error correction method according to an embodiment of the present application. The method can be applied to the terminal or the server, and both the terminal and the server can be regarded as a computer device, in the embodiment of the present application, the computer device is taken as an execution subject, and the text error correction method is described, referring to fig. 2, and in one possible implementation, the embodiment includes the following steps:
201. the computer device obtains a first text to be corrected, the first text including at least two characters.
Optionally, the first text is a text segment stored in the computer device, or the first text is a text segment intercepted by the computer device from the network, or the first text is a text segment input by the user, or a text segment obtained by recognizing the voice data, and the embodiment of the present application does not limit what kind of text is specifically adopted. In the embodiment of the present application, the first text includes a plurality of characters, for example, the characters are chinese characters, and may also be foreign characters, and optionally, the plurality of characters includes an error character.
202. The computer device obtains the font feature, the pronunciation feature and the semantic feature of each character based on the structure, the pronunciation and the context information of each character in the first text.
Wherein the structure of the character is a stroke structure in the character; optionally, the pronunciation of the character is represented by the pinyin of the character, or by an audio file for reading the character, which is not limited in the embodiment of the present application; the context information is used to indicate the meaning of the character in the text.
In the embodiment of the application, the features of each character in the text are extracted from multiple dimensions, the font and pronunciation of each character and the context semantics of each character in the first text are fully considered, and subsequent text error correction is performed based on the multi-dimensional features, so that the comprehensiveness and accuracy of error correction can be improved.
203. And the computer equipment respectively carries out weighted fusion on the font characteristic, the pronunciation characteristic and the semantic characteristic of each character to obtain the fusion characteristic of each character.
In a possible implementation manner, the font features, the pronunciation features, and the semantic features of the characters correspond to different weights, that is, the importance degrees of the different features in the subsequent text error correction process are different, the computer device performs weighting processing on the font features, the pronunciation features, and the semantic features respectively based on the different weights, and then performs fusion on the weighted features to obtain the fusion features. In the subsequent text error correction process, the correct character is predicted based on the fusion characteristic, so that the computer equipment can pay more attention to the important characteristic, and the correct character can be accurately predicted. The method for fusing the features is not limited in the embodiments of the present application.
204. And the computer equipment decodes the obtained at least two fusion characteristics to obtain at least two target characters, and the at least two target characters form a second text which is the text after the error characters in the first text are corrected.
In a possible implementation manner, the computer device may map each fusion feature into a target character, where the target character is a correct character predicted by the computer device, and the computer device determines an arrangement order of each target character, and sorts the at least two characters based on the arrangement order to form the second text. The method for decoding the fusion feature in the embodiment of the present application is not limited.
According to the technical scheme provided by the embodiment of the application, when the first text is corrected, the font characteristic and the pronunciation characteristic of each character in the first text and the context semantic characteristic of each character in the first text are fully considered, the characteristics of the three dimensions are fused to predict the correct character, any character appearing in the first text can be identified and corrected, the coverage range of text correction can be effectively expanded, and the accuracy of text correction can be improved due to multi-dimensional characteristic fusion.
The above-mentioned embodiment is a brief introduction to the embodiments of the present application, and in a possible implementation, the text error correction process is implemented based on a text error correction model, which is a trained model, and optionally, the text error correction model is a model stored in a computer device or a model in a network. In a possible implementation manner, the text error correction model is constructed based on a BERT (natural language processing model), and fig. 3 is a schematic diagram of a text error correction model provided in an embodiment of the present application, and referring to fig. 3, the text error correction model includes aninput layer 301, afeature extraction layer 302, and anoutput layer 303. Theinput layer 301 includes a font analyzing network, a voice recognition network, and a semantic recognition network, wherein the font analyzing network is used for extracting font features (ShapeEmbidding) of characters, the voice recognition network is used for extracting pronunciation features (Pinyin Embidding) of the characters, and the semantic recognition network is used for extracting semantic features (CharEmbidding) of the characters in texts; thefeature extraction layer 302 comprises a feature fusion network, wherein the feature fusion network is composed of a backbone network of a BERT model and an attention module, and can perform feature fusion on features extracted by an input module based on an attention mechanism; theoutput layer 303 is capable of outputting the error corrected text. It should be noted that the above description of the text correction model is only an exemplary description, and the structure of the text correction model is not limited in the embodiments of the present application. Fig. 4 is a flowchart of a text error correction method provided in an embodiment of the present application, and the following describes, with reference to fig. 3 and fig. 4, the text error correction method based on the above text error correction model as an example, and in a possible implementation manner, the embodiment includes the following steps:
401. the computer equipment acquires a first text to be corrected and inputs the first text into a text correction model.
In one possible implementation, a computer device obtains a first text to be corrected in response to a text correction instruction. Illustratively, the computer device is a terminal used by a user, the terminal is installed and runs an application program supporting a text error correction function, for example, the application program is an information-based application program in which the user can search information content, in one possible implementation manner, a search operation of the user in the application program can trigger a search instruction and a text error correction instruction, the computer device queries a search result based on a search keyword provided by the user in response to the search instruction, and performs a subsequent text error correction step based on the search result in response to the text error correction instruction, optionally, the computer device takes all texts in the acquired search result as the first text to be corrected, or takes a title of each search result as the first text, which is not limited by the embodiments of the present application.
In the embodiment of the present application, the text error correction model is constructed based on the BERT model as an example. In a possible implementation manner, before the computer device inputs the first text into the file error correction model, the first text is preprocessed, optionally, the terminal unifies the at least two characters in the first text into a reference font, where the reference character is set by a developer, which is not limited in this application embodiment, for example, the first text includes a traditional character and a simplified character, and the computer device converts the traditional character in the first text into a corresponding simplified character. Optionally, the computer device removes the foreign characters in the first text, for example, if the error correction model is directed to correcting chinese characters, the computer device may remove foreign characters in the first text that are not within the error correction range. Optionally, the computer device may also remove special characters in the first text, perform case conversion on english characters, and the like. Of course, the computer device may also perform preprocessing on the first file by other methods, which is not limited in this embodiment of the application.
402. And the computer equipment respectively extracts the font features of the at least two characters based on the structures of the at least two characters in the first text through a font analysis network in the text error correction model.
In one possible implementation, a method for a computer device to obtain glyph features of a character includes any one of the following implementations:
in a first implementation manner, in a possible implementation manner, a character set is stored in the computer device, where the character set includes a plurality of reference characters, the computer device generates, through the glyph analysis network, a character node map corresponding to any character based on a structure of the any character and structures of at least two reference characters, where the character node map is used to indicate an association relationship between the any character and the at least two reference characters in a structure dimension, and the computer device performs feature extraction on the character node map corresponding to the any character to obtain a glyph feature corresponding to the any character. In the embodiment of the present application, the glyph analysis network can disassemble characters, for example, the glyph analysis network disassembles characters based on a character forming component of the characters, or disassembles characters based on strokes of the characters, and the glyph analysis network constructs a character node map based on the structure of the characters, that is, the disassembled result of the characters. Fig. 5 is a schematic diagram of a character node map provided in an embodiment of the present application, where, taking the character "rain" as an example, a computer device generates the character node map according to structural similarity between the character "rain" and a reference character, each character can be used as a node, and nodes having an association relationship are connected to each other through edges. In one possible implementation, the glyph dissection network can assign different weights to each edge in the character node graph, taking thenode 501 and thenode 502 in fig. 5 as an example, the computer device determines the weight of the edge between thenode 501 and thenode 502 based on the number of strokes and the number of adjacent nodes of the characters corresponding to thenode 501 and thenode 502. Illustratively, the method by which the computer device determines the weight of an edge between nodes is expressed by the following equation (1):
wherein, wijWeight, s, representing the edge connected between the node of character i and the node of character jiNumber of strokes, s, representing character ijNumber of strokes, d, representing character jjIndicating the number of nodes adjacent to the node of character j.
In one possible implementation manner, the computer device performs vectorization processing on the character node map based on the character node map and the weights of the edges in the map, and exemplarily, the computer device converts the character node map into a glyph vector based on a node2vec algorithm, and the glyph vector represents the glyph features of the characters. Of course, the glyph feature may also be expressed in other forms, and the embodiment of the present application is not limited thereto. In the strange embodiment, the incidence relation between characters with similar fonts is obtained by constructing the character node graph, the font features corresponding to the characters are generated based on the incidence relation of the characters in the font structure dimension, and the objectivity and the accuracy of the obtained font features can be ensured.
In a second possible implementation manner, the glyph analysis network has an image processing function and can perform image feature extraction, and is exemplarily constructed based on a convolutional neural network and includes a plurality of convolutional layers for performing image feature extraction. The computer device obtains a character image corresponding to each character, wherein the character image is used for indicating a structure of the character, and optionally, the character image is an image obtained by respectively performing screenshot on each character in the first text. And the computer equipment performs image feature extraction on the character image corresponding to each character through the font analyzing network to obtain the font feature of each character. The present embodiment does not limit the structure of the glyph analysis network and the method of extracting image features. In the embodiment of the application, the image features of the character image are directly extracted, and the extracted image features are used as the font features of the characters, so that the font features of the characters can be efficiently and quickly acquired.
It should be noted that the above description of the method for extracting the font features of the characters is only an exemplary description of one possible implementation manner, and the embodiment of the present application does not limit which method is specifically used to obtain the font features. In the embodiment of the application, the character dimension characteristics are integrated in the text error correction process, so that the model can fully learn the association between the correct character pattern and the wrong character pattern, the text error correction process is closer to the process of correcting the wrong character by human, and the accuracy of the text error correction result is further improved.
403. And the computer equipment respectively extracts the pronunciation characteristics of at least two characters in the first text based on the pronunciation of the at least two characters through a voice recognition network in the text error correction model.
In one possible implementation, a method for acquiring pronunciation characteristics of a character by a computer device includes any one of the following implementation manners:
in a possible implementation manner, the computer device obtains the pinyin corresponding to each character, where the pinyin is used to indicate the pronunciation of the character, and the computer device encodes the pinyin corresponding to each character through the voice recognition network to obtain the pronunciation characteristics of each character. Illustratively, the computer device stores a correspondence table between characters and pinyins, determines the pinyin corresponding to each character by querying the correspondence table, maps each pinyin to a pinyin vector, and indicates the pronunciation characteristics of the character by the pinyin vector. Optionally, the computer device maps each alphabetic element in the pinyin to a sub-vector, and then splices the sub-vectors corresponding to each alphabetic element based on the arrangement sequence of each alphabetic element in the pinyin to obtain the pinyin vector, and of course, the computer device may also obtain the pinyin vector of each character in other manners, which is not limited in the embodiment of the present application. In the embodiments of the present application, only the form in which the phonetic features of characters are expressed as vectors is taken as an example, and the phonetic features may be expressed in other forms such as matrices.
In one possible implementation, before encoding the pinyin for each character, the computer device may also perform a voicing process on the pinyin so that characters with similar pronunciation can correspond to the same pronunciation characteristics. Illustratively, the computer device performs data processing on the pinyin corresponding to each character through the voice recognition network based on the reference mapping condition, and then encodes the pinyin after data processing through the voice recognition network to obtain the pronunciation characteristics of each character. The reference mapping condition is set by a developer, and is not limited in the embodiment of the present application, and exemplarily includes mapping a warped-tongue sound in a pinyin to a corresponding flat-tongue sound, mapping a nose sound in the pinyin to a corresponding side sound, mapping a rear nose sound in the pinyin to a front nose sound, and removing at least one of the tones of the pinyin. In the embodiment of the application, the pronunciation characteristics of the characters are extracted based on the pinyin of the characters, and the pinyin with the similar pronunciation is subjected to the nearing treatment, so that the pinyin with the similar pronunciation is mapped to the same pronunciation characteristics, and the model can have good performance in the error correction of the nearing characters.
In a second possible implementation manner, the voice recognition network has a function of processing an audio file, and can perform audio feature extraction. Illustratively, the glyph dissection network is constructed based on a convolutional neural network, including a plurality of convolutional layers for audio feature extraction. The computer device obtains an audio file corresponding to each character, wherein one audio file comprises voice information for reading one character, and optionally, the audio file is pre-recorded and stored in the computer device. And the computer equipment performs audio characteristic extraction on the audio file corresponding to each character through the voice recognition network to obtain the pronunciation characteristic of each character. It should be noted that the structure of the speech recognition network and the method for extracting the audio features are not limited in the embodiments of the present application. In the embodiment of the application, the pronunciation characteristics are extracted based on the audio files corresponding to the characters, and the pronunciation characteristics of each character can be efficiently and quickly acquired.
It should be noted that the above description of the method for extracting the pronunciation features of the characters is only an exemplary description of one possible implementation manner, and the embodiment of the present application does not limit which method is specifically used to obtain the pronunciation features. In the embodiment of the application, the pronunciation characteristics of the characters are fused in the text error correction process, so that the model can fully learn the similarity between the pronunciations of the correct characters and the incorrect characters, and the model can well express in the subsequent error correction of the near-phonetic characters.
404. The computer equipment respectively extracts semantic features of at least two characters in the first text based on the context information of the at least two characters in the first text through a semantic recognition network in a text error correction model.
In a possible implementation manner, the semantic recognition network is a network used for extracting semantic features in an input layer of the BERT model, and the BERT model can map each character in the input first text into a CharEmbedding, which is also a semantic feature of the character. In a possible implementation manner, the semantic recognition Network may be constructed by a convolutional Neural Network (convolutional Neural Network), an RNN (Recurrent Neural Network), and the like, and the structure of the semantic recognition Network is not limited in this embodiment of the application. Illustratively, the computer device performs bidirectional feature extraction on the first text in the semantic recognition network to enable the semantic features of each character to be fused with context information of the first text, that is, the semantic recognition network performs semantic feature extraction on each character sequentially from left to right and from right to left, in the process of feature extraction once, a hidden layer feature corresponding to each character can be obtained, and after feature extraction twice, each character can correspond to two hidden layer features. Taking feature extraction of a first character and a second character which are adjacent in a first text by a computer device as an example, the first character is located on the left side of the second character, in a possible implementation manner, in a feature extraction sequence from left to right, after extracting a hidden layer feature of the first character, the computer device may transmit the hidden layer feature to the second character, and generate a hidden layer feature of the second character by combining the hidden layer feature of the first character, that is, semantic information of a previous character is fused in the hidden layer feature of each character; in the feature extraction sequence from right to left, after obtaining the hidden layer feature of the second character, the computer device may transmit the hidden layer feature to the first character, and generate the hidden layer feature of the first character by combining the hidden layer feature of the second character, that is, the hidden layer feature of each character is fused with semantic information of the next character. And the computer equipment performs feature fusion on the two hidden layer features to obtain the semantic features corresponding to each character. It should be noted that the above description of the semantic feature obtaining method is only an exemplary description of one possible implementation manner, and the embodiment of the present application does not limit which method is specifically used to obtain the semantic feature. In the embodiment of the application, the semantic features containing the text context information are obtained, and the text context is combined to identify and correct the wrong characters, so that the accuracy of text error correction can be improved.
It should be noted that thesteps 402 to 404 are steps of acquiring the font feature, the pronunciation feature, and the semantic feature of each character based on the structure, the pronunciation, and the context information in the first text of each character. In the embodiment of the application, the external information of the characters, such as the character pronunciation characteristics and the character font characteristics, is fully fused, and then the semantic characteristics of the characters in the text are combined, so that the coverage of text error correction can be effectively expanded, and the accuracy of the text error correction result is improved.
405. The computer equipment determines a first weight corresponding to the font characteristic, a second weight corresponding to the pronunciation characteristic and a third weight corresponding to the semantic characteristic of each character.
In the embodiment of the application, the computer device can assign different weights to different features, so that important features, namely features with larger weights, can be focused more in the subsequent feature decoding process. In a possible implementation manner, the process of acquiring, by the computer device, the weight corresponding to each feature includes the following two steps:
step one, for any character, the computer device performs feature fusion on the font feature, the pronunciation feature and the semantic feature of the character through a feature fusion network in a text error correction model, namely a backbone network in a BERT model, so as to obtain an initial fusion feature (BertHidden) corresponding to the character. It should be noted that, the method for extracting the initial fusion feature by the BERT model is not limited in the embodiments of the present application.
And secondly, respectively determining a first weight corresponding to the font feature, a second weight corresponding to the pronunciation feature and a third weight corresponding to the semantic feature of any character by the computer equipment based on the initial fusion feature corresponding to any character. In a possible implementation manner, first, the computer device obtains a text semantic feature corresponding to the first text, that is, the computer device performs overall feature extraction on the first text through an input layer in a text error correction model, that is, an input layer in a BERT model, to obtain a text feature (CLS) of the first text. Then, for any character, the computer equipment performs characteristic fusion on the text characteristic, the initial fusion characteristic and the font characteristic of the character to obtain a first intermediate characteristic; performing feature fusion on the text feature, the initial fusion feature and the pronunciation feature of any character to obtain a second intermediate feature; and performing feature fusion on the text feature, the initial fusion feature and the semantic feature of any character to obtain a third intermediate feature. Finally, the computer device determines the first weight, the second weight, and the third weight based on the first intermediate feature, the second intermediate feature, and the third intermediate feature, respectively. In one possible implementation, the method for obtaining the fusion feature is expressed as the following formula (2) to formula (5):
fij=softmax(ReLU(W1·[Hcls;Hi;Eij]+b1)) (2)
hij=ReLU(W2·[Hcls;Hi;Eij]+b2)⊙fij (3)
uij=W3·[Hi;hij]+b3 (4)
wherein, aijRepresenting the weight corresponding to the j item characteristic of the ith character, wherein j belongs to the { font characteristic, pronunciation characteristic and semantic characteristic }; hclsA text semantic feature representing the first text; hiInitial fusion feature, h, representing the ith characterijIntermediate features corresponding to the jth feature representing the character, EijJ-th feature, W, representing the ith character1、W2、W3、b1、b2And b3The numerical value of (a) is set by a developer; u. ofijDenotes uijTransposing; f. ofijRepresenting intermediate fusion features by text semantic features HclsInitial fusion characteristics H of charactersiAnd j-th feature h of characterijObtained by fusion.
For example, the computer device may perform dot-product processing on the initial fusion feature and the font feature, the pronunciation feature and the semantic feature of any character to obtain the first right, the second weight and the third weight, which is not limited in the embodiment of the present application. In the embodiment of the application, different weights are assigned to the features of different dimensions based on the attention mechanism, for example, for error characters with similar fonts, the weights of the font features of the characters are respectively greater, and for error characters with similar pronunciation, greater weights are assigned to the pronunciation features of the characters, so that the computer device can pay more attention to the features with greater weights, that is, pay more attention to the features with greater importance in the subsequent text error correction process, thereby improving the accuracy of text error correction.
406. And the computer equipment performs weighted fusion on the font characteristic, the pronunciation characteristic and the semantic characteristic of any character based on the first weight, the second weight and the third weight to obtain the fusion characteristic of any character.
In a possible implementation manner, the computer device may apply the first weight, the second weight, and the third weight to directly weight the font feature, the pronunciation feature, and the semantic feature, or may weight the intermediate features obtained instep 406, that is, perform weighted fusion on the first intermediate feature, the second intermediate feature, and the third intermediate feature based on the first weight, the second weight, and the third weight, respectively, to obtain the fused feature. In the embodiment of the application, the example that the intermediate feature is weighted and fused to obtain the fused feature is taken as an example for explanation, the intermediate feature fuses the whole text semantic feature of the first text, the included information is richer, so that feature fusion is performed based on the intermediate feature, the obtained fused feature can include richer and multidimensional information, and the accuracy of subsequent character prediction is further improved. In one possible implementation, the process of feature fusion by the computer device can be expressed as the following formula (6):
Z=∑jαij*hij (6)
wherein Z represents fusion characteristics, j belongs to { font characteristics, pronunciation characteristics and semantic characteristics }, alphaijWeight h corresponding to j-th feature of i-th characterijJ-th item characteristic pair for representing characterIntermediate characteristics of the response.
The above-mentionedsteps 405 and 406 are steps of performing weighted fusion on the font feature, the pronunciation feature, and the semantic feature of each character to obtain a fusion feature of each character. Fig. 6 is a schematic diagram of a feature fusion method provided in an embodiment of the present application, and the following describes the processes ofstep 405 and step 406 with reference to fig. 6, to take feature fusion for each feature corresponding to one character as an example, a computer device fuses a text semantic feature (CLS) and an initial fusion feature (berthiden) of a first text with a font feature (ShapeEmbedding), a pronunciation feature (pinyinmeembedding), and a semantic feature (CharEmbedding) of the character, respectively, to obtain a firstintermediate feature 601, a secondintermediate feature 602, and a thirdintermediate feature 603, and then performs weighted fusion on the three intermediate features based on the first weight, the second weight, and the third weight, respectively, to obtain a fusion feature.
407. And the computer equipment decodes the obtained at least two fusion characteristics to obtain at least two target characters, and the at least two target characters form a second text.
Wherein the second text is a text corrected for the erroneous characters in the first text.
In one possible implementation, the computer device passes through a text correction model. Namely an output layer in the BERT model, respectively decoding each fusion feature into a classification vector, wherein one element in the classification vector is used for indicating the probability that the fusion feature corresponds to one candidate character, and the computer equipment respectively determines the target character from the candidate character indicated by the element with the largest value in each classification vector. It should be noted that the above description of the target character determination method is only an exemplary description of one possible implementation manner, and the embodiment of the present application does not limit which method is specifically adopted to determine the target character. The computer device sorts the target characters to obtain the second text, where the arrangement order of the target characters is the same as the arrangement order of the corresponding fusion features, and exemplarily, the computer device determines the arrangement order of each fusion feature based on the position of the character corresponding to the fusion feature in the first text.
Fig. 7 is a schematic diagram of a text error correction method according to an embodiment of the present application, and the text error correction process is described below with reference to fig. 7. As shown in fig. 7, in a possible implementation manner, the computer device inputs the first text into a text error correction model, that is, a BERT model, and the BERT model extracts a font feature (typeembedding), a pronunciation feature (PinyinEmbedding), and a semantic feature (CharEmbedding) corresponding to each character in the first text, optionally, the BERT model may further extract a position feature (PositionEmbedding) and a segmentation feature (SegmentEmbedding) of each character in the first text, and the computer device performs feature fusion based on the extracted multidimensional features to obtain an initial fusion feature corresponding to each character and a text semantic feature of the first text. For any character, the computer device performs weighted fusion on the features of each dimension of the character based on an attention mechanism, namely, thestep 405 and thestep 406 are executed to obtain a fusion feature, the fusion feature is mapped into a classification vector (classifier), a target character is determined based on the classification vector, and then a second text is obtained.
According to the technical scheme provided by the embodiment of the application, when the first text is corrected, the font characteristic and the pronunciation characteristic of each character in the first text and the context semantic characteristic of each character in the first text are fully considered, the characteristics of the three dimensions are fused to predict the correct character, any character appearing in the first text can be identified and corrected, the coverage range of text correction can be effectively expanded, and the accuracy of text correction can be improved due to multi-dimensional characteristic fusion.
Fig. 8 is an interface schematic diagram of an information application provided in an embodiment of the present application, as shown in fig. 8 (a), when a user searches information content, the user may obtain a search result containing a wrongly written or mispronounced word, for example, the search result shown in anarea 801, and in combination with the technical solution provided in the embodiment of the present application, the computer device may perform text error correction on the search result, and replace an incorrect character in the search result, as shown in fig. 8 (b), or the computer device may filter content including the incorrect character directly, or, for search content including the incorrect character, the computer device adjusts a display position of the content in a search result interface to a later position. By combining the text error correction method provided by the embodiment of the application with the information application program, the quality of the information content browsed by the user can be effectively improved, malicious content in the application program can be accurately filtered, and the user experience of the user when the user uses the application program is improved.
The text error correction model in the above embodiment is a pre-trained model stored in a computer device, and the text error correction model is a model trained by the computer device or a model trained by other devices. Fig. 9 is a flowchart of a training method of a text correction model provided in an embodiment of the present application, and referring to fig. 9, in a possible implementation manner, the training method of the text correction model includes the following steps:
901. the computer device obtains a text error correction model to be trained and at least two training samples.
In the embodiment of the present application, the text error correction model to be trained is a BERT model that has been pre-trained. A training sample comprises a first training text and a second training text, wherein the first training text is a text comprising wrong characters, and the second training text is a text which corrects a wrong system in the first training text. In one possible implementation, the computer device trains the text error correction model based on an equal-length sequence, that is, the number of characters included in each text of the input text error correction model is the same, and for example, before each training text is input into the text error correction model, the computer device may adjust the length of each training text, for example, adding a place-occupying character at the end of the training text, so that the length of each training text is the same.
902. And the computer equipment inputs the at least two training samples into the text error correction model to obtain the error between the output result and the correct result of the text error correction model.
In this embodiment of the present application, the computer device performs error correction on the first training text in the training sample through the text error correction model to obtain an output result of the text error correction model, and the process of performing error correction on the first training file is the same as the foregoingsteps 401 to 407. The computer device obtains an error between the output result of the model and the correct result, i.e. the corresponding second training text, based on the loss function. The loss function may be a cross entropy loss function, which is not limited in the embodiments of the present application, and the method for obtaining the error between the model output result and the correct result is not limited in the embodiments of the present application.
903. And the computer equipment adjusts each parameter in the text error correction model based on the error until a model convergence condition is met, so that the trained text error correction model is obtained.
In one possible implementation, the computer device compares the obtained error with an error threshold, if the error is greater than the error threshold, the computer device propagates the error back to the text error correction model, and updates various parameters in the text error correction model in combination with an Adaptive moment estimation (Adam) optimization algorithm of SWA (Stochastic Weight Averaging) technology. If the error obtained by the computer device is smaller than the error threshold, it is determined that the output result of the model is correct, and the computer device continues to read the next set of training samples, and performs thestep 902. The error threshold is set by a developer, and the embodiment of the present application is not limited thereto.
In a possible implementation manner, if the number of the correct output results obtained by the computer device reaches the reference number, or all training samples are completely read, it is determined that the text error correction model meets the model convergence condition, and the text error correction model after training is obtained. The reference number is set by a developer, and is not limited in the embodiments of the present application. The model convergence condition may be set to other contents, and the embodiment of the present application is not limited to this.
All the above optional technical solutions may be combined arbitrarily to form optional embodiments of the present application, and are not described herein again.
Fig. 10 is a schematic structural diagram of a text error correction apparatus according to an embodiment of the present application, and referring to fig. 10, the apparatus includes:
atext obtaining module 1001, configured to obtain a first text to be corrected, where the first text includes at least two characters;
afeature obtaining module 1002, configured to obtain a font feature, a pronunciation feature, and a semantic feature of each character based on a structure and a pronunciation of each character and context information in the first text, respectively;
afeature fusion module 1003, configured to perform weighted fusion on the font feature, the pronunciation feature, and the semantic feature of each character respectively to obtain a fusion feature of each character;
thefeature decoding module 1004 is configured to decode the obtained at least two fusion features to obtain at least two target characters, and form a second text from the at least two target characters, where the second text is a text obtained by correcting an error character in the first text.
In one possible implementation, thefeature obtaining module 1002 includes:
the first obtaining sub-module is used for respectively extracting the font features of the at least two characters based on the structures of the at least two characters through a font analyzing network in the text error correction model;
the second obtaining submodule is used for respectively extracting the pronunciation characteristics of the at least two characters based on the pronunciation of the at least two characters through a voice recognition network in the text error correction model;
and the third obtaining submodule is used for respectively extracting semantic features of the at least two characters on the basis of the context information of the at least two characters in the first text through a semantic recognition network in the text error correction model.
In one possible implementation, the first obtaining sub-module is configured to:
generating a character node map corresponding to any character based on the structure of the any character and the structures of at least two reference characters through the font analyzing network, wherein the character node map is used for indicating the incidence relation of the any character and the at least two reference characters in the structure dimension;
and extracting the characteristics of the character node graph corresponding to any character to obtain the font characteristics corresponding to any character.
In one possible implementation, the first obtaining sub-module is configured to:
acquiring a character image corresponding to each character, wherein the character image is used for indicating the structure of the character;
and performing image feature extraction on the character image corresponding to each character through the character pattern analysis network to obtain the character pattern feature of each character.
In one possible implementation, the second obtaining sub-module includes:
a pinyin obtaining unit, configured to obtain a pinyin corresponding to each character, where the pinyin is used to indicate the pronunciation of the character;
and the pinyin coding unit is used for coding the pinyin corresponding to each character through the voice recognition network to obtain the pronunciation characteristics of each character.
In one possible implementation manner, the pinyin coding unit is configured to:
performing data processing on the pinyin corresponding to each character through the voice recognition network based on a reference mapping condition, wherein the reference mapping condition comprises at least one of mapping a warped-tongue sound in the pinyin to a corresponding flat-tongue sound, mapping a nose sound in the pinyin to a corresponding side sound, mapping a rear nose sound in the pinyin to a front nose sound and removing the pinyin tone;
and coding the pinyin after data processing through the voice recognition network to obtain the pronunciation characteristics of each character.
In one possible implementation, the second obtaining sub-module is configured to:
acquiring audio files corresponding to each character, wherein one audio file comprises voice information for reading one character;
and extracting audio features of the audio file corresponding to each character through the voice recognition network to obtain the pronunciation features of each character.
In one possible implementation, thefeature fusion module 1003 includes:
the first fusion submodule is used for carrying out feature fusion on the font feature, the pronunciation feature and the semantic feature of any character through a feature fusion network in the text error correction model for any character to obtain an initial fusion feature corresponding to any character;
the weight determining submodule is used for respectively determining a first weight corresponding to the font feature, a second weight corresponding to the pronunciation feature and a third weight corresponding to the semantic feature of any character based on the initial fusion feature corresponding to the any character;
and the second fusion submodule is used for performing weighted fusion on the font characteristic, the pronunciation characteristic and the semantic characteristic of any character based on the first weight, the second weight and the third weight to obtain the fusion characteristic of any character.
In one possible implementation, the weight determination submodule is configured to:
acquiring text semantic features corresponding to the first text;
for any character, performing feature fusion on the text feature, the initial fusion feature and the font feature of the character to obtain a first intermediate feature;
performing feature fusion on the text feature, the initial fusion feature and the pronunciation feature of any character to obtain a second intermediate feature;
performing feature fusion on the text feature, the initial fusion feature and the semantic feature of any character to obtain a third intermediate feature;
determining the first weight, the second weight, and the third weight based on the first intermediate feature, the second intermediate feature, and the third intermediate feature, respectively.
In one possible implementation, the second fusion submodule is configured to:
and performing weighted fusion on the first intermediate feature, the second intermediate feature and the third intermediate feature based on the first weight, the second weight and the third weight respectively to obtain the fused feature.
In one possible implementation, thefeature decoding module 1004 is configured to:
decoding each fusion feature into a classification vector respectively, wherein one element in the classification vector is used for indicating the probability that the fusion feature corresponds to one candidate character;
and respectively determining the candidate character indicated by the element with the largest value in each classification vector to determine the target character.
In one possible implementation, the apparatus further comprises at least one of:
the font adjusting module is used for unifying the at least two characters in the first text into a reference font;
and the character removing module is used for removing foreign characters in the first text.
According to the device provided by the embodiment of the application, when the first text is corrected, the font characteristic and the pronunciation characteristic of each character in the first text and the context semantic characteristic of each character in the first text are fully considered, the characteristics of the three dimensions are fused to predict the correct character, any character appearing in the first text can be identified and corrected, the coverage range of text correction can be effectively expanded, and the accuracy of text correction can be improved due to multi-dimensional characteristic fusion.
It should be noted that: in the text error correction device provided in the above embodiment, only the division of the above functional modules is used for illustration when text error correction is performed, and in practical applications, the above function distribution may be completed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules to complete all or part of the above described functions. In addition, the text error correction device provided by the above embodiment and the text error correction method embodiment belong to the same concept, and the specific implementation process thereof is detailed in the method embodiment and will not be described herein again.
The computer device provided in the foregoing technical solution may be implemented as a terminal or a server, for example, fig. 11 is a schematic structural diagram of a terminal provided in this embodiment of the present application. The terminal 1100 may be: a smart phone, a tablet computer, an MP3 player (Moving Picture Experts Group Audio Layer III, motion video Experts compression standard Audio Layer 3), an MP4 player (Moving Picture Experts Group Audio Layer IV, motion video Experts compression standard Audio Layer 4), a notebook computer, or a desktop computer. Terminal 1100 may also be referred to by other names such as user equipment, portable terminal, laptop terminal, desktop terminal, and so forth.
In general, terminal 1100 includes: one ormore processors 1101 and one ormore memories 1102.
Processor 1101 may include one or more processing cores, such as a 4-core processor, an 8-core processor, or the like. Theprocessor 1101 may be implemented in at least one hardware form of a DSP (Digital Signal Processing), an FPGA (Field-Programmable Gate Array), and a PLA (Programmable Logic Array). Theprocessor 1101 may also include a main processor and a coprocessor, where the main processor is a processor for Processing data in an awake state, and is also called a Central Processing Unit (CPU); a coprocessor is a low power processor for processing data in a standby state. In some embodiments, theprocessor 1101 may be integrated with a GPU (Graphics Processing Unit) that is responsible for rendering and drawing the content that the display screen needs to display. In some embodiments, theprocessor 1101 may further include an AI (Artificial Intelligence) processor for processing computing operations related to machine learning.
Memory 1102 may include one or more computer-readable storage media, which may be non-transitory.Memory 1102 can also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in thememory 1102 is used to store at least one computer program for execution by theprocessor 1101 to implement the text correction method provided by the method embodiments herein.
In some embodiments, the terminal 1100 may further include: aperipheral interface 1103 and at least one peripheral. Theprocessor 1101,memory 1102 andperipheral interface 1103 may be connected by a bus or signal lines. Various peripheral devices may be connected to theperipheral interface 1103 by buses, signal lines, or circuit boards. Specifically, the peripheral device includes: at least one ofradio frequency circuitry 1104,display screen 1105,camera assembly 1106,audio circuitry 1107,positioning assembly 1108, andpower supply 1109.
Theperipheral interface 1103 may be used to connect at least one peripheral associated with I/O (Input/Output) to theprocessor 1101 and thememory 1102. In some embodiments, theprocessor 1101,memory 1102, andperipheral interface 1103 are integrated on the same chip or circuit board; in some other embodiments, any one or two of theprocessor 1101, thememory 1102 and theperipheral device interface 1103 may be implemented on separate chips or circuit boards, which is not limited by this embodiment.
TheRadio Frequency circuit 1104 is used to receive and transmit RF (Radio Frequency) signals, also called electromagnetic signals. Theradio frequency circuit 1104 communicates with communication networks and other communication devices via electromagnetic signals. Theradio frequency circuit 1104 converts an electric signal into an electromagnetic signal to transmit, or converts a received electromagnetic signal into an electric signal. Optionally, theradio frequency circuit 1104 includes: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a subscriber identity module card, and so forth. Theradio frequency circuit 1104 may communicate with other terminals via at least one wireless communication protocol. The wireless communication protocols include, but are not limited to: metropolitan area networks, various generation mobile communication networks (2G, 3G, 4G, and 5G), Wireless local area networks, and/or WiFi (Wireless Fidelity) networks. In some embodiments, therf circuit 1104 may further include NFC (Near Field Communication) related circuits, which are not limited in this application.
Thedisplay screen 1105 is used to display a UI (user interface). The UI may include graphics, text, icons, video, and any combination thereof. When thedisplay screen 1105 is a touch display screen, thedisplay screen 1105 also has the ability to capture touch signals on or over the surface of thedisplay screen 1105. The touch signal may be input to theprocessor 1101 as a control signal for processing. At this point, thedisplay screen 1105 may also be used to provide virtual buttons and/or a virtual keyboard, also referred to as soft buttons and/or a soft keyboard. In some embodiments,display 1105 may be one, providing the front panel of terminal 1100; in other embodiments, thedisplay screens 1105 can be at least two, respectively disposed on different surfaces of the terminal 1100 or in a folded design; in some embodiments,display 1105 can be a flexible display disposed on a curved surface or on a folded surface of terminal 1100. Even further, thedisplay screen 1105 may be arranged in a non-rectangular irregular pattern, i.e., a shaped screen. TheDisplay screen 1105 may be made of LCD (Liquid Crystal Display), OLED (Organic Light-Emitting Diode), and the like.
Camera assembly 1106 is used to capture images or video. Optionally,camera assembly 1106 includes a front camera and a rear camera. Generally, a front camera is disposed at a front panel of the terminal, and a rear camera is disposed at a rear surface of the terminal. In some embodiments, the number of the rear cameras is at least two, and each rear camera is any one of a main camera, a depth-of-field camera, a wide-angle camera and a telephoto camera, so that the main camera and the depth-of-field camera are fused to realize a background blurring function, and the main camera and the wide-angle camera are fused to realize panoramic shooting and VR (Virtual Reality) shooting functions or other fusion shooting functions. In some embodiments,camera assembly 1106 may also include a flash. The flash lamp can be a monochrome temperature flash lamp or a bicolor temperature flash lamp. The double-color-temperature flash lamp is a combination of a warm-light flash lamp and a cold-light flash lamp, and can be used for light compensation at different color temperatures.
Theaudio circuitry 1107 may include a microphone and a speaker. The microphone is used for collecting sound waves of a user and the environment, converting the sound waves into electric signals, and inputting the electric signals to theprocessor 1101 for processing or inputting the electric signals to theradio frequency circuit 1104 to achieve voice communication. For stereo capture or noise reduction purposes, multiple microphones may be provided, each at a different location of terminal 1100. The microphone may also be an array microphone or an omni-directional pick-up microphone. The speaker is used to convert electrical signals from theprocessor 1101 or theradio frequency circuit 1104 into sound waves. The loudspeaker can be a traditional film loudspeaker or a piezoelectric ceramic loudspeaker. When the speaker is a piezoelectric ceramic speaker, the speaker can be used for purposes such as converting an electric signal into a sound wave audible to a human being, or converting an electric signal into a sound wave inaudible to a human being to measure a distance. In some embodiments, theaudio circuitry 1107 may also include a headphone jack.
Positioning component 1108 is used to locate the current geographic position of terminal 1100 for purposes of navigation or LBS (Location Based Service). ThePositioning component 1108 may be a Positioning component based on the united states GPS (Global Positioning System), the chinese beidou System, the russian graves System, or the european union galileo System.
Power supply 1109 is configured to provide power to various components within terminal 1100. Thepower supply 1109 may be alternating current, direct current, disposable or rechargeable. When thepower supply 1109 includes a rechargeable battery, the rechargeable battery may support wired or wireless charging. The rechargeable battery may also be used to support fast charge technology.
In some embodiments, terminal 1100 can also include one or more sensors 1110. The one or more sensors 1110 include, but are not limited to: acceleration sensor 1111, gyro sensor 1112, pressure sensor 1113, fingerprint sensor 1114, optical sensor 1115, and proximity sensor 1116.
Acceleration sensor 1111 may detect acceleration levels in three coordinate axes of a coordinate system established with terminal 1100. For example, the acceleration sensor 1111 may be configured to detect components of the gravitational acceleration in three coordinate axes. Theprocessor 1101 may control thedisplay screen 1105 to display the user interface in a landscape view or a portrait view according to the gravitational acceleration signal collected by the acceleration sensor 1111. The acceleration sensor 1111 may also be used for acquisition of motion data of a game or a user.
The gyro sensor 1112 may detect a body direction and a rotation angle of the terminal 1100, and the gyro sensor 1112 may cooperate with the acceleration sensor 1111 to acquire a 3D motion of the user with respect to theterminal 1100. From the data collected by gyroscope sensor 1112,processor 1101 may implement the following functions: motion sensing (such as changing the UI according to a user's tilting operation), image stabilization at the time of photographing, game control, and inertial navigation.
Pressure sensor 1113 may be disposed on a side bezel of terminal 1100 and/orunderlying display screen 1105. When the pressure sensor 1113 is disposed on the side frame of the terminal 1100, the holding signal of the terminal 1100 from the user can be detected, and theprocessor 1101 performs left-right hand recognition or shortcut operation according to the holding signal collected by the pressure sensor 1113. When the pressure sensor 1113 is disposed at the lower layer of thedisplay screen 1105, theprocessor 1101 controls the operability control on the UI interface according to the pressure operation of the user on thedisplay screen 1105. The operability control comprises at least one of a button control, a scroll bar control, an icon control and a menu control.
The fingerprint sensor 1114 is configured to collect a fingerprint of the user, and theprocessor 1101 identifies the user according to the fingerprint collected by the fingerprint sensor 1114, or the fingerprint sensor 1114 identifies the user according to the collected fingerprint. Upon recognizing that the user's identity is a trusted identity, the user is authorized by theprocessor 1101 to perform relevant sensitive operations including unlocking the screen, viewing encrypted information, downloading software, paying for and changing settings, etc. Fingerprint sensor 1114 may be disposed on the front, back, or side of terminal 1100. When a physical button or vendor Logo is provided on the terminal 1100, the fingerprint sensor 1114 may be integrated with the physical button or vendor Logo.
Optical sensor 1115 is used to collect ambient light intensity. In one embodiment, theprocessor 1101 may control the display brightness of thedisplay screen 1105 based on the ambient light intensity collected by the optical sensor 1115. Specifically, when the ambient light intensity is high, the display brightness of thedisplay screen 1105 is increased; when the ambient light intensity is low, the display brightness of thedisplay screen 1105 is reduced. In another embodiment,processor 1101 may also dynamically adjust the shooting parameters ofcamera assembly 1106 based on the ambient light intensity collected by optical sensor 1115.
Proximity sensor 1116, also referred to as a distance sensor, is typically disposed on a front panel of terminal 1100. Proximity sensor 1116 is used to capture the distance between the user and the front face of terminal 1100. In one embodiment, when the proximity sensor 1116 detects that the distance between the user and the front face of the terminal 1100 is gradually decreased, thedisplay screen 1105 is controlled by theprocessor 1101 to switch from a bright screen state to a dark screen state; when the proximity sensor 1116 detects that the distance between the user and the front face of the terminal 1100 becomes progressively larger, thedisplay screen 1105 is controlled by theprocessor 1101 to switch from a breath-screen state to a light-screen state.
Those skilled in the art will appreciate that the configuration shown in fig. 11 does not constitute a limitation of terminal 1100, and may include more or fewer components than those shown, or may combine certain components, or may employ a different arrangement of components.
Fig. 12 is a schematic structural diagram of aserver 1200 according to an embodiment of the present application, where theserver 1200 may generate a relatively large difference due to a difference in configuration or performance, and may include one or more processors (CPUs) 1201 and one ormore memories 1202, where the one ormore memories 1202 store at least one computer program, and the at least one computer program is loaded and executed by the one ormore processors 1201 to implement the methods provided by the foregoing method embodiments. Certainly, theserver 1200 may further have components such as a wired or wireless network interface, a keyboard, and an input/output interface, so as to perform input and output, and theserver 1200 may further include other components for implementing the functions of the device, which is not described herein again.
In an exemplary embodiment, a computer-readable storage medium, such as a memory including at least one computer program, executable by a processor, is also provided to perform the text error correction method in the above embodiments. For example, the computer-readable storage medium may be a Read-Only Memory (ROM), a Random Access Memory (RAM), a Compact Disc Read-Only Memory (CD-ROM), a magnetic tape, a floppy disk, an optical data storage device, and the like.
In an exemplary embodiment, a computer program product is also provided, the computer program product comprising at least one computer program, the at least one computer program being stored in a computer readable storage medium. The at least one computer program is read by a processor of the computer device from the computer-readable storage medium, and the at least one computer program is executed by the processor to cause the computer device to implement the operations performed by the text error correction method.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, and the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
The above description is only exemplary of the present application and should not be taken as limiting, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the protection scope of the present application.