Movatterモバイル変換


[0]ホーム

URL:


CN113761190B - Text recognition method, device, computer readable medium and electronic device - Google Patents

Text recognition method, device, computer readable medium and electronic device
Download PDF

Info

Publication number
CN113761190B
CN113761190BCN202110492354.2ACN202110492354ACN113761190BCN 113761190 BCN113761190 BCN 113761190BCN 202110492354 ACN202110492354 ACN 202110492354ACN 113761190 BCN113761190 BCN 113761190B
Authority
CN
China
Prior art keywords
text
classification
entity
label
recognized
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110492354.2A
Other languages
Chinese (zh)
Other versions
CN113761190A (en
Inventor
铁瑞雪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co LtdfiledCriticalTencent Technology Shenzhen Co Ltd
Priority to CN202110492354.2ApriorityCriticalpatent/CN113761190B/en
Publication of CN113761190ApublicationCriticalpatent/CN113761190A/en
Application grantedgrantedCritical
Publication of CN113761190BpublicationCriticalpatent/CN113761190B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Classifications

Landscapes

Abstract

Translated fromChinese

本申请的实施例提供了一种文本识别方法、装置、计算机可读介质及电子设备。该文本识别方法包括:在待识别文本中添加第一分类标记,以生成所述待识别文本对应的输入对象;将所述输入对象输入至文本识别模型,所述文本识别模型是根据携带有标注实体标签以及标注分类标签的目标样本文本训练得到的;获取所述文本识别模型输出的所述待识别文本中各个文字对应的预测实体标签以及所述第一分类标记对应的预测分类标签;根据所述预测实体标签,生成针对所述待识别文本的实体识别结果,并根据所述预测分类标签,生成针对所述待识别文本的分类结果。本申请实施例的技术方案能够实现在对文本分类的同时提取文本中实体,且能够提高文本识别的准确率。

The embodiments of the present application provide a text recognition method, device, computer-readable medium and electronic device. The text recognition method includes: adding a first classification mark to the text to be recognized to generate an input object corresponding to the text to be recognized; inputting the input object into a text recognition model, the text recognition model is trained based on a target sample text carrying annotated entity labels and annotated classification labels; obtaining the predicted entity labels corresponding to each character in the text to be recognized output by the text recognition model and the predicted classification labels corresponding to the first classification mark; generating an entity recognition result for the text to be recognized based on the predicted entity labels, and generating a classification result for the text to be recognized based on the predicted classification labels. The technical solution of the embodiments of the present application can realize the extraction of entities in the text while classifying the text, and can improve the accuracy of text recognition.

Description

Text recognition method, text recognition device, computer readable medium and electronic equipment
Technical Field
The present application relates to the field of computers and communication technologies, and in particular, to a text recognition method, a text recognition device, a computer readable medium, and an electronic device.
Background
Natural language processing (Nature Language Processing, NLP) is an important direction in the fields of computer science and artificial intelligence. Text recognition is widely used for text content detection as an important application in natural language processing.
To identify negative information in web text information, related art often uses artificial feature construction or a large-scale corpus training model to identify negative information. However, the cost of manually constructing the features is high, the sparse phenomenon of the features is easy to occur, a large number of corpus labels can influence the training effect, the effect is easy to be poor, and the accuracy is low.
Disclosure of Invention
The embodiment of the application provides a text recognition method, a text recognition device, a computer readable medium and electronic equipment, which can be used for extracting entities in a text while classifying the text to a certain extent and improving the accuracy of text recognition.
Other features and advantages of the application will be apparent from the following detailed description, or may be learned by the practice of the application.
According to one aspect of the embodiment of the application, a text recognition method is provided, which comprises the steps of adding a first classification mark into a text to be recognized to generate an input object corresponding to the text to be recognized, inputting the input object into a text recognition model, wherein the text recognition model is obtained through training according to a target sample text carrying labeling entity labels and labeling classification labels, the labeling entity labels are entity labels corresponding to all characters in the target sample text, the labeling classification labels are classification labels corresponding to second classification marks added in the target sample text, obtaining a predicted entity label corresponding to all characters in the text to be recognized and a predicted classification label corresponding to the first classification mark, which are output by the text recognition model, generating an entity recognition result for the text to be recognized according to the predicted entity labels, and generating a classification result for the text to be recognized according to the predicted classification labels.
According to one aspect of the embodiment of the application, a text recognition device is provided, which comprises a first adding unit, a first input unit and a generation unit, wherein the first adding unit is configured to add a first classification label in a text to be recognized so as to generate an input object corresponding to the text to be recognized, the first input unit is configured to input the input object into a text recognition model, the text recognition model is obtained through training according to a target sample text carrying a labeling entity label and a labeling classification label, the labeling entity label is an entity label corresponding to each text in the target sample text, the labeling classification label is a classification label corresponding to a second classification label added in the target sample text, the acquisition unit is configured to acquire a predicted entity label corresponding to each text in the text to be recognized and a predicted classification label corresponding to the first classification label, and the generation unit is configured to generate an entity recognition result aiming at the text to be recognized according to the predicted entity label and generate a classification result aiming at the text to be recognized according to the predicted classification label.
In some embodiments of the present application, based on the foregoing solutions, the first adding unit is configured to divide the text to be identified in units of text to generate a text sequence corresponding to the text to be identified, and add the first classification flag to the text sequence to obtain a new text sequence, where the new text sequence is used as an input object corresponding to the text to be identified.
In some embodiments of the present application, based on the foregoing solutions, the generating unit is configured to identify, as the same entity, a text whose position is continuous in the text to be identified and whose corresponding predicted entity tag indicates the same entity, to obtain an entity identification result for the text to be identified, and use a classification category indicated by the predicted classification tag as a classification result for the text to be identified.
In some embodiments of the present application, based on the foregoing solutions, the apparatus further includes a second adding unit configured to add the second classification label to the target sample text to generate a sample input object corresponding to the target sample text, a second input unit configured to input the sample input object to a model to be trained to obtain a predicted score of each word in the target sample text output by the model to be trained for each entity label and each classification label of the second classification label, and a determining unit configured to determine a loss function according to the labeled entity label, the labeled classification label, and the predicted score, and adjust parameters of the model to be trained according to the loss function to obtain the text recognition model.
In some embodiments of the application, the determination unit comprises a prediction score determination subunit configured to determine a target prediction score corresponding to the same entity label as the labeling entity label and the same classification label as the labeling classification label from the prediction scores according to the labeling entity label and the labeling classification label, and a loss function determination subunit configured to determine the loss function according to the target prediction score.
In some embodiments of the present application, based on the foregoing scheme, the loss function determining subunit is configured to calculate a ratio between the target prediction score and a sum of the prediction scores, perform a logarithmic operation on the ratio to obtain an operation result, and determine the loss function according to the operation result.
In some embodiments of the present application, based on the foregoing solutions, the second adding unit is configured to divide the target sample text by using text as a unit to generate a text sequence corresponding to the target sample text, add the second classification flag to the text sequence to obtain a new text sequence, and use the new text sequence as a sample input object corresponding to the target sample text.
In some embodiments of the present application, based on the foregoing solutions, the apparatus further includes a keyword obtaining unit configured to obtain a keyword from a keyword library that is built in advance, a sample text obtaining unit configured to obtain a sample text including the keyword according to the keyword, a labeling unit configured to perform entity tag labeling and category tag labeling on a part of sample texts in the obtained sample text to generate an initial sample text, and a processing unit configured to perform data enhancement processing on the initial sample text to generate the target sample text.
In some embodiments of the present application, based on the foregoing, the processing unit is configured to copy the initial sample text to obtain a copy of the initial sample text, and generate the target sample text according to the initial sample text and the copy of the initial sample text.
In some embodiments of the application, based on the foregoing scheme, the processing unit is configured to perform synonym replacement on the target keyword included in the initial sample text to generate sample text including synonyms of the target keyword, and generate the target sample text according to the initial sample text and the sample text including synonyms of the target keyword.
In some embodiments of the present application, based on the foregoing solutions, the processing unit is configured to delete a portion of text included in the initial sample text to obtain a processed initial sample text, and generate the target sample text according to the initial sample text and the processed initial sample text.
In some embodiments of the present application, based on the foregoing scheme, the processing unit is configured to randomly insert words into the initial sample text to obtain a new initial sample text, and generate the target sample text according to the initial sample text and the new initial sample text.
According to an aspect of an embodiment of the present application, there is provided a computer-readable medium having stored thereon a computer program which, when executed by a processor, implements a text recognition method as described in the above embodiments.
According to an aspect of an embodiment of the present application, there is provided an electronic device including one or more processors, and storage means for storing one or more programs, which when executed by the one or more processors, cause the one or more processors to implement the text recognition method as described in the above embodiment.
According to an aspect of embodiments of the present application, there is provided a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device performs the text recognition method provided in the above-described various alternative embodiments.
In the technical solutions provided in some embodiments of the present application, a first classification label is added to a text to be identified to generate an input object corresponding to the text to be identified, then the input object is input to a text identification model, the text identification model is obtained by training according to a target sample text carrying a labeling entity label and a labeling classification label, the labeling entity label is an entity label corresponding to each text in the target sample text, the labeling classification label is a classification label corresponding to a second classification label added in the target sample text, further, a prediction entity label corresponding to each text in the text to be identified and a prediction classification label corresponding to the first classification label output by the text identification model can be obtained, finally, an entity identification result for the text to be identified can be generated according to the prediction entity label, and a classification result for the text to be identified can be generated according to the prediction classification label. According to the technical scheme, the text to be identified is identified through the text identification model, the manual construction of the features is not needed, the semantic knowledge can be automatically extracted, the labor cost is greatly reduced, the efficiency of entity identification and text classification is improved, the text identification model can extract the entity in the text to be identified when classifying the text to be identified, the entity identification task and the text classification task are combined to train, the semantic knowledge is shared by the two tasks, the attention of the text identification model to the entity is increased, and the accuracy of entity identification and text classification is improved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application as claimed.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application. It is evident that the drawings in the following description are only some embodiments of the present application and that other drawings may be obtained from these drawings without inventive effort for a person of ordinary skill in the art. In the drawings:
FIG. 1 is a schematic view of an application environment of a text recognition method according to an embodiment of the present application;
FIG. 2 is a flow chart of a text recognition method provided by one embodiment of the present application;
FIG. 3 illustrates a block diagram of a text recognition model;
FIG. 4 is a flow chart of a text recognition model training method provided by one embodiment of the present application;
FIG. 5 is a flow chart of a method of determining a loss function provided by one embodiment of the application;
FIG. 6 is a flow chart of a target sample text generation method provided by one embodiment of the present application;
FIG. 7 is a block diagram of a text recognition device provided by one embodiment of the present application;
Fig. 8 shows a schematic diagram of a computer system suitable for use in implementing an embodiment of the application.
Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. However, the exemplary embodiments can be embodied in many different forms and should not be construed as limited to the examples set forth herein, but rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of the exemplary embodiments to those skilled in the art.
Furthermore, the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to give a thorough understanding of embodiments of the application. One skilled in the relevant art will recognize, however, that the application may be practiced without one or more of the specific details, or with other methods, components, devices, steps, etc. In other instances, well-known methods, devices, implementations, or operations are not shown or described in detail to avoid obscuring aspects of the application.
It should be noted that the terms used in the description of the present application and the claims and the above-mentioned drawings are only used for describing the embodiments, and are not intended to limit the scope of the present application. It will be understood that the terms "comprises," "comprising," "includes," "including" and/or "having," when used herein, specify the presence of stated features, integers, steps, operations, elements, components, and/or groups thereof, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It will be further understood that, although the terms "first," "second," "third," etc. may be used herein to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element from another element. For example, a first element could be termed a second element without departing from the scope of the present invention. Similarly, the second element may be referred to as a first element. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.
The block diagrams depicted in the figures are merely functional entities and do not necessarily correspond to physically separate entities. That is, the functional entities may be implemented in software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor devices and/or microcontroller devices.
The flow diagrams depicted in the figures are exemplary only, and do not necessarily include all of the elements and operations/steps, nor must they be performed in the order described. For example, some operations/steps may be decomposed, and some operations/steps may be combined or partially combined, so that the order of actual execution may be changed according to actual situations.
It should be understood that in the present application, "at least one (item)" means one or more, and "a plurality" means two or more.
Artificial intelligence (ARTIFICIAL INTELLIGENCE, AI) is the theory, method, technique, and application system that simulates, extends, and extends human intelligence using a digital computer or a machine controlled by a digital computer, perceives the environment, obtains knowledge, and uses the knowledge to obtain optimal results. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision.
The artificial intelligence technology is a comprehensive subject, and relates to the technology with wide fields, namely the technology with a hardware level and the technology with a software level. Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.
With research and advancement of artificial intelligence technology, research and application of artificial intelligence technology is being developed in various fields, such as common smart home, smart wearable devices, virtual assistants, smart speakers, smart marketing, unmanned, automatic driving, unmanned aerial vehicles, robots, smart medical treatment, smart customer service, etc., and it is believed that with the development of technology, artificial intelligence technology will be applied in more fields and with increasing importance value.
Natural language processing (Nature Language processing, NLP for short) is an important direction in the fields of computer science and artificial intelligence. It is studying various theories and methods that enable effective communication between a person and a computer in natural language. Natural language processing is a science that integrates linguistics, computer science, and mathematics. Thus, the research in this field will involve natural language, i.e. language that people use daily, so it has a close relationship with the research in linguistics. Natural language processing techniques typically include text processing, semantic understanding, machine translation, robotic questions and answers, knowledge graph techniques, and the like.
Machine learning (MACHINE LEARNING, ML for short) is a multi-domain interdisciplinary, involving multiple disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory, and the like. It is specially studied how a computer simulates or implements learning behavior of a human to acquire new knowledge or skills, and reorganizes existing knowledge structures to continuously improve own performance. Machine learning is the core of artificial intelligence, a fundamental approach to letting computers have intelligence, which is applied throughout various areas of artificial intelligence. Machine learning and deep learning typically include techniques such as artificial neural networks, confidence networks, reinforcement learning, transfer learning, induction learning, teaching learning, and the like.
The scheme provided by the embodiment of the application relates to artificial intelligence natural language processing, machine learning and other technologies, and is specifically described by the following embodiments:
According to an aspect of the embodiment of the present application, there is provided a text recognition method, optionally, as an alternative implementation, the above text recognition method may be applied, but not limited to, in the environment shown in fig. 1. The application environment includes a terminal device 101 and a server 102. The terminal device 101 and the server 102 perform data communication through a communication network, alternatively, the communication network may be a wired network or a wireless network, and the communication network may be at least one of a local area network, a metropolitan area network, and a wide area network.
The terminal device 101 is an electronic device having a search function realized through a network. The electronic device may be a mobile terminal such as a smart phone, a tablet computer, a laptop portable notebook computer, or a terminal such as a desktop computer or a projection computer, which is not limited in the embodiment of the present application.
The server 102 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, a content distribution network (ContentDeliveryNetwork, CDN), basic cloud computing services such as big data and an artificial intelligent platform, and the like.
In one embodiment of the present application, a text recognition model is provided in the server 102, the text recognition model being trained from target sample text carrying labeled entity tags and labeled class tags.
After the terminal device 101 sends a text recognition request to the server 102, the server 102 adds a first classification mark in the text to be recognized, generates an input object corresponding to the text to be recognized, takes the input object as input of a text recognition model, performs text recognition by the text recognition model, and outputs a prediction entity tag corresponding to each text in the text to be recognized and a prediction classification tag corresponding to the first classification mark.
In other possible embodiments, the text recognition model may be issued by the server 102 to the terminal device 101, and the terminal device 101 uses the text recognition model to recognize the text to be recognized, which is not limited in this embodiment.
It should be noted that the text recognition model provided by the application can be widely applied to various fields. For example, the text recognition model is applied in the field of recognition of web texts, which may include, for example, public opinion texts, and specifically, a text recognition model is provided in a background server of an application program/a portal, and is used to recognize classification (negative or non-negative) of web texts and entities in web texts. After the application program/portal site is started, a user can issue a web text according to the actual situation of the user, a server can acquire the web text issued by the user from an open interface of the application program or the portal site, and then the server identifies the web text through a text identification model, and extracts entities and relations in the web text while judging whether the web text is negative, so that risks can be found timely.
The identification work of the negative information of the enterprise is crucial to the wind control task of the enterprise, however, in the actual scene, the interior is difficult to sense in time when the enterprise is at external risk on one hand, and on the other hand, a great deal of time is required to be actively spent for searching the negative information manually.
The current network text recognition schemes can be roughly classified into three types (1) rule template type, and judgment is negative after the corresponding rule is met through triggering judgment of negative keywords. (2) The traditional machine learning model carries out classification training by manually constructing a large number of features based on labeling corpus. (3) And the deep learning model does not need to manually construct features, and performs vector extraction on the text to obtain semantic features so as to further train the classification model.
However, (1) rule templates are prone to misjudgment, and require constant replenishment and maintenance of rule template libraries, relying on historical experience, and are not capable of proactively discovering new negative pattern expressions. (2) The traditional machine learning model has higher cost of manually constructing the features, and the features are extremely easy to generate sparse phenomenon. (3) The deep learning model depends on large-scale training corpus, the labeling quality of the corpus is larger, the training effect is influenced, in addition, the network text is usually complex, and the effect is often poor when the classification model is singly used.
Based on the above, the embodiment of the application provides a text recognition method, which can be applied to recognition of a network text, the network text is recognized through a text recognition model, the text recognition model can extract the entity and the relation in the network text when recognizing whether the network text is negative, and the mode of joint training of a named entity recognition task and a classification task is adopted, so that the attention of the important entity is implicitly increased by the model, semantic knowledge of two tasks is shared, the problem of risk ambiguity is better solved, and the task accuracy is improved.
Of course, the text recognition method of the present application can also be applied to other fields, and is not exemplified herein. In addition, in practical application, a plurality of text recognition models may be used to recognize the text to be recognized at the same time, for example, a first text recognition model is used to determine a domain to which the text to be recognized belongs, and a text recognition model corresponding to the domain is selected based on the domain to which the text to be recognized belongs to classify the text to be recognized and identify an entity.
For convenience of description, in the following method embodiments, only the execution subject of each step is taken as a computer device for description, and the computer device may be any electronic device with computing and storage capabilities. For example, the computer device may be the server 102 or the terminal device 101, and it should be noted that in the embodiment of the present application, the execution subject of each step may be the same computer device, or may be executed by a plurality of different computer devices in an interactive manner, which is not limited herein. It should be noted that, in the embodiment of the present application, the execution subject of the text recognition method and the execution subject of the training method of the text recognition model may be the same computer device or may be different computer devices, which is not limited in the embodiment of the present application.
The implementation details of the technical scheme of the embodiment of the application are described in detail below:
fig. 2 shows a flowchart of a text recognition method according to an embodiment of the present application, and referring to fig. 2, the text recognition method includes:
step S210, adding a first classification mark into a text to be identified to generate an input object corresponding to the text to be identified;
step S220, inputting an input object into a text recognition model, wherein the text recognition model is obtained through training according to a target sample text carrying labeling entity labels and labeling classification labels, the labeling entity labels are entity labels corresponding to all characters in the target sample text, and the labeling classification labels are classification labels corresponding to second classification labels added in the target sample text;
Step S230, obtaining a prediction entity label corresponding to each word in the text to be recognized and a prediction classification label corresponding to the first classification label, which are output by the text recognition model;
step S240, generating an entity recognition result aiming at the text to be recognized according to the predicted entity tag, and generating a classification result aiming at the text to be recognized according to the predicted classification tag.
These steps are described in detail below.
In step S210, a first classification flag is added to the text to be recognized, so as to generate an input object corresponding to the text to be recognized.
Text to be recognized refers to text of an unknown class, i.e., unclassified text. The text to be recognized may include a plurality of text characters, for example, may include a plurality of words, and may include groups of characters consisting of words, numbers, letters, or punctuation marks. In the embodiment of the application, the computer equipment acquires the text to be recognized before text recognition. Alternatively, the text to be recognized may be text acquired in real time, or text previously acquired and stored in a computer device.
In one possible implementation, the text to be recognized is provided to the computer device actively by the user. Optionally, the user determines the text to be recognized according to the actual situation, and inputs the text to be recognized to the computer device or the associated device of the computer device, and further, the computer device acquires the text to be recognized. The text to be recognized may be input by text input, voice input, image input or gesture input, and the embodiment of the present application is not limited thereto.
In another possible implementation manner, the text to be recognized is actively acquired by the computer device. Alternatively, the computer device may acquire the text to be identified from the network environment at a certain time interval, in which case, after classifying the text to be identified, the computer device may store the text to be identified to a suitable location, such as a classification database, according to the classification of the text to be identified. Wherein, the time interval can be 1s, 1h, 1 day, 1 week, etc.
After the computer device obtains the text to be recognized, a first classification mark can be added in the text to be recognized to generate an input object corresponding to the text to be recognized. Compared with the text to be recognized, the first classification mark is added to the input object corresponding to the text to be recognized, and it should be noted that the added first classification mark is only a mark representing classification and is not a specific classification category of the text to be recognized. The added first classification mark may be added at a start position, a middle position or an end position of the text to be recognized. As an example, the first classification mark may be "RN", and it should be noted that the first classification mark may not be limited to the english alphabets described above, but may include, but is not limited to, at least one form of numbers, characters, symbols, and the like.
In one illustrative example, the text to be recognized is "how beautiful she is at the bottom", and the computer device adds the first classification mark RN at the start position of the text to be recognized, and generates the input object "how beautiful she is at the bottom" corresponding to the text to be recognized.
In one embodiment of the application, the computer equipment can divide the text to be recognized by using the word segmentation method and taking the text as a unit to obtain a text sequence corresponding to the text to be recognized, and adds a first classification mark into the text sequence to obtain a new text sequence, and then takes the new text sequence as an input object corresponding to the text to be recognized.
In one illustrative example, the text to be identified is Daoxiang of which I want to hear XXX, and the computer equipment divides the text to be identified to obtain a word sequence I want/hear/X/X/Daoxiang, and adds a first classification mark RN into the word sequence obtained by division to generate a new word sequence I want/hear/X/X/X/Daoxiang/RN.
In step S220, the input object is input to a text recognition model, the text recognition model is trained according to a target sample text carrying labeling entity labels and labeling classification labels, the labeling entity labels are entity labels corresponding to each word in the target sample text, and the labeling classification labels are classification labels corresponding to second classification labels added in the target sample text.
The text recognition model can be a model obtained by training a model to be trained by utilizing a preset target sample text in advance. The text recognition model obtained through training can have entity recognition capability and text classification capability at the same time.
In one embodiment, the model to be trained may be a combined model, which may be (1) a bi-directional encoder characterization (Bidirectional Encoder Representation from Transformers, BERT), a Long Short-Term Memory network (LSTM) based model, and a conditional random field (Conditional Random Fields, CRF). (2) Models of Bi-directional long and short Term Memory networks (Bi-directional Long Short-terminal Memory, bi-LSTM) and conditional random fields (Conditional Random Fields, CRF). (3) Bi-directional encoder characterization (Bidirectional Encoder Representation from Transformers, BERT) and model of conditional random field (Conditional Random Fields, CRF).
The target sample text is a specific data set related to text recognition, and the target sample text contains labeling entity tags and labeling classification tags, which can be manually labeled.
The labeling entity labels are entity labels corresponding to all characters in the target sample text, and optionally, labeling schemes of the entity labels can be designed according to actual needs. As an example, assuming that the entity tag labeling scheme for the sample text is designed to introduce seven roles (subject company\subject company qualifier\subject person\associated company\associated product\risk word\tense word), if the target sample text is "a company denies violent threat", after entity tag labeling is performed on the target sample text, the < text can be obtained, where the entity tag corresponding to the text is respectively < a certain entity company >, < a company, a subject company >, < a company, a subject company >, < no, tense word >, < a recognition, tense word >, < a riot word >, < a force, a risk word >, < a web, a risk word >, < a hypochondriac, a risk word >. It should be noted that, the labeling of the entity tag is not limited to the above manner, and the labeling scheme of the entity tag can be designed according to actual needs.
The labeling classification label is a classification label corresponding to a second classification label added in the target sample text. In one illustrative example, the classification tags may include a compliance tag and a violation tag when the text recognition model is used to classify a compliance text and a violation text, the classification tag may be at least one of a compliance tag, a violation information tag, and a fraud information tag when the text recognition model is used to classify a compliance text, a violation information text, or a fraud information text, and the classification tag may include a negative tag and a non-negative tag when the text recognition model is used to classify a negative text and a non-negative text. The embodiment of the application is not limited to the specific content of the classification tag.
Specifically, the computer device may input the input object corresponding to the text to be identified to the text identification model, and determine, according to the output probability, the feature matrix between the labels, and the transition matrix between the labels, the prediction scores of each text for each entity label and each classification label for the first classification label.
In step S230, a prediction entity tag corresponding to each text in the text to be recognized output by the text recognition model and a prediction classification tag corresponding to the first classification label are obtained.
After the computer device inputs the input object into the text recognition model, the prediction scores of each word for each entity tag and each classification tag for the first classification tag can be determined according to the feature matrix between the output probability and the tag and the transfer matrix between the tags.
In other words, the text recognition model may output the prediction scores of multiple paths, where each path is formed by combining the labels corresponding to each position in the text to be recognized, and the path with the highest prediction score is the real path, and the entity label corresponding to each text in the real path is the predicted entity label corresponding to each text in the text to be recognized, and the classification label corresponding to the first classification label in the real path is the predicted classification label corresponding to the first classification label.
In step S240, an entity recognition result for the text to be recognized is generated according to the predicted entity tag, and a classification result for the text to be recognized is generated according to the predicted classification tag.
In this embodiment, after obtaining the predicted entity tag and the predicted classification tag output by the text recognition model, the computer device may generate an entity recognition result for the text to be recognized according to the predicted entity tag, and may generate a classification result for the text to be recognized according to the predicted classification tag.
As an example, the computer device may identify the text whose positions in the text to be identified are consecutive, and the text whose corresponding predicted entity tag indicates the same entity as the same entity, thereby obtaining an entity identification result for the text to be identified.
Taking the text to be identified, "a company enters a clearing program" as an example, the predicted entity label corresponding to "a certain" is "a principal company", the predicted entity label corresponding to "a public" is "a principal company", the predicted entity label corresponding to "a company" is "a principal company", the text to be identified is "a company" with continuous positions, and the corresponding predicted entity label indicates the same entity "principal company", so that the "a company" in the text to be identified can be determined to be an entity.
As one example, the computer device may take the classification category indicated by the predictive classification label corresponding to the first classification label as a classification result for the text to be identified, e.g., the classification category indicated by the predictive classification label corresponding to the first classification label added to the text to be identified is "negative", and may determine that the text to be identified is negative text.
Based on the technical scheme of the embodiment, the text to be identified is identified through the text identification model, the manual construction of features is not needed, semantic knowledge can be automatically extracted, the labor cost is greatly reduced, the efficiency of entity identification and text classification is improved, the text identification model can extract the entity in the text to be identified when classifying the text to be identified, the entity identification task and the text classification task are combined to train, the two tasks share the semantic knowledge, the attention of the text identification model to the entity is increased, and the accuracy of entity identification and text classification is improved.
The above is a detailed description of the text recognition method, and the entity recognition task and the classification task are processed by the text recognition model.
Illustratively, referring to FIG. 3, FIG. 3 shows a block diagram of a text recognition model, the text recognition model 30 being a combined model, including a Bert model 301, an LSTM model 302, and a CRF model 303. The Bert model 301 is used to extract lexical, syntactic, and bi-directional semantic features, and it should be explained that the semantic features are features that reflect the semantics of the corresponding word/word. It is understood that the semantics herein are the semantics of the word/word expressed in the target sample text. That is, the semantics are the semantics that the corresponding word/word reflects in the context of the target sample text in conjunction with the context content. The LSTM model 302 is used to capture the front and back location information, output entity and category probability distribution maps. The CRF model 303 is configured to learn the rule of label transfer nearby, and finally output an optimal prediction result.
In this embodiment, the process of recognizing the text to be recognized may be described as follows:
Firstly, dividing a text to be recognized, adding a first classification mark RL to obtain a character sequence X1、X2…XN, inputting the character sequence X1、X2…XN into a Bert model 301, and respectively representing characters positioned at a first position and a second position in the text to be recognized by X1、X2, wherein the first classification mark RL corresponds to XN. After the Bert model 301 obtains the text sequence to be input, the text is converted into a vector (Embedding), embedding is represented by E, that is, the input vector sequence E1、E2…EN is obtained by conversion, a plurality of layers of coding networks (that is, trm) are arranged in the Bert model 301, each layer of coding network comprises a multi-head attention layer and a feedforward neural network layer, the multi-head attention layer and the feedforward neural network layer are both connected with a summation layer and a normalization layer, and the input vector sequence E1、E2…EN is converted into an output vector sequence T1、T2…TN through the coding networks arranged in the Bert model 301.
Further, the output vector sequence T1、T2…TN is input to the LSTM model 302, and the LSTM model 302 outputs probabilities of each word for each entity tag and the first class label for each class tag. These probabilities will be the inputs to the CRF model 303. At the output layer of the CRF model 303, the predictive scores for each word for each entity tag and the first class label for each class tag are determined based on the feature matrix between the probabilities and tags output by the LSTM model 302, and the transition matrix between each tag.
In other words, the prediction result finally output by the CRF model 303 is a prediction score corresponding to a plurality of paths, each path is formed by combining the labels corresponding to each position in the text to be identified, where the path with the highest prediction score is a real path, the entity labels corresponding to the respective characters in the real path are the predicted entity labels corresponding to the respective characters in the text to be identified, and the classification labels corresponding to the first classification labels in the real path are the predicted classification labels corresponding to the first classification labels.
Illustratively, assuming that the real path a is represented by (B-PER, E-PER, O, B-COM, I-COM, E-COM, TRUE), in which path the B-PER represents a first character tag of a person name, the E-PER represents an end character tag of a person name, O represents an independent character tag, B-COM represents a first character tag of a business name, I-COM represents an intermediate character tag of a business name, E-COM represents an end character tag of a business name, and TRUE represents a classification tag, it may be determined that a person name entity (B-PER, E-PER) and a business entity (B-COM, I-COM, E-COM) are included in the text to be recognized, the person name entity is composed of a first character and a second character in the text to be recognized, the business entity is composed of a fifth character, a sixth character and a seventh character in the text to be recognized, and the classification of the text to be recognized is a classification category indicated by the classification tag TRUE.
Referring to fig. 4, a flowchart of a training method of a text recognition model according to an embodiment of the present application is shown. The method may include steps S410-S430, described in detail below:
in step S410, a second classification flag is added to the target sample text to generate a sample input object corresponding to the target sample text.
Where the target sample text is sample data for training a text recognition model, the computer device may pull the target sample text directly from the internet. After the target sample text is pulled, a second classification mark can be added in the target sample text, so that a sample input object corresponding to the target sample text is obtained. In comparison with the target sample text, the second classification mark is added to the input object corresponding to the target sample text, and it should be noted that the added second classification mark is only a mark representing classification, and is not a specific classification category. The added second classification mark may be added at a beginning position, a middle position or an end position of the target sample text. As an example, the second class mark may be "RN", but of course, the second class mark is not limited to the english alphabets described above, and may include, but is not limited to, at least one form of numerals, letters, symbols, and the like.
In one embodiment, the computer device may divide the target sample text by using a word segmentation method in units of words to obtain a word sequence corresponding to the target sample text, and add a second classification mark to the word sequence corresponding to the target sample text to obtain a new word sequence, and then use the new word sequence as a sample input object corresponding to the target sample text.
In step S420, the sample input object is input into the model to be trained, so as to obtain the prediction scores of each word in the target sample text output by the model to be trained for each entity tag and each classification tag for the second classification tag.
During the training process, the computer device may input the sample input object into the model to be trained to train the model to be trained. The model to be trained can be a combined model, and specifically can be (1) a two-way encoder characterization (Bidirectional Encoder Representation from Transformers, BERT), a Long Short-Term Memory network (LSTM) and a conditional random field (Conditional Random Fields, CRF) based model. (2) Models of Bi-directional long and short Term Memory networks (Bi-directional Long Short-terminal Memory, bi-LSTM) and conditional random fields (Conditional Random Fields, CRF). (3) Bi-directional encoder characterization (Bidirectional Encoder Representation from Transformers, BERT) and model of conditional random field (Conditional Random Fields, CRF).
Specifically, the computer device may divide the sample input object to generate a text sequence corresponding to the target sample text, then process each text according to the sequence of each text in the text sequence to obtain a vector sequence, further determine output probabilities of each text in the target sample text for each entity tag and each classification tag for the second classification tag according to the vector sequence, and determine prediction scores of each text for each entity tag and each first classification tag for each classification tag according to a feature matrix between the output probabilities and the tags and a transition matrix between the tags, that is, after the sample input object is input into the model to be trained, the prediction scores corresponding to multiple paths output by the model to be trained may be obtained, where each path is formed by combining the corresponding tags at each position in the text to be recognized.
For example, assuming that the target sample text includes 2 words, w0 and w1 are respectively marked as RN according to the sequence of the words, the entity tag includes label1 and label2, and the classification tag includes class1 and class2, the model to be trained may output prediction scores corresponding to 8 paths respectively:
prediction score P1 for Path 1{ label1, class1}
Prediction score P2 for Path 2{ label1, class2}
Prediction score P3 for path 3{ label1, label2, class1}
Prediction score P4 for Path 4{ label1, label2, class2}
Prediction score P5 for Path 5{ label2, label1, class1}
Prediction score P6 for path 6{ label2, label1, class2}
Prediction score P7 for path 7{ label2, class1}
Prediction score P8 for path 8{ label2, class2}
In step S430, a loss function is determined according to the labeling entity label, the labeling classification label and the prediction score, and parameters of the model to be trained are adjusted according to the loss function, so as to obtain a text recognition model.
Because the labeling entity labels and the labeling classification labels can be labels which are labeled manually and represent correct labels, the computer equipment can determine a loss function according to the prediction scores output by the model to be trained, the labeling entity labels and the labeling classification labels, adjust model parameters of the model to be trained according to the direction of minimizing the loss function, reduce the loss function by updating the model parameters, continuously optimize and count model parameters of the model to be trained, and determine the model parameters which minimize the loss function by adopting a minimization principle to obtain the text recognition model.
In one embodiment, as shown in fig. 5, the method for determining the loss function may specifically include steps S510 to S520, which are described in detail below:
and S510, determining the target prediction scores corresponding to the entity labels identical to the labeling entity labels and the classification labels identical to the labeling classification labels from the prediction scores according to the labeling entity labels and the labeling classification labels.
After the sample input object is input into the model to be trained, the predictive scores of each word in the target sample text output by the model to be trained for each entity tag and the predictive scores of the second classification tags for each classification tag can be obtained.
Because the target sample text contains the corresponding labeling entity label and labeling classification label, the entity label and the classification label can be correct and real labels through manual labeling, when determining the loss function, the target prediction score corresponding to the entity label identical to the labeling entity label and the classification label identical to the labeling classification label can be determined from the prediction scores according to the labeling entity label and the labeling classification label.
Continuing with the above example in step S420, assuming that the labeling entity tag of w0 is label1, the labeling entity tag of w1 is label2, and the labeling classification tag of the second classification tag is class1, the target prediction score corresponding to the same entity tag as the labeling entity tag and the classification tag as the labeling classification tag is determined to be P3 from the prediction scores.
Step S520, determining a loss function according to the target prediction score.
The target prediction score is determined according to the labeling entity label and the labeling classification label and is the prediction score corresponding to the real path, that is to say, the target prediction score is the highest of the prediction scores corresponding to all paths, so that the loss function can be determined according to the target prediction score.
In one embodiment of the application, determining the loss function according to the target prediction score may specifically include calculating a ratio between the target prediction score and a sum of the prediction scores, performing a logarithmic operation on the comparison values to obtain an operation result, and determining the loss function according to the operation result. Illustratively, the loss functionCan be defined as the following formula (1):
formula (1)
Wherein,Is the target prediction score of the object,Is the predictive score.
In one embodiment of the present application, the target sample text may be a training text directly obtained from the internet by the computer device, or may be a sample text obtained by processing the directly obtained text, as shown in fig. 6, and in this embodiment, the text recognition method may further specifically include step S610 to step S640, which is specifically described as follows:
Step S610, obtaining keywords from a keyword library which is built in advance.
The preset keyword library is a word library formed by preset words which can be selected as keywords, and the keyword library comprises a plurality of keywords.
Alternatively, different keyword libraries are set correspondingly for different classified texts. The keyword library corresponding to each classified text refers to a word library composed of preset words which can be selected as keywords of the classified text. For example, the keyword library corresponding to the text of the entertainment classification comprises words related to entertainment, such as the name of the entertainment star, the name of the movie and television play, the name of the variety program, etc., for example, the keyword library corresponding to the text of the sports classification comprises words related to sports, such as the name of the sports star, the name of the sports, the name of the team, etc., and for example, the keyword library corresponding to the text of the public opinion classification comprises words related to public opinion risk, such as notification, false, etc.
Therefore, in order to acquire the target sample text for model training, keywords may be acquired from a keyword library in advance, and then, training text may be acquired by means of the keywords. It will be appreciated that text obtained from keywords will be more relevant to the text required for model training.
Step S620, according to the keywords, a sample text containing the keywords is obtained.
After obtaining the keywords, the computer device may further obtain sample text containing the keywords, for example, the computer device may obtain text containing the keywords published by the user in the social software from an open interface of the social software as sample text.
Step S630, performing entity tag labeling and classification tag labeling on a part of the sample text in the obtained sample text to generate an initial sample text.
Further, after the sample text is obtained, a portion of the sample text may be selected for entity tag labeling and class tag labeling to generate an initial sample text.
It should be noted that, the selection mode may be a random selection, or may be selecting a part of the sample text according to a preset rule.
Step S640, performing data enhancement processing on the initial sample text to generate a target sample text.
Because the problem of lack of training data is often faced in the model training process, in this embodiment, after the initial sample text marked with the label is obtained, data enhancement processing may be further performed on the initial sample text to generate the target sample text.
In some embodiments of the present application, the data enhancement processing method may specifically include copying the initial sample text to obtain a copy of the initial sample text, and generating the target sample text according to the initial sample text and the copy of the initial sample text.
In some embodiments of the present application, the data enhancement processing method may further include performing synonym replacement on the target keywords included in the initial sample text to generate sample text including synonyms of the target keywords, and generating the target sample text according to the initial sample text and the sample text including the synonyms of the target keywords.
In some embodiments of the present application, the data enhancement processing method may further include deleting a portion of text included in the initial sample text to obtain a processed initial sample text, and generating the target sample text according to the initial sample text and the processed initial sample text.
In some embodiments of the present application, the data enhancement processing method may further include randomly inserting characters into the initial sample text to obtain a new initial sample text, and generating the target sample text according to the initial sample text and the new initial sample text.
Of course, it is understood that the data enhancement processing manner for the initial sample text may also be any combination of the foregoing manners, for example, a combination of a duplication manner and a synonym substitution manner, a combination of a duplication manner and a deletion processing manner, and so on. The embodiments of the present application are not particularly limited herein.
In one embodiment of the application, after the data enhancement processing is performed on the initial sample text, posterior cleaning can be further performed on the data obtained after the data enhancement processing, if an abnormal text is found, rejection is performed, and then a target sample text is generated according to the text after rejection.
In order to analyze the effect of the text recognition model (denoted as V3 model) in the embodiment of the present application, the effect is now compared with a keyword trigger model (denoted as V1 model), a pipeline model (denoted as V2 model) trained by the named entity recognition and classification tasks, respectively, and the comparison results are shown in table 1 below.
TABLE 1
The comparison index comprises an accuracy rate, a coverage rate and an F1 value, wherein the accuracy rate is a value for comprehensively considering the accuracy rate and the coverage rate, the classification model predicts the correct proportion for a certain class of a given test set, the coverage rate is a value for judging how many positive classes in a sample are correctly predicted by the classification model for the certain class of the given test set, and the F1 value is a value for comprehensively considering the accuracy rate and the coverage rate. As can be seen from Table 1, the accuracy and coverage rate of the text recognition model in the embodiment of the application are obviously improved.
The following describes embodiments of the apparatus of the present application that may be used to perform the text recognition method of the above-described embodiments of the present application. For details not disclosed in the embodiments of the apparatus of the present application, please refer to the above-mentioned embodiments of the text recognition method of the present application.
Fig. 7 shows a block diagram of a text recognition apparatus according to an embodiment of the present application, and referring to fig. 7, a text recognition apparatus 700 according to an embodiment of the present application includes a first adding unit 702, a first input unit 704, an obtaining unit 706, and a generating unit 708.
The first adding unit 702 is configured to add a first classification label to a text to be identified to generate an input object corresponding to the text to be identified, the first input unit 704 is configured to input the input object to a text identification model, the text identification model is obtained through training according to a target sample text carrying labeling entity labels and labeling classification labels, the labeling entity labels are entity labels corresponding to all characters in the target sample text, the labeling classification labels are classification labels corresponding to second classification labels added in the target sample text, the obtaining unit 706 is configured to obtain a predicted entity label corresponding to all characters in the text to be identified and a predicted classification label corresponding to the first classification label output by the text identification model, and the generating unit 708 is configured to generate an entity identification result for the text to be identified according to the predicted entity labels and a classification result for the text to be identified according to the predicted classification labels.
In some embodiments of the present application, the first adding unit 702 is configured to divide the text to be identified by using a text as a unit to generate a text sequence corresponding to the text to be identified, and add the first classification mark to the text sequence to obtain a new text sequence, where the new text sequence is used as an input object corresponding to the text to be identified.
In some embodiments of the present application, the generating unit 708 is configured to identify, as the same entity, a text whose position is continuous in the text to be identified and whose corresponding predicted entity tag indicates the same entity, to obtain an entity identification result for the text to be identified, and take a classification category indicated by the predicted classification tag as a classification result for the text to be identified.
In some embodiments of the application, the device further comprises a second adding unit configured to add the second classification label to the target sample text so as to generate a sample input object corresponding to the target sample text, a second input unit configured to input the sample input object to a model to be trained to obtain a predicted score of each word in the target sample text output by the model to be trained for each entity label and each classification label of the second classification label, and a determining unit configured to determine a loss function according to the labeled entity label, the labeled classification label and the predicted score, and adjust parameters of the model to be trained according to the loss function so as to obtain the text recognition model.
In some embodiments of the application, the determining unit comprises a prediction score determining subunit configured to determine a target prediction score corresponding to the same entity label as the labeling entity label and the same classification label as the labeling classification label from the prediction scores according to the labeling entity label and the labeling classification label, and a loss function determining subunit configured to determine the loss function according to the target prediction score.
In some embodiments of the application, the loss function determination subunit is configured to calculate a ratio between the target prediction score and a sum of the prediction scores, perform a logarithmic operation on the ratio to obtain an operation result, and determine the loss function according to the operation result.
In some embodiments of the present application, the second adding unit is configured to divide the target sample text by using characters as units to generate a character sequence corresponding to the target sample text, add the second classification mark into the character sequence to obtain a new character sequence, and use the new character sequence as a sample input object corresponding to the target sample text.
In some embodiments of the application, the device further comprises a keyword acquisition unit configured to acquire keywords from a keyword library established in advance, a sample text acquisition unit configured to acquire sample texts containing the keywords according to the keywords, a labeling unit configured to perform entity label labeling and category label labeling on part of the acquired sample texts to generate initial sample texts, and a processing unit configured to perform data enhancement processing on the initial sample texts to generate target sample texts.
In some embodiments of the application, the processing unit is configured to copy the initial sample text to obtain a copy of the initial sample text, and generate the target sample text according to the initial sample text and the copy of the initial sample text.
In some embodiments of the application, the processing unit is configured to perform synonym replacement on target keywords contained in the initial sample text to generate sample text containing synonyms of the target keywords, and generate the target sample text according to the initial sample text and the sample text containing the synonyms of the target keywords.
In some embodiments of the present application, the processing unit is configured to delete a portion of text included in the initial sample text to obtain a processed initial sample text, and generate the target sample text according to the initial sample text and the processed initial sample text.
In some embodiments of the present application, the processing unit is configured to randomly insert text into the initial sample text to obtain a new initial sample text, and generate the target sample text according to the initial sample text and the new initial sample text.
Fig. 8 shows a schematic diagram of a computer system suitable for use in implementing an embodiment of the application.
It should be noted that, the computer system 800 of the electronic device shown in fig. 8 is only an example, and should not impose any limitation on the functions and the application scope of the embodiments of the present application.
As shown in fig. 8, the computer system 800 includes a central processing unit (Central Processing Unit, CPU) 801 that can perform various appropriate actions and processes, such as performing the methods described in the above embodiments, according to a program stored in a Read-Only Memory (ROM) 802 or a program loaded from a storage section 808 into a random access Memory (Random Access Memory, RAM) 803. In the RAM 803, various programs and data required for system operation are also stored. The CPU 801, ROM 802, and RAM 803 are connected to each other by a bus 804. An Input/Output (I/O) interface 805 is also connected to bus 804.
Connected to the I/O interface 805 are an input section 806 including a keyboard, a mouse, and the like, an output section 807 including a display such as a Cathode Ray Tube (CRT), a Liquid crystal display (Liquid CRYSTAL DISPLAY, LCD), and a speaker, a storage section 808 including a hard disk, and the like, and a communication section 809 including a network interface card such as a LAN (Local Area Network) card, a modem, and the like. The communication section 809 performs communication processing via a network such as the internet. The drive 810 is also connected to the I/O interface 805 as needed. A removable medium 811 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 810 as needed so that a computer program read out therefrom is mounted into the storage section 808 as needed.
In particular, according to embodiments of the present application, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present application include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising a computer program for performing the method shown in the flowchart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication section 809, and/or installed from the removable media 811. When executed by a Central Processing Unit (CPU) 801, performs the various functions defined in the system of the present application.
It should be noted that, the computer readable medium shown in the embodiments of the present application may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of a computer-readable storage medium may include, but are not limited to, an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-Only Memory (ROM), an erasable programmable read-Only Memory (Erasable Programmable Read Only Memory, EPROM), a flash Memory, an optical fiber, a portable compact disc read-Only Memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present application, however, a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with a computer-readable computer program embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. A computer program embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, etc., or any suitable combination of the foregoing.
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. Where each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units involved in the embodiments of the present application may be implemented by software, or may be implemented by hardware, and the described units may also be provided in a processor. Wherein the names of the units do not constitute a limitation of the units themselves in some cases.
As another aspect, the present application also provides a computer-readable medium that may be included in the electronic device described in the above embodiment, or may exist alone without being incorporated into the electronic device. The computer-readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to implement the methods described in the above embodiments.
It should be noted that although in the above detailed description several modules or units of a device for action execution are mentioned, such a division is not mandatory. Indeed, the features and functions of two or more modules or units described above may be embodied in one module or unit in accordance with embodiments of the application. Conversely, the features and functions of one module or unit described above may be further divided into a plurality of modules or units to be embodied.
From the above description of embodiments, those skilled in the art will readily appreciate that the example embodiments described herein may be implemented in software, or may be implemented in software in combination with the necessary hardware. Thus, the technical solution according to the embodiments of the present application may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (may be a CD-ROM, a U-disk, a mobile hard disk, etc.) or on a network, and includes several instructions to cause a computing device (may be a personal computer, a server, a touch terminal, or a network device, etc.) to perform the method according to the embodiments of the present application.
Other embodiments of the application will be apparent to those skilled in the art from consideration of the specification and practice of the embodiments disclosed herein. This application is intended to cover any variations, uses, or adaptations of the application following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the application pertains.
It is to be understood that the application is not limited to the precise arrangements and instrumentalities shown in the drawings, which have been described above, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the application is limited only by the appended claims.

Claims (16)

Translated fromChinese
1.一种文本识别方法,其特征在于,所述方法包括:1. A text recognition method, characterized in that the method comprises:在待识别文本中添加第一分类标记,以生成所述待识别文本对应的输入对象;Adding a first classification mark to the text to be recognized to generate an input object corresponding to the text to be recognized;将所述输入对象输入至文本识别模型,所述文本识别模型是根据携带有标注实体标签以及标注分类标签的目标样本文本训练得到的,所述标注实体标签为所述目标样本文本中各个文字对应的实体标签,所述标注分类标签是所述目标样本文本中添加的第二分类标记对应的分类标签;Inputting the input object into a text recognition model, wherein the text recognition model is trained based on a target sample text carrying annotated entity labels and annotated classification labels, wherein the annotated entity labels are entity labels corresponding to each character in the target sample text, and the annotated classification labels are classification labels corresponding to a second classification mark added to the target sample text;获取所述文本识别模型输出的所述待识别文本中各个文字对应的预测实体标签以及所述第一分类标记对应的预测分类标签,所述待识别文本中各个文字对应的预测实体标签以及所述第一分类标记对应的预测分类标签,是所述文本识别模型中的CRF层根据输入的各个文字针对各个实体标签以及第一分类标记针对各个分类标签的概率得到的;各个文字针对各个实体标签以及第一分类标记针对各个分类标签的概率,是由所述文本识别模型中的LSTM层基于所述输入对象得到的;Obtain the predicted entity label corresponding to each character in the text to be recognized and the predicted classification label corresponding to the first classification label output by the text recognition model, wherein the predicted entity label corresponding to each character in the text to be recognized and the predicted classification label corresponding to the first classification label are obtained by the CRF layer in the text recognition model according to the probability of each input character for each entity label and the first classification label for each classification label; the probability of each character for each entity label and the first classification label for each classification label is obtained by the LSTM layer in the text recognition model based on the input object;根据所述预测实体标签,生成针对所述待识别文本的实体识别结果,并根据所述预测分类标签,生成针对所述待识别文本的分类结果。An entity recognition result for the text to be recognized is generated based on the predicted entity label, and a classification result for the text to be recognized is generated based on the predicted classification label.2.根据权利要求1所述的方法,其特征在于,在待识别文本中添加第一分类标记,以生成所述待识别文本对应的输入对象,包括:2. The method according to claim 1, characterized in that adding a first classification mark to the text to be recognized to generate an input object corresponding to the text to be recognized comprises:以文字为单位对所述待识别文本进行划分处理,以生成所述待识别文本对应的文字序列;Dividing the text to be recognized into units of characters to generate a character sequence corresponding to the text to be recognized;在所述文字序列中添加所述第一分类标记,得到新的文字序列,将所述新的文字序列作为所述待识别文本对应的输入对象。The first classification mark is added to the character sequence to obtain a new character sequence, and the new character sequence is used as an input object corresponding to the text to be recognized.3.根据权利要求1所述的方法,其特征在于,根据所述预测实体标签,生成针对所述待识别文本的实体识别结果,并根据所述预测分类标签,生成针对所述待识别文本的分类结果,包括:3. The method according to claim 1, characterized in that generating an entity recognition result for the text to be recognized based on the predicted entity label, and generating a classification result for the text to be recognized based on the predicted classification label, comprises:将所述待识别文本中位置连续、且对应的预测实体标签指示为同一实体的文字识别为同一个实体,得到针对所述待识别文本的实体识别结果;Recognize characters in the text to be recognized that are consecutively located and whose corresponding predicted entity tags indicate the same entity as the same entity, and obtain an entity recognition result for the text to be recognized;将所述预测分类标签指示的分类类别作为针对所述待识别文本的分类结果。The classification category indicated by the predicted classification label is used as the classification result for the text to be recognized.4.根据权利要求1-3任一项所述的方法,其特征在于,所述方法还包括:4. The method according to any one of claims 1 to 3, characterized in that the method further comprises:在所述目标样本文本中添加所述第二分类标记,以生成所述目标样本文本对应的样本输入对象;Adding the second classification mark to the target sample text to generate a sample input object corresponding to the target sample text;将所述样本输入对象输入待训练模型,得到所述待训练模型输出的所述目标样本文本中各个文字针对各个实体标签以及所述第二分类标记针对各个分类标签的预测分数;Inputting the sample input object into the model to be trained, and obtaining the prediction score of each character in the target sample text output by the model to be trained for each entity label and the second classification mark for each classification label;根据所述标注实体标签、所述标注分类标签以及所述预测分数,确定损失函数,并根据所述损失函数对所述待训练模型的参数进行调整,得到所述文本识别模型。A loss function is determined according to the annotated entity label, the annotated category label and the predicted score, and the parameters of the model to be trained are adjusted according to the loss function to obtain the text recognition model.5.根据权利要求4所述的方法,其特征在于,根据所述标注实体标签、所述标注分类标签以及所述预测分数,确定损失函数,包括:5. The method according to claim 4, characterized in that determining a loss function according to the labeled entity label, the labeled classification label and the predicted score comprises:根据所述标注实体标签以及所述标注分类标签,从所述预测分数中确定与所述标注实体标签相同的实体标签且与所述标注分类标签相同的分类标签对应的目标预测分数;According to the annotated entity label and the annotated classification label, determining from the prediction scores a target prediction score corresponding to an entity label identical to the annotated entity label and a classification label identical to the annotated classification label;根据所述目标预测分数,确定所述损失函数。The loss function is determined according to the target prediction score.6.根据权利要求5所述的方法,其特征在于,根据所述目标预测分数,确定所述损失函数,包括:6. The method according to claim 5, characterized in that determining the loss function according to the target prediction score comprises:计算所述目标预测分数与所述预测分数的和值之间的比值;Calculating a ratio between the target prediction score and the sum of the prediction scores;对所述比值进行对数运算,得到运算结果,并根据所述运算结果,确定所述损失函数。A logarithmic operation is performed on the ratio to obtain an operation result, and the loss function is determined according to the operation result.7.根据权利要求4所述的方法,其特征在于,在所述目标样本文本中添加所述第二分类标记,以生成所述目标样本文本对应的样本输入对象,包括:7. The method according to claim 4, characterized in that adding the second classification mark to the target sample text to generate a sample input object corresponding to the target sample text comprises:以文字为单位对所述目标样本文本进行划分处理,生成所述目标样本文本对应的文字序列;Dividing the target sample text into characters to generate a character sequence corresponding to the target sample text;在所述文字序列中添加所述第二分类标记,得到新的文字序列,将所述新的文字序列作为所述目标样本文本对应的样本输入对象。The second classification mark is added to the character sequence to obtain a new character sequence, and the new character sequence is used as a sample input object corresponding to the target sample text.8.根据权利要求4所述的方法,其特征在于,所述方法还包括:8. The method according to claim 4, characterized in that the method further comprises:从预先建立的关键词库中获取关键词;Get keywords from a pre-built keyword library;根据所述关键词,获取包含有所述关键词的样本文本;According to the keyword, obtaining a sample text containing the keyword;对获取到的样本文本中的部分样本文本进行实体标签标注以及类别标签标注,以生成初始样本文本;Annotate some of the sample texts in the acquired sample texts with entity labels and category labels to generate initial sample texts;对所述初始样本文本进行数据增强处理,以生成所述目标样本文本。The initial sample text is subjected to data enhancement processing to generate the target sample text.9.根据权利要求8所述的方法,其特征在于,对所述初始样本文本进行数据增强处理,以生成所述目标样本文本,包括:9. The method according to claim 8, characterized in that performing data enhancement processing on the initial sample text to generate the target sample text comprises:对所述初始样本文本进行复制,得到所述初始样本文本的副本;Copying the initial sample text to obtain a copy of the initial sample text;根据所述初始样本文本以及所述初始样本文本的副本,生成所述目标样本文本。The target sample text is generated according to the initial sample text and a copy of the initial sample text.10.根据权利要求8所述的方法,其特征在于,对所述初始样本文本进行数据增强处理,以生成所述目标样本文本,包括:10. The method according to claim 8, characterized in that performing data enhancement processing on the initial sample text to generate the target sample text comprises:对所述初始样本文本中包含的目标关键词进行同义词替换,以生成包含有所述目标关键词的同义词的样本文本;Performing synonym replacement on the target keyword contained in the initial sample text to generate a sample text containing a synonym of the target keyword;根据所述初始样本文本以及所述包含有所述目标关键词的同义词的样本文本,生成所述目标样本文本。The target sample text is generated according to the initial sample text and the sample text containing the synonym of the target keyword.11.根据权利要求8所述的方法,其特征在于,对所述初始样本文本进行数据增强处理,以生成所述目标样本文本,包括:11. The method according to claim 8, characterized in that performing data enhancement processing on the initial sample text to generate the target sample text comprises:对所述初始样本文本中包含的部分文字进行删除处理,得到处理后的初始样本文本;Deleting some of the words contained in the initial sample text to obtain a processed initial sample text;根据所述初始样本文本以及所述处理后的初始样本文本,生成所述目标样本文本。The target sample text is generated according to the initial sample text and the processed initial sample text.12.根据权利要求8所述的方法,其特征在于,对所述初始样本文本进行数据增强处理,以生成所述目标样本文本,包括:12. The method according to claim 8, characterized in that performing data enhancement processing on the initial sample text to generate the target sample text comprises:在所述初始样本文本中随机插入文字,得到新的初始样本文本;randomly inserting text into the initial sample text to obtain a new initial sample text;根据所述初始样本文本以及所述新的初始样本文本,生成所述目标样本文本。The target sample text is generated according to the initial sample text and the new initial sample text.13.一种文本识别装置,其特征在于,所述装置包括:13. A text recognition device, characterized in that the device comprises:第一添加单元,配置为在待识别文本中添加第一分类标记,以生成所述待识别文本对应的输入对象;A first adding unit is configured to add a first classification mark to the text to be recognized, so as to generate an input object corresponding to the text to be recognized;第一输入单元,配置为将所述输入对象输入至文本识别模型,所述文本识别模型是根据携带有标注实体标签以及标注分类标签的目标样本文本训练得到的,所述标注实体标签为所述目标样本文本中各个文字对应的实体标签,所述标注分类标签是所述目标样本文本中添加的第二分类标记对应的分类标签;A first input unit is configured to input the input object into a text recognition model, wherein the text recognition model is trained based on a target sample text carrying an annotated entity label and annotated classification label, wherein the annotated entity label is an entity label corresponding to each character in the target sample text, and the annotated classification label is a classification label corresponding to a second classification mark added to the target sample text;获取单元,配置为获取所述文本识别模型输出的所述待识别文本中各个文字对应的预测实体标签以及所述第一分类标记对应的预测分类标签,所述待识别文本中各个文字对应的预测实体标签以及所述第一分类标记对应的预测分类标签,是所述文本识别模型中的CRF层根据输入的各个文字针对各个实体标签以及第一分类标记针对各个分类标签的概率得到的;各个文字针对各个实体标签以及第一分类标记针对各个分类标签的概率,是由所述文本识别模型中的LSTM层基于所述输入对象得到的;an acquisition unit, configured to acquire predicted entity labels corresponding to each character in the text to be recognized and predicted classification labels corresponding to the first classification mark output by the text recognition model, wherein the predicted entity labels corresponding to each character in the text to be recognized and the predicted classification labels corresponding to the first classification mark are obtained by the CRF layer in the text recognition model according to the probability of each input character for each entity label and the first classification mark for each classification label; the probability of each character for each entity label and the first classification mark for each classification label is obtained by the LSTM layer in the text recognition model based on the input object;生成单元,配置为根据所述预测实体标签,生成针对所述待识别文本的实体识别结果,并根据所述预测分类标签,生成针对所述待识别文本的分类结果。A generating unit is configured to generate an entity recognition result for the text to be recognized based on the predicted entity label, and to generate a classification result for the text to be recognized based on the predicted classification label.14.一种计算机可读介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现如权利要求1至12中任一项所述的文本识别方法。14. A computer-readable medium having a computer program stored thereon, wherein when the computer program is executed by a processor, the text recognition method according to any one of claims 1 to 12 is implemented.15.一种电子设备,其特征在于,包括:15. An electronic device, comprising:一个或多个处理器;one or more processors;存储装置,用于存储一个或多个程序,当所述一个或多个程序被所述一个或多个处理器执行时,使得所述一个或多个处理器实现如权利要求1至12中任一项所述的文本识别方法。A storage device for storing one or more programs, when the one or more programs are executed by the one or more processors, enables the one or more processors to implement the text recognition method according to any one of claims 1 to 12.16.一种计算机程序产品,其特征在于,所述计算机程序产品包括计算机指令,所述计算机指令适于被处理器加载并执行如权利要求1至12中任一项所述的文本识别方法。16. A computer program product, characterized in that the computer program product comprises computer instructions, wherein the computer instructions are suitable for being loaded by a processor and executing the text recognition method according to any one of claims 1 to 12.
CN202110492354.2A2021-05-062021-05-06 Text recognition method, device, computer readable medium and electronic deviceActiveCN113761190B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202110492354.2ACN113761190B (en)2021-05-062021-05-06 Text recognition method, device, computer readable medium and electronic device

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202110492354.2ACN113761190B (en)2021-05-062021-05-06 Text recognition method, device, computer readable medium and electronic device

Publications (2)

Publication NumberPublication Date
CN113761190A CN113761190A (en)2021-12-07
CN113761190Btrue CN113761190B (en)2025-02-14

Family

ID=78787098

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202110492354.2AActiveCN113761190B (en)2021-05-062021-05-06 Text recognition method, device, computer readable medium and electronic device

Country Status (1)

CountryLink
CN (1)CN113761190B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN114265919B (en)*2021-12-242025-03-25中电信数智科技有限公司 Entity extraction method, device, electronic device and storage medium
CN114140673B (en)*2022-02-072022-05-20人民中科(北京)智能技术有限公司 A method, system and device for identifying illegal images
CN114637824B (en)*2022-03-182023-12-01马上消费金融股份有限公司Data enhancement processing method and device
CN114661908A (en)*2022-03-252022-06-24鼎富智能科技有限公司Intention category identification method and device, electronic equipment and storage medium
CN115984882A (en)*2023-01-062023-04-18北京有竹居网络技术有限公司Price entity identification method and device, storage medium and electronic equipment
CN116486490A (en)*2023-04-242023-07-25中国工商银行股份有限公司Emotion recognition method and device, nonvolatile storage medium and electronic equipment
CN116738345B (en)*2023-08-152024-03-01腾讯科技(深圳)有限公司Classification processing method, related device and medium
CN118940116B (en)*2024-07-232025-09-16上海哔哩哔哩科技有限公司 Method for classifying information, related device and computer program product

Citations (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN112084336A (en)*2020-09-092020-12-15浙江综合交通大数据中心有限公司Entity extraction and event classification method and device for expressway emergency
CN112269949A (en)*2020-10-192021-01-26杭州叙简科技股份有限公司Information structuring method based on accident disaster news

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN110929521B (en)*2019-12-062023-10-27北京知道创宇信息技术股份有限公司Model generation method, entity identification method, device and storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN112084336A (en)*2020-09-092020-12-15浙江综合交通大数据中心有限公司Entity extraction and event classification method and device for expressway emergency
CN112269949A (en)*2020-10-192021-01-26杭州叙简科技股份有限公司Information structuring method based on accident disaster news

Also Published As

Publication numberPublication date
CN113761190A (en)2021-12-07

Similar Documents

PublicationPublication DateTitle
CN113761190B (en) Text recognition method, device, computer readable medium and electronic device
US12271701B2 (en)Method and apparatus for training text classification model
CN110737758B (en) Method and apparatus for generating models
CN112131366B (en)Method, device and storage medium for training text classification model and text classification
US11334635B2 (en)Domain specific natural language understanding of customer intent in self-help
CN107679039B (en) Method and apparatus for determining sentence intent
CN107783960B (en) Method, apparatus and device for extracting information
CN112100332B (en) Word embedding representation learning method and device, text recall method and device
KR101754473B1 (en)Method and system for automatically summarizing documents to images and providing the image-based contents
CN113704460B (en)Text classification method and device, electronic equipment and storage medium
CN113792112A (en) Visual language task processing system, training method, device, equipment and medium
US20180068221A1 (en)System and Method of Advising Human Verification of Machine-Annotated Ground Truth - High Entropy Focus
CN111368548A (en)Semantic recognition method and device, electronic equipment and computer-readable storage medium
CN113421551B (en)Speech recognition method, speech recognition device, computer readable medium and electronic equipment
US20230376537A1 (en)Multi-chunk relationship extraction and maximization of query answer coherence
CN113704393B (en) Keyword extraction method, device, equipment and medium
CN113705207B (en) Grammatical error identification method and device
CN113779225B (en)Training method of entity link model, entity link method and device
CN114385817B (en) Entity relationship identification method, device and readable storage medium
CN114416995A (en) Information recommendation method, device and equipment
CN113836866B (en)Text encoding method, text encoding device, computer readable medium and electronic equipment
CN112528654A (en)Natural language processing method and device and electronic equipment
CN117573817A (en)Model training method, correlation determining method, device, equipment and storage medium
CN110457436B (en)Information labeling method and device, computer readable storage medium and electronic equipment
CN116680392A (en)Relation triplet extraction method and device

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp