Movatterモバイル変換


[0]ホーム

URL:


CN113673247A - Entity identification method, device, medium and electronic equipment based on deep learning - Google Patents

Entity identification method, device, medium and electronic equipment based on deep learning
Download PDF

Info

Publication number
CN113673247A
CN113673247ACN202110965679.8ACN202110965679ACN113673247ACN 113673247 ACN113673247 ACN 113673247ACN 202110965679 ACN202110965679 ACN 202110965679ACN 113673247 ACN113673247 ACN 113673247A
Authority
CN
China
Prior art keywords
information
word
word vectors
word vector
vectors
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110965679.8A
Other languages
Chinese (zh)
Inventor
鲁冰青
丁川
叶凯
樊海东
王剑斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu Mandala Software Co ltd
Original Assignee
Jiangsu Mandala Software Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu Mandala Software Co ltdfiledCriticalJiangsu Mandala Software Co ltd
Publication of CN113673247ApublicationCriticalpatent/CN113673247A/en
Pendinglegal-statusCriticalCurrent

Links

Images

Classifications

Landscapes

Abstract

The application discloses an entity identification method and device based on deep learning, a computer readable storage medium and electronic equipment, wherein an input natural sentence is divided into a plurality of word vectors, then the characteristic information of each word vector is obtained by carrying out feature extraction on the plurality of word vectors, the plurality of word vectors are subjected to bidirectional coding to obtain bidirectional coding information of each word vector, and finally, an identification result is obtained by synthesis according to the characteristic information and the bidirectional coding information of the plurality of word vectors; the method comprises the steps of extracting characteristics of each word in a natural sentence and performing bidirectional coding on each word to obtain semantic features and context features of each word, so that named entities can be accurately identified.

Description

Entity identification method, device, medium and electronic equipment based on deep learning
Technical Field
The application relates to the technical field of unstructured character entity recognition, in particular to an entity recognition method and device based on deep learning, a computer readable storage medium and electronic equipment.
Background
Named Entity Recognition (NER) is a fundamental task of natural language processing. Early on based on rules and dictionaries, mainly relied on the generalized template of the linguist according to the context semantic structure. The method cannot solve the summarization which is difficult to summarize, the recognition effect is not obvious, and the cost of the summarization process is high, so that scholars use a machine learning method to solve the problem, and classify the NER problem into 3 types of small problems: feature selection, machine learning strategies, sequence tagging, and the like. When the NER problem is processed, a large-scale labeled corpus is used for training a model by a machine, and sequence decoding and the like are carried out on a test corpus through the trained model to obtain a named entity.
However, the machine learning method has a high requirement for extracting text features, and the current machine learning method has huge parameters and occupies more calculation memory, so that the calculation effect and efficiency of the model are not high, and the recognition accuracy is not high.
Disclosure of Invention
The present application is proposed to solve the above-mentioned technical problems. Embodiments of the present application provide an entity identification method and apparatus based on deep learning, a computer-readable storage medium, and an electronic device, which solve the problem of low identification accuracy of the machine learning method.
According to one aspect of the application, an entity identification method based on deep learning is provided, and comprises the following steps: splitting an input natural sentence into a plurality of word vectors; wherein the plurality of word vectors constitute the natural sentence; respectively extracting features of the word vectors to obtain characteristic information of each word vector; wherein the feature information comprises category information of the word vector; performing bidirectional coding on the plurality of word vectors respectively to obtain bidirectional coding information of each word vector; wherein the bidirectional encoding information includes relationship information between a corresponding current word vector and a previous word vector of the current word vector and a subsequent word vector of the current word vector; and obtaining an identification result according to the characteristic information and the bidirectional coding information of the word vectors.
In an embodiment, after the feature extracting is performed on the plurality of word vectors, the entity identification method further includes: performing dimension reduction processing on the feature information to obtain dimension-reduced feature information; wherein obtaining an identification result according to the feature information and the bidirectional coding information of the plurality of word vectors comprises: and obtaining an identification result according to the feature information after the dimension reduction and the bidirectional coding information.
In an embodiment, the performing dimension reduction processing on the feature information includes: global parameter information and attention parameter information of the plurality of word vectors are shared.
In an embodiment, the bi-directionally encoding the plurality of word vectors respectively comprises: converting the chain structure of the plurality of word vectors into a graph structure; and setting weight for the coding information between every two word vectors in the graph structure.
In an embodiment, the converting the chained structure of the plurality of word vectors into the graph structure comprises: setting an information node between every two word vectors; the information node comprises the bidirectional coding information, and the byte length of the information node is a preset fixed value.
In one embodiment, the setting an information node between every two word vectors includes: and when the bidirectional coding information does not exist between the two word vectors, setting an information node between the two word vectors as a null vector with a preset byte length.
In an embodiment, the obtaining a recognition result according to the feature information and the bidirectional encoding information of the plurality of word vectors includes: obtaining a plurality of prediction paths according to the feature information and the bidirectional coding information of the plurality of word vectors; the prediction path characterizes an order of arrangement of the plurality of word vectors; evaluating the plurality of predicted paths to obtain a plurality of evaluation results; and selecting a prediction path corresponding to the optimal result in the plurality of evaluation results as the identification result.
According to another aspect of the present application, there is provided an entity recognition apparatus based on deep learning, including: the splitting module is used for splitting an input natural sentence into a plurality of word vectors; wherein the plurality of word vectors constitute the natural sentence; the extraction module is used for respectively extracting the characteristics of the plurality of word vectors to obtain the characteristic information of each word vector; wherein the feature information comprises category information of the word vector; the coding module is used for respectively carrying out bidirectional coding on the plurality of word vectors to obtain bidirectional coding information of each word vector; wherein the bidirectional encoding information includes relationship information between a corresponding current word vector and a previous word vector of the current word vector and a subsequent word vector of the current word vector; and the identification module is used for obtaining an identification result according to the characteristic information and the bidirectional coding information of the plurality of word vectors.
According to another aspect of the present application, there is provided a computer-readable storage medium storing a computer program for executing any one of the deep learning based entity recognition methods described above.
According to another aspect of the present application, there is provided an electronic apparatus including: a processor; a memory for storing the processor-executable instructions; the processor is configured to execute any one of the deep learning-based entity identification methods.
According to the entity identification method and device based on deep learning, the computer readable storage medium and the electronic equipment, the input natural sentence is divided into a plurality of word vectors, then the characteristic information of each word vector is obtained by performing characteristic extraction on the plurality of word vectors, the plurality of word vectors are subjected to bidirectional coding to obtain the bidirectional coding information of each word vector, and finally, the identification result is obtained by synthesis according to the characteristic information and the bidirectional coding information of the plurality of word vectors; the method comprises the steps of extracting characteristics of each word in a natural sentence and performing bidirectional coding on each word to obtain semantic features and context features of each word, so that named entities can be accurately identified.
Drawings
The above and other objects, features and advantages of the present application will become more apparent by describing in more detail embodiments of the present application with reference to the attached drawings. The accompanying drawings are included to provide a further understanding of the embodiments of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the principles of the application. In the drawings, like reference numbers generally represent like parts or steps.
Fig. 1 is a flowchart illustrating an entity identification method based on deep learning according to an exemplary embodiment of the present application.
Fig. 2 is a flowchart illustrating an entity recognition method based on deep learning according to another exemplary embodiment of the present application.
Fig. 3 is a flowchart illustrating a bidirectional encoding method according to an exemplary embodiment of the present application.
Fig. 4 is a flowchart illustrating a method for obtaining an identification result according to an exemplary embodiment of the present application.
FIG. 5 is a flowchart illustrating a training process of an entity recognition model based on deep learning according to an exemplary embodiment of the present application.
Fig. 6 is a flowchart illustrating an entity recognition process based on deep learning according to an exemplary embodiment of the present application.
Fig. 7 is a schematic structural diagram of an entity recognition apparatus based on deep learning according to an exemplary embodiment of the present application.
Fig. 8 is a schematic structural diagram of an entity recognition apparatus based on deep learning according to another exemplary embodiment of the present application.
Fig. 9 is a block diagram of an electronic device provided in an exemplary embodiment of the present application.
Detailed Description
Hereinafter, example embodiments according to the present application will be described in detail with reference to the accompanying drawings. It should be understood that the described embodiments are only some embodiments of the present application and not all embodiments of the present application, and that the present application is not limited by the example embodiments described herein.
Exemplary method
Fig. 1 is a flowchart illustrating an entity identification method based on deep learning according to an exemplary embodiment of the present application. As shown in fig. 1, the entity recognition method based on deep learning includes:
step 110: splitting an input natural sentence into a plurality of word vectors; wherein a plurality of word vectors form a natural sentence.
Because some words which have little meaning on expression semantics may exist in the natural sentence, the natural sentence is split to obtain a plurality of word vectors, so that the natural sentence is converted into a plurality of word vectors which form the natural sentence, and each word vector is analyzed and identified.
Step 120: respectively extracting the features of the plurality of word vectors to obtain the characteristic information of each word vector; wherein the feature information comprises class information of the word vector.
After obtaining a plurality of word vectors, features are extracted for each word vector, and feature information, such as category information and semantic information of the word vector, is obtained for each word vector. In a transform-based bi-directional Encoder (BERT), the word embedding dimension and the embedding dimension of the Encoder output are 768. The word level embedding in the present application is a representation without context dependency, and the output value of the hidden layer includes not only the meaning of the word in life but also some context information, and the representation of the hidden layer should include some more information, so the dimension of the word vector of the present application ALBERT (a Lite BERT, simplified BERT) is smaller than the dimension of the encoder output value. In the recognition task, the dictionary is usually large, the quantity of parameters of the embedded matrix is large, and the updated content is sparse in the process of back propagation. In combination with the two points described above, the present application employs a factorization approach to reduce the number of parameters. Firstly, mapping the word vector to a low-dimensional space, and then mapping to a high-dimensional space, namely, firstly, changing the dimension into the space of a hidden layer through an embedded matrix with a very low dimension and then through a high-dimensional matrix, thereby realizing the dimension reduction of the parameters. The application also provides a parameter sharing method, the parameter sharing in the Transformer has a plurality of schemes, for example, only the full connection layer is shared, or only the attention layer is shared, the full connection layer and the attention layer are shared by the application, that is, all parameters in the encoder are shared, after the Transformer in the same magnitude level adopts the scheme, the effect is actually reduced, but the parameter amount is greatly reduced, and the training speed is also greatly improved. The NSP task of BERT is actually a two-classification, with the positive sample of training data being by sampling two consecutive sentences in the same document, and the negative sample being by taking sentences of two different documents. In the ALBERT, in order to keep only the consistent task to remove the influence of the subject recognition, a new task presence-order prediction (SOP), nsp (next sequence prediction), is proposed: in the next sentence prediction, positive samples are 2 sentences adjacent to each other up and down, and negative samples are 2 random sentences. Sop (sequence): sentence order prediction, positive samples are 2 adjacent sentences in normal order, and negative samples are 2 adjacent sentences in transposed order. The task is inferred for NLI natural language. Studies have found that the NSP task is not effective, primarily because it is too simple to perform. NSP actually contains two subtasks, topic prediction and relationship consistency prediction, but topic prediction is much simpler than relationship consistency prediction because the model can learn more information as long as the model finds that the topics of two sentences are different. SOP, because it is selected in the same document, only concerns the order of sentences and has no influence on the subject.
Step 130: carrying out bidirectional coding on a plurality of word vectors respectively to obtain bidirectional coding information of each word vector; the bidirectional coding information comprises corresponding relation information between the current word vector and a word vector before and after the current word vector.
After a plurality of word vectors are obtained, bidirectional coding is respectively carried out on each word vector to obtain bidirectional coding information of each word vector, and relation information between each word vector and a previous word vector and between each word vector and a next word vector is obtained, wherein the relation information comprises the probability of combination of each word vector and other word vectors, expressed semantics and the like. Specifically, Word Lattice Long Short-Term Memory (WC-LSTM) is adopted, and four different strategies are utilized to code Word information into vectors with fixed sizes, so that the Word information can be trained in batches and adapted to various application scenarios. The WC-LSTM converts the chain structure into a graph structure by utilizing dictionary information on the basis of the LSTM, the extra nodes are dictionary information, and the weights are updated in the training process. WC-LSTM is consistent with the concept of lattice LSTM, but some modifications are made based on the shortcomings of lattice LSTM. WC-LSTM utilizes four different strategies to encode word information into fixed-size vectors that can be batch trained and adapted to various application scenarios. The reason why lattice LSTM is not able to batch training is that the number of nodes added per word is not consistent, and may be 0 or more. WC-LSTM directly and rigidly specifies that there is only one node between each word representing word information, and if there is no word information between words represented by null, such modification makes the structure uniform, and thus enables the use of batch training. And finally, outputting a final vector by the word vector and the word vector concat. The strategy of using word encoding is: short Word First, Longest Word First, Average: the mean of the first two, Self-Attention.
Step 140: and obtaining an identification result according to the characteristic information and the bidirectional coding information of the plurality of word vectors.
After the feature information and the bidirectional coding information of each word vector are obtained, the information such as the semantics of each word vector and the information such as the probability of context combination can be integrated, and partial word vectors in the natural sentence are combined according to a certain sequence, so that the recognition result is obtained. Specifically, the method adopts a Conditional Random Field (CRF for short), an emission probability matrix and a transition probability matrix are used in calculation, a viterbi algorithm is used for prediction, an optimal path is solved, and the optimal path is a final recognition result of an output sequence.
According to the entity identification method based on deep learning, an input natural sentence is divided into a plurality of word vectors, then the characteristic information of each word vector is obtained by performing feature extraction on the plurality of word vectors, bidirectional coding is performed on the plurality of word vectors to obtain bidirectional coding information of each word vector, and finally an identification result is obtained by synthesis according to the characteristic information and the bidirectional coding information of the plurality of word vectors; the method comprises the steps of extracting characteristics of each word in a natural sentence and performing bidirectional coding on each word to obtain semantic features and context features of each word, so that named entities can be accurately identified.
Fig. 2 is a flowchart illustrating an entity recognition method based on deep learning according to another exemplary embodiment of the present application. As shown in fig. 2, afterstep 120, the entity identification method may further include:
step 150: and performing dimension reduction processing on the feature information to obtain the feature information after dimension reduction.
Correspondingly,step 140 is adjusted to: and obtaining an identification result according to the feature information and the bidirectional coding information after dimension reduction. The feature information is subjected to dimension reduction processing, so that the complexity of the feature information of each word vector is simplified, the operation speed is increased, and the recognition efficiency is increased.
In an embodiment, the specific implementation manner ofstep 150 may be: global parameter information and attention parameter information for a plurality of word vectors are shared.
Fig. 3 is a flowchart illustrating a bidirectional encoding method according to an exemplary embodiment of the present application. As shown in fig. 3, thestep 130 may include:
step 131: converting the chain structure of the plurality of word vectors into a graph structure.
Step 132: weights are set for the encoded information between every two word vectors in the graph structure.
The chain structure of a plurality of word vectors is converted into a graph structure, and the weight is set for the coding information between every two word vectors in the graph structure to obtain the probability of combination between every two word vectors, so that reference is provided for obtaining the identification result by subsequent combination.
In an embodiment, the specific implementation manner of thestep 131 may be: setting an information node between every two word vectors; the information node comprises bidirectional coded information, and the byte length of the information node is a preset fixed value. By encoding the information of the word vector into a vector with a fixed size, the method can be trained in batches and adapted to various application scenes. Specifically, when there is no bidirectional encoding information between two word vectors, an information node between the two word vectors is set as a null vector of a preset byte length. If no word information exists between the word vectors and the word vectors, the word vectors are represented by null, and the modification enables the structure to be uniform, so that batch training can be used, and the training efficiency is improved.
Fig. 4 is a flowchart illustrating a method for obtaining an identification result according to an exemplary embodiment of the present application. As shown in fig. 4, thestep 140 may include:
step 141: obtaining a plurality of prediction paths according to the characteristic information and the bidirectional coding information of the plurality of word vectors; wherein the predicted path characterizes an order of arrangement of the plurality of word vectors.
And obtaining a plurality of prediction paths according to a certain combination sequence by the plurality of word vectors according to the characteristic information and the bidirectional coding information of the plurality of word vectors, namely obtaining a plurality of optional recognition results.
Step 142: and evaluating the plurality of predicted paths to obtain a plurality of evaluation results.
And evaluating the plurality of predicted paths, for example, scoring each predicted path according to a preset rule to obtain a plurality of evaluation results.
Step 143: and selecting a prediction path corresponding to the optimal result in the multiple evaluation results as an identification result.
And selecting a prediction path corresponding to the optimal result (namely the evaluation result with the highest score) in the plurality of evaluation results as an identification result so as to ensure the identification precision.
Taking the example of the chief complaint of "abdominal pain 3 days", labeled "abdominal pain" (clinical manifestation) + "3 days" (duration), inputting the model, learning the semantic structure and meaning of the model, and completing the training, the specific training process is shown in fig. 5. When entity identification is performed, the new input "headache 1 hour" is identified by the model as "headache" (clinical presentation) + "1 hour (duration)", the specific identification process is shown in fig. 6.
The WC-LSTM + CRF portion of ALBERT + WC-LSTM + CRF in this application may also be replaced with LM-LSTM + CRF. In the traditional BilSTM + CRF process, the syntax/semantic feature output of word/token-level is extracted through the BilSTM, and then a CRF layer is connected to ensure the legality and global optimization of sequence labeling transfer. The LM-LSTM technical scheme is mainly characterized in that a language model of character level (char-level) is further introduced to carry out combined training on the basis of BilSTM + CRF, and the overall structure of the model comprises the following steps:
(1) the language model of char-level builds LM model with BilSTM at char-level, so that potential characteristics can be extracted from self-supervision task linguistic data. Unlike the simple approach of letting each char predict the next char (which may result in the model machine remembering the spelling order of the words), this document inserts a space ID after the last char of each word, where the next word is predicted.
(2) According to the word-level sequence standard model, embedding of each word is spliced by a pre-trained word vector and the output information of the space mark, and then BiLSTM + CRF is accessed to predict a sequence.
The LM-BilSTM + CRF model is skillfully designed and trained, and the main core points comprise:
(1) the use of different granularity information. The text sequence labeling task is originally a word-level task, but structural and semantic features between char and word can be learned from the task text in a self-supervision manner by means of a language model of char-level (or other joint training tasks).
(2) High-way skills. Considering the weak correlation between the language model and the labeling task, when the model is used for output prediction of LM and splicing of word-embedding, high-way is introduced to char to map the output of char-level to different semantic spaces, so that the underlying bilstm is more focused on extraction of general features among chars (certainly, transmission of information in the gradient BP process can also be ensured), and parameter learning of high-way is more focused on the labeling task.
(3) Fusion of word vectors and fine-tune. Considering that the corpus is not large in scale and time-consuming in computation, the word vector of the paper directly selects a word vector finetune which is glove based on massive corpus training, rather than performing pre-train directly on the corpus or performing random initialization and joint training directly in LM-BilTM + CRF. Meanwhile, in addition to glove, word-level word vectors fully fuse information obtained from char-level, so that the top-level BilSTM + CRF model can obtain the information.
(4) Alignment of token. Generally, when BilSTM is used, we use the output on token directly (the result of bidirectional concat). But the word vector fusion is carried out in the word-level layer, the alignment of tokens is particularly noticed, and for the forward LSTM, hidden layer information behind each token is used; for the backward LSTM, the information before each token (in terms of absolute position) is used.
(5) Differences in training and inference phases. In the training stage, the cross entropy loss of char-level LM and the Viterbi loss of word-level sequence marking task need to be considered simultaneously by the model. In the prediction phase, only the output of the sequence label needs to be known. Therefore, the vocabulary in the char-level LM and the Embedding vocabulary in the word level can be different, that is, the vocabulary in the char-level LM only needs to cover the word in the training sample; in order for the model to be suitable for inference of more corpora, Embedding should select a larger vocabulary to overcome the OOV problem on the training text. However, the char-level task is not suitable for the Chinese task, so that the model is difficult to be directly suitable for Chinese linguistic data, and word-level (Chinese character level) is required to be retrained again.
Exemplary devices
Fig. 7 is a schematic structural diagram of an entity recognition apparatus based on deep learning according to an exemplary embodiment of the present application. As shown in fig. 7, the entity identifying apparatus 50 includes: a splitting module 51, configured to split an input natural sentence into a plurality of word vectors; wherein the plurality of word vectors form a natural sentence; the extracting module 52 is configured to perform feature extraction on the plurality of word vectors respectively to obtain characteristic information of each word vector; wherein the feature information comprises class information of the word vector; the encoding module 53 is configured to perform bidirectional encoding on the plurality of word vectors respectively to obtain bidirectional encoding information of each word vector; the bidirectional coding information comprises corresponding relation information between a current word vector and a word vector before and after the current word vector; and an identification module 54, configured to obtain an identification result according to the feature information and the bidirectional coding information of the plurality of word vectors.
According to the entity recognition device based on deep learning, an input natural sentence is split into a plurality of word vectors through a splitting module 51, then an extraction module 52 carries out feature extraction on the plurality of word vectors to obtain characteristic information of each word vector, an encoding module 53 carries out bidirectional encoding on the plurality of word vectors to obtain bidirectional encoding information of each word vector, and finally a recognition module 54 synthesizes characteristic information and bidirectional encoding information of the plurality of word vectors to obtain a recognition result; the method comprises the steps of extracting characteristics of each word in a natural sentence and performing bidirectional coding on each word to obtain semantic features and context features of each word, so that named entities can be accurately identified.
Fig. 8 is a schematic structural diagram of an entity recognition apparatus based on deep learning according to another exemplary embodiment of the present application. As shown in fig. 8, the entity identifying apparatus 50 may further include: and the dimension reduction module 55 is configured to perform dimension reduction processing on the feature information to obtain the feature information after dimension reduction.
In an embodiment, as shown in fig. 8, the encoding module 53 may include: a conversion unit 531 for converting the chain structure of the plurality of word vectors into a graph structure; a weight setting unit 532, configured to set a weight for the encoded information between every two word vectors in the graph structure.
In one embodiment, as shown in fig. 8, the identification module 54 may include: a predicted path obtaining unit 541 configured to obtain a plurality of predicted paths from feature information and bidirectional encoding information of the plurality of word vectors; wherein the prediction path characterizes an arrangement order of the plurality of word vectors; the evaluation unit 542 is configured to evaluate the plurality of predicted paths to obtain a plurality of evaluation results; the result determining unit 543 is configured to select a predicted path corresponding to an optimal result of the multiple evaluation results as an identification result.
Exemplary electronic device
Next, an electronic apparatus according to an embodiment of the present application is described with reference to fig. 9. The electronic device may be either or both of the first device and the second device, or a stand-alone device separate from them, which stand-alone device may communicate with the first device and the second device to receive the acquired input signals therefrom.
FIG. 9 illustrates a block diagram of an electronic device in accordance with an embodiment of the present application.
As shown in fig. 9, the electronic device 10 includes one or more processors 11 andmemory 12.
The processor 11 may be a Central Processing Unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in the electronic device 10 to perform desired functions.
Memory 12 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, Random Access Memory (RAM), cache memory (cache), and/or the like. The non-volatile memory may include, for example, Read Only Memory (ROM), hard disk, flash memory, etc. One or more computer program instructions may be stored on the computer-readable storage medium and executed by the processor 11 to implement the deep learning based entity identification methods of the various embodiments of the present application described above and/or other desired functions. Various contents such as an input signal, a signal component, a noise component, etc. may also be stored in the computer-readable storage medium.
In one example, the electronic device 10 may further include: an input device 13 and an output device 14, which are interconnected by a bus system and/or other form of connection mechanism (not shown).
When the electronic device is a stand-alone device, the input means 13 may be a communication network connector for receiving the acquired input signals from the first device and the second device.
The input device 13 may also include, for example, a keyboard, a mouse, and the like.
The output device 14 may output various information including the determined distance information, direction information, and the like to the outside. The output devices 14 may include, for example, a display, speakers, a printer, and a communication network and its connected remote output devices, among others.
Of course, for simplicity, only some of the components of the electronic device 10 relevant to the present application are shown in fig. 9, and components such as buses, input/output interfaces, and the like are omitted. In addition, the electronic device 10 may include any other suitable components depending on the particular application.
Exemplary computer program product and computer-readable storage Medium
In addition to the above-described methods and apparatus, embodiments of the present application may also be a computer program product comprising computer program instructions that, when executed by a processor, cause the processor to perform the steps in the deep learning based entity recognition method according to various embodiments of the present application described in the "exemplary methods" section of this specification, supra.
The computer program product may be written with program code for performing the operations of embodiments of the present application in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server.
Furthermore, embodiments of the present application may also be a computer-readable storage medium having stored thereon computer program instructions that, when executed by a processor, cause the processor to perform the steps in the deep learning based entity recognition method according to various embodiments of the present application described in the "exemplary methods" section above in this specification.
The computer-readable storage medium may take any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may include, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The foregoing describes the general principles of the present application in conjunction with specific embodiments, however, it is noted that the advantages, effects, etc. mentioned in the present application are merely examples and are not limiting, and they should not be considered essential to the various embodiments of the present application. Furthermore, the foregoing disclosure of specific details is for the purpose of illustration and description and is not intended to be limiting, since the foregoing disclosure is not intended to be exhaustive or to limit the disclosure to the precise details disclosed.
The block diagrams of devices, apparatuses, systems referred to in this application are only given as illustrative examples and are not intended to require or imply that the connections, arrangements, configurations, etc. must be made in the manner shown in the block diagrams. These devices, apparatuses, devices, systems may be connected, arranged, configured in any manner, as will be appreciated by those skilled in the art. Words such as "including," "comprising," "having," and the like are open-ended words that mean "including, but not limited to," and are used interchangeably therewith. The words "or" and "as used herein mean, and are used interchangeably with, the word" and/or, "unless the context clearly dictates otherwise. The word "such as" is used herein to mean, and is used interchangeably with, the phrase "such as but not limited to".
It should also be noted that in the devices, apparatuses, and methods of the present application, the components or steps may be decomposed and/or recombined. These decompositions and/or recombinations are to be considered as equivalents of the present application.
The previous description of the disclosed aspects is provided to enable any person skilled in the art to make or use the present application. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects without departing from the scope of the application. Thus, the present application is not intended to be limited to the aspects shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
The foregoing description has been presented for purposes of illustration and description. Furthermore, the description is not intended to limit embodiments of the application to the form disclosed herein. While a number of example aspects and embodiments have been discussed above, those of skill in the art will recognize certain variations, modifications, alterations, additions and sub-combinations thereof.

Claims (10)

CN202110965679.8A2021-05-132021-08-20Entity identification method, device, medium and electronic equipment based on deep learningPendingCN113673247A (en)

Applications Claiming Priority (2)

Application NumberPriority DateFiling DateTitle
CN20211052477782021-05-13
CN2021105247772021-05-13

Publications (1)

Publication NumberPublication Date
CN113673247Atrue CN113673247A (en)2021-11-19

Family

ID=78544854

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202110965679.8APendingCN113673247A (en)2021-05-132021-08-20Entity identification method, device, medium and electronic equipment based on deep learning

Country Status (1)

CountryLink
CN (1)CN113673247A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN115938470A (en)*2023-01-042023-04-07抖音视界有限公司Protein characteristic pretreatment method, device, medium and equipment

Citations (8)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN110298019A (en)*2019-05-202019-10-01平安科技(深圳)有限公司Name entity recognition method, device, equipment and computer readable storage medium
CN111125331A (en)*2019-12-202020-05-08京东方科技集团股份有限公司 Semantic recognition method, apparatus, electronic device, and computer-readable storage medium
CN111126068A (en)*2019-12-252020-05-08中电云脑(天津)科技有限公司Chinese named entity recognition method and device and electronic equipment
CN111709241A (en)*2020-05-272020-09-25西安交通大学 A Named Entity Recognition Method for Network Security
CN112232058A (en)*2020-10-152021-01-15济南大学 Fake news identification method and system based on deep learning three-layer semantic extraction framework
CN112417881A (en)*2020-12-172021-02-26江苏满运物流信息有限公司 Logistics information identification method, device, electronic device, storage medium
CN112528654A (en)*2020-12-152021-03-19作业帮教育科技(北京)有限公司Natural language processing method and device and electronic equipment
WO2021072852A1 (en)*2019-10-162021-04-22平安科技(深圳)有限公司Sequence labeling method and system, and computer device

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN110298019A (en)*2019-05-202019-10-01平安科技(深圳)有限公司Name entity recognition method, device, equipment and computer readable storage medium
WO2021072852A1 (en)*2019-10-162021-04-22平安科技(深圳)有限公司Sequence labeling method and system, and computer device
CN111125331A (en)*2019-12-202020-05-08京东方科技集团股份有限公司 Semantic recognition method, apparatus, electronic device, and computer-readable storage medium
CN111126068A (en)*2019-12-252020-05-08中电云脑(天津)科技有限公司Chinese named entity recognition method and device and electronic equipment
CN111709241A (en)*2020-05-272020-09-25西安交通大学 A Named Entity Recognition Method for Network Security
CN112232058A (en)*2020-10-152021-01-15济南大学 Fake news identification method and system based on deep learning three-layer semantic extraction framework
CN112528654A (en)*2020-12-152021-03-19作业帮教育科技(北京)有限公司Natural language processing method and device and electronic equipment
CN112417881A (en)*2020-12-172021-02-26江苏满运物流信息有限公司 Logistics information identification method, device, electronic device, storage medium

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
JOHNCHENYHL: "最新中文NER模型介绍(二)", 《HTTPS://ZHUANLAN.ZHIHU.COM/P/77788495》*
吴小雪等: "预训练语言模型在中文电子病历命名实体识别上的应用", 《电子质量》*
谢腾等: "基于BERT-BiLSTM-CRF模型的中文实体识别", 《计算机系统应用》*

Cited By (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN115938470A (en)*2023-01-042023-04-07抖音视界有限公司Protein characteristic pretreatment method, device, medium and equipment
CN115938470B (en)*2023-01-042024-01-19抖音视界有限公司Protein characteristic pretreatment method, device, medium and equipment

Similar Documents

PublicationPublication DateTitle
CN111611810B (en)Multi-tone word pronunciation disambiguation device and method
CN111931517B (en)Text translation method, device, electronic equipment and storage medium
CN113869044A (en) Automatic keyword extraction method, device, equipment and storage medium
CN110852040B (en) A punctuation prediction model training method and text punctuation determination method
CN113886601B (en)Electronic text event extraction method, device, equipment and storage medium
WO2023092960A1 (en)Labeling method and apparatus for named entity recognition in legal document
JP2021033995A (en) Text processors, methods, devices and computer readable storage media
CN116579327B (en)Text error correction model training method, text error correction method, device and storage medium
CN111291565A (en)Method and device for named entity recognition
CN112633007A (en)Semantic understanding model construction method and device and semantic understanding method and device
EP4323909A1 (en)Character-level attention neural networks
CN113095082A (en)Method, device, computer device and computer readable storage medium for text processing based on multitask model
CN116187324A (en) Method, system and medium for generating cross-lingual summaries for long texts in source language
Guarasci et al.Probing cross-lingual transfer of XLM multi-language model
CN112883711B (en)Method and device for generating abstract and electronic equipment
DuttaWord-level language identification using subword embeddings for code-mixed Bangla-English social media data
CN113673247A (en)Entity identification method, device, medium and electronic equipment based on deep learning
CN117669565A (en)Method and device for extracting slot values of slot meaning based on large model
KR20210067294A (en)Apparatus and method for automatic translation
CN117172241A (en)Tibetan language syntax component labeling method
Thu et al.Neural Sequence Labeling Based Sentence Segmentation for Myanmar Language
CN114841162A (en)Text processing method, device, equipment and medium
CN114492440A (en) Method, device, electronic device and medium for nomenclature recognition using dictionary knowledge
CN115964458A (en)Text quantum line determination method, text quantum line determination device, storage medium, and electronic apparatus
US20240054989A1 (en)Systems and methods for character-to-phone conversion

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
RJ01Rejection of invention patent application after publication

Application publication date:20211119

RJ01Rejection of invention patent application after publication

[8]ページ先頭

©2009-2025 Movatter.jp