Movatterモバイル変換


[0]ホーム

URL:


CN112560408A - Text labeling method, text labeling device, text labeling terminal and storage medium - Google Patents

Text labeling method, text labeling device, text labeling terminal and storage medium
Download PDF

Info

Publication number
CN112560408A
CN112560408ACN202011510630.5ACN202011510630ACN112560408ACN 112560408 ACN112560408 ACN 112560408ACN 202011510630 ACN202011510630 ACN 202011510630ACN 112560408 ACN112560408 ACN 112560408A
Authority
CN
China
Prior art keywords
word
entity
information
text
labeling
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011510630.5A
Other languages
Chinese (zh)
Inventor
黄月森
李嘉明
邓强
宋敏芳
黄丽
谢黛娜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Xuanyuan Network & Technology Co ltd
Original Assignee
Guangdong Xuanyuan Network & Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Xuanyuan Network & Technology Co ltdfiledCriticalGuangdong Xuanyuan Network & Technology Co ltd
Priority to CN202011510630.5ApriorityCriticalpatent/CN112560408A/en
Publication of CN112560408ApublicationCriticalpatent/CN112560408A/en
Pendinglegal-statusCriticalCurrent

Links

Images

Classifications

Landscapes

Abstract

Translated fromChinese

本申请公开了一种文本标注方法、装置、终端及存储介质,本申请利用对待标注的文本进行分词处理得到的实体词,第一实体词的词义标注信息,结合预置的词义关联关系,从实体词中筛选出第二实体词,以及第一实体词与第二实体词的词义关系信息,生成第一实体词和第二实体词的词义关联标注信息,实现了对文本中的实体词以及实体词关联关系的自动标注,从而提高文本标注的准确率,解决了现有的标注方式,当标注量大且关联的标注相距较远容易导致标注准确率明显下降的技术问题。

Figure 202011510630

The present application discloses a text labeling method, device, terminal and storage medium. The present application uses entity words obtained by word segmentation processing of text to be labelled, word meaning labeling information of the first entity word, combined with preset word meaning associations, from The second entity word and the semantic relationship information between the first entity word and the second entity word are screened out from the entity words, and the semantic correlation labeling information of the first entity word and the second entity word is generated, and the entity words in the text and The automatic labeling of entity word associations improves the accuracy of text labeling, and solves the technical problem that the existing labeling method, when the amount of labeling is large and the associated labels are far apart, will easily lead to a significant drop in labeling accuracy.

Figure 202011510630

Description

Text labeling method, text labeling device, text labeling terminal and storage medium
Technical Field
The present application relates to the field of text processing technologies, and in particular, to a text labeling method, apparatus, terminal, and storage medium.
Background
With the rapid development of the artificial intelligence industry, the development of the artificial intelligence data labeling service industry is promoted by the massive demand for the labeling data, and the industry demand and the scale are increasingly expanding. At present, many achievements are obtained in artificial intelligence at home and abroad, and a large amount of labeled data is needed to train an artificial intelligence algorithm and a model in the labeled data.
Text labeling is an essential link for model training with artificial intelligence. This is the process of turning the most raw data into data usable by the algorithm: the original data is generally obtained through data acquisition, and the subsequent data labeling is equivalent to processing a text and then conveying the processed text to an artificial intelligence algorithm and a model for calling, wherein the current text labeling comprises the following steps: the word meaning labeling and the associated labeling are carried out manually, the accuracy rate of the labeling is associated with a labeler, even if a professional person is in charge of the labeling, the accuracy rate of the labeling is limited by the efficiency of the labeling, and when the labeling quantity is large and the associated labels are far away, the accuracy rate is obviously reduced.
Disclosure of Invention
The application provides a text labeling method, a text labeling device, a text labeling terminal and a text labeling storage medium, which are used for solving the technical problems that the current text labeling implementation mode is manual labeling, the labeling accuracy is associated with a labeling person, and the labeling accuracy is obviously reduced when the labeling quantity is large and the associated labels are far away.
First, a first aspect of the present application provides a text annotation method, including:
acquiring text data to be marked;
performing word segmentation processing on the text data to obtain vocabulary information contained in the text data, and identifying entity words in the vocabulary information;
determining word meaning labeling information of each entity word in a semantic recognition mode according to the entity words and context keywords;
according to word meaning labeling information of a first entity word, in combination with a preset word meaning incidence relation, screening a second entity word from the entity words and word meaning relation information of the first entity word and the second entity word, wherein the second entity word is an entity word of which the word meaning labeling information and the word meaning labeling information of the first entity word have an incidence relation;
and generating word meaning association labeling information according to the first entity words, the second entity words and the word meaning relation information.
Preferably, the identifying entity words in the vocabulary information further comprises:
and generating text positioning information of the entity words according to the positions of the entity words in the text data.
Preferably, after determining word sense labeling information of each entity word by a semantic recognition mode according to the entity word and in combination with a context text of the entity word, the method further includes:
and determining a first label display area corresponding to the entity words according to the text positioning information of the entity words respectively so as to display the word meaning label information of the entity words on the first label display area.
Preferably, after generating the word sense associated tagging information according to the first entity word, the second entity word and the word sense relation information, the method further includes:
determining a second label display area according to the text positioning information of the first entity word and the second entity word so as to display the word meaning associated label information on the second label display area, wherein the word meaning associated label information comprises: word sense association label text and word sense association vector graphics.
Meanwhile, a second aspect of the present application provides a text labeling apparatus, including:
the text acquisition unit is used for acquiring text data to be marked;
the entity word recognition unit is used for performing word segmentation processing on the text data to obtain vocabulary information contained in the text data and recognizing entity words in the vocabulary information;
the word meaning labeling processing unit is used for determining word meaning labeling information of each entity word in a semantic recognition mode according to the entity words and in combination with context keywords;
the associated entity word identification unit is used for screening a second entity word from the entity words and word meaning relation information of the first entity word and the second entity word by combining a preset word meaning associated relation according to word meaning tagging information of the first entity word, wherein the second entity word is an entity word of which the word meaning tagging information and the word meaning tagging information of the first entity word have an associated relation;
and the word meaning association labeling information generating unit is used for generating word meaning association labeling information according to the first entity words, the second entity words and the word meaning relation information.
Preferably, the method further comprises the following steps:
and the entity word positioning unit is used for generating text positioning information of the entity words according to the positions of the entity words in the text data.
Preferably, the method further comprises the following steps:
and the word meaning label display unit is used for determining a first label display area corresponding to the entity word according to the text positioning information of the entity word so as to display the word meaning label information of the entity word on the first label display area.
Preferably, the method further comprises the following steps:
a word sense associated label display unit, configured to determine a second label display area according to text positioning information of the first entity word and the second entity word, so as to display the word sense associated label information on the second label display area, where the word sense associated label information includes: word sense association label text and word sense association vector graphics.
A third aspect of the present application provides a text annotation terminal, including: a memory and a processor;
the memory is used for storing program codes, and the program codes correspond to the text labeling method of the first aspect of the application;
the processor is configured to execute the program code.
A fourth aspect of the present application provides a storage medium having stored therein program code corresponding to the text annotation method of any one of the first aspects of the present application.
According to the technical scheme, the method has the following advantages:
the application provides a text labeling method, which comprises the following steps: acquiring text data to be marked; performing word segmentation processing on the text data to obtain vocabulary information contained in the text data, and identifying entity words in the vocabulary information; determining word meaning labeling information of each entity word in a semantic recognition mode according to the entity words and context keywords; according to word meaning labeling information of a first entity word, in combination with a preset word meaning incidence relation, screening a second entity word from the entity words and word meaning relation information of the first entity word and the second entity word, wherein the second entity word is an entity word of which the word meaning labeling information and the word meaning labeling information of the first entity word have an incidence relation; and generating word meaning association labeling information according to the first entity words, the second entity words and the word meaning relation information.
The method comprises the steps of utilizing an entity word obtained by performing word segmentation processing on a text to be labeled and word meaning labeling information of a first entity word to combine a preset word meaning association relationship, screening a second entity word from the entity word and word meaning relationship information of the first entity word and the second entity word, and generating word meaning association labeling information of the first entity word and the second entity word, so that automatic labeling of the entity word and the entity word association relationship in the text is realized, the accuracy of text labeling is improved, the existing labeling mode is solved, and the technical problem that the labeling accuracy is obviously reduced when the labeling quantity is large and the associated labels are far away is solved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without inventive exercise.
Fig. 1 is a schematic flowchart of a text annotation method according to a first embodiment of the present application;
fig. 2 is a schematic flowchart of a text annotation method according to a second embodiment of the present application;
FIG. 3 is a schematic structural diagram of a first embodiment of a text annotation device provided in the present application;
fig. 4 is a schematic diagram illustrating the effect of labeling according to the text labeling method provided by the present application.
Detailed Description
The embodiment of the application provides a text labeling method, a text labeling device, a text labeling terminal and a text labeling storage medium, which are used for solving the technical problems that the existing text labeling mode is manual labeling, the accuracy of labeling is associated with a labeler, and the labeling accuracy is obviously reduced when the labeling quantity is large and the associated labels are far away.
In order to make the objects, features and advantages of the present invention more apparent and understandable, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the embodiments described below are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Referring to fig. 1, a first embodiment of the present application provides a text annotation method, including:
step 101, obtaining text data to be marked.
And 102, performing word segmentation processing on the text data to obtain vocabulary information contained in the text data, and identifying entity words in the vocabulary information.
The method includes the steps of firstly, acquiring text data to be labeled, performing word segmentation on the text data to obtain each vocabulary after the text data to be labeled is subjected to word segmentation, obtaining vocabulary information formed by the vocabularies, and identifying entity words in the vocabularies by combining the parts of speech of the vocabularies.
And 103, determining word meaning labeling information of each entity word in a semantic recognition mode according to the entity words and the context keywords.
It should be noted that, based on the entity words identified instep 102, in combination with the keyword information of the context of the entity words, the word senses or paraphrases of the entity words in the text are identified through a semantic identification method, such as a machine learning semantic identification method, and the word sense tagging information corresponding to each entity word is determined, so as to automatically tag the word senses of the entity words.
And 104, according to the word meaning labeling information of the first entity word, in combination with a preset word meaning association relationship, screening a second entity word and word meaning relationship information of the first entity word and the second entity word from the entity words, wherein the second entity word is an entity word of which the word meaning labeling information and the word meaning labeling information of the first entity word have an association relationship.
It should be noted that, one entity word is arbitrarily selected from the entity words instep 102 as a first entity word, according to the word sense tagging information of the first entity word, in combination with a preset word sense association relationship, an entity word having an association relationship with the word sense tagging information of the first entity word is screened out from the entity words as a second entity word, and then, based on the word sense association relationship, the word sense relationship information of the first entity word and the second entity word is determined.
And 105, generating word meaning association labeling information according to the first entity words, the second entity words and the word meaning relationship information.
It should be noted that, word meaning associated labeling information associated with the first entity word and the second entity word is generated according to the word meaning relationship information of the first entity word and the second entity word, so as to implement automatic labeling of the entity word association relationship.
The embodiment of the application utilizes the entity words obtained by performing word segmentation processing on the text to be labeled and the word meaning labeling information of the first entity word, and combines the preset word meaning association relationship to screen out the second entity word and the word meaning relationship information of the first entity word and the second entity word from the entity words, and generate the word meaning association labeling information of the first entity word and the second entity word, so that the automatic labeling of the entity words and the association relationship of the entity words in the text is realized, the accuracy of text labeling is improved, the existing labeling mode is solved, and the technical problem that the labeling accuracy is obviously reduced when the labeling quantity is large and the associated labels are far away is solved.
The above is a detailed description of a first embodiment of a text annotation method provided in the present application, and the following is a detailed description of a second embodiment of a text annotation method provided in the present application.
Referring to fig. 2 and fig. 4, a second embodiment of the present application provides a text annotation method based on the first embodiment, including:
step 201, obtaining text data to be marked.
Step 202, performing word segmentation processing on the text data to obtain vocabulary information contained in the text data, and identifying entity words in the vocabulary information.
The method includes the steps of firstly, acquiring text data to be labeled, performing word segmentation on the text data to obtain each vocabulary after the text data to be labeled is subjected to word segmentation, obtaining vocabulary information formed by the vocabularies, and identifying entity words in the vocabularies by combining the parts of speech of the vocabularies.
Step 2001, generating text positioning information of the entity words according to the positions of the entity words in the text data.
It should be noted that, in this embodiment, after the text characters are split, a pair of (x, y) coordinate locations is given to each entity word as the text location information of the entity word, so as to implement fast location of the entity word and/or be used for label display processing in subsequent steps.
And 203, determining word meaning labeling information of each entity word in a semantic recognition mode according to the entity words and the context keywords.
And step 2002, determining a first label display area corresponding to the entity word according to the text positioning information of the entity word, so as to display the word meaning label information of the entity word on the first label display area.
As shown in fig. 4, based on the text positioning information obtained instep 2001, a region for labeling entities or relationships is calculated from the text positioning information as a first label display region, such as a neighboring region above or below the entity word, so that the meaning label information of the entity word is displayed in the first label display region in the following. For example, taking the entity word "college" as an example, by identifying and determining that the meaning label information corresponding to the entity word "college" is "school", a first label display area may be formed at a position close to "college" to synchronously display the entity word and the meaning label information corresponding to the entity word.
Step 204, according to the word meaning labeling information of the first entity word, in combination with a preset word meaning association relationship, screening a second entity word and word meaning relationship information of the first entity word and the second entity word from the entity words, wherein the second entity word is an entity word in the entity words, and the word meaning labeling information of the second entity word and the word meaning labeling information of the first entity word have an association relationship.
Step 205, generating word meaning associated labeling information according to the first entity word, the second entity word and the word meaning relation information.
Step 2003, determining a second label display area according to the text positioning information of the first entity word and the second entity word, so as to display word meaning associated label information on the second label display area, wherein the word meaning associated label information includes: word sense association label text and word sense association vector graphics.
As shown in fig. 4, based on the text positioning information obtained instep 2001, a region for calculating a tagged entity or relationship with the text positioning information is used as a second tagged display region, such as a region above or below the first entity word and the second entity word, so as to subsequently display word meaning associated tagging information corresponding to the first entity word and the second entity word in the second tagged display region. For example, the word sense association relationship between the school/unit and the employee is hired/hired, assuming that the current first entity word is "colleges", when the word sense tagging information is screened as the second entity word "mental professional teacher" of the employee, determining and forming a second tagging display area according to the text positioning information of the first entity word and the second entity word, so as to display the word sense association tagging information on the second tagging display area, and when the first entity word is "mental professional teacher" or other entity words, generating the word sense association tagging information in the same manner, which is not described herein.
It should be noted thatsteps 201, 202, 203, 204, and 205 of this embodiment correspond tosteps 101 to 105 of the first embodiment, and these steps are not described again here.
The above is a detailed description of the second embodiment of the text labeling method provided in the present application, and the following is a detailed description of the first embodiment of the text labeling apparatus provided in the present application.
Referring to fig. 3, a third embodiment of the present application provides a text annotation device, including:
a text acquiring unit 301, configured to acquire text data to be labeled;
the entity word recognition unit 302 is configured to perform word segmentation processing on the text data to obtain vocabulary information included in the text data, and recognize entity words in the vocabulary information;
a word sense tagging processing unit 303, configured to determine, according to the entity words, word sense tagging information of each entity word in a semantic recognition manner in combination with the context keywords;
the associated entity word recognition unit 304 is configured to, according to the word sense tagging information of the first entity word, in combination with a preset word sense association relationship, screen out a second entity word and word sense relationship information of the first entity word and the second entity word from the entity words, where the second entity word is an entity word in which the word sense tagging information and the word sense tagging information of the first entity word have an association relationship;
a word sense associated labeling information generating unit 305 for generating word sense associated labeling information based on the first entity word, the second entity word and the word sense relation information.
Further, still include:
the entity word positioning unit 3001 is configured to generate text positioning information of an entity word according to a position of the entity word in the text data.
Further, still include:
the word sense tagging display unit 3002 is configured to determine, according to the text positioning information of the entity word, a first tagging display area corresponding to the entity word, so as to display word sense tagging information of the entity word on the first tagging display area.
Further, still include:
a word sense associated label display unit 3003, configured to determine a second label display area according to the text positioning information of the first entity word and the second entity word, so as to display word sense associated label information on the second label display area, where the word sense associated label information includes: word sense association label text and word sense association vector graphics.
The above is a detailed description of an embodiment of a text labeling apparatus provided in the present application, and the following is a detailed description of a text labeling terminal and a storage medium provided in the present application.
A third aspect of the present application provides a text annotation terminal, including: a memory and a processor;
the memory is used for storing program codes, and the program codes correspond to the text labeling methods mentioned in the first embodiment or the second embodiment of the application;
the processor is used for executing the program codes to realize the text labeling method mentioned in the first embodiment or the second embodiment of the application.
A fourth aspect of the present application provides a storage medium having stored therein program code corresponding to the text labeling method as mentioned in the first embodiment or the second embodiment of the present application.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The terms "first," "second," "third," "fourth," and the like in the description of the application and the above-described figures, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are, for example, capable of operation in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims (10)

Translated fromChinese
1.一种文本标注方法,其特征在于,包括:1. a text labeling method, is characterized in that, comprises:获取待标注的文本数据;Get the text data to be annotated;对所述文本数据进行分词处理,得到所述文本数据中包含的词汇信息,并识别所述词汇信息中的实体词;Perform word segmentation processing on the text data to obtain vocabulary information contained in the text data, and identify entity words in the vocabulary information;根据所述实体词,结合上下文的关键词,通过语义识别方式,确定各个所述实体词的词义标注信息;According to the entity words, in combination with the keywords of the context, the semantic labeling information of each of the entity words is determined by means of semantic recognition;根据第一实体词的词义标注信息,结合预置的词义关联关系,从所述实体词中筛选出第二实体词,以及所述第一实体词与所述第二实体词的词义关系信息,其中,所述第二实体词为所述实体词中,词义标注信息与所述第一实体词的词义标注信息存在关联关系的实体词;According to the word meaning tagging information of the first entity word, combined with the preset word meaning association relationship, the second entity word and the word meaning relationship information between the first entity word and the second entity word are screened out from the entity words, Wherein, the second entity word is an entity word in which the word sense tagging information and the word sense tagging information of the first entity word are associated in the entity word;根据所述第一实体词、所述第二实体词和所述词义关系信息,生成词义关联标注信息。According to the first entity word, the second entity word, and the word sense relationship information, word sense association tagging information is generated.2.根据权利要求1所述的一种文本标注方法,其特征在于,所述识别所述词汇信息中的实体词之后还包括:2. A kind of text labeling method according to claim 1, is characterized in that, after described identifying the entity word in described vocabulary information, also comprises:根据所述实体词在所述文本数据中位置,生成所述实体词的文本定位信息。According to the position of the entity word in the text data, the text positioning information of the entity word is generated.3.根据权利要求2所述的一种文本标注方法,其特征在于,所述根据所述实体词,结合所述实体词的上下文文本,通过语义识别方式,确定各个所述实体词的词义标注信息之后还包括:3. A text labeling method according to claim 2, characterized in that, according to the entity word, in combination with the context text of the entity word, the semantic labeling of each entity word is determined by means of semantic recognition The information also includes:分别根据所述实体词的文本定位信息,确定所述实体词对应的第一标注显示区域,以便在所述第一标注显示区域上显示所述实体词的词义标注信息。A first labeling display area corresponding to the entity word is determined according to the text positioning information of the entity word respectively, so as to display the word meaning labeling information of the entity word on the first labeling display area.4.根据权利要求2或3所述的一种文本标注方法,其特征在于,根据所述第一实体词、所述第二实体词和所述词义关系信息,生成词义关联标注信息之后还包括:4. A text labeling method according to claim 2 or 3, characterized in that, after generating the word meaning associated labeling information according to the first entity word, the second entity word and the word meaning relationship information, the method further comprises: :根据所述第一实体词与所述第二实体词的文本定位信息,确定第二标注显示区域,以便在所述第二标注显示区域上显示所述词义关联标注信息,其中,所述词义关联标注信息包括:词义关联标注文本和词义关联向量图形。According to the text positioning information of the first entity word and the second entity word, a second label display area is determined, so as to display the word sense associated label information on the second label display area, wherein the word meaning is associated with The annotation information includes: word sense association annotation text and word sense association vector graphics.5.一种文本标注装置,其特征在于,包括:5. A text labeling device, characterized in that, comprising:文本获取单元,用于获取待标注的文本数据;A text acquisition unit, used to acquire the text data to be marked;实体词识别单元,用于对所述文本数据进行分词处理,得到所述文本数据中包含的词汇信息,并识别所述词汇信息中的实体词;an entity word recognition unit, configured to perform word segmentation processing on the text data, obtain lexical information contained in the text data, and identify entity words in the lexical information;词义标注处理单元,用于根据所述实体词,结合上下文的关键词,通过语义识别方式,确定各个所述实体词的词义标注信息;a word sense tagging processing unit, configured to determine the word sense tagging information of each of the entity words by means of semantic recognition according to the entity words and in combination with the keywords of the context;关联实体词识别单元,用于根据第一实体词的词义标注信息,结合预置的词义关联关系,从所述实体词中筛选出第二实体词,以及所述第一实体词与所述第二实体词的词义关系信息,其中,所述第二实体词为所述实体词中,词义标注信息与所述第一实体词的词义标注信息存在关联关系的实体词;The associated entity word recognition unit is used to screen out the second entity word from the entity word according to the word meaning tagging information of the first entity word, combined with the preset word meaning association relationship, and the first entity word and the first entity word. Word sense relationship information of two entity words, wherein the second entity word is an entity word in the entity word whose word sense tagging information is associated with the word sense tagging information of the first entity word;词义关联标注信息生成单元,用于根据所述第一实体词、所述第二实体词和所述词义关系信息,生成词义关联标注信息。A word meaning associated labeling information generating unit, configured to generate word meaning associated labeling information according to the first entity word, the second entity word and the word meaning relationship information.6.根据权利要求5所述的一种文本标注装置,其特征在于,还包括:6. A text annotation device according to claim 5, characterized in that, further comprising:实体词定位单元,用于根据所述实体词在所述文本数据中位置,生成所述实体词的文本定位信息。The entity word positioning unit is configured to generate text positioning information of the entity word according to the position of the entity word in the text data.7.根据权利要求6所述的一种文本标注装置,其特征在于,还包括:7. A text annotation device according to claim 6, characterized in that, further comprising:词义标注显示单元,用于分别根据所述实体词的文本定位信息,确定所述实体词对应的第一标注显示区域,以便在所述第一标注显示区域上显示所述实体词的词义标注信息。A word meaning labeling display unit, configured to determine a first labeling display area corresponding to the entity word according to the text positioning information of the entity word, so as to display the word meaning labeling information of the entity word on the first labeling display area .8.根据权利要求7所述的一种文本标注装置,其特征在于,还包括:8. A text annotation device according to claim 7, characterized in that, further comprising:词义关联标注显示单元,用于根据所述第一实体词与所述第二实体词的文本定位信息,确定第二标注显示区域,以便在所述第二标注显示区域上显示所述词义关联标注信息,其中,所述词义关联标注信息包括:词义关联标注文本和词义关联向量图形。A word meaning associated labeling display unit, configured to determine a second labeling display area according to the text positioning information of the first entity word and the second entity word, so as to display the word meaning associated labeling on the second labeling display area information, wherein the word sense association labeling information includes: word meaning association label text and word meaning association vector graphics.9.一种文本标注终端,其特征在于,包括:存储器和处理器;9. A text annotation terminal, comprising: a memory and a processor;所述存储器用于存储程序代码,所述程序代码与本申请权利要求1至4任意一项所述的文本标注方法相对应;The memory is used to store program codes, and the program codes correspond to the text labeling method described in any one of claims 1 to 4 of the present application;所述处理器用于执行所述程序代码。The processor is used to execute the program code.10.一种存储介质,其特征在于,所述存储介质中保存有与本申请权利要求1至4任意一项所述的文本标注方法相对应的程序代码。10 . A storage medium, characterized in that, the storage medium stores program codes corresponding to the text annotation method according to any one of claims 1 to 4 of the present application.
CN202011510630.5A2020-12-182020-12-18Text labeling method, text labeling device, text labeling terminal and storage mediumPendingCN112560408A (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202011510630.5ACN112560408A (en)2020-12-182020-12-18Text labeling method, text labeling device, text labeling terminal and storage medium

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202011510630.5ACN112560408A (en)2020-12-182020-12-18Text labeling method, text labeling device, text labeling terminal and storage medium

Publications (1)

Publication NumberPublication Date
CN112560408Atrue CN112560408A (en)2021-03-26

Family

ID=75030479

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202011510630.5APendingCN112560408A (en)2020-12-182020-12-18Text labeling method, text labeling device, text labeling terminal and storage medium

Country Status (1)

CountryLink
CN (1)CN112560408A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN113343709A (en)*2021-06-222021-09-03北京三快在线科技有限公司Method for training intention recognition model, method, device and equipment for intention recognition

Citations (11)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN103678281A (en)*2013-12-312014-03-26北京百度网讯科技有限公司Method and device for automatically labeling text
CN104933164A (en)*2015-06-262015-09-23华南理工大学Method for extracting relations among named entities in Internet massive data and system thereof
CN106372060A (en)*2016-08-312017-02-01北京百度网讯科技有限公司Search text labeling method and device
CN107515848A (en)*2017-10-122017-12-26刘啸旻The bilingual mark and composition method of books or electronic document
CN107526722A (en)*2017-07-312017-12-29努比亚技术有限公司A kind of character relation analysis method and terminal
CN107657063A (en)*2017-10-302018-02-02合肥工业大学The construction method and device of medical knowledge collection of illustrative plates
CN109492115A (en)*2018-11-232019-03-19深圳市元征科技股份有限公司A kind of Automobile Service knowledge physical network construction method, device and equipment
CN109885251A (en)*2017-03-272019-06-14三角兽(北京)科技有限公司Information processing unit, information processing method and storage medium
CN110377743A (en)*2019-07-252019-10-25北京明略软件系统有限公司A kind of text marking method and device
CN111859968A (en)*2020-06-152020-10-30深圳航天科创实业有限公司 A text structuring method, text structuring device and terminal device
CN111859857A (en)*2020-06-302020-10-30上海森亿医疗科技有限公司Method, system, device and medium for generating training data set based on labeled text

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN103678281A (en)*2013-12-312014-03-26北京百度网讯科技有限公司Method and device for automatically labeling text
CN104933164A (en)*2015-06-262015-09-23华南理工大学Method for extracting relations among named entities in Internet massive data and system thereof
CN106372060A (en)*2016-08-312017-02-01北京百度网讯科技有限公司Search text labeling method and device
CN109885251A (en)*2017-03-272019-06-14三角兽(北京)科技有限公司Information processing unit, information processing method and storage medium
CN107526722A (en)*2017-07-312017-12-29努比亚技术有限公司A kind of character relation analysis method and terminal
CN107515848A (en)*2017-10-122017-12-26刘啸旻The bilingual mark and composition method of books or electronic document
CN107657063A (en)*2017-10-302018-02-02合肥工业大学The construction method and device of medical knowledge collection of illustrative plates
CN109492115A (en)*2018-11-232019-03-19深圳市元征科技股份有限公司A kind of Automobile Service knowledge physical network construction method, device and equipment
CN110377743A (en)*2019-07-252019-10-25北京明略软件系统有限公司A kind of text marking method and device
CN111859968A (en)*2020-06-152020-10-30深圳航天科创实业有限公司 A text structuring method, text structuring device and terminal device
CN111859857A (en)*2020-06-302020-10-30上海森亿医疗科技有限公司Method, system, device and medium for generating training data set based on labeled text

Cited By (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN113343709A (en)*2021-06-222021-09-03北京三快在线科技有限公司Method for training intention recognition model, method, device and equipment for intention recognition

Similar Documents

PublicationPublication DateTitle
CN108255857B (en)Statement detection method and device
CN107832662B (en)Method and system for acquiring image annotation data
CN110929015B (en)Multi-text analysis method and device
CN112528030A (en)Semi-supervised learning method and system for text classification
CN113297379A (en)Text data multi-label classification method and device
CN113205814A (en)Voice data labeling method and device, electronic equipment and storage medium
CN108121715B (en)Character labeling method and character labeling device
CN109766550A (en)A kind of text brand identification method, identification device and storage medium
CN111462752A (en)Client intention identification method based on attention mechanism, feature embedding and BI-L STM
CN104598510A (en)Event trigger word recognition method and device
CN116030295A (en) Item identification method, device, electronic device and storage medium
CN112613367A (en)Bill information text box acquisition method, system, equipment and storage medium
WO2023231380A1 (en)Electrode plate defect recognition method and apparatus, and electrode plate defect recognition model training method and apparatus, and electronic device
CN114240672A (en)Method for identifying green asset proportion and related product
CN114529933A (en)Contract data difference comparison method, device, equipment and medium
CN119445203A (en) Image label understanding method, device, electronic device and storage medium
CN110096574B (en)Scheme for establishing and subsequently optimizing and expanding data set in E-commerce comment classification task
CN109033078B (en) Sentence category recognition method and device, storage medium, processor
CN112560408A (en)Text labeling method, text labeling device, text labeling terminal and storage medium
CN120032380A (en) Commodity price tag identification method, device, medium, processor and program product
CN111126038A (en)Information acquisition model generation method and device and information acquisition method and device
CN112860860A (en)Method and device for answering questions
CN114399699B (en) Method, device, electronic device and storage medium for determining target recommendation object
CN116894192A (en)Large model training method, and related method, device, equipment, system and medium
CN116166889A (en)Hotel product screening method, device, equipment and storage medium

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
RJ01Rejection of invention patent application after publication

Application publication date:20210326


[8]ページ先頭

©2009-2025 Movatter.jp