CN114385812B

Movatterモバイル変換

Info

Publication number: CN114385812B
Application number: CN202111598713.9A
Authority: CN
Inventors: 杨一帆; 李茂龙; 施淼元
Original assignee: Sipic Technology Co Ltd
Current assignee: Sipic Technology Co Ltd
Priority date: 2021-12-24
Filing date: 2021-12-24
Publication date: 2025-02-07
Anticipated expiration: 2041-12-24
Also published as: CN114385812A

Abstract

Translated fromChinese

本发明实施例提供一种用于文本的关系抽取方法。该方法包括：利用BERT对文本进行编码，对编码结果的序列维度进行复制升维，得到基于片段排列的矩阵；对矩阵中每个坐标对应的字符串进行多标签分类，得到每个坐标的标签集合；遍历每个坐标的标签集合中所有头、尾标签以及实体标签进行握手式标注配对进行关系抽取，确定至少一个限定关系或开放关系的三元组，用于表示文本中各实体间的关系。本发明实施例还提供一种用于文本的关系抽取系统。本发明实施例利用设定的标签集合以及配对关系抽取，既能够实现限定关系抽取，同时能够实现开放关系抽取。能够较大程度减少编码信息损失，准确的表示出不冲突的多维度的关系三元组。

The embodiment of the present invention provides a method for text relation extraction. The method includes: encoding the text using BERT, replicating and upgrading the sequence dimension of the encoding result to obtain a matrix based on fragment arrangement; performing multi-label classification on the character string corresponding to each coordinate in the matrix to obtain a label set for each coordinate; traversing all head and tail labels and entity labels in the label set of each coordinate to perform handshake annotation pairing for relation extraction, and determining at least one triple of a restricted relationship or an open relationship to represent the relationship between entities in the text. The embodiment of the present invention also provides a text relation extraction system. The embodiment of the present invention utilizes a set label set and paired relationship extraction to achieve both restricted relationship extraction and open relationship extraction. It can reduce the loss of encoding information to a large extent and accurately represent non-conflicting multi-dimensional relationship triples.