CN114490923B

Movatterモバイル変換

Info

Publication number: CN114490923B
Application number: CN202111436420.0A
Authority: CN
Inventors: 田上萱; 何文栋; 蔡成飞; 赵文哲; 孔伟杰
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2021-11-29
Filing date: 2021-11-29
Publication date: 2025-02-14
Anticipated expiration: 2041-11-29
Also published as: CN114490923A

Abstract

Translated fromChinese

本申请实施例公开了一种相似文本匹配模型的训练、装置、设备及存储介质，相关实施例可应用于云技术、人工智能以及智能交通等各种场景，用于提高相似文本的召回率。该方法包括：获取与目标场景对应的第一批次样本集，并输入至原始相似文本匹配模型进行向量转化操作，得到第一批次正例句向量以及第一批次负例句向量，对第一批次正例句向量进行三元组构造操作，得到若干个第一批次三元组，并进行损失计算操作，得到第一批次损失函数，并对原始相似文本匹配模型进行参数调整操作，得到中间相似文本匹配模型，重复获取与目标场景对应的第二批次样本集，并执行向量转化操作、三元组构造操作、损失计算操作以及参数调整操作，得到目标相似文本匹配模型。

The embodiments of the present application disclose a training, device, equipment and storage medium for a similar text matching model. The relevant embodiments can be applied to various scenarios such as cloud technology, artificial intelligence and intelligent transportation, and are used to improve the recall rate of similar texts. The method includes: obtaining a first batch of sample sets corresponding to the target scenario, and inputting them into the original similar text matching model for vector conversion operation to obtain a first batch of positive example sentence vectors and a first batch of negative example sentence vectors, performing a triple construction operation on the first batch of positive example sentence vectors to obtain a plurality of first batch triples, and performing a loss calculation operation to obtain a first batch loss function, and performing a parameter adjustment operation on the original similar text matching model to obtain an intermediate similar text matching model, repeatedly obtaining a second batch of sample sets corresponding to the target scenario, and performing a vector conversion operation, a triple construction operation, a loss calculation operation and a parameter adjustment operation to obtain a target similar text matching model.