CN114385814A

Movatterモバイル変換

Info

Publication number: CN114385814A
Application number: CN202210027851.XA
Authority: CN
Inventors: 沈越
Original assignee: Ping An Puhui Enterprise Management Co Ltd
Current assignee: Ping An Puhui Enterprise Management Co Ltd
Priority date: 2022-01-11
Filing date: 2022-01-11
Publication date: 2022-04-22

Abstract

The application discloses a method and a device for information retrieval, computer equipment and a storage medium, and belongs to the technical field of artificial intelligence. The method comprises the steps of firstly carrying out feature coding on input corpora through a shared coding layer, then respectively inputting the feature codes into a classification task layer and a sorting task layer of an initial retrieval model to obtain a corpus classification result and a corpus sorting result, finally carrying out iterative updating on parameters of the initial retrieval model based on the corpus classification result and the corpus sorting result, obtaining a trained retrieval model when the parameters are fitted, and guiding the corpora to be retrieved into the trained retrieval model when the corpus retrieval is required to obtain a retrieval result of the corpora to be retrieved. In addition, the application also relates to a block chain technology, and the corpus to be retrieved can be stored in the block chain. By training the retrieval model with the shared coding layer, the retrieval model can be used in various application scenes, the applicability of the retrieval model is improved, and the consumption of training resources is reduced.

Description

Translated fromChinese

一种信息检索的方法、装置、计算机设备及存储介质A method, device, computer equipment and storage medium for information retrieval

技术领域technical field

本申请属于人工智能技术领域，具体涉及一种信息检索的方法、装置、计算机设备及存储介质。The present application belongs to the technical field of artificial intelligence, and specifically relates to an information retrieval method, device, computer equipment and storage medium.

背景技术Background technique

信息检索(Information Retrieval)是用户进行信息查询和获取的主要方式，是查找信息的方法和手段。狭义的信息检索仅指信息查询(Information Search)。即用户根据需要，采用一定的方法，借助检索工具，从信息集合中找出所需要信息的查找过程。广义的信息检索是信息按一定的方式进行加工、整理、组织并存储起来，再根据信息用户特定的需要将相关信息准确的查找出来的过程。Information retrieval (Information Retrieval) is the main way for users to query and obtain information, and it is a method and means to find information. Information retrieval in a narrow sense only refers to information query (Information Search). That is, the user adopts a certain method and uses a retrieval tool to find out the required information from the information collection according to the needs. Information retrieval in a broad sense is a process in which information is processed, organized, organized and stored in a certain way, and then the relevant information is accurately searched out according to the specific needs of information users.

目前，为了方便信息检索，市面上已经开发出了多种检索模型，现有的检索模型主流结构仍然是单任务模型，主要原因单模型输出比较稳定，调参技术难度较低，且模型输入端较为单一，较好实现模型训练。但是，单任务类型检索模型的使用场景局限性较高，在不同场景的检索任务中，需要训练多个不同的检索模型，导致训练资源消耗过大。At present, in order to facilitate information retrieval, a variety of retrieval models have been developed on the market. The mainstream structure of the existing retrieval model is still a single-task model. It is relatively simple, and it is better to realize model training. However, the usage scenarios of single-task type retrieval models are relatively limited. In retrieval tasks in different scenarios, multiple different retrieval models need to be trained, resulting in excessive consumption of training resources.

发明内容SUMMARY OF THE INVENTION

本申请实施例的目的在于提出一种信息检索的方法、装置、计算机设备及存储介质，以解决现有信息检索方案存在的使用场景局限性较高，多场景使用时需要训练多个不同的检索模型，导致训练资源消耗过大的技术问题。The purpose of the embodiments of the present application is to propose an information retrieval method, device, computer equipment and storage medium, so as to solve the problem that the existing information retrieval solutions have high limitations in usage scenarios, and multiple different retrieval scenarios need to be trained when using multiple scenarios. model, resulting in a technical problem of excessive training resource consumption.

为了解决上述技术问题，本申请实施例提供一种信息检索的方法，采用了如下所述的技术方案：In order to solve the above technical problems, the embodiments of the present application provide a method for information retrieval, which adopts the following technical solutions:

一种信息检索的方法，包括：A method of information retrieval, comprising:

获取训练语料，并在预设的语料库中获取所述训练语料的近似语料和非近似语料；Acquire training corpus, and obtain approximate corpus and non-approximate corpus of the training corpus in a preset corpus;

将所述训练语料、所述近似语料和所述非近似语料导入预设的初始多任务检索模型，其中，所述初始多任务检索模型包括编码层、分类任务层和排序任务层；importing the training corpus, the approximate corpus and the non-approximate corpus into a preset initial multi-task retrieval model, wherein the initial multi-task retrieval model includes an encoding layer, a classification task layer and a sorting task layer;

通过所述编码层对所述训练语料进行特征编码得到第一特征向量，通过所述编码层对所述近似语料进行特征编码得到得到第二特征向量，以及通过所述编码层对所述非近似语料进行特征编码，得到第三特征向量；The first feature vector is obtained by performing feature encoding on the training corpus by the coding layer, the second feature vector is obtained by performing feature coding on the approximate corpus by the coding layer, and the non-approximation corpus is obtained by the coding layer. The corpus is feature encoded to obtain a third feature vector;

将所述第一特征向量、所述第二特征向量和所述第三特征向量导入所述分类任务层，得到语料分类结果；Importing the first feature vector, the second feature vector and the third feature vector into the classification task layer to obtain a corpus classification result;

将所述第一特征向量、所述第二特征向量和所述第三特征向量导入所述排序任务层，得到语料排序结果；Importing the first feature vector, the second feature vector and the third feature vector into the sorting task layer to obtain a corpus sorting result;

基于所述语料分类结果和所述语料排序结果对所述初始检索模型进行迭代更新，得到训练完成的检索模型；Iteratively update the initial retrieval model based on the corpus classification result and the corpus sorting result to obtain a trained retrieval model;

接收语料检索指令，获取待检索语料，并将所述待检索语料导入所述检索模型，生成所述待检索语料的检索结果。A corpus retrieval instruction is received, a corpus to be retrieved is acquired, and the corpus to be retrieved is imported into the retrieval model to generate a retrieval result of the corpus to be retrieved.

进一步地，所述获取训练语料，并在预设的语料库中获取所述训练语料的近似语料和非近似语料的步骤，具体包括：Further, the step of acquiring training corpus and acquiring approximate corpus and non-approximate corpus of the training corpus in a preset corpus specifically includes:

获取训练语料，并计算所述训练语料与所述语料库中样本语料的余弦相似度，得到语料相似度；Obtaining training corpus, and calculating the cosine similarity between the training corpus and the sample corpus in the corpus to obtain the corpus similarity;

基于所述语料相似度和预设的第一相似度阈值，筛选与所述训练语料相近的语料，得到所述近似语料；Based on the similarity of the corpus and a preset first similarity threshold, screening the corpus similar to the training corpus to obtain the approximate corpus;

基于所述语料相似度和预设的第二相似度阈值，筛选与所述训练语料不相近的语料，得到所述非近似语料。Based on the similarity of the corpus and a preset second similarity threshold, the corpus that is not similar to the training corpus is screened to obtain the non-similar corpus.

进一步地，在所述将所述训练语料、所述近似语料和所述非近似语料导入预设的初始多任务检索模型的步骤之前，还包括：Further, before the step of importing the training corpus, the approximate corpus and the non-approximate corpus into a preset initial multi-task retrieval model, the method further includes:

分别对所述训练语料、所述近似语料和所述非近似语料进行分词处理，得到第一文本分词、第二文本分词和第三文本分词；Perform word segmentation processing on the training corpus, the approximate corpus and the non-approximate corpus respectively to obtain a first text segmentation, a second text segmentation and a third text segmentation;

去除所述第一文本分词、所述第二文本分词和所述第三文本分词中的停用词。Stop words in the first text segment, the second text segment, and the third text segment are removed.

进一步地，所述将所述第一特征向量、所述第二特征向量和所述第三特征向量导入所述分类任务层，得到语料分类结果的步骤，具体包括：Further, the step of importing the first feature vector, the second feature vector and the third feature vector into the classification task layer to obtain a corpus classification result specifically includes:

分别对所述第一特征向量、所述第二特征向量和所述第三特征向量进行特征提取，得到第一特征、第二特征和第三特征；Perform feature extraction on the first feature vector, the second feature vector and the third feature vector respectively to obtain the first feature, the second feature and the third feature;

将所述第一特征、所述第二特征和所述第三特征输入到所述分类器中，生成所述语料分类结果。The first feature, the second feature and the third feature are input into the classifier to generate the corpus classification result.

进一步地，所述将所述第一特征向量、所述第二特征向量和所述第三特征向量导入所述排序任务层，得到语料排序结果的步骤，具体包括：Further, the step of importing the first feature vector, the second feature vector and the third feature vector into the sorting task layer to obtain a corpus sorting result specifically includes:

计算所述第一特征向量和所述第二特征向量之间的欧式距离，得到第一欧式距离；Calculate the Euclidean distance between the first eigenvector and the second eigenvector to obtain the first Euclidean distance;

计算所述第二特征向量和所述第三特征向量之间的欧式距离，得到第二欧式距离；Calculate the Euclidean distance between the second eigenvector and the third eigenvector to obtain the second Euclidean distance;

将所述第一欧式距离和所述第二欧式距离输入到所述排序模型中，生成所述语料排序结果。The first Euclidean distance and the second Euclidean distance are input into the ranking model to generate the corpus ranking result.

进一步地，所述基于所述语料分类结果和所述语料排序结果对所述初始检索模型进行迭代更新，得到训练完成的检索模型的步骤，具体包括：Further, the step of iteratively updating the initial retrieval model based on the corpus classification result and the corpus sorting result to obtain a trained retrieval model specifically includes:

计算所述语料分类结果和预设的标准分类结果之间的误差，得到分类误差；Calculate the error between the corpus classification result and the preset standard classification result to obtain the classification error;

将分类误差与预设分类误差阈值进行比较，若所述分类误差大于分类误差阈值，则对所述初始检索模型的初始参数进行迭代更新，直至所述分类误差小于或等于分类误差阈值为止，得到所述初始检索模型的中间参数；Compare the classification error with a preset classification error threshold, and if the classification error is greater than the classification error threshold, then iteratively update the initial parameters of the initial retrieval model until the classification error is less than or equal to the classification error threshold, obtaining Intermediate parameters of the initial retrieval model;

计算所述语料排序结果和预设的标准排序结果之间的误差，得到排序误差；Calculate the error between the corpus sorting result and the preset standard sorting result to obtain the sorting error;

将排序误差与预设排序误差阈值进行比较，若所述排序误差大于排序误差阈值，则对所述初始检索模型的中间参数进行迭代更新，直至所述排序误差小于或等于排序误差阈值为止，得到训练完成的检索模型。Compare the sorting error with a preset sorting error threshold, and if the sorting error is greater than the sorting error threshold, iteratively update the intermediate parameters of the initial retrieval model until the sorting error is less than or equal to the sorting error threshold, and obtain The trained retrieval model.

进一步地，在所述计算所述语料分类结果和预设的标准分类结果之间的误差，得到分类误差的步骤之前，还包括：Further, before the step of calculating the error between the corpus classification result and the preset standard classification result and obtaining the classification error, it also includes:

获取所述分类器的损失函数，得到第一损失函数，以及获取所述排序模型的损失函数，得到第二损失函数；Obtain the loss function of the classifier to obtain the first loss function, and obtain the loss function of the sorting model to obtain the second loss function;

基于交叉熵结合的方式分别计算所述第一损失函数和所述第二损失函数的权重，得到第一权重和第二权重；Calculate the weights of the first loss function and the second loss function respectively based on the combination of cross-entropy, to obtain the first weight and the second weight;

基于所述第一权重和所述第二权重对所述第一损失函数和所述第二损失函数进行加权求和，得到所述初始检索模型的损失函数。The first loss function and the second loss function are weighted and summed based on the first weight and the second weight to obtain a loss function of the initial retrieval model.

为了解决上述技术问题，本申请实施例还提供一种信息检索的装置，采用了如下所述的技术方案：In order to solve the above technical problems, the embodiment of the present application also provides an information retrieval device, which adopts the following technical solutions:

一种信息检索的装置，包括：An information retrieval device, comprising:

语料获取模块，用于获取训练语料，并在预设的语料库中获取所述训练语料的近似语料和非近似语料；a corpus acquisition module, used for acquiring training corpus, and acquiring approximate corpus and non-approximate corpus of the training corpus in a preset corpus;

语料导入模块，用于将所述训练语料、所述近似语料和所述非近似语料导入预设的初始多任务检索模型，其中，所述初始多任务检索模型包括编码层、分类任务层和排序任务层；A corpus import module, used to import the training corpus, the approximate corpus and the non-approximate corpus into a preset initial multi-task retrieval model, wherein the initial multi-task retrieval model includes an encoding layer, a classification task layer and a sorting task layer;

特征编码模块，用于通过所述编码层对所述训练语料进行特征编码得到第一特征向量，通过所述编码层对所述近似语料进行特征编码得到得到第二特征向量，以及通过所述编码层对所述非近似语料进行特征编码，得到第三特征向量；A feature encoding module, configured to perform feature encoding on the training corpus through the encoding layer to obtain a first feature vector, perform feature encoding on the approximate corpus through the encoding layer to obtain a second feature vector, and obtain a second feature vector through the encoding The layer performs feature encoding on the non-approximate corpus to obtain a third feature vector;

特征分类模块，用于将所述第一特征向量、所述第二特征向量和所述第三特征向量导入所述分类任务层，得到语料分类结果；A feature classification module, configured to import the first feature vector, the second feature vector and the third feature vector into the classification task layer to obtain a corpus classification result;

特征排序模块，用于将所述第一特征向量、所述第二特征向量和所述第三特征向量导入所述排序任务层，得到语料排序结果；A feature sorting module, configured to import the first feature vector, the second feature vector and the third feature vector into the sorting task layer to obtain a corpus sorting result;

模型迭代模块，用于基于所述语料分类结果和所述语料排序结果对所述初始检索模型进行迭代更新，得到训练完成的检索模型；a model iteration module, configured to iteratively update the initial retrieval model based on the corpus classification result and the corpus sorting result to obtain a trained retrieval model;

信息检索模块，用于接收语料检索指令，获取待检索语料，并将所述待检索语料导入所述检索模型，生成所述待检索语料的检索结果。The information retrieval module is used to receive the corpus retrieval instruction, obtain the corpus to be retrieved, import the corpus to be retrieved into the retrieval model, and generate a retrieval result of the corpus to be retrieved.

为了解决上述技术问题，本申请实施例还提供一种计算机设备，采用了如下所述的技术方案：In order to solve the above-mentioned technical problems, the embodiment of the present application also provides a computer device, which adopts the following technical solutions:

一种计算机设备，包括存储器和处理器，所述存储器中存储有计算机可读指令，所述处理器执行所述计算机可读指令时实现如上述任一项所述的信息检索的方法的步骤。A computer device includes a memory and a processor, wherein computer-readable instructions are stored in the memory, and when the processor executes the computer-readable instructions, the steps of the method for information retrieval according to any one of the above are implemented.

为了解决上述技术问题，本申请实施例还提供一种计算机可读存储介质，采用了如下所述的技术方案：In order to solve the above technical problems, the embodiments of the present application also provide a computer-readable storage medium, which adopts the following technical solutions:

一种计算机可读存储介质，所述计算机可读存储介质上存储有计算机可读指令，所述计算机可读指令被处理器执行时实现如上述中任一项所述的信息检索的方法的步骤。A computer-readable storage medium, on which computer-readable instructions are stored, and when the computer-readable instructions are executed by a processor, implement the steps of the method for information retrieval as described in any one of the above .

与现有技术相比，本申请实施例主要有以下有益效果：Compared with the prior art, the embodiments of the present application mainly have the following beneficial effects:

本申请公开了一种信息检索的方法、装置、计算机设备及存储介质，属于人工智能技术领域。本申请通过一个共用的编码层对输入语料进行特征编码，然后将特征编码分别输入到初始检索模型的分类任务层和排序任务层，通过分类任务层对输入语料进行分类，获得语料分类结果，同时通过排序任务层对输入语料进行排序，获得语料排序结果，最后基于语料分类结果和语料排序结果对初始检索模型的参数进行迭代更新，当初始检索模型的参数拟合时，得到训练完成的检索模型，当需要进行语料检索时，将待检索语料导入训练好的检索模型，得到待检索语料的检索结果。本申请通过对训练语料以及训练语料的近似语料和非近似语料进行分类和排序，并利用分类结果和排序结果训练一个具有共享编码层的语料检索模型，使得训练得到的语料检索模型，本申请中的语料检索模型的共享编码层可以在多种应用场景下使用，提高检索模型适用性，通过训练语料以及训练语料的近似语料和非近似语料共同语料检索模型，可以在极少训练语料训练一个语料检索模型，在保证模型精度的同时，降低训练资源的消耗。The present application discloses an information retrieval method, device, computer equipment and storage medium, which belong to the technical field of artificial intelligence. The present application performs feature encoding on the input corpus through a shared coding layer, and then inputs the feature codes into the classification task layer and the sorting task layer of the initial retrieval model respectively, and classifies the input corpus through the classification task layer to obtain the corpus classification result. The input corpus is sorted through the sorting task layer to obtain the corpus sorting result. Finally, the parameters of the initial retrieval model are iteratively updated based on the corpus classification results and the corpus sorting results. When the parameters of the initial retrieval model are fitted, the trained retrieval model is obtained. , when corpus retrieval is required, the corpus to be retrieved is imported into the trained retrieval model, and the retrieval result of the corpus to be retrieved is obtained. In this application, the training corpus and the approximate and non-approximate corpus of the training corpus are classified and sorted, and a corpus retrieval model with a shared coding layer is trained by using the classification results and the sorting results, so that the corpus retrieval model obtained by training, in this application The shared coding layer of the corpus retrieval model can be used in a variety of application scenarios to improve the applicability of the retrieval model. Through the training corpus and the common corpus retrieval model of the approximate corpus and the non-approximate corpus of the training corpus, one corpus can be trained in very few training corpora. Retrieval models reduce the consumption of training resources while ensuring model accuracy.

附图说明Description of drawings

为了更清楚地说明本申请中的方案，下面将对本申请实施例描述中所需要使用的附图作一个简单介绍，显而易见地，下面描述中的附图是本申请的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图。In order to illustrate the solutions in the present application more clearly, the following will briefly introduce the accompanying drawings used in the description of the embodiments of the present application. For those of ordinary skill, other drawings can also be obtained from these drawings without any creative effort.

图1示出了本申请可以应用于其中的示例性系统架构图；FIG. 1 shows an exemplary system architecture diagram to which the present application can be applied;

图2示出了根据本申请的信息检索的方法的一个实施例的流程图；FIG. 2 shows a flowchart of an embodiment of a method for information retrieval according to the present application;

图3示出了根据本申请的信息检索的装置的一个实施例的结构示意图；3 shows a schematic structural diagram of an embodiment of an apparatus for information retrieval according to the present application;

图4示出了根据本申请的计算机设备的一个实施例的结构示意图。FIG. 4 shows a schematic structural diagram of an embodiment of a computer device according to the present application.

具体实施方式Detailed ways

除非另有定义，本文所使用的所有的技术和科学术语与属于本申请的技术领域的技术人员通常理解的含义相同；本文中在申请的说明书中所使用的术语只是为了描述具体的实施例的目的，不是旨在于限制本申请；本申请的说明书和权利要求书及上述附图说明中的术语“包括”和“具有”以及它们的任何变形，意图在于覆盖不排他的包含。本申请的说明书和权利要求书或上述附图中的术语“第一”、“第二”等是用于区别不同对象，而不是用于描述特定顺序。Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the technical field of this application; the terms used herein in the specification of the application are for the purpose of describing specific embodiments only It is not intended to limit the application; the terms "comprising" and "having" and any variations thereof in the description and claims of this application and the above description of the drawings are intended to cover non-exclusive inclusion. The terms "first", "second" and the like in the description and claims of the present application or the above drawings are used to distinguish different objects, rather than to describe a specific order.

在本文中提及“实施例”意味着，结合实施例描述的特定特征、结构或特性可以包含在本申请的至少一个实施例中。在说明书中的各个位置出现该短语并不一定均是指相同的实施例，也不是与其它实施例互斥的独立的或备选的实施例。本领域技术人员显式地和隐式地理解的是，本文所描述的实施例可以与其它实施例相结合。Reference herein to an "embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the present application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor a separate or alternative embodiment that is mutually exclusive of other embodiments. It is explicitly and implicitly understood by those skilled in the art that the embodiments described herein may be combined with other embodiments.

为了使本技术领域的人员更好地理解本申请方案，下面将结合附图，对本申请实施例中的技术方案进行清楚、完整地描述。In order to make those skilled in the art better understand the solutions of the present application, the technical solutions in the embodiments of the present application will be described clearly and completely below with reference to the accompanying drawings.

如图1所示，系统架构100可以包括终端设备101、102、103，网络104和服务器105。网络104用以在终端设备101、102、103和服务器105之间提供通信链路的介质。网络104可以包括各种连接类型，例如有线、无线通信链路或者光纤电缆等等。As shown in FIG. 1 , thesystem architecture 100 may includeterminal devices 101 , 102 , and 103 , anetwork 104 and aserver 105 . Thenetwork 104 is a medium used to provide a communication link between theterminal devices 101 , 102 , 103 and theserver 105 . Thenetwork 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.

用户可以使用终端设备101、102、103通过网络104与服务器105交互，以接收或发送消息等。终端设备101、102、103上可以安装有各种通讯客户端应用，例如网页浏览器应用、购物类应用、搜索类应用、即时通信工具、邮箱客户端、社交平台软件等。The user can use theterminal devices 101, 102, 103 to interact with theserver 105 through thenetwork 104 to receive or send messages and the like. Various communication client applications may be installed on theterminal devices 101 , 102 and 103 , such as web browser applications, shopping applications, search applications, instant messaging tools, email clients, social platform software, and the like.

终端设备101、102、103可以是具有显示屏并且支持网页浏览的各种电子设备，包括但不限于智能手机、平板电脑、电子书阅读器、MP3播放器(Moving PictureExpertsGroup Audio Layer III，动态影像专家压缩标准音频层面3)、MP4(MovingPictureExperts Group Audio Layer IV，动态影像专家压缩标准音频层面4)播放器、膝上型便携计算机和台式计算机等等。Theterminal devices 101, 102, and 103 may be various electronic devices that have a display screen and support web browsing, including but not limited to smart phones, tablet computers, e-book readers, MP3 players (Moving Picture Experts Group Audio Layer III, moving picture experts). Compression Standard Audio Layer 3), MP4 (Moving PictureExperts Group Audio Layer IV, Moving Picture Experts Compression Standard Audio Layer 4) Players, Laptops and Desktops, etc.

服务器105可以是提供各种服务的服务器，例如对终端设备101、102、103上显示的页面提供支持的后台服务器，服务器可以是独立的服务器，也可以是提供云服务、云数据库、云计算、云函数、云存储、网络服务、云通信、中间件服务、域名服务、安全服务、内容分发网络(Content Delivery Network，CDN)、以及大数据和人工智能平台等基础云计算服务的云服务器。Theserver 105 can be a server that provides various services, such as a background server that provides support for the pages displayed on theterminal devices 101, 102, and 103. The server can be an independent server, or can provide cloud services, cloud databases, cloud computing, Cloud servers for basic cloud computing services such as cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, Content Delivery Network (CDN), and big data and artificial intelligence platforms.

需要说明的是，本申请实施例所提供的信息检索的方法一般由服务器执行，相应地，信息检索的装置一般设置于服务器中。It should be noted that the information retrieval method provided by the embodiments of the present application is generally executed by a server, and accordingly, an information retrieval apparatus is generally set in the server.

应该理解，图1中的终端设备、网络和服务器的数目仅仅是示意性的。根据实现需要，可以具有任意数目的终端设备、网络和服务器。It should be understood that the numbers of terminal devices, networks and servers in FIG. 1 are merely illustrative. There can be any number of terminal devices, networks and servers according to implementation needs.

继续参考图2，示出了根据本申请的信息检索的方法的一个实施例的流程图。本申请实施例可以基于人工智能技术对相关的数据进行获取和处理。其中，人工智能(Artificial Intelligence，AI)是利用数字计算机或者数字计算机控制的机器模拟、延伸和扩展人的智能，感知环境、获取知识并使用知识获得最佳结果的理论、方法、技术及应用系统。Continuing to refer to FIG. 2 , a flowchart of one embodiment of a method for information retrieval according to the present application is shown. The embodiments of the present application may acquire and process related data based on artificial intelligence technology. Among them, artificial intelligence (AI) is a theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use knowledge to obtain the best results. .

人工智能基础技术一般包括如传感器、专用人工智能芯片、云计算、分布式存储、大数据处理技术、操作/交互系统、机电一体化等技术。人工智能软件技术主要包括计算机视觉技术、机器人技术、生物识别技术、语音处理技术、自然语言处理技术以及机器学习/深度学习等几大方向。所述的信息检索的方法，包括以下步骤：The basic technologies of artificial intelligence generally include technologies such as sensors, special artificial intelligence chips, cloud computing, distributed storage, big data processing technology, operation/interaction systems, and mechatronics. Artificial intelligence software technology mainly includes computer vision technology, robotics technology, biometrics technology, speech processing technology, natural language processing technology, and machine learning/deep learning. The method for information retrieval includes the following steps:

S201，获取训练语料，并在预设的语料库中获取所述训练语料的近似语料和非近似语料。S201: Acquire training corpus, and obtain approximate corpus and non-approximate corpus of the training corpus from a preset corpus.

具体的，在训练检索模型之前，先采集用于训练检索模型的语料集，其中，服务器获取训练语料，并在预设的语料库中获取训练语料的近似语料和非近似语料，训练语料、近似语料和非近似语料组成检索模型的语料集，语料库中预先存储有若干条历史语料。Specifically, before training the retrieval model, first collect the corpus for training the retrieval model, wherein the server obtains the training corpus, and obtains the approximate corpus and the non-approximate corpus of the training corpus from the preset corpus, the training corpus, the approximate corpus The corpus and the non-similar corpus form the corpus of the retrieval model, and several historical corpora are pre-stored in the corpus.

在本申请一种具体的实施例中，检索模型应用于电销场景，服务器先获取训练语料，训练语料为客户询问语句query(q)，然后在预设的语料库中获取训练语料query(q)的近似语句po_query(pq)和非近似语句neg_query(nq)，将训练语料query(q)、近似语句po_query(pq)和非近似语句neg_query(nq)作为检索模型的输入，以训练检索模型，采用三输入的原因在于结合检索模型的损失函数，将丰富的语料信息中相似信息和不相似信息，作为pair形式输入，可帮助编码器有区分度的提取相似特征和不相似特征，帮助模型实现快速收敛。In a specific embodiment of the present application, the retrieval model is applied to a telemarketing scenario, and the server first obtains the training corpus, which is the customer query sentence query(q), and then obtains the training corpus query(q) in the preset corpus The approximate sentence po_query(pq) and the non-approximate sentence neg_query(nq), the training corpus query(q), the approximate sentence po_query(pq) and the non-approximate sentence neg_query(nq) are used as the input of the retrieval model to train the retrieval model, using The reason for the three inputs is that, combined with the loss function of the retrieval model, the similar information and dissimilar information in the rich corpus information are input in the form of pairs, which can help the encoder to extract similar features and dissimilar features discriminately, and help the model to achieve fast convergence.

S202，将所述训练语料、所述近似语料和所述非近似语料导入预设的初始多任务检索模型，其中，所述初始多任务检索模型包括编码层、分类任务层和排序任务层。S202: Import the training corpus, the approximate corpus, and the non-approximate corpus into a preset initial multi-task retrieval model, where the initial multi-task retrieval model includes an encoding layer, a classification task layer, and a sorting task layer.

具体的，在本申请中，初始检索模型基于MMoE模型框架进行构建，MMoE将共享的底层表示层分为多个特征映射单元(expert)，同时设置了门单元(gate)，使得不同的任务可以多样化的使用共享层。Specifically, in this application, the initial retrieval model is constructed based on the MMoE model framework. MMoE divides the shared underlying representation layer into multiple feature mapping units (expert), and sets up a gate unit (gate) at the same time, so that different tasks can be Diverse use of shared layers.

在本申请具体的实施例中，初始多任务检索模型包括编码层、分类任务层和排序任务层，本申请的共享层为编码层，编码层采用transformer结构，transformer层是一个Encoder-Decoder结构的神经网络，Encoder为编码器，Decoder为解码器，在为确保生成诗词语义的连贯性，本实施例中，在编码器中加入了位置编码(Position Embedding)，词的位置信息通过位置编码来表示。本申请通过训练一个具有共享编码器的检索模型，使得检索模型可以在多种应用场景下使用，提高检索模型适用性，降低训练资源的消耗。In a specific embodiment of the present application, the initial multi-task retrieval model includes an encoding layer, a classification task layer and a sorting task layer, the shared layer of the present application is the encoding layer, the encoding layer adopts the transformer structure, and the transformer layer is an Encoder-Decoder structure. Neural network, Encoder is an encoder, and Decoder is a decoder. In order to ensure the coherence of the semantics of the generated poems, in this embodiment, a position encoding (Position Embedding) is added to the encoder, and the position information of the words is represented by the position encoding. . In the present application, by training a retrieval model with a shared encoder, the retrieval model can be used in various application scenarios, thereby improving the applicability of the retrieval model and reducing the consumption of training resources.

S203，通过所述编码层对所述训练语料进行特征编码得到第一特征向量，通过所述编码层对所述近似语料进行特征编码得到得到第二特征向量，以及通过所述编码层对所述非近似语料进行特征编码，得到第三特征向量。S203: Perform feature encoding on the training corpus through the encoding layer to obtain a first feature vector, perform feature encoding on the approximate corpus through the encoding layer to obtain a second feature vector, and perform feature encoding on the training corpus through the encoding layer to obtain a second feature vector. Feature encoding is performed on the non-approximate corpus to obtain a third feature vector.

具体的，服务器在对训练语料、近似语料和非近似语料进行分词处理和去除停用词处理，得到第一文本分词、第二文本分词和第三文本分词后，将第一文本分词、第二文本分词和第三文本分词导入编码层，通过编码层对第一文本分词、第二文本分词和第三文本分词进行特征编码，得到第一特征向量、第二特征向量和第三特征向量。其中，特征编码的任务就是将文本信息表示成计算机可以处理的结构化信息。Specifically, the server performs word segmentation processing and stop word removal processing on the training corpus, approximate corpus, and non-approximate corpus, and obtains the first text segmentation, the second text segmentation, and the third text segmentation. The text segmentation and the third text segmentation are imported into the coding layer, and the coding layer performs feature encoding on the first text segmentation, the second text segmentation and the third text segmentation to obtain the first feature vector, the second feature vector and the third feature vector. Among them, the task of feature encoding is to represent text information as structured information that can be processed by a computer.

S204，将所述第一特征向量、所述第二特征向量和所述第三特征向量导入所述分类任务层，得到语料分类结果。S204, import the first feature vector, the second feature vector and the third feature vector into the classification task layer to obtain a corpus classification result.

在本申请具体的实施例中，分类任务层用于处理分类任务，分类任务层内配置有一个预先训练好的分类器，分类器在进行训练时，对有标签的训练样本进行分析后，获得样本特征与类别标签之间泛化关系，以便于预测未知样本的类别标签，本申请的分类器可采用朴素贝叶斯、K最近邻、支持向量机等方法进行构建。In a specific embodiment of the present application, the classification task layer is used to process classification tasks, and a pre-trained classifier is configured in the classification task layer. The generalization relationship between sample features and class labels is used to predict the class labels of unknown samples. The classifier of this application can be constructed by methods such as Naive Bayes, K-Nearest Neighbors, and Support Vector Machines.

具体的，服务器通过将第一特征向量、第二特征向量和第三特征向量导入分类任务层的分类器中，通过分类器对第一特征向量、第二特征向量和第三特征向量进行处理，得到语料分类结果。Specifically, the server imports the first feature vector, the second feature vector and the third feature vector into the classifier of the classification task layer, and processes the first feature vector, the second feature vector and the third feature vector by the classifier, Get the corpus classification results.

S205，将所述第一特征向量、所述第二特征向量和所述第三特征向量导入所述排序任务层，得到语料排序结果。S205: Import the first feature vector, the second feature vector and the third feature vector into the sorting task layer to obtain a corpus sorting result.

在本申请具体的实施例中，排序任务层用于处理排序任务，排序任务层内配置有预先训练好的排序模型，排序模型通过学习不同样本特征之间的差异，并根据特征差异大小实现排序，本申请的排序模型可以采用Logistic Regression模型或FactorizationMachines模型。In a specific embodiment of the present application, the sorting task layer is used to process sorting tasks, and a pre-trained sorting model is configured in the sorting task layer. The sorting model learns the differences between different sample features and realizes sorting according to the size of the feature differences , the ranking model of the present application may adopt the Logistic Regression model or the FactorizationMachines model.

具体的，服务器通过将第一特征向量、第二特征向量和第三特征向量导入排序任务层的排序模型中，通过排序模型对第一特征向量、第二特征向量和第三特征向量进行处理，得到语料排序结果。Specifically, the server imports the first feature vector, the second feature vector and the third feature vector into the ranking model of the ranking task layer, and processes the first feature vector, the second feature vector and the third feature vector through the ranking model, Get the corpus sorting results.

S206，基于所述语料分类结果和所述语料排序结果对所述初始检索模型进行迭代更新，得到训练完成的检索模型。S206, iteratively update the initial retrieval model based on the corpus classification result and the corpus sorting result to obtain a trained retrieval model.

在本申请具体的实施例中，在得到训练语料、近似语料和非近似语料之后，分别对训练语料、近似语料和非近似语料进行分类标注和排序标注，得到标准分类结果和标准排序结果。In a specific embodiment of the present application, after obtaining the training corpus, the approximate corpus and the non-approximate corpus, the training corpus, the approximate corpus and the non-approximate corpus are respectively classified and marked and sorted to obtain a standard classification result and a standard sorting result.

具体的，基于初始检索模型的损失函数计算语料分类结果和预设的标准分类结果之间的误差，以及计算语料排序结果和预设的标准排序结果之间的误差，并基于方向传播算法在编码层中传递上述误差，并判断上述误差是否大于预设误差阈值，如果上述误差大于预设误差阈值，则对初始检索模型进行反复迭代训练，同时调整初始检索模型的参数，直到上述误差小于或等于预设误差阈值为止，此时检索模型拟合完成，得到训练完成的检索模型。需要说明的是，训练完成的检索模型具有共享编码层。Specifically, the error between the corpus classification result and the preset standard classification result is calculated based on the loss function of the initial retrieval model, and the error between the corpus sorting result and the preset standard sorting result is calculated, and the coding is performed based on the directional propagation algorithm. The above error is transmitted in the layer, and it is judged whether the above error is greater than the preset error threshold. If the above error is greater than the preset error threshold, the initial retrieval model is repeatedly iteratively trained, and the parameters of the initial retrieval model are adjusted until the above error is less than or equal to Up to the preset error threshold, the retrieval model fitting is completed at this time, and the trained retrieval model is obtained. It should be noted that the trained retrieval model has shared coding layers.

其中，反向传播算法，即误差反向传播算法(Backpropagation algorithm，BP算法)适合于多层神经元网络的一种学习算法，它建立在梯度下降法的基础上，用于深度学习网络的误差计算。BP网络的输入、输出关系实质上是一种映射关系：一个n输入m输出的BP神经网络所完成的功能是从n维欧氏空间向m维欧氏空间中一有限域的连续映射，这一映射具有高度非线性。BP算法的学习过程由正向传播过程和反向传播过程组成。在正向传播过程中，输入信息通过输入层经隐含层，逐层处理并传向输出层，并转入反向传播，逐层求出目标函数对各神经元权值的偏导数，构成目标函数对权值向量的梯量，以作为修改权值的依据。Among them, the backpropagation algorithm, that is, the error backpropagation algorithm (Backpropagation algorithm, BP algorithm) is a learning algorithm suitable for multi-layer neuron networks. It is based on the gradient descent method and is used for the error of deep learning networks. calculate. The input and output relationship of BP network is essentially a mapping relationship: the function completed by a BP neural network with n input and m output is a continuous mapping from n-dimensional Euclidean space to a finite field in m-dimensional Euclidean space. A map is highly nonlinear. The learning process of BP algorithm consists of forward propagation process and back propagation process. In the process of forward propagation, the input information is processed layer by layer through the hidden layer through the input layer and transmitted to the output layer, and then transferred to the back propagation, and the partial derivative of the objective function to the weight of each neuron is obtained layer by layer, which constitutes The gradient of the objective function to the weight vector is used as the basis for modifying the weight.

S207，接收语料检索指令，获取待检索语料，并将所述待检索语料导入所述检索模型，生成所述待检索语料的检索结果。S207: Receive a corpus retrieval instruction, acquire the corpus to be retrieved, import the corpus to be retrieved into the retrieval model, and generate a retrieval result of the corpus to be retrieved.

具体的，当存在语料检索检索需求时，服务器接收语料检索指令，获取待检索语料，并在对待检索语料进行文本清洗、分词、去除停用词等预处理后，将预处理后的待检索语料导入所述检索模型，生成待检索语料的检索结果。Specifically, when there is a corpus retrieval and retrieval demand, the server receives the corpus retrieval instruction, obtains the corpus to be retrieved, and performs preprocessing such as text cleaning, word segmentation, and removal of stop words on the corpus to be retrieved, and stores the preprocessed corpus to be retrieved. Import the retrieval model to generate retrieval results of the corpus to be retrieved.

在上述实施例中，本申请通过对训练语料以及训练语料的近似语料和非近似语料进行分类和排序，并利用分类结果和排序结果训练一个具有共享编码层的语料检索模型，使得训练得到的语料检索模型，本申请中的语料检索模型的共享编码层可以在多种应用场景下使用，提高检索模型适用性，通过训练语料以及训练语料的近似语料和非近似语料共同语料检索模型，可以在极少训练语料训练一个语料检索模型，在保证模型精度的同时，降低训练资源的消耗。In the above-mentioned embodiment, the present application classifies and sorts the training corpus and the approximate corpus and non-approximate corpus of the training corpus, and uses the classification result and the sorting result to train a corpus retrieval model with a shared coding layer, so that the corpus obtained by training The retrieval model, the shared coding layer of the corpus retrieval model in this application can be used in a variety of application scenarios to improve the applicability of the retrieval model. Through the training corpus and the common corpus retrieval model of approximate corpus and non-approximate corpus of the training corpus, it can be used in extreme Training a corpus retrieval model with less training corpus reduces the consumption of training resources while ensuring the accuracy of the model.

在本实施例中，信息检索的方法运行于其上的电子设备(例如图1所示的服务器)可以通过有线连接方式或者无线连接方式接收语料检索指令。需要指出的是，上述无线连接方式可以包括但不限于3G/4G连接、WiFi连接、蓝牙连接、WiMAX连接、Zigbee连接、UWB(ultra wideband)连接、以及其他现在已知或将来开发的无线连接方式。In this embodiment, the electronic device (for example, the server shown in FIG. 1 ) on which the information retrieval method runs may receive the corpus retrieval instruction through a wired connection or a wireless connection. It should be pointed out that the above wireless connection methods may include but are not limited to 3G/4G connection, WiFi connection, Bluetooth connection, WiMAX connection, Zigbee connection, UWB (ultra wideband) connection, and other wireless connection methods currently known or developed in the future .

其中，余弦相似度，又称为余弦相似性，是通过计算两个向量的夹角余弦值来评估他们的相似度。余弦相似度将向量根据坐标值，绘制到向量空间中，如最常见的二维空间。余弦相似性通过测量两个向量的夹角的余弦值来度量它们之间的相似性。0度角的余弦值是1，而其他任何角度的余弦值都不大于1；并且其最小值是-1。Among them, cosine similarity, also known as cosine similarity, is to evaluate the similarity of two vectors by calculating the cosine value of the angle between them. Cosine similarity draws vectors according to coordinate values into a vector space, such as the most common two-dimensional space. Cosine similarity measures the similarity between two vectors by measuring the cosine of their angle. The cosine of an angle of 0 degrees is 1, and the cosine of any other angle is not greater than 1; and its minimum value is -1.

具体的，服务器在获取到训练语料后，通过预先训练好的相似度识别模型计算训练语料与语料库中样本语料的余弦相似度，得到语料相似度，基于语料相似度和预设的第一相似度阈值，筛选与训练语料相近的语料，得到近似语料，基于语料相似度和预设的第二相似度阈值，筛选与训练语料不相近的语料，得到非近似语料。Specifically, after obtaining the training corpus, the server calculates the cosine similarity between the training corpus and the sample corpus in the corpus through the pre-trained similarity recognition model, and obtains the corpus similarity, which is based on the corpus similarity and the preset first similarity. Threshold, filter the corpus that is similar to the training corpus to obtain approximate corpus, and based on the similarity of the corpus and the preset second similarity threshold, filter the corpus that is not similar to the training corpus to obtain non-approximate corpus.

在上述实施例中，本申请通过计算训练语料与样本语料的余弦相似度来判断训练语料的近似语料和非近似语料。In the above embodiment, the present application determines the approximate corpus and the non-approximate corpus of the training corpus by calculating the cosine similarity between the training corpus and the sample corpus.

具体的，在将训练语料、近似语料和非近似语料导入初始检索模型之前，还需要对上述语料进行预处理，预处理包括文本清洗、分词、去除停用词。其中，文本清洗用于清除无意义的文本数据或其它的冗余信息或将某些特殊符号进行转换，如清除文本内容中出现了很多除中文之外的字符，如标点符号、数字、字母等等。由于中文不像英文那样具有天然的分隔符，所以一般情况下，中文自然语言处理需要对语料进行分词处理，常见的分词工具有结巴分词工具、HanLP工具、SnowNLP工具等等。停用词(Stop Words)经常出现在文档中，却没有具体的实际意义，如在中文文档中的“啊”、“在”、“的”之类，这些词也可称作虚词，停用词包含副词、冠词、代词等。Specifically, before the training corpus, approximate corpus, and non-approximate corpus are imported into the initial retrieval model, the above-mentioned corpus also needs to be preprocessed, and the preprocessing includes text cleaning, word segmentation, and removal of stop words. Among them, text cleaning is used to remove meaningless text data or other redundant information or convert some special symbols, such as removing many characters other than Chinese appearing in the text content, such as punctuation marks, numbers, letters, etc. Wait. Since Chinese does not have natural separators like English, in general, Chinese natural language processing needs to perform word segmentation on the corpus. Common word segmentation tools include stuttering word segmentation tools, HanLP tools, SnowNLP tools and so on. Stop words (Stop Words) often appear in documents, but have no specific practical meaning, such as "ah", "zai", "de" in Chinese documents, these words can also be called function words, stop words Words include adverbs, articles, pronouns, etc.

在上述实施例中，本申请在将训练语料导入初始多任务检索模型之前，需要对训练语料进行预处理，使得训练语料在格式上符合检索模型的处理标准。In the above embodiment, before importing the training corpus into the initial multi-task retrieval model, the present application needs to preprocess the training corpus, so that the format of the training corpus conforms to the processing standard of the retrieval model.

其中，输入语料中的每个分词都有可能被编码形成特征，如果分词的数目非常多，虽然经过了预处理去掉了停用词等对分类没有太大实际帮助的分词，但是分词过多，仍然会导致特征向量的维数过高，使得文本分类时复杂度过高，影响分类效果，形成维度灾难。因此需要对特征向量进行特征提取，以降低特征维数。Among them, each word segment in the input corpus may be encoded to form a feature. If the number of word segments is very large, although stop words and other word segments that are not helpful for classification are removed after preprocessing, there are too many word segments. It will still cause the dimension of the feature vector to be too high, which makes the complexity of text classification too high, affects the classification effect, and forms a dimensional disaster. Therefore, it is necessary to perform feature extraction on the feature vector to reduce the feature dimension.

具体的，分别对第一特征向量、第二特征向量和第三特征向量进行特征提取，得到第一特征、第二特征和第三特征，将第一特征、第二特征和第三特征输入到分类器中，分别计算第一特征、第二特征和第三特征与分类器中类别标签之间的关联关系，并根据上述关联关系进行语料分类，生成语料分类结果。Specifically, feature extraction is performed on the first feature vector, the second feature vector and the third feature vector respectively to obtain the first feature, the second feature and the third feature, and the first feature, the second feature and the third feature are input into the In the classifier, the association relationship between the first feature, the second feature and the third feature and the category label in the classifier is calculated respectively, and the corpus classification is performed according to the above association relationship to generate a corpus classification result.

在上述实施例中，本申请通过对特征向量进行特征提取，以降低特征维数，并通过分类器对特征进行分类，获得语料分类结果。In the above-mentioned embodiment, the present application performs feature extraction on the feature vector to reduce the feature dimension, and classifies the feature with a classifier to obtain a corpus classification result.

具体的，服务器先计算第一特征向量和第二特征向量之间的欧式距离，得到第一欧式距离，再计算第二特征向量和第三特征向量之间的欧式距离，得到第二欧式距离，将第一欧式距离和第二欧式距离输入到排序模型中，生成语料排序结果。Specifically, the server first calculates the Euclidean distance between the first eigenvector and the second eigenvector to obtain the first Euclidean distance, and then calculates the Euclidean distance between the second eigenvector and the third eigenvector to obtain the second Euclidean distance, Input the first Euclidean distance and the second Euclidean distance into the ranking model to generate a corpus ranking result.

需要说明的是，对于多个不同的文本或者短文本，要来计算它们之间的相似度，一般是将这些文本中词语，映射到向量空间，形成词和向量数据的映射关系，通过计算两个或者多个不同的向量的差异的大小，来计算文本的相似度。其中，欧式距离可以表征向量之间的相似程度，距离越近就代表越相似。It should be noted that, for multiple different texts or short texts, to calculate the similarity between them, generally, the words in these texts are mapped to the vector space to form the mapping relationship between words and vector data. The size of the difference between one or more different vectors is used to calculate the similarity of the text. Among them, the Euclidean distance can represent the similarity between the vectors, and the closer the distance, the more similar.

在上述实施例中，本申请通过计算特征向量之间的欧式距离，并对计算的欧式距离进行排序，获得语料排序结果。In the above embodiment, the present application obtains the corpus sorting result by calculating the Euclidean distance between the feature vectors and sorting the calculated Euclidean distance.

具体的，服务器先基于初始检索模型的损失函数计算语料分类结果和预设的标准分类结果之间的误差，得到分类误差，将分类误差与预设分类误差阈值进行比较，若分类误差大于分类误差阈值，则对初始检索模型的初始参数进行第一次迭代更新，直至分类误差小于或等于分类误差阈值为止，得到初始检索模型的中间参数。然后服务器再基于初始检索模型的损失函数计算语料排序结果和预设的标准排序结果之间的误差，得到排序误差，将排序误差与预设排序误差阈值进行比较，若排序误差大于排序误差阈值，则对初始检索模型的中间参数进行第二次迭代更新，直至排序误差小于或等于排序误差阈值为止，得到训练完成的检索模型。Specifically, the server first calculates the error between the corpus classification result and the preset standard classification result based on the loss function of the initial retrieval model, obtains the classification error, and compares the classification error with the preset classification error threshold. If the classification error is greater than the classification error If the threshold is set, the initial parameters of the initial retrieval model are updated iteratively for the first time until the classification error is less than or equal to the classification error threshold, and the intermediate parameters of the initial retrieval model are obtained. Then the server calculates the error between the corpus sorting result and the preset standard sorting result based on the loss function of the initial retrieval model to obtain the sorting error, and compares the sorting error with the preset sorting error threshold. If the sorting error is greater than the sorting error threshold, Then, the intermediate parameters of the initial retrieval model are iteratively updated for the second time until the sorting error is less than or equal to the sorting error threshold, and the trained retrieval model is obtained.

其中，多任务模型的好处在于共用编码器，减少编码开支，在实际应用场景中，不同任务关注的特征分布是不一样的，在训练共用编码器时，不同的子任务可以根据自己的任务更新共用编码器的参数，从而让共用编码器共同学习多个子任务共同的特征，提升共用编码器的编码能力。但是由于多任务的模型结构较难搭建，以及模型损失函数loss不好融合的特点，导致多任务模型难以得到大规模应用。Among them, the advantage of the multi-task model is to share the encoder and reduce the coding overhead. In practical application scenarios, the feature distributions of different tasks are different. When training the shared encoder, different subtasks can be updated according to their own tasks. The parameters of the shared encoder are shared, so that the shared encoder can jointly learn the common features of multiple subtasks and improve the coding ability of the shared encoder. However, due to the difficulty of building multi-task model structures and the poor integration of model loss functions, it is difficult for multi-task models to be applied on a large scale.

在本申请具体的实施例中，初始检索模型基于MMoE模型框架进行构建，并基于交叉熵结合方法搭建初始检索模型的损失函数，保证在多任务处理的情况下，检索模型能够朝同一个拟合方向拟合，防止模型拟合混乱。其中，交叉熵函数经常用作损失函数的构建。在本申请具体的实施例中，采用交叉熵结合tripletloss的方式，进行加权平均，分类器的loss占比40％，排序模型的loss占比60％，得到检索模型的loss，开始训练后将检索模型的loss降低至0.00012以下，且在验证集和测试集的准确率超过90％，即可完成训练。需要说明的是，初始检索模型的损失函数用于计算分类误差和排序误差，以便对初始检索模型进行迭代训练。In the specific embodiment of the present application, the initial retrieval model is constructed based on the MMoE model framework, and the loss function of the initial retrieval model is constructed based on the cross-entropy combination method, so as to ensure that in the case of multi-task processing, the retrieval model can fit towards the same Directional fitting to prevent confusion in model fitting. Among them, the cross-entropy function is often used as the construction of the loss function. In the specific embodiment of this application, the method of cross-entropy combined with tripletloss is used to perform weighted average, the loss of the classifier accounts for 40%, the loss of the sorting model accounts for 60%, and the loss of the retrieval model is obtained. The training can be completed when the loss of the model is reduced to below 0.00012 and the accuracy of the validation set and test set exceeds 90%. It should be noted that the loss function of the initial retrieval model is used to calculate the classification error and ranking error, so as to iteratively train the initial retrieval model.

在上述实施例中，本申请公开了一种信息检索的方法，属于人工智能技术领域。本申请通过一个共用的编码层对输入语料进行特征编码，然后将特征编码分别输入到初始检索模型的分类任务层和排序任务层，通过分类任务层对输入语料进行分类，获得语料分类结果，同时通过排序任务层对输入语料进行排序，获得语料排序结果，最后基于语料分类结果和语料排序结果对初始检索模型的参数进行迭代更新，当初始检索模型的参数拟合时，得到训练完成的检索模型，当需要进行语料检索时，将待检索语料导入训练好的检索模型，得到待检索语料的检索结果。本申请通过对训练语料以及训练语料的近似语料和非近似语料进行分类和排序，并利用分类结果和排序结果训练一个具有共享编码层的语料检索模型，使得训练得到的语料检索模型，本申请中的语料检索模型的共享编码层可以在多种应用场景下使用，提高检索模型适用性，通过训练语料以及训练语料的近似语料和非近似语料共同语料检索模型，可以在极少训练语料训练一个语料检索模型，在保证模型精度的同时，降低训练资源的消耗。In the above embodiments, the present application discloses an information retrieval method, which belongs to the technical field of artificial intelligence. The present application performs feature encoding on the input corpus through a shared coding layer, and then inputs the feature codes into the classification task layer and the sorting task layer of the initial retrieval model respectively, and classifies the input corpus through the classification task layer to obtain the corpus classification result. The input corpus is sorted through the sorting task layer to obtain the corpus sorting result. Finally, the parameters of the initial retrieval model are iteratively updated based on the corpus classification results and the corpus sorting results. When the parameters of the initial retrieval model are fitted, the trained retrieval model is obtained. , when corpus retrieval is required, the corpus to be retrieved is imported into the trained retrieval model, and the retrieval result of the corpus to be retrieved is obtained. In this application, the training corpus and the approximate and non-approximate corpus of the training corpus are classified and sorted, and a corpus retrieval model with a shared coding layer is trained by using the classification results and the sorting results, so that the corpus retrieval model obtained by training, in this application The shared coding layer of the corpus retrieval model can be used in a variety of application scenarios to improve the applicability of the retrieval model. Through the training corpus and the common corpus retrieval model of the approximate corpus and the non-approximate corpus of the training corpus, one corpus can be trained in very few training corpora. Retrieval models reduce the consumption of training resources while ensuring model accuracy.

需要强调的是，为进一步保证上述待检索语料的私密和安全性，上述待检索语料还可以存储于一区块链的节点中。It should be emphasized that, in order to further ensure the privacy and security of the above-mentioned corpus to be retrieved, the above-mentioned corpus to be retrieved may also be stored in a node of a blockchain.

本申请所指区块链是分布式数据存储、点对点传输、共识机制、加密算法等计算机技术的新型应用模式。区块链(Blockchain)，本质上是一个去中心化的数据库，是一串使用密码学方法相关联产生的数据块，每一个数据块中包含了一批次网络交易的信息，用于验证其信息的有效性(防伪)和生成下一个区块。区块链可以包括区块链底层平台、平台产品服务层以及应用服务层等。The blockchain referred to in this application is a new application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm. Blockchain, essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information to verify its Validity of information (anti-counterfeiting) and generation of the next block. The blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.

本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程，是可以通过计算机可读指令来指令相关的硬件来完成，该计算机可读指令可存储于一计算机可读取存储介质中，该计算机可读指令在执行时，可包括如上述各方法的实施例的流程。其中，前述的存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory，ROM)等非易失性存储介质，或随机存储记忆体(Random Access Memory，RAM)等。Those of ordinary skill in the art can understand that all or part of the processes in the methods of the above embodiments can be implemented by instructing the relevant hardware through computer-readable instructions, and the computer-readable instructions can be stored in a computer-readable storage medium. , when the computer-readable instructions are executed, the processes of the above-mentioned method embodiments may be included. The aforementioned storage medium may be a non-volatile storage medium such as a magnetic disk, an optical disk, a read-only memory (Read-Only Memory, ROM), or a random access memory (Random Access Memory, RAM).

应该理解的是，虽然附图的流程图中的各个步骤按照箭头的指示依次显示，但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明，这些步骤的执行并没有严格的顺序限制，其可以以其他的顺序执行。而且，附图的流程图中的至少一部分步骤可以包括多个子步骤或者多个阶段，这些子步骤或者阶段并不必然是在同一时刻执行完成，而是可以在不同的时刻执行，其执行顺序也不必然是依次进行，而是可以与其他步骤或者其他步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。It should be understood that although the various steps in the flowchart of the accompanying drawings are sequentially shown in the order indicated by the arrows, these steps are not necessarily executed in sequence in the order indicated by the arrows. Unless explicitly stated herein, the execution of these steps is not strictly limited to the order and may be performed in other orders. Moreover, at least a part of the steps in the flowchart of the accompanying drawings may include multiple sub-steps or multiple stages, and these sub-steps or stages are not necessarily executed at the same time, but may be executed at different times, and the execution sequence is also It does not have to be performed sequentially, but may be performed alternately or alternately with other steps or at least a portion of sub-steps or stages of other steps.

进一步参考图3，作为对上述图2所示方法的实现，本申请提供了一种信息检索的装置的一个实施例，该装置实施例与图2所示的方法实施例相对应，该装置具体可以应用于各种电子设备中。Referring further to FIG. 3 , as an implementation of the method shown in FIG. 2 above, the present application provides an embodiment of an apparatus for information retrieval. The apparatus embodiment corresponds to the method embodiment shown in FIG. 2 . Can be used in various electronic devices.

如图3所示，本实施例所述的信息检索的装置包括：As shown in FIG. 3 , the apparatus for information retrieval described in this embodiment includes:

语料获取模块301，用于获取训练语料，并在预设的语料库中获取所述训练语料的近似语料和非近似语料；Corpus acquisition module 301, used for acquiring training corpus, and acquiring approximate corpus and non-approximate corpus of the training corpus in a preset corpus;

语料导入模块302，用于将所述训练语料、所述近似语料和所述非近似语料导入预设的初始多任务检索模型，其中，所述初始多任务检索模型包括编码层、分类任务层和排序任务层；Thecorpus import module 302 is used to import the training corpus, the approximate corpus and the non-approximate corpus into a preset initial multi-task retrieval model, wherein the initial multi-task retrieval model includes an encoding layer, a classification task layer and Sort task layer;

特征编码模块303，用于通过所述编码层对所述训练语料进行特征编码得到第一特征向量，通过所述编码层对所述近似语料进行特征编码得到得到第二特征向量，以及通过所述编码层对所述非近似语料进行特征编码，得到第三特征向量；Thefeature encoding module 303 is configured to perform feature encoding on the training corpus through the encoding layer to obtain a first feature vector, perform feature encoding on the approximate corpus through the encoding layer to obtain a second feature vector, and obtain a second feature vector through the encoding layer. The encoding layer performs feature encoding on the non-approximate corpus to obtain a third feature vector;

特征分类模块304，用于将所述第一特征向量、所述第二特征向量和所述第三特征向量导入所述分类任务层，得到语料分类结果；Thefeature classification module 304 is used to import the first feature vector, the second feature vector and the third feature vector into the classification task layer to obtain a corpus classification result;

特征排序模块305，用于将所述第一特征向量、所述第二特征向量和所述第三特征向量导入所述排序任务层，得到语料排序结果；Afeature sorting module 305, configured to import the first feature vector, the second feature vector and the third feature vector into the sorting task layer to obtain a corpus sorting result;

模型迭代模块306，用于基于所述语料分类结果和所述语料排序结果对所述初始检索模型进行迭代更新，得到训练完成的检索模型；Amodel iteration module 306, configured to iteratively update the initial retrieval model based on the corpus classification result and the corpus sorting result to obtain a trained retrieval model;

信息检索模块307，用于接收语料检索指令，获取待检索语料，并将所述待检索语料导入所述检索模型，生成所述待检索语料的检索结果。Theinformation retrieval module 307 is configured to receive the corpus retrieval instruction, obtain the corpus to be retrieved, import the corpus to be retrieved into the retrieval model, and generate a retrieval result of the corpus to be retrieved.

进一步地，所述语料获取模块301具体包括：Further, thecorpus acquisition module 301 specifically includes:

余弦相似度计算单元，用于获取训练语料，并计算所述训练语料与所述语料库中样本语料的余弦相似度，得到语料相似度；a cosine similarity calculation unit, used for acquiring training corpus, and calculating the cosine similarity between the training corpus and the sample corpus in the corpus to obtain the corpus similarity;

近似语料获取单元，用于基于所述语料相似度和预设的第一相似度阈值，筛选与所述训练语料相近的语料，得到所述近似语料；an approximate corpus acquisition unit, configured to screen corpora similar to the training corpus based on the corpus similarity and a preset first similarity threshold to obtain the approximate corpus;

非近似语料获取单元，用于基于所述语料相似度和预设的第二相似度阈值，筛选与所述训练语料不相近的语料，得到所述非近似语料。The non-approximate corpus acquisition unit is configured to screen corpora that are not similar to the training corpus based on the corpus similarity and a preset second similarity threshold to obtain the non-approximate corpus.

进一步地，所述信息检索的装置还包括：Further, the device for information retrieval also includes:

分词处理模块，用于分别对所述训练语料、所述近似语料和所述非近似语料进行分词处理，得到第一文本分词、第二文本分词和第三文本分词；a word segmentation processing module, configured to perform word segmentation processing on the training corpus, the approximate corpus and the non-approximate corpus respectively, to obtain a first text segmentation, a second text segmentation and a third text segmentation;

停用词去除模块，用于去除所述第一文本分词、所述第二文本分词和所述第三文本分词中的停用词。A stop word removal module, configured to remove stop words in the first text segment, the second text segment and the third text segment.

进一步地，所述特征分类模块304具体包括：Further, thefeature classification module 304 specifically includes:

特征提取单元，用于分别对所述第一特征向量、所述第二特征向量和所述第三特征向量进行特征提取，得到第一特征、第二特征和第三特征；a feature extraction unit, configured to perform feature extraction on the first feature vector, the second feature vector and the third feature vector respectively to obtain the first feature, the second feature and the third feature;

特征分类单元，用于将所述第一特征、所述第二特征和所述第三特征输入到所述分类器中，生成所述语料分类结果。A feature classification unit, configured to input the first feature, the second feature and the third feature into the classifier to generate the corpus classification result.

进一步地，所述特征排序模块305具体包括：Further, thefeature sorting module 305 specifically includes:

第一距离计算单元，用于计算所述第一特征向量和所述第二特征向量之间的欧式距离，得到第一欧式距离；a first distance calculation unit, used to calculate the Euclidean distance between the first eigenvector and the second eigenvector to obtain the first Euclidean distance;

第二距离计算单元，用于计算所述第二特征向量和所述第三特征向量之间的欧式距离，得到第二欧式距离；The second distance calculation unit is used to calculate the Euclidean distance between the second eigenvector and the third eigenvector to obtain the second Euclidean distance;

特征排序单元，用于将所述第一欧式距离和所述第二欧式距离输入到所述排序模型中，生成所述语料排序结果。A feature sorting unit, configured to input the first Euclidean distance and the second Euclidean distance into the sorting model to generate the corpus sorting result.

进一步地，所述模型迭代模块306具体包括：Further, themodel iteration module 306 specifically includes:

分类误差计算单元，用于计算所述语料分类结果和预设的标准分类结果之间的误差，得到分类误差；a classification error calculation unit, configured to calculate the error between the corpus classification result and the preset standard classification result, and obtain the classification error;

第一迭代单元，用于将分类误差与预设分类误差阈值进行比较，若所述分类误差大于分类误差阈值，则对所述初始检索模型的初始参数进行迭代更新，直至所述分类误差小于或等于分类误差阈值为止，得到所述初始检索模型的中间参数；a first iterative unit, configured to compare the classification error with a preset classification error threshold, and if the classification error is greater than the classification error threshold, iteratively update the initial parameters of the initial retrieval model until the classification error is less than or until it is equal to the classification error threshold, obtain the intermediate parameters of the initial retrieval model;

排序误差计算单元，用于计算所述语料排序结果和预设的标准排序结果之间的误差，得到排序误差；a sorting error calculation unit, configured to calculate the error between the corpus sorting result and the preset standard sorting result, and obtain the sorting error;

第二迭代单元，用于将排序误差与预设排序误差阈值进行比较，若所述排序误差大于排序误差阈值，则对所述初始检索模型的中间参数进行迭代更新，直至所述排序误差小于或等于排序误差阈值为止，得到训练完成的检索模型。The second iterative unit is configured to compare the sorting error with a preset sorting error threshold, and if the sorting error is greater than the sorting error threshold, iteratively update the intermediate parameters of the initial retrieval model until the sorting error is less than or Until it is equal to the sorting error threshold, the trained retrieval model is obtained.

进一步地，所述模型迭代模块306还包括：Further, themodel iteration module 306 also includes:

损失函数获取单元，用于获取所述分类器的损失函数，得到第一损失函数，以及获取所述排序模型的损失函数，得到第二损失函数；a loss function obtaining unit, configured to obtain the loss function of the classifier to obtain the first loss function, and obtain the loss function of the sorting model to obtain the second loss function;

权重计算单元，用于基于交叉熵结合的方式分别计算所述第一损失函数和所述第二损失函数的权重，得到第一权重和第二权重；a weight calculation unit, configured to calculate the weights of the first loss function and the second loss function respectively based on the combination of cross-entropy, to obtain the first weight and the second weight;

加权求和单元，用于基于所述第一权重和所述第二权重对所述第一损失函数和所述第二损失函数进行加权求和，得到所述初始检索模型的损失函数。A weighted summation unit, configured to perform a weighted summation of the first loss function and the second loss function based on the first weight and the second weight to obtain a loss function of the initial retrieval model.

在上述实施例中，本申请公开了一种信息检索的装置，属于人工智能技术领域。本申请通过一个共用的编码层对输入语料进行特征编码，然后将特征编码分别输入到初始检索模型的分类任务层和排序任务层，通过分类任务层对输入语料进行分类，获得语料分类结果，同时通过排序任务层对输入语料进行排序，获得语料排序结果，最后基于语料分类结果和语料排序结果对初始检索模型的参数进行迭代更新，当初始检索模型的参数拟合时，得到训练完成的检索模型，当需要进行语料检索时，将待检索语料导入训练好的检索模型，得到待检索语料的检索结果。本申请通过对输入语料进行分类和排序，并利用分类结果和排序结果训练一个具有共享编码层的检索模型，使得训练得到的检索模型可以在多种应用场景下使用，提高检索模型适用性，降低训练资源的消耗。In the above embodiments, the present application discloses an information retrieval device, which belongs to the technical field of artificial intelligence. The present application performs feature encoding on the input corpus through a shared coding layer, and then inputs the feature codes into the classification task layer and the sorting task layer of the initial retrieval model respectively, and classifies the input corpus through the classification task layer to obtain the corpus classification result. The input corpus is sorted through the sorting task layer to obtain the corpus sorting result. Finally, the parameters of the initial retrieval model are iteratively updated based on the corpus classification results and the corpus sorting results. When the parameters of the initial retrieval model are fitted, the trained retrieval model is obtained. , when corpus retrieval is required, the corpus to be retrieved is imported into the trained retrieval model, and the retrieval result of the corpus to be retrieved is obtained. In the present application, by classifying and sorting the input corpus, and using the classification results and sorting results to train a retrieval model with a shared coding layer, the retrieval model obtained by training can be used in a variety of application scenarios, improving the applicability of the retrieval model and reducing Consumption of training resources.

为解决上述技术问题，本申请实施例还提供计算机设备。具体请参阅图4，图4为本实施例计算机设备基本结构框图。To solve the above technical problems, the embodiments of the present application also provide computer equipment. Please refer to FIG. 4 for details. FIG. 4 is a block diagram of a basic structure of a computer device according to this embodiment.

所述计算机设备4包括通过系统总线相互通信连接存储器41、处理器42、网络接口43。需要指出的是，图中仅示出了具有组件41-43的计算机设备4，但是应理解的是，并不要求实施所有示出的组件，可以替代的实施更多或者更少的组件。其中，本技术领域技术人员可以理解，这里的计算机设备是一种能够按照事先设定或存储的指令，自动进行数值计算和/或信息处理的设备，其硬件包括但不限于微处理器、专用集成电路(ApplicationSpecific Integrated Circuit，ASIC)、可编程门阵列(Field－Programmable GateArray，FPGA)、数字处理器(Digital Signal Processor，DSP)、嵌入式设备等。The computer device 4 includes amemory 41, aprocessor 42, and anetwork interface 43 that communicate with each other through a system bus. It should be pointed out that only the computer device 4 with components 41-43 is shown in the figure, but it should be understood that it is not required to implement all of the shown components, and more or less components may be implemented instead. Among them, those skilled in the art can understand that the computer device here is a device that can automatically perform numerical calculation and/or information processing according to pre-set or stored instructions, and its hardware includes but is not limited to microprocessors, special-purpose Integrated circuit (ApplicationSpecific Integrated Circuit, ASIC), programmable gate array (Field-Programmable GateArray, FPGA), digital processor (Digital Signal Processor, DSP), embedded equipment, etc.

所述计算机设备可以是桌上型计算机、笔记本、掌上电脑及云端服务器等计算设备。所述计算机设备可以与用户通过键盘、鼠标、遥控器、触摸板或声控设备等方式进行人机交互。The computer equipment may be a desktop computer, a notebook computer, a palmtop computer, a cloud server and other computing equipment. The computer device can perform human-computer interaction with the user through a keyboard, a mouse, a remote control, a touch pad or a voice control device.

所述存储器41至少包括一种类型的可读存储介质，所述可读存储介质包括闪存、硬盘、多媒体卡、卡型存储器(例如，SD或DX存储器等)、随机访问存储器(RAM)、静态随机访问存储器(SRAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、可编程只读存储器(PROM)、磁性存储器、磁盘、光盘等。在一些实施例中，所述存储器41可以是所述计算机设备4的内部存储单元，例如该计算机设备4的硬盘或内存。在另一些实施例中，所述存储器41也可以是所述计算机设备4的外部存储设备，例如该计算机设备4上配备的插接式硬盘，智能存储卡(Smart Media Card,SMC)，安全数字(Secure Digital,SD)卡，闪存卡(FlashCard)等。当然，所述存储器41还可以既包括所述计算机设备4的内部存储单元也包括其外部存储设备。本实施例中，所述存储器41通常用于存储安装于所述计算机设备4的操作系统和各类应用软件，例如信息检索的方法的计算机可读指令等。此外，所述存储器41还可以用于暂时地存储已经输出或者将要输出的各类数据。Thememory 41 includes at least one type of readable storage medium, and the readable storage medium includes flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), random access memory (RAM), static Random Access Memory (SRAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), Programmable Read Only Memory (PROM), Magnetic Memory, Magnetic Disk, Optical Disk, etc. In some embodiments, thememory 41 may be an internal storage unit of the computer device 4 , such as a hard disk or a memory of the computer device 4 . In other embodiments, thememory 41 may also be an external storage device of the computer device 4 , such as a plug-in hard disk, a smart memory card (Smart Media Card, SMC), a secure digital (Secure Digital, SD) card, flash memory card (FlashCard) and so on. Of course, thememory 41 may also include both the internal storage unit of the computer device 4 and its external storage device. In this embodiment, thememory 41 is generally used to store the operating system and various application software installed on the computer device 4 , such as computer-readable instructions of a method for information retrieval. In addition, thememory 41 can also be used to temporarily store various types of data that have been output or will be output.

所述处理器42在一些实施例中可以是中央处理器(Central Processing Unit，CPU)、控制器、微控制器、微处理器、或其他数据处理芯片。该处理器42通常用于控制所述计算机设备4的总体操作。本实施例中，所述处理器42用于运行所述存储器41中存储的计算机可读指令或者处理数据，例如运行所述信息检索的方法的计算机可读指令。Theprocessor 42 may be a central processing unit (Central Processing Unit, CPU), a controller, a microcontroller, a microprocessor, or other data processing chips in some embodiments. Thisprocessor 42 is typically used to control the overall operation of the computer device 4 . In this embodiment, theprocessor 42 is configured to execute computer-readable instructions stored in thememory 41 or process data, for example, computer-readable instructions for executing the information retrieval method.

所述网络接口43可包括无线网络接口或有线网络接口，该网络接口43通常用于在所述计算机设备4与其他电子设备之间建立通信连接。Thenetwork interface 43 may include a wireless network interface or a wired network interface, and thenetwork interface 43 is generally used to establish a communication connection between the computer device 4 and other electronic devices.

本申请公开了一种计算机设备，属于人工智能技术领域。本申请通过一个共用的编码层对输入语料进行特征编码，然后将特征编码分别输入到初始检索模型的分类任务层和排序任务层，通过分类任务层对输入语料进行分类，获得语料分类结果，同时通过排序任务层对输入语料进行排序，获得语料排序结果，最后基于语料分类结果和语料排序结果对初始检索模型的参数进行迭代更新，当初始检索模型的参数拟合时，得到训练完成的检索模型，当需要进行语料检索时，将待检索语料导入训练好的检索模型，得到待检索语料的检索结果。本申请通过对训练语料以及训练语料的近似语料和非近似语料进行分类和排序，并利用分类结果和排序结果训练一个具有共享编码层的语料检索模型，使得训练得到的语料检索模型，本申请中的语料检索模型的共享编码层可以在多种应用场景下使用，提高检索模型适用性，通过训练语料以及训练语料的近似语料和非近似语料共同语料检索模型，可以在极少训练语料训练一个语料检索模型，在保证模型精度的同时，降低训练资源的消耗。The application discloses a computer device, which belongs to the technical field of artificial intelligence. The present application performs feature encoding on the input corpus through a shared coding layer, and then inputs the feature codes into the classification task layer and the sorting task layer of the initial retrieval model respectively, and classifies the input corpus through the classification task layer to obtain the corpus classification result. The input corpus is sorted through the sorting task layer to obtain the corpus sorting result. Finally, the parameters of the initial retrieval model are iteratively updated based on the corpus classification results and the corpus sorting results. When the parameters of the initial retrieval model are fitted, the trained retrieval model is obtained. , when corpus retrieval is required, the corpus to be retrieved is imported into the trained retrieval model, and the retrieval result of the corpus to be retrieved is obtained. In this application, the training corpus and the approximate and non-approximate corpus of the training corpus are classified and sorted, and a corpus retrieval model with a shared coding layer is trained by using the classification results and the sorting results, so that the corpus retrieval model obtained by training, in this application The shared coding layer of the corpus retrieval model can be used in a variety of application scenarios to improve the applicability of the retrieval model. Through the training corpus and the common corpus retrieval model of the approximate corpus and the non-approximate corpus of the training corpus, one corpus can be trained in very few training corpora. Retrieval models reduce the consumption of training resources while ensuring model accuracy.

本申请还提供了另一种实施方式，即提供一种计算机可读存储介质，所述计算机可读存储介质存储有计算机可读指令，所述计算机可读指令可被至少一个处理器执行，以使所述至少一个处理器执行如上述的信息检索的方法的步骤。The present application also provides another embodiment, that is, to provide a computer-readable storage medium, where the computer-readable storage medium stores computer-readable instructions, and the computer-readable instructions can be executed by at least one processor to The at least one processor is caused to perform the steps of the method of information retrieval as described above.

本申请公开了一种存储介质，属于人工智能技术领域。本申请通过一个共用的编码层对输入语料进行特征编码，然后将特征编码分别输入到初始检索模型的分类任务层和排序任务层，通过分类任务层对输入语料进行分类，获得语料分类结果，同时通过排序任务层对输入语料进行排序，获得语料排序结果，最后基于语料分类结果和语料排序结果对初始检索模型的参数进行迭代更新，当初始检索模型的参数拟合时，得到训练完成的检索模型，当需要进行语料检索时，将待检索语料导入训练好的检索模型，得到待检索语料的检索结果。本申请通过对训练语料以及训练语料的近似语料和非近似语料进行分类和排序，并利用分类结果和排序结果训练一个具有共享编码层的语料检索模型，使得训练得到的语料检索模型，本申请中的语料检索模型的共享编码层可以在多种应用场景下使用，提高检索模型适用性，通过训练语料以及训练语料的近似语料和非近似语料共同语料检索模型，可以在极少训练语料训练一个语料检索模型，在保证模型精度的同时，降低训练资源的消耗。The present application discloses a storage medium, which belongs to the technical field of artificial intelligence. The present application performs feature encoding on the input corpus through a shared coding layer, and then inputs the feature codes into the classification task layer and the sorting task layer of the initial retrieval model respectively, and classifies the input corpus through the classification task layer to obtain the corpus classification result. The input corpus is sorted through the sorting task layer to obtain the corpus sorting result. Finally, the parameters of the initial retrieval model are iteratively updated based on the corpus classification results and the corpus sorting results. When the parameters of the initial retrieval model are fitted, the trained retrieval model is obtained. , when corpus retrieval is required, the corpus to be retrieved is imported into the trained retrieval model, and the retrieval result of the corpus to be retrieved is obtained. In this application, the training corpus and the approximate and non-approximate corpus of the training corpus are classified and sorted, and a corpus retrieval model with a shared coding layer is trained by using the classification results and the sorting results, so that the corpus retrieval model obtained by training, in this application The shared coding layer of the corpus retrieval model can be used in a variety of application scenarios to improve the applicability of the retrieval model. Through the training corpus and the common corpus retrieval model of the approximate corpus and the non-approximate corpus of the training corpus, one corpus can be trained in very few training corpora. Retrieval models reduce the consumption of training resources while ensuring model accuracy.

通过以上的实施方式的描述，本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现，当然也可以通过硬件，但很多情况下前者是更佳的实施方式。基于这样的理解，本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来，该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中，包括若干指令用以使得一台终端设备(可以是手机，计算机，服务器，空调器，或者网络设备等)执行本申请各个实施例所述的方法。From the description of the above embodiments, those skilled in the art can clearly understand that the method of the above embodiment can be implemented by means of software plus a necessary general hardware platform, and of course can also be implemented by hardware, but in many cases the former is better implementation. Based on this understanding, the technical solution of the present application can be embodied in the form of a software product in essence or in a part that contributes to the prior art, and the computer software product is stored in a storage medium (such as ROM/RAM, magnetic disk, CD-ROM), including several instructions to make a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to execute the methods described in the various embodiments of this application.

本申请可用于众多通用或专用的计算机系统环境或配置中。例如：个人计算机、服务器计算机、手持设备或便携式设备、平板型设备、多处理器系统、基于微处理器的系统、置顶盒、可编程的消费电子设备、网络PC、小型计算机、大型计算机、包括以上任何系统或设备的分布式计算环境等等。本申请可以在由计算机执行的计算机可执行指令的一般上下文中描述，例如程序模块。一般地，程序模块包括执行特定任务或实现特定抽象数据类型的例程、程序、对象、组件、数据结构等等。也可以在分布式计算环境中实践本申请，在这些分布式计算环境中，由通过通信网络而被连接的远程处理设备来执行任务。在分布式计算环境中，程序模块可以位于包括存储设备在内的本地和远程计算机存储介质中。The present application may be used in numerous general purpose or special purpose computer system environments or configurations. For example: personal computers, server computers, handheld or portable devices, tablet devices, multiprocessor systems, microprocessor-based systems, set-top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, including A distributed computing environment for any of the above systems or devices, and the like. The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including storage devices.

显然，以上所描述的实施例仅仅是本申请一部分实施例，而不是全部的实施例，附图中给出了本申请的较佳实施例，但并不限制本申请的专利范围。本申请可以以许多不同的形式来实现，相反地，提供这些实施例的目的是使对本申请的公开内容的理解更加透彻全面。尽管参照前述实施例对本申请进行了详细的说明，对于本领域的技术人员来而言，其依然可以对前述各具体实施方式所记载的技术方案进行修改，或者对其中部分技术特征进行等效替换。凡是利用本申请说明书及附图内容所做的等效结构，直接或间接运用在其他相关的技术领域，均同理在本申请专利保护范围之内。Obviously, the above-described embodiments are only a part of the embodiments of the present application, rather than all of the embodiments. The accompanying drawings show the preferred embodiments of the present application, but do not limit the patent scope of the present application. This application may be embodied in many different forms, rather these embodiments are provided so that a thorough and complete understanding of the disclosure of this application is provided. Although the present application has been described in detail with reference to the foregoing embodiments, those skilled in the art can still modify the technical solutions described in the foregoing specific embodiments, or perform equivalent replacements for some of the technical features. . Any equivalent structure made by using the contents of the description and drawings of the present application, which is directly or indirectly used in other related technical fields, is also within the scope of protection of the patent of the present application.

Claims

Translated fromChinese

1.一种信息检索的方法，其特征在于，包括：1. a method for information retrieval, is characterized in that, comprises:

2.如权利要求1所述的信息检索的方法，其特征在于，所述获取训练语料，并在预设的语料库中获取所述训练语料的近似语料和非近似语料的步骤，具体包括：2. the method for information retrieval as claimed in claim 1 is characterized in that, described acquisition training corpus, and the step of obtaining approximate corpus and non-approximation corpus of described training corpus in preset corpus, specifically comprises:

3.如权利要求1所述的信息检索的方法，其特征在于，在所述将所述训练语料、所述近似语料和所述非近似语料导入预设的初始多任务检索模型的步骤之前，还包括：3. The method for information retrieval according to claim 1, wherein, before the step of importing the training corpus, the approximate corpus and the non-approximate corpus into a preset initial multi-task retrieval model, Also includes:

4.如权利要求1所述的信息检索的方法，其特征在于，所述将所述第一特征向量、所述第二特征向量和所述第三特征向量导入所述分类任务层，得到语料分类结果的步骤，具体包括：4. The method for information retrieval according to claim 1, wherein the first feature vector, the second feature vector and the third feature vector are imported into the classification task layer to obtain a corpus The steps of classifying results include:

5.如权利要求1所述的信息检索的方法，其特征在于，所述将所述第一特征向量、所述第二特征向量和所述第三特征向量导入所述排序任务层，得到语料排序结果的步骤，具体包括：5. The method for information retrieval according to claim 1, wherein the first feature vector, the second feature vector and the third feature vector are imported into the sorting task layer to obtain a corpus The steps for sorting results include:

6.如权利要求1至5任意一项所述的信息检索的方法，其特征在于，所述基于所述语料分类结果和所述语料排序结果对所述初始检索模型进行迭代更新，得到训练完成的检索模型的步骤，具体包括：6. The method for information retrieval according to any one of claims 1 to 5, wherein the initial retrieval model is iteratively updated based on the corpus classification result and the corpus sorting result, and the training is completed. The steps of retrieving the model include:

将排序误差与预设排序误差阈值进行比较，若所述排序误差大于排序误差阈值，则对所述初始检索模型的中间参数进行迭代更新，直至所述排序误差小于或等于排序误差阈值为止，得到训练完成的检索模型。Compare the sorting error with a preset sorting error threshold, and if the sorting error is greater than the sorting error threshold, iteratively update the intermediate parameters of the initial retrieval model until the sorting error is less than or equal to the sorting error threshold, obtaining The trained retrieval model.

7.如权利要求6所述的信息检索的方法，其特征在于，在所述计算所述语料分类结果和预设的标准分类结果之间的误差，得到分类误差的步骤之前，还包括：7. The method for information retrieval as claimed in claim 6, characterized in that, before the step of obtaining the classification error by calculating the error between the corpus classification result and the preset standard classification result, further comprising:

8.一种信息检索的装置，其特征在于，包括：8. A device for information retrieval, comprising:

模型迭代模块，用于基于所述语料分类结果和所述语料排序结果对所述初始检索模型进行迭代更新，得到训练完成的检索模型；A model iteration module, configured to iteratively update the initial retrieval model based on the corpus classification result and the corpus sorting result, to obtain a trained retrieval model;

9.一种计算机设备，其特征在于，包括存储器和处理器，所述存储器中存储有计算机可读指令，所述处理器执行所述计算机可读指令时实现如权利要求1至7中任一项所述的信息检索的方法的步骤。9. A computer device, comprising a memory and a processor, wherein computer-readable instructions are stored in the memory, and when the processor executes the computer-readable instructions, any one of claims 1 to 7 is implemented. The steps of the method for information retrieval described in item.

10.一种计算机可读存储介质，其特征在于，所述计算机可读存储介质上存储有计算机可读指令，所述计算机可读指令被处理器执行时实现如权利要求1至7中任一项所述的信息检索的方法的步骤。10. A computer-readable storage medium, wherein computer-readable instructions are stored on the computer-readable storage medium, and when the computer-readable instructions are executed by a processor, any one of claims 1 to 7 is implemented. The steps of the method for information retrieval described in item.