CN118364062A

Movatterモバイル変換

Info

Publication number: CN118364062A
Application number: CN202310096566.8A
Authority: CN
Inventors: 许晏铭; 陈小帅; 亓超
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2023-01-17
Filing date: 2023-01-17
Publication date: 2024-07-19

Abstract

The application discloses a reply sentence generation method, device and equipment and a storage medium, and relates to the technical field of artificial intelligence. The method comprises the following steps: acquiring an input dialogue, wherein the input dialogue comprises a dialogue generated by a first role and a second role; for a current input sentence corresponding to a first role in an input dialogue, acquiring a first type reply sentence corresponding to the current input sentence through a customized dialogue generating model, wherein the customized dialogue generating model learns dialogue characteristics of a second role; obtaining at least one second type reply sentence corresponding to the current input sentence according to the input dialogue and at least one prompt message through a customized dialogue generation model, wherein the prompt message is used for guiding and generating the reply sentence; and determining a final reply sentence corresponding to the current input sentence from the first-type reply sentence and the second-type reply sentence. According to the method and the device for generating the reply sentences, the reply sentences are generated through the prompt information guiding model, and the final reply sentences are screened out from the reply sentences, so that the generation stability of the reply sentences is improved.

Description

Translated fromChinese

回复语句生成方法、装置、设备及存储介质Reply statement generation method, device, equipment and storage medium

技术领域Technical Field

本申请实施例涉及人工智能技术领域，特别涉及一种回复语句生成方法、装置、设备及存储介质。The embodiments of the present application relate to the field of artificial intelligence technology, and in particular to a method, device, equipment and storage medium for generating a reply statement.

背景技术Background technique

角色定制对话是指针对某一角色(如影视角色)，构建符合该角色的对话特征的对话系统，以进行对话生成的过程。随着人工智能技术的发展，人工智能可有效应用于角色定制对话。Character-customized dialogue refers to the process of building a dialogue system that meets the dialogue characteristics of a certain character (such as a film or TV character) to generate dialogue. With the development of artificial intelligence technology, artificial intelligence can be effectively applied to character-customized dialogue.

在相关技术中，在大规模预训练得到的通用对话生成模型的基础上，以符合某一角色对话风格的对话数据，对该通用对话生成模型进行继续训练，以得到该角色对应的定制对话生成模型，然后通过该定制对话生成模型针对用户的输入语句，以该角色的对话风格进行回复。In the related technology, based on the general dialogue generation model obtained through large-scale pre-training, the general dialogue generation model is further trained with dialogue data that conforms to a certain character's dialogue style to obtain a customized dialogue generation model corresponding to the character, and then the customized dialogue generation model is used to respond to the user's input statements in the dialogue style of the character.

然而，仅根据用户的输入语句进行回复，存在模型对对话背景、输入语句等理解不足的问题，从而导致回复语句的生成不够稳定。However, when replying only based on the user's input sentences, there is a problem that the model does not have enough understanding of the conversation context, input sentences, etc., which leads to unstable generation of reply sentences.

发明内容Summary of the invention

本申请实施例提供了一种回复语句生成方法、装置、设备及存储介质，能够提高回复语句的生成稳定性。所述技术方案如下：The embodiment of the present application provides a reply statement generation method, apparatus, device and storage medium, which can improve the generation stability of reply statements. The technical solution is as follows:

根据本申请实施例的一个方面，提供了一种回复语句生成方法，所述方法包括：According to one aspect of an embodiment of the present application, a method for generating a reply statement is provided, the method comprising:

获取输入对话，所述输入对话包括第一角色与第二角色生成的对话，所述第一角色和所述第二角色不同；Acquire an input dialogue, the input dialogue comprising a dialogue generated by a first character and a second character, the first character and the second character being different;

对于所述输入对话中所述第一角色对应的当前输入语句，通过定制对话生成模型根据所述输入对话，获取所述当前输入语句对应的第一类回复语句，所述定制对话生成模型学习有所述第二角色的对话特征；For a current input sentence corresponding to the first character in the input dialogue, obtaining a first type of reply sentence corresponding to the current input sentence according to the input dialogue by using a customized dialogue generation model, wherein the customized dialogue generation model learns dialogue features of the second character;

通过所述定制对话生成模型根据所述输入对话和至少一个提示信息，获取所述当前输入语句对应的至少一个第二类回复语句，所述提示信息是根据所述输入对话生成的用于引导生成回复语句的信息；Acquire, by the customized dialogue generation model, at least one second-category reply sentence corresponding to the current input sentence according to the input dialogue and at least one prompt information, wherein the prompt information is information generated according to the input dialogue and used to guide generation of a reply sentence;

从所述第一类回复语句和所述至少一个第二类回复语句中，确定所述当前输入语句对应的最终回复语句。A final reply sentence corresponding to the current input sentence is determined from the first-category reply sentence and the at least one second-category reply sentence.

根据本申请实施例的一个方面，提供了一种回复语句生成装置，所述装置包括：According to one aspect of an embodiment of the present application, a reply statement generating device is provided, the device comprising:

输入对话获取模块，用于获取输入对话，所述输入对话包括第一角色与第二角色生成的对话，所述第一角色和所述第二角色不同；An input dialogue acquisition module, used to acquire an input dialogue, wherein the input dialogue includes a dialogue generated by a first character and a second character, and the first character and the second character are different;

候选回复获取模块，用于对于所述输入对话中所述第一角色对应的当前输入语句，通过定制对话生成模型根据所述输入对话，获取所述当前输入语句对应的第一类回复语句，所述定制对话生成模型学习有所述第二角色的对话特征；a candidate reply acquisition module, configured to acquire, for a current input sentence corresponding to the first character in the input dialogue, a first type of reply sentence corresponding to the current input sentence according to the input dialogue by using a customized dialogue generation model, wherein the customized dialogue generation model learns dialogue features of the second character;

所述候选回复获取模块，还用于通过所述定制对话生成模型根据所述输入对话和至少一个提示信息，获取所述当前输入语句对应的至少一个第二类回复语句，所述提示信息是根据所述输入对话生成的用于引导生成回复语句的信息；The candidate reply acquisition module is further used to acquire at least one second-category reply sentence corresponding to the current input sentence according to the input dialogue and at least one prompt information through the customized dialogue generation model, wherein the prompt information is information generated according to the input dialogue and used to guide the generation of a reply sentence;

最终回复获取模块，用于从所述第一类回复语句和所述至少一个第二类回复语句中，确定所述当前输入语句对应的最终回复语句。The final reply acquisition module is used to determine the final reply statement corresponding to the current input statement from the first type of reply statement and the at least one second type of reply statement.

根据本申请实施例的一个方面，提供了一种计算机设备，所述计算机设备包括处理器和存储器，所述存储器中存储有计算机程序，所述计算机程序由所述处理器加载并执行以实现上述回复语句生成方法。According to one aspect of an embodiment of the present application, a computer device is provided, comprising a processor and a memory, wherein a computer program is stored in the memory, and the computer program is loaded and executed by the processor to implement the above-mentioned reply statement generation method.

根据本申请实施例的一个方面，提供了一种计算机可读存储介质，所述可读存储介质中存储有计算机程序，所述计算机程序由处理器加载并执行以实现上述回复语句生成方法。According to one aspect of an embodiment of the present application, a computer-readable storage medium is provided, in which a computer program is stored. The computer program is loaded and executed by a processor to implement the above-mentioned reply statement generation method.

根据本申请实施例的一个方面，提供了一种计算机程序产品，该计算机程序产品包括计算机程序，该计算机程序存储在计算机可读存储介质中。计算机设备的处理器从计算机可读存储介质读取该计算机程序，处理器执行该计算机程序，使得该计算机设备执行上述回复语句生成方法。According to one aspect of an embodiment of the present application, a computer program product is provided, the computer program product comprising a computer program, the computer program being stored in a computer-readable storage medium. A processor of a computer device reads the computer program from the computer-readable storage medium, and the processor executes the computer program, so that the computer device executes the above-mentioned reply statement generation method.

本申请实施例提供的技术方案可以包括如下有益效果：The technical solution provided by the embodiments of the present application may have the following beneficial effects:

通过学习有第二角色的对话特征的定制对话生成模型，生成有提示信息引导的第二类回复语句，以及没有提示信息引导的第一类回复语句，再从第一类回复语句和第二类回复语句中，确定第一角色在输入对话中对应的当前输入语句对应的最终回复语句，由于提示信息能够引导回复语句的生成，使得模型能够稳定生成第二类回复语句，再在第二类回复语句的基础上结合第一类回复语句，进行最终回复语句的确定，有利于提高回复语句的生成稳定性。By learning a customized dialogue generation model with the dialogue features of the second character, the second type of reply sentences guided by prompt information and the first type of reply sentences not guided by prompt information are generated, and then the final reply sentence corresponding to the current input sentence of the first character in the input dialogue is determined from the first type of reply sentences and the second type of reply sentences. Since the prompt information can guide the generation of reply sentences, the model can stably generate the second type of reply sentences, and then the final reply sentences are determined based on the second type of reply sentences combined with the first type of reply sentences, which is beneficial to improving the generation stability of the reply sentences.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

为了更清楚地说明本申请实施例中的技术方案，下面将对实施例描述中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图仅仅是本申请的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings required for use in the description of the embodiments will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present application. For ordinary technicians in this field, other drawings can be obtained based on these drawings without paying any creative work.

图1是本申请一个实施例提供的方案实施环境的示意图；FIG1 is a schematic diagram of an implementation environment of a solution provided by an embodiment of the present application;

图2是本申请一个实施例提供的回复语句生成方法的流程图；FIG2 is a flow chart of a method for generating a reply statement provided by an embodiment of the present application;

图3是本申请一个实施例提供的输入对话的示意图；FIG3 is a schematic diagram of an input dialogue provided by an embodiment of the present application;

图4是本申请一个实施例提供的定制对话生成模型的示意图；FIG4 is a schematic diagram of a customized dialogue generation model provided by an embodiment of the present application;

图5是本申请一个实施例提供的第二类回复语句的生成方法的流程图；FIG5 is a flow chart of a method for generating a second type of reply statement provided by an embodiment of the present application;

图6是本申请一个实施例提供的候选回复语句集合的获取方法的流程图；FIG6 is a flowchart of a method for obtaining a candidate reply statement set provided by an embodiment of the present application;

图7是本申请一个实施例提供的最终回复语句的确定方法的流程图；FIG7 is a flowchart of a method for determining a final reply statement provided by an embodiment of the present application;

图8是本申请一个实施例提供的定制对话生成模型的训练方法的流程图；FIG8 is a flow chart of a method for training a customized dialogue generation model provided by one embodiment of the present application;

图9是本申请一个实施例提供的排序模型的训练方法的流程图；FIG9 is a flow chart of a method for training a sorting model provided by an embodiment of the present application;

图10是本申请一个实施例提供的通用对话生成模型、定制对话生成模型和排序模型之间的关系的示意图；FIG10 is a schematic diagram of the relationship between a general dialogue generation model, a customized dialogue generation model, and a ranking model provided by an embodiment of the present application;

图11是本申请另一个实施例提供的输入对话的示意图；FIG11 is a schematic diagram of an input dialogue provided by another embodiment of the present application;

图12是本申请一个实施例提供的回复语句生成装置的框图；FIG12 is a block diagram of a reply statement generating device provided by an embodiment of the present application;

图13是本申请另一个实施例提供的回复语句生成装置的框图；FIG13 is a block diagram of a reply statement generating device provided by another embodiment of the present application;

图14是本申请一个实施例提供的计算机设备的框图。FIG. 14 is a block diagram of a computer device provided by an embodiment of the present application.

具体实施方式Detailed ways

为使本申请的目的、技术方案和优点更加清楚，下面将结合附图对本申请实施方式作进一步地详细描述。In order to make the objectives, technical solutions and advantages of the present application more clear, the implementation methods of the present application will be further described in detail below with reference to the accompanying drawings.

人工智能(Artificial Intelligence，AI)是利用数字计算机或者数字计算机控制的机器模拟、延伸和扩展人的智能，感知环境、获取知识并使用知识获得最佳结果的理论、方法、技术及应用系统。换句话说，人工智能是计算机科学的一个综合技术，它企图了解智能的实质，并生产出一种新的能以人类智能相似的方式做出反应的智能机器。人工智能也就是研究各种智能机器的设计原理与实现方法，使机器具有感知、推理与决策的功能。Artificial Intelligence (AI) is the theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technology in computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can respond in a similar way to human intelligence. Artificial intelligence is to study the design principles and implementation methods of various intelligent machines so that machines have the functions of perception, reasoning and decision-making.

人工智能技术是一门综合学科，涉及领域广泛，既有硬件层面的技术也有软件层面的技术。人工智能基础技术一般包括如传感器、专用人工智能芯片、云计算、分布式存储、大数据处理技术、操作/交互系统、机电一体化等技术。人工智能软件技术主要包括计算机视觉技术、语音处理技术、自然语言处理技术以及机器学习/深度学习等几大方向。Artificial intelligence technology is a comprehensive discipline that covers a wide range of fields, including both hardware-level and software-level technologies. Basic artificial intelligence technologies generally include sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technology, operating/interactive systems, mechatronics, and other technologies. Artificial intelligence software technologies mainly include computer vision technology, speech processing technology, natural language processing technology, and machine learning/deep learning.

机器学习(Machine Learning，ML)是一门多领域交叉学科，涉及概率论、统计学、逼近论、凸分析、算法复杂度理论等多门学科。专门研究计算机怎样模拟或实现人类的学习行为，以获取新的知识或技能，重新组织已有的知识结构使之不断改善自身的性能。机器学习是人工智能的核心，是使计算机具有智能的根本途径，其应用遍及人工智能的各个领域。机器学习和深度学习通常包括人工神经网络、置信网络、强化学习、迁移学习、归纳学习、示教学习等技术。Machine Learning (ML) is a multi-disciplinary subject that involves probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and other disciplines. It specializes in studying how computers simulate or implement human learning behavior to acquire new knowledge or skills and reorganize existing knowledge structures to continuously improve their performance. Machine learning is the core of artificial intelligence and the fundamental way to make computers intelligent. Its applications are spread across all areas of artificial intelligence. Machine learning and deep learning usually include artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and teaching learning.

本申请实施例提供的技术方案涉及人工智能的机器学习技术，利用机器学习技术中的预训练+微调的训练方法，训练得到定制对话生成模型和排序模型，以及利用机器学习技术中的有监督训练方法，训练得到判别模型，判别模型可用于扩展训练数据，定制对话生成模型可用于生成候选回复语句(如第一类回复语句和第二类回复语句)，排序模型可用于对候选回复语句进行排序，得到最终回复语句。The technical solution provided in the embodiment of the present application involves machine learning technology of artificial intelligence, and utilizes the pre-training + fine-tuning training method in the machine learning technology to train a customized dialogue generation model and a sorting model, and utilizes the supervised training method in the machine learning technology to train a discriminant model. The discriminant model can be used to expand training data, the customized dialogue generation model can be used to generate candidate reply sentences (such as first-category reply sentences and second-category reply sentences), and the sorting model can be used to sort the candidate reply sentences to obtain the final reply sentences.

本申请实施例提供的技术方案适用于任何需要回复语句生成的场景中，诸如角色定制对话场景、小说、剧本等自动生成场景、语音导航场景、智能问答场景、模拟学习场景、自动驾驶场景、游戏场景等。本申请实施例提供的技术方案能够提高回复语句的生成准确性。The technical solution provided by the embodiment of the present application is applicable to any scenario where reply statements need to be generated, such as character customization dialogue scenarios, novels, scripts and other automatic generation scenarios, voice navigation scenarios, intelligent question and answer scenarios, simulated learning scenarios, autonomous driving scenarios, game scenarios, etc. The technical solution provided by the embodiment of the present application can improve the generation accuracy of reply statements.

请参考图1，其示出了本申请一个实施例提供的方案实施环境的示意图。该实施环境可以包括：终端设备10和服务器20。Please refer to FIG1 , which shows a schematic diagram of an implementation environment of a solution provided by an embodiment of the present application. The implementation environment may include: a terminal device 10 and a server 20 .

终端设备10可以是诸如智能手机、平板电脑、游戏主机、多媒体播放设备、PCPersonal Computer，个人计算机)、车载终端、智能机器人、可穿戴设备、智能电视等电子设备。终端设备10中可以安装目标应用程序的客户端，诸如角色定制对话类应用程序(如影视角色定制对话)、智能问答类应用程序、导航类应用程序、游戏类应用程序、社交类应用程序、互动娱乐类应用程序等的客户端。The terminal device 10 may be an electronic device such as a smart phone, a tablet computer, a game console, a multimedia player, a PC (Personal Computer), a vehicle-mounted terminal, an intelligent robot, a wearable device, a smart TV, etc. The terminal device 10 may be installed with a client of a target application, such as a client of a role-customized dialogue application (such as a film and television role-customized dialogue), an intelligent question-and-answer application, a navigation application, a game application, a social application, an interactive entertainment application, etc.

服务器20用于为终端设备10中的目标应用程序(如角色定制对话类应用程序)的客户端提供后台服务。例如，服务器20可以是上述目标应用程序(如角色定制对话类应用程序)的后台服务器。服务器20可以是一台服务器，也可以是由多台服务器组成的服务器集群，或者是一个云计算服务中心。示例性地，服务器20可以是独立的物理服务器，也可以是多个物理服务器构成的服务器集群或者分布式系统，还可以是提供云计算服务的云服务器。The server 20 is used to provide background services for the client of the target application (such as the character customization dialogue application) in the terminal device 10. For example, the server 20 can be the background server of the above-mentioned target application (such as the character customization dialogue application). The server 20 can be a single server, or a server cluster composed of multiple servers, or a cloud computing service center. Exemplarily, the server 20 can be an independent physical server, or a server cluster or distributed system composed of multiple physical servers, or a cloud server that provides cloud computing services.

终端设备10和服务器20之间可通过网络进行互相通信。该网络可以是有线网络，也可以是无线网络。The terminal device 10 and the server 20 can communicate with each other via a network, which can be a wired network or a wireless network.

示例性地，以安装应用角色定制对话类应用程序的客户端为例。用户为自己选定第一角色，为客户端选定第二角色，客户端获取输入对话，该输入对话包括用户对应的历史输入语句和当前输入语句，以及客户端回复的历史回复语句。客户端将输入对话发送至服务器，服务器通过定制对话生成模型根据该输入对话，获取当前输入语句对应的第一类回复语句，以及通过定制对话生成模型根据输入对话和至少一个提示信息，获取当前输入语句对应的至少一个第二类回复语句，该提示信息是根据输入对话生成的用于引导生成回复语句的信息；服务器再通过排序模型从第一类回复语句和至少一个第二类回复语句中，确定当前输入语句对应的最终回复语句，然后将该最终回复语句发送至客户端，客户端对该最终回复语句进行展示。For example, a client that installs a role-customized dialogue application is taken as an example. The user selects a first role for himself and a second role for the client. The client obtains an input dialogue, which includes the historical input statements and the current input statements corresponding to the user, as well as the historical reply statements replied by the client. The client sends the input dialogue to the server, and the server obtains the first type of reply statement corresponding to the current input statement based on the input dialogue through the customized dialogue generation model, and obtains at least one second type of reply statement corresponding to the current input statement through the customized dialogue generation model based on the input dialogue and at least one prompt information. The prompt information is information generated according to the input dialogue for guiding the generation of reply statements; the server then determines the final reply statement corresponding to the current input statement from the first type of reply statement and at least one second type of reply statement through the sorting model, and then sends the final reply statement to the client, and the client displays the final reply statement.

可选地，最终回复语句的获取过程也可由客户端单独完成。例如，客户端在获取输入对话之后，通过定制对话生成模型获取第一类回复语句和至少一个第二类回复语句，再通过排序模型从第一类回复语句和至少一个第二类回复语句中，确定当前输入语句对应的最终回复语句，最后将该最终回复语句展示给用户，本申请实施例对此不作限定。Optionally, the process of obtaining the final reply statement can also be completed by the client alone. For example, after obtaining the input dialogue, the client obtains the first-category reply statement and at least one second-category reply statement through the customized dialogue generation model, and then determines the final reply statement corresponding to the current input statement from the first-category reply statement and at least one second-category reply statement through the sorting model, and finally displays the final reply statement to the user, which is not limited in the embodiments of the present application.

下面，将通过方法实施例对本申请技术方案进行介绍说明。Below, the technical solution of the present application will be introduced and explained through a method embodiment.

请参考图2，其示出了本申请一个实施例提供的回复语句生成方法的流程图，该方法各步骤的执行主体可以是计算机设备，如图1所示方案实施环境中的终端设备10或服务器20，该方法可以包括如下几个步骤(201～204)。Please refer to Figure 2, which shows a flowchart of a reply statement generation method provided by an embodiment of the present application. The execution subject of each step of the method can be a computer device, such as the terminal device 10 or the server 20 in the implementation environment of the solution shown in Figure 1. The method can include the following steps (201~204).

步骤201，获取输入对话，该输入对话包括第一角色与第二角色生成的对话，第一角色和第二角色不同。Step 201 , obtaining an input dialogue, where the input dialogue includes a dialogue generated between a first character and a second character, where the first character and the second character are different.

对话是指两个或两个以上的不同对象之间发生的会话，对话中可以包括多对问句和回复语句(可以简称为问答对)，初始状态下的对话可以仅包括某一对象的问句。A dialogue refers to a conversation between two or more different objects. A dialogue may include multiple pairs of questions and responses (referred to as question-answer pairs). An initial dialogue may include only questions from a certain object.

本申请实施例中的输入对话可以是指以文本形式表示的对话，在本申请实施例中，不同对象可以具有不同的角色身份，诸如影视作品中的影视角色的身份、小说中的角色的身份、剧本中的角色的身份等。The input dialogue in the embodiment of the present application may refer to a dialogue expressed in text form. In the embodiment of the present application, different objects may have different role identities, such as the identity of a film or television character in a film or television work, the identity of a character in a novel, the identity of a character in a script, etc.

在一个示例中，第一角色为用户对应的角色，第二角色为客户端对应的角色。示例性地，用户以第一角色的身份进行提问，客户端以第二角色的身份进行回复，从而生成第一角色和第二角色之间的对话。In one example, the first role is a role corresponding to the user, and the second role is a role corresponding to the client. Exemplarily, the user asks questions as the first role, and the client replies as the second role, thereby generating a dialogue between the first role and the second role.

在另一个示例中，第一角色和第二角色均为客户端对应的角色，客户端以第一角色的对话特征和第二角色的对话特征，自动生成第一角色和第二角色之间的对话。In another example, the first character and the second character are both characters corresponding to the client, and the client automatically generates a dialogue between the first character and the second character based on the dialogue features of the first character and the dialogue features of the second character.

可选地，第一角色可以与第二角色相关联，如第一角色和第二角色属于同一个影视作品中的不同影视角色。第一角色也可以与第二角色不相关联，如第一角色属于影视作品1中的影视角色，第二角色属于影视作品2中的影视角色。输入对话中还可以包括其他角色对应的对话数据(如第三角色、第四角色等对应的对话数据)，本申请实施例对此不作限定。其中，对话数据可以包括问句和回复语句。Optionally, the first role may be associated with the second role, such as the first role and the second role are different film and television roles in the same film and television work. The first role may also be unassociated with the second role, such as the first role is a film and television role in film and television work 1, and the second role is a film and television role in film and television work 2. The input dialogue may also include dialogue data corresponding to other roles (such as dialogue data corresponding to the third role, the fourth role, etc.), which is not limited in the embodiment of the present application. The dialogue data may include question sentences and reply sentences.

在一个示例中，该输入对话以第一角色的当前输入语句结尾，当前输入语句可以是指第一角色在当前时刻提出的问题、问句等。输入对话中当前输入语句之前的对话数据可以被称之为历史对话数据。In one example, the input dialogue ends with the current input sentence of the first character, and the current input sentence may refer to a question or a question sentence raised by the first character at the current moment. The dialogue data before the current input sentence in the input dialogue may be referred to as historical dialogue data.

示例性地，参考图3，基于图3中的对话，可以获取输入对话300，该输入对话300可以包括第一角色A和第二角色B之间的历史对话，以及第一角色A对应的当前输入语句301。第一角色A对应的输入语句由用户输入，第二角色B对应的输入语句由客户端生成，客户端需要针对当前输入语句301生成对应的回复语句，以进行回复。可选地，按照对话顺序，依次以角色名称+对话数据的形式，构建得到输入对话300，如A：A说的话\B：B说的话\A；A说的话\......，依次类推。Exemplarily, referring to FIG3 , based on the dialogue in FIG3 , an input dialogue 300 may be obtained, and the input dialogue 300 may include a historical dialogue between a first character A and a second character B, and a current input sentence 301 corresponding to the first character A. The input sentence corresponding to the first character A is input by the user, and the input sentence corresponding to the second character B is generated by the client. The client needs to generate a corresponding reply sentence for the current input sentence 301 to reply. Optionally, in the order of dialogue, the input dialogue 300 is constructed in the form of role name + dialogue data, such as A: what A said\B: what B said\A; what A said\......, and so on.

步骤202，对于输入对话中第一角色对应的当前输入语句，通过定制对话生成模型根据输入对话，获取当前输入语句对应的第一类回复语句，该定制对话生成模型学习有第二角色的对话特征。Step 202, for the current input sentence corresponding to the first character in the input dialogue, obtain a first type of reply sentence corresponding to the current input sentence according to the input dialogue through a customized dialogue generation model, and the customized dialogue generation model learns the dialogue features of the second character.

定制对话生成模型用于生成符合某一角色的对话特征的回复语句，该定制对话生成模型可以基于深度学习网络构建，诸如Transformer(一种依赖于自注意力机制的输入和输出之间的表示转换网络)、ViT(Vision Transformer，视觉转换网络)等。示例性地，参考图4，可以采用Transformer中的decoder(解码器)来构建定制对话生成模型401，该定制对话生成模型401包括多个转换模块，每个转换模块对应一个Transformer中的decoder。本申请实施例对转换模块的数量不作限定，其可以根据实际使用需求进行设置与调整。The customized dialogue generation model is used to generate reply statements that meet the dialogue characteristics of a certain role. The customized dialogue generation model can be built based on a deep learning network, such as Transformer (a representation conversion network between input and output that relies on a self-attention mechanism), ViT (Vision Transformer, visual conversion network), etc. Exemplarily, referring to Figure 4, the decoder in the Transformer can be used to construct a customized dialogue generation model 401, and the customized dialogue generation model 401 includes multiple conversion modules, each of which corresponds to a decoder in the Transformer. The embodiment of the present application does not limit the number of conversion modules, which can be set and adjusted according to actual usage requirements.

可选地，本申请实施例中的定制对话生成模型可针对多个不同角色进行回复语句的生成，也即该定制对话生成模型具有多个不同角色的对话定制功能。定制对话生成模型的训练过程下文将作详细说明，这里不再赘述。上述对话特征包括以下至少一种：对话风格、对话情感、对话语气和对话偏好词。Optionally, the customized dialogue generation model in the embodiment of the present application can generate reply statements for multiple different roles, that is, the customized dialogue generation model has the dialogue customization function for multiple different roles. The training process of the customized dialogue generation model will be described in detail below and will not be repeated here. The above-mentioned dialogue features include at least one of the following: dialogue style, dialogue emotion, dialogue tone and dialogue preference words.

上述第一类回复语句是指定制对话生成模型在没有提示信息引导下生成的针对当前输入语句的回复语句。例如，在当前输入语句为一些基础人设相关问题的情况下，定制对话生成模型基于输入对话即可稳定得到当前输入语句对应的回复语句。上述基础人设相关问题可以与第二角色的基础人设相关，诸如喜好、姓名、家庭成员等。The first type of reply sentence refers to the reply sentence generated by the customized dialogue generation model for the current input sentence without the guidance of prompt information. For example, when the current input sentence is some basic character-related questions, the customized dialogue generation model can stably obtain the reply sentence corresponding to the current input sentence based on the input dialogue. The above basic character-related questions can be related to the basic character of the second character, such as preferences, name, family members, etc.

在一个示例中，参考图4，第一类回复语句的生成过程可以如下：获取输入对话的词嵌入402，诸如可以采用BERT(Bidirectional Encoder Representation fromTransformers，双向Transformers的特征编码器)、Word2Vec、Doc2Vec等方法获取。对于输入对话中每个词对应的词嵌入，依次融合该词对应的位置信息、句子信息和角色信息，得到输入对话对应的输入词嵌入403。其中，词对应的位置信息用于表示该词在输入对话中的位置。词对应的句子信息用于表示词所在句子所属的类型，诸如背景、回复语句、背景和回复语句之间的对话数据等。词对应的角色信息用于表示生成词所在句子的角色的信息(如角色标识)。通过定制对话生成模型401对输入词嵌入403进行处理，即可得到当前输入语句对应的第一类回复语句。In one example, referring to FIG4 , the generation process of the first type of reply sentence can be as follows: obtain the word embedding 402 of the input dialogue, such as by using BERT (Bidirectional Encoder Representation from Transformers), Word2Vec, Doc2Vec and other methods. For the word embedding corresponding to each word in the input dialogue, the position information, sentence information and role information corresponding to the word are fused in sequence to obtain the input word embedding 403 corresponding to the input dialogue. Among them, the position information corresponding to the word is used to indicate the position of the word in the input dialogue. The sentence information corresponding to the word is used to indicate the type of sentence where the word belongs, such as background, reply sentence, dialogue data between background and reply sentence, etc. The role information corresponding to the word is used to indicate the information of the role of the sentence where the word is located (such as role identification). By processing the input word embedding 403 through the customized dialogue generation model 401, the first type of reply sentence corresponding to the current input sentence can be obtained.

本申请实施例通过将词对应的位置信息、句子信息和角色信息融入词嵌入，有利于提高定制对话生成模型对输入对话的理解，从而提高回复语句的生成稳定性。The embodiment of the present application helps to improve the understanding of the input dialogue by the customized dialogue generation model by integrating the position information, sentence information and role information corresponding to the word into the word embedding, thereby improving the generation stability of the response sentence.

步骤203，通过定制对话生成模型根据输入对话和至少一个提示信息，获取当前输入语句对应的至少一个第二类回复语句，提示信息是根据输入对话生成的用于引导生成回复语句的信息。Step 203, obtaining at least one second-category reply sentence corresponding to the current input sentence based on the input dialogue and at least one prompt information by using a customized dialogue generation model, wherein the prompt information is information generated based on the input dialogue for guiding the generation of a reply sentence.

上述第二类回复语句是指定制对话生成模型在提示信息引导下生成的针对当前输入语句的回复语句。提示信息可以是指针对回复语句生成任务设置的一种输入形式和模板，其能够帮忙定制对话生成模型回忆在训练过程遇见过的训练数据。The second type of reply sentence refers to the reply sentence generated by the customized dialogue generation model under the guidance of the prompt information for the current input sentence. The prompt information may refer to an input form and template set for the reply sentence generation task, which can help the customized dialogue generation model recall the training data encountered during the training process.

在本申请实施例中，该提示信息可以是用于提示第二角色的对话特征的提示词或提示语句，该提示信息也可以是用于提示输入对话的对话背景的提示词或提示语句，该提示信息还可以是用于同时提示第二角色的对话特征和输入对话的对话背景的提示词或提示语句，本申请实施例对此不作限定。其中，对话背景可以是指输入对话对应的天气、时间、地点、环境等，对话特征包括以下至少一种：对话风格、对话情感、对话语气和对话偏好词。In the embodiment of the present application, the prompt information may be a prompt word or prompt sentence for prompting the dialogue characteristics of the second character, the prompt information may also be a prompt word or prompt sentence for prompting the dialogue background of the input dialogue, and the prompt information may also be a prompt word or prompt sentence for prompting both the dialogue characteristics of the second character and the dialogue background of the input dialogue, which is not limited in the embodiment of the present application. Among them, the dialogue background may refer to the weather, time, place, environment, etc. corresponding to the input dialogue, and the dialogue characteristics include at least one of the following: dialogue style, dialogue emotion, dialogue tone, and dialogue preference words.

在一个示例中，参考图5，步骤203还可以包括如下几个子步骤：In one example, referring to FIG. 5 , step 203 may further include the following sub-steps:

步骤203a，对于至少一个提示信息中的目标提示信息，将目标提示信息与输入对话进行拼接，得到拼接文本。Step 203a: for the target prompt information in the at least one prompt information, the target prompt information is spliced with the input dialogue to obtain a spliced text.

目标提示信息可以是指至少一个提示信息中的任一提示信息。拼接文本是指由输入对话对应的文本和目标提示信息对应的文本拼接得到的文本。The target prompt information may refer to any prompt information in the at least one prompt information. The concatenated text refers to the text concatenated by the text corresponding to the input dialogue and the text corresponding to the target prompt information.

在一个示例中，在目标提示信息为对话特征的情况下，将对话特征拼接到第二角色在输入对话中对应的语句中，得到拼接文本。In one example, when the target prompt information is a dialogue feature, the dialogue feature is spliced into a corresponding sentence of the second character in the input dialogue to obtain a spliced text.

例如，参考图6，获取第一角色A和第二角色B之间的输入对话601，在需要添加对话特征的情况下，将第一角色A对应的对话特征(如对话偏好词：陛下)拼接到第一角色在输入对话601中对应的语句中(如当前输入语句)，得到拼接文本604，再通过定制对话生成模型对拼接文本604进行处理，得到第二类回复语句606。For example, referring to Figure 6, an input dialogue 601 between a first character A and a second character B is obtained. If dialogue features need to be added, the dialogue features corresponding to the first character A (such as the dialogue preference word: Your Majesty) are spliced into the corresponding sentence of the first character in the input dialogue 601 (such as the current input sentence) to obtain a spliced text 604, and then the spliced text 604 is processed by a customized dialogue generation model to obtain a second type of reply sentence 606.

可选地，本申请针对第一角色和第二角色设置有提示信息列表，可以根据提示信息列中的提示信息与输入对话之间的相似度、编辑距离等，来确定对话特征对应的提示词或提示语句。Optionally, the present application provides a prompt information list for the first character and the second character, and can determine the prompt word or prompt sentence corresponding to the dialogue feature based on the similarity, edit distance, etc. between the prompt information in the prompt information list and the input dialogue.

在另一个示例中，在目标提示信息为对话背景的情况下，将对话背景拼接到输入对话的头部，得到拼接文本。In another example, when the target prompt information is a conversation background, the conversation background is spliced to the header of the input conversation to obtain a spliced text.

例如，参考图6，在需要添加对话背景的情况下，将输入对话601对应的对话背景(如：天气微凉，A和B在后花园中散步)拼接到输入对话601的头部，得到拼接文本603，再通过定制对话生成模型对拼接文本603进行处理，得到第二类回复语句605。可选地，可以将图6中的第一类回复语句602、第二类回复语句605和第二类回复语句606进行组合，得到候选回复语句集合。For example, referring to FIG6, when it is necessary to add a dialogue background, the dialogue background corresponding to the input dialogue 601 (such as: the weather is slightly cool, A and B are walking in the back garden) is spliced to the head of the input dialogue 601 to obtain a spliced text 603, and then the spliced text 603 is processed by the customized dialogue generation model to obtain a second type of reply sentence 605. Optionally, the first type of reply sentence 602, the second type of reply sentence 605, and the second type of reply sentence 606 in FIG6 can be combined to obtain a candidate reply sentence set.

对话背景对应的提示词或提示语句可以根据输入对话所在的背景来敲定。本申请实施例通过在输入对话中添加对话背景或对话特征，有利于提高定制对话生成模型对输入对话的理解，从而提高回复语句的生成稳定性。The prompt word or prompt sentence corresponding to the dialogue context can be finalized according to the context of the input dialogue. The embodiment of the present application helps to improve the understanding of the input dialogue by the customized dialogue generation model by adding dialogue context or dialogue features to the input dialogue, thereby improving the generation stability of the reply sentence.

步骤203b，通过定制对话生成模型对拼接文本进行处理，得到当前输入语句在目标提示信息的引导下生成的第二类回复语句。Step 203b, processing the concatenated text by using a customized dialogue generation model to obtain a second type of reply sentence generated by the current input sentence under the guidance of the target prompt information.

可选地，先获取拼接文本的词嵌入。对于拼接文本中每个词，将词对应的位置信息、句子信息和角色信息，与词对应的词嵌入进行融合，得到定制对话生成模型对应的输入词嵌入；通过定制对话生成模型对输入词嵌入进行处理，得到当前输入语句在目标提示信息的引导下生成的第二类回复语句。定制对话生成模型对拼接文本的处理方法，与定制对话生成模型对输入对话的处理方法相同，本申请实施例未说明的内容，可以参考上述实施例介绍的内容，这里不再赘述。Optionally, first obtain the word embedding of the concatenated text. For each word in the concatenated text, fuse the position information, sentence information, and role information corresponding to the word with the word embedding corresponding to the word to obtain the input word embedding corresponding to the customized dialogue generation model; process the input word embedding through the customized dialogue generation model to obtain the second type of reply sentence generated by the current input sentence under the guidance of the target prompt information. The processing method of the concatenated text by the customized dialogue generation model is the same as the processing method of the input dialogue by the customized dialogue generation model. For the content not described in the embodiments of the present application, please refer to the content introduced in the above embodiments, which will not be repeated here.

在一个可行示例中，还可以采用向量的表示形式来表示提示信息，且将提示信息对应的向量拼接到输入对话的词嵌入中即可。例如，设提示信息对应的向量为3，dim＝1024，则可以引入维度为3*1024的矩阵Wp来表示提示信息，如此只需调整Wp的参数，即可实现提示信息的调整，从而提高了的提示信息的泛化性。In a feasible example, the prompt information can also be represented in the form of a vector, and the vector corresponding to the prompt information can be concatenated into the word embedding of the input dialogue. For example, assuming that the vector corresponding to the prompt information is 3, dim=1024, then a matrix Wp with a dimension of 3*1024 can be introduced to represent the prompt information. In this way, the prompt information can be adjusted by simply adjusting the parameters of Wp, thereby improving the generalization of the prompt information.

步骤204，从第一类回复语句和至少一个第二类回复语句中，确定当前输入语句对应的最终回复语句。Step 204: Determine a final response statement corresponding to the current input statement from the first type of response statement and at least one second type of response statement.

最终回复语句是指用于回复当前输入语句的回复语句。第一类回复语句和至少一个第二类回复语句可以统称为候选回复语句。The final reply sentence refers to a reply sentence used to reply to the current input sentence. The first type of reply sentence and at least one second type of reply sentence can be collectively referred to as candidate reply sentences.

在一个示例中，参考图7，步骤204可以包括如下几个子步骤：In one example, referring to FIG. 7 , step 204 may include the following sub-steps:

步骤204a，通过排序模型获取第一类回复语句对应的相关性评分和各个第二类回复语句分别对应的相关性评分，该相关性评分用于指示回复语句与第二角色的相关性。Step 204a, obtaining the relevance score corresponding to the first type of reply statement and the relevance score corresponding to each second type of reply statement through the sorting model, wherein the relevance score is used to indicate the relevance between the reply statement and the second role.

可选地，通过排序模型对候选回复语句的词嵌入进行处理，即可得到候选回复语句对应的相关性评分。该排序模型可以基于深度学习网络构建，诸如Transformer、ViT等。相对于定制对话生成模型，排序模型新增有分类器(如二分类器)，该分类器用于根据候选回复语句在排序模型下的输出表示，判断候选回复语句是否与第二角色相关，相关性评分越高，候选回复语句与第二角色越相关，如候选回复语句的对话特征越符合第二角色的对话特征。排序模型的训练过程下文将作详细说明，这里不再赘述。Optionally, the word embeddings of the candidate reply sentences are processed by the ranking model to obtain the relevance scores corresponding to the candidate reply sentences. The ranking model can be built based on a deep learning network, such as Transformer, ViT, etc. Compared with the customized dialogue generation model, the ranking model is newly equipped with a classifier (such as a binary classifier), which is used to judge whether the candidate reply sentence is related to the second role based on the output representation of the candidate reply sentence under the ranking model. The higher the relevance score, the more relevant the candidate reply sentence is to the second role, such as the more the dialogue features of the candidate reply sentence conform to the dialogue features of the second role. The training process of the ranking model will be described in detail below and will not be repeated here.

在一个示例中，在获取相关性评分之后，还可以采用后处理策略对相关性评分进行调整，以增强相关性评分的真实性，以及相关性评分之间的区分性，进而提高最终回复语句的确定准确性和稳定性。In one example, after obtaining the relevance score, a post-processing strategy may be used to adjust the relevance score to enhance the authenticity of the relevance score and the distinguishability between the relevance scores, thereby improving the accuracy and stability of the final response statement.

示例性地，在候选回复语句中存在质量评分大于或等于质量阈值的实体词的情况下，提高候选回复语句对应的相关性评分。该候选回复语句包括上述第一类回复语句和上述各个第二类回复语句。Exemplarily, when there is an entity word with a quality score greater than or equal to the quality threshold in the candidate reply sentence, the relevance score corresponding to the candidate reply sentence is increased. The candidate reply sentence includes the above-mentioned first category reply sentence and the above-mentioned second category reply sentences.

质量评分用于表示实体词对对话内容的理解所起到的作用的重要程度，实体词的质量评分越高，该实体词对对话内容的理解所起到的作用越大。质量评分可以通过训练完成的模型来获取，也可以根据统计规则来获取，本申请实施例对此不作限定。质量阈值可以根据实际使用需求进行设置与调整。The quality score is used to indicate the importance of the entity word in understanding the content of the conversation. The higher the quality score of the entity word, the greater the role played by the entity word in understanding the content of the conversation. The quality score can be obtained through the trained model or according to statistical rules, which is not limited in the embodiments of the present application. The quality threshold can be set and adjusted according to actual usage requirements.

可选地，可以采用查表法确定实体词对应的评分增量(表中的实体词设置有对应的评分增量)，并将该评分增量与候选回复语句对应的相关性评分的和值，确定为候选回复语句对应的调整后的相关性评分。例如，评分增量的分布可以为：0.1-0.5，设评分增量为0.2，候选回复语句对应的相关性评分为0.7，则可以将0.2+0.7＝0.9确定为候选回复语句对应的调整后的相关性评分。Optionally, a table lookup method may be used to determine the score increment corresponding to the entity word (the entity word in the table is provided with a corresponding score increment), and the sum of the score increment and the relevance score corresponding to the candidate reply sentence is determined as the adjusted relevance score corresponding to the candidate reply sentence. For example, the distribution of the score increment may be: 0.1-0.5, assuming that the score increment is 0.2, and the relevance score corresponding to the candidate reply sentence is 0.7, then 0.2+0.7=0.9 may be determined as the adjusted relevance score corresponding to the candidate reply sentence.

在另一个示例中，在候选回复语句与角色高频短词之间的相似度大于或等于相似度阈值的情况下，提高候选回复语句对应的相关性评分，该角色高频短词用于表示第二角色的对话特征。In another example, when the similarity between the candidate reply sentence and a character's high-frequency short word is greater than or equal to a similarity threshold, the relevance score corresponding to the candidate reply sentence is increased, and the character's high-frequency short word is used to represent the dialogue characteristics of the second character.

可选地，该角色高频短词可以是指用于表示第二角色的对话风格、对话情感和对话语气等的短词，该短词的出现频率大于或等于频率阈值。该角色高频短词也可以是指第二角色对应的对话偏好词。Optionally, the character high-frequency short word may refer to a short word used to represent the second character's dialogue style, dialogue emotion, and dialogue tone, and the appearance frequency of the short word is greater than or equal to the frequency threshold. The character high-frequency short word may also refer to a dialogue preference word corresponding to the second character.

可选地，可以采用查表法根据相似度，确定候选回复语句对应的评分增量，相似度越高，评分增加越大。然后将该评分增量与候选回复语句对应的相关性评分的和值，确定为候选回复语句对应的调整后的相关性评分。Optionally, a table lookup method can be used to determine the score increment corresponding to the candidate reply statement based on the similarity, and the higher the similarity, the greater the score increase. Then, the sum of the score increment and the relevance score corresponding to the candidate reply statement is determined as the adjusted relevance score corresponding to the candidate reply statement.

可选地，还可以同时采用上述两种后处理策略对相关性评分进行调整，以得到候选回复语句对应的调整后的相关性评分。Optionally, the above two post-processing strategies may be simultaneously used to adjust the relevance score to obtain an adjusted relevance score corresponding to the candidate reply sentence.

步骤204b，将第一类回复语句和各个第二类回复语句中，相关性评分最高的回复语句确定为最终回复语句。Step 204b: determine the reply statement with the highest relevance score among the first-category reply statement and each second-category reply statement as the final reply statement.

可选地，按照相关性评分从高到底的顺序，对第一类回复语句和各个第二类回复语句进行排序，得到回复语句序列，将回复语句序列中排列第一个的回复语句确定为最终回复语句。Optionally, the first category reply statements and each second category reply statement are sorted in order of relevance scores from high to low to obtain a reply statement sequence, and the reply statement that is ranked first in the reply statement sequence is determined as the final reply statement.

综上所述，本申请实施例提供的技术方案，通过学习有第二角色的对话特征的定制对话生成模型，生成有提示信息引导的第二类回复语句，以及没有提示信息引导的第一类回复语句，再从第一类回复语句和第二类回复语句中，确定第一角色在输入对话中对应的当前输入语句对应的最终回复语句，由于提示信息能够引导回复语句的生成，使得模型能够稳定生成第二类回复语句，再在第二类回复语句的基础上结合第一类回复语句，进行最终回复语句的确定，有利于提高回复语句的生成稳定性。To summarize, the technical solution provided in the embodiment of the present application generates a second type of reply sentence guided by prompt information and a first type of reply sentence not guided by prompt information by learning a customized dialogue generation model with dialogue features of a second character, and then determines the final reply sentence corresponding to the current input sentence corresponding to the first character in the input dialogue from the first type of reply sentence and the second type of reply sentence. Since the prompt information can guide the generation of reply sentences, the model can stably generate the second type of reply sentences, and then determine the final reply sentence based on the second type of reply sentences combined with the first type of reply sentences, which is beneficial to improving the generation stability of the reply sentences.

另外，通过在输入对话中添加对话背景或对话特征，以及通过将词对应的位置信息、句子信息和角色信息融入词嵌入，有利于提高定制对话生成模型对输入对话的理解，从而提高回复语句的生成稳定性。In addition, by adding conversation context or conversation features to the input conversation, and by incorporating the position information, sentence information, and role information corresponding to the word into word embedding, it is helpful to improve the customized conversation generation model's understanding of the input conversation, thereby improving the generation stability of the reply sentence.

另外，通过采用后处理策略对相关性评分进行调整，可以增强相关性评分的真实性，以及相关性评分之间的区分性，从而有利于提高最终回复语句的确定准确性和稳定性。In addition, by adopting a post-processing strategy to adjust the relevance score, the authenticity of the relevance score and the distinguishability between the relevance scores can be enhanced, which is conducive to improving the accuracy and stability of the final response statement.

请参考图8，其示出了本申请一个实施例提供的定制对话生成模型的训练方法的流程图，该方法各步骤的执行主体可以是计算机设备，如图1所示方案实施环境中的终端设备10或服务器20，该方法可以包括如下几个步骤(801～803)。Please refer to Figure 8, which shows a flowchart of a training method for a customized dialogue generation model provided by an embodiment of the present application. The execution subject of each step of the method can be a computer device, such as the terminal device 10 or the server 20 in the implementation environment of the solution shown in Figure 1. The method may include the following steps (801~803).

步骤801，获取通用对话生成模型，该通用对话生成模型是通过对话语料数据预训练得到的。Step 801: Obtain a general dialogue generation model, where the general dialogue generation model is obtained by pre-training with dialogue corpus data.

通用对话生成模型的网络结构与定制对话生成模型的网络结构相同，但通用对话生成模型的网络参数与定制对话生成模型的网络参数不同。上述对话语料数据可以是指基于普通的对话生成的文本数据，如一个完整的对话或者一个完整的对话中的部分对话可以作为对话语料数据。该对话语料数据没有针对角色进行筛选，其具有普适性。The network structure of the general dialogue generation model is the same as that of the customized dialogue generation model, but the network parameters of the general dialogue generation model are different from those of the customized dialogue generation model. The above-mentioned dialogue corpus data may refer to text data generated based on ordinary dialogues, such as a complete dialogue or a part of a complete dialogue can be used as dialogue corpus data. The dialogue corpus data is not screened for roles and is universal.

在一个示例中，可以采用有监督训练方法，根据对话语料数据对通用对话生成模型进行训练，得到训练完成的通用对话生成模型。示例性地，先获取对话语料数据的词嵌入，再对对话语料数据中的某一回复语句对应的词嵌入进行掩膜，通过通用对话生成模型根据对话语料数据对应的剩余词嵌入，预测得到该回复语句对应的预测结果，然后根据回复语句对应的词嵌入和预测结果之间的差异，确定通用对话生成模型的训练损失(如交叉熵损失、焦点损失、均方差损失等)，以最小化训练损失为目标，对通用对话生成模型进行迭代训练，得到训练完成的用对话生成模型进。In one example, a supervised training method can be used to train the universal dialogue generation model according to the dialogue corpus data to obtain a trained universal dialogue generation model. Exemplarily, the word embedding of the dialogue corpus data is first obtained, and then the word embedding corresponding to a reply sentence in the dialogue corpus data is masked, and the universal dialogue generation model is used to predict the prediction result corresponding to the reply sentence according to the remaining word embedding corresponding to the dialogue corpus data, and then the training loss (such as cross entropy loss, focus loss, mean square error loss, etc.) of the universal dialogue generation model is determined according to the difference between the word embedding corresponding to the reply sentence and the prediction result, and the universal dialogue generation model is iteratively trained with the goal of minimizing the training loss to obtain a trained universal dialogue generation model.

在另一个示例中，可以采用无监督训练方法，根据上述回复语句对应的预测结果，计算得到通用对话生成模型的训练损失。例如，可以采用NLL Loss(Negative LogLikelihood Loss，负对数似然损失函数)根据上述回复语句对应的预测结果，计算得到通用对话生成模型的训练损失，该计算过程可以表示如下：In another example, an unsupervised training method can be used to calculate the training loss of the general dialogue generation model based on the prediction results corresponding to the above reply sentences. For example, NLL Loss (Negative LogLikelihood Loss) can be used to calculate the training loss of the general dialogue generation model based on the prediction results corresponding to the above reply sentences. The calculation process can be expressed as follows:

其中，c为输入对话，T为输入对话中回复语句的总数，r_t为输入对话中第t个回复语句，r＜t表示第t个回复语句之前的回复语句，p(r_t|c，r＜t)表示r_t对应的预测结果。Wherein, c is the input dialogue, T is the total number of reply sentences in the input dialogue,_rt is the tth reply sentence in the input dialogue, r＜t represents the reply sentence before the tth reply sentence, and p(_rt |c, r＜t) represents the prediction result corresponding to_rt .

步骤802，从第二角色对应的关联文本数据中，解析出训练样本集合，该训练样本集合包括带有第二角色对应的背景数据的第一类训练样本，以及不带有背景数据的第二类训练样本，训练样本集合中的训练样本由对话数据构成。Step 802, parse out a training sample set from the associated text data corresponding to the second role, the training sample set including a first type of training samples with background data corresponding to the second role and a second type of training samples without background data, the training samples in the training sample set being composed of dialogue data.

该训练样本集合中的训练样本用于对上述通用对话生成模型的网络参数进行微调，以使得述通用对话生成模型学习到第二角色的对话特征，以及对背景数据的理解能力，从而得到可以稳定生成回复语句的定制对话生成模型。The training samples in the training sample set are used to fine-tune the network parameters of the general dialogue generation model so that the general dialogue generation model can learn the dialogue characteristics of the second character and the ability to understand background data, thereby obtaining a customized dialogue generation model that can stably generate response sentences.

关联文本数据是指与第二角色相关联的文本数据。在本申请实施例中，该关联文本数据包括原始文本数据和扩展文本数据。其中，原始文本数据包括与第二角色相关的小说和剧本，扩展文本数据可以包括与第二角色相关的百科、论坛、影视论评、讲解、旁白等文本数据，其是对原始文本数据的补充。第二角色对应的背景数据即为输入对话的背景数据。The associated text data refers to the text data associated with the second character. In the embodiment of the present application, the associated text data includes original text data and extended text data. Among them, the original text data includes novels and scripts related to the second character, and the extended text data may include encyclopedias, forums, film and television reviews, explanations, narrations and other text data related to the second character, which is a supplement to the original text data. The background data corresponding to the second character is the background data of the input dialogue.

在一个示例中，关联文本数据的获取过程可以如下：In one example, the process of obtaining the associated text data may be as follows:

1、根据对话提取规则，从原始文本数据中提取第一类训练样本。1. According to the dialogue extraction rules, extract the first type of training samples from the original text data.

对话提取规则用于对话提取，可以根据提取得到的对话的对话数据(即文本数据)，以及背景数据构建第一类训练样本。The dialogue extraction rule is used for dialogue extraction, and a first type of training samples can be constructed based on the dialogue data (ie, text data) of the extracted dialogue and background data.

示例性地，对于剧本，由于剧本数据格式统一，会有清晰的旁白和对话结构，则可以根据旁白和对话结构构建对话提取规则，以进行对话的提取。For example, for scripts, since the script data format is unified and there will be clear narration and dialogue structure, dialogue extraction rules can be constructed based on the narration and dialogue structure to extract dialogue.

对于小说，由于小说数据格式复杂，根据普通的对话提取规则无法有效提取出对话。针对这个问题，本申请实施例以场景作为切割粒度，一个完整的场景包括背景数据和角色间的对话，可选地，背景数据也可以为空。筛选逻辑为：若一个场景对应的对话中包括第二角色的对话内容，则基于该场景对应的对话和背景数据，构建一个第一类训练样本。可选地，在一个场景对应有多个背景数据的情况下，可以将编辑距离最接近第二角色的背景数据确定为该场景对应的背景数据。For novels, due to the complex data format of novels, it is impossible to effectively extract dialogues according to ordinary dialogue extraction rules. In response to this problem, the embodiment of the present application uses scenes as the cutting granularity. A complete scene includes background data and dialogues between characters. Optionally, the background data can also be empty. The screening logic is: if the dialogue corresponding to a scene includes the dialogue content of the second character, then a first-class training sample is constructed based on the dialogue and background data corresponding to the scene. Optionally, when there are multiple background data corresponding to a scene, the background data with the closest editing distance to the second character can be determined as the background data corresponding to the scene.

可选地，参与对话的角色也可以提取出来，如提取参与对话的角色的角色名称，并将其添加至第一类训练样本中。在一些可行示例中，角色也可以采用编码(或词嵌入)进行表示，如角色A采用1表示，角色B采用2表示，其他角色采用3表示。Optionally, the characters participating in the conversation can also be extracted, such as extracting the names of the characters participating in the conversation and adding them to the first type of training samples. In some feasible examples, the characters can also be represented by encoding (or word embedding), such as character A is represented by 1, character B is represented by 2, and other characters are represented by 3.

可选地，由于定制对话生成模型无法集成所有的角色的对话特征，可以针对角色的重要程度，对第一类训练样本进行筛选，保留重要程度高的角色对应的第一类训练样本，重要程度低的角色对应的第一类训练样本可作为训练数据的补充，以在有需求的情况拿来使用。Optionally, since the customized dialogue generation model cannot integrate the dialogue features of all characters, the first category of training samples can be screened according to the importance of the characters, and the first category of training samples corresponding to the characters with high importance are retained. The first category of training samples corresponding to the characters with low importance can be used as a supplement to the training data for use when needed.

示例性地，上述第一类训练样本可以表示如下：背景数据\t角色A：A的对话内容\t角色B：B的对话内容\t……\t角色A：A的对话内容\t。其中，\t为结束符号，表示一句话结束。For example, the first type of training samples can be represented as follows: background data\tRole A: A's dialogue content\tRole B: B's dialogue content\t…\tRole A: A's dialogue content\t. Among them, \t is an end symbol, indicating the end of a sentence.

2、基于第一类训练样本构建判别模型的正样本，以及基于与第二角色不相关的其他文本数据构建判别模型的负样本。2. Constructing positive samples of the discriminant model based on the first type of training samples, and constructing negative samples of the discriminant model based on other text data unrelated to the second role.

判断模型的网络结构可以和上述排序模型的网络结合相同，但判断模型的任务是获取与第二角色相关的扩展文本数据。The network structure of the judgment model can be the same as the network combination of the above-mentioned sorting model, but the task of the judgment model is to obtain the extended text data related to the second role.

可选地，可以直接将第一类训练样本确定为判别模型的正样本，以及从其他文本数据中提取出的与第二角色不相关的第三类训练样本，并将第三类训练样本确定为判别模型的负样本。正样本的数量与负样本的数量相同。Optionally, the first type of training samples can be directly determined as positive samples of the discriminant model, and the third type of training samples extracted from other text data that are not related to the second role can be determined as negative samples of the discriminant model. The number of positive samples is the same as the number of negative samples.

3、根据判别模型的正样本和负样本，对判别模型进行训练，得到训练完成的判别模型。3. According to the positive samples and negative samples of the discriminant model, the discriminant model is trained to obtain a trained discriminant model.

可选地，采用有监督训练方法，根据判别模型的正样本和负样本，对判别模型进行迭代训练，得到训练完成的判别模型。Optionally, a supervised training method is used to iteratively train the discriminant model based on positive samples and negative samples of the discriminant model to obtain a trained discriminant model.

4、通过训练完成的判别模型，从文本数据库中获取扩展文本数据。4. Obtain extended text data from the text database through the trained discriminant model.

通过训练完成的判别模型，获取文本数据库中各个文本数据分别对应的判别结果，将判别结果大于或等于判别阈值的文本数据，确定为扩展文本数据。The discrimination results corresponding to each text data in the text database are obtained through the trained discrimination model, and the text data with the discrimination results greater than or equal to the discrimination threshold are determined as the extended text data.

可选地，在从扩展文本数据中抽取训练样本的过程中，对于扩展文本数据中有背景数据的，则保留背景数据在训练样本中，也即为第一类训练样本；对于扩展文本数据中没有背景数据的，则将其对应的训练样本的背景数据设置为空，也即为第二类训练样本。Optionally, in the process of extracting training samples from extended text data, if there is background data in the extended text data, the background data is retained in the training samples, that is, the first type of training samples; if there is no background data in the extended text data, the background data of the corresponding training samples is set to empty, that is, the second type of training samples.

步骤803，通过训练样本集合对通用对话生成模型进行训练，得到训练完成的定制对话生成模型。Step 803: Train the general dialogue generation model using the training sample set to obtain a trained customized dialogue generation model.

定制对话生成模型的训练过程与通用对话生成模型的训练过程相同，这里不再赘述。The training process of the customized dialogue generation model is the same as that of the general dialogue generation model, so it will not be repeated here.

可选地，在获取定制对话生成模型的预测结果之后，还可以进一步根据预测结果和角色高频短词之间的相似度，获取一致性损失，该一致性损失用于表示预测结果对应的对话风格和第二角色对应的对话风格之间的一致性。在最小化负对数似然损失的同时，最大化一致性损失，对通用对话生成模型进行训练，得到训练完成的定制对话生成模型，如此可以使得定制对话生成模型进一步学到了第二角色的对话风格，从而进一步提高可定制对话生成模型的稳定性。Optionally, after obtaining the prediction result of the customized dialogue generation model, the consistency loss can be further obtained based on the similarity between the prediction result and the high-frequency short words of the character, and the consistency loss is used to represent the consistency between the dialogue style corresponding to the prediction result and the dialogue style corresponding to the second character. While minimizing the negative log-likelihood loss, the consistency loss is maximized, and the general dialogue generation model is trained to obtain a trained customized dialogue generation model. In this way, the customized dialogue generation model can further learn the dialogue style of the second character, thereby further improving the stability of the customizable dialogue generation model.

综上所述，本申请实施例提供的技术方案，通过结合原始文本数据和扩展文本数据，对通用对话生成模型进行微调，使得通用对话生成模型被充分训练。同时，结合背景数据对通用对话生成模型进行微调，使得训练数据被充分使用，从而得到一个能够稳定生成回复语句的定制对话生成模型，进而提高了回复语句的生成稳定性。In summary, the technical solution provided in the embodiment of the present application fine-tunes the general dialogue generation model by combining the original text data and the extended text data, so that the general dialogue generation model is fully trained. At the same time, the general dialogue generation model is fine-tuned in combination with the background data, so that the training data is fully used, thereby obtaining a customized dialogue generation model that can stably generate reply sentences, thereby improving the generation stability of reply sentences.

请参考图9，其示出了本申请一个实施例提供的排序模型的训练方法的流程图，该方法各步骤的执行主体可以是计算机设备，如图1所示方案实施环境中的终端设备10或服务器20，该方法可以包括如下几个步骤(901～903)。Please refer to Figure 9, which shows a flowchart of a training method for a sorting model provided by an embodiment of the present application. The executor of each step of the method may be a computer device, such as the terminal device 10 or the server 20 in the implementation environment of the solution shown in Figure 1. The method may include the following steps (901 to 903).

步骤901，在通用对话生成模型中增设分类器，得到初始状态下的排序模型；其中，通用对话生成模型是通过对话语料数据预训练得到的。Step 901, adding a classifier to the general dialogue generation model to obtain a sorting model in an initial state; wherein the general dialogue generation model is obtained by pre-training with dialogue corpus data.

初始状态下的排序模型中的分类器的网络参数随机初始化，初始状态下的排序模型将通用对话生成模型对应的网络，作为初始状态下的排序模型的预测结果生成网络。通用对话生成模型与上述实施例介绍相同，这里不再赘述。The network parameters of the classifier in the sorting model in the initial state are randomly initialized, and the sorting model in the initial state uses the network corresponding to the general dialogue generation model as the prediction result generation network of the sorting model in the initial state. The general dialogue generation model is the same as that described in the above embodiment, and will not be repeated here.

可选地，通过排序模型中的预测结果生成网络，可得到候选回复结果的预测结果，再通过排序模型中的分类器对候选回复结果的预测结果进行分类，即可得到候选回复结果的分类结果(即上述相关性评分)，该分类结果用于指示候选回复结果与第二角色之间的相关性。Optionally, by generating a prediction result network in the sorting model, the prediction results of the candidate reply results can be obtained, and then the prediction results of the candidate reply results are classified by the classifier in the sorting model to obtain the classification results of the candidate reply results (i.e., the above-mentioned correlation score), which is used to indicate the correlation between the candidate reply result and the second role.

步骤902，根据第二角色对应的第一类训练样本，构建正样本集合和负样本集合，正样本集合中的正样本为第一类训练样本，负样本集合中的负样本是通过对第一类训练样本中的回复语句进行更换得到的。Step 902, constructing a positive sample set and a negative sample set based on the first type of training samples corresponding to the second role, the positive samples in the positive sample set are the first type of training samples, and the negative samples in the negative sample set are obtained by replacing the reply sentences in the first type of training samples.

可选地，直接将第一类训练样本确定为排序模型对应的正样本，将第一类训练样本中的某一个回复语句随机替换成前文或下文中的另一个回复语句，得到排序模型对应的负样本。排序模型对应的正样本的数量和排序模型对应的负样本的数量相同。Optionally, the first type of training samples are directly determined as positive samples corresponding to the sorting model, and a certain reply sentence in the first type of training samples is randomly replaced with another reply sentence in the previous or next text to obtain negative samples corresponding to the sorting model. The number of positive samples corresponding to the sorting model is the same as the number of negative samples corresponding to the sorting model.

示例性地，设第一类训练样本表示为：背景数据\A1：B1：A2：B2…C1：D1：C2：D2，则可以得到正样本：背景数据\A1：B1：A2：B2(标签数据为1)，负样本：背景数据\A1：B1：A2：D2(标签数据为0)。其中，A1用于简化表示角色A对应的角色标识和第一个对话内容。For example, if the first type of training samples is represented as: background data\A1:B1:A2:B2…C1:D1:C2:D2, then we can get positive samples: background data\A1:B1:A2:B2 (label data is 1), negative samples: background data\A1:B1:A2:D2 (label data is 0). Among them, A1 is used to simplify the role identification corresponding to role A and the first dialogue content.

步骤903，通过正样本集合和负样本集合，对初始状态下的排序模型进行训练，得到训练完成的排序模型。Step 903: train the sorting model in the initial state using the positive sample set and the negative sample set to obtain a trained sorting model.

可选地，通过排序模型获取样本对应的分类结果，再根据分类结果和标签数据之间的差异，获取排序模型对应的训练损失，以最小化排序模型对应的训练损失为目标，对初始状态下的排序模型进行迭代训练，得到训练完成的排序模型。Optionally, the classification result corresponding to the sample is obtained through the sorting model, and then the training loss corresponding to the sorting model is obtained based on the difference between the classification result and the label data. The sorting model in the initial state is iteratively trained with the goal of minimizing the training loss corresponding to the sorting model to obtain a trained sorting model.

综上所述，本申请实施例提供的技术方案，通过基于第一类训练样本构建排序模型的训练样本，有利于提高排序模型对候选回复语句与第二角色之间的相关性的识别能力，进而提高最终回复语句的质量。To sum up, the technical solution provided in the embodiment of the present application, by constructing the training samples of the sorting model based on the first type of training samples, is conducive to improving the sorting model's ability to recognize the correlation between the candidate reply statements and the second role, thereby improving the quality of the final reply statement.

在一个示例性实施例中，以角色对话定制场景为例，对本申请实施例提供的回复语句生成方法进行说明，其可以包括如下内容：In an exemplary embodiment, taking the role dialogue customization scenario as an example, the reply statement generation method provided by the embodiment of the present application is described, which may include the following content:

参考图3，用户为自己选择第一角色A，为对话系统选择第二角色B(第一角色A和第二角色B可以为影视角色、游戏角色等，本申请实施例对此不作限定)。用户以第一角色A的身份与对话系统进行角色扮演互聊，则对话系统以第二角色B的身份进行回复，生成了第一角色A和第二角色B之间的输入对话300。Referring to FIG3 , the user selects a first character A for himself and a second character B for the dialogue system (the first character A and the second character B can be film and television characters, game characters, etc., which are not limited in the present embodiment). The user performs role-playing and chatting with the dialogue system as the first character A, and the dialogue system replies as the second character B, generating an input dialogue 300 between the first character A and the second character B.

对话系统中部署有定制对话生成模型和排序模型。参考图10，可以先基于对话语料数据预训练得到通用对话生成模型1001，再根据从与第二角色B相关的小说和剧本等原始文本数据中解析出的对话数据(即第一类训练样本)和扩展文本数据(即第二类训练样本)，对通用对话生成模型进行微调，得到定制对话生成模型1002。根据与第二角色B对应的第一类训练样本，对增加分类器的通用对话生成模型1001进行微调，得到排序模型1003。最后将定制对话生成模型1002和排序模型1003部署至对话系统中。A customized dialogue generation model and a ranking model are deployed in the dialogue system. Referring to FIG10 , a general dialogue generation model 1001 can be obtained by pre-training based on dialogue material data, and then the general dialogue generation model is fine-tuned according to the dialogue data (i.e., the first type of training samples) and the extended text data (i.e., the second type of training samples) parsed from the original text data such as novels and scripts related to the second character B to obtain a customized dialogue generation model 1002. According to the first type of training samples corresponding to the second character B, the general dialogue generation model 1001 with a classifier is fine-tuned to obtain a ranking model 1003. Finally, the customized dialogue generation model 1002 and the ranking model 1003 are deployed in the dialogue system.

对话系统通过定制对话生成模型根据第一角色A和第二角色B之间的输入对话，针对第一角色A的当前输入语句，进行候选回复语句的生成。示例性地，通过定制对话生成模型直接对第一角色A和第二角色B之间的输入对话进行处理，得到第一类回复语句，以及通过定制对话生成模型在指示信息的引导下，对第一角色A和第二角色B之间的输入对话进行处理，得到第二类回复语句。将第一类回复语句和第二类回复语句确定为候选回复语句。The dialogue system generates candidate reply statements for the current input statement of the first character A according to the input dialogue between the first character A and the second character B through the customized dialogue generation model. Exemplarily, the input dialogue between the first character A and the second character B is directly processed by the customized dialogue generation model to obtain a first type of reply statement, and the input dialogue between the first character A and the second character B is processed by the customized dialogue generation model under the guidance of the instruction information to obtain a second type of reply statement. The first type of reply statement and the second type of reply statement are determined as candidate reply statements.

对话系统通过排序模型获取各个候选回复语句分别对应的相关性评分，再采用上述后处理策略对相关性评分进行调整，得到调整后的相关性评分，最后将相关性评分最大的候选回复语句确定为当前输入语句对应的最终回复语句。The dialogue system obtains the relevance scores corresponding to each candidate reply sentence through the ranking model, and then uses the above-mentioned post-processing strategy to adjust the relevance scores to obtain the adjusted relevance scores. Finally, the candidate reply sentence with the largest relevance score is determined as the final reply sentence corresponding to the current input sentence.

参考图11，在输入对话300中，针对第一角色A的每一个问句，对话系统均可以稳定地以第二角色B的身份进行回复。Referring to FIG. 11 , in the input dialogue 300 , for each question of the first character A, the dialogue system can stably respond as the second character B.

另外，将本申请实施例提供的技术方案应用于角色对话定制场景，可以实现对话系统稳定地以角色身份自动与用户对话，有利于提升用户对游戏、影视作品等的粘性，增强用户的参与度，从而提高转化成功率。同时，有利于提升游戏、影视等宣传、分发的个性化能力。In addition, applying the technical solution provided by the embodiment of the present application to the role dialogue customization scenario can realize that the dialogue system can stably and automatically dialogue with the user in the role identity, which is conducive to improving the stickiness of users to games, film and television works, etc., and enhancing the user's participation, thereby improving the conversion success rate. At the same time, it is conducive to improving the personalized ability of publicity and distribution of games, films, etc.

下述为本申请装置实施例，可以用于执行本申请方法实施例。对于本申请装置实施例中未披露的细节，请参照本申请方法实施例。The following are device embodiments of the present application, which can be used to execute the method embodiments of the present application. For details not disclosed in the device embodiments of the present application, please refer to the method embodiments of the present application.

参考图12，其示出了本申请一个实施例提供的回复语句生成装置的框图。该装置具有实现上述方法示例的功能，所述功能可以由硬件实现，也可以由硬件执行相应的软件实现。如图12所示，该装置1200包括：输入对话获取模块1201、候选回复获取模块1202和最终回复获取模块1203。Referring to FIG. 12 , a block diagram of a reply statement generating device provided by an embodiment of the present application is shown. The device has the function of implementing the above method example, and the function can be implemented by hardware or by hardware executing corresponding software. As shown in FIG. 12 , the device 1200 includes: an input dialogue acquisition module 1201, a candidate reply acquisition module 1202, and a final reply acquisition module 1203.

输入对话获取模块1201，用于获取输入对话，所述输入对话包括第一角色与第二角色生成的对话，所述第一角色和所述第二角色不同。The input dialogue acquisition module 1201 is used to acquire an input dialogue, where the input dialogue includes a dialogue generated between a first role and a second role, where the first role and the second role are different.

候选回复获取模块1202，用于对于所述输入对话中所述第一角色对应的当前输入语句，通过定制对话生成模型根据所述输入对话，获取所述当前输入语句对应的第一类回复语句，所述定制对话生成模型学习有所述第二角色的对话特征。The candidate reply acquisition module 1202 is used to obtain a first type of reply sentence corresponding to the current input sentence corresponding to the first role in the input dialogue by using a customized dialogue generation model based on the input dialogue, and the customized dialogue generation model learns the dialogue features of the second role.

所述候选回复获取模块1202，还用于通过所述定制对话生成模型根据所述输入对话和至少一个提示信息，获取所述当前输入语句对应的至少一个第二类回复语句，所述提示信息是根据所述输入对话生成的用于引导生成回复语句的信息。The candidate reply acquisition module 1202 is also used to obtain at least one second-category reply statement corresponding to the current input statement based on the input dialogue and at least one prompt information through the customized dialogue generation model, wherein the prompt information is information generated based on the input dialogue for guiding the generation of a reply statement.

最终回复获取模块1203，用于从所述第一类回复语句和所述至少一个第二类回复语句中，确定所述当前输入语句对应的最终回复语句。The final reply acquisition module 1203 is used to determine the final reply sentence corresponding to the current input sentence from the first type of reply sentence and the at least one second type of reply sentence.

在一些实施例中，如图13所示，所述候选回复获取模块1202，包括：拼接文本获取子模块1202a和二类回复获取子模块1202b。In some embodiments, as shown in FIG. 13 , the candidate reply acquisition module 1202 includes: a concatenated text acquisition submodule 1202a and a second-category reply acquisition submodule 1202b.

拼接文本获取子模块1202a，用于对于所述至少一个提示信息中的目标提示信息，将所述目标提示信息与所述输入对话进行拼接，得到拼接文本。The concatenated text acquisition submodule 1202a is used to concatenate the target prompt information in the at least one prompt information with the input dialogue to obtain a concatenated text.

二类回复获取子模块1202b，用于通过所述定制对话生成模型对所述拼接文本进行处理，得到所述当前输入语句在所述目标提示信息的引导下生成的第二类回复语句。The second-category reply acquisition submodule 1202b is used to process the concatenated text through the customized dialogue generation model to obtain a second-category reply sentence generated by the current input sentence under the guidance of the target prompt information.

在一些实施例中，所述二类回复获取子模块1202b，用于：In some embodiments, the second-type reply acquisition submodule 1202b is used to:

获取所述拼接文本的词嵌入；Obtain word embeddings of the concatenated text;

对于所述拼接文本中每个词，将所述词对应的位置信息、句子信息和角色信息，与所述词对应的词嵌入进行融合，得到所述定制对话生成模型对应的输入词嵌入；For each word in the concatenated text, the position information, sentence information, and role information corresponding to the word are fused with the word embedding corresponding to the word to obtain an input word embedding corresponding to the customized dialogue generation model;

通过所述定制对话生成模型对所述输入词嵌入进行处理，得到所述当前输入语句在所述目标提示信息的引导下生成的第二类回复语句。The input word embedding is processed by the customized dialogue generation model to obtain a second type of reply sentence generated by the current input sentence under the guidance of the target prompt information.

在一些实施例中，所述提示信息包括所述第二角色的对话特征或所述输入对话的对话背景；其中，所述对话特征包括以下至少一种：对话风格、对话情感、对话语气和对话偏好词；所述拼接文本获取子模块1202a，用于：In some embodiments, the prompt information includes the dialogue features of the second character or the dialogue background of the input dialogue; wherein the dialogue features include at least one of the following: dialogue style, dialogue emotion, dialogue tone and dialogue preference words; the spliced text acquisition submodule 1202a is used to:

在所述目标提示信息为所述对话特征的情况下，将所述对话特征拼接到所述第一角色在所述输入对话中对应的语句中，得到所述拼接文本；In a case where the target prompt information is the dialogue feature, splicing the dialogue feature into a corresponding sentence of the first character in the input dialogue to obtain the spliced text;

或者，在所述目标提示信息为所述对话背景的情况下，将所述对话背景拼接到所述输入对话的头部，得到所述拼接文本。Alternatively, when the target prompt information is the conversation background, the conversation background is spliced to the header of the input conversation to obtain the spliced text.

在一些实施例中，所述最终回复获取模块1203，用于：In some embodiments, the final reply acquisition module 1203 is used to:

通过排序模型获取所述第一类回复语句对应的相关性评分和各个所述第二类回复语句分别对应的相关性评分，所述相关性评分用于指示回复语句与所述第二角色的相关性；Obtaining, by means of a ranking model, a relevance score corresponding to the first type of reply statements and a relevance score corresponding to each of the second type of reply statements, wherein the relevance score is used to indicate the relevance of the reply statement to the second role;

将所述第一类回复语句和各个所述第二类回复语句中，所述相关性评分最高的回复语句确定为所述最终回复语句。The reply statement with the highest relevance score among the first category reply statements and each of the second category reply statements is determined as the final reply statement.

在一些实施例中，如图13所示，所述装置1200，还包括：相关性评分调整模块1204。In some embodiments, as shown in FIG. 13 , the apparatus 1200 further includes: a relevance score adjustment module 1204 .

相关性评分调整模块1204，用于在候选回复语句中存在质量评分大于或等于质量阈值的实体词的情况下，提高所述候选回复语句对应的相关性评分。The relevance score adjustment module 1204 is used to improve the relevance score corresponding to the candidate reply sentence when there is an entity word with a quality score greater than or equal to the quality threshold in the candidate reply sentence.

或者，所述相关性评分调整模块1204，还用于在候选回复语句与角色高频短词之间的相似度大于或等于相似度阈值的情况下，提高所述候选回复语句对应的相关性评分，所述角色高频短词用于表示所述第二角色的对话特征；其中，所述候选回复语句包括所述第一类回复语句和各个所述第二类回复语句。Alternatively, the relevance score adjustment module 1204 is also used to increase the relevance score corresponding to the candidate reply statement when the similarity between the candidate reply statement and the character's high-frequency short words is greater than or equal to a similarity threshold, and the character's high-frequency short words are used to represent the dialogue characteristics of the second character; wherein the candidate reply statements include the first category reply statements and each of the second category reply statements.

在一些实施例中，所述定制对话生成模型的训练过程如下：In some embodiments, the training process of the customized dialogue generation model is as follows:

获取通用对话生成模型，所述通用对话生成模型是通过对话语料数据预训练得到的；Obtaining a general dialogue generation model, where the general dialogue generation model is obtained by pre-training with dialogue corpus data;

从所述第二角色对应的关联文本数据中，解析出训练样本集合，所述训练样本集合包括带有所述第二角色对应的背景数据的第一类训练样本，以及不带有所述背景数据的第二类训练样本，所述训练样本集合中的训练样本由对话数据构成；Parsing a training sample set from the associated text data corresponding to the second role, the training sample set comprising a first type of training samples with background data corresponding to the second role and a second type of training samples without the background data, wherein the training samples in the training sample set are composed of dialogue data;

通过所述训练样本集合对所述通用对话生成模型进行训练，得到训练完成的所述定制对话生成模型。The general dialogue generation model is trained using the training sample set to obtain the trained customized dialogue generation model.

在一些实施例中，所述关联文本数据包括原始文本数据和扩展文本数据；所述关联文本数据的获取过程可以如下：In some embodiments, the associated text data includes original text data and extended text data; the process of acquiring the associated text data may be as follows:

根据对话提取规则，从所述原始文本数据中提取所述第一类训练样本；其中，所述原始文本数据包括与所述第二角色相关的小说和剧本；Extracting the first type of training samples from the original text data according to the dialogue extraction rule; wherein the original text data includes novels and scripts related to the second character;

基于所述第一类训练样本构建判别模型的正样本，以及基于与所述第二角色不相关的其他文本数据构建所述判别模型的负样本；constructing positive samples of the discriminant model based on the first type of training samples, and constructing negative samples of the discriminant model based on other text data unrelated to the second role;

根据所述判别模型的正样本和负样本，对所述判别模型进行训练，得到训练完成的判别模型；According to the positive samples and negative samples of the discriminant model, the discriminant model is trained to obtain a trained discriminant model;

通过所述训练完成的判别模型，从文本数据库中获取所述扩展文本数据。The extended text data is obtained from a text database through the trained discriminant model.

在一些实施例中，所述排序模型的训练过程如下：In some embodiments, the training process of the ranking model is as follows:

在通用对话生成模型中增设分类器，得到初始状态下的排序模型；其中，所述通用对话生成模型是通过对话语料数据预训练得到的；Adding a classifier to the general dialogue generation model to obtain a sorting model in an initial state; wherein the general dialogue generation model is obtained by pre-training with dialogue corpus data;

根据所述第二角色对应的第一类训练样本，构建正样本集合和负样本集合，所述正样本集合中的正样本为所述第一类训练样本，所述负样本集合中的负样本是通过对所述第一类训练样本中的回复语句进行更换得到的；According to the first type of training samples corresponding to the second role, construct a positive sample set and a negative sample set, wherein the positive samples in the positive sample set are the first type of training samples, and the negative samples in the negative sample set are obtained by replacing the reply sentences in the first type of training samples;

通过所述正样本集合和所述负样本集合，对所述初始状态下的排序模型进行训练，得到训练完成的所述排序模型。The sorting model in the initial state is trained using the positive sample set and the negative sample set to obtain the trained sorting model.

需要说明的是，上述实施例提供的装置，在实现其功能时，仅以上述各功能模块的划分进行举例说明，实际应用中，可以根据需要而将上述功能分配由不同的功能模块完成，即将设备的内部结构划分成不同的功能模块，以完成以上描述的全部或者部分功能。另外，上述实施例提供的装置与方法实施例属于同一构思，其具体实现过程详见方法实施例，这里不再赘述。It should be noted that the device provided in the above embodiment, when implementing its functions, only uses the division of the above functional modules as an example. In actual applications, the above functions can be assigned to different functional modules as needed, that is, the internal structure of the device is divided into different functional modules to complete all or part of the functions described above. In addition, the device and method embodiments provided in the above embodiment belong to the same concept, and their specific implementation process is detailed in the method embodiment, which will not be repeated here.

请参考图14，其示出了本申请一个实施例提供的计算机设备的结构框图。该计算机设备1400可以用于实施上述实施例中提供的回复语句生成方法。该计算机设备1400可以是上文提及的终端设备10，也可以是上文提及的服务器20，本申请实施例对此不作限定，具体来讲：Please refer to FIG. 14, which shows a block diagram of a computer device provided in an embodiment of the present application. The computer device 1400 can be used to implement the reply statement generation method provided in the above embodiment. The computer device 1400 can be the terminal device 10 mentioned above, or the server 20 mentioned above. The embodiment of the present application does not limit this. Specifically:

可选地，如图14所示，计算机设备1400包括处理器1401和存储器1402。处理器1401包括但不限于以下任意一种：CPU(Central Processing Unit，中央处理器)、GPU(GraphicsProcessing Unit，图形处理器)和FPGA(Field Programmable Gate Array，现场可编程逻辑门阵列)等。存储器1402可以包括RAM(Random-Access Memory，随机存储器)和ROM(Read-Only Memory，只读存储器)等存储设备。处理器1401和存储器1402之间可以通过系统总线连接。Optionally, as shown in FIG14 , the computer device 1400 includes a processor 1401 and a memory 1402. The processor 1401 includes, but is not limited to, any one of the following: a CPU (Central Processing Unit), a GPU (Graphics Processing Unit), and an FPGA (Field Programmable Gate Array). The memory 1402 may include storage devices such as a RAM (Random-Access Memory) and a ROM (Read-Only Memory). The processor 1401 and the memory 1402 may be connected via a system bus.

在示例性实施例中，所述存储器1402中存储有计算机程序，所述计算机程序由所述处理器1401加载并执行以实现上述回复语句生成方法。In an exemplary embodiment, a computer program is stored in the memory 1402 , and the computer program is loaded and executed by the processor 1401 to implement the above-mentioned reply statement generating method.

在一些实施例中，还提供了一种计算机可读存储介质，所述存储介质中存储有计算机程序，所述计算机程序在被处理器执行时以实现上述回复语句生成方法。In some embodiments, a computer-readable storage medium is also provided, in which a computer program is stored. When the computer program is executed by a processor, the above-mentioned reply statement generation method is implemented.

可选地，该计算机可读存储介质可以包括：ROM(Read-Only Memory，只读存储器)、RAM(Random-Access Memory，随机存储器)、SSD(Solid State Drives，固态硬盘)或光盘等。其中，随机存取记忆体可以包括ReRAM(Resistance Random Access Memory，电阻式随机存取记忆体)和DRAM(Dynamic Random Access Memory，动态随机存取存储器)。Optionally, the computer readable storage medium may include: ROM (Read-Only Memory), RAM (Random-Access Memory), SSD (Solid State Drives) or optical disks, etc. Among them, the random access memory may include ReRAM (Resistance Random Access Memory) and DRAM (Dynamic Random Access Memory).

在一些实施例中，还提供了一种计算机程序产品，所述计算机程序产品包括计算机程序，所述计算机程序存储在计算机可读存储介质中。计算机设备的处理器从所述计算机可读存储介质中读取所述计算机程序，所述处理器执行所述计算机程序，使得所述计算机设备执行上述回复语句生成方法。In some embodiments, a computer program product is further provided, the computer program product comprising a computer program, the computer program being stored in a computer-readable storage medium. A processor of a computer device reads the computer program from the computer-readable storage medium, and the processor executes the computer program, so that the computer device executes the above-mentioned reply statement generation method.

需要说明的是，本申请所涉及的信息(包括但不限于对象设备信息、对象个人信息等)、数据(包括但不限于用于分析的数据、存储的数据、展示的数据等)以及信号，均为经对象授权或者经过各方充分授权的，且相关数据的收集、使用和处理需要遵守相关国家和地区的相关法律法规和标准。例如，本申请中涉及到的输入对话、提示信息、对话语料数据、关联文本数据等都是在充分授权的情况下获取的。It should be noted that the information (including but not limited to the subject device information, subject personal information, etc.), data (including but not limited to data used for analysis, stored data, displayed data, etc.) and signals involved in this application are all authorized by the subject or fully authorized by all parties, and the collection, use and processing of relevant data must comply with the relevant laws, regulations and standards of the relevant countries and regions. For example, the input dialogue, prompt information, dialogue data, related text data, etc. involved in this application are all obtained with full authorization.

应当理解的是，在本文中提及的“多个”是指两个或两个以上。“和/或”，描述关联对象的关联关系，表示可以存在三种关系，例如，A和/或B，可以表示：单独存在A，同时存在A和B，单独存在B这三种情况。字符“/”一般表示前后关联对象是一种“或”的关系。另外，本文中描述的步骤编号，仅示例性示出了步骤间的一种可能的执行先后顺序，在一些其它实施例中，上述步骤也可以不按照编号顺序来执行，如两个不同编号的步骤同时执行，或者两个不同编号的步骤按照与图示相反的顺序执行，本申请实施例对此不作限定。It should be understood that the "multiple" mentioned in this article refers to two or more. "And/or" describes the association relationship of associated objects, indicating that three relationships may exist. For example, A and/or B can represent: A exists alone, A and B exist at the same time, and B exists alone. The character "/" generally indicates that the objects associated with each other are in an "or" relationship. In addition, the step numbers described in this article only illustrate a possible execution sequence between the steps. In some other embodiments, the above steps may not be executed in the order of the numbers, such as two steps with different numbers are executed at the same time, or two steps with different numbers are executed in the opposite order to the diagram. The embodiments of the present application are not limited to this.

以上所述仅为本申请的示例性实施例，并不用以限制本申请，凡在本申请的精神和原则之内，所作的任何修改、等同替换、改进等，均应包含在本申请的保护范围之内。The above description is only an exemplary embodiment of the present application and is not intended to limit the present application. Any modifications, equivalent substitutions, improvements, etc. made within the spirit and principles of the present application shall be included in the protection scope of the present application.