CN111860167A

Movatterモバイル変換

Info

Publication number: CN111860167A
Application number: CN202010560733.6A
Authority: CN
Inventors: 余席宇; 洪智滨
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2020-06-18
Filing date: 2020-06-18
Publication date: 2020-10-30
Anticipated expiration: 2040-06-18
Also published as: CN111860167B

Abstract

Translated fromChinese

本申请公开了人脸融合模型获取及人脸融合方法、装置及存储介质，涉及人工智能、深度学习及计算机视觉领域，其中的方法可包括：获取训练样本，训练样本中包括：目标人脸图像以及模板人脸图像；利用训练样本，训练得到用于对输入的目标人脸图像和模板人脸图像进行人脸融合的人脸融合模型，包括：获取目标人脸图像中的目标人脸的属性信息；对目标人脸图像及模板人脸图像进行人脸融合，得到融合人脸；获取融合人脸的属性信息；根据目标人脸的属性信息及融合人脸的属性信息构建属性一致性损失函数，利用属性一致性损失函数进行人脸融合模型的自监督学习。本申请所述方案可以用于云服务，可提升人脸融合效果并具有广泛适用性等。

The present application discloses a face fusion model acquisition and face fusion method, device and storage medium, and relates to the fields of artificial intelligence, deep learning and computer vision. The method may include: acquiring a training sample, and the training sample includes: a target face image and a template face image; using the training samples to train a face fusion model for face fusion of the input target face image and the template face image, including: obtaining the attributes of the target face in the target face image information; perform face fusion on the target face image and the template face image to obtain a fused face; obtain the attribute information of the fused face; construct an attribute consistency loss function according to the attribute information of the target face and the attribute information of the fused face , using the attribute consistency loss function for self-supervised learning of the face fusion model. The solution described in this application can be used for cloud services, can improve the effect of face fusion, and has wide applicability.

Description

Translated fromChinese

人脸融合模型获取及人脸融合方法、装置及存储介质Face fusion model acquisition and face fusion method, device and storage medium

技术领域technical field

本申请涉及计算机应用技术，特别涉及人工智能、深度学习及计算机视觉领域的人脸融合模型获取及人脸融合方法、装置及存储介质。The present application relates to computer application technology, in particular to a face fusion model acquisition and face fusion method, device and storage medium in the fields of artificial intelligence, deep learning and computer vision.

背景技术Background technique

目前，人脸融合技术已广泛应用于虚拟视觉形象、长短视频趣味玩法、相册趣味玩法等各种场景。人脸融合技术通常需要保留一个人脸形象的身份信息而融合另一个人脸形象的属性信息。At present, face fusion technology has been widely used in various scenarios such as virtual visual images, long and short video fun games, and album fun games. Face fusion technology usually needs to retain the identity information of one face image and fuse the attribute information of another face image.

鉴于实现成本较低，目前多采用基于混合变形(blendshape)的人脸融合方法等进行人脸融合，但这种方法的融合效果较差，即得到的融合形象不够真实等。In view of the low cost of implementation, the face fusion method based on blend shape is mostly used for face fusion, but the fusion effect of this method is poor, that is, the obtained fusion image is not realistic enough.

发明内容SUMMARY OF THE INVENTION

本申请提供了人脸融合模型获取及人脸融合方法、装置及存储介质。The present application provides a face fusion model acquisition and face fusion method, device and storage medium.

一种人脸融合模型获取方法，包括：A method for obtaining a face fusion model, comprising:

获取训练样本，所述训练样本中包括：目标人脸图像以及模板人脸图像；Obtain training samples, the training samples include: a target face image and a template face image;

利用所述训练样本，训练得到用于对输入的目标人脸图像和模板人脸图像进行人脸融合的人脸融合模型，包括：Utilize the training samples to train a face fusion model for performing face fusion on the input target face image and the template face image, including:

获取所述目标人脸图像中的目标人脸的属性信息；Obtain attribute information of the target face in the target face image;

对所述目标人脸图像及所述模板人脸图像进行人脸融合，得到融合人脸；performing face fusion on the target face image and the template face image to obtain a fusion face;

获取所述融合人脸的属性信息；obtaining attribute information of the fused face;

根据所述目标人脸的属性信息及所述融合人脸的属性信息构建属性一致性损失函数，利用所述属性一致性损失函数进行所述人脸融合模型的自监督学习。An attribute consistency loss function is constructed according to the attribute information of the target face and the attribute information of the fusion face, and the self-supervised learning of the face fusion model is performed by using the attribute consistency loss function.

一种人脸融合方法，包括：A face fusion method, including:

获取目标人脸图像及模板人脸图像；Obtain the target face image and the template face image;

将所述目标人脸图像及所述模板人脸图像输入按照上述方法获取到的人脸融合模型，得到输出的人脸融合结果。The target face image and the template face image are input into the face fusion model obtained according to the above method, and an output face fusion result is obtained.

一种人脸融合模型获取装置，包括：样本获取模块以及模型训练模块；A face fusion model acquisition device, comprising: a sample acquisition module and a model training module;

所述样本获取模块，用于获取训练样本，所述训练样本中包括：目标人脸图像以及模板人脸图像；The sample acquisition module is used to acquire training samples, and the training samples include: a target face image and a template face image;

所述模型训练模块，用于利用所述训练样本，训练得到用于对输入的目标人脸图像和模板人脸图像进行人脸融合的人脸融合模型，包括：获取所述目标人脸图像中的目标人脸的属性信息；对所述目标人脸图像及所述模板人脸图像进行人脸融合，得到融合人脸；获取所述融合人脸的属性信息；根据所述目标人脸的属性信息及所述融合人脸的属性信息构建属性一致性损失函数，利用所述属性一致性损失函数进行所述人脸融合模型的自监督学习。The model training module is used for using the training samples to train a face fusion model for performing face fusion on the input target face image and the template face image, including: acquiring the target face image. the attribute information of the target face; perform face fusion on the target face image and the template face image to obtain a fused face; obtain the attribute information of the fused face; according to the attributes of the target face information and the attribute information of the fusion face to construct an attribute consistency loss function, and use the attribute consistency loss function to perform self-supervised learning of the face fusion model.

一种人脸融合装置，包括：图像获取模块以及人脸融合模块；A face fusion device, comprising: an image acquisition module and a face fusion module;

所述图像获取模块，用于获取目标人脸图像及模板人脸图像；The image acquisition module is used to acquire the target face image and the template face image;

所述人脸融合模块，用于将所述目标人脸图像及所述模板人脸图像输入按照上述方法获取到的人脸融合模型，得到输出的人脸融合结果。The face fusion module is configured to input the target face image and the template face image into the face fusion model obtained according to the above method, and obtain an output face fusion result.

一种电子设备，包括：An electronic device comprising:

至少一个处理器；以及at least one processor; and

与所述至少一个处理器通信连接的存储器；其中，a memory communicatively coupled to the at least one processor; wherein,

所述存储器存储有可被所述至少一个处理器执行的指令，所述指令被所述至少一个处理器执行，以使所述至少一个处理器能够执行如以上所述的方法。The memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method as described above.

一种存储有计算机指令的非瞬时计算机可读存储介质，所述计算机指令用于使所述计算机执行如以上所述的方法。A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method as described above.

上述申请中的一个实施例具有如下优点或有益效果：可训练得到人脸融合模型，其中，可分别获取目标人脸及融合人脸的属性信息，实现身份和属性的解耦，并可利用得到的属性信息构建属性一致性损失函数，利用属性一致性损失函数来指导模型的训练等，从而提升了模型训练效果，进而提升了利用训练得到的模型进行人脸融合时的融合效果，能够得到更为真实的融合形象，而且，针对训练得到的模型，给定任意目标人脸图像及模板人脸图像，均可得到对应的人脸融合结果，具有广泛适用性并具有较低的实现成本等。应当理解，本部分所描述的内容并非旨在标识本公开的实施例的关键或重要特征，也不用于限制本公开的范围。本公开的其它特征将通过以下的说明书而变得容易理解。An embodiment in the above application has the following advantages or beneficial effects: a face fusion model can be obtained by training, wherein the attribute information of the target face and the fusion face can be obtained respectively, so as to realize the decoupling of identity and attribute, and can be obtained by using The attribute information is used to construct an attribute consistency loss function, and the attribute consistency loss function is used to guide the training of the model, so as to improve the model training effect, and then improve the fusion effect when using the trained model for face fusion. It is a real fusion image, and for the model obtained by training, given any target face image and template face image, the corresponding face fusion result can be obtained, which has wide applicability and low implementation cost. It should be understood that what is described in this section is not intended to identify key or critical features of embodiments of the disclosure, nor is it intended to limit the scope of the disclosure. Other features of the present disclosure will become readily understood from the following description.

附图说明Description of drawings

附图用于更好地理解本方案，不构成对本申请的限定。其中：The accompanying drawings are used for better understanding of the present solution, and do not constitute a limitation to the present application. in:

图1为本申请所述人脸融合模型获取方法实施例的流程图；1 is a flowchart of an embodiment of a method for obtaining a face fusion model described in the present application;

图2为本申请所述人脸融合模型获取过程示意图；2 is a schematic diagram of the acquisition process of the face fusion model described in this application;

图3为本申请所述人脸融合方法实施例的流程图；3 is a flowchart of an embodiment of a face fusion method described in the present application;

图4为本申请所述人脸融合模型获取装置实施例40的组成结构示意图；FIG. 4 is a schematic diagram of the composition and structure ofEmbodiment 40 of the apparatus for obtaining a face fusion model according to the present application;

图5为本申请所述人脸融合装置实施例50的组成结构示意图；FIG. 5 is a schematic diagram of the composition structure ofEmbodiment 50 of the face fusion apparatus described in the present application;

图6为根据本申请实施例所述方法的电子设备的框图。FIG. 6 is a block diagram of an electronic device according to the method described in the embodiment of the present application.

具体实施方式Detailed ways

以下结合附图对本申请的示范性实施例做出说明，其中包括本申请实施例的各种细节以助于理解，应当将它们认为仅仅是示范性的。因此，本领域普通技术人员应当认识到，可以对这里描述的实施例做出各种改变和修改，而不会背离本申请的范围和精神。同样，为了清楚和简明，以下的描述中省略了对公知功能和结构的描述。Exemplary embodiments of the present application are described below with reference to the accompanying drawings, which include various details of the embodiments of the present application to facilitate understanding, and should be considered as exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted from the following description for clarity and conciseness.

另外，应理解，本文中术语“和/或”，仅仅是一种描述关联对象的关联关系，表示可以存在三种关系，例如，A和/或B，可以表示：单独存在A，同时存在A和B，单独存在B这三种情况。另外，本文中字符“/”，一般表示前后关联对象是一种“或”的关系。In addition, it should be understood that the term "and/or" in this document is only an association relationship for describing associated objects, indicating that there can be three kinds of relationships, for example, A and/or B, which can mean that A exists alone, and A exists at the same time and B, there are three cases of B alone. In addition, the character "/" in this document generally indicates that the related objects are an "or" relationship.

图1为本申请所述人脸融合模型获取方法实施例的流程图。如图1所示，包括以下具体实现方式。FIG. 1 is a flowchart of an embodiment of a method for obtaining a face fusion model described in this application. As shown in Figure 1, the following specific implementations are included.

在101中，获取训练样本，训练样本中包括：目标人脸图像以及模板人脸图像。In 101, a training sample is obtained, and the training sample includes: a target face image and a template face image.

在102中，利用训练样本，训练得到用于对输入的目标人脸图像和模板人脸图像进行人脸融合的人脸融合模型，其中训练过程包括103-106所示内容。In 102, using the training samples, a face fusion model for performing face fusion on the input target face image and the template face image is trained and obtained, wherein the training process includes the contents shown in 103-106.

在103中，获取目标人脸图像中的目标人脸的属性信息。In 103, attribute information of the target face in the target face image is acquired.

在104中，对目标人脸图像及模板人脸图像进行人脸融合，得到融合人脸。In 104, face fusion is performed on the target face image and the template face image to obtain a fused face.

在105中，获取融合人脸的属性信息。In 105, attribute information of the fused face is obtained.

在106中，根据目标人脸的属性信息及融合人脸的属性信息构建属性一致性损失函数，利用属性一致性损失函数进行人脸融合模型的自监督学习。In 106, an attribute consistency loss function is constructed according to the attribute information of the target face and the attribute information of the fusion face, and the self-supervised learning of the face fusion model is performed by using the attribute consistency loss function.

可以看出，本实施例中可训练得到人脸融合模型，其中，可分别获取目标人脸及融合人脸的属性信息，实现身份和属性的解耦，并可利用得到的属性信息构建属性一致性损失函数，利用属性一致性损失函数来指导模型的训练等，从而提升了模型训练效果，进而提升了利用训练得到的模型进行人脸融合时的融合效果，能够得到更为真实的融合形象，而且，针对训练得到的模型，给定任意目标人脸图像及模板人脸图像，均可得到对应的人脸融合结果，具有广泛适用性并具有较低的实现成本等。It can be seen that in this embodiment, a face fusion model can be obtained by training, in which the attribute information of the target face and the fused face can be obtained respectively, so as to realize the decoupling of identity and attribute, and the obtained attribute information can be used to construct a consistent attribute It uses the attribute consistency loss function to guide the training of the model, etc., thereby improving the model training effect, which in turn improves the fusion effect when using the trained model for face fusion, and a more realistic fusion image can be obtained. Moreover, for the model obtained by training, given any target face image and template face image, the corresponding face fusion result can be obtained, which has wide applicability and low implementation cost.

在模型训练时，可根据实际需要获取多个训练样本，每个训练样本中可分别包括：目标人脸图像以及模板人脸图像，目标人脸图像以及模板人脸图像分别对应于不同的人物，即每个训练样本可分别对应于一组人物，另外，不同的训练样本对应的人物可以不同。During model training, multiple training samples can be obtained according to actual needs. Each training sample can respectively include: target face image and template face image, the target face image and template face image respectively correspond to different characters, That is, each training sample may correspond to a group of characters, and in addition, different training samples may have different characters.

针对训练样本中的目标人脸图像，可利用人脸三维重建方法获取其中的目标人脸的属性信息。优选地，可首先利用人脸三维重建方法获取目标人脸的三维形变模型(3DMM，3D Morphable Model)参数，之后可从3DMM参数中提取出目标人脸的属性信息。For the target face image in the training sample, the attribute information of the target face can be obtained by using the face three-dimensional reconstruction method. Preferably, the 3D Morphable Model (3DMM, 3D Morphable Model) parameters of the target face can be obtained first by using a face 3D reconstruction method, and then the attribute information of the target face can be extracted from the 3DMM parameters.

人脸三维重建方法的具体实现为现有技术。3DMM参数中的不同维度分别对应于人物的身份、表情、姿态、光照等信息。属性信息主要包括表情和姿态信息。The specific implementation of the three-dimensional face reconstruction method is the prior art. The different dimensions in the 3DMM parameters correspond to the identity, expression, posture, lighting and other information of the character respectively. The attribute information mainly includes expression and gesture information.

另外，可对目标人脸图像及模板人脸图像进行人脸融合，得到融合人脸。优选地，可利用基于生成对抗网络(GAN，Generative Adversarial Networks) 的人脸融合方法，对目标人脸图像及模板人脸图像进行人脸融合，从而得到融合人脸。在实际应用中，可采用任意基于GAN的人脸融合方法，如现有的人脸迁移(faceshifter)方法。In addition, face fusion can be performed on the target face image and the template face image to obtain a fused face. Preferably, a face fusion method based on Generative Adversarial Networks (GAN, Generative Adversarial Networks) can be used to perform face fusion on the target face image and the template face image, thereby obtaining a fused face. In practical applications, any GAN-based face fusion method can be used, such as the existing faceshifter method.

人脸融合技术通常需要保留一个人脸形象的身份信息而融合另一个人脸形象的属性信息。faceshifter方法中针对目标人脸存在一个自编码器 (autoencoder)属性重建网络，将属性重建网络的各个尺度的特征融合到模板人脸的身份信息中。Face fusion technology usually needs to retain the identity information of one face image and fuse the attribute information of another face image. In the faceshifter method, there is an autoencoder attribute reconstruction network for the target face, and the features of each scale of the attribute reconstruction network are fused into the identity information of the template face.

采用基于GAN的人脸融合方法，可得到较好的融合效果，从而方便了后续处理，另外，本实施例中结合了基于GAN的人脸融合方法以及人脸三维重建方法，利用了各自的优势，从而进一步提升了模型训练效果等。By adopting the GAN-based face fusion method, a better fusion effect can be obtained, which facilitates subsequent processing. In addition, the GAN-based face fusion method and the 3D face reconstruction method are combined in this embodiment, using their respective advantages. , which further improves the model training effect.

之后，还可利用人脸三维重建方法来获取融合人脸的属性信息。优选地，可首先利用人脸三维重建方法获取融合人脸的3DMM参数，之后可从3DMM 参数中提取出融合人脸的属性信息。After that, the attribute information of the fused face can also be obtained by using the three-dimensional reconstruction method of the face. Preferably, the 3DMM parameters of the fused face can be obtained first by using a face three-dimensional reconstruction method, and then the attribute information of the fused face can be extracted from the 3DMM parameters.

基于人脸三维重建方法，可方便准确地获取到所需的属性信息，从而实现了身份和属性的解耦，方便了后续处理等。Based on the face 3D reconstruction method, the required attribute information can be easily and accurately obtained, thereby realizing the decoupling of identity and attributes, and facilitating subsequent processing.

经过上述处理后，可分别获取到目标人脸的属性信息以及融合人脸的属性信息，人脸融合时希望这两个属性信息具有一致性，因此可根据目标人脸的属性信息及融合人脸的属性信息构建属性一致性损失函数，并可利用属性一致性损失函数进行人脸融合模型的自监督学习。After the above processing, the attribute information of the target face and the attribute information of the fused face can be obtained respectively. When the face is fused, it is hoped that these two attribute information are consistent. Therefore, the attribute information of the target face and the fusion face can be obtained according to The attribute information is used to construct an attribute consistency loss function, and the attribute consistency loss function can be used for self-supervised learning of the face fusion model.

优选地，可计算目标人脸的属性信息及融合人脸的属性信息的L2范数 (L2-norm)，作为属性一致性损失函数，具体形式可为：||A-B||²，其中A 和B分别表示目标人脸的属性信息及融合人脸的属性信息。Preferably, the attribute information of the target face and the L2 norm (L2-norm) of the attribute information of the fusion face can be calculated as the attribute consistency loss function, and the specific form can be: ||AB||² , where A and B represents the attribute information of the target face and the attribute information of the fusion face, respectively.

另外，还可结合属性一致性损失函数以及基于GAN的人脸融合方法中的身份一致性损失函数来进行人脸融合模型的自监督学习。比如，结合属性一致性损失函数以及faceshifter方法中的身份一致性损失函数来进行人脸融合模型的自监督学习。In addition, the attribute consistency loss function and the identity consistency loss function in the GAN-based face fusion method can also be combined for self-supervised learning of the face fusion model. For example, the self-supervised learning of the face fusion model is carried out by combining the attribute consistency loss function and the identity consistency loss function in the faceshifter method.

通过上述方式，不但保证了融合人脸与目标人脸的属性的一致性，而且保证了融合人脸与模板人脸的身份的一致性。Through the above method, not only the consistency of the attributes of the fusion face and the target face is ensured, but also the identity consistency of the fusion face and the template face is guaranteed.

综合上述介绍，图2为本申请所述人脸融合模型获取过程示意图。如图 2所示，可利用人脸三维重建方法获取目标人脸图像中的目标人脸的属性信息，如可利用人脸三维重建方法获取目标人脸的3DMM参数，从3DMM参数中提取出目标人脸的属性信息，并可对目标人脸图像及模板人脸图像进行人脸融合，得到融合人脸，如可利用基于GAN的人脸融合方法中的faceshifter 方法，对目标人脸图像及模板人脸图像进行人脸融合，得到融合人脸，之后可利用人脸三维重建方法获取融合人脸的属性信息，如可利用人脸三维重建方法获取融合人脸的3DMM参数，从3DMM参数中提取出融合人脸的属性信息，进而可根据目标人脸的属性信息及融合人脸的属性信息构建属性一致性损失函数，如可计算目标人脸的属性信息及融合人脸的属性信息的L2-norm，作为属性一致性损失函数，进而可结合属性一致性损失函数以及faceshifter 方法中的身份一致性损失函数进行人脸融合模型的自监督学习等。Based on the above introduction, FIG. 2 is a schematic diagram of the acquisition process of the face fusion model described in this application. As shown in Figure 2, the attribute information of the target face in the target face image can be obtained by using the 3D face reconstruction method. For example, the 3DMM parameters of the target face can be obtained by using the 3D face reconstruction method, and the target face can be extracted from the 3DMM parameters. The attribute information of the face, and the face fusion of the target face image and the template face image can be performed to obtain a fusion face. For example, the faceshifter method in the GAN-based face fusion method can be used to fuse the target face image and template. The face image is fused to obtain the fused face, and then the 3D face reconstruction method can be used to obtain the attribute information of the fused face. The attribute information of the fused face can be obtained, and then the attribute consistency loss function can be constructed according to the attribute information of the target face and the attribute information of the fused face. For example, the attribute information of the target face and the attribute information of the fusion face can be calculated. norm, as the attribute consistency loss function, can be combined with the attribute consistency loss function and the identity consistency loss function in the faceshifter method for self-supervised learning of the face fusion model.

图3为本申请所述人脸融合方法实施例的流程图。如图3所示，包括以下具体实现方式。FIG. 3 is a flowchart of an embodiment of a face fusion method described in this application. As shown in FIG. 3 , the following specific implementations are included.

在301中，获取目标人脸图像及模板人脸图像。In 301, a target face image and a template face image are acquired.

在302中，将目标人脸图像及模板人脸图像输入预先训练得到的人脸融合模型，得到输出的人脸融合结果。In 302, the target face image and the template face image are input into the face fusion model obtained by pre-training, and the output face fusion result is obtained.

人脸融合模型可为按照图1所示方法得到的人脸融合模型。The face fusion model may be a face fusion model obtained according to the method shown in FIG. 1 .

在得到人脸融合模型后，即可利用其来进行实际的人脸融合。对于待融合的任意目标人脸图像和模板人脸图像，可输入人脸融合模型，从而得到输出的人脸融合结果。After the face fusion model is obtained, it can be used for actual face fusion. For any target face image and template face image to be fused, the face fusion model can be input to obtain the output face fusion result.

需要说明的是，对于前述的各方法实施例，为了简单描述，将其都表述为一系列的动作组合，但是本领域技术人员应该知悉，本申请并不受所描述的动作顺序的限制，因为依据本申请，某些步骤可以采用其它顺序或者同时进行。其次，本领域技术人员也应该知悉，说明书中所描述的实施例均属于优选实施例，所涉及的动作和模块并不一定是本申请所必须的。另外，某个实施例中没有详述的部分，可以参见其它实施例中的相关说明。It should be noted that, for the sake of simple description, the foregoing method embodiments are all expressed as a series of action combinations, but those skilled in the art should know that the present application is not limited by the described action sequence, because Certain steps may be performed in other orders or concurrently in accordance with the present application. Secondly, those skilled in the art should also know that the embodiments described in the specification are all preferred embodiments, and the actions and modules involved are not necessarily required by the present application. In addition, for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions in other embodiments.

本申请所述方案可以用于云服务。以上是关于方法实施例的介绍，以下通过装置实施例，对本申请所述方案进行进一步说明。The solutions described in this application can be used for cloud services. The above is an introduction to the method embodiments, and the solutions described in the present application are further described below through the device embodiments.

图4为本申请所述人脸融合模型获取装置实施例40的组成结构示意图。如图4所示，包括：样本获取模块401以及模型训练模块402。FIG. 4 is a schematic diagram of the composition and structure ofEmbodiment 40 of the apparatus for obtaining a face fusion model according to the present application. As shown in FIG. 4 , it includes: asample acquisition module 401 and amodel training module 402 .

样本获取模块401，用于获取训练样本，训练样本中包括：目标人脸图像以及模板人脸图像。Thesample acquisition module 401 is used for acquiring training samples, and the training samples include: a target face image and a template face image.

模型训练模块402，用于利用训练样本，训练得到用于对输入的目标人脸图像和模板人脸图像进行人脸融合的人脸融合模型，包括：获取目标人脸图像中的目标人脸的属性信息；对目标人脸图像及模板人脸图像进行人脸融合，得到融合人脸；获取融合人脸的属性信息；根据目标人脸的属性信息及融合人脸的属性信息构建属性一致性损失函数，利用属性一致性损失函数进行人脸融合模型的自监督学习。Themodel training module 402 is used to train a face fusion model for performing face fusion on the input target face image and the template face image by using the training samples, including: obtaining the target face in the target face image. Attribute information; perform face fusion on the target face image and the template face image to obtain the fused face; obtain the attribute information of the fused face; construct the attribute consistency loss according to the attribute information of the target face and the attribute information of the fused face function, which uses the attribute consistency loss function for self-supervised learning of the face fusion model.

针对训练样本中的目标人脸图像，模型训练模块402可利用人脸三维重建方法获取其中的目标人脸的属性信息，如可利用人脸三维重建方法获取目标人脸的3DMM参数，从3DMM参数中提取出目标人脸的属性信息。For the target face image in the training sample, themodel training module 402 can use the three-dimensional face reconstruction method to obtain the attribute information of the target face. For example, the 3DMM parameter of the target face can be obtained by using the face three-dimensional reconstruction method, The attribute information of the target face is extracted.

3DMM参数中的不同维度分别对应于人物的身份、表情、姿态、光照等信息。属性信息主要包括表情和姿态信息。The different dimensions in the 3DMM parameters correspond to the identity, expression, posture, lighting and other information of the character respectively. The attribute information mainly includes expression and gesture information.

模型训练模块402还可利用基于GAN的人脸融合方法，对目标人脸图像及模板人脸图像进行人脸融合。在实际应用中，可采用任意基于GAN的人脸融合方法，如现有的faceshifter方法等。Themodel training module 402 can also use the GAN-based face fusion method to perform face fusion on the target face image and the template face image. In practical applications, any GAN-based face fusion method, such as the existing faceshifter method, can be used.

模型训练模块402还可利用人脸三维重建方法获取融合人脸的属性信息，如可利用人脸三维重建方法获取融合人脸的3DMM参数，从3DMM参数中提取出融合人脸的属性信息。Themodel training module 402 can also use the 3D face reconstruction method to obtain the attribute information of the fused face. For example, the 3DMM parameters of the fused face can be obtained by using the 3D face reconstruction method, and the attribute information of the fused face can be extracted from the 3DMM parameters.

在分别获取到目标人脸的属性信息以及融合人脸的属性信息后，模型训练模块402可计算目标人脸的属性信息及融合人脸的属性信息的L2-norm，作为属性一致性损失函数，具体形式可为：||A-B||²，其中A和B分别表示目标人脸的属性信息及融合人脸的属性信息。After acquiring the attribute information of the target face and the attribute information of the fusion face respectively, themodel training module 402 can calculate the attribute information of the target face and the L2-norm of the attribute information of the fusion face as the attribute consistency loss function, The specific form may be: ||AB||² , where A and B respectively represent the attribute information of the target face and the attribute information of the fusion face.

另外，模型训练模块402还可结合属性一致性损失函数以及基于GAN的人脸融合方法中的身份一致性损失函数来进行人脸融合模型的自监督学习。比如，结合属性一致性损失函数以及faceshifter方法中的身份一致性损失函数来进行人脸融合模型的自监督学习。In addition, themodel training module 402 can also combine the attribute consistency loss function and the identity consistency loss function in the GAN-based face fusion method to perform self-supervised learning of the face fusion model. For example, the self-supervised learning of the face fusion model is carried out by combining the attribute consistency loss function and the identity consistency loss function in the faceshifter method.

图5为本申请所述人脸融合装置实施例50的组成结构示意图。如图5所示，包括：图像获取模块501以及人脸融合模块502。FIG. 5 is a schematic diagram of the composition structure ofEmbodiment 50 of the face fusion apparatus described in the present application. As shown in FIG. 5 , it includes: animage acquisition module 501 and aface fusion module 502 .

图像获取模块501，用于获取目标人脸图像及模板人脸图像。Theimage acquisition module 501 is used for acquiring a target face image and a template face image.

人脸融合模块502，用于将目标人脸图像及模板人脸图像输入预先训练得到的人脸融合模型，得到输出的人脸融合结果。Theface fusion module 502 is configured to input the target face image and the template face image into the face fusion model obtained by pre-training, and obtain the output face fusion result.

人脸融合模型可为按照图1所示方法得到的人脸融合模型，即可为图4 所示装置获取到的人脸融合模型。The face fusion model may be the face fusion model obtained according to the method shown in FIG. 1 , that is, the face fusion model obtained by the device shown in FIG. 4 .

在获取到人脸融合模型后，即可利用其来进行实际的人脸融合。对于待融合的任意目标人脸图像和模板人脸图像，可输入人脸融合模型，从而得到输出的人脸融合结果。After the face fusion model is obtained, it can be used for actual face fusion. For any target face image and template face image to be fused, the face fusion model can be input to obtain the output face fusion result.

图4和5所示装置实施例的具体工作流程请参照前述方法实施例中的相关说明，不再赘述。For the specific workflow of the apparatus embodiments shown in FIGS. 4 and 5 , please refer to the relevant descriptions in the foregoing method embodiments, which will not be repeated.

总之，采用本申请装置实施例所述方案，可训练得到人脸融合模型，其中，可利用人脸三维重建方法实现目标人脸及融合人脸的身份和属性的解耦，并可利用得到的属性信息构建属性一致性损失函数，利用属性一致性损失函数来指导模型的训练等，从而提升了模型训练效果，进而提升了利用训练得到的模型进行人脸融合时的融合效果，能够得到更为真实的融合形象，而且，针对训练得到的模型，给定任意目标人脸图像及模板人脸图像，均可得到对应的人脸融合结果，具有广泛适用性并具有较低的实现成本等；另外，采用基于GAN的人脸融合方法，可得到较好的融合效果，从而方便了后续处理，并结合了基于GAN的人脸融合方法以及人脸三维重建方法，利用了各自的优势，从而进一步提升了模型训练效果等；另外，基于人脸三维重建方法，可方便准确地获取到所需的属性信息；再有，可结合属性一致性损失函数以及基于GAN的人脸融合方法中的身份一致性损失函数来进行人脸融合模型的自监督学习，从而不但保证了融合人脸与目标人脸的属性的一致性，而且保证了融合人脸与模板人脸的身份的一致性。In a word, using the solution described in the device embodiment of the present application, a face fusion model can be obtained by training, wherein the 3D face reconstruction method can be used to realize the decoupling of the identities and attributes of the target face and the fused face, and the obtained face The attribute information is used to construct the attribute consistency loss function, and the attribute consistency loss function is used to guide the training of the model, so as to improve the model training effect, and then improve the fusion effect when using the trained model for face fusion. The real fusion image, and, for the model obtained by training, given any target face image and template face image, the corresponding face fusion results can be obtained, which has wide applicability and low implementation cost, etc.; , using the GAN-based face fusion method, which can obtain a better fusion effect, which facilitates subsequent processing, and combines the GAN-based face fusion method and the face 3D reconstruction method, taking advantage of their respective advantages to further improve In addition, based on the three-dimensional face reconstruction method, the required attribute information can be easily and accurately obtained; in addition, the attribute consistency loss function and the identity consistency in the GAN-based face fusion method can be combined. The loss function is used to perform self-supervised learning of the face fusion model, which not only ensures the consistency of the attributes of the fusion face and the target face, but also ensures the consistency of the identity of the fusion face and the template face.

根据本申请的实施例，本申请还提供了一种电子设备和一种可读存储介质。According to the embodiments of the present application, the present application further provides an electronic device and a readable storage medium.

如图6所示，是根据本申请实施例所述方法的电子设备的框图。电子设备旨在表示各种形式的数字计算机，诸如，膝上型计算机、台式计算机、工作台、个人数字助理、服务器、刀片式服务器、大型计算机、和其它适合的计算机。电子设备还可以表示各种形式的移动装置，诸如，个人数字处理、蜂窝电话、智能电话、可穿戴设备和其它类似的计算装置。本文所示的部件、它们的连接和关系、以及它们的功能仅仅作为示例，并且不意在限制本文中描述的和/或者要求的本申请的实现。As shown in FIG. 6 , it is a block diagram of an electronic device according to the method described in the embodiment of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframe computers, and other suitable computers. Electronic devices may also represent various forms of mobile devices, such as personal digital processors, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions are by way of example only, and are not intended to limit implementations of the application described and/or claimed herein.

如图6所示，该电子设备包括：一个或多个处理器Y01、存储器Y02，以及用于连接各部件的接口，包括高速接口和低速接口。各个部件利用不同的总线互相连接，并且可以被安装在公共主板上或者根据需要以其它方式安装。处理器可以对在电子设备内执行的指令进行处理，包括存储在存储器中或者存储器上以在外部输入/输出装置(诸如，耦合至接口的显示设备)上显示图形用户界面的图形信息的指令。在其它实施方式中，若需要，可以将多个处理器和/或多条总线与多个存储器和多个存储器一起使用。同样，可以连接多个电子设备，各个设备提供部分必要的操作(例如，作为服务器阵列、一组刀片式服务器、或者多处理器系统)。图6中以一个处理器Y01为例。As shown in FIG. 6 , the electronic device includes: one or more processors Y01, a memory Y02, and interfaces for connecting various components, including a high-speed interface and a low-speed interface. The various components are interconnected using different buses and may be mounted on a common motherboard or otherwise as desired. The processor may process instructions executed within the electronic device, including instructions stored in or on memory to display graphical information of a graphical user interface on an external input/output device, such as a display device coupled to the interface. In other embodiments, multiple processors and/or multiple buses may be used with multiple memories and multiple memories, if desired. Likewise, multiple electronic devices may be connected, each providing some of the necessary operations (eg, as a server array, a group of blade servers, or a multiprocessor system). In FIG. 6, a processor Y01 is taken as an example.

存储器Y02即为本申请所提供的非瞬时计算机可读存储介质。其中，所述存储器存储有可由至少一个处理器执行的指令，以使所述至少一个处理器执行本申请所提供的方法。本申请的非瞬时计算机可读存储介质存储计算机指令，该计算机指令用于使计算机执行本申请所提供的方法。The memory Y02 is the non-transitory computer-readable storage medium provided in this application. Wherein, the memory stores instructions executable by at least one processor, so that the at least one processor executes the method provided by the present application. The non-transitory computer-readable storage medium of the present application stores computer instructions for causing a computer to perform the methods provided by the present application.

存储器Y02作为一种非瞬时计算机可读存储介质，可用于存储非瞬时软件程序、非瞬时计算机可执行程序以及模块，如本申请实施例中的方法对应的程序指令/模块。处理器Y01通过运行存储在存储器Y02中的非瞬时软件程序、指令以及模块，从而执行服务器的各种功能应用以及数据处理，即实现上述方法实施例中的方法。As a non-transitory computer-readable storage medium, the memory Y02 can be used to store non-transitory software programs, non-transitory computer-executable programs, and modules, such as program instructions/modules corresponding to the methods in the embodiments of the present application. The processor Y01 executes various functional applications and data processing of the server by running the non-transitory software programs, instructions and modules stored in the memory Y02, that is, to implement the methods in the above method embodiments.

存储器Y02可以包括存储程序区和存储数据区，其中，存储程序区可存储操作系统、至少一个功能所需要的应用程序；存储数据区可存储根据电子设备的使用所创建的数据等。此外，存储器Y02可以包括高速随机存取存储器，还可以包括非瞬时存储器，例如至少一个磁盘存储器件、闪存器件、或其他非瞬时固态存储器件。在一些实施例中，存储器Y02可选包括相对于处理器Y01远程设置的存储器，这些远程存储器可以通过网络连接至电子设备。上述网络的实例包括但不限于互联网、企业内部网、区块链网络、局域网、移动通信网及其组合。The memory Y02 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required by at least one function; the storage data area may store data created according to the use of the electronic device, and the like. In addition, the memory Y02 may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid-state storage device. In some embodiments, memory Y02 may optionally include memory located remotely relative to processor Y01, which may be connected to the electronic device via a network. Examples of such networks include, but are not limited to, the Internet, intranets, blockchain networks, local area networks, mobile communication networks, and combinations thereof.

电子设备还可以包括：输入装置Y03和输出装置Y04。处理器Y01、存储器Y02、输入装置Y03和输出装置Y04可以通过总线或者其他方式连接，图6中以通过总线连接为例。The electronic device may further include: an input device Y03 and an output device Y04. The processor Y01, the memory Y02, the input device Y03, and the output device Y04 may be connected by a bus or in other ways, and the connection by a bus is taken as an example in FIG. 6 .

输入装置Y03可接收输入的数字或字符信息，以及产生与电子设备的用户设置以及功能控制有关的键信号输入，例如触摸屏、小键盘、鼠标、轨迹板、触摸板、指示杆、一个或者多个鼠标按钮、轨迹球、操纵杆等输入装置。输出装置Y04可以包括显示设备、辅助照明装置和触觉反馈装置(例如，振动电机)等。该显示设备可以包括但不限于，液晶显示器、发光二极管显示器和等离子体显示器。在一些实施方式中，显示设备可以是触摸屏。Input device Y03 can receive input numerical or character information, and generate key signal input related to user settings and function control of electronic equipment, such as touch screen, keypad, mouse, track pad, touch pad, pointing stick, one or more Input devices such as mouse buttons, trackballs, joysticks, etc. The output device Y04 may include a display device, an auxiliary lighting device, a haptic feedback device (eg, a vibration motor), and the like. The display devices may include, but are not limited to, liquid crystal displays, light emitting diode displays, and plasma displays. In some implementations, the display device may be a touch screen.

此处描述的系统和技术的各种实施方式可以在数字电子电路系统、集成电路系统、专用集成电路、计算机硬件、固件、软件、和/或它们的组合中实现。这些各种实施方式可以包括：实施在一个或者多个计算机程序中，该一个或者多个计算机程序可在包括至少一个可编程处理器的可编程系统上执行和/或解释，该可编程处理器可以是专用或者通用可编程处理器，可以从存储系统、至少一个输入装置、和至少一个输出装置接收数据和指令，并且将数据和指令传输至该存储系统、该至少一个输入装置、和该至少一个输出装置。Various implementations of the systems and techniques described herein can be implemented in digital electronic circuitry, integrated circuit systems, application specific integrated circuits, computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include being implemented in one or more computer programs executable and/or interpretable on a programmable system including at least one programmable processor that The processor, which may be a special purpose or general-purpose programmable processor, may receive data and instructions from a storage system, at least one input device, and at least one output device, and transmit data and instructions to the storage system, the at least one input device, and the at least one output device an output device.

这些计算程序(也称作程序、软件、软件应用、或者代码)包括可编程处理器的机器指令，并且可以利用高级过程和/或面向对象的编程语言、和/ 或汇编/机器语言来实施这些计算程序。如本文使用的，术语“机器可读介质”和“计算机可读介质”指的是用于将机器指令和/或数据提供给可编程处理器的任何计算机程序产品、设备、和/或装置(例如，磁盘、光盘、存储器、可编程逻辑装置)，包括，接收作为机器可读信号的机器指令的机器可读介质。术语“机器可读信号”指的是用于将机器指令和/或数据提供给可编程处理器的任何信号。These computational programs (also referred to as programs, software, software applications, or codes) include machine instructions for programmable processors, and may be implemented using high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages calculation program. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or apparatus for providing machine instructions and/or data to a programmable processor ( For example, a magnetic disk, an optical disk, a memory, a programmable logic device), including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.

为了提供与用户的交互，可以在计算机上实施此处描述的系统和技术，该计算机具有：用于向用户显示信息的显示装置(例如，阴极射线管或者液晶显示器监视器)；以及键盘和指向装置(例如，鼠标或者轨迹球)，用户可以通过该键盘和该指向装置来将输入提供给计算机。其它种类的装置还可以用于提供与用户的交互；例如，提供给用户的反馈可以是任何形式的传感反馈(例如，视觉反馈、听觉反馈、或者触觉反馈)；并且可以用任何形式(包括声输入、语音输入或者、触觉输入)来接收来自用户的输入。To provide interaction with a user, the systems and techniques described herein may be implemented on a computer having: a display device (eg, a cathode ray tube or liquid crystal display monitor) for displaying information to the user; and a keyboard and pointing A device (eg, a mouse or trackball) through which the user can provide input to the computer through the keyboard and the pointing device. Other kinds of devices can also be used to provide interaction with the user; for example, the feedback provided to the user can be any form of sensory feedback (eg, visual feedback, auditory feedback, or tactile feedback); and can be in any form (including acoustic input, voice input, or tactile input) to receive input from the user.

可以将此处描述的系统和技术实施在包括后台部件的计算系统(例如，作为数据服务器)、或者包括中间件部件的计算系统(例如，应用服务器)、或者包括前端部件的计算系统(例如，具有图形用户界面或者网络浏览器的用户计算机，用户可以通过该图形用户界面或者该网络浏览器来与此处描述的系统和技术的实施方式交互)、或者包括这种后台部件、中间件部件、或者前端部件的任何组合的计算系统中。可以通过任何形式或者介质的数字数据通信(例如，通信网络)来将系统的部件相互连接。通信网络的示例包括：局域网、广域网、区块链网络和互联网。The systems and techniques described herein may be implemented on a computing system that includes back-end components (eg, as a data server), or a computing system that includes middleware components (eg, an application server), or a computing system that includes front-end components (eg, a user's computer having a graphical user interface or web browser through which a user may interact with implementations of the systems and techniques described herein), or including such backend components, middleware components, Or any combination of front-end components in a computing system. The components of the system may be interconnected by any form or medium of digital data communication (eg, a communication network). Examples of communication networks include: local area networks, wide area networks, blockchain networks, and the Internet.

计算机系统可以包括客户端和服务器。客户端和服务器一般远离彼此并且通常通过通信网络进行交互。通过在相应的计算机上运行并且彼此具有客户端-服务器关系的计算机程序来产生客户端和服务器的关系。A computer system can include clients and servers. Clients and servers are generally remote from each other and usually interact through a communication network. The relationship of client and server arises by computer programs running on the respective computers and having a client-server relationship to each other.

应该理解，可以使用上面所示的各种形式的流程，重新排序、增加或删除步骤。例如，本申请中记载的各步骤可以并行地执行也可以顺序地执行也可以不同的次序执行，只要能够实现本申请公开的技术方案所期望的结果，本文在此不进行限制。It should be understood that steps may be reordered, added or deleted using the various forms of flow shown above. For example, the steps described in the present application can be executed in parallel, sequentially or in different orders, as long as the desired results of the technical solutions disclosed in the present application can be achieved, no limitation is imposed herein.

上述具体实施方式，并不构成对本申请保护范围的限制。本领域技术人员应该明白的是，根据设计要求和其他因素，可以进行各种修改、组合、子组合和替代。任何在本申请的精神和原则之内所作的修改、等同替换和改进等，均应包含在本申请保护范围之内。The above-mentioned specific embodiments do not constitute a limitation on the protection scope of the present application. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may occur depending on design requirements and other factors. Any modifications, equivalent replacements and improvements made within the spirit and principles of this application shall be included within the protection scope of this application.