Movatterモバイル変換


[0]ホーム

URL:


CN114445873A - Training of face recognition model, face recognition method and electronic device - Google Patents

Training of face recognition model, face recognition method and electronic device
Download PDF

Info

Publication number
CN114445873A
CN114445873ACN202111581961.2ACN202111581961ACN114445873ACN 114445873 ACN114445873 ACN 114445873ACN 202111581961 ACN202111581961 ACN 202111581961ACN 114445873 ACN114445873 ACN 114445873A
Authority
CN
China
Prior art keywords
resolution
channel
low
face
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111581961.2A
Other languages
Chinese (zh)
Other versions
CN114445873B (en
Inventor
王益斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Telecom Cloud Technology Co Ltd
Original Assignee
China Telecom Cloud Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Telecom Cloud Technology Co LtdfiledCriticalChina Telecom Cloud Technology Co Ltd
Priority to CN202111581961.2ApriorityCriticalpatent/CN114445873B/en
Publication of CN114445873ApublicationCriticalpatent/CN114445873A/en
Application grantedgrantedCritical
Publication of CN114445873BpublicationCriticalpatent/CN114445873B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Images

Classifications

Landscapes

Abstract

The invention relates to the field of face recognition, in particular to a training method of a face recognition model, a face recognition method and electronic equipment, wherein the training method comprises the steps of obtaining face sample data and a label thereof; inputting face sample data into a near infrared channel, and extracting features by using a source flow network, a migration flow network and a feature fusion unit in the near infrared channel to obtain a first prediction result of the near infrared channel; performing first loss function calculation based on the first prediction result and the label, updating parameters of the near infrared channel, and determining a target near infrared channel; inputting the face sample data into a low-resolution channel, and performing feature extraction by using a high-resolution branch and a low-resolution branch in the low-resolution channel to obtain a second prediction result of the low-resolution channel; and performing second loss function calculation based on the second prediction result and the label, updating parameters of the low-resolution channel, determining a target low-resolution channel, and determining a target face recognition model.

Description

Translated fromChinese
人脸识别模型的训练、人脸识别方法及电子设备Training of face recognition model, face recognition method and electronic device

技术领域technical field

本发明涉及人脸识别技术领域,具体涉及人脸识别模型的训练、人脸识别方法及电子设备。The invention relates to the technical field of face recognition, in particular to training of a face recognition model, a face recognition method and electronic equipment.

背景技术Background technique

近些年随着人脸识别在各领域应用的铺展,基于通用的场景的人脸识别研究日益成熟,但在一些非通用场景下,比如:逆光环境、黑暗环境、分辨率低等导致常规的人脸识别技术应用失败。真实场景的复杂多变给应用层的人脸检测带来了极大的挑战。有研究表明,近红外补光下的图片能够保持一定质量的人脸特征,因此将可见光下人脸的图片与近红外人脸图片相匹配的思路,为解决非通用场景(光照)下的人脸识别提供了一种可能的解决方案,学界将此称之为异质人脸识别。针对上述问题和现有方案,存在以下几点挑战:In recent years, with the spread of face recognition in various fields, the research on face recognition based on general scenarios has become more and more mature, but in some non-general scenarios, such as: backlight environment, dark environment, low resolution, etc. Failed to apply face recognition technology. The complexity and variability of real scenes bring great challenges to face detection at the application layer. Some studies have shown that pictures under near-infrared supplementary light can maintain a certain quality of facial features. Therefore, the idea of matching pictures of human faces under visible light with near-infrared face pictures is a way to solve the problem of people in non-universal scenes (lighting). Face recognition provides a possible solution, which academic circles call heterogeneous face recognition. In view of the above problems and existing solutions, there are the following challenges:

(1)数据采集困难,该匹配任务需要同时收集可见光图片和红外线图片,现有数据集数量相当小。(1) Data collection is difficult. The matching task needs to collect visible light pictures and infrared pictures at the same time, and the number of existing data sets is quite small.

(2)异质图片存在较大差异,在不同环境下的图片在分布、外观、属性等上往往存在很大差异,为匹配带来了很大的困难。(2) Heterogeneous pictures are quite different, and pictures in different environments often have great differences in distribution, appearance, attributes, etc., which brings great difficulties to matching.

(3)图片分辨率不同,实时识别环境下分辨率有可能很低。(3) The image resolution is different, and the resolution may be very low in the real-time recognition environment.

发明内容SUMMARY OF THE INVENTION

有鉴于此,本发明实施例提供了一种人脸识别模型的训练、人脸识别方法及电子设备,以解决异质人脸识别的问题。In view of this, embodiments of the present invention provide a face recognition model training, a face recognition method, and an electronic device, so as to solve the problem of heterogeneous face recognition.

根据第一方面,本发明实施例提供了一种人脸识别模型的训练方法,所述人脸识别模型包括并行的近红外线通道以及低分辨率通道,所述方法包括:According to a first aspect, an embodiment of the present invention provides a method for training a face recognition model, where the face recognition model includes parallel near-infrared channels and low-resolution channels, and the method includes:

获取人脸样本数据及其标签;Obtain face sample data and its labels;

将所述人脸样本数据输入所述近红外线通道中,并利用所述近红外线通道中的源流网络、迁移流网络以及特征融合单元进行特征提取,以得到所述近红外线通道的第一预测结果;Input the face sample data into the near-infrared channel, and use the source flow network, the migration flow network and the feature fusion unit in the near-infrared channel to perform feature extraction to obtain the first prediction result of the near-infrared channel ;

基于所述第一预测结果以及所述标签进行第一损失函数计算,对所述近红外线通道的参数进行更新,以确定目标近红外线通道;Perform a first loss function calculation based on the first prediction result and the label, and update the parameters of the near-infrared channel to determine the target near-infrared channel;

将所述人脸样本数据输入所述低分辨率通道中,并利用所述低分辨率通道中的高分辨率分支以及低分辨率分支进行特征提取,以得到所述低分辨率通道的第二预测结果;Input the face sample data into the low-resolution channel, and use the high-resolution branch and the low-resolution branch in the low-resolution channel to perform feature extraction to obtain the second feature of the low-resolution channel. forecast result;

基于所述第二预测结果以及所述标签进行第二损失函数计算,对所述低分辨率通道的参数进行更新,以确定目标低分辨率通道;Perform a second loss function calculation based on the second prediction result and the label, and update the parameters of the low-resolution channel to determine the target low-resolution channel;

基于所述目标近红外线通道以及所述目标低分辨率通道确定目标人脸识别模型。A target face recognition model is determined based on the target near-infrared channel and the target low-resolution channel.

本发明实施例提供的人脸识别模型的训练方法,首先将逆光、黑暗环境和低分辨率图片分为两大情况处理,针对前一种场景,本提案将源流和迁移流组成两个并联网络结构,通过并联结构的迁移学习,利用源流多尺度特征图信息,利用迁移流将可见光和近红外特征迁移到另一个统一的公共空间,消除不同模态下的差异,以提高异质人脸识别。针对后一种场景,将低分辨率人脸部件解析图输入到识别网络,优化了特征提取,设基于特征域的超分辨技术,将图像相似性损失函数融合到面部解析模型。最后再将两种场景拆解到相应通道,以解决该场景下人脸识别问题,保证所得到的目标人脸识别模型的可靠性。In the training method of the face recognition model provided by the embodiment of the present invention, the backlight, dark environment, and low-resolution pictures are firstly processed into two cases. For the former scenario, this proposal forms two parallel networks of the source stream and the migration stream. Structure, through the transfer learning of parallel structure, using the source flow multi-scale feature map information, using the transfer flow to transfer the visible light and near-infrared features to another unified public space, eliminating the differences in different modalities to improve heterogeneous face recognition . For the latter scenario, the low-resolution facial component parsing map is input into the recognition network, and feature extraction is optimized. A super-resolution technique based on feature domain is set up, and the image similarity loss function is integrated into the facial parsing model. Finally, the two scenarios are disassembled into corresponding channels to solve the face recognition problem in this scenario and ensure the reliability of the obtained target face recognition model.

可选地,所述将所述人脸样本数据输入所述近红外线通道中,并利用所述近红外线通道中的源流网络、迁移流网络以及特征融合单元进行特征提取,以得到所述近红外线通道的第一预测结果,包括:Optionally, the face sample data is input into the near-infrared channel, and the source flow network, the migration flow network and the feature fusion unit in the near-infrared channel are used to perform feature extraction to obtain the near-infrared. The first prediction result of the channel, including:

固定所述源流网络的参数,所述源流网络是基于可见光数据集进行预训练得到的;Fixing the parameters of the source-flow network, the source-flow network is obtained by pre-training based on the visible light data set;

将所述人脸样本数据输入所述近红外线通道中,并将所述源流网络提取的多尺度中间层特征作为所述迁移流网络和所述特征融合单元的输入;Input the face sample data into the near-infrared channel, and use the multi-scale intermediate layer feature extracted by the source flow network as the input of the migration flow network and the feature fusion unit;

利用所述特征融合单元吸收所述源流网络产生的所述多尺度中间层特征并融合到所述迁移流网络的特征图中,以得到所述近红外线通道的第一预测结果。The multi-scale intermediate layer features generated by the source flow network are absorbed by the feature fusion unit and fused into the feature map of the migration flow network to obtain the first prediction result of the near-infrared channel.

本发明实施例提供的人脸识别模型的训练方法,将源流和迁移流组成两个并联网络结构,通过并联结构的迁移学习,利用源流多尺度特征图信息,利用迁移流将可见光和近红外特征迁移到另一个统一的公共空间,消除不同模态下的差异,以提高异质人脸识别。In the training method for a face recognition model provided by the embodiment of the present invention, the source stream and the migration stream are formed into two parallel network structures, and through the migration learning of the parallel structure, the multi-scale feature map information of the source stream is used, and the visible light and near-infrared features are converted by the migration stream. Migrating to another unified public space to eliminate differences in different modalities to improve heterogeneous face recognition.

可选地,所述基于所述第一预测结果以及所述标签进行第一损失函数计算,对所述近红外线通道的参数进行更新,以确定目标近红外线通道,包括:Optionally, the first loss function calculation is performed based on the first prediction result and the label, and the parameters of the near-infrared channel are updated to determine the target near-infrared channel, including:

基于所述第一预测结果进行所述标签进行交叉熵损失函数的计算,得到计算结果;Calculate the cross-entropy loss function of the label based on the first prediction result, and obtain a calculation result;

基于所述计算结果对所述近红外线通道的参数进行更新,以确定目标近红外线通道。The parameters of the near-infrared channel are updated based on the calculation result to determine the target near-infrared channel.

可选地,所述高分辨率分支与所述低分辨率分支的网络结构相同,所述将所述人脸样本数据输入所述低分辨率通道中,并利用所述低分辨率通道中的高分辨率分支以及低分辨率分支进行特征提取,以得到所述低分辨率通道的第二预测结果,包括:Optionally, the network structure of the high-resolution branch is the same as that of the low-resolution branch, and the face sample data is input into the low-resolution channel, and the data in the low-resolution channel is used. The high-resolution branch and the low-resolution branch perform feature extraction to obtain the second prediction result of the low-resolution channel, including:

将所述人脸样本数据输入所述高分辨率分支中,得到高分辨率特征;Inputting the face sample data into the high-resolution branch to obtain high-resolution features;

对所述人脸样本数据进行降采样后分别输入到低分辨率分支的特征提取单元以及面部解析单元,并将所述面部解析单元的解析结果进行多尺度降采样与所述特征提取单元的对应部分进行融合,得到低分辨率特征;After down-sampling the face sample data, it is respectively input to the feature extraction unit and the face analysis unit of the low-resolution branch, and the analysis result of the face analysis unit is subjected to multi-scale down-sampling and the correspondence of the feature extraction unit Partially fused to obtain low-resolution features;

基于所述高分辨率特征以及所述低分辨率特征确定所述第二预测结果。The second prediction result is determined based on the high resolution feature and the low resolution feature.

本发明实施例提供的人脸识别模型的训练方法,将低分辨率人脸部件解析图输入到识别网络,优化了特征提取,设基于特征域的超分辨技术,将图像相似性损失函数融合到面部解析模型。In the training method for a face recognition model provided by the embodiment of the present invention, a low-resolution face component analysis map is input into a recognition network, feature extraction is optimized, a super-resolution technology based on feature domain is set, and the image similarity loss function is fused to the facial parsing model.

可选地,所述基于所述第二预测结果以及所述标签进行第二损失函数计算,对所述低分辨率通道的参数进行更新,以确定低分辨率通道,包括:Optionally, the second loss function calculation is performed based on the second prediction result and the label, and the parameters of the low-resolution channel are updated to determine the low-resolution channel, including:

基于所述第二预测结果以及所述标签计算分类损失;calculating a classification loss based on the second prediction result and the label;

基于所述高分辨率特征以及所述低分辨率特征进行特征损失的计算;Calculate feature loss based on the high-resolution feature and the low-resolution feature;

计算所述分类损失以及所述特征损失的加权和,确定损失结果;Calculate the weighted sum of the classification loss and the feature loss to determine the loss result;

基于所述损失结果对所述低分辨率通道的参数进行更新,以确定目标低分辨率通道。The parameters of the low-resolution channel are updated based on the loss result to determine a target low-resolution channel.

可选地,所述基于所述目标近红外线通道以及所述目标低分辨率通道确定目标人脸识别模型,包括:Optionally, determining the target face recognition model based on the target near-infrared channel and the target low-resolution channel, including:

将所述目标近红外线通道以及所述目标低分辨率通道并接后的模型确定为所述目标人脸识别模型。A model in which the target near-infrared channel and the target low-resolution channel are connected in parallel is determined as the target face recognition model.

根据第二方面,本发明实施例提供了一种人脸识别方法,包括:According to a second aspect, an embodiment of the present invention provides a face recognition method, including:

获取待识别人脸图像;Obtain the face image to be recognized;

将所述待识别人脸图像输入目标人脸识别模型中,以得到所述近红外线通道的第一识别结果以及所述低分辨率通道的第二识别结果;Inputting the to-be-recognized face image into the target face recognition model, to obtain the first recognition result of the near-infrared channel and the second recognition result of the low-resolution channel;

基于所述第一识别结果与所述第二识别结果的并集,确定所述待识别人脸图像的识别结果。Based on the union of the first recognition result and the second recognition result, the recognition result of the face image to be recognized is determined.

本发明实施例提供的人脸识别方法,由于所得到的目标人脸识别模型是双通道模型,能够适应不同的图像环境,因此,利用该目标人脸识别模型对待识别人脸图像进行识别,可以得到较准确的识别结果。In the face recognition method provided by the embodiment of the present invention, since the obtained target face recognition model is a two-channel model, it can adapt to different image environments. Therefore, using the target face recognition model to recognize the face image to be recognized can be Get more accurate identification results.

根据第三方面,本发明实施例提供了一种人脸识别装置,包括:According to a third aspect, an embodiment of the present invention provides a face recognition device, including:

获取模块,用于获取待识别人脸图像;an acquisition module, used to acquire the face image to be recognized;

识别模块,用于将所述待识别人脸图像输入目标人脸识别模型中,以得到所述近红外线通道的第一识别结果以及所述低分辨率通道的第二识别结果;A recognition module, for inputting the face image to be recognized into the target face recognition model, to obtain the first recognition result of the near-infrared channel and the second recognition result of the low-resolution channel;

确定模块,用于基于所述第一识别结果与所述第二识别结果的并集,确定所述待识别人脸图像的识别结果。A determination module, configured to determine the recognition result of the face image to be recognized based on the union of the first recognition result and the second recognition result.

根据第四方面,本发明实施例提供了一种电子设备,包括:存储器和处理器,所述存储器和所述处理器之间互相通信连接,所述存储器中存储有计算机指令,所述处理器通过执行所述计算机指令,从而执行第一方面或者第一方面的任意一种实施方式中所述的人脸识别模型的训练方法,或执行第二方面中所述的人脸识别方法。According to a fourth aspect, an embodiment of the present invention provides an electronic device, including: a memory and a processor, the memory and the processor are connected in communication with each other, the memory stores computer instructions, and the processor By executing the computer instructions, the training method of the face recognition model described in the first aspect or any one of the implementation manners of the first aspect is executed, or the face recognition method described in the second aspect is executed.

根据第五方面,本发明实施例提供了一种计算机可读存储介质,所述计算机可读存储介质存储计算机指令,所述计算机指令用于使所述计算机执行第一方面或者第一方面的任意一种实施方式中所述的人脸识别模型的训练方法,或执行第二方面中所述的人脸识别方法。According to a fifth aspect, an embodiment of the present invention provides a computer-readable storage medium, where the computer-readable storage medium stores computer instructions, and the computer instructions are used to cause the computer to execute the first aspect or any one of the first aspect. A method for training a face recognition model described in an embodiment, or executing the face recognition method described in the second aspect.

附图说明Description of drawings

为了更清楚地说明本发明具体实施方式或现有技术中的技术方案,下面将对具体实施方式或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本发明的一些实施方式,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to illustrate the specific embodiments of the present invention or the technical solutions in the prior art more clearly, the following briefly introduces the accompanying drawings that need to be used in the description of the specific embodiments or the prior art. Obviously, the accompanying drawings in the following description The drawings are some embodiments of the present invention. For those of ordinary skill in the art, other drawings can also be obtained based on these drawings without creative efforts.

图1是根据本发明实施例的人脸识别模型的训练方法的流程图;1 is a flowchart of a training method for a face recognition model according to an embodiment of the present invention;

图2是根据本发明实施例的人脸识别模型的训练方法的流程图;2 is a flowchart of a training method for a face recognition model according to an embodiment of the present invention;

图3是根据本发明实施例的近红外线通道的网络结构示意图;3 is a schematic diagram of a network structure of a near-infrared channel according to an embodiment of the present invention;

图4是根据本发明实施例的低分辨率通道的网络结构示意图;4 is a schematic diagram of a network structure of a low-resolution channel according to an embodiment of the present invention;

图5是根据本发明实施例的低分辨率分支的网络结构示意图;5 is a schematic diagram of a network structure of a low-resolution branch according to an embodiment of the present invention;

图6是根据本发明实施例的图像域超分与特征域超分对比的示意图;6 is a schematic diagram of a comparison between image domain over-score and feature domain over-score according to an embodiment of the present invention;

图7是根据本发明实施例的人脸识别方法的流程图;7 is a flowchart of a face recognition method according to an embodiment of the present invention;

图8是本发明实施例提供的电子设备的硬件结构示意图。FIG. 8 is a schematic diagram of a hardware structure of an electronic device provided by an embodiment of the present invention.

具体实施方式Detailed ways

为使本发明实施例的目的、技术方案和优点更加清楚,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。In order to make the purposes, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments These are some embodiments of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative efforts shall fall within the protection scope of the present invention.

本发明实施例提供的人脸识别模型的训练方法,首先将逆光、黑暗环境和低分辨率图片分为两大情况处理,针对前一种场景,本提案将源流和迁移流组成两个并联网络结构,通过并联结构的迁移学习,利用源流多尺度特征图信息,利用迁移流将可见光和近红外特征迁移到另一个统一的公共空间,消除不同模态下的差异,以提高异质人脸识别。针对后一种场景,将低分辨率人脸部件解析图输入到识别网络,优化了特征提取,设基于特征域的超分辨技术,将图像相似性损失函数融合到面部解析模型。最后在将两种场景拆解到相应通道,以解决该场景下人脸识别问题。In the training method of the face recognition model provided by the embodiment of the present invention, the backlight, dark environment, and low-resolution pictures are firstly processed into two cases. For the former scenario, this proposal forms two parallel networks of the source stream and the migration stream. Structure, through the transfer learning of parallel structure, using the source flow multi-scale feature map information, using the transfer flow to transfer the visible light and near-infrared features to another unified public space, eliminating the differences in different modalities to improve heterogeneous face recognition . For the latter scenario, the low-resolution facial component parsing map is input into the recognition network, and feature extraction is optimized. A super-resolution technique based on feature domain is set up, and the image similarity loss function is integrated into the facial parsing model. Finally, the two scenes are disassembled into corresponding channels to solve the face recognition problem in this scene.

根据本发明实施例,提供了一种人脸识别模型的训练方法实施例,需要说明的是,在附图的流程图示出的步骤可以在诸如一组计算机可执行指令的计算机系统中执行,并且,虽然在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤。According to an embodiment of the present invention, an embodiment of a training method for a face recognition model is provided. It should be noted that the steps shown in the flowchart of the accompanying drawings can be executed in a computer system such as a set of computer-executable instructions, Also, although a logical order is shown in the flowcharts, in some cases the steps shown or described may be performed in an order different from that herein.

在本实施例中提供了一种人脸识别模型的训练方法,可用于电子设备,如监控终端、服务器等,图1是根据本发明实施例的人脸识别模型的训练方法的流程图,如图1所示,该流程包括如下步骤:In this embodiment, a training method for a face recognition model is provided, which can be used in electronic devices, such as monitoring terminals, servers, etc. FIG. 1 is a flowchart of a training method for a face recognition model according to an embodiment of the present invention, as shown in FIG. As shown in Figure 1, the process includes the following steps:

S11,获取人脸样本数据及其标签。S11, obtain face sample data and labels.

人脸样本数据是收集到的低分辨率人脸图像,或光照环境不佳的人脸图像,或其他场景下的人脸图像,等等。标签为该人脸样本图像对应的目标人员。The face sample data are collected low-resolution face images, or face images in poor lighting environments, or face images in other scenes, and so on. The label is the target person corresponding to the face sample image.

S12,将人脸样本数据输入近红外线通道中,并利用近红外线通道中的源流网络、迁移流网络以及特征融合单元进行特征提取,以得到近红外线通道的第一预测结果。S12: Input the face sample data into the near-infrared channel, and use the source flow network, the migration flow network and the feature fusion unit in the near-infrared channel to perform feature extraction to obtain the first prediction result of the near-infrared channel.

近红外线通道包括源流网络、迁移流网络以及特征融合单元,其中,特征融合单元用于将源流网络以及迁移流网络对应尺度的特征进行融合后作为迁移流网络的下一尺度的特征提取模块中,得到特征提取结果;再基于特征提取的结果进行预测,得到第一预测结果。The near-infrared channel includes a source flow network, a migration flow network and a feature fusion unit, wherein the feature fusion unit is used to fuse the features of the corresponding scales of the source flow network and the migration flow network as a feature extraction module of the next scale of the migration flow network, Obtain a feature extraction result; and then perform prediction based on the feature extraction result to obtain a first prediction result.

S13,基于第一预测结果以及标签进行第一损失函数计算,对近红外线通道的参数进行更新,以确定目标近红外线通道。S13, based on the first prediction result and the label, perform a first loss function calculation, and update the parameters of the near-infrared channel to determine the target near-infrared channel.

第一预测结果与标签之间的差异用第一损失函数进行计算,在计算得到损失函数值之后,利用该损失函数值对近红外线通道的参数进行更新,最终确定出目标近红外线通道。The difference between the first prediction result and the label is calculated by the first loss function, and after the loss function value is obtained, the parameter of the near-infrared channel is updated with the loss function value, and the target near-infrared channel is finally determined.

S14,将人脸样本数据输入低分辨率通道中,并利用低分辨率通道中的高分辨率分支以及低分辨率分支进行特征提取,以得到低分辨率通道的第二预测结果。S14: Input the face sample data into the low-resolution channel, and use the high-resolution branch and the low-resolution branch in the low-resolution channel to perform feature extraction to obtain a second prediction result of the low-resolution channel.

低分辨率通道包括高分辨率分支以及低分辨率分支,利用这两个分支分别进行高分辨率特征提取以及低分辨率特征提取,再基于提取出的两个特征确定低分辨率通道的第二预测结果。The low-resolution channel includes a high-resolution branch and a low-resolution branch. These two branches are used to extract high-resolution features and low-resolution features respectively, and then determine the second feature of the low-resolution channel based on the two extracted features. forecast result.

S15,基于第二预测结果以及标签进行第二损失函数计算,对低分辨率通道的参数进行更新,以确定目标低分辨率通道。S15, perform a second loss function calculation based on the second prediction result and the label, and update the parameters of the low-resolution channel to determine the target low-resolution channel.

电子设备对第二预测结果以及标签进行第二损失函数的计算,具体采用何种损失函数进行计算可以根据实际需求进行设置,在此对其并不做任何限定。在计算出损失函数值之后,对低分辨率通道的参数进行更新,经过多次迭代最终确定出目标低分辨率通道。The electronic device calculates the second loss function on the second prediction result and the label. The specific loss function used for calculation can be set according to actual requirements, which is not limited herein. After the loss function value is calculated, the parameters of the low-resolution channel are updated, and the target low-resolution channel is finally determined after several iterations.

S16,基于目标近红外线通道以及目标低分辨率通道确定目标人脸识别模型。S16: Determine the target face recognition model based on the target near-infrared channel and the target low-resolution channel.

电子设备将目标近红外线通道以及目标低分辨率通道并接后的模型确定为目标人脸识别模型,即目标近红外线通道以及目标低分辨率通道为目标人脸识别模型的两个并行分支。The electronic device determines the model in which the target near-infrared channel and the target low-resolution channel are connected together as the target face recognition model, that is, the target near-infrared channel and the target low-resolution channel are two parallel branches of the target face recognition model.

本实施例提供的人脸识别模型的训练方法,首先将逆光、黑暗环境和低分辨率图片分为两大情况处理,针对前一种场景,本提案将源流和迁移流组成两个并联网络结构,通过并联结构的迁移学习,利用源流多尺度特征图信息,利用迁移流将可见光和近红外特征迁移到另一个统一的公共空间,消除不同模态下的差异,以提高异质人脸识别。针对后一种场景,将低分辨率人脸部件解析图输入到识别网络,优化了特征提取,设基于特征域的超分辨技术,将图像相似性损失函数融合到面部解析模型。最后再将两种场景拆解到相应通道,以解决该场景下人脸识别问题,保证所得到的目标人脸识别模型的可靠性。The training method of the face recognition model provided by this embodiment first divides the backlight, dark environment and low-resolution pictures into two cases for processing. For the former scenario, this proposal combines the source flow and the migration flow into two parallel network structures , through the transfer learning of the parallel structure, using the multi-scale feature map information of the source flow, and using the transfer flow to transfer the visible light and near-infrared features to another unified public space to eliminate the differences in different modalities to improve heterogeneous face recognition. For the latter scenario, the low-resolution facial component parsing map is input into the recognition network, and feature extraction is optimized. A super-resolution technique based on feature domain is set up, and the image similarity loss function is integrated into the facial parsing model. Finally, the two scenarios are disassembled into corresponding channels to solve the face recognition problem in this scenario and ensure the reliability of the obtained target face recognition model.

在本实施例中提供了一种人脸识别模型的训练方法,可用于电子设备,如监控终端、服务器等,图2是根据本发明实施例的人脸识别模型的训练方法的流程图,如图2所示,该流程包括如下步骤:In this embodiment, a training method for a face recognition model is provided, which can be used for electronic equipment, such as monitoring terminals, servers, etc. FIG. 2 is a flowchart of a training method for a face recognition model according to an embodiment of the present invention, such as As shown in Figure 2, the process includes the following steps:

S21,获取人脸样本数据及其标签。S21, obtain face sample data and labels.

详细请参见图1所示实施例的S11,在此不再赘述。For details, please refer to S11 of the embodiment shown in FIG. 1 , which will not be repeated here.

S22,将人脸样本数据输入近红外线通道中,并利用近红外线通道中的源流网络、迁移流网络以及特征融合单元进行特征提取,以得到近红外线通道的第一预测结果。S22: Input the face sample data into the near-infrared channel, and use the source flow network, the migration flow network and the feature fusion unit in the near-infrared channel to perform feature extraction to obtain the first prediction result of the near-infrared channel.

近红外线通道由源流网络、迁移网络和特征融合单元组成。每个模块的设计如下:The near-infrared channel consists of source flow network, transfer network and feature fusion unit. The design of each module is as follows:

(a)源流网络:采用ResNet网络结构,如图3所示的S-stream;(a) Source stream network: ResNet network structure is adopted, as shown in S-stream as shown in Figure 3;

(b)迁移网络:源流网络结构的轻量版本,如图3所示的T-Stream;(b) Migration network: a lightweight version of the source-stream network structure, such as T-Stream shown in Figure 3;

(c)特征融合:图3中convn.x表示包含多个的卷积层、激活层、残差单元模块。特征融合单元吸收源流提取的多尺度中间层特征,并将其和对应的迁移流生成的特征图融合。该实施例中使用4个特征融合单元来连接迁移流网络和源流网络,此设计的目的是可以使得不同分辨率下的特征图都能够输入到迁移流网络。具体来说,特征融合单元包含数个卷积层和激活层,形式化定义如下:(c) Feature fusion: convn.x in Figure 3 represents a convolutional layer, an activation layer, and a residual unit module that contains multiple modules. The feature fusion unit absorbs the multi-scale intermediate layer features extracted by the source stream and fuses them with the corresponding feature maps generated by the transfer stream. In this embodiment, four feature fusion units are used to connect the transfer flow network and the source flow network. The purpose of this design is to enable feature maps at different resolutions to be input to the transfer flow network. Specifically, the feature fusion unit contains several convolutional layers and activation layers, which are formally defined as follows:

f=FFU(fs,ft)=H(Concat(fs,ft))f=FFU(fs,ft)=H(Concat(fs,ft))

其中,fs和ft分别为对应的源流网络结构和迁移流网络结构中所提取的中间特征图,f表示该尺度下的融合输出特征图。H(·)表示一个残差块,是由两个深度可分离的卷积和对应的激活层组成。Concat(·)指在在通道维度进行拼接。Among them, fs and ft are the intermediate feature maps extracted from the corresponding source flow network structure and migration flow network structure, respectively, and f represents the fusion output feature map at this scale. H( ) represents a residual block, which consists of two depthwise separable convolutions and corresponding activation layers. Concat( ) means concatenating in the channel dimension.

具体地,上述S22包括:Specifically, the above S22 includes:

S221,固定源流网络的参数。S221, the parameters of the source stream network are fixed.

其中,所述源流网络是基于可见光数据集进行预训练得到的。Wherein, the source flow network is obtained by pre-training based on the visible light data set.

在大规模的可见光数据集上对源流网络进行预训练,得到源流网络,再对源流网络的参数固定。The source-flow network is pre-trained on a large-scale visible light dataset to obtain the source-flow network, and then the parameters of the source-flow network are fixed.

S222,将人脸样本数据输入近红外线通道中,并将源流网络提取的多尺度中间层特征作为迁移流网络和所述特征融合单元的输入。S222, input the face sample data into the near-infrared channel, and use the multi-scale intermediate layer feature extracted by the source flow network as the input of the migration flow network and the feature fusion unit.

在源流参数固定后,使用人脸样本数据进行迁移流网络的训练,由源流网络提取的多尺度中间层特征作为迁移网络流和特征融合单元的输入。After the source flow parameters are fixed, the face sample data is used to train the migration flow network, and the multi-scale intermediate layer features extracted by the source flow network are used as the input of the migration network flow and feature fusion unit.

S223,利用特征融合单元吸收源流网络产生的多尺度中间层特征并融合到迁移流网络的特征图中,以得到近红外线通道的第一预测结果。S223 , using the feature fusion unit to absorb the multi-scale intermediate layer features generated by the source flow network and fuse them into the feature map of the migration flow network to obtain the first prediction result of the near-infrared channel.

特征融合单元吸收源流产生的中间层多尺度特征,并且将其融合到对应的迁移流网络特征图中,并进行迁移学习,在此基础上得到近红外线通道的第一预测结果。The feature fusion unit absorbs the multi-scale features of the middle layer generated by the source stream, and fuses them into the corresponding feature map of the transfer stream network, and performs transfer learning. On this basis, the first prediction result of the near-infrared channel is obtained.

S23,基于第一预测结果以及标签进行第一损失函数计算,对近红外线通道的参数进行更新,以确定目标近红外线通道。S23, based on the first prediction result and the label, perform a first loss function calculation, and update the parameters of the near-infrared channel to determine the target near-infrared channel.

设计基于边距分类的分类损失应用,监督模型训练,对顶层迁移后的特征所形成的人脸嵌入特征优化。具体地,上述S23包括:Design a classification loss application based on margin classification, supervise model training, and optimize the face embedding features formed by the top-level transferred features. Specifically, the above S23 includes:

S231,基于第一预测结果进行标签进行交叉熵损失函数的计算,得到计算结果。S231 , performing a calculation of a cross-entropy loss function on the label based on the first prediction result, and obtaining a calculation result.

交叉熵损失函数的设计目的是能够更加有效的保证数据特征的分布更为紧致。形式化定义如下:The design purpose of the cross-entropy loss function is to more effectively ensure that the distribution of data features is more compact. The formal definition is as follows:

Figure BDA0003427321710000091
Figure BDA0003427321710000091

其中,s(θy)样本属于第y类的类别得分,n表示训练数据集中的种类数。where s(θy ) samples belong to the class score of the yth class, and n represents the number of classes in the training dataset.

S232,基于计算结果对近红外线通道的参数进行更新,以确定目标近红外线通道。S232: Update the parameters of the near-infrared channel based on the calculation result to determine the target near-infrared channel.

电子设备在计算得到计算结果之后,对近红外线通道的参数进行更新,并经过多次迭代训练,确定出目标近红外线通道。After the electronic device obtains the calculation result, it updates the parameters of the near-infrared channel, and after several iterations of training, determines the target near-infrared channel.

S24,将人脸样本数据输入低分辨率通道中,并利用低分辨率通道中的高分辨率分支以及低分辨率分支进行特征提取,以得到低分辨率通道的第二预测结果。S24: Input the face sample data into the low-resolution channel, and perform feature extraction by using the high-resolution branch and the low-resolution branch in the low-resolution channel to obtain a second prediction result of the low-resolution channel.

所述高分辨率分支与所述低分辨率分支的网络结构相同,该方法使用基于深度学习的方法,将低分辨率下的图片进行超分后,图6显示了超分前后的对比,这里本方案设计了孪生网络实现,具体来说,孪生网络有两个分支,即高分辨率分支和低分辨率分支,二者网络结构参数不共享,但结构相同。图6的左图是图像域超分,低分辨率经过超分模型后生成高分辨率图片;图6的右图是特征域超分,将低分辨率图片提取的特征和由高分辨率图片提取的特征进行相似度度量约束。具体地,上述S24包括:The network structure of the high-resolution branch is the same as that of the low-resolution branch. This method uses a deep learning-based method to super-score the low-resolution images. Figure 6 shows the comparison before and after the super-score. Here This scheme designs a twin network implementation. Specifically, the twin network has two branches, a high-resolution branch and a low-resolution branch. The network structure parameters of the two are not shared, but the structure is the same. The left picture of Figure 6 is the image domain super-resolution, and the low-resolution image is generated by the super-resolution model; The extracted features are subject to similarity measure constraints. Specifically, the above S24 includes:

S241,将人脸样本数据输入高分辨率分支中,得到高分辨率特征。S241, input the face sample data into the high-resolution branch to obtain high-resolution features.

高分辨率特征由高分辨率分支处理,该支路参数可在大规模数据集上预训练得到,在特征幻想阶段,仅用来提取高分辨面部图片的特征。High-resolution features are processed by a high-resolution branch whose parameters can be pre-trained on large-scale datasets, and are only used to extract features from high-resolution facial images during the feature fantasy stage.

S242,对人脸样本数据进行降采样后分别输入到低分辨率分支的特征提取单元以及面部解析单元,并将面部解析单元的解析结果进行多尺度降采样与特征提取单元的对应部分进行融合,得到低分辨率特征。S242, the face sample data is down-sampled and then input to the feature extraction unit and the face analysis unit of the low-resolution branch, respectively, and the analysis result of the face analysis unit is subjected to multi-scale downsampling and the corresponding part of the feature extraction unit is fused, Get low-resolution features.

低分辨率分支用于处理低分辨率图片,该支路也得到预训练,且参数在不断更新。其中,面部解析单元是基于BiSeNet作为面部解析的主干网络,融入人脸部位解析的目的在于将人脸先验信息输入到识别网络中来辅助识别模型对低分辨率人脸图片提取更有鉴别力的特征(融入面部解析的作用),主要改进点包括:a)将原来的插值上采样方式改为参数可以学习得到的Pixel Shuffle上采样;b)Context Path的骨架网络ResNet更改为人脸识别特征提取网络。The low-resolution branch is used to process low-resolution images. This branch is also pre-trained and the parameters are continuously updated. Among them, the facial parsing unit is based on BiSeNet as the backbone network of facial parsing. The purpose of integrating facial facial parsing is to input the prior information of the face into the recognition network to assist the recognition model to extract low-resolution facial images. The feature of force (integrating the role of face analysis), the main improvements include: a) Change the original interpolation upsampling method to Pixel Shuffle upsampling that can be learned by parameters; b) Change the skeleton network ResNet of Context Path to face recognition features Extract the network.

电子设备在得到低分辨率人脸样本数据的解析结果后,需要将解析结果作为特征图输入到低分辨识别支路中,为低分辨支路模型提供一定先验信息。流程图如图5所示,将低分辨率人脸图片的解析结果作为特征图输入到识别网络的各个尺度的模块中去,对于一个特定的尺度,人脸解析的结果采样到相应的尺度(p),并通过图5特征融合模块,与该尺度的识别特征fr进行融合,随后将融合后的特征图fm输入到下一尺度的识别网络模块中。After obtaining the analysis result of the low-resolution face sample data, the electronic device needs to input the analysis result as a feature map into the low-resolution identification branch, so as to provide certain prior information for the low-resolution branch model. The flow chart is shown in Figure 5. The analysis results of the low-resolution face pictures are input into the modules of each scale of the recognition network as feature maps. For a specific scale, the results of face analysis are sampled to the corresponding scale (p ), and through the feature fusion module in Figure 5, it is fused with the recognition feature fr of this scale, and then the fused feature mapfmis input into the recognition network module of the next scale.

S243,基于高分辨率特征以及低分辨率特征确定第二预测结果。S243 , determining a second prediction result based on the high-resolution feature and the low-resolution feature.

如图4所述,结合高分辨率特征以及低分辨率特征进行分类预测,得到第二预测结果。As shown in FIG. 4 , a second prediction result is obtained by combining high-resolution features and low-resolution features for classification prediction.

S25,基于第二预测结果以及标签进行第二损失函数计算,对低分辨率通道的参数进行更新,以确定目标低分辨率通道;S25, calculating the second loss function based on the second prediction result and the label, and updating the parameters of the low-resolution channel to determine the target low-resolution channel;

具体地,上述S25包括:Specifically, the above S25 includes:

S251,基于第二预测结果以及标签计算分类损失。S251 , calculating a classification loss based on the second prediction result and the label.

S252,基于高分辨率特征以及低分辨率特征进行特征损失的计算。S252, calculating the feature loss based on the high-resolution feature and the low-resolution feature.

S253,计算分类损失以及特征损失的加权和,确定损失结果。S253: Calculate the weighted sum of the classification loss and the feature loss, and determine the loss result.

对低分辨人脸图片进行解析,仅使用分类损失函数是不足的,这里将解析模型对低分辨率人脸识别进行解析的过程类似于图像生成的过程,因此使用超分辨任务中损失函数对解析图的结果和像素级标签进行监督优化。形式化表示如下:For parsing low-resolution face images, it is not enough to use only the classification loss function. Here, the process of parsing low-resolution face recognition by the parsing model is similar to the process of image generation, so the loss function in the super-resolution task is used for parsing. Graph results and pixel-level labels are supervised for optimization. The formal representation is as follows:

ζPFPWCE+1·L1[IfP,gt]+2·SSIM[IfP,gt]ζPF =PWCE +1 ·L1 [IfP,gt ] +2 ·SSIM [IfP,gt ]

其中,ζPWCE表示像素级交叉熵分类损失,ζL1和ζSSIM分别表示超分辨率任务中常见的L1损失和结构相似性(SSIM)损失函数,λ1和λ2为加权系数,IfP和Igt为解析模型输出的解析图和像素级标注。where ζPWCE represents the pixel-level cross-entropy classification loss, ζL1 and ζSSIM represent the L1 loss and structural similarity (SSIM) loss function commonly used in super-resolution tasks, respectively, λ1 and λ2 are weighting coefficients, IfP and Igt is the analytic map and pixel-level annotation output by the analytic model.

具体地,如上述公式所述,特征损失为λ1·L1[IfP,gt]+2·SSIM[IfP,gt]。Specifically, as described in the above formula, the feature loss is λ1 ·L1 [IfP,gt ]+2 ·SSIM [IfP,gt ].

S254,基于损失结果对低分辨率通道的参数进行更新,以确定目标低分辨率通道。S254, the parameters of the low-resolution channel are updated based on the loss result to determine the target low-resolution channel.

电子设备在计算得到的损失结果的基础上对低分辨率通道的参数进行更新,通过不断地迭代训练,确定出低分辨率通道。The electronic device updates the parameters of the low-resolution channel on the basis of the calculated loss results, and determines the low-resolution channel through continuous iterative training.

对分辨率较低图片识别的整体框架设计如图4所示,网络骨架为一个双流网络,由高分辨识别支路和低分辨识别支路组成。高分辨识别支路在大规模高分辨可见光通用人脸识别数据集上预训练,并在后续训练中保持参数固定。低分辨率图片首先经过人脸解析模型生成人脸部件解析结果,由低分辨识别支路提取特征。Figure 4 shows the overall framework design for image recognition with lower resolution. The network skeleton is a two-stream network, which consists of a high-resolution recognition branch and a low-resolution recognition branch. The high-resolution recognition branch is pre-trained on the large-scale high-resolution visible light universal face recognition dataset, and the parameters are kept fixed in the subsequent training. The low-resolution image is first generated by the face parsing model to generate facial component parsing results, and features are extracted by the low-resolution identification branch.

S26,基于目标近红外线通道以及目标低分辨率通道确定目标人脸识别模型。S26: Determine the target face recognition model based on the target near-infrared channel and the target low-resolution channel.

详细请参见图1所示实施例的S16,在此不再赘述。For details, please refer to S16 of the embodiment shown in FIG. 1 , which will not be repeated here.

本实施例提供的人脸识别模型的训练方法,将源流和迁移流组成两个并联网络结构,通过并联结构的迁移学习,利用源流多尺度特征图信息,利用迁移流将可见光和近红外特征迁移到另一个统一的公共空间,消除不同模态下的差异,以提高异质人脸识别。将低分辨率人脸部件解析图输入到识别网络,优化了特征提取,设基于特征域的超分辨技术,将图像相似性损失函数融合到面部解析模型。In the training method for a face recognition model provided in this embodiment, the source stream and the transfer stream are formed into two parallel network structures, and through the transfer learning of the parallel structure, the multi-scale feature map information of the source stream is used, and the visible light and near-infrared features are transferred by the transfer stream. to another unified public space to eliminate differences in different modalities to improve heterogeneous face recognition. The low-resolution face part analysis map is input into the recognition network, and the feature extraction is optimized. The super-resolution technology based on the feature domain is set up, and the image similarity loss function is integrated into the face analysis model.

作为本实施例的一个具体应用实例,该人脸识别模型的训练方法包括:As a specific application example of this embodiment, the training method of the face recognition model includes:

(1)近红外识别模型训练阶段,在大规模的可见光数据集上对源流网络进行预训练。(1) In the training phase of the near-infrared recognition model, the source-flow network is pre-trained on a large-scale visible light dataset.

(2)近红外识别模型训练阶段,在源流参数固定后,使用可见光图片和近红外图片进行迁移流网络的训练,由源流网络提取的多尺度中间层特征作为迁移网络流和特征融合单元的输入。(2) In the training phase of the near-infrared recognition model, after the source flow parameters are fixed, the visible light image and the near-infrared image are used to train the migration flow network, and the multi-scale intermediate layer features extracted by the source flow network are used as the input of the migration network flow and the feature fusion unit. .

(3)近红外识别模型训练阶段,通过将标注数据和预测数据计算损失函数,并不断迭代,实现模型的迭代优化。(3) In the training stage of the near-infrared recognition model, the iterative optimization of the model is realized by calculating the loss function of the labeled data and the predicted data, and iterating continuously.

(4)低分辨识别膜性能训练过程与近红外模型相似,此处不再赘述。(4) The training process of low-resolution recognition membrane performance is similar to that of the near-infrared model, and will not be repeated here.

相较于现有技术,很多人脸检测器都实现了高精度的实时人脸检测,但这些算法往往都是在直立的人脸上表现很好,然后在一些逆光、分辨率低等环境下的识别效果很差。本方法将这类特殊情况拆分为低分辨率情况和光照环境不佳情况,并分别针对这两大类问题,设计双通道的解决方案。在近红外识别方案中:将源流和迁移流组成两个并联网络结构,通过并联结构的迁移学习,利用源流多尺度特征图信息,利用迁移流将可见光和近红外特征迁移到另一个统一的公共空间,消除不同模态下的差异,以提高异质人脸识别。在低分辨率识别方案中,将低分辨率人脸部件解析图输入到识别网络,优化了特征提取,设基于特征域的超分辨技术,将图像相似性损失函数融合到面部解析模型。Compared with the existing technology, many face detectors have achieved high-precision real-time face detection, but these algorithms often perform well on upright faces, and then in some environments with backlight and low resolution. The recognition effect is poor. This method divides such special cases into low-resolution cases and poor lighting conditions, and designs dual-channel solutions for these two types of problems respectively. In the near-infrared identification scheme: the source stream and the transfer stream are formed into two parallel network structures, and through the transfer learning of the parallel structure, the multi-scale feature map information of the source stream is used, and the transfer stream is used to transfer the visible light and near-infrared features to another unified common space to eliminate differences in different modalities to improve heterogeneous face recognition. In the low-resolution recognition scheme, the low-resolution facial component analysis map is input into the recognition network, and the feature extraction is optimized. The super-resolution technology based on the feature domain is set up, and the image similarity loss function is integrated into the facial analysis model.

在本实施例中提供了一种人脸识别模型的训练方法,可用于电子设备,如监控终端、服务器等,图6是根据本发明实施例的人脸识别模型的训练方法的流程图,如图6所示,该流程包括如下步骤:In this embodiment, a training method for a face recognition model is provided, which can be used for electronic equipment, such as monitoring terminals, servers, etc. FIG. 6 is a flowchart of a training method for a face recognition model according to an embodiment of the present invention, such as As shown in Figure 6, the process includes the following steps:

S31,获取待识别人脸图像。S31 , acquiring a face image to be recognized.

待识别人脸图像可以是监控设备实时采集到的,也可以是存储在电子设备,或者,监控设备采集到发送给电子设备的,等等。The face image to be recognized may be collected in real time by the monitoring device, or stored in the electronic device, or collected by the monitoring device and sent to the electronic device, and so on.

S32,将待识别人脸图像输入目标人脸识别模型中,以得到近红外线通道的第一识别结果以及低分辨率通道的第二识别结果。S32, input the face image to be recognized into the target face recognition model to obtain the first recognition result of the near-infrared channel and the second recognition result of the low-resolution channel.

目标人脸识别模型对输入的待识别人脸图像,分别利用近红外线通道以及低分辨率通道对其进行识别,对应得到第一识别结果以及第二识别结果。The target face recognition model uses the near-infrared channel and the low-resolution channel to recognize the input face image to be recognized, respectively, and correspondingly obtains the first recognition result and the second recognition result.

S33,基于第一识别结果与第二识别结果的并集,确定待识别人脸图像的识别结果。S33: Determine the recognition result of the face image to be recognized based on the union of the first recognition result and the second recognition result.

电子设备在第一识别结果与第二识别结果中只要有一个确定识别成功,则表示识别成功,即取两次识别结果的并集,得到待识别人脸图像的识别结果。As long as one of the first recognition result and the second recognition result of the electronic device determines that the recognition is successful, it means that the recognition is successful, that is, the union of the two recognition results is taken to obtain the recognition result of the face image to be recognized.

本实施例提供的人脸识别方法,由于所得到的目标人脸识别模型是双通道模型,能够适应不同的图像环境,因此,利用该目标人脸识别模型对待识别人脸图像进行识别,可以得到较准确的识别结果。In the face recognition method provided in this embodiment, since the obtained target face recognition model is a dual-channel model, it can adapt to different image environments. Therefore, by using the target face recognition model to recognize the face image to be recognized, we can obtain more accurate identification results.

该人脸识别方法将异质人脸识别的情况拆分为低分辨率情况和光照环境不佳情况,并分别针对这两大类问题,设计双通道的解决方案。具体地,在近红外识别方案中:将源流和迁移流组成两个并联网络结构,通过并联结构的迁移学习,利用源流多尺度特征图信息,利用迁移流将可见光和近红外特征迁移到另一个统一的公共空间,消除不同模态下的差异,以提高异质人脸识别;在低分辨率识别方案中,将低分辨率人脸部件解析图输入到识别网络,优化了特征提取,设基于特征域的超分辨技术,将图像相似性损失函数融合到面部解析模型。The face recognition method divides the situation of heterogeneous face recognition into the low-resolution situation and the poor lighting environment, and designs a dual-channel solution for these two types of problems respectively. Specifically, in the near-infrared identification scheme: the source flow and the migration flow are formed into two parallel network structures, and through the transfer learning of the parallel structure, the multi-scale feature map information of the source flow is used, and the visible light and near-infrared features are transferred to another by the migration flow. A unified public space eliminates differences in different modalities to improve heterogeneous face recognition; in the low-resolution recognition scheme, the low-resolution face component analysis map is input into the recognition network, and feature extraction is optimized. The feature domain-based super-resolution technique fuses the image similarity loss function into the face parsing model.

在本实施例中还提供了一种人脸识别装置,该装置用于实现上述实施例及优选实施方式,已经进行过说明的不再赘述。如以下所使用的,术语“模块”可以实现预定功能的软件和/或硬件的组合。尽管以下实施例所描述的装置较佳地以软件来实现,但是硬件,或者软件和硬件的组合的实现也是可能并被构想的。In this embodiment, a face recognition device is also provided, and the device is used to implement the above-mentioned embodiments and preferred implementations, and what has been described will not be repeated. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. Although the apparatus described in the following embodiments is preferably implemented in software, implementations in hardware, or a combination of software and hardware, are also possible and contemplated.

本实施例提供一种人脸识别装置,包括:This embodiment provides a face recognition device, including:

获取模块,用于获取待识别人脸图像;an acquisition module, used to acquire the face image to be recognized;

识别模块,用于将所述待识别人脸图像输入目标人脸识别模型中,以得到所述近红外线通道的第一识别结果以及所述低分辨率通道的第二识别结果;A recognition module, for inputting the face image to be recognized into the target face recognition model, to obtain the first recognition result of the near-infrared channel and the second recognition result of the low-resolution channel;

确定模块,用于基于所述第一识别结果与所述第二识别结果的并集,确定所述待识别人脸图像的识别结果。A determination module, configured to determine the recognition result of the face image to be recognized based on the union of the first recognition result and the second recognition result.

本实施例中的人脸识别装置是以功能单元的形式来呈现,这里的单元是指ASIC电路,执行一个或多个软件或固定程序的处理器和存储器,和/或其他可以提供上述功能的器件。The face recognition device in this embodiment is presented in the form of functional units, where units refer to ASIC circuits, processors and memories that execute one or more software or fixed programs, and/or other devices that can provide the above functions device.

上述各个模块的更进一步的功能描述与上述对应实施例相同,在此不再赘述。Further functional descriptions of the above-mentioned modules are the same as those of the above-mentioned corresponding embodiments, and are not repeated here.

本发明实施例还提供一种601,请参阅图7,图7是本发明可选实施例提供的一种电子设备的结构示意图,如图7所示,该601可以包括:至少一个处理器41,例如CPU(CentralProcessing Unit,中央处理器),至少一个通信接口43,存储器44,至少一个通信总线42。其中,通信总线42用于实现这些组件之间的连接通信。其中,通信接口43可以包括显示屏(Display)、键盘(Keyboard),可选通信接口43还可以包括标准的有线接口、无线接口。存储器44可以是高速RAM存储器(Random Access Memory,易挥发性随机存取存储器),也可以是非不稳定的存储器(non-volatile memory),例如至少一个磁盘存储器。存储器44可选的还可以是至少一个位于远离前述处理器41的存储装置。其中,存储器44中存储应用程序,且处理器41调用存储器44中存储的程序代码,以用于执行上述任一方法步骤。An embodiment of the present invention further provides a 601. Please refer to FIG. 7. FIG. 7 is a schematic structural diagram of an electronic device provided by an optional embodiment of the present invention. As shown in FIG. 7, the 601 may include: at least oneprocessor 41 , for example, a CPU (Central Processing Unit, central processing unit), at least onecommunication interface 43 ,memory 44 , and at least onecommunication bus 42 . Among them, thecommunication bus 42 is used to realize the connection and communication between these components. Thecommunication interface 43 may include a display screen (Display) and a keyboard (Keyboard), and theoptional communication interface 43 may also include a standard wired interface and a wireless interface. Thememory 44 may be a high-speed RAM memory (Random Access Memory, volatile random access memory), or may be a non-volatile memory (non-volatile memory), such as at least one disk memory. Thememory 44 can optionally also be at least one storage device located away from theaforementioned processor 41 . The application program is stored in thememory 44, and theprocessor 41 calls the program code stored in thememory 44 for executing any of the above method steps.

其中,通信总线42可以是外设部件互连标准(peripheral componentinterconnect,简称PCI)总线或扩展工业标准结构(extended industry standardarchitecture,简称EISA)总线等。通信总线42可以分为地址总线、数据总线、控制总线等。为便于表示,图8中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。Thecommunication bus 42 may be a peripheral component interconnect (PCI for short) bus or an extended industry standard architecture (EISA for short) bus or the like. Thecommunication bus 42 can be divided into an address bus, a data bus, a control bus, and the like. For ease of presentation, only one thick line is used in FIG. 8, but it does not mean that there is only one bus or one type of bus.

其中,存储器44可以包括易失性存储器(英文:volatile memory),例如随机存取存储器(英文:random-access memory,缩写:RAM);存储器也可以包括非易失性存储器(英文:non-volatile memory),例如快闪存储器(英文:flash memory),硬盘(英文:hard diskdrive,缩写:HDD)或固态硬盘(英文:solid-state drive,缩写:SSD);存储器44还可以包括上述种类的存储器的组合。Thememory 44 may include volatile memory (English: volatile memory), such as random-access memory (English: random-access memory, abbreviation: RAM); the memory may also include non-volatile memory (English: non-volatile memory) memory), such as flash memory (English: flash memory), hard disk (English: hard diskdrive, abbreviation: HDD) or solid-state drive (English: solid-state drive, abbreviation: SSD); thememory 44 may also include the above-mentioned types of memory The combination.

其中,处理器41可以是中央处理器(英文:central processing unit,缩写:CPU),网络处理器(英文:network processor,缩写:NP)或者CPU和NP的组合。Theprocessor 41 may be a central processing unit (English: central processing unit, abbreviation: CPU), a network processor (English: network processor, abbreviation: NP), or a combination of CPU and NP.

其中,处理器41还可以进一步包括硬件芯片。上述硬件芯片可以是专用集成电路(英文:application-specific integrated circuit,缩写:ASIC),可编程逻辑器件(英文:programmable logic device,缩写:PLD)或其组合。上述PLD可以是复杂可编程逻辑器件(英文:complex programmable logic device,缩写:CPLD),现场可编程逻辑门阵列(英文:field-programmable gate array,缩写:FPGA),通用阵列逻辑(英文:generic arraylogic,缩写:GAL)或其任意组合。Theprocessor 41 may further include a hardware chip. The above-mentioned hardware chip may be an application-specific integrated circuit (English: application-specific integrated circuit, abbreviation: ASIC), a programmable logic device (English: programmable logic device, abbreviation: PLD) or a combination thereof. The above-mentioned PLD may be a complex programmable logic device (English: complex programmable logic device, abbreviation: CPLD), a field programmable gate array (English: field-programmable gate array, abbreviation: FPGA), a general-purpose array logic (English: generic arraylogic , abbreviation: GAL) or any combination thereof.

可选地,存储器44还用于存储程序指令。处理器41可以调用程序指令,实现如本申请任一实施例中所示的人脸识别模型的训练方法,或人脸识别方法。Optionally,memory 44 is also used to store program instructions. Theprocessor 41 may invoke program instructions to implement the training method of the face recognition model or the face recognition method as shown in any embodiment of the present application.

本发明实施例还提供了一种非暂态计算机存储介质,所述计算机存储介质存储有计算机可执行指令,该计算机可执行指令可执行上述任意方法实施例中的人脸识别模型的训练方法,或人脸设备方法。其中,所述存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)、随机存储记忆体(Random Access Memory,RAM)、快闪存储器(FlashMemory)、硬盘(Hard Disk Drive,缩写:HDD)或固态硬盘(Solid-State Drive,SSD)等;所述存储介质还可以包括上述种类的存储器的组合。The embodiment of the present invention further provides a non-transitory computer storage medium, the computer storage medium stores computer-executable instructions, and the computer-executable instructions can execute the training method of the face recognition model in any of the above method embodiments, Or face device method. Wherein, the storage medium may be a magnetic disk, an optical disk, a read-only memory (Read-Only Memory, ROM), a random access memory (Random Access Memory, RAM), a flash memory (FlashMemory), a hard disk (Hard Disk) Drive, abbreviation: HDD) or solid-state drive (Solid-State Drive, SSD), etc.; the storage medium may also include a combination of the above-mentioned types of memories.

虽然结合附图描述了本发明的实施例,但是本领域技术人员可以在不脱离本发明的精神和范围的情况下做出各种修改和变型,这样的修改和变型均落入由所附权利要求所限定的范围之内。Although the embodiments of the present invention have been described with reference to the accompanying drawings, various modifications and variations can be made by those skilled in the art without departing from the spirit and scope of the present invention, and such modifications and variations fall within the scope of the appended claims within the limits of the requirements.

Claims (10)

Translated fromChinese
1.一种人脸识别模型的训练方法,其特征在于,所述人脸识别模型包括并行的近红外线通道以及低分辨率通道,所述方法包括:1. the training method of a face recognition model, is characterized in that, described face recognition model comprises parallel near-infrared channel and low-resolution channel, and described method comprises:获取人脸样本数据及其标签;Obtain face sample data and its labels;将所述人脸样本数据输入所述近红外线通道中,并利用所述近红外线通道中的源流网络、迁移流网络以及特征融合单元进行特征提取,以得到所述近红外线通道的第一预测结果;Input the face sample data into the near-infrared channel, and use the source flow network, the migration flow network and the feature fusion unit in the near-infrared channel to perform feature extraction to obtain the first prediction result of the near-infrared channel ;基于所述第一预测结果以及所述标签进行第一损失函数计算,对所述近红外线通道的参数进行更新,以确定目标近红外线通道;Perform a first loss function calculation based on the first prediction result and the label, and update the parameters of the near-infrared channel to determine the target near-infrared channel;将所述人脸样本数据输入所述低分辨率通道中,并利用所述低分辨率通道中的高分辨率分支以及低分辨率分支进行特征提取,以得到所述低分辨率通道的第二预测结果;Input the face sample data into the low-resolution channel, and use the high-resolution branch and the low-resolution branch in the low-resolution channel to perform feature extraction to obtain the second feature of the low-resolution channel. forecast result;基于所述第二预测结果以及所述标签进行第二损失函数计算,对所述低分辨率通道的参数进行更新,以确定目标低分辨率通道;Perform a second loss function calculation based on the second prediction result and the label, and update the parameters of the low-resolution channel to determine the target low-resolution channel;基于所述目标近红外线通道以及所述目标低分辨率通道确定目标人脸识别模型。A target face recognition model is determined based on the target near-infrared channel and the target low-resolution channel.2.根据权利要求1所述的方法,其特征在于,所述将所述人脸样本数据输入所述近红外线通道中,并利用所述近红外线通道中的源流网络、迁移流网络以及特征融合单元进行特征提取,以得到所述近红外线通道的第一预测结果,包括:2. The method according to claim 1, wherein the face sample data is input into the near-infrared channel, and the source flow network, migration flow network and feature fusion in the near-infrared channel are utilized The unit performs feature extraction to obtain the first prediction result of the near-infrared channel, including:固定所述源流网络的参数,所述源流网络是基于可见光数据集进行预训练得到的;Fixing the parameters of the source-flow network, the source-flow network is obtained by pre-training based on the visible light data set;将所述人脸样本数据输入所述近红外线通道中,并将所述源流网络提取的多尺度中间层特征作为所述迁移流网络和所述特征融合单元的输入;Input the face sample data into the near-infrared channel, and use the multi-scale intermediate layer feature extracted by the source flow network as the input of the migration flow network and the feature fusion unit;利用所述特征融合单元吸收所述源流网络产生的所述多尺度中间层特征并融合到所述迁移流网络的特征图中,以得到所述近红外线通道的第一预测结果。The multi-scale intermediate layer features generated by the source flow network are absorbed by the feature fusion unit and fused into the feature map of the migration flow network to obtain the first prediction result of the near-infrared channel.3.根据权利要求2所述的方法,其特征在于,所述基于所述第一预测结果以及所述标签进行第一损失函数计算,对所述近红外线通道的参数进行更新,以确定目标近红外线通道,包括:3 . The method according to claim 2 , wherein the calculation of the first loss function is performed based on the first prediction result and the label, and the parameters of the near-infrared channel are updated to determine the target near-infrared channel. 4 . Infrared channel, including:基于所述第一预测结果进行所述标签进行交叉熵损失函数的计算,得到计算结果;Calculate the cross-entropy loss function of the label based on the first prediction result, and obtain a calculation result;基于所述计算结果对所述近红外线通道的参数进行更新,以确定目标近红外线通道。The parameters of the near-infrared channel are updated based on the calculation result to determine the target near-infrared channel.4.根据权利要求1所述的方法,其特征在于,所述高分辨率分支与所述低分辨率分支的网络结构相同,所述将所述人脸样本数据输入所述低分辨率通道中,并利用所述低分辨率通道中的高分辨率分支以及低分辨率分支进行特征提取,以得到所述低分辨率通道的第二预测结果,包括:4 . The method according to claim 1 , wherein the network structure of the high-resolution branch and the low-resolution branch is the same, and the face sample data is input into the low-resolution channel. 5 . , and use the high-resolution branch and the low-resolution branch in the low-resolution channel to perform feature extraction to obtain the second prediction result of the low-resolution channel, including:将所述人脸样本数据输入所述高分辨率分支中,得到高分辨率特征;Inputting the face sample data into the high-resolution branch to obtain high-resolution features;对所述人脸样本数据进行降采样后分别输入到低分辨率分支的特征提取单元以及面部解析单元,并将所述面部解析单元的解析结果进行多尺度降采样与所述特征提取单元的对应部分进行融合,得到低分辨率特征;After down-sampling the face sample data, it is respectively input to the feature extraction unit and the face analysis unit of the low-resolution branch, and the analysis result of the face analysis unit is subjected to multi-scale downsampling and the correspondence of the feature extraction unit Partially fused to obtain low-resolution features;基于所述高分辨率特征以及所述低分辨率特征确定所述第二预测结果。The second prediction result is determined based on the high resolution feature and the low resolution feature.5.根据权利要求4所述的方法,其特征在于,所述基于所述第二预测结果以及所述标签进行第二损失函数计算,对所述低分辨率通道的参数进行更新,以确定低分辨率通道,包括:5. The method according to claim 4, wherein the second loss function calculation is performed based on the second prediction result and the label, and the parameters of the low-resolution channel are updated to determine the low-resolution channel. Resolution channels, including:基于所述第二预测结果以及所述标签计算分类损失;calculating a classification loss based on the second prediction result and the label;基于所述高分辨率特征以及所述低分辨率特征进行特征损失的计算;Calculate feature loss based on the high-resolution feature and the low-resolution feature;计算所述分类损失以及所述特征损失的加权和,确定损失结果;Calculate the weighted sum of the classification loss and the feature loss to determine the loss result;基于所述损失结果对所述低分辨率通道的参数进行更新,以确定目标低分辨率通道。The parameters of the low-resolution channel are updated based on the loss result to determine a target low-resolution channel.6.根据权利要求1所述的训练方法,其特征在于,所述基于所述目标近红外线通道以及所述目标低分辨率通道确定目标人脸识别模型,包括:6. The training method according to claim 1, wherein, determining a target face recognition model based on the target near-infrared channel and the target low-resolution channel, comprising:将所述目标近红外线通道以及所述目标低分辨率通道并接后的模型确定为所述目标人脸识别模型。A model in which the target near-infrared channel and the target low-resolution channel are connected in parallel is determined as the target face recognition model.7.一种人脸识别方法,其特征在于,包括:7. A face recognition method, comprising:获取待识别人脸图像;Obtain the face image to be recognized;将所述待识别人脸图像输入目标人脸识别模型中,以得到所述近红外线通道的第一识别结果以及所述低分辨率通道的第二识别结果;Inputting the to-be-recognized face image into the target face recognition model, to obtain the first recognition result of the near-infrared channel and the second recognition result of the low-resolution channel;基于所述第一识别结果与所述第二识别结果的并集,确定所述待识别人脸图像的识别结果。Based on the union of the first recognition result and the second recognition result, the recognition result of the face image to be recognized is determined.8.一种人脸识别装置,其特征在于,包括:8. A face recognition device, comprising:获取模块,用于获取待识别人脸图像;an acquisition module, used to acquire the face image to be recognized;识别模块,用于将所述待识别人脸图像输入目标人脸识别模型中,以得到所述近红外线通道的第一识别结果以及所述低分辨率通道的第二识别结果;A recognition module, for inputting the face image to be recognized into the target face recognition model, to obtain the first recognition result of the near-infrared channel and the second recognition result of the low-resolution channel;确定模块,用于基于所述第一识别结果与所述第二识别结果的并集,确定所述待识别人脸图像的识别结果。A determination module, configured to determine the recognition result of the face image to be recognized based on the union of the first recognition result and the second recognition result.9.一种电子设备,其特征在于,包括:9. An electronic device, characterized in that, comprising:存储器和处理器,所述存储器和所述处理器之间互相通信连接,所述存储器中存储有计算机指令,所述处理器通过执行所述计算机指令,从而执行权利要求1-6中任一项所述的人脸识别模型的训练方法,或执行权利要求7所述的人脸识别方法。A memory and a processor, wherein the memory and the processor are connected in communication with each other, the memory stores computer instructions, and the processor executes any one of claims 1-6 by executing the computer instructions The training method of the described face recognition model, or the execution of the face recognition method of claim 7.10.一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储有计算机指令,所述计算机指令用于使计算机执行权利要求1-6中任一项所述的人脸识别模型的训练方法,或执行权利要求7所述的人脸识别方法。10. A computer-readable storage medium, wherein the computer-readable storage medium stores computer instructions, and the computer instructions are used to make a computer perform the face recognition according to any one of claims 1-6 The training method of the model, or the implementation of the face recognition method of claim 7.
CN202111581961.2A2021-12-222021-12-22 Face recognition model training, face recognition method and electronic deviceActiveCN114445873B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202111581961.2ACN114445873B (en)2021-12-222021-12-22 Face recognition model training, face recognition method and electronic device

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202111581961.2ACN114445873B (en)2021-12-222021-12-22 Face recognition model training, face recognition method and electronic device

Publications (2)

Publication NumberPublication Date
CN114445873Atrue CN114445873A (en)2022-05-06
CN114445873B CN114445873B (en)2024-10-25

Family

ID=81363851

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202111581961.2AActiveCN114445873B (en)2021-12-222021-12-22 Face recognition model training, face recognition method and electronic device

Country Status (1)

CountryLink
CN (1)CN114445873B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN117830645A (en)*2024-02-232024-04-05中国科学院空天信息创新研究院 Feature extraction network training method, device, equipment and medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20180096595A1 (en)*2016-10-042018-04-05Street Simplified, LLCTraffic Control Systems and Methods
CN111209901A (en)*2020-04-202020-05-29湖南极点智能科技有限公司Face recognition method, system and related device

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20180096595A1 (en)*2016-10-042018-04-05Street Simplified, LLCTraffic Control Systems and Methods
CN111209901A (en)*2020-04-202020-05-29湖南极点智能科技有限公司Face recognition method, system and related device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
HYUNJU MAENG: "NFRAD: Near-Infrared Face Recognition at a Distance", 《2011 INTERNATIONAL JOINT CONFERENCE ON BIOMETRICS (IJCB)》, 31 December 2011 (2011-12-31), pages 1 - 12*
张典: "基于轻量网络和异质图像融合的实时人脸识别算法研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》, no. 05, 31 May 2021 (2021-05-31), pages 138 - 1131*

Cited By (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN117830645A (en)*2024-02-232024-04-05中国科学院空天信息创新研究院 Feature extraction network training method, device, equipment and medium
CN117830645B (en)*2024-02-232025-01-28中国科学院空天信息创新研究院 Remote sensing basic model network rapid fine-tuning method and device based on parameter freezing

Also Published As

Publication numberPublication date
CN114445873B (en)2024-10-25

Similar Documents

PublicationPublication DateTitle
CN110516670B (en)Target detection method based on scene level and area suggestion self-attention module
CN109241895B (en) Dense crowd counting method and device
CN112149637B (en)Method and device for generating a target re-recognition model and for target re-recognition
CN109784183B (en) Video Saliency Object Detection Method Based on Cascaded Convolutional Networks and Optical Flow
CN112132156A (en)Multi-depth feature fusion image saliency target detection method and system
KR102606734B1 (en)Method and apparatus for spoof detection
CN111931859B (en)Multi-label image recognition method and device
CN115131281B (en) Change detection model training and image change detection method, device and equipment
CN110781980B (en)Training method of target detection model, target detection method and device
CN112836597A (en) Multi-hand pose keypoint estimation method based on cascaded parallel convolutional neural network
US11195024B1 (en)Context-aware action recognition by dual attention networks
CN113553909A (en) Model training method for skin detection, skin detection method
CN115345905A (en) Target object tracking method, device, terminal and storage medium
Yuan et al.CurSeg: A pavement crack detector based on a deep hierarchical feature learning segmentation framework
CN116205927A (en)Image segmentation method based on boundary enhancement
CN110427915B (en)Method and apparatus for outputting information
WO2025030907A1 (en)Low-altitude unmanned aerial vehicle image change detection method and system
CN114445873B (en) Face recognition model training, face recognition method and electronic device
Chen et al.LMSA‐Net: A lightweight multi‐scale aware network for retinal vessel segmentation
CN113343979A (en)Method, apparatus, device, medium and program product for training a model
WO2024160275A1 (en)Semantic segmentation model training method and device, and semantic segmentation method and device
WO2024230076A1 (en)Positioning verification method and apparatus, and electronic device and storage medium
CN114998702B (en) Entity recognition and knowledge graph generation method and system based on BlendMask
CN115909408A (en) A method and device for pedestrian re-identification based on Transformer network
CN117292122A (en)RGB-D significance object detection and semantic segmentation method and system

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp