技术领域technical field
本发明涉及机器深度学习、图像处理识别领域,尤其涉及一种加入注意力机制的人脸识别方法。The invention relates to the fields of machine deep learning, image processing and recognition, and in particular to a face recognition method with an attention mechanism added.
背景技术Background technique
人脸识别是近年来计算机视觉领域和机器学习领域中最富挑战性的课题之一,受到了研究者们的广泛关注.成功有效的人脸识别具有广阔的应用前景,可在国防安全、视频监控、人机交互和视频索引等场景发挥巨大作用。Face recognition is one of the most challenging topics in the field of computer vision and machine learning in recent years, and has received extensive attention from researchers. Successful and effective face recognition has broad application prospects, and can be used in national defense security, video Scenarios such as surveillance, human-computer interaction, and video indexing play a huge role.
目前,大部分基于CNN的特征提取网络使用分类损失(Softmax Loss)作为网络训练的监督信号,这些网络以分类为学习目标,在训练过程中不同类别之间的距离会逐渐增大。Deepface使用分类网络方法,同时使用复杂的3D对齐方式和大量的训练数据。DeepID则是首先对人脸图片进行分块,然后使用多个分类网络对不同人脸块进行特征提取,最后使用联合贝叶斯算法对这些特征进行融合,由于该技术是对不同人脸块进行特征提取,所以数据集比原图增加了好几倍,训练时间大大增加,计算资源消耗大。另外这些人脸块都是严格固定好划分方式的,对于侧脸或者非规则的人脸图片,则该准确率会大打折扣,算法不够鲁棒。At present, most CNN-based feature extraction networks use classification loss (Softmax Loss) as the supervision signal for network training. These networks take classification as the learning target, and the distance between different categories will gradually increase during the training process. Deepface uses a classification network approach while using complex 3D alignments and large amounts of training data. DeepID first divides the face image into blocks, then uses multiple classification networks to extract features from different face blocks, and finally uses the joint Bayesian algorithm to fuse these features. Feature extraction, so the data set is several times larger than the original image, the training time is greatly increased, and the computing resource consumption is large. In addition, these face blocks are strictly fixed and divided. For profile or irregular face pictures, the accuracy rate will be greatly reduced, and the algorithm is not robust enough.
发明内容SUMMARY OF THE INVENTION
为了克服现有技术存在的缺陷,本发明提供一种加入注意力机制的人脸识别方法,通过注意力模块,神经网络能够自动学习到具有判别性的人脸块特征,而不是固定划分人脸块,用这样的方法提取到的特征更有利于提升分类准确率,鲁棒性更强。同时由于注意力模块结构简洁,所以计算资源消耗少,网络收敛速度快。In order to overcome the defects of the prior art, the present invention provides a face recognition method with an attention mechanism. Through the attention module, the neural network can automatically learn the discriminative face block features, rather than fixedly dividing the face. The features extracted by this method are more conducive to improving the classification accuracy and are more robust. At the same time, due to the simple structure of the attention module, the consumption of computing resources is low, and the network convergence speed is fast.
为了达到上述目的,本发明采用以下技术方案:In order to achieve the above object, the present invention adopts the following technical solutions:
本发明公开一种加入注意力机制的人脸识别方法,包括下述步骤:The invention discloses a face recognition method adding an attention mechanism, comprising the following steps:
S1:使用级联的卷积神经网络进行图像预处理,得到对齐的人脸图像;S1: Use cascaded convolutional neural networks for image preprocessing to obtain aligned face images;
S2:对预处理后的图像进行数据增广,所述数据增广包括随机裁剪和随机翻转操作,经过步骤S1处理后的图像随机裁剪出设定的尺寸区域,以设定的概率对图像进行翻转,最后对图像做白化处理,对于测试样本则直接归一化成设定尺寸的图像,然后进行白化处理,所述设定尺寸与随机裁剪的设定尺寸相同;S2: Perform data augmentation on the preprocessed image, the data augmentation includes random cropping and random flipping operations, the image processed in step S1 is randomly cropped out of a set size area, and the image is processed with a set probability. Flip, and finally whiten the image. For the test sample, it is directly normalized to an image of a set size, and then whitened. The set size is the same as the set size of random cropping;
S3:设置注意力机制模块,用于网络自动学习到具有判别性的人脸块特征,利用注意力机制模块将输入的图像进行卷积操作,然后进行全连接回归输出M个角度值,M为自然数,基于M个角度值构建矩阵,通过矩阵运算提取图像的局部特征;S3: Set the attention mechanism module for the network to automatically learn the discriminative face block features, use the attention mechanism module to perform the convolution operation on the input image, and then perform the full connection regression to output M angle values, where M is Natural numbers, construct a matrix based on M angle values, and extract local features of the image through matrix operations;
S4:搭建注意力机制网络,采用深度神经网络提取图像特征,并加入注意力机制模块,所述注意力机制网络包括主路和支路,所述主路为图片通过深度神经网络后得到的输出,所述支路为深度神经网络的每个阶段的输出经过不同的注意力机制模块,再依次进行elementwise-add后得到的输出,最后把主路和支路的输出进行特征拼接,得到最终的图像特征图,用于计算损失函数和作为人脸识别的特征;S4: Build an attention mechanism network, use a deep neural network to extract image features, and add an attention mechanism module. The attention mechanism network includes a main path and a branch, and the main path is the output obtained after the image passes through the deep neural network. , the branch is the output of each stage of the deep neural network through different attention mechanism modules, and then the output obtained after elementwise-add is performed in turn, and finally the outputs of the main road and the branch are feature spliced to obtain the final Image feature maps for calculating loss functions and as features for face recognition;
S5:训练注意力机制网络,采用人脸识别损失函数对注意力机制网络进行训练并且保存;S5: Train the attention mechanism network, and use the face recognition loss function to train and save the attention mechanism network;
S6:提取图像特征,将测试样本输入到训练好的注意力机制网络中,得到优质的图像特征;S6: Extract image features, input test samples into the trained attention mechanism network, and obtain high-quality image features;
S7:人脸识别,把提取得到的图像特征用softmax回归方法进行分类,完成测试样本的识别。S7: face recognition, classify the extracted image features with the softmax regression method, and complete the recognition of the test samples.
作为优选的技术方案,步骤S1中所述级联的卷积神经网络采用MTCNN,包括P-Net、R-Net和O-Net,给定任意一张待测图像,缩放到不同比例,构建图像金字塔,然后依次输入P-Net、R-Net和O-Net,提取人脸候选框,还包括拟合人脸与非人脸分类、边框回归和人脸特征点坐标回归的目标训练,具体损失函数如下所述:As a preferred technical solution, the cascaded convolutional neural network described in step S1 adopts MTCNN, including P-Net, R-Net and O-Net, given any image to be tested, zoomed to different scales, and constructed the image Pyramid, then input P-Net, R-Net and O-Net in turn to extract face candidate frames, including fitting face and non-face classification, frame regression and face feature point coordinate regression target training, specific loss The function is described below:
MTCNN进行人脸与非人脸分类使用交叉熵作为损失函数,记为Ldet,计算公式如下:MTCNN uses cross entropy as the loss function for face and non-face classification, denoted as Ldet , and the calculation formula is as follows:
其中,p(i)为模型预测的概率,为测试样本x(i)的标签,where p(i) is the probability predicted by the model, is the label of the test sample x(i) ,
MTCNN进行边框回归使用L2Loss作为损失函数,记为Lbox,计算公式如下:MTCNN uses L2Loss as the loss function for bounding box regression, denoted as Lbox , and the calculation formula is as follows:
其中,是模型预测的回归值,是测试样本x(i)真实的坐标值,且in, is the regression value predicted by the model, is the true coordinate value of the test sample x(i) , and
MTCNN进行人脸特征点坐标回归同样使用L2Loss作为损失函数,记为Llandmark,计算公式如下:MTCNN also uses L2Loss as the loss function for facial feature point coordinate regression, which is recorded as Llandmark . The calculation formula is as follows:
其中,是模型预测的回归值,是测试样本x(i)真实人脸特征点的坐标值,且in, is the regression value predicted by the model, is the coordinate value of the real face feature point of the test sample x(i) , and
作为优选的技术方案,所述MTCNN引入总目标函数,用于排除非人脸数据参与到损失函数的计算,所述总目标函数计算公式如下:As a preferred technical solution, the MTCNN introduces an overall objective function to exclude non-face data from participating in the calculation of the loss function, and the calculation formula of the overall objective function is as follows:
其中,N表示训练样本总数,αj表示对应目标函数在总的目标函数中的重要程度,对于P-Net或R-Net的相关权重为(αdet=1,αbox=0.5,αlandmark=0.5);对于ONet的相关权重为(αdet=1,αbox=0.5,αlandmark=1)。Among them, N represents the total number of training samples, αj represents the importance of the corresponding objective function in the total objective function, and the relevant weight for P-Net or R-Net is (αdet =1,αbox =0.5,αlandmark = 0.5); the relevant weight for ONet is (αdet =1, αbox =0.5, αlandmark =1).
作为优选的技术方案,步骤S3所述注意力机制模块采用STN模块,所述STN模块包括本地化网络模块,网格生成器和采样器,As a preferred technical solution, the attention mechanism module in step S3 adopts an STN module, and the STN module includes a localization network module, a grid generator and a sampler,
所述本地化网络模块将输入的图片进行卷积操作,然后进行全连接回归出6个角度值,形成2*3的矩阵,The localization network module performs a convolution operation on the input image, and then performs full connection regression to obtain 6 angle values to form a 2*3 matrix,
所述网格生成器通过矩阵运算计算出目标图V中的每个位置对应原图U中的坐标位置,生成Tθ(Gi),具体计算公式如下所述:The grid generator calculates the coordinate position in the original image U corresponding to each position in the target image V through matrix operation, and generates Tθ (Gi ), and the specific calculation formula is as follows:
其中,代表原始图的坐标,代表目标图的坐标,Aθ为本地化网络模块网络回归出的6个角度值,in, represent the coordinates of the original graph, Represents the coordinates of the target image, Aθ is the 6 angle values returned by the localization network module network,
所述采样器根据T(G)中的坐标信息,在原始图U中进行采样,将U中的像素复制到目标图V中。The sampler performs sampling in the original image U according to the coordinate information in T(G), and copies the pixels in U to the target image V.
作为优选的技术方案,步骤S4中,所述深度神经网络的基础网络采用resnet50,resnet50包括5个stage,具体如下所述:As a preferred technical solution, in step S4, the basic network of the deep neural network adopts resnet50, and resnet50 includes 5 stages, which are as follows:
Stage0:包括卷积层和池化层,所述卷积层的卷积核大小为7x7,输出通道数为64,步长为2,所述池化层采用maxpooling的池化方式,窗口大小为3x3,步长为2;Stage0: including convolution layer and pooling layer, the convolution kernel size of the convolution layer is 7x7, the number of output channels is 64, and the stride is 2. The pooling layer adopts the pooling method of maxpooling, and the window size is 3x3, step size is 2;
Stage1:由3个输出通道数为256的块组成;Stage1: consists of 3 blocks with 256 output channels;
Stage2:由4个输出通道数为512的块组成;Stage2: consists of 4 blocks with 512 output channels;
Stage3:由5个输出通道数为1024的块组成;Stage3: consists of 5 blocks with 1024 output channels;
Stage4:由6个输出通道数为2048的块组成;Stage4: consists of 6 blocks with 2048 output channels;
所述支路网络将基础网络resnet50的stage0,1,2,3,4得到的图像特征图分别输入到各个STN模块中,得到特征L0、L1、L2、L3、L4,所述L1-L4均做一次卷积操作,卷积核大小为1x1,步长为1,输出通道数为上一个特征的通道数,用elementwise-add的方式把这些特征依次相加,具体计算方式为:The branch network inputs the image feature maps obtained from stages0, 1, 2, 3, and 4 of the basic network resnet50 into each STN module to obtain features L0, L1, L2, L3, and L4. Do a convolution operation, the size of the convolution kernel is 1x1, the step size is 1, the number of output channels is the number of channels of the previous feature, and these features are added in turn by elementwise-add. The specific calculation method is:
L0+f(L1)+f(L2)+f(L3)+f(L4)L0+f(L1)+f(L2)+f(L3)+f(L4)
其中”+”为elsemenwise-add操作,f(·)为卷积操作。Where "+" is the elsemenwise-add operation, and f( ) is the convolution operation.
作为优选的技术方案,所述块的结构形成步骤具体如下所述:As a preferred technical solution, the structure forming steps of the block are as follows:
采用一个1x1卷积进行降维,然后进行3x3卷积操作,再用1x1卷积升维,输出与输入进行elementwise-add操作后得到的结果,Use a 1x1 convolution to reduce the dimension, then perform a 3x3 convolution operation, and then use a 1x1 convolution to increase the dimension, and the result obtained after the elementwise-add operation between the output and the input,
最后加入一个128维的全连接层进行降维。Finally, a 128-dimensional fully connected layer is added for dimensionality reduction.
作为优选的技术方案,步骤S5中所述人脸识别损失函数采用Softmax函数,基于Softmax函数的分类模型的第K路输出为:As a preferred technical solution, the face recognition loss function described in step S5 adopts the Softmax function, and the Kth output of the classification model based on the Softmax function is:
其中bk为Softmax层的两个参数,表示有K组权重和偏置。in bk is the two parameters of the Softmax layer, indicating that there are K groups of weights and biases.
作为优选的技术方案,所述Softmax层采用未激活的全连接层。As a preferred technical solution, the Softmax layer adopts an inactive fully connected layer.
作为优选的技术方案,所述Softmax层输出变换后第K类的后验概率为:As a preferred technical solution, the posterior probability of the Kth class after output transformation of the Softmax layer is:
为了每个测试样本所属类别的概率最大,定义Softmax Loss为:In order to maximize the probability of the category to which each test sample belongs, the Softmax Loss is defined as:
其中θ表示模型参数,x(i)表示测试样本y(i)所属类别。where θ represents the model parameters, and x(i) represents the category to which the test sample y(i) belongs.
作为优选的技术方案,所述基于Softmax函数的分类模型还包括优化器,优化器采用Adam。As a preferred technical solution, the classification model based on the Softmax function further includes an optimizer, and the optimizer adopts Adam.
本发明与现有技术相比,具有如下优点和有益效果:Compared with the prior art, the present invention has the following advantages and beneficial effects:
(1)本发明基于提取更有判别性的人脸局部特征为出发点,在基础神经网络的框架下设计了注意力机制模块,并且以独特的连接方式和深度神经网络结合,形成了独特的加入注意力机制的人脸识别方法,能够提取到丰富类别相关信息的人脸特征。(1) The present invention is based on the extraction of more discriminative face local features as a starting point, designs an attention mechanism module under the framework of a basic neural network, and combines a unique connection method with a deep neural network to form a unique add-on The face recognition method of the attention mechanism can extract the face features with rich category-related information.
(2)本发明对预处理后的图像进行数据增广,包括随机裁剪和随机翻转操作,用于增加训练的样本数据,训练集的数据扩增能够加强网络的鲁棒性。(2) The present invention performs data augmentation on the preprocessed images, including random cropping and random flipping operations, to increase the sample data for training, and the data augmentation of the training set can enhance the robustness of the network.
(3)本发明的注意力机制模块采用STN模块,STN模块包括本地化网络模块,网格生成器和采样器,该STN模块结构简洁,计算资源消耗少,网络收敛速度快。(3) The attention mechanism module of the present invention adopts the STN module. The STN module includes a localization network module, a grid generator and a sampler. The STN module has a simple structure, consumes less computing resources, and has a fast network convergence speed.
附图说明Description of drawings
图1为本发明人脸对齐网络的结构示意图;1 is a schematic structural diagram of a face alignment network of the present invention;
图2为本发明STN模块的结构示意图;Fig. 2 is the structural representation of the STN module of the present invention;
图3为本发明基础深度卷积神经网络的结构示意图;3 is a schematic structural diagram of a basic deep convolutional neural network of the present invention;
图4为本发明基础深度卷积神经网络中的块结构示意图;4 is a schematic diagram of a block structure in the basic deep convolutional neural network of the present invention;
图5为本发明注意力机制网络的结构示意图。FIG. 5 is a schematic structural diagram of the attention mechanism network of the present invention.
具体实施方式Detailed ways
为了使本发明的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本发明进行进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。In order to make the objectives, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present invention, but not to limit the present invention.
本实施例公开一种基于加入注意力机制的人脸识别算法,所述算法包括以下步骤:This embodiment discloses a face recognition algorithm based on adding an attention mechanism, and the algorithm includes the following steps:
步骤一:使用级联的神经网络进行人脸检测人脸对齐的数据预处理,采用的级联的卷积神经网络是MTCNN,MTCNN级联结构主要由3个卷积神经网络组成,分别为P-Net、R-Net和O-Net。给定一张待检测图片,图片会首先被缩放到不同的比例,以构建图片的尺度空间,然后依次输入三个网络,以提取人脸候选框。如图1所示,该算法有三个阶段组成:第一阶段,浅层的CNN快速产生候选窗体;第二阶段,通过更复杂的CNN精炼候选窗体,丢弃大量的重叠窗体;第三阶段,使用更加强大的CNN,实现候选窗体去留,同时显示五个面部关键点定位。在进行模型训练的时候,为了融合人脸检测和人脸对齐任务,MTCNN同时拟合3个目标:人脸/非人脸分类、边框回归和人脸特征点坐标回归。三个损失函数分别是:Step 1: Use a cascaded neural network for face detection and face alignment data preprocessing. The cascaded convolutional neural network used is MTCNN. The MTCNN cascade structure is mainly composed of three convolutional neural networks, respectively P -Net, R-Net and O-Net. Given an image to be detected, the image is first scaled to different scales to construct the scale space of the image, and then input to three networks in turn to extract face candidate boxes. As shown in Figure 1, the algorithm consists of three stages: in the first stage, a shallow CNN quickly generates candidate forms; in the second stage, a more complex CNN refines candidate forms and discards a large number of overlapping forms; In the first stage, a more powerful CNN is used to realize the candidate frame removal and display the five facial key point locations at the same time. During model training, in order to integrate face detection and face alignment tasks, MTCNN simultaneously fits three objectives: face/non-face classification, frame regression and face feature point coordinate regression. The three loss functions are:
(1)人脸/非人脸分类(1) face/non-face classification
人脸/非人脸是一个二分类问题,所以MTCNN使用交叉熵作为损失函数,记为Ldet。对于每个测试样本x(i),Face/non-face is a binary classification problem, so MTCNN uses cross-entropy as the loss function, denoted as Ldet . For each test sample x(i) ,
其中,p(i)为模型预测的概率,为测试样本x(i)的标签,where p(i) is the probability predicted by the model, is the label of the test sample x(i) ,
(2)边框回归:边框回归的目的在于对于每个人脸候选框估计与附近真实人脸区域的偏移量,包括左边、上边、宽和高。所以边框回归是一个回归问题,以上述4个数值作为回归目标,所以MTCNN使用L2Loss作为损失函数,记为Lbox。对于每个测试样本x(i),(2) Frame regression: The purpose of frame regression is to estimate the offset from the nearby real face area for each face candidate frame, including the left, top, width and height. Therefore, the border regression is a regression problem, and the above 4 values are used as the regression target, so MTCNN uses L2Loss as the loss function, which is recorded as Lbox . For each test sample x(i) ,
其中,是模型预测的回归值,是测试样本x(i)真实的坐标值,因为待回归的目标有4个值,所以where is the regression value predicted by the model, is the real coordinate value of the test sample x(i) , because the target to be regressed has 4 values, so
(3)人脸特征点坐标回归(3) Coordinate regression of facial feature points
人脸特征点坐标回归同样是一个回归问题,由于MTCNN只检测5个人脸特征点,而每个特征点包含x、y坐标,所以一共有10个回归目标。这里同样使用L2Loss作为损失函数,记为Llandmark。对于每个测试样本x(i):The face feature point coordinate regression is also a regression problem. Since MTCNN only detects 5 face feature points, and each feature point contains x and y coordinates, there are a total of 10 regression targets. Here L2Loss is also used as the loss function, denoted as Llandmark . For each test sample x(i) :
其中,是模型预测的回归值,是测试样本x(i)真实人脸特征点的坐标值,因为待回归的目标有10个值,所以in, is the regression value predicted by the model, is the coordinate value of the real face feature point of the test sample x(i) , because the target to be regressed has 10 values, so
(4)总目标函数(4) Overall objective function
让模型同时拟合不同的目标,需要使用不同类型的训练数据,例如非人脸图片、部分人脸图片、带特征点标注人脸数据等,但并不是所有数据对所有目标函数都有意义,例如非人脸数据对Llandmark并没有意义。因而在训练的时候,并不是每种样本都需要参与所有损失函数的计算,为了进行对不同的样本进行区分,MTCNN引入样本类型标签表示样本x(i)是否属于类型j,于是总目标函数表示为To make the model fit different targets at the same time, it is necessary to use different types of training data, such as non-face pictures, partial face pictures, face data with feature points, etc., but not all data are meaningful for all target functions. For example, non-face data does not make sense for Llandmark . Therefore, during training, not every sample needs to participate in the calculation of all loss functions. In order to distinguish different samples, MTCNN introduces sample type labels. Indicates whether the sample x(i) belongs to type j, so the overall objective function is expressed as
其中,N表示训练样本总数,αj表示对应目标函数在总的目标函数中的重要程度,对于P-Net和R-Net,相关权重为(αdet=1,αbox=0.5,αlandmark=0.5);而对于ONet,为了保证人脸特征点的准确度,提高了特征点坐标回归目标函数的权重,变为(αdet=1,αbox=0.5,αlandmark=1)Among them, N represents the total number of training samples, αj represents the importance of the corresponding objective function in the total objective function, for P-Net and R-Net, the relevant weights are (αdet =1,αbox =0.5,αlandmark = 0.5); and for ONet, in order to ensure the accuracy of face feature points, the weight of the feature point coordinate regression objective function is increased, becoming (αdet =1,αbox =0.5,αlandmark =1)
步骤二:数据增广Step 2: Data Augmentation
数据增广采用了随机裁剪和随机翻转操作,前者将经过步骤一处理后的图片中随机裁剪出160x160区域,后者以0.5的概率对图片进行翻转。最后对图片进行白化。测试样本则直接归一化成160x160大小的图片,然后同样进行白化。Data augmentation adopts random cropping and random flipping operations. The former randomly crops a 160x160 area from the image processed in step 1, and the latter flips the image with a probability of 0.5. Finally, whiten the image. The test sample is directly normalized to a 160x160 size image, and then whitened as well.
步骤三:设计注意力机制模块Step 3: Design the attention mechanism module
注意力机制模块采用的是STN模块:如图2所示,STN模块由本地化网络模块(Localisation Network),网格生成器(Grid generator),采样器(Sampler)3个部分组成。The attention mechanism module adopts the STN module: as shown in Figure 2, the STN module consists of three parts: the localisation network module (Localisation Network), the grid generator (Grid generator), and the sampler (Sampler).
Localisation Network:该网络就是一个简单的回归网络。将输入的图片进行几个卷积操作,然后全连接回归出6个角度值(假设是仿射变换),2*3的矩阵。Localisation Network: This network is a simple regression network. Perform several convolution operations on the input image, and then fully connect to regress 6 angle values (assuming affine transformation), a 2*3 matrix.
Grid generator:网格生成器负责将V中的坐标位置,通过矩阵运算,计算出目标图V中的每个位置对应原图U中的坐标位置,即生成Tθ(Gi)。Grid generator: The grid generator is responsible for calculating the coordinate position in V through matrix operations to calculate each position in the target image V corresponding to the coordinate position in the original image U, that is, generating Tθ (Gi ).
这里的Grid采样过程,对于二维仿射变换(旋转,平移,缩放)来说,就是简单的矩阵运算:The Grid sampling process here is a simple matrix operation for two-dimensional affine transformations (rotation, translation, scaling):
上式中,代表原始图的坐标,代表目标图的坐标。Aθ为Localisation Network网络回归出的6个角度值。In the above formula, represent the coordinates of the original graph, Represents the coordinates of the target graph. Aθ is the 6 angle values regressed by the Localisation Network network.
Sampler:采样器根据Tθ(Gi)中的坐标信息,在原始图U中进行采样,将U中的像素复制到目标图V中。Sampler: The sampler samples the original image U according to the coordinate information in Tθ (Gi ), and copies the pixels in U to the target image V.
步骤三:搭建注意力机制网络Step 3: Build the attention mechanism network
特征提取采用深度神经网络的方法,采用的基础网络是resnet50,然后再这个基础上加入注意力机制模块。而注意力机制模块采用的是STN模块:将输入特征进行几个卷积操作,然后全连接回归出6个角度值(假设是仿射变换),2*3的矩阵。然后输入乘以这个矩阵就能得到局部有意义的特征。Feature extraction adopts the method of deep neural network, the basic network used is resnet50, and then the attention mechanism module is added on this basis. The attention mechanism module uses the STN module: perform several convolution operations on the input features, and then fully connect to regress 6 angle values (assuming affine transformation), a 2*3 matrix. The input is then multiplied by this matrix to get locally meaningful features.
网络分为主路和支路,主路为图片通过resnet50得到的输出,支路为经过不同的STN模块后再依次进行elementwise-add得到的输出。The network is divided into a main road and a branch. The main road is the output obtained by the picture through resnet50, and the branch is the output obtained by elementwise-add after passing through different STN modules.
主路:resnet50,由5个阶段组成,其中每个阶段包括了若干个卷积和池化操作。The main road: resnet50, which consists of 5 stages, each of which includes several convolution and pooling operations.
如图3所示,首先resnet50按输出特征图尺寸来分,可以分为5个stage,每个stage输出的特征图大小都不一样。As shown in Figure 3, first, resnet50 is divided according to the size of the output feature map, which can be divided into 5 stages, and the size of the feature map output by each stage is different.
Stage0有一个卷积层和池化层,卷积核大小是7x7,输出通道数为64,步长为2。池化采用的是maxpooling,窗口大小为3x3,步长为2。Stage0 has a convolutional layer and a pooling layer, the convolution kernel size is 7x7, the number of output channels is 64, and the stride is 2. Pooling uses maxpooling with a window size of 3x3 and a stride of 2.
Stage1由3个输出通道数为256的块(block)组成。Stage1 consists of 3 blocks with 256 output channels.
Stage2由4个输出通道数为512的块(block)组成。Stage2 consists of 4 blocks with 512 output channels.
Stage3由5个输出通道数为1024的块(block)组成。Stage3 consists of 5 blocks with 1024 output channels.
Stage4由6个输出通道数为2048的块(block)组成。Stage4 consists of 6 blocks with 2048 output channels.
如图4所示,其中每一个block的结构都是先用一个1x1卷积进行降维,然后进行3x3卷积,最后再用1x1卷积升维,输出与输入做elementwise-add操作,得到结果。As shown in Figure 4, the structure of each block is to use a 1x1 convolution to reduce the dimension first, then perform a 3x3 convolution, and finally use a 1x1 convolution to increase the dimension, and the output and input are performed elementwise-add operation to obtain the result. .
最后接一个128维的全连接层进行信息整合。Finally, a 128-dimensional fully connected layer is used for information integration.
支路:分别把stage0,1,2,3,4得到的特征图输入到各个STN模块中得到各自的特征:Branch: Input the feature maps obtained by stage0, 1, 2, 3, and 4 into each STN module to obtain their own features:
stage0经过STN后的输出为L0;The output of stage0 after STN is L0;
Stage1经过STN后的输出为L1;The output of Stage1 after STN is L1;
Stage2经过STN后的输出为L2;The output of Stage2 after STN is L2;
Stage3经过STN后的输出为L3;The output of Stage3 after STN is L3;
Stage4经过STN后的输出为L4;The output of Stage4 after STN is L4;
如图5所示,除第一个特征外,其余的特征都做一次卷积操作,卷积核大小是1x1,步长为1,输出通道数为上一个特征的通道数,用elementwise-add的方式把这些特征依次融合起来,所以做卷积操作的意义就是用于改变特征维度,以便特征相加操作。具体相加方法如下:As shown in Figure 5, except for the first feature, the rest of the features are subjected to a convolution operation. The size of the convolution kernel is 1x1, the stride is 1, and the number of output channels is the number of channels of the previous feature. Use elementwise-add These features are fused in turn, so the meaning of the convolution operation is to change the feature dimension so that the feature can be added. The specific addition method is as follows:
L0+f(L1)+f(L2)+f(L3)+f(L4)L0+f(L1)+f(L2)+f(L3)+f(L4)
其中”+”为elsemenwise-add操作,f(·)为卷积操作。Where "+" is the elsemenwise-add operation, and f( ) is the convolution operation.
这样就能得到主路输出和支路输出,最后把两路的输出进行特征拼接,得到最终的特征。这个特征将直接用于计算损失函数和作为人脸识别的特征。In this way, the main output and the branch output can be obtained, and finally the features of the two outputs are spliced to obtain the final feature. This feature will be used directly to calculate the loss function and as a feature for face recognition.
步骤五:训练注意力机制神经网络Step 5: Train the Attention Mechanism Neural Network
在本实施例中,构建Softmax分类模型时,我们将特征输出为x输入K路Softmax层(使用未激活的全连接层实现),以计算样本关于不同类别的后验概率其中K代表类别数目。Softmax层包含两个参数,W和b,于是第k路输出又可以表示成:In this embodiment, when constructing the Softmax classification model, we output the feature as x input to the K-way Softmax layer (implemented using an inactive fully connected layer) to calculate the posterior probability of the sample with respect to different categories where K represents the number of categories. The Softmax layer contains two parameters, W and b, so the kth output It can also be expressed as:
但由于全连接层的输出是任意数值,为了样本关于不同类别的归一化概率,我们需要对Softmax层输出变换,则得到的关于第k类的后验概率为:However, since the output of the fully connected layer is an arbitrary value, in order to normalize the probability of the sample with respect to different categories, we need to transform the output of the Softmax layer, and the obtained posterior probability about the kth category is:
在本实施例中,为了最大化每个样本关于所属类别的概率最大,我们可以定义Softmax Loss为:In this embodiment, in order to maximize the probability that each sample belongs to the category, we can define Softmax Loss as:
θ表示模型参数,x(i)表示样本y(i)所属类别。θ represents the model parameters, and x(i) represents the class of the sample y(i) .
在本实施例中,优化器采用Adam,权值衰减为5e-5,batch size为128,平均池化层输出采用dropout操作,保持概率为0.8。学习率调整策略为:先以0.1作为学习率对训练集训练3轮,然后降低至0.01训练2轮,接着再降低至0.001训练2轮,共7轮。每训完一轮的分类模型都会在LFW上进行验证,最后把训练好的分类模型保存。In this embodiment, the optimizer adopts Adam, the weight decay is 5e-5, the batch size is 128, the average pooling layer output adopts the dropout operation, and the retention probability is 0.8. The learning rate adjustment strategy is: first use 0.1 as the learning rate to train the training set for 3 rounds, then reduce it to 0.01 for 2 rounds of training, and then reduce it to 0.001 for 2 rounds of training, for a total of 7 rounds. After each round of training, the classification model will be verified on LFW, and finally the trained classification model will be saved.
步骤六:学习图像的高层特征和抽象特征Step 6: Learning high-level features and abstract features of images
提取图像特征,将测试样本输入到训练好的注意力机制网络中,得到优质的图像特征。Extract image features, input test samples into the trained attention mechanism network, and obtain high-quality image features.
步骤七:人脸识别Step 7: Face Recognition
把提取得到的图像特征用softmax回归方法进行分类,完成测试样本的识别。The extracted image features are classified by the softmax regression method to complete the identification of the test samples.
上述实施例为本发明较佳的实施方式,但本发明的实施方式并不受上述实施例的限制,其他的任何未背离本发明的精神实质与原理下所作的改变、修饰、替代、组合、简化,均应为等效的置换方式,都包含在本发明的保护范围之内。The above-mentioned embodiments are preferred embodiments of the present invention, but the embodiments of the present invention are not limited by the above-mentioned embodiments, and any other changes, modifications, substitutions, combinations, The simplification should be equivalent replacement manners, which are all included in the protection scope of the present invention.
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201811396296.8ACN109543606B (en) | 2018-11-22 | 2018-11-22 | A face recognition method with attention mechanism |
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201811396296.8ACN109543606B (en) | 2018-11-22 | 2018-11-22 | A face recognition method with attention mechanism |
| Publication Number | Publication Date |
|---|---|
| CN109543606Atrue CN109543606A (en) | 2019-03-29 |
| CN109543606B CN109543606B (en) | 2022-09-27 |
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201811396296.8AActiveCN109543606B (en) | 2018-11-22 | 2018-11-22 | A face recognition method with attention mechanism |
| Country | Link |
|---|---|
| CN (1) | CN109543606B (en) |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN110110642A (en)* | 2019-04-29 | 2019-08-09 | 华南理工大学 | A kind of pedestrian's recognition methods again based on multichannel attention feature |
| CN110135243A (en)* | 2019-04-02 | 2019-08-16 | 上海交通大学 | A pedestrian detection method and system based on two-level attention mechanism |
| CN110135251A (en)* | 2019-04-09 | 2019-08-16 | 上海电力学院 | A Group Image Emotion Recognition Method Based on Attention Mechanism and Hybrid Network |
| CN110188730A (en)* | 2019-06-06 | 2019-08-30 | 山东大学 | Face detection and alignment method based on MTCNN |
| CN110287846A (en)* | 2019-06-19 | 2019-09-27 | 南京云智控产业技术研究院有限公司 | A face key point detection method based on attention mechanism |
| CN110334588A (en)* | 2019-05-23 | 2019-10-15 | 北京邮电大学 | Kinship recognition methods and the device of network are paid attention to based on local feature |
| CN110378961A (en)* | 2019-09-11 | 2019-10-25 | 图谱未来(南京)人工智能研究院有限公司 | Optimization method, critical point detection method, apparatus and the storage medium of model |
| CN110458829A (en)* | 2019-08-13 | 2019-11-15 | 腾讯医疗健康(深圳)有限公司 | Image quality control method, device, equipment and storage medium based on artificial intelligence |
| CN110569905A (en)* | 2019-09-10 | 2019-12-13 | 江苏鸿信系统集成有限公司 | Fine-grained image classification method based on generation of confrontation network and attention network |
| CN110598022A (en)* | 2019-08-05 | 2019-12-20 | 华中科技大学 | An Image Retrieval System and Method Based on Robust Deep Hash Network |
| CN110610129A (en)* | 2019-08-05 | 2019-12-24 | 华中科技大学 | A deep learning face recognition system and method based on self-attention mechanism |
| CN110633689A (en)* | 2019-09-23 | 2019-12-31 | 天津天地基业科技有限公司 | Face recognition model based on semi-supervised attention network |
| CN110688938A (en)* | 2019-09-25 | 2020-01-14 | 江苏省未来网络创新研究院 | Pedestrian re-identification method integrated with attention mechanism |
| CN110781760A (en)* | 2019-05-24 | 2020-02-11 | 西安电子科技大学 | A method and device for facial expression recognition based on spatial attention |
| CN110796072A (en)* | 2019-10-28 | 2020-02-14 | 桂林电子科技大学 | Target tracking and identity recognition method based on double-task learning |
| CN110837840A (en)* | 2019-11-07 | 2020-02-25 | 中国石油大学(华东) | Picture feature detection method based on attention mechanism |
| CN111046781A (en)* | 2019-12-09 | 2020-04-21 | 华中科技大学 | Robust three-dimensional target detection method based on ternary attention mechanism |
| CN111178183A (en)* | 2019-12-16 | 2020-05-19 | 深圳市华尊科技股份有限公司 | Face detection method and related device |
| CN111242038A (en)* | 2020-01-15 | 2020-06-05 | 北京工业大学 | Dynamic tongue tremor detection method based on frame prediction network |
| CN111325161A (en)* | 2020-02-25 | 2020-06-23 | 四川翼飞视科技有限公司 | Method for constructing human face detection neural network based on attention mechanism |
| CN111339813A (en)* | 2019-09-30 | 2020-06-26 | 深圳市商汤科技有限公司 | Face attribute recognition method and device, electronic equipment and storage medium |
| CN111563468A (en)* | 2020-05-13 | 2020-08-21 | 电子科技大学 | A method for detecting abnormal driver behavior based on neural network attention |
| CN111582044A (en)* | 2020-04-15 | 2020-08-25 | 华南理工大学 | Face recognition method based on convolutional neural network and attention model |
| CN111652020A (en)* | 2019-04-16 | 2020-09-11 | 上海铼锶信息技术有限公司 | Method for identifying rotation angle of human face around Z axis |
| CN111680732A (en)* | 2020-05-28 | 2020-09-18 | 浙江师范大学 | A training method for dish recognition based on deep learning attention mechanism |
| CN111738099A (en)* | 2020-05-30 | 2020-10-02 | 华南理工大学 | An automatic face detection method based on video image scene understanding |
| CN111783681A (en)* | 2020-07-02 | 2020-10-16 | 深圳市万睿智能科技有限公司 | Large-scale face library recognition method, system, computer equipment and storage medium |
| CN111860393A (en)* | 2020-07-28 | 2020-10-30 | 浙江工业大学 | A face detection and recognition method on a security system |
| CN111950586A (en)* | 2020-07-01 | 2020-11-17 | 银江股份有限公司 | A Target Detection Method Introducing Bidirectional Attention |
| CN111967427A (en)* | 2020-08-28 | 2020-11-20 | 广东工业大学 | Fake face video identification method, system and readable storage medium |
| CN111985323A (en)* | 2020-07-14 | 2020-11-24 | 珠海市卓轩科技有限公司 | Face recognition method and system based on deep convolutional neural network |
| CN112101266A (en)* | 2020-09-25 | 2020-12-18 | 重庆电政信息科技有限公司 | Multi-ARM-based distributed inference method for action recognition model |
| CN112163462A (en)* | 2020-09-08 | 2021-01-01 | 北京数美时代科技有限公司 | Face-based juvenile recognition method and device and computer equipment |
| CN112365717A (en)* | 2020-10-10 | 2021-02-12 | 新疆爱华盈通信息技术有限公司 | Vehicle information acquisition method and system |
| CN112464912A (en)* | 2020-12-22 | 2021-03-09 | 杭州电子科技大学 | Robot-end face detection method based on YOLO-RGGNet |
| CN112507995A (en)* | 2021-02-05 | 2021-03-16 | 成都东方天呈智能科技有限公司 | Cross-model face feature vector conversion system and method |
| CN112560756A (en)* | 2020-12-24 | 2021-03-26 | 北京嘀嘀无限科技发展有限公司 | Method, device, electronic equipment and storage medium for recognizing human face |
| CN112597888A (en)* | 2020-12-22 | 2021-04-02 | 西北工业大学 | On-line education scene student attention recognition method aiming at CPU operation optimization |
| CN112699847A (en)* | 2021-01-15 | 2021-04-23 | 苏州大学 | Face characteristic point detection method based on deep learning |
| CN112766422A (en)* | 2021-03-15 | 2021-05-07 | 山东大学 | Privacy protection method based on lightweight face recognition model |
| CN112766158A (en)* | 2021-01-20 | 2021-05-07 | 重庆邮电大学 | Multi-task cascading type face shielding expression recognition method |
| WO2021088640A1 (en)* | 2019-11-06 | 2021-05-14 | 重庆邮电大学 | Facial recognition technology based on heuristic gaussian cloud transformation |
| CN113034457A (en)* | 2021-03-18 | 2021-06-25 | 广州市索图智能电子有限公司 | Face detection device based on FPGA |
| CN113239866A (en)* | 2021-05-31 | 2021-08-10 | 西安电子科技大学 | Face recognition method and system based on space-time feature fusion and sample attention enhancement |
| CN113408549A (en)* | 2021-07-14 | 2021-09-17 | 西安电子科技大学 | Few-sample weak and small target detection method based on template matching and attention mechanism |
| CN113822203A (en)* | 2021-09-26 | 2021-12-21 | 中国民用航空飞行学院 | Face recognition device and method based on reinforcement learning and deep convolutional neural network |
| CN113971745A (en)* | 2021-09-27 | 2022-01-25 | 哈尔滨工业大学 | Exit-entry verification stamp identification method and device based on deep neural network |
| CN113989906A (en)* | 2021-11-26 | 2022-01-28 | 江苏科技大学 | A face recognition method |
| CN114418838A (en)* | 2021-12-03 | 2022-04-29 | 浙江大华技术股份有限公司 | Sample image processing method and related device |
| CN114677672A (en)* | 2022-03-27 | 2022-06-28 | 河南科技大学 | A method for identifying ripe blueberry fruit based on deep learning neural network |
| CN114943251A (en)* | 2022-05-20 | 2022-08-26 | 电子科技大学 | Unmanned aerial vehicle target identification method based on fusion attention mechanism |
| CN114943845A (en)* | 2022-05-23 | 2022-08-26 | 天津城建大学 | Domain picture fine-grained classification and identification method and system |
| CN115205177A (en)* | 2022-06-22 | 2022-10-18 | 京东方科技集团股份有限公司 | Image acquisition method, apparatus, device, and non-transitory computer storage medium |
| CN115713795A (en)* | 2022-11-04 | 2023-02-24 | 四川轻化工大学 | Human face typical region classification method based on attention mechanism |
| CN115993365A (en)* | 2023-03-23 | 2023-04-21 | 山东省科学院激光研究所 | A belt defect detection method and system based on deep learning |
| CN116309703A (en)* | 2023-02-14 | 2023-06-23 | 天津市口腔医院(天津市整形外科医院、南开大学口腔医院) | A Method for Computer-Aided Analysis of Mandibular Range of Motion |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20180018539A1 (en)* | 2016-07-12 | 2018-01-18 | Beihang University | Ranking convolutional neural network constructing method and image processing method and apparatus thereof |
| CN108009493A (en)* | 2017-11-30 | 2018-05-08 | 电子科技大学 | Face anti-fraud recognition methods based on action enhancing |
| CN108416314A (en)* | 2018-03-16 | 2018-08-17 | 中山大学 | The important method for detecting human face of picture |
| CN108537135A (en)* | 2018-03-16 | 2018-09-14 | 北京市商汤科技开发有限公司 | The training method and device of Object identifying and Object identifying network, electronic equipment |
| CN108564029A (en)* | 2018-04-12 | 2018-09-21 | 厦门大学 | Face character recognition methods based on cascade multi-task learning deep neural network |
| CN108805089A (en)* | 2018-06-14 | 2018-11-13 | 南京云思创智信息科技有限公司 | Based on multi-modal Emotion identification method |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20180018539A1 (en)* | 2016-07-12 | 2018-01-18 | Beihang University | Ranking convolutional neural network constructing method and image processing method and apparatus thereof |
| CN108009493A (en)* | 2017-11-30 | 2018-05-08 | 电子科技大学 | Face anti-fraud recognition methods based on action enhancing |
| CN108416314A (en)* | 2018-03-16 | 2018-08-17 | 中山大学 | The important method for detecting human face of picture |
| CN108537135A (en)* | 2018-03-16 | 2018-09-14 | 北京市商汤科技开发有限公司 | The training method and device of Object identifying and Object identifying network, electronic equipment |
| CN108564029A (en)* | 2018-04-12 | 2018-09-21 | 厦门大学 | Face character recognition methods based on cascade multi-task learning deep neural network |
| CN108805089A (en)* | 2018-06-14 | 2018-11-13 | 南京云思创智信息科技有限公司 | Based on multi-modal Emotion identification method |
| Title |
|---|
| 郑伟诗等: "非对称行人重识别: 跨摄像机持续行人追踪", 《中国科学》* |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN110135243A (en)* | 2019-04-02 | 2019-08-16 | 上海交通大学 | A pedestrian detection method and system based on two-level attention mechanism |
| CN110135251A (en)* | 2019-04-09 | 2019-08-16 | 上海电力学院 | A Group Image Emotion Recognition Method Based on Attention Mechanism and Hybrid Network |
| CN110135251B (en)* | 2019-04-09 | 2023-08-08 | 上海电力学院 | Group image emotion recognition method based on attention mechanism and hybrid network |
| CN111652020A (en)* | 2019-04-16 | 2020-09-11 | 上海铼锶信息技术有限公司 | Method for identifying rotation angle of human face around Z axis |
| CN111652020B (en)* | 2019-04-16 | 2023-07-11 | 上海铼锶信息技术有限公司 | Face rotation angle identification method around Z axis |
| CN110110642A (en)* | 2019-04-29 | 2019-08-09 | 华南理工大学 | A kind of pedestrian's recognition methods again based on multichannel attention feature |
| CN110334588A (en)* | 2019-05-23 | 2019-10-15 | 北京邮电大学 | Kinship recognition methods and the device of network are paid attention to based on local feature |
| CN110781760A (en)* | 2019-05-24 | 2020-02-11 | 西安电子科技大学 | A method and device for facial expression recognition based on spatial attention |
| CN110188730A (en)* | 2019-06-06 | 2019-08-30 | 山东大学 | Face detection and alignment method based on MTCNN |
| CN110188730B (en)* | 2019-06-06 | 2022-12-23 | 山东大学 | MTCNN-based face detection and alignment method |
| CN110287846A (en)* | 2019-06-19 | 2019-09-27 | 南京云智控产业技术研究院有限公司 | A face key point detection method based on attention mechanism |
| CN110287846B (en)* | 2019-06-19 | 2023-08-04 | 南京云智控产业技术研究院有限公司 | A face key point detection method based on attention mechanism |
| CN110610129A (en)* | 2019-08-05 | 2019-12-24 | 华中科技大学 | A deep learning face recognition system and method based on self-attention mechanism |
| CN110598022A (en)* | 2019-08-05 | 2019-12-20 | 华中科技大学 | An Image Retrieval System and Method Based on Robust Deep Hash Network |
| CN110598022B (en)* | 2019-08-05 | 2021-11-19 | 华中科技大学 | Image retrieval system and method based on robust deep hash network |
| CN110458829B (en)* | 2019-08-13 | 2024-01-30 | 腾讯医疗健康(深圳)有限公司 | Image quality control method, device, equipment and storage medium based on artificial intelligence |
| CN110458829A (en)* | 2019-08-13 | 2019-11-15 | 腾讯医疗健康(深圳)有限公司 | Image quality control method, device, equipment and storage medium based on artificial intelligence |
| CN110569905A (en)* | 2019-09-10 | 2019-12-13 | 江苏鸿信系统集成有限公司 | Fine-grained image classification method based on generation of confrontation network and attention network |
| CN110569905B (en)* | 2019-09-10 | 2023-04-14 | 中电鸿信信息科技有限公司 | Fine-grained image classification method based on generation of confrontation network and attention network |
| CN110378961A (en)* | 2019-09-11 | 2019-10-25 | 图谱未来(南京)人工智能研究院有限公司 | Optimization method, critical point detection method, apparatus and the storage medium of model |
| CN110633689A (en)* | 2019-09-23 | 2019-12-31 | 天津天地基业科技有限公司 | Face recognition model based on semi-supervised attention network |
| CN110688938A (en)* | 2019-09-25 | 2020-01-14 | 江苏省未来网络创新研究院 | Pedestrian re-identification method integrated with attention mechanism |
| CN111339813A (en)* | 2019-09-30 | 2020-06-26 | 深圳市商汤科技有限公司 | Face attribute recognition method and device, electronic equipment and storage medium |
| CN111339813B (en)* | 2019-09-30 | 2022-09-27 | 深圳市商汤科技有限公司 | Face attribute recognition method and device, electronic equipment and storage medium |
| WO2021063056A1 (en)* | 2019-09-30 | 2021-04-08 | 深圳市商汤科技有限公司 | Facial attribute recognition method and apparatus, and electronic device and storage medium |
| CN110796072B (en)* | 2019-10-28 | 2023-04-07 | 桂林电子科技大学 | Target tracking and identity recognition method based on double-task learning |
| CN110796072A (en)* | 2019-10-28 | 2020-02-14 | 桂林电子科技大学 | Target tracking and identity recognition method based on double-task learning |
| WO2021088640A1 (en)* | 2019-11-06 | 2021-05-14 | 重庆邮电大学 | Facial recognition technology based on heuristic gaussian cloud transformation |
| CN110837840A (en)* | 2019-11-07 | 2020-02-25 | 中国石油大学(华东) | Picture feature detection method based on attention mechanism |
| CN111046781A (en)* | 2019-12-09 | 2020-04-21 | 华中科技大学 | Robust three-dimensional target detection method based on ternary attention mechanism |
| CN111046781B (en)* | 2019-12-09 | 2022-05-27 | 华中科技大学 | A Robust 3D Object Detection Method Based on Ternary Attention Mechanism |
| CN111178183A (en)* | 2019-12-16 | 2020-05-19 | 深圳市华尊科技股份有限公司 | Face detection method and related device |
| CN111242038B (en)* | 2020-01-15 | 2024-06-07 | 北京工业大学 | Dynamic tongue fibrillation detection method based on frame prediction network |
| CN111242038A (en)* | 2020-01-15 | 2020-06-05 | 北京工业大学 | Dynamic tongue tremor detection method based on frame prediction network |
| CN111325161A (en)* | 2020-02-25 | 2020-06-23 | 四川翼飞视科技有限公司 | Method for constructing human face detection neural network based on attention mechanism |
| CN111325161B (en)* | 2020-02-25 | 2023-04-18 | 四川翼飞视科技有限公司 | Method for constructing human face detection neural network based on attention mechanism |
| CN111582044A (en)* | 2020-04-15 | 2020-08-25 | 华南理工大学 | Face recognition method based on convolutional neural network and attention model |
| CN111582044B (en)* | 2020-04-15 | 2023-06-20 | 华南理工大学 | Face recognition method based on convolutional neural network and attention model |
| CN111563468A (en)* | 2020-05-13 | 2020-08-21 | 电子科技大学 | A method for detecting abnormal driver behavior based on neural network attention |
| CN111680732A (en)* | 2020-05-28 | 2020-09-18 | 浙江师范大学 | A training method for dish recognition based on deep learning attention mechanism |
| CN111738099B (en)* | 2020-05-30 | 2023-11-07 | 华南理工大学 | Automatic face detection method based on video image scene understanding |
| CN111738099A (en)* | 2020-05-30 | 2020-10-02 | 华南理工大学 | An automatic face detection method based on video image scene understanding |
| CN111950586A (en)* | 2020-07-01 | 2020-11-17 | 银江股份有限公司 | A Target Detection Method Introducing Bidirectional Attention |
| CN111950586B (en)* | 2020-07-01 | 2024-01-19 | 银江技术股份有限公司 | Target detection method for introducing bidirectional attention |
| CN111783681A (en)* | 2020-07-02 | 2020-10-16 | 深圳市万睿智能科技有限公司 | Large-scale face library recognition method, system, computer equipment and storage medium |
| CN111985323A (en)* | 2020-07-14 | 2020-11-24 | 珠海市卓轩科技有限公司 | Face recognition method and system based on deep convolutional neural network |
| CN111860393A (en)* | 2020-07-28 | 2020-10-30 | 浙江工业大学 | A face detection and recognition method on a security system |
| CN111967427A (en)* | 2020-08-28 | 2020-11-20 | 广东工业大学 | Fake face video identification method, system and readable storage medium |
| CN112163462A (en)* | 2020-09-08 | 2021-01-01 | 北京数美时代科技有限公司 | Face-based juvenile recognition method and device and computer equipment |
| CN112101266A (en)* | 2020-09-25 | 2020-12-18 | 重庆电政信息科技有限公司 | Multi-ARM-based distributed inference method for action recognition model |
| CN112365717A (en)* | 2020-10-10 | 2021-02-12 | 新疆爱华盈通信息技术有限公司 | Vehicle information acquisition method and system |
| CN112464912A (en)* | 2020-12-22 | 2021-03-09 | 杭州电子科技大学 | Robot-end face detection method based on YOLO-RGGNet |
| CN112464912B (en)* | 2020-12-22 | 2024-02-09 | 杭州电子科技大学 | Robot end face detection method based on YOLO-RGGNet |
| CN112597888B (en)* | 2020-12-22 | 2024-03-08 | 西北工业大学 | Online education scene student attention recognition method aiming at CPU operation optimization |
| CN112597888A (en)* | 2020-12-22 | 2021-04-02 | 西北工业大学 | On-line education scene student attention recognition method aiming at CPU operation optimization |
| CN112560756A (en)* | 2020-12-24 | 2021-03-26 | 北京嘀嘀无限科技发展有限公司 | Method, device, electronic equipment and storage medium for recognizing human face |
| CN112699847A (en)* | 2021-01-15 | 2021-04-23 | 苏州大学 | Face characteristic point detection method based on deep learning |
| CN112766158A (en)* | 2021-01-20 | 2021-05-07 | 重庆邮电大学 | Multi-task cascading type face shielding expression recognition method |
| CN112766158B (en)* | 2021-01-20 | 2022-06-03 | 重庆邮电大学 | Face occlusion expression recognition method based on multi-task cascade |
| CN112507995A (en)* | 2021-02-05 | 2021-03-16 | 成都东方天呈智能科技有限公司 | Cross-model face feature vector conversion system and method |
| CN112507995B (en)* | 2021-02-05 | 2021-06-01 | 成都东方天呈智能科技有限公司 | A cross-model face feature vector conversion system and method |
| CN112766422A (en)* | 2021-03-15 | 2021-05-07 | 山东大学 | Privacy protection method based on lightweight face recognition model |
| CN113034457A (en)* | 2021-03-18 | 2021-06-25 | 广州市索图智能电子有限公司 | Face detection device based on FPGA |
| CN113239866B (en)* | 2021-05-31 | 2022-12-13 | 西安电子科技大学 | Face recognition method and system based on space-time feature fusion and sample attention enhancement |
| CN113239866A (en)* | 2021-05-31 | 2021-08-10 | 西安电子科技大学 | Face recognition method and system based on space-time feature fusion and sample attention enhancement |
| CN113408549B (en)* | 2021-07-14 | 2023-01-24 | 西安电子科技大学 | Few-sample Weak Object Detection Method Based on Template Matching and Attention Mechanism |
| CN113408549A (en)* | 2021-07-14 | 2021-09-17 | 西安电子科技大学 | Few-sample weak and small target detection method based on template matching and attention mechanism |
| CN113822203A (en)* | 2021-09-26 | 2021-12-21 | 中国民用航空飞行学院 | Face recognition device and method based on reinforcement learning and deep convolutional neural network |
| CN113971745A (en)* | 2021-09-27 | 2022-01-25 | 哈尔滨工业大学 | Exit-entry verification stamp identification method and device based on deep neural network |
| CN113971745B (en)* | 2021-09-27 | 2024-04-16 | 哈尔滨工业大学 | A method and device for identifying entry-exit verification stamp based on deep neural network |
| CN113989906A (en)* | 2021-11-26 | 2022-01-28 | 江苏科技大学 | A face recognition method |
| CN113989906B (en)* | 2021-11-26 | 2024-11-15 | 江苏科技大学 | A face recognition method |
| CN114418838A (en)* | 2021-12-03 | 2022-04-29 | 浙江大华技术股份有限公司 | Sample image processing method and related device |
| CN114677672A (en)* | 2022-03-27 | 2022-06-28 | 河南科技大学 | A method for identifying ripe blueberry fruit based on deep learning neural network |
| CN114943251B (en)* | 2022-05-20 | 2023-05-02 | 电子科技大学 | Unmanned aerial vehicle target recognition method based on fusion attention mechanism |
| CN114943251A (en)* | 2022-05-20 | 2022-08-26 | 电子科技大学 | Unmanned aerial vehicle target identification method based on fusion attention mechanism |
| CN114943845A (en)* | 2022-05-23 | 2022-08-26 | 天津城建大学 | Domain picture fine-grained classification and identification method and system |
| CN115205177A (en)* | 2022-06-22 | 2022-10-18 | 京东方科技集团股份有限公司 | Image acquisition method, apparatus, device, and non-transitory computer storage medium |
| CN115205177B (en)* | 2022-06-22 | 2025-09-26 | 京东方科技集团股份有限公司 | Image acquisition method, device, apparatus, and non-transitory computer storage medium |
| CN115713795A (en)* | 2022-11-04 | 2023-02-24 | 四川轻化工大学 | Human face typical region classification method based on attention mechanism |
| CN116309703A (en)* | 2023-02-14 | 2023-06-23 | 天津市口腔医院(天津市整形外科医院、南开大学口腔医院) | A Method for Computer-Aided Analysis of Mandibular Range of Motion |
| CN115993365A (en)* | 2023-03-23 | 2023-04-21 | 山东省科学院激光研究所 | A belt defect detection method and system based on deep learning |
| Publication number | Publication date |
|---|---|
| CN109543606B (en) | 2022-09-27 |
| Publication | Publication Date | Title |
|---|---|---|
| CN109543606B (en) | A face recognition method with attention mechanism | |
| Yuliang et al. | Detecting curve text in the wild: New dataset and new solution | |
| Tao et al. | Smoke detection based on deep convolutional neural networks | |
| CN110298266A (en) | Deep neural network object detection method based on multiple dimensioned receptive field Fusion Features | |
| US20180018503A1 (en) | Method, terminal, and storage medium for tracking facial critical area | |
| CN112766186B (en) | Real-time face detection and head posture estimation method based on multitask learning | |
| CN107808376B (en) | A Deep Learning-Based Hand Raised Detection Method | |
| CN113888461B (en) | Small hardware defect detection method, system and equipment based on deep learning | |
| CN103886325B (en) | Cyclic matrix video tracking method with partition | |
| CN115797970B (en) | Dense pedestrian target detection method and system based on YOLOv5 model | |
| CN104794449B (en) | Gait energy diagram based on human body HOG features obtains and personal identification method | |
| CN109948457B (en) | Real-time object recognition method based on convolutional neural network and CUDA acceleration | |
| CN111310609B (en) | Video target detection method based on time sequence information and local feature similarity | |
| CN108121972A (en) | A kind of target identification method under the conditions of partial occlusion | |
| CN116912670A (en) | Deep sea fish identification method based on improved YOLO model | |
| Xie et al. | Research on MTCNN face recognition system in low computing power scenarios | |
| CN110334622A (en) | Pedestrian Retrieval Method Based on Adaptive Feature Pyramid | |
| CN110728214A (en) | A Scale Matching-Based Approach for Weak and Small Person Object Detection | |
| Zhang et al. | A robust chinese license plate detection and recognition systemin natural scenes | |
| Tu et al. | Improved pedestrian detection algorithm based on HOG and SVM | |
| Eldho et al. | YOLO based Logo detection | |
| CN110046650B (en) | Express package bar code rapid detection method | |
| CN118196396A (en) | Underwater target detection method based on deep learning | |
| Li et al. | CDMY: A lightweight object detection model based on coordinate attention | |
| Zhan et al. | Scale-equivariant steerable networks for crowd counting |
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |