

技术领域technical field
本发明涉及人脸年龄自动估计方法,尤其是基于弱监督的人脸年龄自动估计方法。The invention relates to an automatic face age estimation method, in particular to an automatic face age estimation method based on weak supervision.
背景技术Background technique
人脸年龄自动估计指提供一张图片,自动识别出图中人物的年龄。这项技术有很多应用,如视频监控、产品推荐、人机交互、市场分析、用户画像、年龄变化预测等。近些年来随着深度学习特别是卷积神经网络领域的发展,人脸年龄估计算法的性能也得到了快速的提高。但是使用算法进行年龄估计仍是一个比较困难的任务,人脸年龄与头骨形状、五官位置、皱纹等都有一定关系,而且会受到光照、姿态、表情等的影响,即使是人类对年龄的观察估计也会有很大偏差,因此人工标记年龄也不精确。当前的大部分基于机器学习的年龄估计方法需要依赖人工标记精确年龄,然后让算法去自动学习图片和标记的年龄之间的对应关系。因为标记不精确,学习到的模型也有误差。并且人工标记是一个很费时的工作,很多时候获取不到具有精确标记的人脸年龄数据。Automatic face age estimation refers to providing a picture and automatically identifying the age of the person in the picture. This technology has many applications, such as video surveillance, product recommendation, human-computer interaction, market analysis, user profiling, age change prediction, etc. In recent years, with the development of deep learning, especially the field of convolutional neural networks, the performance of face age estimation algorithms has also been rapidly improved. However, it is still a difficult task to use algorithms to estimate age. The age of the face is related to the shape of the skull, the position of the facial features, wrinkles, etc., and will be affected by lighting, posture, expressions, etc., even if humans observe age. Estimates can also be highly biased, so manually labeling ages is inaccurate. Most of the current age estimation methods based on machine learning need to rely on manual labeling of accurate ages, and then let the algorithm automatically learn the correspondence between pictures and labeled ages. Because the labels are not precise, the learned model also has errors. And manual labeling is a very time-consuming task, and in many cases, accurate facial age data cannot be obtained.
发明内容SUMMARY OF THE INVENTION
本发明目的在于提供一种基于弱监督的人脸年龄估计方法,此处弱监督(或称弱标签)是指人工对一张图片进行年龄标记的时候,不需要标记精确的年龄,只标记儿童,少年,青年,壮年,老年等几个年龄阶段。这样一是可以减少人工估计年龄的时间,二是降低算法对人工标注的精确性的依赖。The purpose of the present invention is to provide a face age estimation method based on weak supervision, where weak supervision (or weak label) refers to when manually marking a picture for age, it is not necessary to mark the exact age, and only children are marked. , teenagers, youth, adults, old age and other age stages. One is to reduce the time for manual age estimation, and the other is to reduce the algorithm's dependence on the accuracy of manual annotation.
本发明的具体技术方案是:一种基于弱监督的人脸年龄自动估计方法,包含以下步骤:The specific technical scheme of the present invention is: an automatic face age estimation method based on weak supervision, comprising the following steps:
步骤一、预处理:对人脸图像,统一缩放和裁剪为设定尺寸,根据图片本身的均值和方差对图片做归一化;Step 1. Preprocessing: uniformly scale and crop the face image to the set size, and normalize the image according to the mean and variance of the image itself;
步骤二、卷积神经网络:卷积神经网络使用任一种卷积神经网络模型,以经典的残差网络为例,它包含若干卷积层和最后的全连接分类层,各层之间使用直接映射来连接;
步骤三、提取特征向量和类别:神经网络的最终输出为图片所对应的类别得分,最后的全连接分类层的输入特征为图片对应的特征向量;
步骤四、计算每个类别的特征向量的中心点:取训练集中的所有图片,按类别分组,计算每组对应特征向量集合的中心点;
步骤五、计算每个图片与每个类别中心点的距离:对一个待测试的图片,提取特征向量后,计算该特征向量与每个类别中心点的距离,计算公式如下:
dj_i=(fj-Ci)×(fj-Ci)dj_i =(fj -Ci )×(fj -Ci )
其中,fj表示第j个测试的图片,Ci表示第i个类别的中心点;dj_i表示待测试的图片与Ci的距离;Among them, fj represents the image of the jth test, Ci represents the center point of the ith category; dj_i represents the distance between the image to be tested and Ci ;
步骤六、根据图片与每个类别中心点的距离估算年龄:对一张待测试图片p,首先估算每个类中心点之间的距离,计算公式如下:Step 6. Estimate the age according to the distance between the picture and the center point of each category: For a picture p to be tested, first estimate the distance between the center points of each category. The calculation formula is as follows:
Cij=(Ci-Cj)×(Ci-Cj)Cij =(Ci -Cj )×(Ci -Cj )
其中,Ci与Cj表示第i与第j个类别的中心点,Cij表示两个类别中心点之间的距离;Among them, Ci and Cj represent the center points of the i-th and j-th categories, and Cij represents the distance between the center points of the two categories;
年龄估算公式如下:The age estimation formula is as follows:
yp=Yc_i+dp_i×(Ci_i+1>dp_i)×(Gi×0.5)yp =Yc_i +dp_i ×(Ci_i+1 >dp_i )×(Gi ×0.5)
其中,Gi表示第i类别年龄段的跨度,Yc_i表示第i类年龄段的中心年龄,所述中心年龄是预设跨度的平均值;例如儿童年龄段预设跨度为0-12岁,中心年龄为6岁;dp_i表示预测图片p与对应预测的类别i之间距离;Ci_i+1表示类别i与邻近类别之间的距离;yp是最终得出的年龄。Among them, Gi represents the span of the i-th age group, Yc_i represents the center age of the i-th age group, and the center age is the average value of the preset span; for example, the preset span of children's age group is 0-12 years old, The central age is 6 years old; dp_i represents the distance between the predicted picture p and the corresponding predicted category i; Ci_i+1 represents the distance between the category i and the adjacent category; yp is the final age.
进一步的,上述步骤一中统一缩放和裁剪为设定尺寸是缩放到124x124,然后随机裁剪为112x112。Further, in the above step 1, the uniform scaling and cropping to set size is scaling to 124x124, and then randomly cropping to 112x112.
进一步的,上述卷积神经网络模型为残差网络,它包含若干卷积层和最后的全连接分类层,各层之间使用直接映射来连接。Further, the above-mentioned convolutional neural network model is a residual network, which includes several convolutional layers and a final fully-connected classification layer, and direct mapping is used to connect each layer.
目前常用的年龄估计算法普遍存在两个问题:一是非常依赖人工精确标注,而这非常花费时间和精力;二是模型计算度复杂,难以满足实际需求。本发明所提出的基于非精确标记的年龄估计算法,即基于弱监督的人脸年龄自动估计方法,依靠弱标签(年龄段)可以自动训练强分类(具体年龄)模型,该模型对一张未标注的图片自动估计具体年龄,减少了对人工精确标记的依赖。并且该算法从弱分类(年龄段)到强分类(具体年龄)的计算非常简单,相对于神经网络,额外增加的计算量可以忽略不计。There are generally two problems in the commonly used age estimation algorithms: one is that it relies heavily on manual accurate annotation, which takes a lot of time and effort; the other is that the model is computationally complex and difficult to meet actual needs. The imprecise label-based age estimation algorithm proposed by the present invention, that is, the weakly-supervised face age automatic estimation method, can automatically train a strong classification (specific age) model by relying on weak labels (age segments). Annotated images automatically estimate specific age, reducing the reliance on manual precision labeling. And the calculation of the algorithm from weak classification (age group) to strong classification (specific age) is very simple, and the additional calculation amount is negligible compared with the neural network.
在以平均绝对误差MAE为标准的评估中,在同样的网络比如res18,以IMDB-WIKI数据集为例,误差为3.45。该结果与需要精确标记的方法相比是可比较的。以无需精确标记能达到同样的准确度说明了方法的有效性。In the evaluation based on the mean absolute error MAE, in the same network such as res18, taking the IMDB-WIKI dataset as an example, the error is 3.45. The results are comparable to methods requiring precise labeling. The effectiveness of the method is demonstrated by achieving the same accuracy without precise labeling.
附图说明Description of drawings
图1是本发明一种基于弱监督的人脸年龄估计方法的流程图。FIG. 1 is a flow chart of a method for estimating face age based on weak supervision of the present invention.
图2是步骤三提取特征向量和类别的示意图。FIG. 2 is a schematic diagram of
具体实施方式Detailed ways
下面结合附图和具体实施方式对本发明作进一步详细的说明。The present invention will be described in further detail below with reference to the accompanying drawings and specific embodiments.
一种基于弱监督的人脸年龄估计方法,如图1所示,其步骤如下:A face age estimation method based on weak supervision, as shown in Figure 1, the steps are as follows:
步骤一、数据预处理:利用MTCNN或任一种现有的人脸检测和对齐方法,裁剪出人脸框,进行人脸关键点对齐。然后在训练模型前对图像数据预处理,首先大小统一调整为144x144;然后随机裁剪为122x122,随机旋转(比如旋转角度±20°),随机像素亮度调整;最后对图片做归一化,计算图片本身的均值和方差,然后对图片所有像素,均减去均值,除以方差。Step 1. Data preprocessing: Use MTCNN or any existing face detection and alignment method to crop out the face frame and align the key points of the face. Then preprocess the image data before training the model. First, the size is uniformly adjusted to 144x144; then it is randomly cropped to 122x122, randomly rotated (such as rotation angle ±20°), and the random pixel brightness is adjusted; finally, the image is normalized to calculate the image The mean and variance of itself, and then for all pixels in the picture, subtract the mean and divide by the variance.
步骤二、卷积神经网络:卷积神经网络不是固定的某种结构才能实现所提算法,为了更详细的描述实施方式,以残差网络resnet18为例,它的网络结构如图2所示。它包含17个卷积层,和一个全连接层。除了第一层,所有卷积层卷积核大小都为3x3。为了使梯度正常传播,多个卷积层之间以直接映射加间接映射的方式连接。最后的全连接层输出类别得分。
步骤三、提取特征向量和类别:如图2所示,取全连接层fc1000之前的特征作为特征向量,预测类别的最大值作为对应类别。在本实施例中,特征向量的维度为256。Step 3: Extract feature vector and category: As shown in Figure 2, the feature before the fully connected layer fc1000 is taken as the feature vector, and the maximum value of the predicted category is taken as the corresponding category. In this embodiment, the dimension of the feature vector is 256.
步骤四、计算每个类别的特征向量的中心点:取所有训练集图片,按类别分组,对每个类别,计算该类别下所有图片提取到的特征向量的均值,作为该类的中心点。Step 4: Calculate the center point of the feature vector of each category: take all the training set pictures, group them by category, and calculate the mean value of the feature vectors extracted from all the pictures under the category as the center point of the category for each category.
步骤五、计算要预测图片与类中心点距离:对于待预测图片,预处理后输入网络,提取特征向量和对应预测类别。计算特征向量与每个类中心点的距离,计算公式如下:Step 5: Calculate the distance between the image to be predicted and the class center point: For the image to be predicted, input it into the network after preprocessing, and extract the feature vector and the corresponding predicted class. Calculate the distance between the feature vector and the center point of each class, the calculation formula is as follows:
dj_i=(fj-Ci)×(fj-Ci)dj_i =(fj -Ci )×(fj -Ci )
其中,fj表示第j个测试的图片,Ci表示第i个类别的中心点;dj_i表示待测试的图片与Ci的距离。Among them, fj represents the image of the jth test, Ci represents the center point of the ith category; dj_i represents the distance between the image to be tested and Ci .
步骤六、根据图片与每个类别中心点的距离估算年龄:对于待预测图片,提取出特征向量和对应预测类别后,以对应类别的年龄段的中心点Yc_i为基础,计算特征向量与对应类别中心点的距离dp_i,dp_i越大,说明年龄偏移越大;dp_i越小,说明年龄偏移越小。偏移的方向由dp_i和预测对应类别与临近类别的中心距离Ci_i+1决定。如果dp_i大于Ci_i+1,说明方向应为负,否则方向应为正。Step 6. Estimate the age according to the distance between the picture and the center point of each category: for the picture to be predicted, after extracting the feature vector and the corresponding predicted category, based on the center point Yc_i of the age range of the corresponding category, calculate the feature vector and the corresponding predicted category. The distance dp_i from the center point of the category, the larger the dp_i , the larger the age offset; the smaller the dp_i , the smaller the age offset. The direction of the offset is determined by dp_i and the center distance Ci_i+1 between the predicted corresponding category and the adjacent category. If dp_i is greater than Ci_i+1 , the direction should be negative, otherwise the direction should be positive.
对一张待测试图片p,首先估算每个类中心点之间的距离,计算公式如下:For a picture p to be tested, first estimate the distance between the center points of each class, the calculation formula is as follows:
Cij=(Ci-Cj)×(Ci-Cj)Cij =(Ci -Cj )×(Ci -Cj )
其中,Ci与Cj表示第i与第j个类别的中心点,Cij表示两个类别中心点之间的距离;Among them, Ci and Cj represent the center points of the i-th and j-th categories, and Cij represents the distance between the center points of the two categories;
最终的年龄估算公式如下:The final age estimation formula is as follows:
yp=Yc_i+dp_i×(Ci_i+1>dp_i)×(Gi×0.5)yp =Yc_i +dp_i ×(Ci_i+1 >dp_i )×(Gi ×0.5)
其中,Gi表示第i类别年龄段的跨度,Yc_i表示第i类年龄段的中心年龄,所述中心年龄是预设跨度的平均值;例如儿童年龄段预设跨度为0-12岁,中心年龄为6岁;dp_i表示预测图片p与对应预测的类别i之间距离;Ci_i+1表示类别i与邻近类别之间的距离;yp是最终得出的年龄。Among them, Gi represents the span of the i-th age group, Yc_i represents the center age of the i-th age group, and the center age is the average value of the preset span; for example, the preset span of children's age group is 0-12 years old, The central age is 6 years old; dp_i represents the distance between the predicted picture p and the corresponding predicted category i; Ci_i+1 represents the distance between the category i and the adjacent category; yp is the final age.
在训练和预估之前的数据打标阶段,数据标注方法如下:In the data labeling stage before training and estimation, the data labeling method is as follows:
以标记为儿童,少年,青年,壮年,老年等5个年龄阶段为例,分别对应年龄范围大致为0-12岁,13-19岁,20-30岁,31-49岁,50-100岁。这5个年龄段即为5个类别,则5个类别的中心点分别为6,16,25,25,75。在上面步骤六的实施过程中,以这5个类别对应的中心点和年龄范围进行计算。Take the five age stages marked as children, teenagers, youth, prime-aged and elderly as an example, the corresponding age ranges are roughly 0-12 years old, 13-19 years old, 20-30 years old, 31-49 years old, 50-100 years old . These 5 age groups are 5 categories, and the center points of the 5 categories are 6, 16, 25, 25, and 75 respectively. In the implementation process of step 6 above, the calculation is performed based on the center point and age range corresponding to these 5 categories.
在以平均绝对误差MAE为标准的评估中,在同样的网络比如res18,以IMDB-WIKI数据集为例,误差为3.45。该结果与需要精确标记的方法相比是可比较的。以无需精确标记能达到同样的准确度说明了方法的有效性。In the evaluation based on the mean absolute error MAE, in the same network such as res18, taking the IMDB-WIKI dataset as an example, the error is 3.45. The results are comparable to methods requiring precise labeling. The effectiveness of the method is demonstrated by achieving the same accuracy without precise labeling.
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202010028680.3ACN111274882A (en) | 2020-01-11 | 2020-01-11 | Automatic estimation method for human face age based on weak supervision |
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202010028680.3ACN111274882A (en) | 2020-01-11 | 2020-01-11 | Automatic estimation method for human face age based on weak supervision |
| Publication Number | Publication Date |
|---|---|
| CN111274882Atrue CN111274882A (en) | 2020-06-12 |
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202010028680.3APendingCN111274882A (en) | 2020-01-11 | 2020-01-11 | Automatic estimation method for human face age based on weak supervision |
| Country | Link |
|---|---|
| CN (1) | CN111274882A (en) |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN111814611A (en)* | 2020-06-24 | 2020-10-23 | 重庆邮电大学 | A multi-scale face age estimation method and system with embedded high-order information |
| WO2023281606A1 (en)* | 2021-07-05 | 2023-01-12 | 日本電信電話株式会社 | Learning device, learning method, and learning program |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN102567719A (en)* | 2011-12-26 | 2012-07-11 | 东南大学 | Human age automatic estimation method based on posterior probability neural network |
| CN108898123A (en)* | 2018-07-09 | 2018-11-27 | 成都考拉悠然科技有限公司 | A kind of face identification method based on signature analysis |
| CN110378280A (en)* | 2019-07-17 | 2019-10-25 | 南京信息工程大学 | Orderly convolutional neural networks face age estimation method based on feature constraint |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN102567719A (en)* | 2011-12-26 | 2012-07-11 | 东南大学 | Human age automatic estimation method based on posterior probability neural network |
| CN108898123A (en)* | 2018-07-09 | 2018-11-27 | 成都考拉悠然科技有限公司 | A kind of face identification method based on signature analysis |
| CN110378280A (en)* | 2019-07-17 | 2019-10-25 | 南京信息工程大学 | Orderly convolutional neural networks face age estimation method based on feature constraint |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN111814611A (en)* | 2020-06-24 | 2020-10-23 | 重庆邮电大学 | A multi-scale face age estimation method and system with embedded high-order information |
| CN111814611B (en)* | 2020-06-24 | 2022-09-13 | 重庆邮电大学 | Multi-scale face age estimation method and system embedded with high-order information |
| WO2023281606A1 (en)* | 2021-07-05 | 2023-01-12 | 日本電信電話株式会社 | Learning device, learning method, and learning program |
| JPWO2023281606A1 (en)* | 2021-07-05 | 2023-01-12 | ||
| JP7694662B2 (en) | 2021-07-05 | 2025-06-18 | 日本電信電話株式会社 | Learning device, learning method, and learning program |
| Publication | Publication Date | Title |
|---|---|---|
| Liu et al. | Dense face alignment | |
| US7522773B2 (en) | Using time in recognizing persons in images | |
| CN104601964B (en) | Pedestrian target tracking and system in non-overlapping across the video camera room of the ken | |
| CN106355147A (en) | Acquiring method and detecting method of live face head pose detection regression apparatus | |
| KR20170006355A (en) | Method of motion vector and feature vector based fake face detection and apparatus for the same | |
| CN105389554A (en) | Live body discrimination method and device based on face recognition | |
| CN111091109A (en) | Method, system and device for age and gender prediction based on face images | |
| CN105243376A (en) | Living body detection method and device | |
| CN107743225B (en) | A Method for No-Reference Image Quality Prediction Using Multi-Layer Depth Representations | |
| CN104615986A (en) | Method for utilizing multiple detectors to conduct pedestrian detection on video images of scene change | |
| WO2019153175A1 (en) | Machine learning-based occluded face recognition system and method, and storage medium | |
| CN103500345A (en) | Method for learning person re-identification based on distance measure | |
| CN109993116B (en) | Pedestrian re-identification method based on mutual learning of human bones | |
| CN111401113A (en) | A Pedestrian Re-identification Method Based on Human Pose Estimation | |
| WO2023004546A1 (en) | Traditional chinese medicine constitution recognition method and apparatus, and electronic device, storage medium and program | |
| CN104966052A (en) | Attributive characteristic representation-based group behavior identification method | |
| CN110826629A (en) | Otoscope image auxiliary diagnosis method based on fine-grained classification | |
| CN111274882A (en) | Automatic estimation method for human face age based on weak supervision | |
| Lee et al. | Automatic facial recognition system assisted-facial asymmetry scale using facial landmarks | |
| CN116645717A (en) | A micro-expression recognition method and system based on PCANet+ and LSTM | |
| CN116311345A (en) | A method for re-identification of occluded pedestrians based on Transformer | |
| CN108009512A (en) | A kind of recognition methods again of the personage based on convolutional neural networks feature learning | |
| Pang et al. | Feature generation based on relation learning and image partition for occluded person re-identification | |
| CN107944340B (en) | Pedestrian re-identification method combining direct measurement and indirect measurement | |
| CN114627500A (en) | A cross-modal pedestrian re-identification method based on convolutional neural network |
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination |