Movatterモバイル変換


[0]ホーム

URL:


CN105005768B - Dynamic percentage sample cuts AdaBoost method for detecting human face - Google Patents

Dynamic percentage sample cuts AdaBoost method for detecting human face
Download PDF

Info

Publication number
CN105005768B
CN105005768BCN201510391152.3ACN201510391152ACN105005768BCN 105005768 BCN105005768 BCN 105005768BCN 201510391152 ACN201510391152 ACN 201510391152ACN 105005768 BCN105005768 BCN 105005768B
Authority
CN
China
Prior art keywords
sample
training
samples
error rate
weak classifier
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201510391152.3A
Other languages
Chinese (zh)
Other versions
CN105005768A (en
Inventor
李东新
张鸿鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hohai University HHU
Original Assignee
Hohai University HHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hohai University HHUfiledCriticalHohai University HHU
Priority to CN201510391152.3ApriorityCriticalpatent/CN105005768B/en
Publication of CN105005768ApublicationCriticalpatent/CN105005768A/en
Application grantedgrantedCritical
Publication of CN105005768BpublicationCriticalpatent/CN105005768B/en
Expired - Fee Relatedlegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Classifications

Landscapes

Abstract

Translated fromChinese

本发明公开了一种动态百分比样本裁剪AdaBoost人脸检测算法,具体为:在每次迭代开始的时候首先确定所需裁剪样本个数的百分比f,每一轮按照f裁剪掉权重较小的样本,用剩余样本进行训练,当训练得到的本次迭代的最佳弱分类器错误率大于随机值产生的错误率,通过减小裁剪的常量f,扩大样本集数量,对于本次迭代重新进行训练。如果当采用全部样本进行训练时,错误率仍然超过0.5则停止迭代。本发明适用于参与训练的样本个数过多时,通过选取部分对性能提升效果更好的样本,来达到节省训练时间的目的。

The invention discloses a dynamic percentage sample clipping AdaBoost face detection algorithm, specifically: at the beginning of each iteration, the percentage f of the number of samples to be clipped is first determined, and samples with smaller weights are clipped according to f in each round , use the remaining samples for training, when the error rate of the best weak classifier of this iteration obtained by training is greater than the error rate generated by the random value, by reducing the constant f of the clipping, the number of sample sets is enlarged, and retraining for this iteration . If the error rate still exceeds 0.5 when all samples are used for training, the iteration is stopped. The present invention is suitable for saving training time by selecting some samples with better effect on performance improvement when there are too many samples participating in training.

Description

Translated fromChinese
动态百分比样本裁剪AdaBoost人脸检测方法AdaBoost face detection method based on dynamic percentage sample cropping

技术领域technical field

本发明涉及一种动态百分比样本裁剪AdaBoost人脸检测方法,属于模式识别技术领域。The invention relates to a dynamic percentage sample clipping AdaBoost face detection method, which belongs to the technical field of pattern recognition.

背景技术Background technique

生物特征识别技术是通过每个个体所独有的生理特征和行为特征来实现身份证实或个体鉴别的目的。人脸作为生物特征的一种,具有易于获取,接口友好等特点,相较于现在常用的方式,如口令、信用卡、身份卡等,具有不可复制、携带方便、鉴别性强等优势。因此在视频监控、智能家居和刑事侦查等领域具有广阔的前景。随着嵌入式设备运算能力越来越强,智能算法越来越多地应用于嵌入式开发领域,实现不同的功能。其中人脸检测作为人脸识别的基础,成为了人工智能领域的研究热点。Biometric identification technology is to achieve the purpose of identity verification or individual identification through the unique physiological and behavioral characteristics of each individual. As a kind of biometrics, the face is easy to obtain and has a friendly interface. Compared with the commonly used methods such as passwords, credit cards, and identity cards, it has the advantages of non-replicable, easy to carry, and strong identification. Therefore, it has broad prospects in the fields of video surveillance, smart home and criminal investigation. As the computing power of embedded devices becomes stronger and stronger, intelligent algorithms are more and more used in the field of embedded development to realize different functions. Among them, face detection, as the basis of face recognition, has become a research hotspot in the field of artificial intelligence.

AdaBoost算法其核心是通过迭代的方法从大量的Haar特征中提取出分类效果最好的特征作为弱分类器,而最终生成的强分类器是由大量的弱分类器组成。AdaBoost实用而简单,而基于AdaBoost算法的人脸检测方法对于单一人脸图像的检测不仅具有极高的检测精度,而且具备很快的检测速度,因此基于该算法的人脸识别技术得到了广泛的应用。The core of the AdaBoost algorithm is to extract the feature with the best classification effect from a large number of Haar features through an iterative method as a weak classifier, and the final strong classifier is composed of a large number of weak classifiers. AdaBoost is practical and simple, and the face detection method based on the AdaBoost algorithm not only has extremely high detection accuracy for the detection of a single face image, but also has a fast detection speed, so the face recognition technology based on this algorithm has been widely used. application.

当训练样本,样本特征,弱分类器个数较多的时候,采用AdaBoost算法训练的分类器会消耗大量的训练时间。特征个数决定了算法的迭代次数,每次迭代获取相应特征在训练样本集中的错误率,最后通过比较错误率取得最佳弱分类器。每训练完一个最佳弱分类器,训练样本的权重会相应的发生变化,因此如果需要更多的弱分类器,则需要重复相应次数的上述步骤。由此可见,当训练样本,样本特征个数和弱分类器个数增加时,训练时间会以三次方的数量级增加。When the number of training samples, sample features, and weak classifiers is large, the classifier trained by the AdaBoost algorithm will consume a lot of training time. The number of features determines the number of iterations of the algorithm. Each iteration obtains the error rate of the corresponding feature in the training sample set, and finally obtains the best weak classifier by comparing the error rate. Every time the best weak classifier is trained, the weight of the training sample will change accordingly, so if more weak classifiers are needed, the above steps need to be repeated for a corresponding number of times. It can be seen that when the number of training samples, the number of sample features and the number of weak classifiers increase, the training time will increase in the order of cubic.

发明内容Contents of the invention

本发明的目的在于克服现有技术中的不足,提供一种动态百分比样本裁剪AdaBoost人脸检测方法,解决现有技术中采用AdaBoost算法训练的分类器会消耗大量的训练时间的技术问题。The purpose of the present invention is to overcome the deficiencies in the prior art, provide a kind of dynamic percentage sample cropping AdaBoost face detection method, solve the technical problem that the classifier that adopts AdaBoost algorithm training in the prior art can consume a large amount of training time.

为解决上述技术问题,本发明所采用的技术方案是:动态百分比样本裁剪AdaBoost人脸检测方法,在每次迭代开始的时候,首先确定所需裁剪样本个数的百分比f,每一轮按照f裁剪掉权重较小的样本,用剩余样本进行训练;In order to solve the above-mentioned technical problems, the technical solution adopted in the present invention is: the dynamic percentage sample cutting AdaBoost face detection method, when each iteration starts, at first determine the percentage f of the required cutting sample number, each round according to f Cut out samples with smaller weights and use the remaining samples for training;

当训练得到的本次迭代的最佳弱分类器错误率大于随机值产生的错误率,通过减小裁剪的常量f,扩大样本集数量,对于本次迭代重新进行训练;When the error rate of the best weak classifier of this iteration obtained by training is greater than the error rate generated by the random value, the number of sample sets is enlarged by reducing the clipping constant f, and retraining for this iteration;

如果当采用全部样本进行训练时,错误率仍然超过0.5,则停止迭代;If the error rate still exceeds 0.5 when all samples are used for training, stop the iteration;

具体算法包括如下步骤:The specific algorithm includes the following steps:

步骤一:设输入的训练样本总数为N,其中负样本为m个,正样本为n个,训练样本集为S={(x1,y1),...(xn,yn)},其中xi表示第i个样本,yi={1,0},分别用于标识正负样本;Step 1: Let the total number of input training samples be N, among which there are m negative samples and n positive samples, and the training sample set is S={(x1 ,y1 ),...(xn ,yn ) }, where xi represents the i-th sample, yi ={1,0}, which are used to identify positive and negative samples respectively;

步骤二:初始化样本权重:Step 2: Initialize sample weights:

步骤三:假设每一轮舍去的样本百分比为f,那么每一轮参与训练的样本个数为N×(1-f),迭代次数t=1,2,…,T;Step 3: Assuming that the percentage of samples discarded in each round is f, then the number of samples participating in training in each round is N×(1-f), and the number of iterations t=1,2,...,T;

步骤四:获取最优弱分类器,求得弱分类器ht在强分类器中的加权系数αt,方法如下:Step 4: Obtain the optimal weak classifier, and obtain the weighting coefficient αt of the weak classifier ht in the strong classifier, the method is as follows:

步骤401:归一化样本的权重值:Step 401: Normalize the weight value of the sample:

步骤402:针对每个特征j,训练一个简单弱分类器hj(x,fj,pjj):Step 402: For each feature j, train a simple weak classifier hj (x,fj ,pjj ):

其中,fj(x)为特征值,pj表示不等号方向,θj为弱分类器阈值;Among them, fj (x) is the feature value, pj represents the direction of the inequality sign, and θj is the threshold of the weak classifier;

步骤403:选择最小错误率对应的弱分类器ht(x),其中最小错误率定义为:Step 403: Select the weak classifier ht (x) corresponding to the minimum error rate, where the minimum error rate is defined as:

步骤404:如果εt=0或者在第一轮训练时就出现εt≥0.5,则令T=t-1,跳到步骤六;如果εt≥0.5且不是第一轮,则令T=t-1,判断f是否大于2/3,若大于则令f=2×f-1,否则令f=f/2跳转到步骤五;Step 404: If εt = 0 or εt ≥ 0.5 in the first round of training, then set T = t-1, skip to step 6; if εt ≥ 0.5 and not in the first round, then set T = t-1, judge whether f is greater than 2/3, if greater, set f=2×f-1, otherwise set f=f/2 and jump to step five;

步骤405:更新样本权重:Step 405: Update sample weights:

当样本xi被错误分类时ei=0,反之ei=1,When the sample xi is misclassified, ei =0, otherwise ei =1,

步骤406:求得弱分类器ht在强分类器中的加权系数:Step 406: Obtain the weighting coefficient of the weak classifier ht in the strong classifier:

步骤五:对训练集中样本,按权重值从小到大进行排列,根据裁剪的百分比f,裁剪掉权重较小的前n×f个样本;Step 5: Arrange the samples in the training set according to the weight value from small to large, and cut out the first n×f samples with smaller weights according to the clipping percentage f;

步骤六:输出强分类器:Step 6: Output strong classifier:

与现有技术相比,本发明所达到的有益效果是:适用于参与训练的样本个数过多时,通过选取部分对性能提升效果更好的样本,来达到节省训练时间的目的。Compared with the prior art, the beneficial effect achieved by the present invention is: when there are too many samples for training, the purpose of saving training time is achieved by selecting some samples with better effect on performance improvement.

附图说明Description of drawings

图1是本发明方法的流程图。Figure 1 is a flow chart of the method of the present invention.

图2是获取最优弱分类器的流程图。Fig. 2 is a flow chart of obtaining the optimal weak classifier.

具体实施方式Detailed ways

下面结合附图对本发明作进一步描述。The present invention will be further described below in conjunction with the accompanying drawings.

附图中各函数所表示的含义如下:The meanings of each function in the accompanying drawings are as follows:

函数cvGetTickCount():返回从操作系统启动到当前所经过的毫秒数,通过计算两个返回量的差值便可以统计训练所耗费的时间。Function cvGetTickCount(): returns the number of milliseconds elapsed from the start of the operating system to the current time, and the time spent on training can be counted by calculating the difference between the two returned values.

函数Single_Classifier(int i):用于产生一个强分类器,传入的参数表示构成此强分类器的弱分类器个数。Function Single_Classifier(int i): It is used to generate a strong classifier, and the parameter passed in indicates the number of weak classifiers constituting the strong classifier.

函数Generate_AllFeatures(int count):用于生成所有Haar-like的特征,count表示使用特征类型的数量。本发明选用了5种常用特征模板,因此count值为5。Function Generate_AllFeatures(int count): used to generate all Haar-like features, count represents the number of feature types used. The present invention selects 5 commonly used feature templates, so the count value is 5.

函数Input_Samples():从指定目录中读入正负样本。Function Input_Samples(): read positive and negative samples from the specified directory.

函数Select_WeakClassifier():用于获取最优弱分类器。Function Select_WeakClassifier(): used to obtain the optimal weak classifier.

函数Output_WeakClassifier():用于输出生成的弱分类器。Function Output_WeakClassifier(): A weak classifier for output generation.

函数Cal_HaarValue(j,k):用于计算第k个样本的第j个特征。Function Cal_HaarValue(j,k): used to calculate the jth feature of the kth sample.

函数qsort():根据特征值的大小对样本进行排序。Function qsort(): Sort the samples according to the size of the eigenvalues.

如图1所示,动态百分比样本裁剪AdaBoost人脸检测方法,在每次迭代开始的时候,首先确定所需裁剪样本个数的百分比f,每一轮按照f裁剪掉权重较小的样本,用剩余样本进行训练;As shown in Figure 1, the dynamic percentage sample cropping AdaBoost face detection method, at the beginning of each iteration, first determine the percentage f of the number of clipped samples, and cut out samples with smaller weights according to f in each round, using The remaining samples are used for training;

当训练得到的本次迭代的最佳弱分类器错误率大于随机值产生的错误率,通过减小裁剪的常量f,扩大样本集数量,对于本次迭代重新进行训练;When the error rate of the best weak classifier of this iteration obtained by training is greater than the error rate generated by the random value, the number of sample sets is enlarged by reducing the clipping constant f, and retraining for this iteration;

如果当采用全部样本进行训练时,错误率仍然超过0.5,则停止迭代;If the error rate still exceeds 0.5 when all samples are used for training, stop the iteration;

具体算法包括如下步骤:The specific algorithm includes the following steps:

步骤一:设输入的训练样本总数为N,其中负样本为m个,正样本为n个,训练样本集为S={(x1,y1),...(xn,yn)},其中xi表示第i个样本,yi={1,0},分别用于标识正负样本;Step 1: Let the total number of input training samples be N, among which there are m negative samples and n positive samples, and the training sample set is S={(x1 ,y1 ),...(xn ,yn ) }, where xi represents the i-th sample, yi ={1,0}, which are used to identify positive and negative samples respectively;

步骤二:初始化样本权重:Step 2: Initialize sample weights:

步骤三:假设每一轮舍去的样本百分比为f,那么每一轮参与训练的样本个数为N×(1-f),迭代次数t=1,2,…,T;Step 3: Assuming that the percentage of samples discarded in each round is f, then the number of samples participating in training in each round is N×(1-f), and the number of iterations t=1,2,...,T;

步骤四:获取最优弱分类器,求得弱分类器ht在强分类器中的加权系数αt,如图2所示,方法如下:Step 4: Obtain the optimal weak classifier, and obtain the weighting coefficient αt of the weak classifier ht in the strong classifier, as shown in Figure 2, the method is as follows:

步骤401:归一化样本的权重值:Step 401: Normalize the weight value of the sample:

步骤402:针对每个特征j,训练一个简单弱分类器hj(x,fj,pjj):Step 402: For each feature j, train a simple weak classifier hj (x,fj ,pjj ):

其中,fj(x)为特征值,pj表示不等号方向,θj为弱分类器阈值;Among them, fj (x) is the feature value, pj represents the direction of the inequality sign, and θj is the threshold of the weak classifier;

步骤403:选择最小错误率对应的弱分类器ht(x),其中最小错误率定义为:Step 403: Select the weak classifier ht (x) corresponding to the minimum error rate, where the minimum error rate is defined as:

步骤404:如果εt=0或者在第一轮训练时就出现εt≥0.5,则令T=t-1,跳到步骤六;如果εt≥0.5且不是第一轮,则令T=t-1,判断f是否大于2/3,若大于则令f=2×f-1,否则令f=f/2跳转到步骤五;Step 404: If εt = 0 or εt ≥ 0.5 in the first round of training, then set T = t-1, skip to step 6; if εt ≥ 0.5 and not in the first round, then set T = t-1, judge whether f is greater than 2/3, if greater, set f=2×f-1, otherwise set f=f/2 and jump to step five;

步骤405:更新样本权重:Step 405: Update sample weights:

当样本xi被错误分类时ei=0,反之ei=1,When the sample xi is misclassified, ei =0, otherwise ei =1,

步骤406:求得弱分类器ht在强分类器中的加权系数:Step 406: Obtain the weighting coefficient of the weak classifier ht in the strong classifier:

步骤五:对训练集中样本,按权重值从小到大进行排列,根据裁剪的百分比f,裁剪掉权重较小的前n×f个样本;Step 5: Arrange the samples in the training set according to the weight value from small to large, and cut out the first n×f samples with smaller weights according to the clipping percentage f;

步骤六:输出强分类器:Step 6: Output strong classifier:

以上所述仅是本发明的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明技术原理的前提下,还可以做出若干改进和变形,这些改进和变形也应视为本发明的保护范围。The above is only a preferred embodiment of the present invention, it should be pointed out that for those of ordinary skill in the art, without departing from the technical principle of the present invention, some improvements and modifications can also be made. It should also be regarded as the protection scope of the present invention.

Claims (1)

CN201510391152.3A2015-07-062015-07-06Dynamic percentage sample cuts AdaBoost method for detecting human faceExpired - Fee RelatedCN105005768B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN201510391152.3ACN105005768B (en)2015-07-062015-07-06Dynamic percentage sample cuts AdaBoost method for detecting human face

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201510391152.3ACN105005768B (en)2015-07-062015-07-06Dynamic percentage sample cuts AdaBoost method for detecting human face

Publications (2)

Publication NumberPublication Date
CN105005768A CN105005768A (en)2015-10-28
CN105005768Btrue CN105005768B (en)2018-09-14

Family

ID=54378433

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201510391152.3AExpired - Fee RelatedCN105005768B (en)2015-07-062015-07-06Dynamic percentage sample cuts AdaBoost method for detecting human face

Country Status (1)

CountryLink
CN (1)CN105005768B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN106022225B (en)*2016-05-102019-03-05中科天网(广东)科技有限公司A kind of Face datection classifier building method based on AdaBoost
CN106951930A (en)*2017-04-132017-07-14杭州申昊科技股份有限公司A kind of instrument localization method suitable for Intelligent Mobile Robot
CN107477809A (en)*2017-09-202017-12-15四川长虹电器股份有限公司Air conditioner energy source management system based on Adaboost

Citations (3)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN101196984A (en)*2006-12-182008-06-11北京海鑫科金高科技股份有限公司Fast face detecting method
CN103116756A (en)*2013-01-232013-05-22北京工商大学Face detecting and tracking method and device
CN103605964A (en)*2013-11-252014-02-26上海骏聿数码科技有限公司Face detection method and system based on image on-line learning

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US7953253B2 (en)*2005-12-312011-05-31Arcsoft, Inc.Face detection on mobile devices

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN101196984A (en)*2006-12-182008-06-11北京海鑫科金高科技股份有限公司Fast face detecting method
CN103116756A (en)*2013-01-232013-05-22北京工商大学Face detecting and tracking method and device
CN103605964A (en)*2013-11-252014-02-26上海骏聿数码科技有限公司Face detection method and system based on image on-line learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于AdaBoost算法的人脸检测研究;李长风;《中国优秀硕士学位论文全文数据库 信息科技辑》;20141015(第10期);全文*
基于动态权重裁剪的快速Adaboost训练算法;贾慧星;《计算机学报》;20090228;第32卷(第2期);全文*

Also Published As

Publication numberPublication date
CN105005768A (en)2015-10-28

Similar Documents

PublicationPublication DateTitle
CN109063565B (en)Low-resolution face recognition method and device
Zhang et al.Tiny YOLO optimization oriented bus passenger object detection
CN107480575A (en)The training method of model, across age face identification method and corresponding device
Wu et al.Some analysis and research of the AdaBoost algorithm
CN106407958B (en)Face feature detection method based on double-layer cascade
CN103198303A (en)Gender identification method based on facial image
CN104751136A (en)Face recognition based multi-camera video event retrospective trace method
CN107945210B (en)Target tracking method based on deep learning and environment self-adaption
Zhou et al.Convolutional neural networks based pornographic image classification
CN104361345A (en)Electroencephalogram signal classification method based on constrained extreme learning machine
CN105069396B (en) Dynamic percentage feature cropping AdaBoost face detection algorithm
CN103839033A (en)Face identification method based on fuzzy rule
Saberian et al.Learning optimal embedded cascades
CN105046214A (en)On-line multi-face image processing method based on clustering
CN108288048A (en)Based on the facial emotions identification feature selection method for improving brainstorming optimization algorithm
CN107239741A (en)A kind of single sample face recognition method based on sparse reconstruct
CN105005768B (en)Dynamic percentage sample cuts AdaBoost method for detecting human face
Agha et al.A comprehensive study on sign languages recognition systems using (SVM, KNN, CNN and ANN)
Yang et al.Research on bootstrapping algorithm for health insurance data fraud detection based on decision tree
Ma et al.Robust real-time face detection based on cost-sensitive AdaBoost method
CN103366163B (en)Face detection system and method based on incremental learning
CN102147862A (en)Face feature extracting method based on survival exponential entropy
Lv et al.Face detection based on skin color and AdaBoost algorithm
Mu et al.Incremental SVM algorithm to intrusion detection base on boundary areas
Yu et al.A depth cascade face detection algorithm based on adaboost

Legal Events

DateCodeTitleDescription
C06Publication
PB01Publication
C10Entry into substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant
CF01Termination of patent right due to non-payment of annual fee

Granted publication date:20180914

Termination date:20210706

CF01Termination of patent right due to non-payment of annual fee

[8]ページ先頭

©2009-2025 Movatter.jp