CN102930278A

Movatterモバイル変換

Info

Publication number: CN102930278A
Application number: CN2012103929754A
Authority: CN
Inventors: 车明; 常轶松; 刘学毅; 李维超; 秦超; 黎贺
Original assignee: Tianjin University
Current assignee: Tianjin University
Priority date: 2012-10-16
Filing date: 2012-10-16
Publication date: 2013-02-13

Abstract

本发明公开了一种人眼视线估计方法及其装置，涉及人机交互领域。通过对人脸区域进行眼部粗定位，先粗略选取脸部图像的1/2-7/8处为眼部区域；再通过混合积分投影函数确定人眼纵坐标位置，以获取人眼精确边界；在人眼精确边界进行瞳孔扫描，根据经验值确定一个和眼球大小相近的框对人眼精确边界内部进行窗口扫描，选取灰度值和最小的窗口为瞳孔，并将窗口中心作为瞳孔中心；利用角点匹配算法在人眼精确边界内部获取内眼角点坐标；将瞳孔中心坐标和内眼角点坐标值传入视线估计模型中，确定视线方向。本发明通过视线估计模型提高了视线估计的精度；通过设计的硬件结构实现了在高检测率的条件下，消耗较少资源。

The invention discloses a method and a device for estimating the line of sight of human eyes, and relates to the field of human-computer interaction. By roughly positioning the eyes of the face area, first roughly select 1/2-7/8 of the face image as the eye area; then determine the ordinate position of the human eye through the mixed integral projection function to obtain the precise boundary of the human eye ;Pupil scanning is performed at the precise boundary of the human eye, and a frame similar to the size of the eyeball is determined according to experience to scan the window inside the precise boundary of the human eye, and the window with the gray value and the smallest value is selected as the pupil, and the center of the window is used as the pupil center; Use the corner point matching algorithm to obtain the coordinates of the inner corner of the eye within the precise boundary of the human eye; transfer the coordinates of the center of the pupil and the coordinates of the inner corner of the eye into the line of sight estimation model to determine the direction of the line of sight. The present invention improves the accuracy of line of sight estimation through the line of sight estimation model; realizes the consumption of less resources under the condition of high detection rate through the designed hardware structure.

Description

Translated fromChinese

一种人眼视线估计方法及其装置A method and device for estimating human eye sight line

技术领域technical field

本发明涉及人机交互领域，尤其涉及一种人眼视线估计方法及其装置。The invention relates to the field of human-computer interaction, in particular to a method and device for estimating a human eye sight line.

背景技术Background technique

HCI(Human-Computer Interaction，人机交互)是研究人与计算机相互作用的技术，研究目的在于提高人机交流的自然性和高效性。眼睛作为人类面部最显著的特征，其运动在信息表达与交流中起到了非常重要的作用。因此，通过拍摄图像对人眼睛的信息进行提取和分析成为人机交互领域中的热点研究问题。HCI (Human-Computer Interaction, Human-Computer Interaction) is a technology that studies the interaction between humans and computers. The purpose of the research is to improve the naturalness and efficiency of human-computer communication. Eyes are the most prominent feature of the human face, and their movement plays a very important role in information expression and communication. Therefore, extracting and analyzing the information of human eyes by taking images has become a hot research issue in the field of human-computer interaction.

脸部特征点检测是决定一幅图像中人脸各个特征的位置和大小，它有着广泛的应用：监控追踪、人机交互、智能机器人以及视线估计等。1995年，Freund和Schapire提出AdaBoost算法，这种算法根据弱学习的反馈，适应性地调整假设的错误率，使在效率不降低的情况下，检测正确率得到了很大的提高。Viola提出将积分图应用到特征值的计算之中，使其计算速度大幅增加。Facial feature point detection is to determine the position and size of each feature of a face in an image. It has a wide range of applications: monitoring and tracking, human-computer interaction, intelligent robots, and line of sight estimation. In 1995, Freund and Schapire proposed the AdaBoost algorithm, which adaptively adjusts the error rate of assumptions according to the feedback of weak learning, so that the detection accuracy rate is greatly improved without reducing the efficiency. Viola proposes to apply the integral graph to the calculation of eigenvalues, so that the calculation speed can be greatly increased.

视线估计在很多领域比如人机交互、医学诊断、航空和助残等等都有着广阔的发展前景，因此近年来该技术得到了广泛的关注。常用的研究方法是通过设置参考点光源和三维重建技术来实现视线估计。Sight estimation has broad development prospects in many fields such as human-computer interaction, medical diagnosis, aviation and disability assistance, so this technology has received extensive attention in recent years. The commonly used research method is to realize line-of-sight estimation by setting reference point light source and 3D reconstruction technology.

发明人在实现本发明的过程中，发现现有技术中至少存在以下缺点和不足：In the process of realizing the present invention, the inventor finds that at least the following disadvantages and deficiencies exist in the prior art:

1）由于上述算法本身的特性，使得获取到人脸检测结果的时间较长；1) Due to the characteristics of the above algorithm itself, it takes a long time to obtain the face detection results;

2）由于参考点光源需要复杂的实验设置以及严格的光照环境，限制了应用范围。2) Due to the complex experimental setup and strict lighting environment required for the reference point light source, the scope of application is limited.

发明内容Contents of the invention

本发明提供了一种人眼视线估计方法及其装置，本发明缩短了获取人脸检测结果的时间，扩大了应用范围，详见下文描述：The present invention provides a method and device for estimating the line of sight of the human eye. The present invention shortens the time for obtaining face detection results and expands the scope of application. See the following description for details:

一种人眼视线估计方法，其特征在于，所述方法包括以下步骤：A method for estimating the line of sight of human eyes is characterized in that the method comprises the following steps:

（1）获取包含人脸的图像并进行位图转换，获取RGB位图；将RGB位图转换为灰度图和肤色二值图；(1) Obtain an image containing a human face and perform bitmap conversion to obtain an RGB bitmap; convert the RGB bitmap into a grayscale image and a skin color binary image;

（2）将灰度图进行各级缩放，对每一级缩放后的图像进行窗口扫描，扫描窗口大小为20×20，并对扫描窗口中的灰度图进行积分图和平方积分图计算；(2) Scale the grayscale image at various levels, perform window scanning on the image scaled at each level, the size of the scanning window is 20×20, and calculate the integral map and square integral map of the grayscale image in the scanning window;

（3）通过积分图数据和平方积分图数据进行弱分类器计算，将同级弱分类器计算结果累加与对应的强分类器阈值比较，淘汰非人脸窗口；如果一个待选窗口通过所有的强分类器则判定为人脸图像；(3) Calculate the weak classifier through the integral map data and the square integral map data, and compare the cumulative results of the weak classifiers at the same level with the corresponding strong classifier thresholds, and eliminate non-face windows; if a candidate window passes all The strong classifier is judged as a face image;

（4）Nios II核处理器将识别为人脸的待选窗口进行合并，获取最终人脸区域；根据最终人脸区域、肤色二值图和肤色模型进行脸部区域精确定位，获取人脸区域；(4) The Nios II core processor merges the candidate windows identified as faces to obtain the final face area; according to the final face area, skin color binary image and skin color model, the face area is accurately positioned to obtain the face area;

（5）根据人脸区域进行眼部粗定位，先粗略选取脸部图像的1/2-7/8处为眼部区域；再通过混合积分投影函数确定人眼纵坐标位置，以获取人眼精确边界；(5) Carry out rough positioning of the eyes according to the face area, first roughly select 1/2-7/8 of the face image as the eye area; then determine the ordinate position of the human eye through the mixed integral projection function to obtain the precise boundaries;

（6）对人眼精确边界进行窗口扫描，根据经验值确定一个和眼球大小相近的框对人眼精确边界进行扫描，选取灰度值和最小的窗口为瞳孔，并将窗口中心作为瞳孔中心；(6) Window scan the precise boundary of the human eye, determine a frame similar to the size of the eyeball to scan the precise boundary of the human eye according to empirical values, select the window with the smallest gray value and the smallest value as the pupil, and use the center of the window as the pupil center;

（7）以获得的瞳孔中心为基准，截取包含内眼角的内眼角窗口并对内眼角窗口进行灰度等级拉伸预处理，然后利用Susan算子及角点检测算子，在内眼角窗口中提取候选内眼角点，最后筛选出正确内眼角点坐标；(7) Based on the obtained pupil center, intercept the inner corner window including the inner corner of the eye and perform grayscale stretching preprocessing on the inner corner window, and then use the Susan operator and corner detection operator to detect the inner corner window in the inner corner window Extract candidate inner corner points, and finally filter out the correct inner corner point coordinates;

（8）将瞳孔中心的坐标和正确内眼角点坐标传入PC机的视线估计模型中，确定视线方向。(8) Transfer the coordinates of the center of the pupil and the correct coordinates of the inner corner of the eye to the line-of-sight estimation model of the PC to determine the line-of-sight direction.

所述将识别为人脸的待选窗口进行合并具体为：The merging of the candidate windows identified as human faces is specifically:

1)当第二个人脸框与第一个人脸框相距不到第一个人脸框宽的1/2时，将第一个人脸框和第二个人脸框合并为一类，当满足条件时依次进行其他人脸框和第一个人脸框的合并；1) When the distance between the second face frame and the first face frame is less than 1/2 of the width of the first face frame, merge the first face frame and the second face frame into one category, when When the conditions are met, the other face frames are merged with the first face frame in turn;

2)将人脸框数目大于阈值的类进行人脸最终区域的计算。2) Calculate the final area of the face for the class whose number of face frames is greater than the threshold.

所述将人脸框数目大于阈值的类进行人脸最终区域的计算具体为：The calculation of the final area of the face with the number of face frames greater than the threshold is specifically:

将所有框的对应左上角点坐标进行求平均值计算，将计算结果作为整合框的左上角点坐标，再依次进行右上角点、左下角点和右下角点坐标的计算，至此上述四点的坐标即可确定人脸最终区域。Calculate the average of the coordinates of the corresponding upper left corners of all boxes, and use the calculation results as the coordinates of the upper left corner of the integrated frame, and then calculate the coordinates of the upper right corner, lower left corner and lower right corner in turn, so far the above four points The coordinates can determine the final area of the face.

所述肤色模型具体为：（Cg，Cb，Cr为肤色二值图）；The skin color model is specifically: (Cg, Cb, Cr are binary images of skin color);

$\frac{{((Cg C g - - 107107))}^{22} + + {((Cb Cb - - 110110))}^{22}}{{12.25 12.25}^{22}} \leq \leq 11$

Cr∈[260-Cg,280-Cg]Cr∈[260-Cg,280-Cg]

Cg∈[85,135]Cg∈[85,135]

所述视线估计模型具体为：The line of sight estimation model is specifically:

$p p [\begin{matrix} {x x}_{s the s} \\ {y the y}_{s the s} \\ 11 \end{matrix}] = = H h [\begin{matrix} {x x}_{p p} \\ {y the y}_{P P} \\ 11 \end{matrix}]$

(x_s，y_s)为屏幕注视点坐标，(x_p,y_p)为照片中的瞳孔坐标，H为屏幕和照片投影平面间的矩阵，P为照片中的瞳孔中心。(x_s , y_s ) is the gaze point coordinates of the screen, (x_p , y_p ) is the pupil coordinates in the photo, H is the matrix between the screen and the projection plane of the photo, and P is the pupil center in the photo.

所述在内眼角窗口中提取候选内眼角点，最后筛选出正确内眼角点坐标具体为：In the inner corner window, the candidate inner corner points are extracted, and finally the correct inner corner point coordinates are selected as follows:

1）如果只有一个候选角点，则候选角点即为正确内眼角点；1) If there is only one candidate corner point, the candidate corner point is the correct inner corner point;

2）如果有两个候选角点，则选择距瞳孔中心最远的角点为正确内眼角点；2) If there are two candidate corner points, select the corner point farthest from the pupil center as the correct inner corner point;

3）如果有三个或三个以上候选角点，则根据如下算法进行筛选：3) If there are three or more candidate corner points, filter according to the following algorithm:

${X x}_{max max} = = \underset{((x x,, y the y)) &Element; &Element; S S}{max max} {S S}_{x x}$ ${Y Y}_{min min} = = \underset{((x x,, y the y)) &Element; &Element; S S}{min min} {S S}_{y the y}$

T={(x,y)|(X_max-x)<5∩(y-Y_min)<5,(x,y)∈S}T={(x,y)|(X_max -x)<5∩(yY_min )<5,(x,y)∈S}

C_x=mean(_Tx) C_y=mean(T_y)C_x =mean(_Tx ) C_y =mean(T_y )

其中，S为候选角点集合，X_max为S中所有的点横坐标的最大值，Y_min为S中所有的点纵坐标的最小值，T为S中与点（X_max，Y_min）的横纵坐标均相差不大于5个像素的点的集合，点（C_x，C_y）即为所选正确内眼角点坐标；mean表示取平均值。Among them, S is the set of candidate corner points, X_max is the maximum value of the abscissa of all points in S, Y_min is the minimum value of the ordinate of all points in S, and T is the point in S (X_max , Y_min ) A set of points whose horizontal and vertical coordinates differ by no more than 5 pixels, and the point (C_x , C_y ) is the coordinates of the selected correct inner corner of the eye; mean means to take the average value.

一种人眼视线估计装置，包括：A device for estimating human line of sight, comprising:

人脸图像获取模块，用于获取人脸图像；A human face image acquisition module is used to obtain a human face image;

位图转换模块，用于对人脸图像进行位图转换，获取RGB位图；Bitmap conversion module, for carrying out bitmap conversion to face image, obtains RGB bitmap;

灰度图模块，用于将RGB位图转换为灰度图；Grayscale module for converting RGB bitmaps to grayscale images;

肤色二值图模块，用于将RGB位图转换为肤色二值图；The skin color binary image module is used to convert the RGB bitmap into a skin color binary image;

积分图模块，用于进行积分图和平方积分图计算，获取积分图数据和平方积分图数据；Integral map module, used for integral map and square integral map calculation, to obtain integral map data and square integral map data;

弱分类器计算模块，用于对积分图数据和平方积分图数据进行计算并淘汰非人脸窗口；The weak classifier calculation module is used to calculate integral map data and square integral map data and eliminate non-face windows;

Nios II核处理器，用于人脸窗口的合并，并获取最终人脸区域；根据最终人脸区域、肤色二值图和肤色模型进行脸部区域精确定位，获取人脸区域；Nios II core processor, used for merging of face windows and obtaining the final face area; according to the final face area, skin color binary image and skin color model, the face area is accurately positioned to obtain the face area;

眼部特征点检测模块，用于根据人脸区域进行眼部粗定位，选取脸部图像的1/2-7/8处为眼部区域；通过混合积分投影函数确定人眼坐标位置，获取人眼精确边界；对人眼精确边界进行窗口扫描，根据经验值确定一个和眼球大小相近的框对人眼精确边界进行扫描，选取灰度值和最小的窗口为瞳孔，并将窗口中心作为瞳孔中心，获取正确内眼角点坐标；将瞳孔中心的坐标和正确内眼角点坐标传入PC机的视线估计模型中，确定视线方向。The eye feature point detection module is used for rough positioning of the eyes according to the face area, and selects 1/2-7/8 of the face image as the eye area; determines the coordinate position of the human eye through the mixed integral projection function, and obtains the Precise boundary of the eye; scan the window of the precise boundary of the human eye, determine a frame similar to the size of the eyeball to scan the precise boundary of the human eye according to empirical values, select the window with the gray value and the smallest value as the pupil, and use the center of the window as the pupil center , to obtain the correct coordinates of the inner corner of the eye; transfer the coordinates of the center of the pupil and the correct coordinates of the inner corner of the eye into the line of sight estimation model of the PC, and determine the direction of the line of sight.

所述肤色二值图模块是由Cg-Cb和Cg-Cr两组分割器的结果进行逻辑与得到的，硬件设计为：将Cg、Cb分别与107、110做绝对值减法，经过两级流水线实现，结果用1bit来表示，32个像素点的结果进行拼接，存储到缓存中。The skin color binary map module is obtained by logically ANDing the results of the two groups of dividers Cg-Cb and Cg-Cr. The hardware design is as follows: Cg, Cb are respectively subtracted from 107 and 110 in absolute value, and after a two-stage pipeline Implementation, the result is represented by 1 bit, and the results of 32 pixels are spliced and stored in the cache.

所述积分图模块使用四级流水线实现，首先，积分图计算部分延迟一个周期，在此周期内进行灰度值的平方运算；在第二级中，将当前像素的值与左累加寄存器做加法，结果保存到左累加寄存器中，并把相应位置的地址送到行缓存的地址寄存器中；在第三级中，将读出的数据和左累加寄存器做加法；在第四级中，将当前位置的计算结果输出并写回到行缓存中，以供下一行计算使用。The integral image module is implemented using a four-stage pipeline. First, the integral image calculation part is delayed by one cycle, and the square operation of the gray value is performed within this cycle; in the second stage, the value of the current pixel is added to the left accumulation register , the result is stored in the left accumulation register, and the address of the corresponding position is sent to the address register of the line buffer; in the third stage, the read data is added to the left accumulation register; in the fourth stage, the current The calculation result of the position is output and written back into the row buffer for use in the next row calculation.

所述弱分类器计算模块采用了三级并行的硬件架构：The weak classifier calculation module adopts a three-level parallel hardware architecture:

（1）窗口间任务级并行：四个待检窗口同时进行扫描，第一个窗口设定流水线切分时序和读取弱分类器信息，其余三个窗口在时序上与第一个窗口对齐，并且共享第一个窗口读出的弱分类器信息；(1) Task-level parallelism between windows: four windows to be inspected are scanned at the same time, the first window sets the timing of pipeline segmentation and reads weak classifier information, and the remaining three windows are aligned with the first window in timing, And share the weak classifier information read out by the first window;

（2）窗口内任务级并行：每个窗口内部有三条流水线同时计算弱分类器，根据两种弱分类器的个数，由两条流水线计算两个矩形的弱分类器，由第三条流水线计算三个矩形的弱分类器；(2) Task-level parallelism within the window: There are three pipelines inside each window to calculate weak classifiers at the same time. According to the number of two weak classifiers, two pipelines calculate two rectangular weak classifiers, and the third pipeline Compute weak classifiers for three rectangles;

（3）数据级并行：单条流水线结构划分为7级，每个周期计算一个弱分类器。(3) Data level parallelism: a single pipeline structure is divided into 7 levels, and a weak classifier is calculated in each cycle.

本发明提供的技术方案的有益效果是：本发明通过视线估计模型提高了视线估计的精度；本发明所设计的硬件结构在保证高检测率的条件下，消耗的资源较少，硬件系统消耗了12,181个逻辑单元(LE)，91个9位的乘法器，1,507,176位存储空间；针对640×480分辨率的图像，检测速率为12帧/秒，针对320×240分辨率的图像，检测速率为41帧/秒。The beneficial effects of the technical solution provided by the present invention are: the present invention improves the accuracy of sight line estimation through the line of sight estimation model; the hardware structure designed by the present invention consumes less resources under the condition of ensuring a high detection rate, and the hardware system consumes 12,181 logic units (LE), 91 9-bit multipliers, 1,507,176-bit storage space; for images with a resolution of 640×480, the detection rate is 12 frames per second; for images with a resolution of 320×240, the detection rate is 41 frames per second.

附图说明Description of drawings

图1为视线估计模型的示意图；Figure 1 is a schematic diagram of a line of sight estimation model;

图2为一种人眼视线估计方法的流程图；Fig. 2 is a flow chart of a method for estimating the line of sight of human eyes;

图3人眼视线估计装置结构图；Fig. 3 structure diagram of human eye line of sight estimation device;

图4位图转换模块硬件结构；Fig. 4 bitmap conversion module hardware structure;

图5肤色二值图转换硬件结构；Fig. 5 skin color binary image conversion hardware structure;

图6积分图计算的流水线结构图；Figure 6 The pipeline structure diagram of integral graph calculation;

图7弱分类器计算的流水线结构图。Figure 7. The pipeline structure diagram of weak classifier calculation.

具体实施方式Detailed ways

为使本发明的目的、技术方案和优点更加清楚，下面将结合附图对本发明实施方式作进一步地详细描述。In order to make the object, technical solution and advantages of the present invention clearer, the implementation manner of the present invention will be further described in detail below in conjunction with the accompanying drawings.

为了缩短获取人脸检测结果的时间，扩大应用范围，本发明实施例提供了一种人眼视线估计方法及其装置，详见下文描述：In order to shorten the time for obtaining face detection results and expand the scope of application, an embodiment of the present invention provides a method and device for estimating human eye sight line, see the following description for details:

随着后PC时代的到来，SoC(System On-Chip，嵌入式片上系统)取得了长足的发展和进步，在更小的芯片上可以集成更多、更快的以处理器为核心的复杂功能单元，这使得设计更高性能的SoC逐步成为可能，因此AdaBoost算法的硬件实现是一种提高计算速度的有效途径。With the advent of the post-PC era, SoC (System On-Chip, embedded system on a chip) has made great progress and progress, and more and faster complex functions centered on the processor can be integrated on a smaller chip. Unit, which makes it possible to design a higher performance SoC gradually, so the hardware implementation of AdaBoost algorithm is an effective way to improve the calculation speed.

一种人眼视线估计方法，参见图1和图2，包括以下步骤：A method for estimating the line of sight of human eyes, referring to Fig. 1 and Fig. 2, comprises the following steps:

101：获取包含人脸的图像并进行位图转换，获取RGB位图；将RGB位图转换为灰度图和肤色二值图；101: Obtain an image containing a human face and perform bitmap conversion to obtain an RGB bitmap; convert the RGB bitmap into a grayscale image and a skin color binary image;

102：将灰度图进行各级缩放，对每一级缩放后的图像进行窗口扫描，扫描窗口大小为20×20，并对扫描窗口中的灰度图进行积分图和平方积分图计算；102: Scale the grayscale image at various levels, perform window scanning on the image scaled at each level, the size of the scanning window is 20×20, and perform integral and square integral image calculations on the grayscale image in the scanning window;

其中，本方法采用缩放系数为1.25，具体实现时，根据实际应用中的需要进行设定，本发明实施例对此不做限制。Wherein, this method adopts a scaling factor of 1.25, which is set according to actual application requirements during specific implementation, and is not limited in this embodiment of the present invention.

103：通过积分图数据和平方积分图数据进行弱分类器计算，将同级弱分类器计算结果累加与对应的强分类器阈值比较，淘汰非人脸窗口；如果一个待选窗口通过所有的强分类器则判定为人脸图像；103: Carry out weak classifier calculations through integral image data and square integral image data, compare the cumulative results of weak classifiers at the same level with the corresponding strong classifier thresholds, and eliminate non-face windows; if a candidate window passes all strong classifiers The classifier determines that it is a face image;

本方法使用OpenCV库训练好的弱分类器，在训练的过程中，样本集包括大量的人脸图像和非人脸图像，将每种Haar特征的每种尺寸扫描完样本集中每幅图像的所有位置，挑选出对人脸图像和非人脸图像具有最好区分度的Haar特征，最终得到弱分类器，由若干个弱分类器组合得到各级强分类器，强分类器的阈值由弱分类器共同决定。This method uses the weak classifier trained by the OpenCV library. During the training process, the sample set includes a large number of face images and non-face images, and scans all the images of each image in the sample set for each size of each Haar feature. position, select the Haar feature with the best discrimination between face images and non-face images, and finally get a weak classifier, and combine several weak classifiers to get strong classifiers at all levels. The threshold of the strong classifier is determined by the weak classifier jointly determined by the device.

本方法使用的弱分类器基于20×20的样本集图像，最终得到两个矩形表示的弱分类器1775个，三个矩形表示的弱分类器360个，这些弱分类器组成22级强分类器，如果一个待选窗口通过所有的强分类器则判定为人脸图像。The weak classifiers used in this method are based on 20×20 sample set images, and finally 1775 weak classifiers represented by two rectangles and 360 weak classifiers represented by three rectangles are obtained. These weak classifiers form 22 levels of strong classifiers , if a candidate window passes all strong classifiers, it is judged as a face image.

104：Nios II核处理器将识别为人脸的待选窗口进行合并，获取最终人脸区域；根据最终人脸区域、肤色二值图和肤色模型进行脸部区域精确定位，获取人脸区域；104: The Nios II core processor merges the candidate windows identified as faces to obtain the final face area; perform precise positioning of the face area according to the final face area, skin color binary image and skin color model to obtain the face area;

其中，一副图像经过步骤103检测过后，输出的结果往往会出现交错和包含等现象，即大的人脸框里包含了小的人脸框或两个人脸框的位置相差并不大并且都包含了同一个人脸。在这种情况下，就需要一种整合算法把包含同一个人脸的人脸框合并在一起，以便获得最终的检测结果。Among them, after an image is detected in step 103, the output results often have phenomena such as interlacing and inclusion, that is, the large face frame contains a small face frame or the positions of the two face frames are not much different and both Contains the same face. In this case, an integration algorithm is needed to merge the face frames containing the same face together to obtain the final detection result.

其中，将识别为人脸的待选窗口进行合并具体为：Among them, merging the candidate windows that are recognized as human faces is specifically:

该步骤具体为：将所有框的对应左上角点坐标进行求平均值计算，将计算结果作为整合框的左上角点坐标，再依次进行右上角点、左下角点和右下角点坐标的计算，至此上述四点的坐标即可确定人脸最终区域。This step is specifically: calculate the average value of the corresponding upper left corner point coordinates of all boxes, use the calculation result as the upper left corner point coordinates of the integrated frame, and then perform the calculation of the upper right corner point, lower left corner point and lower right corner point coordinates in sequence, So far, the coordinates of the above four points can determine the final area of the face.

实际应用时具体实现为：首先设定一个判定条件来判定哪些人脸框是相近的人脸框，然后将这些相近的人脸框归为一类，本方法认为只要第二个人脸框与第一个人脸框相距不到第一个人脸框宽的1/2，那么他们就是相近的，就应该合并为一类。同时设定一个阈值（本方法以5为例进行说明，具体实现时，根据实际应用中的需要进行设定，本发明实施例对此不做限制），表明如果某一类中有至少五个人脸框，本方法认为它就是人脸的位置了，否则，就放弃这一类人脸框。然后对人脸框数目大于5的类进行合并计算，计算方法是将所有框的对应左上角点坐标进行求平均值计算，将计算结果作为整合框的左上角点坐标，再依次进行右上角点、左下角点和右下角点坐标的计算，至此上述四点的坐标即可确定人脸最终区域。The specific implementation in practical application is as follows: first, set a judgment condition to determine which face frames are similar face frames, and then classify these similar face frames into one category. If the distance between a face frame is less than 1/2 of the width of the first face frame, then they are similar and should be merged into one category. At the same time, a threshold is set (this method takes 5 as an example for illustration, and it is set according to the needs of practical applications during specific implementation, and the embodiment of the present invention does not limit this), indicating that if there are at least five people in a certain category Face frame, this method considers it to be the position of the face, otherwise, this type of face frame is discarded. Then, the class with a number of face frames greater than 5 is combined and calculated. The calculation method is to calculate the average value of the corresponding upper left corner point coordinates of all frames, and use the calculation result as the upper left corner point coordinate of the integrated frame, and then perform the upper right corner point coordinates in turn. , Calculation of the coordinates of the lower left corner point and the lower right corner point, so far the coordinates of the above four points can determine the final area of the face.

其中，肤色模型具体为：（Cg，Cb，Cr为肤色二值图）Among them, the skin color model is specifically: (Cg, Cb, Cr are the skin color binary image)

Cr∈[260-Cg,280-Cg]Cr∈[260-Cg,280-Cg]

Cg∈[85,135]Cg∈[85,135]

即满足肤色模型的判定为人脸区域，不满足肤色模型的为非人脸区域。That is, the judgment that satisfies the skin color model is a face area, and the judgment that does not meet the skin color model is a non-face area.

105：根据人脸区域进行眼部粗定位，先粗略选取脸部图像的1/2-7/8处为眼部区域；再通过混合积分投影函数确定人眼纵坐标位置，以获取人眼精确边界；105: Carry out rough positioning of the eyes according to the face area, first roughly select 1/2-7/8 of the face image as the eye area; then determine the ordinate position of the human eye through the mixed integral projection function to obtain the precise boundary;

其中，混合积分投影函数具体为：Among them, the mixed integral projection function is specifically:

设在区间[y1,y2]内的水平混合投影函数为H(y)，则其表达式为：Assuming that the horizontal mixed projection function in the interval [y1,y2] is H(y), its expression is:

H_(y)=0.4×（1-M′_(y)）+0.6×D′_(y)H_(y) =0.4×(1-M′_(y) )+0.6×D′_(y)

${M m}_{((y the y))}^{' '} = = \frac{{M m}_{((y the y))} - - min min (({M m}_{((y the y))}))}{max max (({M m}_{((y the y))})) - - min min (({M m}_{((y the y))}))}$ ${M m}_{((y the y))} = = \frac{11}{{x x}_{22} - - {x x}_{11}} {&Integral; &Integral;}_{{x x}_{11}}^{{x x}_{22}} I I ((x x,, y the y)) dx dx$

${D D.}_{((y the y))}^{' '} = = \frac{{D D.}_{((y the y))} - - min min (({D D.}_{((y the y))}))}{max max (({M m}_{((y the y))})) - - min min (({M m}_{((y the y))}))}$ ${D D.}_{((y the y))} = = {Σ Σ}_{{x x}_{i i} = = {x x}_{11}}^{{x x}_{22}} | | {I I}_{((x x,, y the y))} - - {I I}_{((x x - - 11,, y the y))} | |$

其中，参数0.4和0.6根据实际计算效果指定，具体实现时根据实际应用中的需要进行设定。Among them, the parameters 0.4 and 0.6 are specified according to the actual calculation effect, and are set according to the needs of the actual application during the specific implementation.

106：对人眼精确边界进行窗口扫描，根据经验值确定一个和眼球大小相近的框对人眼精确边界进行扫描，选取灰度值和最小的窗口为瞳孔，并将窗口中心作为瞳孔中心；106: Perform window scanning on the precise boundary of the human eye, determine a frame similar to the size of the eyeball to scan the precise boundary of the human eye according to empirical values, select the window with the gray value and the smallest value as the pupil, and use the center of the window as the pupil center;

其中，对人眼精确边界进行扫描时，可采用步长为1像素，从左到右、从上到下进行扫描，直至覆盖整个人眼精确边界。具体实现时，还可以采用其他的扫描顺序，只要覆盖整个扫描区域即可。Wherein, when scanning the precise boundary of the human eye, a step size of 1 pixel may be used to scan from left to right and from top to bottom until the entire precise boundary of the human eye is covered. During specific implementation, other scanning sequences may also be used, as long as the entire scanning area is covered.

107：以获得的瞳孔中心为基准，截取包含内眼角的内眼角窗口并对内眼角窗口进行灰度等级拉伸预处理，然后利用Susan算子及角点检测算子，在内眼角窗口中提取候选内眼角点，最后筛选出正确内眼角点坐标；107: Based on the obtained pupil center, intercept the inner eye corner window including the inner eye corner and perform grayscale stretching preprocessing on the inner eye corner window, and then use the Susan operator and corner detection operator to extract the inner eye corner window Candidate inner corner points, and finally select the correct inner corner point coordinates;

其中，该步骤具体为：Among them, this step is specifically:

1）截取包含内眼角的内眼角窗口并对内眼角窗口进行灰度等级拉伸预处理；1) Intercept the inner corner window including the inner corner of the eye and perform grayscale stretching preprocessing on the inner corner window;

设眼睛矩形区域尺寸为M×N(M行N列)，瞳孔中心的坐标为(x0，y0)，瞳孔半径r=M/8，设定任意参数fixY=N/3。Let the size of the rectangular area of the eye be M×N (M rows and N columns), the coordinates of the center of the pupil are (x0, y0), the pupil radius r=M/8, and set any parameter fixY=N/3.

以左眼为例，结合眼睛先验规则设定内眼角窗口的上边界top=y0+fixY；下边界bottom=y0-fixY；左边界left=x0+r-1；右边界right=left+N/3。Taking the left eye as an example, set the upper boundary top=y0+fixY of the inner corner window in combination with the prior rules of the eye; the lower boundary bottom=y0-fixY; the left boundary left=x0+r-1; the right boundary right=left+N /3.

同理，可设定右眼内眼角窗口如下：top=y0+fixY；bottom=y0-fixY；left=right-N/3；right=x0-r+1。Similarly, the inner corner window of the right eye can be set as follows: top=y0+fixY; bottom=y0-fixY; left=right-N/3; right=x0-r+1.

眼睛移动会造成眼角亮度变化，影响Susan算子的执行效果，为此对眼角窗口进行灰度等级拉伸变换，公式如下所示The movement of the eyes will cause the brightness of the corners of the eyes to change, which will affect the execution effect of the Susan operator. To this end, the grayscale stretching transformation is performed on the corners of the eyes. The formula is as follows

$g g ((x x,, y the y)) = = \frac{{b b}^{' '} - - {a a}^{' '}}{b b - - a a} [[f f ((x x,, y the y)) - - a a]] + + {a a}^{' '}$

其中，f(x,y)为原灰度图像，其灰度范围为[a,b]；灰度等级拉伸后图像g(x,y)的灰度范围为本文中取a＝10，b＝140，可得到较好的效果。Among them, f(x,y) is the original grayscale image, and its grayscale range is [a,b]; the grayscale range of the image g(x,y) after grayscale stretching is In this paper, take a=10, b=140, better results can be obtained.

2）确定Susan算法中灰度亮度差门限t和非极大值抑制门限g；2) Determine the gray level brightness difference threshold t and the non-maximum value suppression threshold g in the Susan algorithm;

利用均方差来确定灰度亮度差门限t，针对不同光照强度的图像，t值可进行自适应调整。假设I(x,y)表示灰度等级拉伸后点错误！未找到引用源。处像素的灰度值，均值μ与方差σ²的数学表述如下所示：The mean square error is used to determine the gray level brightness difference threshold t, and the t value can be adaptively adjusted for images with different light intensities. Assuming that I(x,y) indicates that the gray scale is stretched, the point is wrong! Reference source not found. The gray value of the pixel at , the mathematical expression of mean μ and variance σ² is as follows:

$μ μ = = \frac{11}{MN MN} {Σ Σ}_{x x = = 00}^{M m - - 11} {Σ Σ}_{y the y = = 00}^{N N - - 11} I I ((x x,, y the y))$

${σ σ}^{22} = = \frac{11}{MN MN} {Σ Σ}_{x x = = 00}^{M m - - 11} {Σ Σ}_{y the y = = 00}^{N N - - 11} {[[I I ((x x,, y the y)) - - μ μ]]}^{22}$

其中，M为图像高度，N为图像宽度，单位均为像素。本文中设定t＝σ，

n_max是n(x₀)所能达到的最大值，本文选取圆形模板大小为7×7，即n_max=37。然后利用Susan算子检测内眼角窗口的边缘图（SA），然后利用内眼角检测算子（CF）对内眼角窗口边缘图进行卷积运算，取最大值处作为眼角的候选位置。Among them, M is the image height, N is the image width, and the unit is pixel. In this paper, we set t=σ,

n_max is the maximum value that n(x₀ ) can achieve. In this paper, the size of the circular template is selected as 7×7, that is, n_max =37. Then use the Susan operator to detect the edge map (SA) of the inner corner window, and then use the inner corner detection operator (CF) to perform convolution operation on the edge map of the inner corner window, and take the maximum value as the candidate position of the eye corner.

内眼角检测算子具体为：The inner corner detection operator is specifically:

$[\begin{matrix} - - 11 & - - 11 & 11 & 11 & - - 11 & - - 11 \\ - - 11 & - - 11 & - - 11 & 11 & - - & - - 11 \\ - - 11 & - - 11 & - - 11 & - - 11 & 11 & 11 \\ 11 & 11 & 11 & 11 & 11 & 11 \end{matrix}] [\begin{matrix} - - 11 & - - 11 & 11 & 11 & - - 11 & - - 11 \\ - - 11 & 11 & 11 & - - 11 & - - 11 & - - 11 \\ 11 & 11 & - - 11 & - - 11 & - - 11 & 11 \\ 11 & 11 & 11 & 11 & 11 & 11 \end{matrix}]$

a.左眼角检测算子 b.右眼角检测算子a. Left eye corner detection operator b. Right eye corner detection operator

卷积计算公式为： $Corners = \max_{(x, y) &Element; SA} (SA &CircleTimes; CF)$ The convolution calculation formula is: $Corners = \max_{(x, the y) &Element; SA} (SA &CircleTimes; CF)$

最后，通过内眼角点提取算法得出内眼角点坐标。Finally, the inner corner point coordinates are obtained through the inner corner point extraction algorithm.

正确内眼角点提取算法具体描述为：The correct inner corner point extraction algorithm is specifically described as:

由于由内眼角定位算子定位的内眼角候选点通常多于两个，为此根据内眼角在眼角窗口中的位置特征筛选正确内眼角点，具体实现方法如下：Since there are usually more than two candidates for the inner corner of the eye located by the inner corner of the eye positioning operator, the correct inner corner of the eye point is selected according to the position characteristics of the inner corner of the eye in the corner of the eye window. The specific implementation method is as follows:

（1）如果只有一个候选角点，则候选角点即为正确内眼角点；(1) If there is only one candidate corner point, the candidate corner point is the correct inner corner point;

（2）如果有两个候选角点，则选择距瞳孔中心最远的角点为正确内眼角点；(2) If there are two candidate corner points, select the corner point farthest from the pupil center as the correct inner corner point;

（3）如果有三个或三个以上候选角点，则根据如下算法进行筛选：(3) If there are three or more candidate corner points, filter according to the following algorithm:

C_x=mean(T_x) C_y=mean(T_y)C_x =mean(T_x ) C_y =mean(T_y )

其中，S为候选角点集合，X_max为S中所有的点横坐标的最大值，Y_min为S中所有的点纵坐标的最小值，T为S中与点（X_max，Y_min）的横纵坐标均相差不大于5个像素的点的集合，点（C_x，C_y）即为所选正确内眼角点坐标。Among them, S is the set of candidate corner points, X_max is the maximum value of the abscissa of all points in S, Y_min is the minimum value of the ordinate of all points in S, and T is the point in S (X_max , Y_min ) A set of points whose horizontal and vertical coordinates differ by no more than 5 pixels, and the point (C_x , C_y ) is the correct coordinate of the selected inner corner of the eye.

108：将瞳孔中心的坐标和正确内眼角点坐标传入PC机的视线估计模型中，确定视线方向。108: Transfer the coordinates of the center of the pupil and the correct coordinates of the inner corner of the eye to the line-of-sight estimation model of the PC, and determine the line-of-sight direction.

该视线估计模型基于如下两个假设：把眼球前面一侧看做平面而非球面，即无论用户在看哪，瞳孔中心的空间位置始终在一个平面上；且头部保持不动。The line of sight estimation model is based on the following two assumptions: the front side of the eyeball is regarded as a plane rather than a sphere, that is, no matter where the user is looking, the spatial position of the pupil center is always on a plane; and the head remains still.

视线估计模型如图1所示，O是眼球中心，N为相机小孔成像中心点，S是屏幕上的注视点，E为瞳孔中心（所在平面为眼球近似平面），P为照片中的瞳孔中心，计算公式如下：The line-of-sight estimation model is shown in Figure 1. O is the center of the eyeball, N is the imaging center of the camera pinhole, S is the gaze point on the screen, E is the center of the pupil (where the plane is the approximate plane of the eyeball), and P is the pupil in the photo Center, the calculation formula is as follows:

(x_s,y_s)为屏幕注视点坐标，(x_p,y_p)为照片中的瞳孔坐标，H为屏幕和照片投影平面间的矩阵，p为缩放系数，求解出H，便可根据瞳孔在照片中的位置计算出注视点在屏幕上的位置。(x_s , y_s ) is the coordinates of the gaze point on the screen, (x_p , y_p ) is the pupil coordinates in the photo, H is the matrix between the screen and the projection plane of the photo, p is the zoom factor, and after solving H, we can use The position of the pupil in the photo calculates the position of the gaze point on the screen.

矩阵H计算方法如下：计算矩阵H需要几对坐标数据，即用户要依次注视屏幕上几个确定坐标的点，每个点拍摄一张照片，然后从照片中提取出瞳孔中心的坐标。这个过程称为标定。H是一个3×3的矩阵，由于缩放系数p的存在，该矩阵有8个独立的元素，为减小误差，取9个点的坐标，采用一种最大似然法来估计。The calculation method of matrix H is as follows: the calculation of matrix H requires several pairs of coordinate data, that is, the user needs to look at several points with fixed coordinates on the screen in turn, take a photo for each point, and then extract the coordinates of the pupil center from the photo. This process is called calibration. H is a 3×3 matrix. Due to the existence of the scaling factor p, the matrix has 8 independent elements. In order to reduce the error, the coordinates of 9 points are taken and estimated by a maximum likelihood method.

设M_i＝(x_pi，y_pi)^T，m_i＝(x_si，y_si)^T，i=1to 9为照片上和屏幕上的对应坐标，H的9个元素用h＝(H₁₁，H₁₂，H₁₃，H₂₁,H₂₂,H₂₃,H₃₁，H₃₂，H₃₃)^T表示，设数据的误差平均值为0协方差矩阵为则取最大似然目标函数为：Let M_i =(x_pi , y_pi )^T , m_i =(x_si , y_si )^T , i=1to 9 are the corresponding coordinates on the photo and on the screen, and the nine elements of H are represented by h=(H₁₁ , H₁₂ , H₁₃ , H₂₁ , H₂₂ , H₂₃ , H₃₁ , H₃₂ , H₃₃ )^T indicates that the mean value of the data error is 0 and the covariance matrix is Then take the maximum likelihood objective function as:

其中 ${\hat{m}}_{i} = \frac{1}{H_{31} x_{pi} + H_{32} y_{pi} + H_{33}} [\begin{matrix} H_{11} x_{pi} + H_{12} y_{pi} + H_{13} \\ H_{31} x_{pi} + H_{32} y_{pi} + H_{33} \end{matrix}] .$ 由于每个点的取样为独立的，实际计算中可取

I为单位矩阵。则上述问题实际上是一个非线性最小二乘法问题，令

最小即可求解出H。当用户头部移动时，则计算出的H矩阵将不再有效。in

{\hat{m}}_{i} = \frac{1}{h_{31} x_{p} + h_{32} {the y}_{p} + h_{33}} [\begin{matrix} h_{11} x_{p} + h_{12} {the y}_{p} + h_{13} \\ h_{31} x_{p} + h_{32} {the y}_{p} + h_{33} \end{matrix}] .

Since the sampling of each point is independent, the actual calculation can take

I is the identity matrix. Then the above problem is actually a non-linear least squares problem, let

H can be solved for the minimum. When the user's head moves, the calculated H matrix will no longer be valid.

即，通过步骤101-步骤108实现了对视线方向的确定，缩短了检测时间。That is, throughsteps 101 to 108, the line of sight direction is determined, and the detection time is shortened.

为了验证视线估计模型的准确度，本方法进行了实验验证。实验设备包括一个2048×1536像素的相机以及一台14英寸，分辨率1280×800的显示器。相机的内参数矩阵可以通过用户依次注视屏幕上几个确定坐标的点来计算，相机内参数矩阵为：In order to verify the accuracy of the line-of-sight estimation model, this method is verified experimentally. The experimental equipment includes a 2048×1536 pixel camera and a 14-inch monitor with a resolution of 1280×800. The internal parameter matrix of the camera can be calculated by the user looking at several points with fixed coordinates on the screen in turn. The internal parameter matrix of the camera is:

$A A = = [\begin{matrix} 19421942 & 00 & 10131013 \\ 00 & 19481948 & 770770 \\ 00 & 00 & 11 \end{matrix}]$

经过一组9个点的标定，本次试验测试了8组每组16个点，共128个点的计算，其中头部移动至4个不同位置，结果不到一厘米的平均误差基本令人满意，可以更好的满足于HCI设备的应用。After a set of 9-point calibration, this test tested 8 groups of 16 points in each group, a total of 128 points for calculation, in which the head moved to 4 different positions, and the average error of less than one centimeter was basically impressive. Satisfied, can be better satisfied with the application of HCI equipment.

一种人眼视线估计装置，参见图3，包括：A device for estimating line of sight of human eyes, referring to Fig. 3, comprising:

眼部特征点检测模块，用于根据人脸区域进行眼部粗定位，选取脸部图像的1/2-7/8处为眼部区域；通过混合积分投影函数确定人眼坐标位置，获取人眼精确边界；对人眼精确边界进行窗口扫描，根据经验值确定一个和眼球大小相近的框对人眼精确边界进行扫描，选取灰度值和最小的窗口为瞳孔，并将窗口中心作为瞳孔中心，并获取正确内眼角点坐标；将瞳孔中心的坐标和正确内眼角点坐标传入PC机的视线估计模型中，确定视线方向。The eye feature point detection module is used for rough positioning of the eyes according to the face area, and selects 1/2-7/8 of the face image as the eye area; determines the coordinate position of the human eye through the mixed integral projection function, and obtains the Precise boundary of the eye; scan the window of the precise boundary of the human eye, determine a frame similar to the size of the eyeball to scan the precise boundary of the human eye according to empirical values, select the window with the gray value and the smallest value as the pupil, and use the center of the window as the pupil center , and obtain the correct coordinates of the inner corner of the eye; transfer the coordinates of the center of the pupil and the correct coordinates of the inner corner of the eye into the line of sight estimation model of the PC, and determine the direction of the line of sight.

其中，本装置使用高分辨率的摄像头，最大支持2,592×1,944分辨率的图像，得到的RGB位图为12位，结合实际情况，实现人脸检测算法时使用640×480规格的图像，在实时性改进阶段，可根据实际应用情况调整参数，将初始图像的规格调整至400×300。眼部特征点检测使用的是1600×1200的图像，位图使用8位格式。Among them, this device uses a high-resolution camera that supports images with a maximum resolution of 2,592×1,944, and the obtained RGB bitmap is 12 bits. Combined with the actual situation, an image with a specification of 640×480 is used when implementing the face detection algorithm. In the permanent improvement stage, the parameters can be adjusted according to the actual application situation, and the size of the initial image can be adjusted to 400×300. The eye feature point detection uses a 1600×1200 image, and the bitmap uses an 8-bit format.

本装置利用硬件实现按位操作效率高的特点，在设计位图转换模块硬件结构时使用一种按位组合生成查表地址的方法--分布式查表算法。This device utilizes the feature of high efficiency of bit-by-bit operations realized by hardware, and uses a method of bit-by-bit combination to generate look-up table addresses—distributed look-up table algorithm when designing the hardware structure of the bitmap conversion module.

位图转灰度图计算公式为：Y=0.2990×R+0.5870×G+0.1140×BThe formula for converting bitmap to grayscale is: Y=0.2990×R+0.5870×G+0.1140×B

Y分量可表示为如下形式：The Y component can be expressed as follows:

Y=a₀X₀+a₁X₁+a₂X₂+a₃Y=a₀ X₀ +a₁ X₁ +a₂ X₂ +a₃

其中a_i(i=0,1,2)表示转换系数，a₃为常数(0.5，四舍五入)；X_i(i=0,1,2)分别表示R、G、B三个色彩分量，将其用二进制形式表示可得：Among them, a_i (i=0,1,2) represents the conversion coefficient, a₃ is a constant (0.5, rounded); Xi (_i =0,1,2) represents the three color components of R, G, and B respectively, and the It can be expressed in binary form as:

$Y Y + + {Σ Σ}_{i i = = 00}^{22} {a a}_{i i} (({Σ Σ}_{m m = = 00}^{77} {X x}_{i i,, m m} \times \times 22^{m m})) + + {a a}_{33} = = {Σ Σ}_{m m = = 00}^{77} (({Σ Σ}_{i i = = 00}^{22} {a a}_{i i} {X x}_{i i,, m m} \times \times 22^{m m})) + + {a a}_{33} = = {Σ Σ}_{m m = = 00}^{77} {PS P.S.}_{m m} \times \times 22^{m m} + + {a a}_{33}$

其中

称为部分积由于X_i,m只能取0或1，部分积只有8种可能的取值，可以通过预先计算将这些值存储在寄存器中，在进行转换时，构造向量(X_0，m,X_1，m’X_2,m)作为表地址以得到相应的部分积。为了简化表结构并节省存储空间，在查找表中仅存储PS_m×2⁷，以满足最高二进制权重位(MostSignificant Bit,MSB)的精度。其余权重位的部分积运算可在查表后通过右移实现除2操作。查找表内容如下表所示。in

It is called a partial product because X_{i, m} can only take 0 or 1, and the partial product has only 8 possible values, which can be stored in registers through pre-calculation. When converting, construct a vector (X_{0, m} , X_{1, m} 'X_{2, m} ) as the address of the table to obtain the corresponding partial product. In order to simplify the table structure and save storage space, only PS_m ×2⁷ is stored in the lookup table to meet the precision of the most significant bit (MostSignificant Bit, MSB). The partial product operation of the remaining weight bits can be divided by 2 by right shifting after looking up the table. The content of the lookup table is shown in the table below.

通过上述嵌入式软硬件协同设计方法，可以大大提高位图转换的计算速率，该部分的硬件设计如图4所示。Through the above-mentioned embedded software-hardware collaborative design method, the calculation rate of bitmap conversion can be greatly improved. The hardware design of this part is shown in Figure 4.

肤色二值图模块是由Cg-Cb和Cg-Cr两组分割器的结果进行逻辑与得到的。Cg-Cb空间的肤色区域可以表示为圆形区域，其方程形式如下The skin color binary image module is obtained by logically ANDing the results of the Cg-Cb and Cg-Cr two groups of segmenters. The skin color area in Cg-Cb space can be expressed as a circular area, and its equation is as follows

$\frac{{((Cg C g - - 107107))}^{22} + + {((Cb Cb - - 110110))}^{22}}{{12.25 12.25}^{22}} \leq \leq 11 - - - - - - ((11))$

Cg-Cr空间的肤色区域可表示为条带型，数学表达式如下所示：The skin color area in Cg-Cr space can be expressed as a strip type, and the mathematical expression is as follows:

Cr∈[260-Cg,280-Cg] （2）Cr∈[260-Cg,280-Cg] (2)

Cg∈[85,135] （3）Cg ∈ [85,135] (3)

如果Cb、Cg和Cr满足上述三个公式，则可判定当前像素属于肤色区域，否则判定为非肤色区域。由RGB色彩空间到CbCgCr空间的转换与到灰度值空间的转换相似，也是使用按位组合查表的方法，经过三级流水线后，得到Cb、Cg、Cr分量，然后进行肤色判断。If Cb, Cg and Cr satisfy the above three formulas, it can be determined that the current pixel belongs to the skin color area, otherwise it is determined as a non-skin color area. The conversion from the RGB color space to the CbCgCr space is similar to the conversion to the gray value space. It also uses the method of bitwise combination table lookup. After three-stage pipelines, the Cb, Cg, and Cr components are obtained, and then the skin color is judged.

式(1)的判定为一个圆形区域，如果单独计算这个公式，则需要用到乘法和浮点除法，计算量大，资源消耗高，本装置使用查表的方法实现，首先对Cg、Cb的坐标进行变换，将其变换到坐标原点的位置，然后判断该坐标是否坐落在圆内，圆的半径为12.5，为了进一步缩小存储空间，将判定过程集中到坐标系的第一象限内。表的大小为13×13Bits，每一位的值表示相应位置上的坐标点是否坐落在圆内。The determination of formula (1) is a circular area. If this formula is calculated separately, multiplication and floating-point division are required, which requires a large amount of calculation and high resource consumption. This device uses a look-up table method to realize it. Transform the coordinates to the position of the origin of the coordinates, and then judge whether the coordinates are located in a circle. The radius of the circle is 12.5. In order to further reduce the storage space, the judgment process is concentrated in the first quadrant of the coordinate system. The size of the table is 13×13Bits, and the value of each bit indicates whether the coordinate point at the corresponding position is located in the circle.

该部分的硬件设计如图5所示，首先将Cg、Cb分别与107、110做绝对值减法，然后通过查表的方法判定该点是否在圆内，以此判定Cg-Cb空间是否符合上述公式。Cg-Cr空间的判定相对比较简单，只需判定式(2)。经过两级流水线可以完成该过程，结果用1bit来表示，32个像素点的结果进行拼接，存储到缓存中，在适当的时候写回到DDR存储器，供后续模块使用。The hardware design of this part is shown in Figure 5. First, the absolute value subtraction of Cg and Cb from 107 and 110 is performed respectively, and then the method of looking up the table is used to determine whether the point is in the circle, so as to determine whether the Cg-Cb space conforms to the above formula. The determination of Cg-Cr space is relatively simple, only need to determine the formula (2). This process can be completed through a two-stage pipeline, and the result is represented by 1 bit. The results of 32 pixels are spliced, stored in the cache, and written back to the DDR memory at an appropriate time for use by subsequent modules.

积分图模块在完成一帧图像的获取后，开始两种积分图的计算，用以提高Haar特征值的计算速度，计算公式如下所示，其中image为灰度图：After the integral image module completes the acquisition of one frame of image, it starts the calculation of two integral images to improve the calculation speed of Haar eigenvalues. The calculation formula is as follows, where image is a grayscale image:

积分图sum(X,Y)=∑_{x≤X，y≤Y}image(x,y)Integral image sum(X,Y)=∑_{x≤X, y≤Y} image(x,y)

平方积分图sqrsum(X,Y)=∑_{x≤X，y≤Y}image(x,y)²Square integral graph sqrsum(X,Y)=∑_{x≤X, y≤Y} image(x,y)²

计算所需的数据--灰度图，通过本模块的总线接口部分获取，设置两块双口RAM：RAM_L和RAM_ L_sq，分别存储上一行的积分图的值和平方积分图的值；设置两个左累加寄存器，分别存储与当前像素同行的左侧所有像素(包括当前像素)灰度值和灰度值平方的累加和。当前位置积分图的值为当前像素的灰度值、左累加寄存器的值和上一行对应位置的值三部分求和。The data required for calculation—gray scale image, is obtained through the bus interface part of this module, and two dual-port RAMs are set: RAM_L and RAM_L_sq, which store the value of the integral image and the value of the square integral image of the previous row respectively; A left accumulating register stores the gray value and the sum of the gray value squares of all pixels on the left (including the current pixel) on the same line as the current pixel. The value of the integral map at the current position is the sum of the gray value of the current pixel, the value of the left accumulation register and the value of the corresponding position in the previous line.

两种积分图的计算使用四级流水线实现，首先，为了保持两种积分图计算的同步，积分图计算部分延迟一个周期，在此周期内进行灰度值的平方运算。在第二级中，将当前像素的值与左累加寄存器做加法，结果保存到左累加寄存器中，并把相应位置的地址送到行缓存的地址寄存器中。在第三级中，将读出的数据(上一行对应位置的值)和左累加寄存器做加法。在第四级中，将当前位置的计算结果输出并写回到行缓存中，以供下一行计算使用。该流水线结构在运行过程中，每个周期可以计算一个像素点的两种积分图。计算结果保存到窗口缓存和计算缓存中，供应AdaBoost算法使用，流水线结构如图6所示。The calculation of the two integral images is implemented using a four-stage pipeline. First, in order to maintain the synchronization of the two integral image calculations, the integral image calculation part is delayed by one cycle, and the gray value square operation is performed within this cycle. In the second stage, the value of the current pixel is added to the left accumulation register, the result is stored in the left accumulation register, and the address of the corresponding position is sent to the address register of the line buffer. In the third stage, the read data (the value of the corresponding position in the previous row) is added to the left accumulation register. In the fourth level, the calculation result of the current position is output and written back to the line buffer for the next line calculation. During the operation of the pipeline structure, two integral maps of one pixel can be calculated in each cycle. The calculation results are saved in the window cache and the calculation cache for use by the AdaBoost algorithm. The pipeline structure is shown in Figure 6.

因为本发明采用图像缩放的策略，所有Haar特征值的计算都局限在一个扫描窗口(21×21)内部进行，所以对整幅图像的两种积分图的计算结果分别对2¹⁷和2²⁵进行取模，只保留二进制形式的低17位和25位，否则需要保存27位和35位，这可以在很大程度上节省存储空间，另外，由于取模的原因，在后面的计算过程中需要进行校正，否则会出错。Because the present invention adopts the strategy of image scaling, the calculation of all Haar eigenvalues is limited to a scan window (21×21), so the calculation results of the two integral images of the entire image are respectively 2¹⁷ and 2²⁵ Modulo, only keep the lower 17 bits and 25 bits in binary form, otherwise you need to save 27 bits and 35 bits, which can save storage space to a large extent. In addition, due to the modulus, you need to Make corrections, otherwise errors will occur.

其中，弱分类器计算模块采用了三级并行的硬件架构：Among them, the weak classifier calculation module adopts a three-level parallel hardware architecture:

（1）窗口间任务级并行：四个待检窗口同时进行扫描，其中第一个窗口设定流水线切分时序和读取弱分类器信息，其余三个窗口在时序上与第一个窗口对齐，并且共享第一个窗口读出的弱分类器信息。存储结构示意图如图7所示。四个窗口的判断时间取决于计算强分类器级数最多的窗口，当四个窗口都做出判断后，本次扫描结束，同时更新四列数据以计算下一组窗口。(1) Task-level parallelism between windows: four windows to be inspected are scanned at the same time, the first window sets the timing of pipeline segmentation and reads weak classifier information, and the remaining three windows are aligned with the first window in timing , and share the weak classifier information read out by the first window. A schematic diagram of the storage structure is shown in Figure 7. The judgment time of the four windows depends on the window with the largest number of strong classifiers. When all four windows make judgments, the scan ends, and the four columns of data are updated to calculate the next set of windows.

（2）窗口内任务级并行：每个窗口内部有三条流水线同时计算弱分类器，根据两种弱分类器(矩形个数为划分依据)的个数，由两条流水线计算两个矩形的弱分类器，由第三条流水线计算三个矩形的弱分类器。(2) Task-level parallelism in the window: There are three pipelines in each window to calculate the weak classifiers at the same time. According to the number of two weak classifiers (the number of rectangles is the basis for division), two pipelines calculate the weak classifiers of two rectangles. Classifier, the weak classifier of three rectangles is calculated by the third pipeline.

（3）数据级并行：单条流水线结构划分为7级，该结构每个周期可以计算一个弱分类器。(3) Data-level parallelism: A single pipeline structure is divided into 7 levels, which can calculate a weak classifier per cycle.

其中，Nios II核处理器是用户可配置的通用32位RISC软核微处理器，II核处理器负责完成总线上各个模块间的调度，各个模块与Nios II核处理器进行通信，此外Nios II核处理器完成少量的计算任务，比如最终的人脸区域相似性合并，这部分计算量不大，但是过程复杂，使用硬件实现时，逻辑功能设计比较困难，而且不能发挥硬件设计的优势，因此本系统使用软件的方式实现。Among them, the Nios II core processor is a user-configurable general-purpose 32-bit RISC soft-core microprocessor. The II core processor is responsible for completing the scheduling between modules on the bus. Each module communicates with the Nios II core processor. In addition, the Nios II core processor The core processor completes a small amount of calculation tasks, such as the final similarity merging of face regions. This part of the calculation is not large, but the process is complicated. When using hardware, the logic function design is more difficult, and the advantages of hardware design cannot be used. Therefore, The system is realized by means of software.

本领域技术人员可以理解附图只是一个优选实施例的示意图，上述本发明实施例序号仅仅为了描述，不代表实施例的优劣。Those skilled in the art can understand that the accompanying drawing is only a schematic diagram of a preferred embodiment, and the serial numbers of the above-mentioned embodiments of the present invention are for description only, and do not represent the advantages and disadvantages of the embodiments.

以上所述仅为本发明的较佳实施例，并不用以限制本发明，凡在本发明的精神和原则之内，所作的任何修改、等同替换、改进等，均应包含在本发明的保护范围之内。The above descriptions are only preferred embodiments of the present invention, and are not intended to limit the present invention. Any modifications, equivalent replacements, improvements, etc. made within the spirit and principles of the present invention shall be included in the protection of the present invention. within range.