KR20040042500A

Movatterモバイル変換

Info

Publication number: KR20040042500A
Application number: KR1020020070810A
Authority: KR
Inventors: 이진수; 김현준; 유재신; 변혜란; 홍은혜
Original assignee: 엘지전자 주식회사
Priority date: 2002-11-14
Filing date: 2002-11-14
Publication date: 2004-05-20

Abstract

PURPOSE: A method and a device for detecting a face are provided to precisely detect a face area by using a PCA(Principle Component Analysis) and an LDA(Linear Discriminant Analysis) representing a complex pattern of the face to a few main component values. CONSTITUTION: A preprocessor(602) preprocesses an image inputted from a camera(601). An image size changer(603) changes the size of the preprocessed image. A PCA-LDA converter(604) performs the PCA-LDA conversion for each image. A template matching engine(605) detects the most similar target area by matching the PCA-LDA converted image with a template. A template storage(606) stores the template previously prepared from the template matching. An image size adjuster(607) adjusts the size of the image area extracted as a template matching result.

Description

Translated fromKorean

얼굴 검출방법 및 그 장치{FACE DETECTION BASED ON PCA-LDA}Face detection method and apparatus {FACE DETECTION BASED ON PCA-LDA}

본 발명은 사람의 얼굴을 포함하는 동영상 또는 정지영상이나 이미지 등의 비디오에서 얼굴 영역을 검출하는 방법과 그 장치에 관한 것으로서, 특히 얼굴이 가진 복잡한 패턴을 몇 개의 주성분 값으로 나타낼 수 있는 주성분 분석(PCA: Principle Component Analysis)과 선형 판별 분석(LDA: Linear Descriminant Analysis)을 이용해서 얼굴 영역을 검출하는 방법과 그 장치에 관한 것이다.The present invention relates to a method and apparatus for detecting a facial region in a video including a human face or a still image or an image, and more particularly, to a principal component analysis that can represent a complex pattern of a face as several principal component values ( The present invention relates to a method and apparatus for detecting a facial region using Principle Component Analysis (PCA) and Linear Descriminant Analysis (LDA).

비디오로부터 자동적으로 얼굴을 검출하고 추적하는 기술은 화상회의, 보안 통제 시스템, 내용기반 비디오 검색 등에 널리 이용되고 있다.Techniques for automatically detecting and tracking faces from video are widely used for video conferencing, security control systems, and content-based video retrieval.

얼굴 검출과 추적에 있어서 가장 첫번째 수행해야 할 작업은 영상 안에 얼굴이 존재하는지의 여부를 결정하는 것과 얼굴이 존재한다면 그 얼굴의 정확한 위치가 어디인가를 파악하는 것이다. 그러나, 정지 영상이나 동영상으로부터 정확하게 얼굴을 검출해 낸다는 것은 결코 쉬운 일이 아니다. 이것은 얼굴이 다양한 크기를 갖고, 위치가 변하며, 카메라 앵글의 변화로 인해 각도가 변하고, 얼굴 포즈 (정면 얼굴, 45도 기울어진 얼굴, 측면 얼굴)의 변화와 얼굴 가림 현상, 얼굴 표정의 변화, 빛의 변화 등이 원인으로 작용하기 때문이다.The first task in face detection and tracking is to determine if a face exists in the image and to determine the exact location of the face if it exists. However, it is never easy to accurately detect faces from still images and moving images. This is due to the fact that the face varies in size, the position changes, the angle changes due to changes in the camera angle, changes in face poses (front face, face tilted at 45 degrees, side face) and face masking, facial expression changes, light This is because the change of the cause and the like.

일반적으로 얼굴 검출 알고리즘은 몇 개의 카테고리로 분류된다. 첫번째 방법은 지식기반 방법으로서, 인간의 기본 얼굴에 대한 지식으로부터 추론된 규칙들을 기반으로 하는 방법이다. 이 방법은 얼굴을 구성하는 요소들간의 상대적인 거리나 관계와 같은 몇 가지 단순한 규칙들을 적용하는 방법이다. 그러나 이 방법은 인간의 얼굴에 대한 지식을 정확하게 정의된 규칙으로 변형하기 힘들다는 점과 다양한 얼굴 포즈에 대해서 고정적인 규칙을 적용하기 어렵다는 단점이 있다.In general, face detection algorithms fall into several categories. The first is the knowledge-based method, which is based on rules deduced from the knowledge of the basic human face. This method applies some simple rules, such as the relative distances or relationships between the elements that make up a face. However, this method has disadvantages in that it is difficult to transform human knowledge into precisely defined rules, and it is difficult to apply fixed rules to various face poses.

두 번째 방법은 특징 기반 방법으로서, 이 방법은 얼굴 검출을 위해 얼굴의 불변하는 특징( 눈, 코, 입과 같은 얼굴 요소, 질감, 살색)들을 찾는 방법이다. 얼굴이라고 판단할 만한 여러 가지 특징들 중에서 살색은 얼굴의 이동, 회전, 크기 변화 등에 덜 민감한 특성을 가지므로 최근에 가장 많이 사용되고 있는 방법이다. 그렇지만, 살색 모델은 빛의 스펙트럼이 심하게 변할 경우 매우 비효율적이므로 최근의 몇 가지 조합적인 방법들은 색상과 모양 또는 추적을 위해 움직임 정보 등을 결합하는 방식을 시도하고 있다.The second method is a feature-based method, which finds invariant features of the face (face elements such as eyes, nose and mouth, texture and skin color) for face detection. Among the various features that can be considered as a face, flesh color is the method most recently used because it is less sensitive to face movement, rotation, and size change. However, the skin color model is very inefficient when the light spectrum is severely changed, so some recent combination methods attempt to combine motion information for color and shape or tracking.

세 번째 방법으로 형판(template) 기반 얼굴 검출방법은 얼굴에 대한 몇 가지의 표준 패턴을 만든 뒤에 이 패턴을 얼굴 검출을 위해 저장하고, 상기 패턴들을 영상의 탐색 윈도우 안에서 영상과 하나씩 비교해 나가는 방법이다. 이 방법은 얼굴 검출 알고리즘이 비교적 단순하다는 장점을 가지고 있지만, 얼굴 회전이나 크기변화, 다양한 빛의 변화 및 잡음에 민감한 단점을 가지고 있다.As a third method, a template-based face detection method is to create several standard patterns for a face, store the patterns for face detection, and compare the patterns one by one in a search window of the image. This method has the advantage of relatively simple face detection algorithm, but has the disadvantage of being sensitive to face rotation, size change, various light changes and noise.

마지막으로, 신경망(neural network) 기반 방법이 있다. 이 방법들은 영상으로부터 각기 다른 영역들을 서브 샘플링(sub-sampling)하여 신경망을 통해 얼굴과 비 얼굴에 대해 학습시킨 후 얼굴을 찾는 방법을 사용한다. 이러한 방법들은 만족할 만한 검출 능력을 보여주는 반면에, 계산량이 많고 정면과 수평으로 돌린 측면얼굴 검출에는 좋은 성능을 보여주지만 다른 각도로 회전된 얼굴 검출에서는 좋을 결과를 보여주지 못하고 있다.Finally, there is a neural network based method. These methods use a method of finding faces after sub-sampling different regions from an image and learning them through a neural network on faces and non-faces. While these methods show satisfactory detection capabilities, they have high computational capacity and good performance in detecting side faces that are turned front and horizontal, but do not show good results in face detection rotated at different angles.

본 발명은 얼굴이 가진 복잡한 패턴을 몇 개의 주성분 값으로 나타낼 수 있는 주성분 분석(PCA)과 선형 판별 분석(LDA)을 이용해서 얼굴 영역을 정확히 검출할 수 있도록 한 얼굴 검출방법과 그 장치를 제공함을 목적으로 한다.The present invention provides a face detection method and apparatus for accurately detecting a face region using principal component analysis (PCA) and linear discriminant analysis (LDA), which can express complex patterns of a face with several principal component values. The purpose.

또한 본 발명은 전처리 과정을 통해서 빛 등의 외부 요인에 의한 얼굴 검출 오류 요인을 제거하고, 형판 생성과정을 통해서 정확한 얼굴 검출을 위한 형판을 생성하고, 학습과정을 통해서 얼굴 검출을 위한 주요 정보를 구하고, 얼굴 영역 검출과정을 통해서 희망하는 얼굴 영역을 검출하고, 얼굴 크기 변화과정을 통해서 최종적으로 정확한 크기의 얼굴을 검출할 수 있도록 한 얼굴 검출방법과 그 장치를제공함을 목적으로 한다.In addition, the present invention removes the face detection error factors due to external factors such as light through the pre-processing process, generates a template for accurate face detection through the template generation process, and obtain the main information for face detection through the learning process It is an object of the present invention to provide a face detection method and apparatus for detecting a desired face region through a face region detection process and finally detecting a face having a correct size through a face size change process.

도1은 본 발명에서 전처리 과정을 설명하기 위한 도면1 is a view for explaining a preprocessing process in the present invention

도2는 본 발명에서 얼굴 형판 생성과정을 설명하기 위한 도면2 is a view for explaining a process for creating a face template in the present invention

도3은 본 발명에서 얼굴 크기 변환과 모자이크 영상을 설명하기 위한 도면3 is a view for explaining a face size conversion and mosaic image in the present invention;

도4는 본 발명에서 얼굴 영역 검출 결과의 예를 나타낸 도면4 is a diagram illustrating an example of a face region detection result in the present invention.

도5는 본 발명에서 얼굴 검출 영역의 변화를 설명하기 위한 도면5 is a view for explaining a change in the face detection area in the present invention;

도6은 본 발명의 얼굴 검출 장치의 블럭도6 is a block diagram of a face detection apparatus of the present invention.

본 발명에서는 얼굴이 가진 복잡한 패턴을 몇 개의 주성분 값으로 나타낼 수 있는 주성분 분석(Principal Component Analysis)과 선형 판별 분석(Linear Discriminant Analysis)을 이용해서 얼굴 영역을 검출한다. 이 방법은 얼굴의 패턴을 이용함으로, 카메라의 색상왜곡, 복잡한 배경을 가진 영상에 효과적이다.In the present invention, a facial region is detected by using principal component analysis and linear discriminant analysis, which can represent a complex pattern of a face with several principal component values. This method uses the pattern of the face, which is effective for camera color distortion and images with complex backgrounds.

본 발명은 입력 영상/이미지에 대한 전처리 과정, 영상 검출을 위한 형판 생성과정, 상기 전처리된 영상과 형판을 이용해서 얼굴 영역을 검출하는 과정, 형판의 생성과 얼굴 검출을 위한 학습과정, 검출된 얼굴 영역으로부터 정확한 얼굴 크기를 추정해 내기 위한 얼굴 크기 변화과정을 포함한다.The present invention provides a preprocessing process for an input image / image, a template generating process for detecting an image, a process of detecting a face region using the preprocessed image and a template, a learning process for generating a template and detecting a face, and a detected face. It includes the face size change process to estimate the exact face size from the area.

본 발명은 원 영상의 밝기 보정을 위한 전처리 과정, 전처리된 원 영상의 크기를 변화시키면서 각각의 PCA 및 LDA 값을 기반으로 얼굴 영역을 선택하고 이를 준비된 형판과 매칭시켜 얼굴 영역을 검출하는 과정, 상기 검출된 얼굴 영역의 크기를 변화시켜 최종 얼굴 영역을 결정하는 과정을 포함하여 이루어지는 것을 특징으로 하는 얼굴 검출 방법이다.The present invention provides a preprocessing process for correcting the brightness of an original image, a process of selecting a face region based on respective PCA and LDA values while changing the size of the preprocessed original image, and detecting the face region by matching it with the prepared template. And determining the final face area by changing the size of the detected face area.

또한 본 발명은 상기 형판 매칭을 위하여 원 영상을 모자이크 영상으로 변환시키고 평균 영상을 구하여 형판을 생성하고, 형판 생성시 PCA-LDA 변환을 위한 함수를 구하는 학습 과정을 수행하는 것을 특징으로 한다.In addition, the present invention is characterized by performing a learning process of transforming the original image into a mosaic image for obtaining the template matching, generating a template by obtaining an average image, and obtaining a function for PCA-LDA conversion during template generation.

또한 본 발명은 상기 PCA-LDA 변환시 고유 벡터의 개수는 영상의 빛 성분의 평균값에 따라 적응적으로 적용함을 특징으로 한다.In addition, the present invention is characterized in that the number of eigenvectors in the PCA-LDA conversion is adaptively applied according to the average value of the light component of the image.

또한 본 발명에서 상기 전처리 과정은 원 영상의 밝기 성분에 대하여 최소-최대 정규화 기법을 기반으로 하거나, 히스토그램 평활화 기법을 기반으로 하여 수행함을 특징으로 한다.In the present invention, the preprocessing process is performed based on a minimum-maximal normalization technique or a histogram smoothing technique for the brightness component of the original image.

또한 본 발명에서 상기 얼굴 영역의 검출 과정은 형판의 크기를 고정시키고, 입력 영상의 크기를 다중 해상도를 갖는 크기로 줄여서 크기 변환된 영상과 형판의 PCA-LDA 변환한 값의 차가 최소인 영역을 얼굴 영역으로 판정하는 것을 특징으로 한다.In the present invention, the face area detection process fixes the size of the template, reduces the size of the input image to a size having multiple resolutions, and faces the region where the difference between the size-converted image and the template-PCA-LDA converted value is minimum. It is characterized by determining as an area.

또한 본 발명은 상기 최종 얼굴 영역을 검출하기 위하여 눈/눈썹 영역을 추출하고, 눈/눈썹의 위치 정보를 기반으로 최소한의 얼굴 크기를 확정하는 것을 특징으로 한다.In addition, the present invention is characterized in that to extract the eye / eyebrow area to detect the final face area, to determine the minimum face size based on the location information of the eye / eyebrows.

또한 본 발명은 입력 영상의 빛 성분 보정을 위한 전처리 수단, 상기 전처리된 입력 영상의 크기를 변환시키는 수단, 상기 크기 변환된 입력 영상에서 형판 정합을 위해 대상 영역을 취한 후 이를 PCA-LDA 변환하는 수단, 상기 변환된 대상 영역을 형판과 정합시켜 얼굴 영역을 결정하는 수단, 상기 형판 정합을 위한 정보를 저장하고 있는 형판 저장수단, 상기 결정된 얼굴 영역을 원래의 크기로 복원시키는 수단을 포함하여 이루어지는 것을 특징으로 하는 얼굴 검출장치이다.In addition, the present invention is a pre-processing means for correcting the light component of the input image, means for converting the size of the pre-processed input image, means for taking a target area for template matching in the size-converted input image and then means for PCA-LDA conversion And means for matching the converted target area with a template to determine a face area, template storing means for storing the information for template matching, and means for restoring the determined face area to its original size. It is a face detection device.

이하, 첨부된 도면을 참조하여 본 발명의 얼굴 검출방법과 그 장치에 대해서 설명한다.Hereinafter, a face detection method and an apparatus thereof according to the present invention will be described with reference to the accompanying drawings.

1. 전처리 과정1. Pretreatment Process

전처리 과정은 빛의 상태를 보정하기 위한 과정으로서, 이 과정은 다음의 수학식1과 같이 표현될 수 있다. 식1은 최소-최대 정규화 (Min_max normalization) 방법을 표현한다. 최소-최대 정규화 방법은 원 영상을 새롭게 정의된 데이터 범위로 변형시켜주는 선형적인 변형(linear transformation)방법이다.The preprocessing process is a process for correcting the state of light, which may be expressed as Equation 1 below. Equation 1 expresses the Min_max normalization method. The minimum-maximal normalization method is a linear transformation method that transforms an original image into a newly defined data range.

식1에서 min₁과 max₁은 입력 영상의 최소, 최대 밝기 값이며, min₂와 max₂는 새로운 범위의 최소, 최대 밝기 값이다. 최소-최대 정규화를 위해, 본 발명에서는 영상의 밝기(luminance) 성분의 상위 약 8.5%를 기준백색(reference white) (226~256)으로 가정하고, 반대로 하위 약 8.5까지를 기준흑색(reference black) (0~30)으로 간주하였다. 여기서 8.5%라는 수치는 실험을 통해 가장 좋은 성능을 보여준 값으로 정하였다. 결국, 본 발명에서는 입력영상의 최대, 최소 밝기 값에 상관없이 모든 영상이 31 ~ 225 범위의 밝기 값을 갖도록 하였다. 하지만, 영상의 빛 성분의 평균값이 70 이하인 예외적인 상황에서는 히스토그램 평활화(Histogram Equalization)를 사용한다.In Equation 1, min₁ and max₁ are the minimum and maximum brightness values of the input image, and min₂ and max₂ are the new range minimum and maximum brightness values. For minimum-maximum normalization, the present invention assumes that the upper approximately 8.5% of the luminance component of the image is the reference white (226 to 256), and conversely, the lower reference to the lower reference 8.5 is referred to as reference black. (0-30) was considered. Here, 8.5% was set as the value that showed the best performance through the experiment. As a result, in the present invention, all images have a brightness value ranging from 31 to 225 regardless of the maximum and minimum brightness values of the input image. However, histogram equalization is used in exceptional situations where the average value of light components in an image is 70 or less.

도1은 전처리 과정을 보여준다. 101은 원 영상이고 102는 원 영상(101)에 대해서 히스토그램 평활화를 수행한 결과를 보여주며, 103은 원 영상이고 104는 원 영상(103)에 대해서 최소-최대 정규화를 수행한 결과를 보여준다.1 shows a pretreatment process. 101 is a raw image, 102 shows a result of histogram smoothing on the original image 101, 103 is a raw image, and 104 shows a result of performing minimum-maximal normalization on the original image 103.

도1에서 알 수 있듯이 전처리 과정을 통해서 원 영상의 빛 상태에 따른 후속 과정에서의 얼굴 검출 오류 요인을 사전에 제거 또는 최소화시키는 작업이 이루어질 수 있었다.As can be seen in Figure 1 through the pre-processing can be performed to remove or minimize the face detection error factor in the subsequent process according to the light state of the original image.

2. 임시 형판(temporary template) 생성2. Create a Temporary Template

본 발명에서 임시 형판 생성 과정은 형판으로 사용할 영상 범위를 선택하고 전처리를 수행하며, 이를 모자이크 영상으로 변환하여 형판을 구하는 방법을 사용한다.In the present invention, the temporary template generation process uses a method of selecting an image range to be used as a template, performing preprocessing, and converting it into a mosaic image to obtain a template.

형판(template)은 눈썹부터 입술까지 40 ×40 영상을 사용한다. 형판은 영상의 명암에 관계없이 얼굴 요소의 특징을 부각시키기 위해서 전처리를 하는데, 이 형판 전처리는 히스토그램 평활화를 이용한다.The template uses 40 × 40 images from the eyebrows to the lips. The template is preprocessed to highlight the features of the facial elements regardless of the contrast of the image, which uses histogram smoothing.

그리고, 형판 정합을 위해서 형판을 그대로 사용하지 않고 4 ×4 모자이크 영상으로 변환하는 과정을 추가하였다. 이것은 영상과 형판의 각각의 화소들이 1대1로 정합될 경우 속도가 느려지는 단점이 있으므로 이를 보완하기 위한 것이다.In addition, a process of converting a template into a 4 × 4 mosaic image without using the template is added for template matching. This is to compensate for the disadvantage that the speed is slowed if each pixel of the image and the template is matched one-to-one.

도2는 형판 생성 과정을 보여준다. 201은 원 영상을 표현하며, 202는 4 ×4 모자이크 영상, 203은 모자이크 영상을 히스토그램 평활화하여 구한 평균 영상을 보여준다. 이 것이 후의 얼굴 영역 검출을 위한 형판으로 사용할 영상이다.2 shows a template generation process. 201 represents an original image, 202 represents a 4 × 4 mosaic image, and 203 represents an average image obtained by histogram smoothing the mosaic image. This is an image to be used as a template for later face region detection.

3. 형판 생성 및 학습과정3. Template creation and learning process

얼굴 추출에 사용될 최종적인 형판은 상기 생성된 임시 형판을 PCA, LDA 변환을 통해 구해진다. 그러기 위해 학습 데이터를 이용한 학습 과정이 필요한데, 학습과정은 PCA, LDA 변환을 위한 함수를 구하는 과정으로 설명될 수 있다. 이러한 학습 과정은 형판 생성시 수행하고 일단 형판이 생성된 이후에는 이 학습과정은 반복하여 수행되지 않아도 무방하다. 즉, 얼굴 검출 시에는 학습 과정에서 얻은 PCA,LDA 함수를 이용하여 입력 영상을 변환한 후 형판과 비교하기만 하면 되는 것이다.The final template to be used for face extraction is obtained through PCA and LDA conversion of the generated temporary template. To this end, a learning process using learning data is required. The learning process can be described as a process of obtaining a function for PCA and LDA conversion. This learning process is performed at the time of template generation, and once the template is generated, this learning process may not be repeatedly performed. In other words, when detecting a face, the input image is converted by using the PCA and LDA functions obtained in the learning process and compared with the template.

LDA변환 시 클래스(class)의 개수와 표본 데이터 개수는 얼굴 영역 검출에 영향을 미친다. 각 클래스의 표본 데이터는 2개 혹은 4개가 좋다는 것으로 알려져 있다. 본 발명에서는 각 클래스의 표본 데이터로 2개를 사용하고, 클래스의 개수는 40개를 사용함으로써 총 80개의 학습 데이터를 사용했다. 또한 PCA 변환 시, 고유 벡터의 개수는 얼굴 영역 검출에 영향을 미치는 것으로, 본 발명에서는 고유 벡터의 수에 따른 얼굴 검출 정확율을 계산하여 가장 좋은 성능을 보여주는 값으로 고유벡터의 개수를 설정하였다.During the LDA conversion, the number of classes and the number of sample data affect the face area detection. It is known that two or four samples of each class are preferred. In the present invention, two are used as sample data of each class, and the number of classes is 40, so that a total of 80 learning data are used. In addition, in PCA conversion, the number of eigenvectors affects face area detection. In the present invention, the number of eigenvectors is set to a value showing the best performance by calculating the face detection accuracy rate according to the number of eigenvectors.

즉, 영상의 빛 성분의 평균값이 70 이하인 어두운 영상에서는 고유 벡터 개수가 전체 고유벡터의 40%를 가질 때 가장 정확율이 높고, 그 외의 상황에서는 고유 벡터 개수가 전체 고유벡터의 30%를 가질 때 가장 정확 율이 높다. 따라서 본 발명에서는 고유 벡터의 개수를 영상의 빛 성분의 평균값에 따라 적용하도록 하였다.That is, in the dark image where the average value of the light components of the image is 70 or less, the highest accuracy rate is obtained when the number of eigenvectors has 40% of the total eigenvectors, and in other cases, when the number of eigenvectors has 30% of the total eigenvectors, The accuracy rate is high. Therefore, in the present invention, the number of eigenvectors is applied according to the average value of the light components of the image.

아래의 표1은 일반 영상에서 고유벡터에 따른 얼굴 영역 검출의 정확율을 보여주며, 표2는 어두운 영상에서 고유벡터에 따른 얼굴 영역 검출의 정확율을 보여주고 있다.Table 1 below shows the accuracy rate of face region detection according to the eigenvector in the normal image, and Table 2 shows the accuracy rate of face region detection according to the eigenvector in the dark image.

고유벡터 개수에대한 백분율(%)Percentage of Number of Eigenvectors정확율(%)% Accurate20205555303098984040787850506767

고유벡터 개수에대한 백분율(%)Percentage of Number of Eigenvectors정확율(%)% Accurate20206666303058584040666650503030

상기 표2에서 알 수 있듯이 어두운 영상에서는 고유 벡터의 개수가 전체 고유 벡터의 40%를 가질 때 정확율이 가장 높고(66%), 그 이외의 경우는 표1에서 알 수 있듯이 고유 벡터의 개수가 전체 고유 벡터의 30%를 가질 때 가장 정확율이 높다(98%).As can be seen from Table 2, in the dark image, when the number of eigenvectors has 40% of the total eigenvectors, the accuracy is the highest (66%). Otherwise, as shown in Table 1, the number of eigenvectors is total. It has the highest accuracy (98%) when it has 30% of the eigenvectors.

4. 얼굴 영역 검출4. Face area detection

본 발명에서는 얼굴 영역 검출을 위해서 상기 형판의 크기를 변화시키는 멀티 스케일(multi-scale)방법과 형판의 크기를 한 개로 고정시키고 영상의 크기를 변화시키는 다해상도(multi-resolution) 방법 중에서, 1개의 형판만을 이용하는 후자의 방법을 사용하였다.In the present invention, one of the multi-scale method of changing the size of the template for detecting the face region and the multi-resolution method of fixing the size of the template to one and changing the size of the image, The latter method using only template was used.

먼저, 얼굴 영역을 검출하기 위해서 입력영상의 크기를 원 영상의 90%, 70%, 50%로 줄이고, 얼굴 형판과 마찬가지로 4 ×4 모자이크 영상을 만든다. 90%, 70%, 50%로 크기가 변환된 영상과 형판을 역시 PCA 변환한다. PCA 변환된 결과로 나온 PC성분 값을 LDA 변환한 후의 값의 차이(MAE)가 가장 작은 영역을 얼굴 영역이라고 선택한다.First, in order to detect the face region, the size of the input image is reduced to 90%, 70%, and 50% of the original image, and a 4 × 4 mosaic image is made like the face template. PCA converts images and templates that have been resized to 90%, 70% or 50%. The area with the smallest difference (MAE) after the LDA conversion of the PC component value resulting from the PCA conversion is selected as the face area.

본 발명에서는 영상의 크기를 90%, 70%, 50%로 줄이기 때문에, 40 ×40 크기의 형판을 사용해도, 60 ×60 혹은 80 ×80 전후 크기의 얼굴까지 찾을 수 있다.In the present invention, since the size of the image is reduced to 90%, 70%, and 50%, even a 40 × 40 template can be found up to 60 × 60 or 80 × 80 faces.

일반적으로 LDA가 PCA보다 좋은 성능을 가지지만, 학습 데이터가 적은 경우에는 PCA가 더 좋은 성능을 가지므로 본 발명에서는 PC 성분을 LDA 변환하여 얼굴을 검출하는 방법을 사용하였다. 기존의 PCA 만을 이용하는 방법보다 향상된 얼굴 검출율을 보여줌을 실험을 통해 알 수 있었다.In general, LDA has better performance than PCA, but when the training data is small, PCA has better performance. Therefore, in the present invention, a method of detecting a face by LDA conversion of a PC component is used. Experiments show that the face detection rate is improved compared to the conventional PCA-only method.

아래 표3은 PCA변환과 PCA 변환 후 LDA변환을 비교한 표이다.Table 3 below is a table comparing PCA conversion and LDA conversion after PCA conversion.

변환방법영상프레임How to Convert Video FramesPCAPCAPCA + LDAPCA + LDA10010095958585100100919181811001006363888810010089899999평균Average85858888

도3은 원 영상과 원 영상을 크기 변환한 영상을 보여준다. 301은 원 영상이고, 302는 원 영상을 90% 줄인 영상, 303은 70% 줄인 영상, 304는 50% 줄인 영상을 각각 보여준다. 이와 같이 입력 영상의 크기를 줄여서 각각의 4 ×4 모자이크 영상을 만들고, 크기가 변환된 영상과 형판을 PCA 변환하여 PC 성분 값을 LDA 변환환 후의 값의 차이가 최소인 영역을 얼굴 영역이라고 판정하는 것이다.3 shows an original image and an image obtained by size-converting the original image. 301 is an original image, 302 is an image reduced by 90%, 303 is an image reduced by 70%, and 304 is an image reduced by 50%. In this way, each 4 × 4 mosaic image is reduced by reducing the size of the input image, and the PCA conversion of the size-converted image and the template is performed to determine that the region where the difference between the values of the PC component values after the LDA conversion is minimum is the face region. will be.

도4는 상기 과정을 거쳐서 구한 얼굴 영역의 검출 결과를 보여준다. 401은 전체적으로 어두운 밝기를 갖는 영상에서 구한 얼굴 영역(401a)을 보여주고 있으며, 402는 색상 왜곡을 갖는 영상에서 구한 얼굴 영역(402a)을 보여주고 있으며, 403은 비교적 복잡한 배경을 갖는 영상에서 구한 얼굴 영역(403a)을 보여주고 있다.4 shows the detection result of the face area obtained through the above process. 401 shows a face area 401a obtained from an image with dark brightness as a whole, 402 shows a face area 402a obtained from an image with color distortion, and 403 shows a face obtained from an image with a relatively complex background. The area 403a is shown.

5. 얼굴 크기 변화5. Face size change

상기한 바와 같이 구해진 얼굴 영역에 대하여 그 얼굴을 가장 넓게 포함하는 최소한의 얼굴 영역 크기를 구한다.With respect to the face area obtained as described above, the minimum face area size including the face is most widely obtained.

즉, 찾아진 얼굴 영역으로부터 정확한 얼굴 크기를 추정해 내는 것이다. 이 방법은 PCA 변환 후에 찾아진 얼굴 영역에서 눈 영역을 찾아서 정확한 얼굴의 크기를 찾아내는 방법을 이용한다. 즉, 찾아진 얼굴 영역에서 전체 화소들의 밝기 성분의 20%이하인 영역을 얼굴 영역으로 간주하고 이 영역을 이진화시킨다.In other words, the exact face size is estimated from the found face area. This method uses the method of finding the correct face size by finding the eye area in the face area found after PCA conversion. That is, an area of 20% or less of the brightness component of all pixels in the found face area is regarded as a face area and binarized.

다음으로 해당 이진화 영역에 대해 모폴로지(morphology) 연산 중 불림(Dilation)을 적용하여 얼굴 성분을 뚜렷이 구분하게 해준다. 그리고 남은 부분을 프로젝션(projection)해서 수평 성분보다 수직 성분의 값이 큰 부분과 라벨링(labeling)을 통해 가장 큰 부분을 제거함으로써 머리 부분을 제거한다. 이 때, 남은 영역 중에서 중심점의 높이 값이 큰 두 개의 영역을 선택하면, 눈 혹은 눈썹 부분만 남게 된다. 이 때 두 눈썹 혹은 두 눈이 남겨지면, 양쪽 눈썹 사이의 거리를 이용해서 얼굴의 크기를 알 수 있다. 혹은 한쪽 눈썹과 눈만이 남겨져 있다면, 눈썹과 눈의 중심점의 거리 이용해서, 얼굴의 크기를 알 수 있다.Next, by applying the dilation during the morphology operation on the binarization region, the facial components can be clearly distinguished. Projection of the remaining part removes the head part by removing the part with the larger value of the vertical component than the horizontal component and the largest part by labeling. At this time, if two areas are selected from the remaining areas where the height of the center point is large, only the eyes or the eyebrows remain. If you have two eyebrows or two eyes left, you can determine the size of the face by using the distance between the two eyebrows. Alternatively, if only one eyebrow and the eye are left, the size of the face can be determined using the distance between the eyebrow and the center point of the eye.

도5는 이 과정을 보여준다. 501은 원 영상이고, 502는 이로부터 상기 얼굴 검출 과정을 통해서 구한 얼굴 영역(502a)을 보여주고 있으며, 503은 얼굴 크기의 변화 과정을 통해서 최종적으로 구해진 얼굴 영역(503a)을 보여주고 있다. 도5에 나타낸 바와 같이 원 영상에 대해서 일단 구해진 얼굴 영역(502a)을 대상으로 해서 눈 영역을 찾아내고, 양쪽 눈 혹은 양쪽 눈썹 사이의 거리를 이용해서 실제 얼굴이라고 판단할 수 있는 영역 전부를 포함하는 최소한의 사각형(503a)을 그려서 그 곳을 얼굴 영역이라고 결정하는 것이라고 이해될 수 있다.5 shows this process. 501 shows an original image, 502 shows a face area 502a obtained through the face detection process, and 503 shows a face area 503a finally obtained through a process of changing the size of the face. As shown in Fig. 5, the eye area is found using the face area 502a once obtained for the original image, and the entire area that can be determined as a real face using the distance between both eyes or both eyebrows is included. It can be understood that drawing a minimal rectangle 503a determines that the area is a face area.

얼굴 영역에서 눈이나 눈썹을 검출하는 기법은 기존의 알려진 눈/눈썹 검출 알고리즘을 적용해도 무방하며, 본 발명에서 적용한 바와 같이 이진화 단계, 모폴로지/불림 연산 적용, 프로젝션, 눈/눈썹 결정의 과정을 거치는 방법을 사용할 수도 있다.The technique of detecting eyes or eyebrows in the face region may be applied to a known eye / eyebrow detection algorithm, and the process of binarization, morphology / bump operation, projection, and eye / eyebrow determination may be applied as applied in the present invention. You can also use the method.

도6은 지금까지 설명한 얼굴 검출 방법을 구현한 본 발명의 얼굴 검출장치를 보여준다.6 shows a face detection apparatus of the present invention implementing the face detection method described so far.

영상 취득을 위한 카메라(601), 취득된 입력 영상의 전처리를 위한 전처리기(602), 상기 전처리된 영상의 크기를 변화시키는 영상 크기 변화부(603), 상기 크기가 변환된 각각의 영상에 대하여 PCA 및 LDA 변환을 수행하는 LCA-LDA 변환기(604), 상기 LCA-LDA 변환된 영상을 형판과 매칭하여 가장 유사한 대상 영역을 검출하는 형판 매칭 엔진(605), 상기 형판 매칭을 위해 사전에 준비된 형판이 저장된 형판 저장소(606), 상기 형판 매칭 결과로 추출된 얼굴 영역의 크기를 조절하는 영상 크기 조절부(607)를 포함하고 있다.A camera 601 for image acquisition, a preprocessor 602 for preprocessing the acquired input image, an image size changer 603 for changing the size of the preprocessed image, and for each image whose size has been converted An LCA-LDA converter 604 for performing PCA and LDA conversion, a template matching engine 605 for detecting the most similar target area by matching the LCA-LDA converted image with a template, and a template prepared in advance for the template matching The stored template storage 606 includes an image size controller 607 for adjusting the size of the face region extracted as a result of the template matching.

상기 전처리기(602)로부터 영상 크기 조절부(607)에 이르는 구성요소는 본 발명의 PCA-LDA 기반 얼굴 검출 수단을 포함하는 단말기로 구현되며, 카메라(601)는 단말기에 포함되거나 인터페이스를 이용해서 부가될 수 있는 구성요소가 된다.Components from the preprocessor 602 to the image size adjusting unit 607 are implemented as a terminal including the PCA-LDA-based face detection means of the present invention, the camera 601 is included in the terminal or using an interface It becomes a component that can be added.

카메라(601) 등과 같은 영상 취득수단을 이용해서 영상이 입력되면 전처리기(602)는 입력 영상에 대하여 앞서 설명한 전처리 과정을 수행한다. 전처리된 영상은 영상 크기 변환부(603)에서 각각 90%, 70%, 50%의 크기로 변환되고, 변환된 영상에서 형판 정합을 위해 대상 영역을 취득한다. 그리고 대상 영역을 PCA 및 LDA 변환기(604)에서 PCA 및 LDA 변환한다. 이때 변환에 사용된 함수는 사전에 형판 구성 시에 학습된 함수를 사용하게 된다. 사전에 형판 구성을 위한 학습 과정은 앞서 설명한 형판 생성과 학습 과정을 참조한다.When an image is input using image acquisition means such as a camera 601, the preprocessor 602 performs the preprocessing process described above with respect to the input image. The preprocessed image is converted into sizes of 90%, 70%, and 50% by the image size converter 603, respectively, and obtains a target area for template matching from the converted image. The target region is then converted by PCA and LDA converter 604 to PCA and LDA. In this case, the function used for the conversion uses a function learned in advance in forming the template. For the learning process for forming the template in advance, refer to the above-described template generation and learning process.

이와 같이 변환된 대상 영역을 형판 매칭 엔진(605)에서 형판 저장소(606)에 저장된 형판과 정합시켜 가장 유사하다고 판단되는 대상 영역을 얼굴 영역으로 결정한다. 이렇게 결정된 얼굴 영역은 앞서 조절된 크기를 다시 원래의 크기로 복원시키기 위하여 영상 크기 조절부(607)에서 크기 재조절을 수행함으로써, 최종적으로 원 영상에 대해서 구하고자 하였던 얼굴 영역을 구하게 되는 것이다.The target region thus converted is matched with the template stored in the template storage 606 in the template matching engine 605 to determine the target region determined to be the most similar as the face region. The face area determined as described above is obtained by resizing in the image size adjusting unit 607 to restore the previously adjusted size back to the original size, thereby finally obtaining the face area that is to be obtained for the original image.

본 발명은 주성분 분석(PCA)과 선형 판별 분석(LDA)을 이용해서 얼굴 영역을 검출한다. 이 방법은 칼라 정보를 전혀 고려하지 않으며, 전처리 과정을 거쳐서 빛의 밝기에 기인하는 검출 오류를 사전에 차단하고, 미리 학습과 형판 생성을 통해서 구한 형판을 이용해서 다해상도 기법을 적용하여 얼굴 영역을 검출하고, 이 얼굴 영역에 대해서 크기 변화 기법을 적용해서 최종적으로 최소 얼굴 영역을 결정함으로써, 고속으로 정확한 얼굴 영역의 검출이 가능하게 한다.The present invention detects facial regions using principal component analysis (PCA) and linear discriminant analysis (LDA). This method does not consider color information at all, pre-processes detection errors due to the brightness of light in advance, and uses a template obtained through learning and template generation to apply face resolution using a multiresolution technique. By detecting and applying a size change technique to the face region, the minimum face region is finally determined, thereby enabling accurate detection of the face region at high speed.

따라서, 본 발명의 얼굴 검출 방법은 특히 휴대 전화기나 PDA와 같은 이동 통신 단말기 환경에서 원격 화상 통신을 수행하는 경우 매우 유용할 수 있다.Therefore, the face detection method of the present invention can be very useful especially when performing remote image communication in a mobile communication terminal environment such as a cellular phone or a PDA.

Claims

Translated fromKorean

원 영상의 밝기 보정을 위한 전처리 과정, 전처리된 원 영상의 크기를 변화시키는 과정, 크기가 변화된 영상에서 비교할 영역을 각각 PCA 및 LDA 변환을 수행한 후 이를 준비된 형판과 매칭시켜 얼굴 영역을 검출하는 과정, 상기 검출된 얼굴 영역을 확인 및 조정하여 최종 얼굴 영역을 결정하는 과정을 포함하여 이루어지는 것을 특징으로 하는 얼굴 검출 방법.Preprocessing process for correcting the brightness of the original image, changing the size of the preprocessed original image, and detecting the face region by performing PCA and LDA conversion on the areas to be compared, and matching them with the prepared template. And determining the final face area by checking and adjusting the detected face area.

제 1 항에 있어서, 상기 형판 매칭을 위하여 원 영상을 모자이크 영상으로 변환시키고 평균 영상을 구하여 형판을 생성하고, 형판 생성시 PCA-LDA 변환을 위한 함수를 구하는 학습 과정을 수행하는 것을 특징으로 하는 얼굴 검출 방법.The face of claim 1, wherein a face is formed by converting an original image into a mosaic image, obtaining an average image, generating a template, and obtaining a function for PCA-LDA conversion during template generation. Detection method.

제 2 항에 있어서, 상기 PCA-LDA 변환시 고유 벡터의 개수는 영상의 빛 성분의 평균값에 따라 적응적으로 적용함을 특징으로 하는 얼굴 검출방법.The face detection method of claim 2, wherein the number of eigenvectors in the PCA-LDA conversion is adaptively applied according to an average value of light components of an image.

제 3 항에 있어서, 상기 고유 벡터의 개수는 영상의 빛 성분의 평균값이 어두울 때가 밝을 때보다 더 많은 것을 특징으로 하는 얼굴 검출방법.The method of claim 3, wherein the number of eigenvectors is greater than when the average value of light components of the image is dark.

제 4 항에 있어서, 상기 고유 벡터의 개수는 영상의 빛 성분의 평균값이 70 이하일 때(빛 성분의 값 범위는 0 ~ 255) 전체 고유 벡터의 개수의 40% 이고, 그렇지 않을 때에는 30% 임을 특징으로 하는 얼굴 검출방법.The method of claim 4, wherein the number of eigenvectors is 40% of the total number of eigenvectors when the average value of the light components of the image is 70 or less (the range of the value of the light components is 0 to 255), otherwise it is 30%. Face detection method.

제 1 항에 있어서, 상기 전처리 과정은 원 영상의 밝기 성분에 대하여 최소-최대 정규화 기법을 기반으로 하거나, 히스토그램 평활화 기법을 기반으로 하여 수행함을 특징으로 하는 얼굴 검출 방법.The method of claim 1, wherein the preprocessing is performed based on a minimum-maximal normalization technique or a histogram smoothing technique for the brightness component of the original image.

제 1 항에 있어서, 상기 얼굴 영역의 검출 과정은 형판의 크기를 고정시키고, 입력 영상의 크기를 다중 해상도를 갖는 크기로 줄여서 크기 변환된 영상과 형판의 PCA-LDA 변환한 값의 차가 최소인 영역을 얼굴 영역으로 판정하는 것을 특징으로 하는 얼굴 검출 방법.The method of claim 1, wherein the detecting of the face region is performed by fixing the size of the template and reducing the size of the input image to a size having multiple resolutions so that the difference between the size-converted image and the PCA-LDA converted value of the template is minimal. Determining a face area as a face area.

제 1 항에 있어서, 상기 최종 얼굴 영역을 검출하기 위하여 눈/눈썹 영역을 추출하고, 눈/눈썹의 위치 정보를 기반으로 최소한의 얼굴 크기를 확정하는 것을 특징으로 하는 얼굴 검출방법.The face detection method of claim 1, wherein the eye / eyebrow region is extracted to detect the final face region, and the minimum face size is determined based on the position information of the eye / eyebrow.

입력 영상의 빛 성분 보정을 위한 전처리 수단, 상기 전처리된 입력 영상의 크기를 변환시키는 수단, 상기 크기 변환된 입력 영상에서 형판 정합을 위해 대상 영역을 취한 후 이를 PCA-LDA 변환하는 수단, 상기 변환된 대상 영역을 형판과 정합시켜 얼굴 영역을 결정하는 수단, 상기 형판 정합을 위한 정보를 저장하고 있는 형판 저장수단, 상기 결정된 얼굴 영역을 원래의 크기로 복원시키는 수단을 포함하여 이루어지는 것을 특징으로 하는 얼굴 검출장치.Preprocessing means for correcting light components of an input image, means for converting the size of the preprocessed input image, means for taking a target area for template matching in the size-converted input image and then converting it to PCA-LDA; Means for matching a target area with a template to determine a face area, a template storage means for storing the information for template matching, and means for restoring the determined face area to its original size Device.