Disclosure of Invention
The invention aims to provide a face fusion comparison method based on multiple cameras, which is characterized in that the method comprises the steps of preprocessing before face recognition, receiving video stream data of the multiple cameras, and obtaining effective data through living body detection, screening and sorting and quality evaluation so as to solve the problem that a sample is easily influenced by external conditions to reduce the face recognition efficiency in the existing face recognition technology.
A face fusion comparison method based on multiple cameras specifically comprises the following steps:
s1, video data stream acquisition: acquiring a plurality of groups of camera video stream data, and storing the video stream data in groups according to the ID of the cameras;
s2, picture extraction: acquiring multiple groups of RGB pictures corresponding to the multiple groups of video stream data at the same time T;
s3, face extraction: extracting the face of each group of RGB pictures to obtain face pictures, calculating the screen occupation ratio of the face in each face picture, and setting the face picture with the largest screen occupation ratio as an effective face picture of each group of RGB pictures;
s4, picture screening, namely obtaining effective face pictures of the RGB pictures marked as effective data by the effective face pictures of all the groups of RGB pictures through a picture screening algorithm;
s5, uploading: and sending the effective face picture of the RGB picture marked as the effective data to a face comparison server.
Further, the picture screening algorithm in step S4 is at least one of the following algorithms:
firstly, a comparison fusion algorithm: carrying out similarity comparison on effective face pictures of every two groups of RGB pictures by a circular comparison method to obtain corresponding comparison scores;
marking effective face pictures of the two groups of RGB pictures with the highest contrast scores as effective data;
secondly, a screening and sorting algorithm:
the similarity value is: obtaining the similarity value of the effective face pictures of each two groups of RGB pictures;
screening, namely comparing the plurality of similar values with a preset threshold value respectively, and discarding the similar values lower than the preset threshold value;
sorting: sorting the discarded residual similarity values according to the size of the score, and reserving effective face pictures of the two groups of RGB pictures corresponding to the highest similarity value in the sorting;
thirdly, evaluating the quality of the picture; and carrying out picture quality evaluation on the effective face pictures of each group of RGB pictures, and marking the effective face pictures of the RGB pictures meeting the picture quality evaluation as effective data.
Further, the image screening algorithm is an algorithm composed of one or more of a comparison fusion algorithm, a screening and sorting algorithm and an image quality evaluation algorithm according to different sequences.
Further, the picture screening algorithm is an algorithm sequentially composed of a comparison fusion algorithm, a screening and sorting algorithm, and a picture quality evaluation algorithm, and the step S4 specifically includes the following steps:
SA, alignment fusion: carrying out similarity comparison on effective face pictures of every two groups of RGB pictures by a circular comparison method to obtain corresponding comparison scores;
SB, screening and sorting:
screening, namely comparing the plurality of comparison scores with a preset threshold respectively, and discarding the comparison scores lower than the preset threshold;
sorting: sorting the discarded residual contrast scores according to the scores, and reserving effective face pictures of the two groups of RGB pictures corresponding to the highest contrast score in the sorting;
SC, picture quality assessment: performing picture quality evaluation on the effective face pictures of the two groups of RGB pictures, and marking the effective face pictures of the RGB pictures meeting the picture quality evaluation as effective data;
further, if the multiple groups of cameras include one or more groups of binocular cameras, the binocular cameras are an RGB camera and an IR camera, respectively, step S2 further includes a step for recognizing the living body:
s001: identifying binocular camera video stream data, and calling RGB pictures at the moment T corresponding to the group of cameras;
s002: detecting an RGB picture, and recording the position of a face in the RGB picture when the face is detected;
s003: according to the position of the face in the RGB picture, carrying out face detection at the position corresponding to the IR picture;
s004: if the face is detected, judging that the face in the current RGB picture is in a living body state, and reserving the RGB picture in the living body state;
s005: if the human face is not detected, judging that the human face in the current RGB picture is in a non-living state, and discarding the RGB picture in the non-living state.
Further, step S3 is followed by a step of determining the number of valid faces:
calculating the number of effective faces;
if the number of the effective faces is more than or equal to 2, turning to a step SA;
if the number of valid faces is 1, go to step S5.
Further, the step SB further includes a step of discriminating the number of remaining comparison scores:
calculating the number of remaining comparison scores;
if the number of the residual contrast scores is larger than or equal to 1, turning to the step S6;
if the number of remaining comparison scores is equal to 0, the time T is equal to T +1, and the process proceeds to step S2.
Further, the picture quality evaluation is to perform quality detection judgment based on a plurality of indexes and a valid range, where the plurality of indexes and the valid range include:
index A: image blur, effective range (0.1, 1);
index B: the pitch angle of the human face is within the effective range of (-20 degrees, +20 degrees);
index C: the face mask ratio, the effective range (0.9, 1) of mouth and nose C1, the effective range (0.5, 1) of eye C2;
index D: degree of mouth opening and closing, effective range (0.1, 1);
and marking the RGB pictures which simultaneously meet the effective ranges of four indexes of image ambiguity A, face pitching angle B, face mask proportion C and mouth opening and closing degree D as effective data.
Further, the slice quality evaluation is a judgment of quality detection based on a plurality of indexes including:
index A: the degree of image blur;
index B: a face pitch angle;
index C: face mask ratio, mouth, nose C1; eye C2;
index D: degree of opening and closing of the mouth;
the quality evaluation score scorei is obtained by respectively giving corresponding weights wi to the image fuzziness A, the face pitching angle B, the face mask proportion C and the mouth opening and closing degree D and substituting the weights into a formula:
scorei=w1*A+w2*B+w3*(C1+C2)+w4*D;
further, in the step S4, the similarity of the effective faces is compared by calculating the euclidean distance or the cosine distance between two groups of effective faces, where the effective faces are the face portions of the RGB picture where the face occupies the largest screen ratio.
The utility model provides a face fuses comparison device based on many cameras, includes:
a memory;
one or more processors; and
one or more modules stored in the memory and configured to be executed by the one or more processors, the one or more modules comprising:
a memory;
one or more processors; and
one or more modules stored in the memory and configured to be executed by the one or more processors, the one or more modules comprising:
a video data stream acquisition module: the system comprises a data processing module, a data storage module and a data processing module, wherein the data processing module is used for acquiring a plurality of groups of camera video stream data and storing the video stream data in groups according to the ID of the cameras;
the picture extraction module: the system comprises a plurality of groups of RGB pictures corresponding to the plurality of groups of video stream data at the same time T;
a face extraction module: the face image acquisition module is used for extracting faces from each group of RGB images to obtain face images, calculating the screen occupation ratio of the faces in each face image, and setting the face image with the largest screen occupation ratio as an effective face image of each group of RGB images;
the picture screening module: the effective face pictures of the RGB pictures marked as effective data are obtained by the effective face pictures of the RGB pictures of all the groups through a picture screening algorithm;
an uploading module: and the effective face picture of the RGB pictures marked as the effective data is sent to the face comparison server.
Further, the camera is a binocular camera and/or a monocular camera which are arranged at a plurality of different angles in the security inspection area.
The invention has the following beneficial effects:
1. the multiple cameras are used for face tracking, and multiple target face images with different illumination, different postures, different fuzziness and the like can be obtained in the area of the monitoring scene. Through in-vivo detection, screening and sorting and quality evaluation, a plurality of human face images with different qualities are preprocessed, and a human face image with the best quality is selected as effective data. The scheme effectively improves the efficiency of face image acquisition, greatly improves the face recognition rate and has great practical value;
2. therefore, the fusion contrast technology is used, and the multiple cameras are subjected to parallel processing, so that the video of the multi-source camera can be guaranteed not to be affected successively, and meanwhile, higher real-time performance can be guaranteed.
Detailed Description
The present invention will be described in further detail with reference to examples and drawings, but the present invention is not limited to these examples.
In the description and drawings, particular embodiments of the invention have been disclosed in detail as being indicative of some of the ways in which the principles of the embodiments of the invention may be practiced, but it is understood that the scope of the embodiments of the invention is not limited correspondingly. On the contrary, the embodiments of the invention include all changes, modifications and equivalents coming within the spirit and terms of the claims appended hereto.
Example 1
An object of this embodiment is to provide a face fusion contrast method based on multiple binocular cameras, including:
1. acquiring data stream of a camera: by accessing a plurality of groups of binocular cameras, video streams are extracted according to modules, and each group is divided into an RGB (color camera) and an IR (infrared camera). Dividing and storing the video streams according to modules, wherein the number of the extracted video streams is 2 x the number of the modules;
2. extracting pictures from the video: extracting pictures of two paths of videos in each group at the same time, wherein the two paths of videos comprise RGB (red, green and blue) pictures and IR (infrared) pictures (or depth pictures), and the number of picture groups is the number of modules, and the number of pictures is the number of cameras;
3. grouping to carry out living body detection: the pictures obtained in the step 2 are compared with each other according to groups, whether the attack of the living body exists is detected, and RGB pictures in the picture group with successful living body authentication are reserved;
4. extracting the face of an image: performing face extraction on the (living body) RGB picture in the step (3) to obtain a face picture, determining a plurality of face problems in the picture through a face screen ratio (the ratio of the face picture in the whole screen), and determining the face with the largest screen ratio as an effective face in the picture;
5. comparing and fusing RGB pictures: and (4) a local face picture comparison library is built in, the effective face pictures of the RGB pictures in the step (4) are stored in the local face library and are compared with each other, and effective face pictures of two groups of RGB pictures are obtained.
In step 4, extracting that the number of effective faces is n, comparing m face images (in this scene, m is 2, that is, two-by-two comparison), wherein the number of times to be compared in the algorithm follows a permutation and combination calculation formula as follows:
the number of all combinations of m (m ≦ n) elements taken out of the n different elements, i.e., the number of combinations of m elements taken out of the n different elements. Denoted by the symbol C (n, m):
example 1: a, B, C effective face pictures of the RGB pictures extracted in the step 4 are stored in the local database, A, B, C pieces of data exist in the local face bottom database, and at the moment, the following comparison is carried out:
when: score > is threshold (such as 0.85, the threshold is adjustable), and is regarded as valid face data and reserved;
when: score < threshold (such as 0.85, threshold is adjustable), and the face data is considered invalid and discarded;
| contrast relationship | Contrast scores | Results |
| A and B are compared | score1 (e.g.: 0.92) | Valid data, retention |
| Comparison of A with C | score2 (e.g.: 0.94) | Valid data, retention |
| Comparison of B with C | score3 (e.g.: 0.84) | Abandon |
Keeping the group of results with the highest contrast score will result in the group of pictures a and C.
| Contrast relationship | Contrast scores | Results |
| A and B are compared | score1 (e.g.: 0.92) | Valid data, discard |
| Comparison of A with C | score2 (e.g.: 0.94) | Valid data, retention |
Example 2: a, C effective face pictures of the RGB pictures extracted in the step 4 are stored in the local database, A, C two pieces of data exist in the local face bottom database, and at the moment, the following comparison is carried out:
when: score > is threshold (such as 0.85, threshold is adjustable), and is regarded as valid data and reserved; entering step 7;
| contrast relationship | Contrast scores | Results |
| Comparison of A with C | score (e.g.: 0.92) | Valid data, retention |
When score < threshold (such as 0.85, the threshold is adjustable), the data is regarded as invalid and is discarded; ending the algorithm;
| contrast relationship | Contrast scores | Results |
| Comparison of A with C | score (e.g.: 0.84) | Abandon |
Example 3: and 4, extracting 1 effective face image from the step 4, not comparing the effective face images, skipping the step 6, and entering the step 7.
Wherein score < threshold (e.g. 0.85, the threshold is adjustable), and the person is considered as different person, so that it can be avoided that multiple cameras shoot different persons or different angles are considered as different persons (for short, false identification), and finally effective data is determined. The method can improve the face detection amount and effectively reduce the false recognition rate;
6. detecting and selecting picture quality: and (3) carrying out picture quality comparison on the effective face pictures of the two groups of RGB pictures obtained in the step (5), wherein the quality comparison includes but is not limited to: determining the quality of the group of data according to the various indexes, and taking the optimal quality as a final effective picture;
description of the parameters:
image blur, effective range of parameter values (0.1, 1);
the face pitch angle, the effective range of the parameter value [ -20 degrees, +20 degrees ];
face mask ratio: effective ranges of oral and nasal parameter values (0.9, 1), and effective ranges of ocular parameter values (0.5, 1);
degree of opening and closing of the mouth, effective range of parameter values (0.1, 1).
Parameter relationship: while satisfying the effective range, is considered effective.
7. Comparing the faces of the pictures: and (4) sending the face picture in the step (6) into a background face comparison service, realizing a final comparison result and finishing subsequent service processing.
Example 2
The embodiment aims to provide a human face fusion comparison method based on a multi-monocular camera, which comprises the following steps:
1. acquiring data stream of a camera: by accessing a plurality of monocular cameras, video streams are extracted according to modules, and each group is divided into an RGB (color camera) and an IR (infrared camera). Dividing and storing the video streams according to modules, wherein the number of the extracted video streams is equal to the number of the modules;
2. extracting pictures from the video: performing RGB picture extraction on each group of videos at the same moment to obtain the number of picture groups which is the number of modules;
3. extracting the face of an image: performing face extraction on the RGB picture in the step 2 to obtain a face picture, determining a plurality of face problems in the picture through a face screen ratio (the ratio of the face picture in the whole screen), and considering the face with the largest screen ratio as an effective face in the picture;
4. comparing and fusing RGB pictures: and (4) a local face picture comparison library is built in, the effective face pictures of the RGB pictures in the step (4) are stored in the local face library and are compared with each other, and effective face pictures of two groups of RGB pictures are obtained.
5. Detecting and selecting picture quality: and (4) carrying out image quality comparison on the effective face images of the two groups of RGB images acquired in the step (4), wherein the quality comparison includes but is not limited to: determining the quality of the group of data according to the various indexes, and taking the optimal quality as a final effective picture;
description of the parameters:
image blur, effective range of parameter values (0.1, 1);
the face pitch angle, the effective range of the parameter value [ -20 degrees, +20 degrees ];
face mask ratio: effective ranges of oral and nasal parameter values (0.9, 1), and effective ranges of ocular parameter values (0.5, 1);
degree of opening and closing of the mouth, effective range of parameter values (0.1, 1).
Parameter relationship: while satisfying the effective range, is considered effective.
6. Comparing the faces of the pictures: and (5) sending the face picture in the step (5) into a background face comparison service, realizing a final comparison result and finishing subsequent service processing.
Example 3
An object of this embodiment is to provide a face fuses contrast method based on many cameras, and the camera can support the face live body for binocular camera head module, prevents the attack of the face live body, also can degrade to monocular camera module, also can binocular, the configuration of monocular free combination, include:
1. acquiring data stream of a camera: extracting video streams according to modules by accessing a plurality of groups of monocular cameras, wherein each group is divided into RGB (color camera) and IR (infrared camera), and the RGB (color camera) and IR (infrared camera) are divided and stored according to the modules;
2. extracting pictures from the video: extracting pictures of the videos of each group at the same time;
3. living body identification:
and SA: identifying binocular camera video stream data, and calling RGB pictures at the moment T corresponding to the group of cameras;
SB: detecting an RGB picture, and recording the position of a face in the RGB picture when the face is detected;
SC: according to the position of the face in the RGB picture, carrying out face detection at the position corresponding to the IR picture;
SD: if the face is detected, judging that the face in the current RGB picture is in a living body state, and reserving the RGB picture in the living body state;
and SE: if the human face is not detected, judging that the human face in the current RGB picture is in a non-living state, and discarding the RGB picture in the non-living state.
4. Extracting the face of an image: performing face extraction on the RGB picture in the step 3 to obtain a face picture, determining a plurality of face problems in the picture through a face screen ratio (the ratio of the face picture in the whole screen), and considering the face with the largest screen ratio as an effective face in the picture;
5. comparing and fusing RGB pictures: and (4) a local face picture comparison library is built in, the effective face pictures of the RGB pictures in the step (4) are stored in the local face library and are compared with each other, and effective face pictures of two groups of RGB pictures are obtained.
6. Detecting and selecting picture quality: and (3) carrying out picture quality comparison on the effective face pictures of the two groups of RGB pictures obtained in the step (5), wherein the quality comparison includes but is not limited to: determining the quality of the group of data according to the various indexes, and taking the optimal quality as a final effective picture;
description of the parameters:
image blur, effective range of parameter values (0.1, 1);
the face pitch angle, the effective range of the parameter value [ -20 degrees, +20 degrees ];
face mask ratio: effective ranges of oral and nasal parameter values (0.9, 1), and effective ranges of ocular parameter values (0.5, 1);
degree of opening and closing of the mouth, effective range of parameter values (0.1, 1).
Parameter relationship: while satisfying the effective range, is considered effective.
7. Comparing the faces of the pictures: and (4) sending the face picture in the step (6) into a background face comparison service, realizing a final comparison result and finishing subsequent service processing.
Example 4
An object of this embodiment is to provide a face fuses comparison device based on many cameras, includes:
a memory;
one or more processors; and
one or more modules stored in the memory and configured to be executed by the one or more processors, the one or more modules comprising:
a memory;
one or more processors; and
one or more modules stored in the memory and configured to be executed by the one or more processors, the one or more modules comprising:
a video data stream acquisition module: the system comprises a data processing module, a data storage module and a data processing module, wherein the data processing module is used for acquiring a plurality of groups of camera video stream data and storing the video stream data in groups according to the ID of the cameras;
the picture extraction module: the system comprises a plurality of groups of RGB pictures corresponding to the plurality of groups of video stream data at the same time T;
a face extraction module: the face image acquisition module is used for extracting faces from each group of RGB images to obtain face images, calculating the screen occupation ratio of the faces in each face image, and setting the face image with the largest screen occupation ratio as an effective face image of each group of RGB images;
the picture screening module: the effective face pictures of the RGB pictures marked as effective data are obtained by the effective face pictures of the RGB pictures of all the groups through a picture screening algorithm;
an uploading module: and the effective face picture of the RGB pictures marked as the effective data is sent to the face comparison server.
The foregoing is only a preferred embodiment of the present invention, and the present invention is not limited thereto in any way, and any simple modification, equivalent replacement and improvement made to the above embodiment within the spirit and principle of the present invention still fall within the protection scope of the present invention.