CN104077804B

Movatterモバイル変換

Info

Publication number: CN104077804B
Application number: CN201410253326.5A
Authority: CN
Inventors: 刘威; 张丛喆; 汤勇; 谢佳亮
Original assignee: GUANGZHOU JIAQI INTELLIGENT TECHNOLOGY CO LTD
Current assignee: GUANGZHOU JIAQI INTELLIGENT TECHNOLOGY CO LTD
Priority date: 2014-06-09
Filing date: 2014-06-09
Publication date: 2017-03-01
Anticipated expiration: 2034-06-09
Also published as: CN104077804A

Abstract

The invention discloses a kind of method based on multi-frame video picture construction three-dimensional face model, including：The two-dimentional monitored picture that the video camera of fixing irradiation position and angle is shot carries out three-dimensional reconstruction, thus obtaining three-dimensional space model and the three-dimensional spatial information of camera supervised picture；Extract, from the video image of input, the multiframe continuous sequence video comprising target motion, shape, texture and colouring information；Facial features localization, three-dimensional fix, face characteristic synchronized tracking and identification are carried out to multiframe continuous sequence video, thus obtaining three-dimensional face features' point of multiframe continuous sequence video；Three-dimensional space model according to camera supervised picture carries out superposition calculation to three-dimensional face features' point of multiframe continuous sequence video, thus forming three-dimensional face grid and generating three-dimensional face model data.The present invention has the advantages that simple and convenient, preferably and degree of accuracy is higher for real-time.The composite can be widely applied to field of video image processing.

Description

A kind of method based on multi-frame video picture construction three-dimensional face model

Technical field

The present invention relates to field of video image processing, especially one kind are based on multi-frame video picture construction three-dimensional face modelMethod.

Background technology

At present, based on biological characteristic（As fingerprint, palmmprint and footmark etc.）Identity identifying technology be widely used to security protectionField, the various service application based on Identification of Images also gradually spread to different industries and field.Traditional Identification of Images sideMethod based on two-dimension human face recognition methodss, including Fisher face recognition methodss and Eigenface recognition methodss etc..But twoThe discrimination of dimension recognition of face method is low, there is certain error, cannot meet the urgent needss of service application.And it is based on three-dimensionalThe Identification of Images method of faceform, compared with two-dimension human face method of identification, has more abundant information, and supports to revolve by spaceTurn the comparison realizing multi-angle, recognition accuracy is higher, has replaced the trend of two-dimension human face method of identification.

The structure of three-dimensional face model is core and the key of the Identification of Images method based on three-dimensional face model.At present, structureThe method building three-dimensional face model mainly has two kinds：A kind of is to a fixed people by the three-dimensional camera of multi-angleFace is shot, and is then spliced into a threedimensional model；Another kind is to build three-dimensional mould by way of surface profile scansType.Although both approaches are reconstructed three-dimensional face model to a certain extent, its operation is more complicated, not convenient.

The structure of three-dimensional face model includes feature extraction, master pattern change, positioning feature point and texture mapping to be waitedJourney.Current feature extraction, master pattern change and texture mapping process are carried out mainly for Static Human Face image, are difficult to reflectionThe information such as the face parameter with movement locus and attribute（Situation as facial expression distortion is just difficult to describe）It is impossible to adopt phaseMethod like property tolerance or comparison reduces real human face to the full extent, and real-time is relatively low and error in data is larger.

In sum, need in the industry three-dimensional face model construction method a kind of convenient, real-time and that degree of accuracy is high at present badly.

Content of the invention

In order to solve above-mentioned technical problem, the purpose of the present invention is：A kind of convenient, real-time and degree of accuracy height and application are providedScope is wide, the method based on multi-frame video picture construction three-dimensional face model.

The technical solution adopted for the present invention to solve the technical problems is：One kind is based on multi-frame video picture construction three-dimensional peopleThe method of face model, including：

A. the two-dimentional monitored picture video camera of fixing irradiation position and angle being shot carries out three-dimensional reconstruction, thus obtainingThe three-dimensional space model of camera supervised picture and three-dimensional spatial information；

B. from input video image extract comprise target motion, shape, texture and colouring information multiframe continuousSequence video；

C. multiframe continuous sequence video is carried out facial features localization, three-dimensional fix, face characteristic synchronized tracking withIdentification, thus obtain three-dimensional face features' point of multiframe continuous sequence video；

D. three-dimensional face features' point to multiframe continuous sequence video for the three-dimensional space model according to camera supervised pictureIt is overlapped calculating, thus forming three-dimensional face grid and generating three-dimensional face model data.

Further, described step A, it includes：

A1. homography solution is set up according to the intrinsic parameter matrix of video camera, described homography solution reflects actual ground levelHomography relation with ground level in video camera shooting image；

A2. according to known to the height of video camera, given two length and the reference line perpendicular to ground level, to video cameraVisual angle initial point calculated；

A3. camera angles are rebuild according to homography solution, the visual angle initial point of video camera and given Visualization Model three-dimensionalModel, thus obtain three-dimensional space model and the three-dimensional spatial information of camera supervised picture.

Further, described step C, it includes：

C1. choose single frame of video as current video frame from multiframe continuous sequence video；

C2. facial features localization and extract facial feature are carried out to current video frame, thus obtaining the people of current video frameFace characteristic point；

C3. three-dimensional fix is carried out to the human face characteristic point of current video frame, and detect face contained by current video frameThe spatial information of characteristic point, the movement locus of face characteristic and temporal information；

C4. face characteristic synchronized tracking and automatic identification are carried out to current video frame according to the result of detection, so that it is determined thatThe human face characteristic point of current video frame each locus coordinate in moving process；

C5. continue to choose next single frame of video from multiframe continuous sequence video as current video frame, then returnReturn step C2, thus the continuous kinestate under not in the same time generates the three-dimensional of face characteristic according to multiframe continuous sequence videoCoordinate system matrix.

Further, described step D, it is specially：

Three-dimensional face key frame superposition calculation is carried out by multigroup three-dimensional face features' point of multiframe continuous sequence video, rawBecome three-dimensional face model data directory list, thus setting up structurized three-dimensional face model data list and to three-dimensional faceModel data index list carries out storage process.

Further, described step D, it includes：

D1. by the three-dimensional space model for the camera supervised picture of people for the human face characteristic point of multiframe continuous sequence video,Thus obtaining three-dimensional face features' space of points coordinate of multiframe continuous sequence video；

D2. according to three-dimensional face features' space of points coordinate, texture image is generated using 3-D view stitching algorithm, and oppositeThe texture image becoming is mapped, thus obtaining real three-dimensional face model data.

Further, described step D2, it includes：

D21. the dilute of face markers point is rebuild from multiframe continuous sequence video according to three-dimensional face features' space of points coordinateRare set closes, and carries out trial one by one using thin plate spline TPS to sparse set；

D22. the result according to TPS trial carries out nonlinear transformation to Generic face model, thus obtaining the three-dimensional matedFaceform；

D23. the face texture feature information of multiframe continuous sequence video is obtained using 3-D view stitching algorithm, and willTo face texture feature information be mapped to coupling three-dimensional face model in, thus obtaining real three-dimensional face model numberAccording to.

Further, it is additionally provided with step E after described step D, described step E, it is specially：

Retain the special characteristic of Generic face model using SFM algorithm, by comparing with Generic face model, revise and generateThree-dimensional face model data and Generic face model data error；Then triangle close classification is adopted to pass through the depth letter of pointBreath builds final three-dimensional face model.

Further, the depth information that triangle close classification passes through point is adopted to build final three-dimensional face in described step EThe step for model, it includes：

E21. being filtered out from three-dimensional face grid according to default threshold value needs the triangle of subdivision, and to filtering outTriangle is marked；

E22. the triangle of labelling is combined into n grid block according to neighbouring relations, then this n grid block is independently gone outCome, be designated as Bb₁,b₂,b₃,…,b_n, the part not simultaneously being labeled three-dimensional face grid is designated asR_i；

E23. willB_iThe weight on middle four summits of grid block is adjusted to 0,1/2,1/2 and 0 respectively, thus rightB_iCarry out gridInterpolation subdividing；

E24. segment to not doingR_iCarry out interpolation subdividing in boundary, so that being located at midpoint in borderline insertion pointPlace；

E25. willBWithRSynthesized, and judged whether the grid model after synthesis meets the length of side of its all triangle allLess than default threshold value, if so, then using the grid model after synthesis as final three-dimensional face model, conversely, then returning stepRapid E21.

The invention has the beneficial effects as follows：Built by the image information that the video camera of single fixing irradiation position and angle shootsVertical three-dimensional face model, simple to operate, very convenient；By facial features localization, three dimensions are carried out to continuous sequence videoPositioning, face characteristic synchronized tracking and identification, extract the key frame comprising face, enter Mobile state to the change in displacement of face locationFollow the trail of, set up three-dimensional relationship, determine the spatial relation of each human face characteristic point, solving prior art cannot be to dynamicThe information such as the face characteristic parameter such as movement locus of state facial image and attribute synchronizes the problem followed the tracks of with identification, real-timePreferably and degree of accuracy is higher.Further, the special characteristic of Generic face model is retained using SFM algorithm, and thin using trianglePoint-score smooths to facial image, further increases degree of accuracy and the sense of reality of faceform.

Brief description

The invention will be further described with reference to the accompanying drawings and examples.

Fig. 1 is a kind of flow chart of steps of the method based on multi-frame video picture construction three-dimensional face model of the present invention；

Fig. 2 is the flow chart of step A of the present invention；

Fig. 3 is the flow chart of step C of the present invention；

Fig. 4 is the flow chart of step D of the present invention；

Fig. 5 is the flow chart of step D2 of the present invention；

Fig. 6 is the flow chart of step E triangle close classification of the present invention；

Fig. 7 is the schematic diagram according to video camera intrinsic Reconstruction three-dimensional space model for the embodiment one；

Fig. 8 is the schematic diagram that in embodiment one, multiple image is set up with three-dimensional portrait flow process.

Specific embodiment

Reference Fig. 1, a kind of method based on multi-frame video picture construction three-dimensional face model, including：

With reference to Fig. 2, it is further used as preferred embodiment, described step A, it includes：

With reference to Fig. 3, it is further used as preferred embodiment, described step C, it includes：

It is further used as preferred embodiment, described step D, it is specially：

With reference to Fig. 4, it is further used as preferred embodiment, described step D, it includes：

With reference to Fig. 5, it is further used as preferred embodiment, described step D2, it includes：

It is further used as preferred embodiment, after described step D, being additionally provided with step E, described step E, it is specially：

Generic face model, refers to be standard faces model known in the industry.

With reference to Fig. 6, it is further used as preferred embodiment, in described step E, adopt triangle close classification to pass through pointThe step for depth information builds final three-dimensional face model, it includes：

With reference to specific embodiment, the present invention is described in further detail.

Embodiment one

The present embodiment is carried out specifically by the process that video acquisition high-speed downloads device carries out faceform to the present inventionBright.

The process that video acquisition high-speed downloads device carries out faceform is：

（One）3 D scene rebuilding

In video acquisition high-speed downloads device, the intrinsic parameter of video camera is known, and the picture that video camera shoots carries out three-dimensionalScene rebuilding, reconstruction procedures are：First between the ground level in actual ground level and image, set up homography solution（homography）H；Utilize setting height(from bottom) h of reality and the ground level of video camera afterwards, default known length and perpendicular to groundThe line of plane, video camera is calibrated.Specific embodiment is as follows：

（1）According to the pin-hole model of video camera, define matrixMFor：, it follows that actual HorizonThe homography relation of the ground level in face and video camera shooting image is represented by…（1）.

Wherein, A is the intrinsic parameter matrix of video camera.r₁,r₂,r₃For spin matrixRThree column vectors,tFor translation ginsengNumber.If the corresponding point between ground level in actual ground level and image are more than 4 groups, formula can be passed through（1）, H can be madeMore extended.

（2）Define video camera optic center point, that is, the visual angle initial point of video camera be (x_c,y_c,h), order,Spatial relationship according to video camera can draw：

（3）The given reference line perpendicular to actual ground levell*, and its projection on ground level in photography imagel, the spatial relationship according to video camera can learn the straight line through impact pointH^TlOn actual ground level and passing point (x_c,y_c, 0).

Therefore, according to step（1）-（3）, given camera heighthWith two perpendicular to actual ground level and lengthThe reference line known, can calculatex_c,y_cWithK.

（4）Rebuild camera angles threedimensional model then according to default Visualization Model.

As shown in fig. 7, will (x_c,y_c,h) it is set to the central point of user coordinate system, and this Visualization Model is projected to realityGround level.According to space geometry projection relation, any point in user coordinate system (x_w,y_w,z_w) throwing in actual ground levelShadow (x_w^＇,y_w^＇, 0) and formula can be passed through（2）Calculate, formula（2）For：

…（2）

（5）Finally utilize homography solution H, can be by projection mapping on actual ground level for this Visualization Model to imageIn ground level, thus setting up the reconstruction that mapping relations complete 3 D monitoring scene.After reconstruction terminates, can be to three setting upOn dimension portrait, any point carries out depth information calibration.

（Two）Obtain three-dimensional face features' spatial data

After the completion of 3 D scene rebuilding, the picture in this video camera is calculated, using facial features localization method pairWherein one framef1Picture carries out extract facial feature, and the characteristic point sequence collected is substituted into the three-dimensional space model rebuildIn, draw each characteristic point three-dimensional space data [x_f1,y_f1,z_f1].Then begin to read the next frame in videof2, sameMethod obtain the second frame face characteristic three-dimensional space data [x_f2,y_f2,z_f2], until obtainingfnThe three-dimensional face features of frameSpatial data [x_fn,y_fn,z_fn].

（Three）Three-dimensional splicing and mapping

In addition it is also necessary to be modeled to the camera imaging figure of multiple angles after the three-dimensional face features' spatial data obtainingShape, this process can be converted into the three-dimensional splicing problem of 3-D view.Specific practice is：The dilute of face markers point is rebuild from videoRare set closes, using thin plate spline TPS（Thin Plate Spline）These sparse set are carried out trial one by one, then in TPSOn the basis of trial, Generic face model is carried out with the three-dimensional face model that nonlinear transformation obtains mating, finally again by videoFace texture information to map this coupling three-dimensional face model in, thus obtaining real three-dimensional face model.

For example, 3-D view I1 and I2 that a pair can splice is located in the range of one group of given N number of image.I1 firstCarry out splicing a kind of new 3-D view I11 obtaining with I2, then I11 image and I3 image carry out splicing and obtain image I12,Carry out splicing followed by I12 image and I3 image and obtain image I13.Repeat said process, till can not being spliced again,Thus obtaining a complete three-dimensional face model.

（Four）SFM algorithm

In order to ensure the precision of model, the present invention additionally uses SFM（Structure From Motion）Algorithm retains logicalWith the special characteristic of faceform, by comparing with Generic face model, revise the error between two width faces.Concretely comprise the following steps：

Determine first with fromf1Obtain [x_f1,y_f1,z_f1] on the basis of follow the trail of data.Subsequently estimate the fortune of human face characteristic pointMove and structure change.Next motion estimated values are refined, finally by estimated value and next framef2Face coordinate figure enterRow compares, and judges that it, whether in the interval that estimation is calculated, if it is not, then abandoning, continues the extraction of next frame data,Thus circulate the face result drawing and can ensure that its grown form.

（Five）Triangle close classification

Three-dimensional portrait is set up by the depth information of point using triangle close classification, idiographic flow is as follows：

Step 1, screening needs the triangle of subdivision：If i-th triangle maximal side is k, work as K>This triangle of labelling during mShape, m is default threshold value, and the present invention is taken as 0.15.All trianglees in traversal grid model, and labelling needs the three of divisionAngular.

Step 2, the triangle of composite marking：The triangle of labelling is combined into n block according to neighbouring relations, and this nIndividual block is independent, be designated as Bb₁,b₂,b₃,…,b_n, the part not simultaneously being labeled three-dimensional face grid is designated asR_i.

Step 3, segments independent grid block：RightB_iDo grid subdivision, the weight on four summits need to be adjusted, be changed to 0,1/ respectively2,1/2,0, can be all midpoint always so in borderline insertion point, so that boundary shape is consistent.

Step 4, adjustmentR_iBorder：B_iBorder inserts new point in midpoint, is not done and segmentR_i?Boundary does same adjustment, so that the grid of synthesis coincide in splicing boundary.

Step 5, synthesizes R and B.

Through above-mentioned steps 1-5, the subdivision of R grid completes, and R, and B is also consistent in borderline division points, finally twoPerson combines and just achieves the once subdivision to whole original mesh.Repeat above step until the length of side of all triangleesBoth less than threshold value, is finally reached model accuracy requirement.

Fig. 8 is the embodiment schematic diagram setting up three-dimensional portrait model by said method.

Compared with prior art, the image information that the present invention is shot by the video camera of single fixing irradiation position and angleSet up three-dimensional face model, simple to operate, very convenient；By facial features localization, three-dimensional space are carried out to continuous sequence videoBetween positioning, face characteristic synchronized tracking and identification, extract the key frame comprising face, action entered to the change in displacement of face locationState is followed the trail of, and sets up three-dimensional relationship, determines the spatial relation of each human face characteristic point, solving prior art cannot be rightThe information such as the face characteristic parameter such as movement locus of dynamic human face image and attribute synchronizes the problem followed the tracks of with identification, in real timeProperty preferably and degree of accuracy is higher；Retain the special characteristic of Generic face model using SFM algorithm, and adopt triangle close classification pairFacial image is smoothed, and further increases degree of accuracy and the sense of reality of faceform.

It is more than that the preferable enforcement to the present invention is illustrated, but the invention is not limited to described enforcementExample, those of ordinary skill in the art also can make a variety of equivalent variations without prejudice on the premise of present invention spirit or replaceChange, these equivalent deformation or replacement are all contained in the application claim limited range.

Claims

1. a kind of method based on multi-frame video picture construction three-dimensional face model it is characterised in that：Including：

A. the two-dimentional monitored picture video camera of fixing irradiation position and angle being shot carries out three-dimensional reconstruction, thus being imagedThe three-dimensional space model of machine monitoring picture and three-dimensional spatial information；

B. extract, from the video image of input, the multiframe continuous sequence comprising target motion, shape, texture and colouring informationVideo；

C. facial features localization, three-dimensional fix, face characteristic synchronized tracking and identification are carried out to multiframe continuous sequence video,Thus obtaining three-dimensional face features' point of multiframe continuous sequence video；

D. the three-dimensional space model according to camera supervised picture is carried out to three-dimensional face features' point of multiframe continuous sequence videoSuperposition calculation, thus forming three-dimensional face grid and generating three-dimensional face model data；

Described step C, it includes：

C2. facial features localization and extract facial feature are carried out to current video frame, thus the face obtaining current video frame is specialLevy a little；

C3. three-dimensional fix is carried out to the human face characteristic point of current video frame, and detect face characteristic contained by current video frameThe spatial information of point, the movement locus of face characteristic and temporal information；

C4. face characteristic synchronized tracking and automatic identification are carried out to current video frame according to the result of detection, so that it is determined that currentlyThe human face characteristic point of frame of video each locus coordinate in moving process；

C5. continue to choose next single frame of video from multiframe continuous sequence video as current video frame, be then back to walkRapid C2, thus the continuous kinestate under not in the same time generates the three-dimensional coordinate of face characteristic according to multiframe continuous sequence videoIt is matrix.

2. a kind of method based on multi-frame video picture construction three-dimensional face model according to claim 1, its feature existsIn：Described step A, it includes：

A1. homography solution is set up according to the intrinsic parameter matrix of video camera, described homography solution reflects actual ground level and takes the photographThe homography relation of ground level in camera shooting image；

A2., according to known to the height of video camera, given two length and the reference line perpendicular to ground level, video camera is regardedAngle initial point is calculated；

A3. camera angles threedimensional model is rebuild according to homography solution, the visual angle initial point of video camera and given Visualization Model,Thus obtaining three-dimensional space model and the three-dimensional spatial information of camera supervised picture.

3. a kind of method based on multi-frame video picture construction three-dimensional face model according to claim 1, its feature existsIn：Described step D, it is specially：

Three-dimensional face key frame superposition calculation is carried out by multigroup three-dimensional face features' point of multiframe continuous sequence video, generates threeDimension faceform's data directory list, thus set up structurized three-dimensional face model data list and to three-dimensional face modelData directory list carries out storage process.

4. a kind of method based on multi-frame video picture construction three-dimensional face model according to claim 1, its feature existsIn：Described step D, it includes：

D1. by the three-dimensional space model for the camera supervised picture of people for the human face characteristic point of multiframe continuous sequence video, thusObtain three-dimensional face features' space of points coordinate of multiframe continuous sequence video；

D2. according to three-dimensional face features' space of points coordinate, texture image is generated using 3-D view stitching algorithm, and to generationTexture image is mapped, thus obtaining real three-dimensional face model data.

5. a kind of method based on multi-frame video picture construction three-dimensional face model according to claim 4, its feature existsIn：Described step D2, it includes：

D21. the sparse set of face markers point is rebuild from multiframe continuous sequence video according to three-dimensional face features' space of points coordinateClose, and using thin plate spline TPS, trial one by one is carried out to sparse set；

D22. the result according to TPS trial carries out nonlinear transformation to Generic face model, thus obtaining the three-dimensional face matingModel；

D23. obtain the face texture feature information of multiframe continuous sequence video using 3-D view stitching algorithm, and will obtainFace texture feature information is mapped in the three-dimensional face model of coupling, thus obtaining real three-dimensional face model data.

6. a kind of method based on multi-frame video picture construction three-dimensional face model according to claim 1, its feature existsIn：It is additionally provided with step E after described step D, described step E, it is specially：

Retain the special characteristic of Generic face model using SFM algorithm, by comparing with Generic face model, revise three generatingDimension faceform's data and the error of Generic face model data；Then triangle close classification is adopted to pass through the depth information structure of pointBuild final three-dimensional face model.

7. a kind of method based on multi-frame video picture construction three-dimensional face model according to claim 6, its feature existsIn：The step for depth information that triangle close classification passes through point builds final three-dimensional face model is adopted in described step E,It includes：

E21. being filtered out from three-dimensional face grid according to default threshold value needs the triangle of subdivision, and to the triangle filtering outShape is marked；

E22. the triangle of labelling is combined into n grid block according to neighbouring relations, then independent for this n grid block,Be designated as Bb₁,b₂,b₃,…,b_n, the part not simultaneously being labeled three-dimensional face grid is designated asR_i；

E23. willB_iThe weight on middle four summits of grid block is adjusted to 0,1/2,1/2 and 0 respectively, thus rightB_iCarry out gridding interpolationSubdivision；

E24. segment to not doingR_iCarry out interpolation subdividing in boundary, so that being located at midpoint in borderline insertion point；

E25. willBWithRSynthesized, and whether the grid model after judging to synthesize is met the length of side of its all triangle and be both less thanDefault threshold value, if so, then using the grid model after synthesis as final three-dimensional face model, conversely, then return to stepE21.