Summary of the invention
The invention provides a kind of expression interactive approach of analyzing based on face tracking, through the facial expression image of camera collection people face; Utilize the face tracking that proposes in real time the facial image of catching to be carried out analyzing and processing, realize the tracking and the expression Parameter Extraction of people's face with the expression analytical technology; Utilize the expression driving parameter target three-dimensional face model that extracts to make identical expression animation then.Synoptic diagram of the present invention is as shown in Figure 1.
To achieve these goals, the present invention proposes following technical scheme:
(1) a certain personage's of design three-dimensional model, and make some typical case's expression models (this step off-line is accomplished) of this personage;
(2) three active appearance models under the homonymy face angle (this step off-line is accomplished) not;
(3), then utilize the initial value of previous frame parameter as active appearance models if there is people's face in previous frame; If tracking is lost or people's face gets into picture for the first time, then utilize the Adaboost algorithm to detect people's face, the people that utilization is obtained is bold little and positional information is come the initialization active appearance models;
(4) minimization of energy function obtains optimum active appearance models parameter of present frame and expression parameter, detects the state of eyes;
(5) utilize the expression parameter and the eye state that obtain to drive the three-dimensional model of making, let it generate the expression identical with the performing artist;
(6) upgrade the camera data, the expression analysis of beginning next frame and expression drive to be handled.
Advantage of the present invention is:
1. highly versatile; The user can change required three dimensional character model, and same performing artist can drive different faceforms.
2. face tracking robust, processing speed is fast; Several kinds of typical case's expressions of people's face can be caught in real time accurately,, the processing speed of 25f/s can be realized for the computing machine of Pentium4 2G.
3. need not man-machine interactively, be applicable to the ordinary populace crowd.
Embodiment
To combine accompanying drawing that the present invention is specified below, and be to be noted that described embodiment only is intended to be convenient to understanding of the present invention, and it is not played any qualification effect.The present invention describes through following embodiment:
The setting of energy function formula and expression analysis, eyes are opened and are closed state-detection, three-dimensional model driving in making three-dimensional model, training active appearance models, face tracking initialization, the face tracking, and the practical implementation process is following:
1. making three-dimensional model
Three-dimensional model is made and to be belonged to the off-line pretreatment stage, and purpose is the model under three-dimensional face model of design and corresponding 14 the expression states, and department pattern is as shown in Figure 2.In the expression interaction, let this model real time modelling performing artist's expression.In the present invention,, divided 14 kinds of basic expressions, undertaken by following form during analogue formation based on the characteristics of human face action.
Table one modelling explanation
| Numbering | The expression state | Make prompting |
| 1 | Initial model | Amimia, face is little to be closed, and eyes are opened |
| 2 | Magnify mouth | The model mouth magnifies |
| 3 | Pout one's lips | Mouth has sticked up forward |
| 4 | Grin | Face is a word and opens |
| 5 | Laugh at | The corners of the mouth upwarps |
| 6 | Sad | The corners of the mouth is drop-down |
| 7 | Close one's eyes in a left side | The model left eye closes, other attonitys |
| 8 | Close one's eyes in the right side | The model right eye closes, other attonitys |
| 9 | Left side indignation | Left side eyebrow is doneangry shape |
| 10 | Right indignation | Right eyebrow is doneangry shape |
| 11 | Stare in a left side | Left eye pops |
| 12 | Stare in the right side | Right eye pops |
| 13 | Eye is lifted on a left side | Left side eyebrow is raised |
| 14 | Eye is lifted on the right side | Right eyebrow is raised |
2. training active appearance models
Because performing artist's head pose is uncontrollable in the tracing process, in order to strengthen the robustness of face tracking, the present invention proposes the face tracking method under the multi-angle.Three not active appearance models under the homonymy face angle have been trained; Side face angle respectively corresponding
and
is in tracing process; If the side face angle of people's face surpasses certain number of degrees; Then be written into the active appearance models under the new angle, strengthen the accuracy of face tracking.For each active appearance models, its training process is following:
(21) off-line is gathered the people's face sample under this angle, and to demarcating the shape of sample people face;
(22) shape and the texture of normalization sample people face.Wherein texture comprises three parts: shape have nothing to do gray scale texture maps, x direction gradient figure and y direction gradient figure.Wherein the introducing of gradient map is in order to strengthen anti-light interference capability.
(23) shape after the normalization and texture are done the PCA processing; Obtain the shape and the texture model of active appearance models:
be the shape of S representative face wherein, and p is a form parameter; A is three-channel texture image, and λ is a parametric texture.In addition, calculate the Hessian matrix that needs in the iteration.
3. face tracking initialization
When the people gets into picture for the first time or follows the tracks of when losing, need to realize automatically the face tracking initialization, position, the people who promptly the detects people's face little information of being bold, and utilize the parameter of these information initializing active appearance models.The present invention utilizes Adaboost to carry out people's face and detects automatically, and Adaboost (Adaptive boosing, self-adaptation strengthens) is a kind of statistical learning algorithm commonly used, successfully has been applied to detection of people's face and the classification of people's face.Adaboost is through the final strong classifier of the incompatible acquisition of the cascaded series of some Weak Classifiers, and the several Weak Classifiers that come the front can be got rid of a large amount of non-face image-regions in advance, and follow-up sorter concentrates in the differentiation of similar human face region.
4. energy function is set
The face tracking process is exactly a process that minimizes the energy function value of active appearance models, and the setting of energy function form has very big influence to the precision of face tracking.In order to improve tracking accuracy, the present invention proposes new energy function form, form by three parts:
(41) whole texture difference restriction:
This energy function Xiang Yuyuan AAM algorithm is consistent, and difference is that A is three-channel texture image.The physical significance of this function item is for through continuing to optimize parameter p, makes the residual error minimum of the irrelevant texture image of shape that obtained by shape and average texture image.
(42) local grain difference restriction:
ΩtBe the face characteristic point set that detects, RjFor being one 9 * 9 the fritter at center with j unique point, AT-1Texture image for former frame people face.The physical significance of this function item makes that for through parameters optimization p the residual error of the texture image that the regional corresponding sub-piece with the previous frame unique point of texture image of the definite sub-piece of current unique point is regional is minimum.Consistance before and after this function item guarantees in the tracing process between the frame is avoided the parameter saltus step.
(43) area of skin color restriction:
People's face shape that parameter p is confirmed in the iterative process departs from human face region, therefore introduces this function item.IDBe a gray level image, value is 0 in the human face region, and value is 255 in the non-face zone.Human face region confirmed by the face complexion model, and the human face region through first frame detects can train a complexion model, during follow-up tracking complexion model upgraded.The physical significance of this function item is carried out in people's face effective coverage for guaranteeing iterative process, and it is far away excessively to avoid departing from actual value.
In parameter optimisation procedure, the energy function that the present invention confirms is above-mentioned three combination:
E(p)=E1(p)+ω2E2(p)+ω3E3(p)
ω wherein2, ω3Be weight coefficient, adjust the capability of influence of each function item.Can solve optimum form parameter p through the counter-rotating composition algorithm; Can obtain the shape of people's face through expression formula.
5. expression is analyzed
For the human face expression to each frame is analyzed, the present invention analyzes several kinds of actions of typically expressing one's feelings after obtaining people's face shape.The present invention has introduced the CANDIDE three-dimensional face model, and on this basis model is revised, to cooperate several kinds of expressions of table one.The form of CANDIDE shape is following:
is three-dimensional average face shape; S is the change component of 3D shape, and A is the expression motion components.
is used for describing people's face shape of persona certa, and A α representes this people's expression action.First frame at face tracking; Suppose and do not have the expression action; Confirm this people's the follow-up tracing process of people's face shape
espressiove action thus, people's face shape then remains unchanged.The expression Parameter Extraction promptly minimizes following energy function:
E=‖S′(p)-P(Q(g′(σ,α)))‖2
In its Chinese style ' represent the certain characteristics point in the shape, S (p) is for following the tracks of the people's face shape that obtains, and Q () represents the rotary manipulation of three-dimensional shape model, i.e. the attitude of head; P () represents projection operation, and 3D shape is projected to the plane of delineation.The physical significance of this energy function formula is the most optimized parameter σ, and α makes this three-dimensional shape model through consistent with the shape of following the tracks of acquisition after rotation and the projection.σ confirms at first frame, remains unchanged in the tracing process, has only action parameter α to change, and extracts the expression action parameter of each frame thus.
6. eye state identification
Because the resolution of camera institute images acquired is limited, though overall AAM can obtain good people's face shape positioning result, the bearing accuracy of eyes is limited, so the present invention has carried out further processing to eyes.Eyes are handled the Fine Mapping comprise eye shape and eyes and are opened and close status detection.The Fine Mapping of eyes is following:
(1) the local active appearance models of an eye areas of training, process is of 2;
(2) utilize the local AAM (off-line completion) of people's face positioning result initialization of overall AAM;
(3) iterative computation obtains the convergence result of local AAM, obtains the Fine Mapping of eyes.
Close state-detection for opening of eyes, adopted histogrammic characteristic of LBP and SVM linear classifier in the present invention.Concrete implementation procedure is following:
(1) collect the sample open eyes in a large number and close one's eyes, and the LBP histogram that calculates each sample is as characteristic of division (off-line completion);
(2) utilize SVM to train eyes and open the linear classifier that closes state-detection;
(3) on the basis of eyes Fine Mapping, calculate the LBP histogram of this area image and utilize sorter to detect the state of eyes.
7. three-dimensional model drives
In the present invention, the method for linear interpolation is adopted in the driving of three-dimensional model.In conjunction with the model of making and the expression action parameter of extraction, can confirm that the displacement on each summit of model is: D under a certain expression type
i=α
i(V
i-V
0); V wherein
iBe i the apex coordinate under the expression type, V
0Be the apex coordinate under amimia, α
iBe the intensity of i expression.The model that then final band is expressed one's feelings is: the quantity of
expression type i is consistent with table one.Utilize attitude parameter (being rotation matrix Q ()) to come rotating model at last, make that the head pose of model is consistent with performing artist's attitude.
Top description is to be used to realize the present invention and embodiment, and therefore, scope of the present invention should not described by this and limit.It should be appreciated by those skilled in the art,, all belong to claim of the present invention and come restricted portion in any modification that does not depart from the scope of the present invention or local replacement.