Numbering	The expression state	Make prompting
			1	Initial model	Amimia, face is little to be closed, and eyes are opened
2	Magnify mouth	The model mouth magnifies
			3	Pout one's lips	Mouth has sticked up forward
4	Grin	Face is a word and opens
			5	Laugh at	The corners of the mouth upwarps
6	Sad	The corners of the mouth is drop-down
			7	Close one's eyes in a left side	The model left eye closes, other attonitys
8	Close one's eyes in the right side	The model right eye closes, other attonitys
			9	Left side indignation	Left side eyebrow is doneangry shape
10	Right indignation	Right eyebrow is doneangry shape
			11	Stare in a left side	Left eye pops
12	Stare in the right side	Right eye pops
			13	Eye is lifted on a left side	Left side eyebrow is raised
14	Eye is lifted on the right side	Right eyebrow is raised

2. training active appearance models

Because performing artist's head pose is uncontrollable in the tracing process, in order to strengthen the robustness of face tracking, the present invention proposes the face tracking method under the multi-angle.Three not active appearance models under the homonymy face angle have been trained; Side face angle respectively corresponding

and

is in tracing process; If the side face angle of people's face surpasses certain number of degrees; Then be written into the active appearance models under the new angle, strengthen the accuracy of face tracking.For each active appearance models, its training process is following:

(21) off-line is gathered the people's face sample under this angle, and to demarcating the shape of sample people face;

(22) shape and the texture of normalization sample people face.Wherein texture comprises three parts: shape have nothing to do gray scale texture maps, x direction gradient figure and y direction gradient figure.Wherein the introducing of gradient map is in order to strengthen anti-light interference capability.

(23) shape after the normalization and texture are done the PCA processing; Obtain the shape and the texture model of active appearance models:

be the shape of S representative face wherein, and p is a form parameter; A is three-channel texture image, and λ is a parametric texture.In addition, calculate the Hessian matrix that needs in the iteration.

3. face tracking initialization

When the people gets into picture for the first time or follows the tracks of when losing, need to realize automatically the face tracking initialization, position, the people who promptly the detects people's face little information of being bold, and utilize the parameter of these information initializing active appearance models.The present invention utilizes Adaboost to carry out people's face and detects automatically, and Adaboost (Adaptive boosing, self-adaptation strengthens) is a kind of statistical learning algorithm commonly used, successfully has been applied to detection of people's face and the classification of people's face.Adaboost is through the final strong classifier of the incompatible acquisition of the cascaded series of some Weak Classifiers, and the several Weak Classifiers that come the front can be got rid of a large amount of non-face image-regions in advance, and follow-up sorter concentrates in the differentiation of similar human face region.

4. energy function is set

The face tracking process is exactly a process that minimizes the energy function value of active appearance models, and the setting of energy function form has very big influence to the precision of face tracking.In order to improve tracking accuracy, the present invention proposes new energy function form, form by three parts:

(41) whole texture difference restriction:

E_{1} (p) = | | A_{0} - I (W (p)) | | = \underset{x &Element; S_{0}}{Σ} {[A_{0} (x) - I (W (x; p))]}^{2}

This energy function Xiang Yuyuan AAM algorithm is consistent, and difference is that A is three-channel texture image.The physical significance of this function item is for through continuing to optimize parameter p, makes the residual error minimum of the irrelevant texture image of shape that obtained by shape and average texture image.

(42) local grain difference restriction:

E_{2} (p) = \underset{j &Element; Ω_{t}}{Ω} \underset{x &Element; R^{j}}{Σ} {[A_{t - 1} (x) - I_{t} (W (x; p))]}^{2}

Ω_tBe the face characteristic point set that detects, R^jFor being one 9 * 9 the fritter at center with j unique point, A_T-1Texture image for former frame people face.The physical significance of this function item makes that for through parameters optimization p the residual error of the texture image that the regional corresponding sub-piece with the previous frame unique point of texture image of the definite sub-piece of current unique point is regional is minimum.Consistance before and after this function item guarantees in the tracing process between the frame is avoided the parameter saltus step.

(43) area of skin color restriction:

E_{3} (p) = \underset{x &Element; S_{0}}{Σ} {[I_{D} (W (x; p))]}^{2}

People's face shape that parameter p is confirmed in the iterative process departs from human face region, therefore introduces this function item.I_DBe a gray level image, value is 0 in the human face region, and value is 255 in the non-face zone.Human face region confirmed by the face complexion model, and the human face region through first frame detects can train a complexion model, during follow-up tracking complexion model upgraded.The physical significance of this function item is carried out in people's face effective coverage for guaranteeing iterative process, and it is far away excessively to avoid departing from actual value.

In parameter optimisation procedure, the energy function that the present invention confirms is above-mentioned three combination:

E(p)＝E₁(p)+ω₂E₂(p)+ω₃E₃(p)

ω wherein₂, ω₃Be weight coefficient, adjust the capability of influence of each function item.Can solve optimum form parameter p through the counter-rotating composition algorithm; Can obtain the shape of people's face through expression formula.

5. expression is analyzed

For the human face expression to each frame is analyzed, the present invention analyzes several kinds of actions of typically expressing one's feelings after obtaining people's face shape.The present invention has introduced the CANDIDE three-dimensional face model, and on this basis model is revised, to cooperate several kinds of expressions of table one.The form of CANDIDE shape is following:

g (σ, α) = \overset{&OverBar;}{g} + Sσ + Aα

is three-dimensional average face shape; S is the change component of 3D shape, and A is the expression motion components.

is used for describing people's face shape of persona certa, and A α representes this people's expression action.First frame at face tracking; Suppose and do not have the expression action; Confirm this people's the follow-up tracing process of people's face shape

espressiove action thus, people's face shape then remains unchanged.The expression Parameter Extraction promptly minimizes following energy function:

E＝‖S′(p)-P(Q(g′(σ，α)))‖²

In its Chinese style ' represent the certain characteristics point in the shape, S (p) is for following the tracks of the people's face shape that obtains, and Q () represents the rotary manipulation of three-dimensional shape model, i.e. the attitude of head; P () represents projection operation, and 3D shape is projected to the plane of delineation.The physical significance of this energy function formula is the most optimized parameter σ, and α makes this three-dimensional shape model through consistent with the shape of following the tracks of acquisition after rotation and the projection.σ confirms at first frame, remains unchanged in the tracing process, has only action parameter α to change, and extracts the expression action parameter of each frame thus.

6. eye state identification

Because the resolution of camera institute images acquired is limited, though overall AAM can obtain good people's face shape positioning result, the bearing accuracy of eyes is limited, so the present invention has carried out further processing to eyes.Eyes are handled the Fine Mapping comprise eye shape and eyes and are opened and close status detection.The Fine Mapping of eyes is following:

(1) the local active appearance models of an eye areas of training, process is of 2;

(2) utilize the local AAM (off-line completion) of people's face positioning result initialization of overall AAM;

(3) iterative computation obtains the convergence result of local AAM, obtains the Fine Mapping of eyes.

Close state-detection for opening of eyes, adopted histogrammic characteristic of LBP and SVM linear classifier in the present invention.Concrete implementation procedure is following:

(1) collect the sample open eyes in a large number and close one's eyes, and the LBP histogram that calculates each sample is as characteristic of division (off-line completion);

(2) utilize SVM to train eyes and open the linear classifier that closes state-detection;

(3) on the basis of eyes Fine Mapping, calculate the LBP histogram of this area image and utilize sorter to detect the state of eyes.

7. three-dimensional model drives

In the present invention, the method for linear interpolation is adopted in the driving of three-dimensional model.In conjunction with the model of making and the expression action parameter of extraction, can confirm that the displacement on each summit of model is: D under a certain expression type_i=α_i(V_i-V₀); V wherein_iBe i the apex coordinate under the expression type, V₀Be the apex coordinate under amimia, α_iBe the intensity of i expression.The model that then final band is expressed one's feelings is: the quantity of

expression type i is consistent with table one.Utilize attitude parameter (being rotation matrix Q ()) to come rotating model at last, make that the head pose of model is consistent with performing artist's attitude.

Top description is to be used to realize the present invention and embodiment, and therefore, scope of the present invention should not described by this and limit.It should be appreciated by those skilled in the art,, all belong to claim of the present invention and come restricted portion in any modification that does not depart from the scope of the present invention or local replacement.

Claims

1. the expression interactive approach based on face tracking and analysis is characterized in that, comprises the steps:

Step 1: design a certain personage's three-dimensional model, and make some typical case's expression models (this step off-line is accomplished) of this personage;

Step 2: train three not active appearance models under the homonymy face angle (this step off-line is accomplished);

Step 3:, then utilize the initial value of previous frame parameter as active appearance models if there is people's face in previous frame; If tracking is lost or people's face gets into picture for the first time, then utilize the Adaboost algorithm to detect people's face, the people that utilization is obtained is bold little and positional information is come the initialization active appearance models;

Step 4: the minimization of energy function, obtain optimum active appearance models parameter of present frame and expression parameter, detect the state of eyes;

Step 5: utilize the expression parameter and the eye state that obtain to drive the three-dimensional model of making, let it generate the expression identical with the performing artist;

Step 6: upgrade the camera data, the expression analysis of beginning next frame and expression drive to be handled.

2. the expression interactive approach based on face tracking and analysis according to claim 1 is characterized in that the training of the active appearance models in the step 2 is undertaken by following process:

Step 21: adopt three kinds of not human face expression images under the homonymy face angle respectively, and mark people's face shape of every width of cloth image;

Step 22: to the sample set under each side face angle, people's face shape of normalization sample and people's face texture image, wherein texture image is made up of gray level image, x direction gradient figure and three passages of y direction gradient figure;

Step 23: train three kinds of active appearance models under the angle through PCA.

3. the expression interactive approach based on face tracking and analysis according to claim 1 is characterized in that, the energy function in the step 4 is undertaken by following process with the expression parameter acquiring:

Step 31: set the energy function formula of active appearance models, comprise the difference minimize restriction of people's face texture and average texture, based on the consecutive frame consistency constraint of unique point local grain, based on the human face region constraint (the off-line completion of this step) of complexion model;

Step 32: make follow-on CANDIDE three-dimensional face shape grid and several kinds of typical corresponding shape grids (this step off-line is accomplished) down of expressing one's feelings;

Step 33: utilization counter-rotating composition algorithm minimizes the energy function of active appearance models, obtains people's face shape of single preceding frame;

Step 34: utilize the people's face shape and the follow-on CANDIDE three-dimensional face grid that obtain, extract the expression parameter and the head pose of single preceding frame.

4. the expression interactive approach based on face tracking and analysis according to claim 1 is characterized in that, the eye state in the step 4 detects and undertaken by following process:

Step 41: the local active appearance models of an eye areas of training (this step off-line is accomplished);

Step 42: the state classification device (this step off-line is accomplished) that utilizes eyes of LBP histogram feature and SVM training to open eyes and close one's eyes;

Step 43: on the basis of overall active appearance models location, utilize local active appearance models accurately to locate shape of eyes;

Step 44: calculate the LBP of eye areas image, and judge the state of closing of opening of eyes according to the svm classifier device.

5. the expression interactive approach based on face tracking and analysis according to claim 1 is characterized in that, the expression in the step 5 drives is undertaken by following process:

Step 51: be written into three dimensional character model and corresponding typical case's expression model (this step accomplishes) thereof when program initialization;

Step 52: utilization expression calculation of parameter goes out the displacement on every kind of each summit of typical case's expression drag, and is superimposed with the amimia model of neutrality then, obtains the corresponding to expression model of this personage and performing artist;

Step 53: the rotary manipulation to three angles of expression model enforcement makes its head pose with the performing artist consistent;