Movatterモバイル変換


[0]ホーム

URL:


CN112700524A - 3D character facial expression animation real-time generation method based on deep learning - Google Patents

3D character facial expression animation real-time generation method based on deep learning
Download PDF

Info

Publication number
CN112700524A
CN112700524ACN202110316439.5ACN202110316439ACN112700524ACN 112700524 ACN112700524 ACN 112700524ACN 202110316439 ACN202110316439 ACN 202110316439ACN 112700524 ACN112700524 ACN 112700524A
Authority
CN
China
Prior art keywords
animation
picture
decoder
pictures
facial
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110316439.5A
Other languages
Chinese (zh)
Other versions
CN112700524B (en
Inventor
赵锐
侯志迎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu Yuanli Digital Technology Co ltd
Original Assignee
Jiangsu Yuanli Digital Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu Yuanli Digital Technology Co ltdfiledCriticalJiangsu Yuanli Digital Technology Co ltd
Priority to CN202110316439.5ApriorityCriticalpatent/CN112700524B/en
Publication of CN112700524ApublicationCriticalpatent/CN112700524A/en
Application grantedgrantedCritical
Publication of CN112700524BpublicationCriticalpatent/CN112700524B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Images

Classifications

Landscapes

Abstract

The invention provides a 3D character facial expression animation real-time generation method based on deep learning, which comprises the following steps: acquiring training data, and performing enhancement processing on the acquired training data; building a generation model, wherein the generation model comprises 1 encoder and 3 decoders, the encoder is used for encoding picture data of training data into a hidden space, and the 3 decoders are used for decoding the data of the hidden space into facial action pictures of actors, screen shot pictures of animation files and values of controllers corresponding to the screen shot pictures of the animation files; training the built generation model to obtain the optimal weight values of the encoder and the decoder to obtain the optimal model; inputting the pictures of actors into a trained generation model, coding the pictures to a hidden space by a coder, and decoding data in the hidden space by a corresponding decoder to obtain the value of a corresponding controller; the values of the controller are input into animation software to generate facial movements of the model.

Description

3D character facial expression animation real-time generation method based on deep learning
Technical Field
The invention relates to the technical field of animation production, in particular to a 3D character facial expression animation real-time generation method based on deep learning.
Background
At present, in the market, the facial animation of virtual characters is driven in real time by using the facial expression in videos, and a method based on human face key point detection in computer vision is mainly adopted, and the method has the following defects:
1. the generalization is poor, if a driving mode with higher accuracy is needed, data needs to be marked again when the actors are changed,
2. if data are not marked, only the character model with lower precision can be driven.
The above disadvantages determine that the method cannot meet the requirement of a 3D animation film production flow with high precision requirement (model point is 2-3 ten thousand). Currently, no mature solution is available in the market that can directly generate 3D high-precision angular color animations from actor facial performances.
Disclosure of Invention
The invention aims to provide a 3D character facial expression animation real-time generation method based on deep learning, which reduces early preparation work, has wide application range and can generate animation in real time.
The invention provides the following technical scheme:
A3D character facial expression animation real-time generation method based on deep learning comprises the following steps:
s1, acquiring training data, and performing enhancement processing on the acquired training data, wherein the training data comprises animation files of the model and values of corresponding controllers, facial motion pictures of actors, screen shot pictures of the animation files and values of corresponding controllers;
s2, building a generation model, wherein the generation model comprises 1 encoder and 3 decoders, the encoder is used for encoding the picture data of the training data into an implied space, and the 3 decoders are used for decoding the data of the implied space into facial action pictures of actors, screen shot pictures of animation files and controller values corresponding to the screen shot pictures of the animation files;
s3, training the built generation model to obtain the optimal weight values of the encoder and the decoder to obtain the optimal model;
s4, inputting the pictures of actors into the trained generation model, coding the pictures into the hidden space by the coder, and decoding the data in the hidden space by the corresponding decoder to obtain the corresponding controller value;
s5, the controller value is input to animation software to generate the facial movement of the model.
Preferably, the method of the training data enhancement processing in step S1 is to change the brightness of the actor' S facial motion picture and the captured picture of the animation file randomly, and perform data enhancement through rotation, displacement, noise addition and simulated illumination change.
Preferably, the facial motion picture of the actor and the screen shot picture of the animation file share the same code.
Preferably, in the training of the generated model in step S3, the training of the encoder, the decoder for generating the facial motion picture of the actor, and the decoder for generating the screen shot picture of the animation file is a process of outputting the restoration input.
Preferably, the training method of the generative model is as follows:
q1, inputting the actor facial motion picture and the animation file screen picture into an encoder, outputting a corresponding picture through a decoder for generating the actor facial motion picture and a decoder for generating the animation file screen picture, wherein the output picture is the input picture in the process, calculating a loss function value between the input picture and the output picture through a structural similarity index at the moment, and updating the weight of the encoder and the corresponding decoder according to the loss function value;
q2, the third decoder outputs the controller value corresponding to the screen picture, the loss function is obtained by averaging the absolute value of the difference value of each controller value, and the weight of the corresponding decoder is updated according to the loss function value.
Preferably, the animation software of step S5 comprises maya or UE.
Preferably, the encoder and decoder employ convolutional neural networks.
The invention has the beneficial effects that: according to the invention, through modeling, the facial actions of the corresponding animation models are generated according to the facial videos and photos of the actors, and paired data corresponding to the video pictures of the actors and the animation files of the corresponding roles are not needed, so that the early-stage data preparation work is greatly reduced; the actors can be changed at will without data annotation work; real-time estimation can be carried out, namely, the facial videos of actors can be acquired in real time and calculated as the facial movements of the animation model.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:
fig. 1 is a schematic block diagram of the present invention.
Detailed Description
As shown in fig. 1, a method for generating facial expression animation of a 3D character based on deep learning in real time includes the following steps:
s1, acquiring training data, and performing enhancement processing on the acquired training data, wherein the training data comprises animation files of the model and values of corresponding controllers, facial action pictures of actors, screen shot pictures of the animation files and values of corresponding controllers;
s2, building a generation model, wherein the generation model comprises 1 encoder and 3 decoders, the encoder is used for encoding the picture data of the training data into a hidden space, and the 3 decoders are used for decoding the data of the hidden space into facial action pictures of actors, screen-shot pictures of animation files and controller values corresponding to the screen-shot pictures of the animation files;
s3, training the built generation model to obtain the optimal weight values of the encoder and the decoder to obtain the optimal model;
s4, inputting the pictures of actors into the trained generation model, coding the pictures into the hidden space by the coder, and decoding the data in the hidden space by the corresponding decoder to obtain the corresponding controller value;
s5, the controller value is input to animation software to generate the facial movement of the model.
The first embodiment is as follows:
acquiring training data, comprising:
A. acquiring an animation file of the model and a corresponding value of a controller, wherein the controller is a group of devices capable of controlling the facial action of the animation model, and can be quantized into a group of values, and each group of values can correspond to the facial action of the animation model one by one;
B. acquiring a facial action video of an actor at one end;
C. the animation file is aligned to the face of the model to perform screen shooting operation, and each screen shooting picture and the corresponding value of the controller are obtained;
D. the brightness of facial action pictures of actors and screen-shot pictures of animation files is changed randomly, data enhancement is carried out through rotation, displacement, noise addition and illumination change simulation, and the robustness of the system is improved.
Building a generating model, which is a training process through reconstructing an input neural network, wherein a hidden layered vector of the generating model has a dimension reduction effect, the generating model comprises 1 coder and 3 decoders, and the coder is used for coding picture data of training data into an implicit space and contains the meaning of the input data; the 3 decoders are used for decoding the data of the hidden space into facial motion pictures of actors (hereinafter referred to as "decoder a"), screen pictures of animation files (hereinafter referred to as "decoder B"), and controller values corresponding to the screen pictures of the animation files (hereinafter referred to as "decoder C"); the input data will be reconstructed by "implicit space". The final generated model through the training of the neural network will result in a "hidden space" in the hidden layer representing the input data. It can help data classification, visualization, storage. The model is an unsupervised learning mode actually, only data is needed to be input, and label or data of an input-output pair are not needed; both the decoder and encoder herein use convolutional neural networks; the actor's facial motion picture shares the same code with the filmed picture of the animation file.
Wherein, the training decoders A and B do not need labels, and the training decoder C needs labels, namely, the paired screen pictures of the animation file and the corresponding controller values are needed.
And training the built generation model to obtain the optimal weight values of the encoder and the decoder to obtain the optimal model, wherein the training of the encoder, the decoder for generating facial action pictures of actors and the decoder for generating screen shooting pictures of animation files is the process of outputting, restoring and inputting.
Specifically, the training method of the generative model is as follows:
q1, inputting the actor's facial motion picture and the animation file's screenshot picture to the encoder, outputting the corresponding picture through the decoder generating the actor's facial motion picture and the decoder generating the animation file's screenshot picture, the output in the process being the input picture, at this moment, calculating the loss function value between the input and output pictures through the structural similarity index, and updating the weights of the encoder and the corresponding decoder according to the loss function;
q2, the third decoder outputs the controller value corresponding to the screen picture, the loss function is obtained by averaging the absolute value of the difference value of each controller value, and the weight of the corresponding decoder is updated according to the loss function value.
Inputting the pictures of actors into a trained generation model, coding the pictures to a hidden space by a coder, and decoding data in the hidden space by a corresponding decoder to obtain the value of a corresponding controller; the controller values are input to animation software (maya, UE) to generate the facial movements of the model.
According to the invention, through modeling, the facial actions of the corresponding animation models are generated according to the facial videos and photos of the actors, and paired data corresponding to the video pictures of the actors and the animation files of the corresponding roles are not needed, so that the early-stage data preparation work is greatly reduced; the actors can be changed at will without data annotation work; real-time estimation can be carried out, namely, the facial videos of actors can be acquired in real time and calculated as the facial movements of the animation model.
Although the present invention has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that changes may be made in the embodiments and/or equivalents thereof without departing from the spirit and scope of the invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (7)

CN202110316439.5A2021-03-252021-03-253D character facial expression animation real-time generation method based on deep learningActiveCN112700524B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202110316439.5ACN112700524B (en)2021-03-252021-03-253D character facial expression animation real-time generation method based on deep learning

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202110316439.5ACN112700524B (en)2021-03-252021-03-253D character facial expression animation real-time generation method based on deep learning

Publications (2)

Publication NumberPublication Date
CN112700524Atrue CN112700524A (en)2021-04-23
CN112700524B CN112700524B (en)2021-07-02

Family

ID=75516776

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202110316439.5AActiveCN112700524B (en)2021-03-252021-03-253D character facial expression animation real-time generation method based on deep learning

Country Status (1)

CountryLink
CN (1)CN112700524B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN113781616A (en)*2021-11-082021-12-10江苏原力数字科技股份有限公司Facial animation binding acceleration method based on neural network
CN114898020A (en)*2022-05-262022-08-12唯物(杭州)科技有限公司3D character real-time face driving method and device, electronic equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN111524226A (en)*2020-04-212020-08-11中国科学技术大学 Key point detection and 3D reconstruction method for caricature portraits
CN111598979A (en)*2020-04-302020-08-28腾讯科技(深圳)有限公司Method, device and equipment for generating facial animation of virtual character and storage medium
CN112200894A (en)*2020-12-072021-01-08江苏原力数字科技股份有限公司Automatic digital human facial expression animation migration method based on deep learning framework

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN111524226A (en)*2020-04-212020-08-11中国科学技术大学 Key point detection and 3D reconstruction method for caricature portraits
CN111598979A (en)*2020-04-302020-08-28腾讯科技(深圳)有限公司Method, device and equipment for generating facial animation of virtual character and storage medium
CN112200894A (en)*2020-12-072021-01-08江苏原力数字科技股份有限公司Automatic digital human facial expression animation migration method based on deep learning framework

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
岳旸: "《基于深度学习网络模型实现语音驱动的人脸动画合成》", 《中国优秀硕士学位论文全文数据库信息科技辑》*
闫衍芙 等: "《基于深度学习和表情AU参数的人脸动画方法》", 《计算机辅助设计与图形学学报》*

Cited By (3)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN113781616A (en)*2021-11-082021-12-10江苏原力数字科技股份有限公司Facial animation binding acceleration method based on neural network
CN114898020A (en)*2022-05-262022-08-12唯物(杭州)科技有限公司3D character real-time face driving method and device, electronic equipment and storage medium
CN114898020B (en)*2022-05-262024-10-18唯物(杭州)科技有限公司3D character real-time face driving method and device, electronic equipment and storage medium

Also Published As

Publication numberPublication date
CN112700524B (en)2021-07-02

Similar Documents

PublicationPublication DateTitle
Mihajlovic et al.LEAP: Learning articulated occupancy of people
CN111417988B (en)System and method for real-time complex character animation and interactivity
US12243140B2 (en)Synthesizing sequences of images for movement-based performance
US20230154089A1 (en)Synthesizing sequences of 3d geometries for movement-based performance
CN112700524B (en)3D character facial expression animation real-time generation method based on deep learning
CN111292403B (en)Method for creating movable cloth doll
CN114862992A (en)Virtual digital human processing method, model training method and device thereof
CN114049678A (en) A method and system for facial motion capture based on deep learning
CN117221464A (en)Audio and video data processing method, device, equipment and storage medium
US20250022202A1 (en)Pose-aware neural inverse kinematics
CN119888023A (en)Audio-driven three-dimensional digital person generation method and system based on nerve radiation field
CN118488266A (en) A method and system for generating high-quality talking face videos based on RDDM
CN114241553B (en) A method for transferring human facial expressions to virtual character faces
Tu et al.Acquiring identity and expression information from monocular face image
Liu et al.Make-A-Character 2: Animatable 3D Character Generation From a Single Image
US20250182367A1 (en)Apparatus and method for realistic movement of digital human character
CN116091668B (en)Talking head video generation method based on emotion feature guidance
Jiang et al.Animating arbitrary topology 3D facial model using the MPEG-4 FaceDefTables
US20250209715A1 (en)Generalized pose and motion generation
Chen et al.3D-GS Talker: 3D Gaussian Based Audio-Driven Real-Time Talking Head Generation
CN120431222A (en)Voice-driven 3D digital person generation method based on deep learning

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp