CN120147484A

Movatterモバイル変換

Info

Publication number: CN120147484A
Application number: CN202510622315.8A
Authority: CN
Inventors: 黄姗姗; 黄民主; 王小刚
Original assignee: Xi'an Hongyuan Video Equipment Co ltd
Current assignee: Xi'an Hongyuan Video Equipment Co ltd
Priority date: 2025-05-15
Filing date: 2025-05-15
Publication date: 2025-06-13

Abstract

The application discloses a cartoon action real-time generation method and device, and relates to the technical field of animation production, wherein the method comprises the steps of collecting action data through an action capturing engine, processing the action data according to requirements of different fields to form scene cartoon action templates applicable to various different fields, forming a model library by all scene cartoon action templates, selecting corresponding scene cartoon action templates in the model library according to the field selected by a user, obtaining real-time data input by the user, converting the real-time data into corresponding action parameters, applying the action parameters to the scene cartoon action templates, and rendering the scene cartoon action templates in sequence to generate cartoon actions. By adopting the method, the cartoon action with high expressive force can be generated in real time, the action details can be automatically optimized according to scene requirements, and the animation production efficiency and the user experience are obviously improved.

Description

Cartoon action real-time generation method and device

Technical Field

The application relates to the technical field of animation production, in particular to a cartoon action real-time generation method and device.

Background

The existing real-time cartoon animation generation technology (such as a Unity engine, a motion capture device and the like) is widely applied to game, movie and virtual character interaction, and basic motion design can be realized through a preset animation library or manual adjustment. However, there is still a significant technical gap in dynamically generating a consistent and logical cartoon action sequence based on user real-time input (e.g., voice command, gesture capture, emotion recognition, etc.). The prior art is difficult to automatically call or generate an adaptive motion model (such as walking, jumping, expression change and the like) based on real-time input data, and intelligent fusion of motion and a scene is realized, so that the animation generation efficiency is low and the flexibility is lacking. These drawbacks are manifested in several ways:

Real-time data analysis and action generated fracturing. The prior art cannot analyze multimodal data (such as voice, gestures and biological signals) input by a user in real time, and dynamically generate a high-precision animation sequence according to the multimodal data, and still relies on manual intervention or preprogrammed scripts.

The cross-domain scene adaptation capability is insufficient. The existing system is difficult to integrate the requirements of different scenes such as games, videos, education and the like, and can not automatically call an adaptive action model or scene component according to a specific application scene.

The operation complexity is high. Mainstream technologies (such as unity+motion capture+real-time rendering) still rely on cumbersome manual modeling and data mapping processes, and have high requirements on technical capabilities of users, so that popularization and application range of the technologies are limited.

Disclosure of Invention

The embodiment of the application provides a cartoon action real-time generation method and device, which are used for solving the problems of insufficient utilization of real-time multi-mode data, insufficient cross-field scene adaptation capability and high use requirement in the cartoon action real-time generation technology in the prior art.

In one aspect, an embodiment of the present application provides a method for generating cartoon actions in real time, including:

Collecting motion data by a motion capture engine;

processing the motion data according to the requirements of different fields to form scene cartoon motion templates applicable to various different fields, wherein all scene cartoon motion templates form a model library;

selecting a corresponding scenerized cartoon action template in the model library according to the field selected by the user;

acquiring real-time data input by a user, and converting the real-time data into corresponding action parameters;

Applying the action parameters in the scene cartoon action template;

rendering the plurality of scenerised cartoon action templates in sequence to generate cartoon actions.

In one possible implementation, the real-time data is voice data, and after the voice data is converted into corresponding text data, the action parameters are obtained by extracting keywords in the text data.

In one possible implementation, the real-time data is a three-dimensional motion, the three-dimensional motion includes position and movement data of a plurality of key points, and the position and movement data of each key point in the three-dimensional motion are extracted to obtain motion parameters.

In one possible implementation, the three-dimensional action is an action input by the user through the three-dimensional sensor and corresponding to a position in the cartoon action to be generated.

In one possible implementation, the three-dimensional action is an action entered by a user through a three-dimensional sensor for status adjustment of a specific location in the scenerized cartoon action template.

In one possible implementation, a UE (illusive engine) real-time rendering technique is employed to render a plurality of scenerised cartoon action templates.

On the other hand, the embodiment of the application also provides a cartoon action real-time generation device, which comprises:

the motion capture engine is used for collecting motion data;

The model library construction module is used for processing the action data according to the requirements of different fields to form scene cartoon action templates applicable to various different fields, and all scene cartoon action templates form a model library;

The template selection module is used for selecting a corresponding scene cartoon action template in the model library according to the field selected by the user;

the parameter acquisition module is used for acquiring real-time data input by a user and converting the real-time data into corresponding action parameters;

The action application module is used for applying the action parameters in the scene cartoon action template;

And the action generating module is used for rendering the plurality of scene cartoon action templates in sequence to generate cartoon actions.

The method and the device for generating the cartoon action in real time have the following advantages:

By the method, the cartoon action with high expressive force can be generated in real time, action details (such as physical collision and light shadow adaptation) can be automatically optimized according to scene requirements, and the animation production efficiency and user experience are remarkably improved. After combining with the AI (artificial intelligence) technology, the method can be expanded to the emerging fields of virtual even images, meta-universe interaction and the like, and the common Hui Hua and the intellectualization of the animation generation technology are promoted.

Drawings

In order to more clearly illustrate the embodiments of the application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a flowchart of a cartoon action real-time generation method provided by an embodiment of the application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

Fig. 1 is a flowchart of a cartoon action real-time generation method provided by an embodiment of the application. The embodiment of the application provides a cartoon action real-time generation method, which comprises the following steps:

S100, collecting motion data through a motion capture engine.

For example, a user may build three-dimensional cartoon models in advance through a three-dimensional modeling manner, the cartoon models may simulate human, animal, plant and other images, and after the three-dimensional cartoon models are built, the user needs to set action matching points at each key position on the three-dimensional cartoon models, the action matching points are usually movable positions of the three-dimensional cartoon models, for example, the action matching points in the three-dimensional cartoon models built by the simulated human can be set at the head, the limbs, the body and the face.

The user then wears a plurality of three-dimensional sensors, such as tri-axial gyroscopes, etc., on the body, which can monitor position and movement data of specific parts of the user's body, such as the limbs, trunk, and even the face, etc., which represent the status of a plurality of key points on the user's body, i.e., the positions where the three-dimensional sensors are located. After the key points and the action matching points are in one-to-one correspondence and bound, a user can make various actions in a three-dimensional space, such as walking, running, jumping, different expressions and the like, after the three-dimensional sensor collects data, the position and movement data of each key point are extracted and applied to the corresponding action matching points of the three-dimensional cartoon model, so that each action matching point in the three-dimensional cartoon model changes state according to the position and movement data of the key point, and basic action data can be formed.

S110, processing the motion data according to the requirements of different fields to form scene cartoon motion templates applicable to various fields, wherein all scene cartoon motion templates form a model library.

For example, in different fields of games, videos, education and the like, the requirements on the three-dimensional cartoon model are different, so that corresponding scenerized cartoon action templates are required to be generated according to the requirements of different fields in order to meet the requirements of different fields.

In the processing process, the process of generating action templates in different fields through the basic action data is as follows:

1. And (5) preprocessing data.

1.1 Denoising the original motion data collected by the motion capture engine, such as filtering to eliminate sensor shake noise.

1.2 Standardizing a coordinate system and a time axis of the motion data, and ensuring compatibility of data from different sources.

1.3 Dividing continuous motion data into independent motion units, such as walking, waving and jumping.

2. And extracting characteristic parameters in the action unit.

2.1 Spatial characteristics, namely joint angle, motion trail and key point displacement.

2.2 Time characteristics, action duration, speed curve and acceleration peak value.

2.3 Physical characteristics, collision volume, center of gravity offset.

3. And (5) adjusting characteristic parameters.

3.1 Storing the feature parameters as structured data, such as JSON (JavaScript object notation) or in matrix form.

And 3.2, smoothing the action curve by adopting an interpolation algorithm, and optimizing micro-expression transition.

3.3 Adding shadow matching parameters such as real-time interaction of actions with scene light sources.

4. Binding the adjusted action parameters to a preset three-dimensional model skeleton system to generate a standardized template.

4.1 Scene element fusion and template encapsulation.

And 4.2, dynamically associating the action template with scene components, such as game special effects, teaching icons and video backgrounds according to the field scene requirements.

4.3 The package template is a callable module containing the following metadata:

Motion parameter ranges (e.g., speed threshold, angle limit);

compatibility identification (e.g., support UE engine rendering, unity plugin interface).

5.A model action is generated.

6. Template verification and iterative optimization

6.1 Simulating the performance of the environmental test templates in the target field.

And 6.2, collecting user feedback, and optimizing and adjusting a rule base by utilizing regression analysis.

And 6.3, establishing an automatic updating mechanism, and synchronizing the optimized template to a model library.

Through the flow, the basic action data can be efficiently converted into the multi-field adaptive scene template, so that 'one-time capturing and multi-scene multiplexing' is realized, and the core goal of 'cross-field intelligent adaptation' is met.

S120, selecting a corresponding scene cartoon action template in the model library according to the field selected by the user.

Illustratively, after creating the model library, the user may be engaged in the work of a particular development task. After the development task is determined, the domain to which the cartoon action belongs, i.e., the domain to which the cartoon action belongs, so that the user can select a specific domain, after which the computer will select a matching scenerised cartoon action template according to the selected domain. It should be appreciated that in the process of building the model library, only basic actions are input by the user through the action capturing engine, but in the actual development task, the cartoon actions to be generated may be complex, so that the actions of the scene cartoon action template need to be further adjusted by collecting the data input by the user so as to adapt to the requirements of the development task.

S130, acquiring real-time data input by a user, and converting the real-time data into corresponding action parameters.

Illustratively, real-time data may be classified into voice data, which may be input by a user through a microphone, and three-dimensional actions, which may be input by a three-dimensional sensor, such as a tri-axis gyroscope, etc., worn on the body of the user, according to the user's access API (application programming interface).

When the microphone is used for collecting voice data, voice recognition technology can be used for converting the voice data into corresponding text data, and then the action parameters are obtained by extracting keywords in the text data. These keywords typically contain the names of objects, such as feet, hands, etc., and the names of specific parts of the scenic cartoon action template that the user needs to control, and actions, such as rotation, up-shift, etc., that are instructions to adjust the state of these specific parts.

When a three-dimensional sensor is adopted to collect three-dimensional actions, the three-dimensional actions comprise the position and movement data of a plurality of key points, and the position and movement data of each key point in the three-dimensional actions are extracted to obtain action parameters. The user can select one of two control modes according to the requirement and adjust the scene cartoon action template by utilizing the three-dimensional action.

The first of these is similar to the method of collecting motion data by a motion capture engine. In the process, the part of the user wearing the three-dimensional sensor is the same as the part in the scenerized cartoon action model, for example, the part is a hand, the action corresponding to the part in the cartoon action to be generated, which is input by the user through the three-dimensional sensor, can be used as a three-dimensional action, and the position and movement data in the three-dimensional action are extracted, so that the action parameters can be obtained.

The second method is to input motion parameters through three-dimensional sensors worn by a user at specific positions, usually hands, and adjust the state of the scene cartoon motion template by using the motion parameters, wherein the position and movement data in the three-dimensional motion input by the user through equipment such as gloves integrated with a plurality of three-dimensional sensors are the motion parameters. Although the first mode can simulate the actions of the body of the user to adjust the scene cartoon action template, the cartoon actions and the real human actions are not completely the same, and many cartoon actions are not made by the human body, at this time, the user is required to wear gloves, and the state adjustment is performed on a specific part of the three-dimensional cartoon model in the scene cartoon action template in the form of AR (augmented reality) or VR (virtual reality), so that the part can make actions which cannot be realized by the human body.

And S140, applying the action parameters to the scene cartoon action template.

When voice data is adopted, a position corresponding to an object in a keyword in a scene cartoon action template is selected, and the position of the position is adjusted according to an action instruction, so that the effect of generating cartoon actions according to the voice data can be achieved.

When the first mode of three-dimensional actions is adopted, although the scene cartoon action templates have some basic actions, the basic actions either do not meet the requirements of actual development tasks, or can not meet all the requirements due to the fact that the quantity and the types are relatively small, so that part of actions in the scene cartoon action templates can be adjusted in a mode of inputting the three-dimensional actions by a user to meet the requirements of the actual development tasks, or actions which are not in the scene cartoon action templates, such as actions before and after the scene cartoon action templates start are connected, are regenerated, and finally generated scene cartoon action templates can be spliced to form continuous cartoon actions.

When the second mode of the three-dimensional actions is adopted, a scene cartoon action template and the three-dimensional actions are required to be displayed on a user interface, so that a user can know the relative position between the scene cartoon action template and the three-dimensional actions, and the actions are adjusted based on the relative position, so that a certain part of the scene cartoon action template is adjusted according to the three-dimensional actions of the user. For example, after the user wears a glove with 10 three-dimensional sensors in total with both hands, the 10 points are displayed on the user interface, and the positions and distances of the 10 points correspond to the actual actions of the user. After the user rotates or moves the specific part in the scene cartoon action template by adjusting the positions of the 10 points, the purpose of adjusting the state is achieved.

And S150, rendering the plurality of scene cartoon action templates in sequence to generate cartoon actions.

By way of example, after generating a plurality of scenic cartoon action templates according to a development task, the scenic cartoon action templates are usually continuous in time, such as walking, running, jumping and other actions which occur in sequence in the same scene, and after the action parameters input by a user are applied to the scenic cartoon action templates, the scenic cartoon action templates of two actions adjacent to each other in front and back are also coherent, so that the scenic cartoon action templates can be spliced in sequence to form preliminary cartoon actions, and then rendered to obtain final cartoon actions.

Specifically, a UE real-time rendering technique may be employed to render the plurality of scenerized cartoon action templates.

The embodiment of the application also provides a device for the cartoon action real-time generation method, which comprises the following modules:

the motion capture engine is used for collecting motion data;

While preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the application.

It will be apparent to those skilled in the art that various modifications and variations can be made to the present application without departing from the spirit or scope of the application. Thus, it is intended that the present application also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims

1. The cartoon action real-time generation method is characterized by comprising the following steps of:

Collecting motion data by a motion capture engine;

Processing the action data according to the requirements of different fields to form scene cartoon action templates applicable to various different fields, wherein all scene cartoon action templates form a model library;

selecting a corresponding scene cartoon action template in the model library according to the field selected by the user;

Applying the action parameters to the scene cartoon action template;

rendering a plurality of the scene cartoon action templates in sequence to generate cartoon actions.

2. The method for generating cartoon actions in real time according to claim 1, wherein said real-time data is voice data, said voice data is converted into corresponding text data, and said action parameters are obtained by extracting keywords from said text data.

3. The method for generating the cartoon action in real time according to claim 1, wherein the real-time data is a three-dimensional action, the three-dimensional action comprises position and movement data of a plurality of key points, and the position and movement data of each key point in the three-dimensional action are extracted to obtain the action parameters.

4. A method for generating a cartoon action in real time according to claim 3, wherein said three-dimensional action is an action corresponding to a position in the cartoon action to be generated, which is inputted by a user through a three-dimensional sensor.

5. A method for generating cartoon actions in real time according to claim 3, wherein said three-dimensional actions are actions of status adjustment of specific parts in said scenerized cartoon action template inputted by user through three-dimensional sensor.

6. The method for generating cartoon actions in real time according to claim 1, wherein a plurality of said scenerized cartoon action templates are rendered by using a UE real-time rendering technique.

7. Apparatus for applying the cartoon action real-time generation method of any one of claims 1 to 6, comprising:

the motion capture engine is used for collecting motion data;

the model library construction module is used for processing the action data according to the requirements of different fields to form scene cartoon action templates applicable to various different fields, and all the scene cartoon action templates form a model library;

the template selection module is used for selecting the corresponding scene cartoon action template in the model library according to the field selected by the user;

The action application module is used for applying the action parameters to the scene cartoon action template;

And the action generating module is used for rendering the plurality of the scene cartoon action templates in sequence to generate cartoon actions.