CN117032957A

Movatterモバイル変換

Info

Publication number: CN117032957A
Application number: CN202310940537.5A
Authority: CN
Inventors: 周泓; 赵保中; 李翔; 朱全银; 徐斌; 王云鹏; 石锐; 宗美晨; 古小诗; 杜萌; 刘云飞; 高市洪
Original assignee: Huaiyin Institute of Technology
Current assignee: Huaiyin Institute of Technology
Priority date: 2023-07-28
Filing date: 2023-07-28
Publication date: 2023-11-10
Anticipated expiration: 2043-07-28
Also published as: CN117032957B

Abstract

A control method for a behavior tree driven virtual person comprises the following steps: step 1: constructing nodes, including virtual behavior nodes and logic nodes; step 2: acquiring voice input by a user from a driver; step 3: identifying the acquired voice; step 4: processing text information and constructing an event list; step 5: constructing an event set according to the result obtained from the event list; step 6: constructing a behavior tree executor; step 7: and completing the control of the virtual person according to the behavior tree executor. The invention utilizes the voice recognition technology and the action tree algorithm to automatically convert the processing action process described by the user into an event list arranged according to the time sequence, and constructs the action tree by matching with the virtual character action instruction to reproduce the scene described by the user. The user can describe the scene of the needed interaction through natural voice without manually writing complex control instructions, so that the learning cost and the operation difficulty of the user are reduced.

Description

Translated fromChinese

一种行为树驱动虚拟人的控制方法A control method for behavior tree-driven virtual humans

技术领域Technical field

本发明涉及计算机技术领域，具体涉及一种行为树驱动虚拟人的控制方法。The invention relates to the field of computer technology, and in particular to a control method for a behavior tree-driven virtual human.

背景技术Background technique

现有的行为树算法可以保证虚拟人与用户之间行为逻辑正确，且流畅自然，可以实现虚拟角色的任务分配、行动计划和决策等功能，从而保证虚拟人的行为符合用户预期。但是现有的行为树算法还存在下列技术问题：The existing behavior tree algorithm can ensure that the behavior logic between the virtual person and the user is correct, smooth and natural, and can realize the task allocation, action planning and decision-making functions of the virtual character, thereby ensuring that the virtual person's behavior meets the user's expectations. However, existing behavior tree algorithms still have the following technical problems:

1.在处理复杂场景时性能可能不足，行为树算法执行速度较慢，难以应对大规模场景的需求；在处理大规模场景时容易出现执行时间过长、内存占用过高等问题，导致系统性能不稳定。1. The performance may be insufficient when processing complex scenarios. The behavior tree algorithm executes slowly and is difficult to cope with the needs of large-scale scenarios. When processing large-scale scenarios, problems such as long execution time and high memory usage are prone to occur, resulting in poor system performance. Stablize.

2.在算法执行过程中，存在多个节点同时修改状态或者访问共享数据的情况，容易出现竞争和死锁等问题。2. During the execution of the algorithm, there are situations where multiple nodes modify status or access shared data at the same time, which is prone to problems such as competition and deadlock.

3.对于多线程环境下的并行化处理支持度较低，缺乏有效的多线程调度和资源管理机制，从而导致系统的可扩展性和稳定性受到影响。3. The support for parallel processing in a multi-threaded environment is low, and there is a lack of effective multi-thread scheduling and resource management mechanisms, which affects the scalability and stability of the system.

4.在行为控制方面，传统方法通常使用固定规则或脚本编程虚拟人的行为，导致行为缺乏灵活性和可扩展性。以往的研究工作试图解决上述问题。例如，在行为控制方面，一些研究采用了状态机、有限状态机等技术，但这些方法需要手动编写和调整大量的规则，非常的麻烦。4. In terms of behavior control, traditional methods usually use fixed rules or scripts to program the behavior of virtual humans, resulting in a lack of flexibility and scalability of behavior. Previous research efforts have attempted to address the above issues. For example, in terms of behavioral control, some research uses technologies such as state machines and finite state machines, but these methods require manual writing and adjustment of a large number of rules, which is very troublesome.

发明内容Contents of the invention

针对上述技术问题，本技术方案提供了一种行为树驱动虚拟人的控制方法，通过构建行为节点和逻辑节点，形成层次化的行为树结构来描述虚拟人的行为；可以通过行为树进行多人多事件的控制，引入了CBT执行器和线程池管理技术来解决行为树并行化执行，以及处理过程中出现的死锁或阻塞问题；CBT执行器是基于行为树算法并行化原理设计的，可以在多线程环境下高效地执行行为树；该执行器可以将行为树分解成多个子树并行执行，同时还能够处理节点状态同步和数据共享等问题，从而快速地完成整个行为树的执行；能有效的解决上述问题。In response to the above technical problems, this technical solution provides a control method for behavior tree-driven virtual humans. By constructing behavior nodes and logical nodes, a hierarchical behavior tree structure is formed to describe the behavior of virtual humans; multiple people can be controlled through the behavior tree. For multi-event control, the CBT executor and thread pool management technology are introduced to solve the parallel execution of behavior trees and the deadlock or blocking problems that occur during processing; the CBT executor is designed based on the parallelization principle of the behavior tree algorithm and can Efficiently execute behavior trees in a multi-threaded environment; the executor can decompose the behavior tree into multiple subtrees for parallel execution, and can also handle issues such as node status synchronization and data sharing, thereby quickly completing the execution of the entire behavior tree; Effectively solve the above problems.

本发明通过以下技术方案实现：The present invention is realized through the following technical solutions:

一种行为树驱动虚拟人的控制方法，通过语音驱动获取用户的语音，将用户描述处理行为过程的语音转化为具有时序信息的事件列表，并通过匹配虚拟人物动作指令构建行为树，目的是再现用户所述场景，供用户在整个过程中能够看到所描述的虚拟人场景；具体的步骤包括：A control method for a behavior tree-driven virtual human. It acquires the user's voice through voice drive, converts the user's voice describing the behavioral process into an event list with time series information, and constructs a behavior tree by matching the virtual character's action instructions, with the purpose of reproducing The scene described by the user allows the user to see the described virtual human scene during the entire process; the specific steps include:

步骤1：构建节点，包括虚拟人行为节点与逻辑节点；Step 1: Build nodes, including virtual human behavior nodes and logical nodes;

步骤2：从驱动获取用户输入的语音；Step 2: Get the voice input by the user from the driver;

步骤3：对于获取的语音对其进行识别；Step 3: Recognize the acquired speech;

步骤4：处理文本信息，构建事件列表；Step 4: Process the text information and build an event list;

将文本分为主体、对象、行为三个要素；主体表示动作的执行者，对象可以是位置或物体；建立四个词库Z、X、W、S用于存储主体称谓、行为、位置信息和时间；The text is divided into three elements: subject, object, and behavior; the subject represents the performer of the action, and the object can be a location or object; four lexicon Z, X, W, and S are established to store the subject's title, behavior, location information, and time;

步骤5：根据事件列表中得到的结果，构建事件集合；事件集合的公式为：Step 5: Construct an event collection based on the results obtained in the event list; the formula for the event collection is:

Ai＝N(z,x,w,s)Ai＝N(z,x,w,s)

其中i∈[1,n]，n表示事件总数；引入Parallel、Mutex、Semaphore节点类型以支持并发操作，根据事件发生的先后顺序，将事件添加到行为树的逻辑节点中；若事件中的s不能被忽略且s＝1，则表示虚拟人的当前动作与另一个虚拟人的动作同时发生，则使用Parallel节点；Where i∈[1,n], n represents the total number of events; Parallel, Mutex, and Semaphore node types are introduced to support concurrent operations, and events are added to the logical nodes of the behavior tree according to the order in which events occur; if s in the event cannot be ignored and s=1, it means that the current action of the virtual person occurs at the same time as the action of another virtual person, and the Parallel node is used;

步骤6：构建行为树执行器CBTEngine；具体的操作方式为：Step 6: Build the behavior tree executor CBTEngine; the specific operation method is:

初始化CBTEngine，并创建线程池；CBTEngine预先创建一定数量的线程，并为每个行为树创建一个运行时实例；扫描所有行为树的任务队列，若队列中有任务，则将任务分配给空闲的线程进行执行；执行过程中，CBTEngine会检查其它可能并发运行的行为树任务；若需要，会使用Mutex和Semaphore节点来确保互斥和同步操作的正确执行；同时，CBTEngine还具备异常恢复机制，能够处理行为树任务执行中的错误情况，并及时通知相关操作；Initialize CBTEngine and create a thread pool; CBTEngine creates a certain number of threads in advance and creates a runtime instance for each behavior tree; scans the task queues of all behavior trees, and if there are tasks in the queue, assigns the tasks to idle threads Execution; during execution, CBTEngine will check other behavior tree tasks that may run concurrently; if necessary, Mutex and Semaphore nodes will be used to ensure the correct execution of mutual exclusion and synchronization operations; at the same time, CBTEngine also has an exception recovery mechanism that can handle Error conditions in the execution of behavior tree tasks and timely notification of relevant operations;

步骤7：根据行为树执行器完成对虚拟人的控制。Step 7: Complete the control of the virtual human according to the behavior tree actuator.

进一步的，步骤1所述的行为节点包括转向、抓取、跑、走、抬等相关的行为；虚拟人逻辑节点控制行为节点是否被激活，对虚拟人的行为节点和逻辑节点分层形成树状结构。Further, the behavior nodes described in step 1 include related behaviors such as steering, grabbing, running, walking, and lifting; the virtual human logical node controls whether the behavioral node is activated, and hierarchically forms a tree for the virtual human's behavior nodes and logical nodes. shape structure.

进一步的，所述步骤3的具体的操作方法为：Further, the specific operation method of step 3 is:

步骤3.1：对获取的语音进行信号处理与特征提取；Step 3.1: Perform signal processing and feature extraction on the acquired speech;

步骤3.2：选择声学模型，对其进行训练；Step 3.2: Select the acoustic model and train it;

步骤3.3：在声学模型给出发音序列之后，语言模型从候选的文字序列中找出概率最大的字符串序列；根据字典找到对应的汉字词或单词；Step 3.3: After the acoustic model gives the pronunciation sequence, the language model finds the string sequence with the highest probability from the candidate text sequences; finds the corresponding Chinese character or words according to the dictionary;

步骤3.4：使用解码器寻找词序列；在声学模型、字典和语言模型等知识源组成的搜索空间中，通过一定的搜索算法，寻找使概率最大的词序列，最后输出文本。Step 3.4: Use the decoder to find the word sequence; in the search space composed of knowledge sources such as acoustic models, dictionaries, and language models, use a certain search algorithm to find the word sequence that maximizes the probability, and finally output the text.

进一步的，步骤3.1所述的信号处理与特征提取的操作方式为：使用Whisper模型处理，获取语音进行预处理，输入音频被分成30秒的块，转换成梅尔声谱图，进行特征提取，然后传递到编码器。Further, the operation method of signal processing and feature extraction described in step 3.1 is: use Whisper model processing, obtain speech for preprocessing, input audio is divided into 30-second blocks, converted into Mel spectrogram, and feature extraction is performed. Then passed to the encoder.

进一步的，所述步骤3.2的具体操作方式为：编码器的输出使用归一化层进行归一化，解码器被训练来预测相应的文本，并与特殊标记混合，标记指导模型转为文字。Further, the specific operation method of step 3.2 is: the output of the encoder is normalized using a normalization layer, the decoder is trained to predict the corresponding text, and is mixed with special markers, and the markers guide the model to convert into text.

进一步的，所述步骤4的具体操作方式为：Further, the specific operation method of step 4 is:

步骤4.1：遍历整句话，搜索该句子中的主体称谓，并与Z主体的称谓词库进行匹配，将匹配到的结果标记为z；Step 4.1: Traverse the entire sentence, search for the title of the subject in the sentence, and match it with the title vocabulary of Z subject, and mark the matched result as z;

步骤4.2：搜索行为，采用与步骤4.1相同的方式来搜索句子中的行为，并将其与行为词库X进行匹配，将匹配到的结果标记为x；Step 4.2: Search for behavior, use the same method as step 4.1 to search for the behavior in the sentence, match it with the behavior vocabulary X, and mark the matched result as x;

步骤4.3：搜索对象位置信息，检索非主体的称谓时，该位置信息为动态信息，需要在每次调用时进行更新；如果是静态位置信息，则将其与位置信息词库W进行匹配，并将搜索到的位置信息属性标记为w；Step 4.3: Search for object location information. When retrieving the title of a non-subject, the location information is dynamic information and needs to be updated every time it is called; if it is static location information, match it with the location information vocabulary W, and Mark the searched location information attribute as w;

步骤4.4：搜索时间，判断该句子中是否包含时间词库中的词；如果有匹配到的结果，则将其与时间词库S中的信息进行匹配，并将匹配到的结果标记为s；否则不进行处理。Step 4.4: Search time to determine whether the sentence contains words in the time vocabulary; if there is a matching result, match it with the information in the time vocabulary S, and mark the matched result as s; Otherwise it will not be processed.

有益效果beneficial effects

本发明提出的一种基于改进深度残差收缩网络的轴承故障诊断方法，与现有技术相比较，其具有以下有益效果：The present invention proposes a bearing fault diagnosis method based on an improved deep residual shrinkage network. Compared with the existing technology, it has the following beneficial effects:

(1)本技术方案通过语音识别技术，由语音驱动获取用户的语音，将用户描述处理行为过程的语音转化为具有时序信息的事件列表，用户可以自然地描述所需交互的场景，而不需要手动编写复杂控制指令，从而降低了用户的学习成本和操作难度。(1) This technical solution uses speech recognition technology to obtain the user's voice driven by speech, and converts the user's voice describing the behavioral process into an event list with time series information. The user can naturally describe the required interaction scene without the need for Manually write complex control instructions, thereby reducing the user's learning cost and operation difficulty.

(2)本技术方案通过构建不同类型的节点如顺序节点、选择节点、并行节点等，通过构建行为节点和逻辑节点，形成层次化的行为树结构来描述虚拟人的行为，定义虚拟人的行为规则和逻辑；这些节点可以根据任务的优先级和依赖关系进行组织，从而实现复杂的行为策略。(2) This technical solution forms a hierarchical behavior tree structure by constructing different types of nodes such as sequential nodes, selection nodes, parallel nodes, etc., and by constructing behavior nodes and logical nodes to describe the behavior of the virtual human and define the behavior of the virtual human. Rules and logic; these nodes can be organized according to task priorities and dependencies to implement complex behavioral strategies.

(3)本技术方案引入并行计算实现行为树的高效执行，解决行为树算法执行速度较慢的问题；通过行为树进行多人多事件的控制，引入CBT执行器来实现行为树的并行化执行，对于节点状态同步进行处理；CBT执行器是基于行为树算法并行化原理设计的，可以在多线程环境下高效地执行行为树。该执行器可以将行为树分解成多个子树并行执行，同时还能够处理节点状态同步和数据共享等问题，从而快速地完成整个行为树的执行，从而保证可靠性和稳定性。还引入线程池管理技术来有效地管理并发线程，从而保证可靠性和稳定性，解决多个节点同时修改状态或者访问共享数据出现竞争和死锁的问题。(3) This technical solution introduces parallel computing to achieve efficient execution of the behavior tree and solves the problem of slow execution of the behavior tree algorithm; it uses the behavior tree to control multiple people and multiple events, and introduces the CBT executor to realize the parallel execution of the behavior tree. , for node status synchronization processing; the CBT executor is designed based on the parallelization principle of the behavior tree algorithm, and can efficiently execute the behavior tree in a multi-threaded environment. The executor can decompose the behavior tree into multiple subtrees for parallel execution, and can also handle issues such as node status synchronization and data sharing, thereby quickly completing the execution of the entire behavior tree to ensure reliability and stability. Thread pool management technology is also introduced to effectively manage concurrent threads to ensure reliability and stability and solve the problems of competition and deadlock when multiple nodes modify status at the same time or access shared data.

(4)本技术方案通过匹配虚拟人物动作指令构建行为树，目的是再现用户描述的场景，供用户在整个过程中能够看到所描述的虚拟人场景，使用户提升体验感与沉浸感。利用语音驱动来获取用户语音，然后利用语音识别算法，行为树算法等技术来再现虚拟场景。(4) This technical solution builds a behavior tree by matching virtual character action instructions, with the purpose of reproducing the scene described by the user, so that the user can see the described virtual human scene throughout the process, so that the user can enhance their sense of experience and immersion. Use voice driver to obtain the user's voice, and then use speech recognition algorithm, behavior tree algorithm and other technologies to reproduce the virtual scene.

(5)本技术方案通过生成行为树，将构建好的节点和事件列表组合起来，形成一棵完整的行为树。该行为树将指导虚拟人在交互环境中执行相应的行为，并根据外部事件和条件进行动态调整和更新。还可以通过使用不同的技术和算法来扩展和改进，提高其性能和效果，例如通过引入状态管理机制，行为树执行器可以跟踪和管理虚拟人的状态，以便更好地控制和调整行为。(5) This technical solution generates a behavior tree and combines the constructed node and event lists to form a complete behavior tree. This behavior tree will guide the virtual human to perform corresponding behaviors in the interactive environment, and dynamically adjust and update based on external events and conditions. It can also be extended and improved by using different technologies and algorithms to improve its performance and effect. For example, by introducing a state management mechanism, the behavior tree executor can track and manage the state of the virtual human to better control and adjust behavior.

附图说明Description of the drawings

图1是本发明的整体流程图。Figure 1 is an overall flow chart of the present invention.

图2是步骤3中基于Whisper语音识别方法的整体流程图。Figure 2 is the overall flow chart of the Whisper-based speech recognition method in step 3.

图3是步骤5中行为树的构建图。Figure 3 is the construction diagram of the behavior tree in step 5.

图4是步骤6中行为树的执行架构图。Figure 4 is the execution architecture diagram of the behavior tree in step 6.

图5是步骤6中行为树执行示意图。Figure 5 is a schematic diagram of the behavior tree execution in step 6.

具体实施方式Detailed ways

下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述。所描述的实施例仅仅是本发明一部分实施例，而不是全部的实施例。在不脱离本发明设计构思的前提下，本领域普通人员对本发明的技术方案做出的各种变型和改进，均应落入到本发明的保护范围。The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. The described embodiments are only some of the embodiments of the present invention, but not all of the embodiments. Without departing from the design concept of the present invention, various modifications and improvements made by ordinary people in the art to the technical solution of the present invention shall fall within the protection scope of the present invention.

实施例1：Example 1:

如图1所示，一种行为树驱动虚拟人的控制方法，通过语音驱动获取用户的语音，将用户描述处理行为过程的语音转化为具有时序信息的事件列表，并通过匹配虚拟人物动作指令构建行为树，目的是再现用户所述场景，供用户在整个过程中能够看到所描述的虚拟人场景；包括如下几个步骤：As shown in Figure 1, a behavior tree-driven virtual human control method obtains the user's voice through voice drive, converts the user's voice describing the behavioral process into an event list with time series information, and constructs it by matching the virtual character's action instructions. The purpose of the behavior tree is to reproduce the scene described by the user so that the user can see the described virtual human scene throughout the process; it includes the following steps:

步骤1：构建节点；包括虚拟人行为节点与逻辑节点：行为节点包括转向、抓取、跑、走、抬等，虚拟人逻辑节点控制行为节点是否被激活，对虚拟人的行为节点和逻辑节点分层形成树状结构。Step 1: Build nodes; including virtual human behavior nodes and logical nodes: Behavior nodes include steering, grabbing, running, walking, lifting, etc. The virtual human logical node controls whether the behavior node is activated, and the virtual human behavior node and logical node Layered to form a tree structure.

步骤2：从麦克风驱动获取语音。Step 2: Get the voice from the microphone driver.

步骤3：对于获取的语音进行识别；具体的操作方法为：Step 3: Recognize the acquired voice; the specific operation method is:

使用Whisper模型处理，获取语音进行预处理，输入音频被分成30秒的块，转换成梅尔声谱图，进行特征提取，然后传递到编码器。Processed using the Whisper model, the speech is acquired for preprocessing, and the input audio is divided into 30-second chunks, converted into mel spectrograms, feature extracted, and then passed to the encoder.

编码器的输出使用归一化层进行归一化，解码器被训练来预测相应的文本，并与特殊标记混合，标记指导模型转为文字。The output of the encoder is normalized using a normalization layer, and the decoder is trained to predict the corresponding text, mixed with special markers that guide the model into text.

步骤3.3：在声学模型给出发音序列之后，语言模型从候选的文字序列中找出概率最大的字符串序列；根据字典找到对应的汉字词或单词。Step 3.3: After the acoustic model gives the pronunciation sequence, the language model finds the string sequence with the highest probability from the candidate text sequences; it finds the corresponding Chinese character or words according to the dictionary.

步骤4：根据文本信息可作为虚拟人之间的交互信息，文本分为主体、对象、行为。主体是行为的执行者，对象可以是位置，也可以是物体，建立四个词库分别记为Z、X、W、S。Z为主体的称谓词库，X为行为词库，W为位置信息词库，S为表示虚拟人事件的时间词库。Step 4: According to the text information, which can be used as interactive information between virtual people, the text is divided into subjects, objects, and behaviors. The subject is the performer of the behavior, and the object can be a location or an object. Four lexicon databases are established and recorded as Z, X, W, and S respectively. Z is the title lexicon of the subject, X is the behavior lexicon, W is the location information lexicon, and S is the time lexicon representing virtual human events.

步骤4.1：遍历整句话，搜索该句子中的主体称谓，并与Z主体的称谓词库进行匹配，将匹配到的结果标记为z。Step 4.1: Traverse the entire sentence, search for the title of the subject in the sentence, and match it with the title vocabulary of Z subject, and mark the matched result as z.

步骤4.2：搜索行为，采用与步骤4.1相同的方式来搜索句子中的行为，并将其与行为词库X进行匹配，将匹配到的结果标记为x。Step 4.2: Search for behavior, use the same method as step 4.1 to search for the behavior in the sentence, match it with the behavior vocabulary X, and mark the matched result as x.

步骤4.3：搜索对象位置信息，检索非主体的称谓时，该位置信息为动态信息，需要在每次调用时进行更新。如果是静态位置信息，则将其与位置信息词库W进行匹配，并将搜索到的位置信息属性标记为w。Step 4.3: Search object location information. When retrieving the title of a non-subject, the location information is dynamic information and needs to be updated every time it is called. If it is static location information, it is matched with the location information vocabulary W, and the searched location information attribute is marked as w.

步骤4.4：搜索时间，判断该句子中是否包含时间词库中的词。如果有匹配到的结果，则将其与时间词库S中的信息进行匹配，并将匹配到的结果标记为s；否则不进行处理。Step 4.4: Search for time and determine whether the sentence contains words in the time vocabulary. If there is a matching result, match it with the information in the time vocabulary S, and mark the matching result as s; otherwise, it will not be processed.

步骤4.5：遍历整句话，搜索主体、行为、位置信息与时间，与词库进行匹配，一致的话分别标记为z、x、w、s，否则不用处理。Step 4.5: Traverse the entire sentence, search for the subject, behavior, location information and time, and match it with the vocabulary. If they are consistent, they are marked as z, x, w, s, otherwise no processing is required.

步骤5：由事件序列生成行为树，引入Parallel、Mutex、Semaphore新节点类型，支持并发操作。Step 5: Generate a behavior tree from the event sequence, introduce new node types such as Parallel, Mutex, and Semaphore to support concurrent operations.

步骤5.1：每句话对应一个事件，根据步骤4中得到的结果，构建事件Ai＝N(z,x,w,s)，形成一个事件集合A_i(其中i∈[1,n]，n表示事件总数)，并且设定节点类型。Step 5.1: Each sentence corresponds to an event. Based on the results obtained in step 4, construct the event Ai = N (z, x, w, s) to form an event set A_i (where i∈[1,n], n represents the total number of events), and sets the node type.

步骤5.1.1：构建Parallel节点：首先定义Parallel节点的输入参数，例如子节点列表或选择策略。在Parallel节点内部，创建一个计数器，用于跟踪子节点的完成状态。每个子节点都需要在执行完成后向计数器报告，并检查其他子节点的状态。然后根据设定的策略判断是否所有子节点都已完成，若是，则返回成功；否则，继续等待。Step 5.1.1: Build a Parallel node: First define the input parameters of the Parallel node, such as a list of child nodes or a selection strategy. Inside the Parallel node, create a counter to track the completion status of the child nodes. Each child node needs to report to the counter after execution is complete and check the status of other child nodes. Then determine whether all child nodes have been completed according to the set strategy. If so, return success; otherwise, continue to wait.

步骤5.1.2：构建Mutex节点：首先定义Mutex节点的输入参数，例如所保护的资源名或标识符。在Mutex节点内部，引入互斥锁机制，确保只有一个虚拟人能够获取到锁。当一个虚拟人尝试获取锁时，如果锁已被占用，则此虚拟人需要等待，直到锁被释放。只有成功获取到锁的虚拟人才能执行与该资源相关的行为。Step 5.1.2: Build a Mutex node: First define the input parameters of the Mutex node, such as the resource name or identifier to be protected. Inside the Mutex node, a mutex lock mechanism is introduced to ensure that only one virtual person can obtain the lock. When a virtual person tries to acquire a lock, if the lock is already occupied, the virtual person needs to wait until the lock is released. Only the virtual person who successfully acquires the lock can perform actions related to the resource.

步骤5.1.3：构建Semaphore节点：首先定义Semaphore节点的输入参数，例如初始计数器值和最大允许并发数。在Semaphore节点内部，维护一个计数器，用于跟踪当前已获取锁的虚拟人数量。当一个虚拟人尝试获取信号量时，如果当前已获取锁的虚拟人数量达到最大并发数，则该虚拟人需要等待，直到有其他虚拟人释放信号量。获取信号量后，虚拟人可以执行与该资源相关的行为，并在完成后释放信号量。Step 5.1.3: Build the Semaphore node: First define the input parameters of the Semaphore node, such as the initial counter value and the maximum allowed number of concurrencies. Inside the Semaphore node, a counter is maintained to track the number of virtual persons that have currently acquired the lock. When a virtual person tries to acquire the semaphore, if the number of virtual persons that have currently acquired the lock reaches the maximum number of concurrencies, the virtual person needs to wait until other virtual persons release the semaphore. After acquiring the semaphore, the virtual person can perform actions related to the resource and release the semaphore when completed.

步骤5.2：按照事件发生的先后顺序，将事件添加到行为树的逻辑节点中。如果事件中的s不能被忽略且s＝1，表示虚拟人的当前动作与另一个虚拟人的动作同时发生，则为Parallel节点。Step 5.2: Add events to the logical nodes of the behavior tree in the order in which they occur. If s in the event cannot be ignored and s=1, it means that the current action of the virtual person occurs at the same time as the action of another virtual person, and it is a Parallel node.

步骤6：构建行为树执行器(CBTEngine)，包括初始化模块和线程池管理方法，以实现行为树任务的并发执行。Step 6: Build the behavior tree executor (CBTEngine), including initialization module and thread pool management method to achieve concurrent execution of behavior tree tasks.

步骤6.1：对其进行初始化及创建线程池。CBTEngine在启动时预先创建一定数量的线程，创建线程池，并为每个行为树创建一个运行时实例。Step 6.1: Initialize it and create a thread pool. CBTEngine pre-creates a certain number of threads at startup, creates a thread pool, and creates a runtime instance for each behavior tree.

步骤6.2：扫描所有行为树的任务队列，如果队列中有任务，则取出任务进行执行，并将结果返回给行为树，Step 6.2: Scan the task queues of all behavior trees. If there are tasks in the queue, take out the tasks for execution and return the results to the behavior tree.

步骤6.2.1：所述步骤6.2中的任务队列中，如行为树中的节点需并行执行，则将任务提交到线程池上，线程池会自动选择一个空闲线程来处理该任务。Step 6.2.1: In the task queue in step 6.2, if the nodes in the behavior tree need to be executed in parallel, submit the task to the thread pool, and the thread pool will automatically select an idle thread to process the task.

步骤6.2.2：所述步骤6.2中的任务队列中，如行为树中的节点需同步或协调，则使用消息通信来实现。Step 6.2.2: In the task queue in step 6.2, if the nodes in the behavior tree need to be synchronized or coordinated, message communication is used to achieve this.

步骤6.2.3：所述步骤6.2中的任务队列中，如运行错误时，则将异常信息返回给行为树，并尝试恢复该节点的状态。Step 6.2.3: If an error occurs in the task queue in step 6.2, the exception information will be returned to the behavior tree and an attempt will be made to restore the status of the node.

步骤7：控制虚拟人；Step 7: Control the virtual person;

步骤7.1：根据行为树执行器完成对虚拟人的控制。Step 7.1: Complete the control of the virtual human according to the behavior tree actuator.

步骤7.2：执行行为树中定义的虚拟人行为节点，驱动虚拟人在场景中进行相应的动作。Step 7.2: Execute the virtual human behavior node defined in the behavior tree to drive the virtual human to perform corresponding actions in the scene.

步骤7.3：通过调度和执行行为树中的逻辑节点，实现虚拟人行为的流程控制和交互响应。Step 7.3: Implement process control and interactive response of virtual human behavior by scheduling and executing logical nodes in the behavior tree.

本实施例通过语音驱动获取用户的语音，将用户描述处理行为过程的语音转化为具有时序信息的事件列表，并通过匹配虚拟人物动作指令构建行为树，目的是再现用户所述场景，供用户在整个过程中能够看到所描述的虚拟人场景，使用户提升体验感与沉浸感。这种方法的关键是利用语音驱动来获取用户语音，然后利用语音识别算法，行为树算法等技术来再现虚拟场景。This embodiment obtains the user's voice through voice drive, converts the user's voice describing the behavioral process into an event list with timing information, and constructs a behavior tree by matching the virtual character action instructions, with the purpose of reproducing the scene described by the user for the user to The described virtual human scene can be seen throughout the entire process, allowing users to enhance their experience and immersion. The key to this method is to use voice driver to obtain the user's voice, and then use speech recognition algorithm, behavior tree algorithm and other technologies to reproduce the virtual scene.

利用语音识别技术及行为树算法这两种技术，把用户描述的处理行为过程自动转换成一个按照时间顺序排列的事件列表，并通过匹配虚拟人物动作指令构建行为树，再现用户描述的场景。语音识别技术是整个系统的思维工具，将用户的语音描述转换成可处理的事件序列；行为树算法则是保证虚拟人与用户之间行为逻辑正确，且流畅自然的关键因素。使得用户可以通过自然语音描述所需交互的场景，而不需要手动编写复杂控制指令，从而降低了用户的学习成本和操作难度。该技术在智慧教育、人机交互等领域具有广泛的应用前景。Using two technologies, speech recognition technology and behavior tree algorithm, the processing behavior described by the user is automatically converted into a list of events arranged in chronological order, and a behavior tree is constructed by matching the virtual character's action instructions to reproduce the scene described by the user. Speech recognition technology is the thinking tool of the entire system, converting the user's voice description into a processable sequence of events; the behavior tree algorithm is the key factor to ensure that the behavior logic between the virtual person and the user is correct, smooth and natural. This allows users to describe the required interaction scenarios through natural speech without the need to manually write complex control instructions, thereby reducing users' learning costs and operational difficulty. This technology has broad application prospects in fields such as smart education and human-computer interaction.

上述实施方式只为说明本发明的技术构思及特点，其目的在于让熟悉此项技术的人能够了解本发明的内容并据以实施，并不能以此限制本发明的保护范围。凡根据本发明实质所做的等效变换或修饰，都应涵盖在本发明的保护范围之内。The above embodiments are only for illustrating the technical concepts and features of the present invention. Their purpose is to enable those familiar with this technology to understand the content of the present invention and implement it accordingly, and cannot limit the scope of protection of the present invention. All equivalent transformations or modifications based on the essence of the present invention should be included in the protection scope of the present invention.

Claims

Translated fromChinese

1.一种行为树驱动虚拟人的控制方法,其特征在于：通过语音驱动获取用户的语音，将用户描述处理行为过程的语音转化为具有时序信息的事件列表，并通过匹配虚拟人物动作指令构建行为树，目的是再现用户所述场景，供用户在整个过程中能够看到所描述的虚拟人场景；具体的步骤包括：1. A control method for behavior tree-driven virtual humans, characterized by: acquiring the user's voice through voice drive, converting the user's voice describing the behavioral process into an event list with time series information, and constructing it by matching the virtual character action instructions The purpose of the behavior tree is to reproduce the scene described by the user so that the user can see the described virtual human scene during the entire process; the specific steps include:

Ai＝N(z,x,w,s)Ai＝N(z,x,w,s)

2.根据权利要求1所述的一种行为树驱动虚拟人的控制方法，其特征在于：步骤1所述的行为节点包括转向、抓取、跑、走、抬等相关的行为；虚拟人逻辑节点控制行为节点是否被激活，对虚拟人的行为节点和逻辑节点分层形成树状结构。2. A control method for a behavior tree driven virtual human according to claim 1, characterized in that: the behavior nodes described in step 1 include steering, grabbing, running, walking, lifting and other related behaviors; the virtual human logic The node controls whether the behavior node is activated, and the virtual human's behavior nodes and logical nodes are hierarchically formed into a tree structure.

3.根据权利要求1所述的一种行为树驱动虚拟人的控制方法，其特征在于：所述步骤3的具体的操作方法为：3. A control method for a behavior tree-driven virtual human according to claim 1, characterized in that: the specific operation method of step 3 is:

4.根据权利要求3所述的一种行为树驱动虚拟人的控制方法，其特征在于：步骤3.1所述的信号处理与特征提取的操作方式为：使用Whisper模型处理，获取语音进行预处理，输入音频被分成30秒的块，转换成梅尔声谱图，进行特征提取，然后传递到编码器。4. A control method for a behavior tree-driven virtual human according to claim 3, characterized in that: the operation mode of signal processing and feature extraction described in step 3.1 is: using Whisper model processing to obtain speech for preprocessing, The input audio is divided into 30-second chunks, converted into mel spectrograms, feature extracted, and passed to the encoder.

5.根据权利要求4所述的一种行为树驱动虚拟人的控制方法，其特征在于：所述步骤3.2的具体操作方式为：编码器的输出使用归一化层进行归一化，解码器被训练来预测相应的文本，并与特殊标记混合，标记指导模型转为文字。5. A control method for a behavior tree-driven virtual human according to claim 4, characterized in that: the specific operation mode of step 3.2 is: the output of the encoder is normalized using a normalization layer, and the decoder are trained to predict corresponding text and mixed with special tags that guide the model into text.

6.根据权利要求1所述的一种行为树驱动虚拟人的控制方法，其特征在于：所述步骤4的具体操作方式为：6. A control method for a behavior tree-driven virtual human according to claim 1, characterized in that: the specific operation mode of step 4 is: