CN112233681A

Movatterモバイル変換

Info

Publication number: CN112233681A
Application number: CN202011076956.1A
Authority: CN
Inventors: 彭经伟; 左声勇
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Apollo Intelligent Connectivity Beijing Technology Co Ltd
Priority date: 2020-10-10
Filing date: 2020-10-10
Publication date: 2021-01-15

Abstract

Translated fromChinese

本申请公开了一种误唤醒语料确定方法、装置、电子设备和存储介质，涉及语音识别领域。具体实现方案为：通过至少一个音频采集器采集干扰音频数据；其中，音频采集器所在的音频采集区域中包括与音频采集器关联的唤醒引擎；将音频采集器采集的干扰音频数据输入关联的唤醒引擎；在唤醒引擎被输入的干扰音频数据成功唤醒情况下，将该干扰音频数据作为唤醒引擎的误唤醒语料。本申请实施例中，通过将采集的干扰语音直接传输到对应唤醒引擎中，只要唤醒引擎被唤醒,则将该干扰音频数据作为误唤醒语料，由此实现了误唤醒物料的自动确认收集，避免了通过人工本地录音中挑选误唤醒语料，进而提升了误唤醒语料确认收集的效率。

The present application discloses a method, device, electronic device and storage medium for determining false wake-up corpus, and relates to the field of speech recognition. The specific implementation scheme is: collecting interference audio data through at least one audio collector; wherein, the audio collection area where the audio collector is located includes a wake-up engine associated with the audio collector; input the disturbing audio data collected by the audio collector into the associated wake-up Engine; in the case that the wake-up engine is successfully woken up by the input interference audio data, the interference audio data is used as the false wake-up corpus for the wake-up engine. In the embodiment of the present application, by directly transmitting the collected interference voice to the corresponding wake-up engine, as long as the wake-up engine is awakened, the interference audio data is used as the false wake-up corpus, thereby realizing the automatic confirmation and collection of the false wake-up material, avoiding the In order to select the false wake-up corpus from the manual local recording, the efficiency of the confirmation and collection of the false wake-up corpus is improved.

Description

Translated fromChinese

一种误唤醒语料确定方法、装置、电子设备和存储介质A method, device, electronic device and storage medium for determining false awakening corpus

技术领域technical field

本申请涉及人工智能技术领域，尤其语音识别领域，特别涉及一种误唤醒语料确定方法、装置、电子设备和存储介质。The present application relates to the technical field of artificial intelligence, in particular to the field of speech recognition, and in particular to a method, device, electronic device and storage medium for determining false wake-up corpus.

背景技术Background technique

语音助手是通过智能对话与即时问答的智能交互，帮助用户解决问题的应用，随着人工智能技术的发展，语音助手已被广泛用于汽车上。Voice assistant is an application that helps users solve problems through intelligent interaction of intelligent dialogue and instant question and answer. With the development of artificial intelligence technology, voice assistant has been widely used in automobiles.

汽车上的语音助手常出现被误唤醒的情况，因此需要对误唤醒语音助手的语料进行收集。目前，在收集误唤醒车机语音助手的语料时，通常采用外部工具(人工嘴、喇叭)，随机播放音频，车机通过麦克风采集外部工具播放的音频数据，或车机在不同的环境(如开车环境、安静的环境、多人说话的场景)开启麦克风采集音频数据保存到本地，进而通过人工挑选的方式从本地录制的音频确定触发误唤醒的音频数据(即语料)。The voice assistants in the car are often awakened by mistake, so it is necessary to collect the corpus of the voice assistants by mistake. At present, when collecting the corpus for accidentally awakening the voice assistant of the car machine, external tools (artificial mouth, speakers) are usually used to play audio randomly. Driving environment, quiet environment, scenes with multiple people talking) turn on the microphone to collect audio data and save it locally, and then determine the audio data (ie corpus) that triggers false wake-up from the locally recorded audio through manual selection.

发明内容SUMMARY OF THE INVENTION

本申请实施例提供了一种误唤醒语料确定方法、装置、设备和存储介质。Embodiments of the present application provide a method, apparatus, device, and storage medium for determining false wake-up corpus.

根据第一方面，提供了一种误唤醒语料确定方法，包括：According to the first aspect, a method for determining false awakening corpus is provided, including:

通过至少一个音频采集器采集干扰音频数据；其中，音频采集器所在的音频采集区域中包括与音频采集器关联的唤醒引擎；Collect interference audio data by at least one audio collector; wherein, the audio collection area where the audio collector is located includes a wake-up engine associated with the audio collector;

将音频采集器采集的干扰音频数据输入关联的唤醒引擎；Input the interference audio data collected by the audio collector into the associated wake-up engine;

在唤醒引擎被输入的干扰音频数据成功唤醒情况下，将该干扰音频数据作为唤醒引擎的误唤醒语料。In the case that the wake-up engine is successfully woken up by the input interference audio data, the interference audio data is used as the false wake-up corpus of the wake-up engine.

根据第二方面，提供了一种误唤醒语料确定装置，包括：According to a second aspect, a device for determining false awakening corpus is provided, comprising:

语音采集模块，用于通过至少一个音频采集器采集干扰音频数据；其中，音频采集器所在的音频采集区域中包括与音频采集器关联的唤醒引擎；a voice collection module, configured to collect interference audio data through at least one audio collector; wherein, the audio collection area where the audio collector is located includes a wake-up engine associated with the audio collector;

数据输入模块，用于将音频采集器采集的干扰音频数据输入关联的唤醒引擎；The data input module is used to input the interfering audio data collected by the audio collector into the associated wake-up engine;

误唤醒语料确定模块，用于在唤醒引擎被输入的干扰音频数据成功唤醒情况下，将该干扰音频数据作为唤醒引擎的误唤醒语料。The false wake-up corpus determination module is configured to use the interference audio data as the false wake-up corpus of the wake-up engine when the wake-up engine is successfully woken up by the input interference audio data.

根据第三方面，提供了一种电子设备，包括：According to a third aspect, an electronic device is provided, comprising:

至少一个处理器；以及at least one processor; and

与至少一个处理器通信连接的存储器；其中，a memory communicatively coupled to the at least one processor; wherein,

存储器存储有可被至少一个处理器执行的指令，指令被至少一个处理器执行，以使至少一个处理器能够执行本申请任意实施例的误唤醒语料确定方法。The memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor, so that the at least one processor can execute the method for determining a false wake corpus of any embodiment of the present application.

根据第四方面，提供了一种存储有计算机指令的非瞬时计算机可读存储介质，计算机指令用于使计算机执行本申请任意实施例的误唤醒语料确定方法。According to a fourth aspect, a non-transitory computer-readable storage medium storing computer instructions is provided, and the computer instructions are used to cause a computer to execute the method for determining a false awakening corpus of any embodiment of the present application.

根据本申请的技术，实现了自动收集误唤醒语料的目的，提升了确定误唤醒语料的效率。According to the technology of the present application, the purpose of automatically collecting false awakening corpus is achieved, and the efficiency of determining false awakening corpus is improved.

应当理解，本部分所描述的内容并非旨在标识本公开的实施例的关键或重要特征，也不用于限制本公开的范围。本公开的其它特征将通过以下的说明书而变得容易理解。It should be understood that what is described in this section is not intended to identify key or critical features of embodiments of the disclosure, nor is it intended to limit the scope of the disclosure. Other features of the present disclosure will become readily understood from the following description.

附图说明Description of drawings

附图用于更好地理解本方案，不构成对本申请的限定。其中：The accompanying drawings are used to better understand the present solution, and do not constitute a limitation to the present application. in:

图1是根据本申请实施例的误唤醒语料确定方法的流程示意图；1 is a schematic flowchart of a method for determining a false awakening corpus according to an embodiment of the present application;

图2是根据本申请实施例的误唤醒语料确定方法的流程示意图；2 is a schematic flowchart of a method for determining a false awakening corpus according to an embodiment of the present application;

图3是根据本申请实施例的误唤醒语料确定方法的流程示意图；3 is a schematic flowchart of a method for determining a false wake-up corpus according to an embodiment of the present application;

图4是根据本申请实施例的误唤醒语料确定方法的流程示意图；4 is a schematic flowchart of a method for determining a false awakening corpus according to an embodiment of the present application;

图5是根据本申请实施例的误唤醒语料确定方法的流程图；5 is a flowchart of a method for determining a false wake-up corpus according to an embodiment of the present application;

图6是根据本申请实施例的误唤醒语料确定装置的结构示意图；6 is a schematic structural diagram of a device for determining false wake-up corpus according to an embodiment of the present application;

图7是用来实现本申请实施例的误唤醒语料确定方法的电子设备的框图。FIG. 7 is a block diagram of an electronic device used to implement the method for determining a false wake-up corpus according to an embodiment of the present application.

具体实施方式Detailed ways

以下结合附图对本申请的示范性实施例-做出说明，其中包括本申请实施例的各种细节以助于理解，应当将它们认为仅仅是示范性的。因此，本领域普通技术人员应当认识到，可以对这里描述的实施例做出各种改变和修改，而不会背离本申请的范围和精神。同样，为了清楚和简明，以下的描述中省略了对公知功能和结构的描述。Exemplary embodiments of the present application are described below with reference to the accompanying drawings, which include various details of the embodiments of the present application to facilitate understanding, and should be considered as exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted from the following description for clarity and conciseness.

图1是根据本申请实施例的误唤醒语料确定方法的流程示意图，本实施例可适用于车机从音频采集器所采集的音频数据中确定误唤醒语音助手的语料的情况。该方法可由一种误唤醒语料确定装置来执行，该装置采用软件和/或硬件的方式实现，优选是配置于电子设备。FIG. 1 is a schematic flowchart of a method for determining a false wake-up corpus according to an embodiment of the present application. This embodiment can be applied to a situation where a car machine determines a false wake-up voice assistant corpus from audio data collected by an audio collector. The method can be performed by a false wake-up corpus determination device, the device is implemented by means of software and/or hardware, and is preferably configured in an electronic device.

参见图1，误唤醒语料确定方法具体如下：Referring to Figure 1, the method for determining the false wake-up corpus is as follows:

S101、通过至少一个音频采集器采集干扰音频数据。S101. Collect interference audio data through at least one audio collector.

其中，音频采集器可选的为麦克风，在汽车上预先设定有多个音频采集区域(即音区)，每个音区都设置有麦克风，用于采集麦克风所在音区的音频数据。干扰音频数据为非用户输入语音的数据，可选的为通过外部工具(人工嘴、喇叭)播放的音频数据。The optional audio collector is a microphone, and a plurality of audio collection areas (ie, sound areas) are preset on the car, and each sound area is provided with a microphone for collecting audio data of the sound area where the microphone is located. The interfering audio data is the data that is not input by the user, and optionally is the audio data played by an external tool (artificial mouth, speaker).

S102、将音频采集器采集的干扰音频数据输入关联的唤醒引擎。S102. Input the interference audio data collected by the audio collector into the associated wake-up engine.

本申请实施例中，音频采集器所在的音频采集区域中包括与音频采集器关联的唤醒引擎，示例性的，音区1中设置有麦克风MIC1，MIC1关联唤醒引擎A；音区2中设置有麦克风MIC2，MIC2关联唤醒引擎B。因此通过某一音频采集器采集到干扰音频数据后，直接将采集的干扰音频数据输入与该音频采集器关联的唤醒引擎中，例如MIC1采集到干扰音频数据后，直接将干扰音频数据传输到唤醒引擎A中，并不会传输到其他唤醒引擎中。需要说明的是，本申请实施例中通过将采集的音频数据直接输入到搜索引擎中，而不需要先将采集的音频数据保存到本地，再由人工从本地中挑选误唤醒语料，可保证后续误唤醒语料确定的效率。In the embodiment of the present application, the audio collection area where the audio collector is located includes a wake-up engine associated with the audio collector. Exemplarily, a microphone MIC1 is provided in sound area 1, and MIC1 is associated with a wake-up engine A; sound area 2 is provided with a Microphone MIC2, MIC2 is associated with wake-up engine B. Therefore, after collecting the interference audio data through an audio collector, directly input the collected interference audio data into the wake-up engine associated with the audio collector. For example, after the MIC1 collects the interference audio data, it directly transmits the interference audio data to the wake-up engine. In engine A, it will not be transmitted to other wake-up engines. It should be noted that, in the embodiment of the present application, by directly inputting the collected audio data into the search engine, it is not necessary to save the collected audio data locally, and then manually select the false wake-up corpus from the local, which can ensure the follow-up. Efficiency of false awakening corpus determination.

S103、在唤醒引擎被输入的干扰音频数据成功唤醒情况下，将该干扰音频数据作为唤醒引擎的误唤醒语料。S103. In the case that the wake-up engine is successfully woken up by the input interference audio data, use the interference audio data as a false wake-up corpus of the wake-up engine.

本申请实施例中，设置有唤醒监听机制，用于实时监听各唤醒引擎是否被成功唤醒。由于输入到唤醒引擎中的音频为干扰音频数据，而用户不发出任何语音指令，因此只要在监听到唤醒引擎被输入的干扰音频数据成功唤醒情况下，将该干扰音频数据作为唤醒引擎的误唤醒语料。In the embodiment of the present application, a wake-up monitoring mechanism is provided to monitor in real time whether each wake-up engine is successfully woken up. Since the audio input to the wake-up engine is interference audio data, and the user does not issue any voice commands, as long as the wake-up engine is successfully woken up by the input interference audio data, the interference audio data is regarded as the false wake-up of the wake-up engine. corpus.

本申请实施例中，通过将采集的干扰语音直接传输到对应唤醒引擎中，只要唤醒引擎被唤醒,则将该干扰音频数据作为误唤醒语料，由此实现了误唤醒物料的自动确认收集，避免了先将采集的所有音频保存在本地，进而通过人工从本地录音中挑选误唤醒语料，提升了误唤醒语料确认收集的效率。In the embodiment of the present application, by directly transmitting the collected interference voice to the corresponding wake-up engine, as long as the wake-up engine is awakened, the interference audio data is used as the false wake-up corpus, thereby realizing the automatic confirmation and collection of the false wake-up material, avoiding the In order to save all the collected audio locally, and then manually select the false wake-up corpus from the local recording, the efficiency of the confirmation and collection of the false wake-up corpus is improved.

图2是根据本申请实施例的误唤醒语料确定方法的流程示意图，本实施例是在上述实施例的基础上进行优化，参见图2，误唤醒语料确定方法具体如下：FIG. 2 is a schematic flowchart of a method for determining a false wake-up corpus according to an embodiment of the present application. This embodiment is optimized on the basis of the above-mentioned embodiment. Referring to FIG. 2 , the method for determining a false-awakening corpus is as follows:

S201、通过至少一个音频采集器采集干扰音频数据。S201. Collect interference audio data through at least one audio collector.

其中，音频采集器所在的音频采集区域中包括与音频采集器关联的唤醒引擎。The audio collection area where the audio collector is located includes a wake-up engine associated with the audio collector.

S202、将音频采集器采集的干扰音频数据输入关联的唤醒引擎。S202. Input the interference audio data collected by the audio collector into the associated wake-up engine.

S203、在唤醒引擎被输入的干扰音频数据成功唤醒情况下，将该干扰音频数据中在成功唤醒前预设时长的干扰音频数据，作为唤醒引擎的误唤醒语料。S203. In the case that the wake-up engine is successfully woken up by the input interference audio data, the interference audio data with a preset duration before the successful wake-up in the interference audio data is used as the false wake-up corpus of the wake-up engine.

现有技术中，通过人工挑选误唤醒语料时，通常是将麦克风录制的完整音频文件作为误唤醒语料，使得误唤醒预料中无关数据过多，例如一个误唤醒语料(例如8小时的录音数据)中，成功使唤醒引擎被唤醒的只有最后10秒的音频数据，因此该误唤醒语料中除了最后10秒音频外都是无关的音频。而且将完整的音频文件作为误唤醒语料存储时，还会占用较大的存储空间。In the prior art, when the false wake-up corpus is manually selected, the complete audio file recorded by the microphone is usually used as the false wake-up corpus, so that there are too many irrelevant data expected in the false wake-up, such as a false wake-up corpus (for example, 8 hours of recording data) , only the audio data of the last 10 seconds can be successfully woken up by the wake-up engine, so the false wake-up corpus is all irrelevant audio except the audio of the last 10 seconds. Moreover, when the complete audio file is stored as the false wake-up corpus, it will also occupy a large storage space.

基于此，发明人提出了在唤醒引擎被输入的干扰音频数据成功唤醒情况下，将该干扰音频数据中在成功唤醒前预设时长的干扰音频数据，作为唤醒引擎的误唤醒语料。可选的，通过监听机制确定唤醒引擎被唤醒的时间，进而将在该时间之前的预设时长内输入的音频数据作为误唤醒语料，其中，预设时长可以根据实际需要进行设定，示例性的为10秒。Based on this, the inventor proposes that when the wake-up engine is successfully woken up by the input interference audio data, the interference audio data with a preset duration in the interference audio data before the successful wake-up is used as the false wake-up corpus of the wake-up engine. Optionally, the time when the wake-up engine is woken up is determined through a monitoring mechanism, and the audio data input within a preset time period before the time is used as the false wake-up corpus, wherein the preset time length can be set according to actual needs, exemplary of 10 seconds.

本申请实施例中，将该干扰音频数据中在成功唤醒前预设时长的干扰音频数据，作为唤醒引擎的误唤醒语料，不但可以减少误唤醒语料中的无关数据，在保存误唤醒语料时，还能节省存储空间。In the embodiment of the present application, the interference audio data with a preset duration before the successful wake-up in the interference audio data is used as the false wake-up corpus of the wake-up engine, which can not only reduce irrelevant data in the false wake-up corpus, but also save the false wake-up corpus. It also saves storage space.

图3是根据本申请实施例的误唤醒语料确定方法的流程示意图，本实施例是在上述实施例的基础上进行优化，参见图3，该误唤醒语料确定方法具体如下：FIG. 3 is a schematic flowchart of a method for determining a false awakening corpus according to an embodiment of the present application. The present embodiment is optimized on the basis of the above-mentioned embodiment. Referring to FIG. 3 , the method for determining the false awakening corpus is as follows:

S301、通过至少一个音频采集器采集干扰音频数据。S301. Collect interference audio data through at least one audio collector.

S302、将音频采集器采集的干扰音频数据输入关联的唤醒引擎。S302. Input the interference audio data collected by the audio collector into the associated wake-up engine.

S303、将干扰音频数据输入到与唤醒引擎关联的缓存单元中，其中，缓存单元被配置为存储输入的干扰音频数据中最后预设时长的干扰音频数据。S303. Input the interference audio data into a buffer unit associated with the wake-up engine, where the buffer unit is configured to store the interference audio data of the last preset duration in the input interference audio data.

本申请实施例中，每个唤醒引擎都关联的一个缓存单元，其中，缓存单元被配置为存储输入的干扰音频数据中最后预设时长的干扰音频数据，也即缓存单元采用先进先出的缓存机制，在将干扰音频数据输入到与唤醒引擎关联的缓存单元的过程中，缓存单元始终保存最新输入的预设时长的干扰音频数据，其中预设时长可以根据实际需要进行设定，例如10秒，也即缓存单元中保存的是预设时长的干扰音频数据。In the embodiment of the present application, each wake-up engine is associated with a buffer unit, wherein the buffer unit is configured to store the interference audio data of the last preset duration in the input interference audio data, that is, the buffer unit adopts a first-in, first-out buffer. Mechanism, in the process of inputting the interference audio data into the buffer unit associated with the wake-up engine, the buffer unit always saves the latest input interference audio data of a preset duration, where the preset duration can be set according to actual needs, such as 10 seconds , that is, the interfering audio data of a preset duration is stored in the buffer unit.

需要说明的是，S302和S303是同步执行的，也即在将音频采集器采集的干扰音频数据输入关联的唤醒引擎的同时，同步将采集的干扰音频数据输入到该唤醒引擎关联的缓存单元中。而之所以同步存入缓存单元，是为了保证唤醒引擎被成功唤醒时，导致唤醒引擎被成功唤醒的干扰音频数据当前正在缓存单元中。It should be noted that S302 and S303 are performed synchronously, that is, while the interference audio data collected by the audio collector is input into the associated wake-up engine, the collected interference audio data is simultaneously input into the cache unit associated with the wake-up engine. . The reason why it is stored in the buffer unit synchronously is to ensure that when the wake-up engine is successfully woken up, the interfering audio data that causes the wake-up engine to be successfully woken up is currently in the buffer unit.

在一种可选的实施方式中，缓存单元示例性的为lru(Least Recently Used)缓存队列，lru缓存队列的容量是根据预设时长、采样率和位深度计算得到的，在这三个参数确定时，lru缓存队列的缓存容量是有限的，使得在将采集的音频数据实时保存到lru缓存队列时，如果缓存满了则基于先进先出的缓存机制，删除先存进的部分音频数据，以便给后续新输入的音频数据留下缓存位置。In an optional implementation manner, the buffer unit is exemplarily an lru (Least Recently Used) buffer queue, and the capacity of the lru buffer queue is calculated according to the preset duration, sampling rate and bit depth, among these three parameters When it is determined, the cache capacity of the lru cache queue is limited, so that when the collected audio data is saved to the lru cache queue in real time, if the cache is full, based on the first-in, first-out caching mechanism, some audio data stored first will be deleted. In order to leave a buffer location for subsequent new input audio data.

S304、响应于任一唤醒引擎被成功唤醒，将该唤醒引擎关联的缓存单元中所缓存的干扰音频数据作为唤醒引擎的误唤醒语料。S304. In response to any wake-up engine being successfully woken up, use the interfering audio data buffered in the cache unit associated with the wake-up engine as the false wake-up corpus of the wake-up engine.

本申请实施例中，唤醒监听机制监听某一唤醒引擎被成功唤醒时，由于传输到唤醒引擎的干扰音频数据被同步的保存在该唤醒引擎关联的lru缓存队列中，因此可直接将该时刻下与该唤醒引擎关联的缓存单元中所缓存的干扰音频数据作为唤醒引擎的误唤醒语料。示例性的，缓存单元用于保存最新的10秒的干扰音频数据，若某一时刻唤醒引擎被成功唤醒，则将该时刻下，缓存单元中保存的10秒的干扰音频数据作为误唤醒语料。需要说明的是，若在初始阶段，缓存单元中保存的干扰音频数据小于10秒时，例如在保存6秒的干扰音频数据时，监听到唤醒引擎被成功唤醒，则将保存的6秒的干扰音频数据作为误唤醒语料。In the embodiment of the present application, when the wake-up monitoring mechanism monitors that a wake-up engine is successfully woken up, since the interfering audio data transmitted to the wake-up engine is synchronously stored in the lru cache queue associated with the wake-up engine, it can be directly The interfering audio data buffered in the buffer unit associated with the wake-up engine is used as the false wake-up corpus of the wake-up engine. Exemplarily, the buffer unit is used to store the latest 10 seconds of interference audio data. If the wake-up engine is successfully woken up at a certain moment, the 10 seconds of interference audio data stored in the buffer unit at that moment is used as the false wake-up corpus. It should be noted that, if in the initial stage, the interference audio data saved in the buffer unit is less than 10 seconds, for example, when the interference audio data is saved for 6 seconds, the wake-up engine is successfully awakened, the saved 6 seconds of interference Audio data is used as false wake-up corpus.

本申请实施例，在任一唤醒引擎被成功唤醒情况下，从与该唤醒引擎关联的缓存单元中直接获取干扰音频数据作为误唤醒语料，可以提升获取误唤醒语料的效率。In the embodiment of the present application, when any wake-up engine is successfully woken up, the interference audio data is directly obtained from the cache unit associated with the wake-up engine as the false wake-up corpus, which can improve the efficiency of acquiring the false-awaken corpus.

图4是根据本申请实施例的误唤醒语料确定方法的流程示意图，本实施例是在上述实施例的基础上进行优化，参见图4，该误唤醒语料确定的方法具体如下：FIG. 4 is a schematic flowchart of a method for determining a false awakening corpus according to an embodiment of the present application. This embodiment is optimized on the basis of the above-mentioned embodiment. Referring to FIG. 4 , the method for determining the false awakening corpus is as follows:

S401、通过至少一个音频采集器采集干扰音频数据.S401. Collect interference audio data through at least one audio collector.

S402、将音频采集器采集的干扰音频数据输入关联的唤醒引擎。S402. Input the interference audio data collected by the audio collector into the associated wake-up engine.

S403、在唤醒引擎被输入的干扰音频数据成功唤醒情况下，将该干扰音频数据作为唤醒引擎的误唤醒语料。S403. In the case that the wake-up engine is successfully woken up by the input interference audio data, use the interference audio data as the false wake-up corpus of the wake-up engine.

S404、识别误唤醒语料中包括的唤醒词，且按照唤醒词对误唤醒语料进行命名。S404. Identify the wake-up words included in the false-awakening corpus, and name the false-awakening corpus according to the wake-up words.

在得到误唤醒语料之后，可对误唤醒语料进行识别，确定误唤醒语料中包括的唤醒词，并按照唤醒词对误唤醒语料进行命名，以便后续用户可直接根据误唤醒语料的名称直观的确定哪个唤醒词导致唤醒引擎被误唤醒。而且在按照唤醒词对误唤醒语料进行命名后，可将误唤醒语料存储在车机的本地。而在一种可选的实施方式中，可通过文件写入工具完成误唤醒语料的命名和保存。After the false awakening corpus is obtained, the false awakening corpus can be identified, the wake words included in the false awakening corpus can be determined, and the false awakening corpus can be named according to the wake word, so that subsequent users can intuitively determine the name of the false awakening corpus. Which wake word caused the wake engine to be falsely woken up. Moreover, after the false wake-up corpus is named according to the wake-up word, the false wake-up corpus can be stored locally in the vehicle. In an optional implementation manner, the naming and saving of the false wake-up corpus can be completed through a file writing tool.

S405、根据命名后的误唤醒语料，确定唤醒词的误唤醒率。S405 , according to the named false awakening corpus, determine the false awakening rate of the awakening word.

进一步的，在误唤醒压测之后，也即是在外部工具停止播放干扰音频数据之后，根据本地存储的所有误唤醒语料的名称，确定唤醒词在单位时间内的误唤醒率，进而使用户知道哪个唤醒词导致的误唤醒率最高，后续用户输入语音指令时可以规避。Further, after the false wake-up stress test, that is, after the external tool stops playing the interfering audio data, the false wake-up rate per unit time of the wake-up word is determined according to the names of all the false wake-up corpora stored locally, so as to let the user know. Which wake-up word causes the highest false wake-up rate, which can be avoided when subsequent users input voice commands.

S406、将唤醒引擎的误唤醒语料作为唤醒引擎中唤醒模型的负样本，对唤醒模型进行训练。S406. Use the false wake-up corpus of the wake-up engine as a negative sample of the wake-up model in the wake-up engine, and train the wake-up model.

车机可将本地存储的误唤醒语料作为唤醒引擎中唤醒模型的负样本，对唤醒模型进行训练，以达到优化唤醒引擎的目的。需要说明的是，还可以将车机本地存储的误唤醒语料发送到云端，在云端完成唤醒模型的修正训练。The car machine can use the locally stored false wake-up corpus as a negative sample of the wake-up model in the wake-up engine to train the wake-up model to optimize the wake-up engine. It should be noted that the false wake-up corpus stored locally on the vehicle can also be sent to the cloud, and the correction training of the wake-up model can be completed in the cloud.

图5是根据本申请实施例的误唤醒语料确定方法的流程图，参见图5，本申请实施例中的汽车示例性的设置有四个音区，如5中的MIC1-MIC4是设置在车机上的四个设置在不同音区的用于采集音频数据的因音频采集器，例如麦克风，每个麦克风关联一个唤醒引擎，同时设置有4个lru缓存队列，每个音区的唤醒引擎关联一个lru缓存队列。同时还设置有唤醒监听机制，以实时监听唤醒引擎是否被成功唤醒。FIG. 5 is a flowchart of a method for determining false wake-up corpus according to an embodiment of the present application. Referring to FIG. 5 , the car in the embodiment of the present application is exemplarily provided with four sound zones. There are four audio collectors on the machine for collecting audio data, such as microphones, which are set in different sound zones. Each microphone is associated with a wake-up engine. At the same time, there are 4 lru buffer queues, and each sound zone is associated with a wake-up engine. lru cache queue. At the same time, a wake-up monitoring mechanism is also set to monitor whether the wake-up engine is successfully woken up in real time.

具体工作时，外部工具在汽车内播放干扰音频数据，麦克风MIC1、MIC2、MIC3和MIC4分别采集各自所在音区的干扰音频数据。进一步的，各麦克风将采集的干扰音频数据灌入唤醒引擎，并同步将干扰音频数据根据预设时长存入唤醒引擎对应的Lru缓存队列中。在监听到某一唤醒引擎被成功唤醒后，回调确认是哪个音区的唤醒引擎被唤醒，进而确定该唤醒引擎关联的lru缓存队列，并将该lru缓存队列中当前缓存的干扰音频数据作为误唤醒语料。而为了保存误唤醒语料，将确定的误唤醒语料弹出lru缓存队列，并传输到文件写入工具，通过文件写入工具将误唤醒语料保存在车机本地，由此实现了自动收集误唤醒语料的目的。During specific work, the external tool plays the interfering audio data in the car, and the microphones MIC1, MIC2, MIC3 and MIC4 collect the interfering audio data in their respective sound areas. Further, each microphone pours the collected interference audio data into the wake-up engine, and synchronously stores the interference audio data in the Lru cache queue corresponding to the wake-up engine according to the preset duration. After listening to a wake-up engine being successfully woken up, the callback confirms the wake-up engine of which sound zone was woken up, and then determines the lru cache queue associated with the wake-up engine, and regards the currently buffered interfering audio data in the lru cache queue as an error message. Awakening corpus. In order to save the false wakeup corpus, the determined false wakeup corpus is popped out of the lru cache queue and transmitted to the file writing tool, and the false wakeup corpus is saved locally in the vehicle through the file writing tool, thus realizing the automatic collection of false wakeup corpus. the goal of.

图6是根据本申请实施例的误唤醒语料确定装置的结构示意图，本实施例可适用于通过语音查询耳机电量的情况。如图6所示，该装置600具体包括：FIG. 6 is a schematic structural diagram of an apparatus for determining a false wake-up corpus according to an embodiment of the present application. This embodiment is applicable to the case of querying the power level of an earphone by voice. As shown in Figure 6, thedevice 600 specifically includes:

语音采集模块601，用于通过至少一个音频采集器采集干扰音频数据；其中，音频采集器所在的音频采集区域中包括与音频采集器关联的唤醒引擎；Avoice collection module 601, configured to collect interference audio data through at least one audio collector; wherein, the audio collection area where the audio collector is located includes a wake-up engine associated with the audio collector;

数据输入模块602，用于将音频采集器采集的干扰音频数据输入关联的唤醒引擎；Thedata input module 602 is used for inputting the interfering audio data collected by the audio collector into the associated wake-up engine;

误唤醒语料确定模块603，用于在唤醒引擎被输入的干扰音频数据成功唤醒情况下，将该干扰音频数据作为唤醒引擎的误唤醒语料。The false wake-upcorpus determination module 603 is configured to use the interference audio data as the false wake-up corpus of the wake-up engine when the wake-up engine is successfully woken up by the input interference audio data.

在上述实施例的基础上，可选的，误唤醒语料确定模块，包括：On the basis of the above-mentioned embodiment, optionally, the false wake-up corpus determination module includes:

误唤醒语料确定单元，用于在唤醒引擎被输入的干扰音频数据成功唤醒情况下，将该干扰音频数据中在成功唤醒前预设时长的干扰音频数据，作为唤醒引擎的误唤醒语料。The false wake-up corpus determination unit is used for, in the case that the wake-up engine is successfully woken up by the input interference audio data, the interference audio data with a preset duration before the successful wake-up in the interference audio data is used as the false wake-up corpus of the wake-up engine.

在上述实施例的基础上，可选的，该装置还包括：On the basis of the foregoing embodiment, optionally, the device further includes:

缓存模块，用于将干扰音频数据输入到与唤醒引擎关联的缓存单元中，其中，缓存单元被配置为存储输入的干扰音频数据中最后预设时长的干扰音频数据；a buffering module for inputting the interference audio data into a buffer unit associated with the wake-up engine, wherein the buffer unit is configured to store the interference audio data of the last preset duration in the input interference audio data;

误唤醒语料确定单元具体用于：The false awakening corpus determination unit is specifically used for:

响应于任一唤醒引擎被成功唤醒，将该唤醒引擎关联的缓存单元中所缓存的干扰音频数据作为唤醒引擎的误唤醒语料。In response to any wake-up engine being successfully woken up, the interfering audio data buffered in the buffer unit associated with the wake-up engine is used as the false wake-up corpus of the wake-up engine.

识别与命名模块，用于将该干扰音频数据作为唤醒引擎的误唤醒语料之后，识别误唤醒语料中包括的唤醒词，且按照唤醒词对误唤醒语料进行命名。The identification and naming module is used for identifying the wake-up words included in the false-awakening corpus after using the interference audio data as the false-awakening corpus of the wake-up engine, and naming the false-awakening corpus according to the wake-up word.

计算模块，用于在按照唤醒词对误唤醒语料进行命名之后，根据命名后的误唤醒语料，确定唤醒词的误唤醒率。The calculation module is used for determining the false awakening rate of the awakening word according to the named false awakening corpus after naming the false awakening corpus according to the awakening word.

训练模块，用于将该干扰音频数据作为唤醒引擎的误唤醒语料之后，将唤醒引擎的误唤醒语料作为唤醒引擎中唤醒模型的负样本，对唤醒模型进行训练。The training module is used for training the wake-up model by using the interference audio data as the false-awakening corpus of the wake-up engine, and using the false-awakening corpus of the wake-up engine as a negative sample of the wake-up model in the wake-up engine.

本申请实施例提供的误唤醒语料确定装置600可执行本申请任意实施例提供的误唤醒语料确定方法，具备执行方法相应的功能模块和有益效果。本实施例中未详尽描述的内容可以参考本申请任意方法实施例中的描述。Thedevice 600 for determining the false wake-up corpus provided by the embodiment of the present application can execute the method for determining the false wake-up corpus provided by any embodiment of the present application, and has functional modules and beneficial effects corresponding to the execution method. For the content not described in detail in this embodiment, reference may be made to the description in any method embodiment of this application.

根据本申请的实施例，本申请还提供了一种电子设备和一种可读存储介质。According to the embodiments of the present application, the present application further provides an electronic device and a readable storage medium.

如图7所示，是根据本申请实施例的误唤醒语料确定方法的电子设备的框图。电子设备旨在表示各种形式的数字计算机，诸如，膝上型计算机、台式计算机、工作台、个人数字助理、服务器、刀片式服务器、大型计算机、和其它适合的计算机。电子设备还可以表示各种形式的移动装置，诸如，个人数字处理、蜂窝电话、智能电话、可穿戴设备和其它类似的计算装置。本文所示的部件、它们的连接和关系、以及它们的功能仅仅作为示例，并且不意在限制本文中描述的和/或者要求的本申请的实现。As shown in FIG. 7 , it is a block diagram of an electronic device according to a method for determining a false wake-up corpus according to an embodiment of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframe computers, and other suitable computers. Electronic devices may also represent various forms of mobile devices, such as personal digital processors, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions are by way of example only, and are not intended to limit implementations of the application described and/or claimed herein.

如图7所示，该电子设备包括：一个或多个处理器701、存储器702，以及用于连接各部件的接口，包括高速接口和低速接口。各个部件利用不同的总线互相连接，并且可以被安装在公共主板上或者根据需要以其它方式安装。处理器可以对在电子设备内执行的指令进行处理，包括存储在存储器中或者存储器上以在外部输入/输出装置(诸如，耦合至接口的显示设备)上显示GUI的图形信息的指令。在其它实施方式中，若需要，可以将多个处理器和/或多条总线与多个存储器和多个存储器一起使用。同样，可以连接多个电子设备，各个设备提供部分必要的操作(例如，作为服务器阵列、一组刀片式服务器、或者多处理器系统)。图7中以一个处理器701为例。As shown in FIG. 7 , the electronic device includes: one ormore processors 701 , amemory 702 , and interfaces for connecting various components, including a high-speed interface and a low-speed interface. The various components are interconnected using different buses and may be mounted on a common motherboard or otherwise as desired. The processor may process instructions executed within the electronic device, including instructions stored in or on memory to display graphical information of the GUI on an external input/output device, such as a display device coupled to the interface. In other embodiments, multiple processors and/or multiple buses may be used with multiple memories and multiple memories, if desired. Likewise, multiple electronic devices may be connected, each providing some of the necessary operations (eg, as a server array, a group of blade servers, or a multiprocessor system). Aprocessor 701 is taken as an example in FIG. 7 .

存储器702即为本申请所提供的非瞬时计算机可读存储介质。其中，存储器存储有可由至少一个处理器执行的指令，以使至少一个处理器执行本申请所提供的误唤醒语料确定方法。本申请的非瞬时计算机可读存储介质存储计算机指令，该计算机指令用于使计算机执行本申请所提供的误唤醒语料确定方法。Thememory 702 is the non-transitory computer-readable storage medium provided by the present application. Wherein, the memory stores instructions executable by at least one processor, so that the at least one processor executes the method for determining false wake-up corpus provided by the present application. The non-transitory computer-readable storage medium of the present application stores computer instructions, and the computer instructions are used to cause the computer to execute the method for determining false wake-up corpus provided by the present application.

存储器702作为一种非瞬时计算机可读存储介质，可用于存储非瞬时软件程序、非瞬时计算机可执行程序以及模块，如本申请实施例中的误唤醒语料确定方法对应的程序指令/模块(例如，附图6所示的语音采集模块601、数据输入模块602、误唤醒语料确定模块603)。处理器701通过运行存储在存储器702中的非瞬时软件程序、指令以及模块，从而执行服务器的各种功能应用以及数据处理，即实现上述方法实施例中的误唤醒语料确定方法。As a non-transitory computer-readable storage medium, thememory 702 can be used to store non-transitory software programs, non-transitory computer-executable programs and modules, such as program instructions/modules (for example, program instructions/modules corresponding to the method for determining false awakening corpus in the embodiments of the present application). , thevoice acquisition module 601, thedata input module 602, and the false wake-upcorpus determination module 603 shown in FIG. 6). Theprocessor 701 executes various functional applications and data processing of the server by running the non-transitory software programs, instructions and modules stored in thememory 702, that is, implementing the method for determining false wake-up corpus in the above method embodiments.

存储器702可以包括存储程序区和存储数据区，其中，存储程序区可存储操作系统、至少一个功能所需要的应用程序；存储数据区可存储根据实现本申请实施例的误唤醒语料确定方法的电子设备的使用所创建的数据等。此外，存储器702可以包括高速随机存取存储器，还可以包括非瞬时存储器，例如至少一个磁盘存储器件、闪存器件、或其他非瞬时固态存储器件。在一些实施例中，存储器702可选包括相对于处理器701远程设置的存储器，这些远程存储器可以通过网络连接至实现本申请实施例的误唤醒语料确定方法的电子设备。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。Thememory 702 may include a stored program area and a stored data area, wherein the stored program area may store an operating system and an application program required by at least one function; data created by the use of the device, etc. Additionally,memory 702 may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, thememory 702 may optionally include a memory set remotely relative to theprocessor 701 , and the remote memory may be connected to an electronic device implementing the method for determining a false wake-up corpus of the embodiment of the present application through a network. Examples of such networks include, but are not limited to, the Internet, an intranet, a local area network, a mobile communication network, and combinations thereof.

实现本申请实施例的误唤醒语料确定方法的电子设备还可以包括：输入装置703和输出装置704。处理器701、存储器702、输入装置703和输出装置704可以通过总线或者其他方式连接，图7中以通过总线连接为例。The electronic device that implements the method for determining false wake-up corpus according to the embodiment of the present application may further include: aninput device 703 and anoutput device 704 . Theprocessor 701 , thememory 702 , theinput device 703 and theoutput device 704 may be connected by a bus or in other ways, and the connection by a bus is taken as an example in FIG. 7 .

输入装置703可接收输入的数字或字符信息，以及产生与实现本申请实施例的误唤醒语料确定方法的电子设备的用户设置以及功能控制有关的键信号输入，例如触摸屏、小键盘、鼠标、轨迹板、触摸板、指示杆、一个或者多个鼠标按钮、轨迹球、操纵杆等输入装置。输出装置704可以包括显示设备、辅助照明装置(例如，LED)和触觉反馈装置(例如，振动电机)等。该显示设备可以包括但不限于，液晶显示器(LCD)、发光二极管(LED)显示器和等离子体显示器。在一些实施方式中，显示设备可以是触摸屏。Theinput device 703 can receive input digital or character information, and generate key signal input related to user settings and function control of the electronic device that implements the method for determining false wake-up corpus according to the embodiment of the present application, such as a touch screen, a keypad, a mouse, a track Input devices such as pads, touchpads, pointing sticks, one or more mouse buttons, trackballs, joysticks, etc.Output devices 704 may include display devices, auxiliary lighting devices (eg, LEDs), haptic feedback devices (eg, vibration motors), and the like. The display device may include, but is not limited to, a liquid crystal display (LCD), a light emitting diode (LED) display, and a plasma display. In some implementations, the display device may be a touch screen.

此处描述的系统和技术的各种实施方式可以在数字电子电路系统、集成电路系统、专用ASIC(专用集成电路)、计算机硬件、固件、软件、和/或它们的组合中实现。这些各种实施方式可以包括：实施在一个或者多个计算机程序中，该一个或者多个计算机程序可在包括至少一个可编程处理器的可编程系统上执行和/或解释，该可编程处理器可以是专用或者通用可编程处理器，可以从存储系统、至少一个输入装置、和至少一个输出装置接收数据和指令，并且将数据和指令传输至该存储系统、该至少一个输入装置、和该至少一个输出装置。Various implementations of the systems and techniques described herein can be implemented in digital electronic circuitry, integrated circuit systems, application specific ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include being implemented in one or more computer programs executable and/or interpretable on a programmable system including at least one programmable processor that The processor, which may be a special purpose or general-purpose programmable processor, may receive data and instructions from a storage system, at least one input device, and at least one output device, and transmit data and instructions to the storage system, the at least one input device, and the at least one output device an output device.

这些计算程序(也称作程序、软件、软件应用、或者代码)包括可编程处理器的机器指令，并且可以利用高级过程和/或面向对象的编程语言、和/或汇编/机器语言来实施这些计算程序。如本文使用的，术语“机器可读介质”和“计算机可读介质”指的是用于将机器指令和/或数据提供给可编程处理器的任何计算机程序产品、设备、和/或装置(例如，磁盘、光盘、存储器、可编程逻辑装置(PLD))，包括，接收作为机器可读信号的机器指令的机器可读介质。术语“机器可读信号”指的是用于将机器指令和/或数据提供给可编程处理器的任何信号。These computational programs (also referred to as programs, software, software applications, or codes) include machine instructions for programmable processors, and may be implemented using high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages calculation program. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or apparatus for providing machine instructions and/or data to a programmable processor ( For example, magnetic disks, optical disks, memories, programmable logic devices (PLDs), including machine-readable media that receive machine instructions as machine-readable signals. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.

为了提供与用户的交互，可以在计算机上实施此处描述的系统和技术，该计算机具有：用于向用户显示信息的显示装置(例如，CRT(阴极射线管)或者LCD(液晶显示器)监视器)；以及键盘和指向装置(例如，鼠标或者轨迹球)，用户可以通过该键盘和该指向装置来将输入提供给计算机。其它种类的装置还可以用于提供与用户的交互；例如，提供给用户的反馈可以是任何形式的传感反馈(例如，视觉反馈、听觉反馈、或者触觉反馈)；并且可以用任何形式(包括声输入、语音输入或者、触觉输入)来接收来自用户的输入。To provide interaction with a user, the systems and techniques described herein may be implemented on a computer having a display device (eg, a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to the user ); and a keyboard and pointing device (eg, a mouse or trackball) through which a user can provide input to the computer. Other kinds of devices can also be used to provide interaction with the user; for example, the feedback provided to the user can be any form of sensory feedback (eg, visual feedback, auditory feedback, or tactile feedback); and can be in any form (including acoustic input, voice input, or tactile input) to receive input from the user.

可以将此处描述的系统和技术实施在包括后台部件的计算系统(例如，作为数据服务器)、或者包括中间件部件的计算系统(例如，应用服务器)、或者包括前端部件的计算系统(例如，具有图形用户界面或者网络浏览器的用户计算机，用户可以通过该图形用户界面或者该网络浏览器来与此处描述的系统和技术的实施方式交互)、或者包括这种后台部件、中间件部件、或者前端部件的任何组合的计算系统中。可以通过任何形式或者介质的数字数据通信(例如，通信网络)来将系统的部件相互连接。通信网络的示例包括：局域网(LAN)、广域网(WAN)、互联网和区块链网络。The systems and techniques described herein may be implemented on a computing system that includes back-end components (eg, as a data server), or a computing system that includes middleware components (eg, an application server), or a computing system that includes front-end components (eg, a user's computer having a graphical user interface or web browser through which a user may interact with implementations of the systems and techniques described herein), or including such backend components, middleware components, Or any combination of front-end components in a computing system. The components of the system may be interconnected by any form or medium of digital data communication (eg, a communication network). Examples of communication networks include: Local Area Networks (LANs), Wide Area Networks (WANs), the Internet, and blockchain networks.

计算机系统可以包括客户端和服务器。客户端和服务器一般远离彼此并且通常通过通信网络进行交互。通过在相应的计算机上运行并且彼此具有客户端-服务器关系的计算机程序来产生客户端和服务器的关系。服务器可以是云服务器，又称为云计算服务器或云主机，是云计算服务体系中的一项主机产品，以解决了传统物理主机与VPS服务中，存在的管理难度大，业务扩展性弱的缺陷。A computer system can include clients and servers. Clients and servers are generally remote from each other and usually interact through a communication network. The relationship of client and server arises by computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also known as a cloud computing server or a cloud host. It is a host product in the cloud computing service system to solve the traditional physical host and VPS services, which are difficult to manage and weak in business scalability. defect.

根据本申请实施例的技术方案，实现了自动收集误唤醒语料的目的，提升了确定误唤醒语料的效率。According to the technical solutions of the embodiments of the present application, the purpose of automatically collecting false awakening corpora is achieved, and the efficiency of determining false awakening corpus is improved.

应该理解，可以使用上面所示的各种形式的流程，重新排序、增加或删除步骤。例如，本申请中记载的各步骤可以并行地执行也可以顺序地执行也可以不同的次序执行，只要能够实现本申请公开的技术方案所期望的结果，本文在此不进行限制。It should be understood that steps may be reordered, added or deleted using the various forms of flow shown above. For example, the steps described in the present application can be executed in parallel, sequentially or in different orders, as long as the desired results of the technical solutions disclosed in the present application can be achieved, no limitation is imposed herein.

上述具体实施方式，并不构成对本申请保护范围的限制。本领域技术人员应该明白的是，根据设计要求和其他因素，可以进行各种修改、组合、子组合和替代。任何在本申请的精神和原则之内所作的修改、等同替换和改进等，均应包含在本申请保护范围之内。The above-mentioned specific embodiments do not constitute a limitation on the protection scope of the present application. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may occur depending on design requirements and other factors. Any modifications, equivalent replacements and improvements made within the spirit and principles of this application shall be included within the protection scope of this application.