Movatterモバイル変換


[0]ホーム

URL:


CN111276140A - Voice command recognition method, device, system and storage medium - Google Patents

Voice command recognition method, device, system and storage medium
Download PDF

Info

Publication number
CN111276140A
CN111276140ACN202010060884.5ACN202010060884ACN111276140ACN 111276140 ACN111276140 ACN 111276140ACN 202010060884 ACN202010060884 ACN 202010060884ACN 111276140 ACN111276140 ACN 111276140A
Authority
CN
China
Prior art keywords
voice
information
behavior
control instruction
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010060884.5A
Other languages
Chinese (zh)
Other versions
CN111276140B (en
Inventor
宋德超
陈翀
陈向文
罗晓宇
黄智刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Gree Electric Appliances Inc of Zhuhai
Zhuhai Lianyun Technology Co Ltd
Original Assignee
Gree Electric Appliances Inc of Zhuhai
Zhuhai Lianyun Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Gree Electric Appliances Inc of Zhuhai, Zhuhai Lianyun Technology Co LtdfiledCriticalGree Electric Appliances Inc of Zhuhai
Priority to CN202010060884.5ApriorityCriticalpatent/CN111276140B/en
Publication of CN111276140ApublicationCriticalpatent/CN111276140A/en
Application grantedgrantedCritical
Publication of CN111276140BpublicationCriticalpatent/CN111276140B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Images

Classifications

Landscapes

Abstract

The invention relates to a voice command recognition method, a device, a system and a storage medium, wherein the method comprises the following steps: acquiring voice information and behavior image data of a voice user who sends the voice information when sending the voice information; performing behavior feature recognition on the voice user in the behavior image data to determine whether the voice user sends out voice information as a control instruction; and under the condition that the voice information is determined to be sent out as the control instruction, generating a corresponding control instruction according to the voice information, wherein the corresponding control instruction is used for controlling the target equipment to execute a corresponding action according to the voice information. The invention can determine whether the voice user sends the voice information as the control instruction by performing the behavior characteristic recognition on the behavior image data of the voice user when sending the voice information, thereby avoiding the false response of the target equipment.

Description

Voice command recognition method, device, system and storage medium
Technical Field
The invention belongs to the technical field of artificial intelligence, relates to a voice recognition technology, and particularly relates to a voice command recognition method, device, system and storage medium.
Background
Currently, as the living standard of people is improved, intelligent household appliances (such as intelligent air conditioning equipment) become choices of more and more consumers. The voice control module is applied to the household appliance, so that the traditional household appliance is more convenient and intelligent to use, and the voice control module is one of the development routes of household appliance intellectualization.
However, when recognizing human voice, the existing home appliance integrated with the voice control module may have recognition errors. For example, the user originally has a language exchange with another person, and the exchanged speech is similar to the device control sentence, and is mistaken by the device for the control command and executes the command, resulting in an erroneous response.
Disclosure of Invention
The invention mainly aims to provide a new voice command recognition method, a new voice command recognition device, a new voice command recognition system and a new storage medium, which are used for solving the problem that in the prior art, the household appliance mistakenly recognizes an exchange statement as a control command and executes the command because the exchange statement of a user and other people is similar to an appliance control statement, so that the household appliance is helped to accurately recognize whether the exchange statement is a control command, and the accuracy and the rapidity of correct response are improved.
The invention provides a voice command recognition method in a first aspect, which comprises the following steps: acquiring voice information and behavior image data of a voice user who sends the voice information when sending the voice information; performing behavior feature recognition on the voice user in the behavior image data to determine whether the voice user sends the voice information as a control instruction; and under the condition that the voice information is determined to be sent out as a control instruction, generating a corresponding control instruction according to the voice information, wherein the control instruction is used for controlling the target equipment to execute corresponding actions according to the voice information.
Optionally, performing behavior feature recognition on the voice user in the behavior image data to determine whether the voice user sends the voice information as a control instruction, including: dividing the behavior image data to obtain a video frame sequence; performing video analysis on the video frame sequence, and determining face orientation information when a voice user sends voice information and behavior types when the voice user sends the voice information; and judging whether the voice user sends the voice information as a control instruction or not according to the face orientation information and the behavior category.
Optionally, performing video analysis on the video frame sequence to determine face orientation information when a voice user sends out voice information, including: analyzing the video frame sequence to obtain head coordinate information of the voice user on each video frame; based on the head coordinate information of the voice user, positioning the head area of the voice user on each video frame, and acquiring head posture information and face feature information in the head area corresponding to each video frame; and determining face orientation information of the voice user when the voice information is sent by the voice user based on the head posture information and the face characteristic information on each video frame by using the trained face orientation recognition model.
Optionally, performing video analysis on the video frame sequence to determine a behavior category when a voice user sends out voice information, including: screening out video frames containing voice users from the video frame sequence; and determining the behavior type of the voice user when the voice user sends out voice information based on the video frame containing the voice user by using the trained preset behavior recognition model.
Optionally, the determining, based on the trained preset behavior recognition model and based on the video frame containing the voice user, the behavior category when the voice user sends out the voice information includes: positioning and analyzing the voice user in the video frame containing the voice user to obtain the position information of the voice user in the video frame containing the voice user; extracting human behavior feature information of the voice user in a video frame sequence containing the voice user according to the position information; and determining the behavior category of the voice user based on the human behavior feature information of the voice user in the video frame containing the voice user.
Optionally, the preset behavior recognition model is trained through the following steps: performing individual positioning analysis on the persons in the field according to the video frames to obtain position information containing the persons in the field on the corresponding video frames; acquiring human behavior characteristic information of the present person in the corresponding video frame according to the position information corresponding to the present person; and training a preset behavior recognition model based on the human behavior feature information of the on-site person in the corresponding video frame.
Optionally, in a case that it is determined that the voice information is sent as a control instruction, generating a corresponding control instruction according to the voice information, where the control instruction is used to control the target device to execute a corresponding action according to the voice information, where the method includes: preprocessing the voice information and extracting voice keywords from a processing result; recognizing control information from the voice keywords by using a preset voice recognition model; and under the condition that the voice information is determined to be sent out as a control instruction, generating a corresponding control instruction according to the control information, wherein the corresponding control instruction is used for controlling the target equipment to execute a corresponding action.
Optionally, the pre-processing comprises one or more of: denoising, pre-emphasis, framing, windowing, and endpoint detection.
Optionally, after determining that the voice information is sent as a control instruction, and before generating a corresponding control instruction according to the control information, for controlling the target device to execute a corresponding action, the method further includes: judging whether the control information is included in a preset instruction list or not based on the preset instruction list, wherein the preset instruction list comprises a plurality of executable control information; and under the condition that the preset instruction list comprises the control information, determining to generate a corresponding control instruction according to the control information so as to control the target equipment to execute a corresponding action.
A second aspect of the present invention provides a storage medium storing one or more programs executable by one or more processors to implement the voice command recognition method described above.
A third aspect of the present invention provides a voice command recognition apparatus, comprising: comprises a processor and a memory; the memory is used for storing computer instructions, and the processor is used for operating the computer instructions stored by the memory so as to realize the voice command recognition method.
A fourth aspect of the present invention provides a voice command recognition system, the system comprising: the terminal is used for acquiring voice information and behavior image data of a voice user sending the voice information when sending the voice information; the server is in communication connection with the terminal and is used for receiving the voice information sent by the terminal and behavior image data of a voice user sending the voice information when sending the voice information; wherein the server further comprises a processor and a memory; the memory is used for storing computer instructions, and the processor is used for operating the computer instructions stored by the memory so as to realize the voice command recognition method.
The invention provides an intelligent household appliance system, which comprises the voice command recognition device or system and a household appliance connected with the device or system in a communication way, wherein the household appliance receives a control instruction sent by the device or system and executes corresponding action according to the control instruction.
Compared with the prior art, the invention has the following beneficial effects: by performing behavior feature recognition on the behavior image data of the voice user when the voice user sends the voice information, whether the voice user sends the voice information as a control instruction can be determined. Therefore, when the voice information sent by the voice user is determined not to be used as the control instruction through the behavior image data, the target device does not respond to the voice information no matter whether the voice information is similar to or the same as the communication language used for man-machine interaction or even if the voice information is used for the communication between the voice user and a person; when the voice information sent by the voice user is determined to be used as the control instruction through the behavior image data, the fact that the voice user is in man-machine interaction with household appliances such as an intelligent air conditioner can be determined, the target device can respond to the voice information, therefore, the target device is prevented from responding mistakenly, and the accuracy of the target device response is improved.
Drawings
FIG. 1 is a diagram illustrating an application environment of a voice command recognition method according to an embodiment of the present invention;
fig. 2 is a schematic workflow diagram of a voice command recognition method according to another embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The voice command recognition method provided by the invention can be applied to the application environment of the voice command recognition system shown in FIG. 1. Theterminal 102 and theserver 104 are in communication connection through a wireless network. Theterminal 102 is used for collecting voice information and behavior image data of a voice user who sends the voice information when sending the voice information; the collected voice information and behavior image data are transmitted to theserver 104, and theserver 104 performs behavior feature recognition on the voice user in the behavior image data to determine whether the voice user sends the voice information as a control instruction; and under the condition that the voice information is determined to be sent out as a control instruction, generating a corresponding control instruction according to the voice information, wherein the control instruction is used for controlling the target equipment to execute corresponding actions according to the voice information. Theterminal 102 may be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices, and theserver 104 may be implemented by an independent server or a server cluster formed by a plurality of servers.
Wherein, the target device includes but is not limited to: intelligent household electrical appliance. For convenience of description, the intelligent household appliance is hereinafter referred to as a household appliance.
In this embodiment, theterminal 102 includes but is not limited to: theserver 104 includes, but is not limited to, a signal processing device applied to the home appliance and configured to process a data signal uploaded by the collecting device.
In this embodiment, theserver 104 and the home device form a data connection, but in another embodiment, theterminal 102 and the home device may form a data connection.
In this embodiment, the home device includes but is not limited to: for example: intelligence refrigerator and intelligent air conditioner.
In another embodiment, as shown in fig. 2, a voice command recognition method is provided, which is exemplified by the application of the method to theserver 104 in fig. 1, and includes the following steps:
step S201: acquiring voice information and behavior image data of a voice user who sends the voice information when sending the voice information;
step S202: performing behavior feature recognition on the voice user in the behavior image data to determine whether the voice user sends the voice information as a control instruction; if the voice information is determined to be sent out as a control instruction, the following step S203 is executed, otherwise, the following step S204 is executed;
step S203: and generating a corresponding control instruction according to the voice information so as to control the target equipment to execute a corresponding action according to the voice information.
Step S204: no treatment is performed.
In this embodiment, theterminal 102 includes a voice collecting device and a video collecting device, and when the voice collecting device collects voice information, the video collecting device collects video data in the shooting area at this time, that is: the video acquisition equipment acquires behavior image data of a voice user when the voice user sends voice information. Of course, in this embodiment, the voice capture device and the video capture device continuously collect corresponding data information.
After behavior image data are obtained, performing behavior feature recognition on the voice user in the behavior image data to determine whether the voice user sends the voice information as a control instruction. Namely: in this embodiment, by performing behavior feature recognition on the voice user in the behavior image data, it is determined whether the voice user performs a behavior action of issuing the voice information as the control instruction when the voice information is acquired, and in the case of yes determination, it may be determined that the voice user issues the voice information as the control instruction. In this case, a corresponding control instruction may be generated according to the voice information to control the target device to perform a corresponding action according to the voice information.
Therefore, in the present embodiment, by performing behavior feature recognition on the behavior image data of the voice user when uttering the voice information, it can be determined whether the voice user utters the voice information as the control instruction. Therefore, when the voice information sent by the voice user is determined not to be used as the control instruction through the behavior image data, the target device does not respond to the voice information no matter whether the voice information is similar to or the same as the communication language used for man-machine interaction or even if the voice information is used for the communication between the voice user and a person; when the voice information sent by the voice user is determined to be used as the control instruction through the behavior image data, the fact that the voice user is in man-machine interaction with household appliances such as an intelligent air conditioner and the like can be determined, the target device can respond to the voice information, and therefore the target device is prevented from responding mistakenly.
In another embodiment, an implementation manner of the step S202 is as follows:
step S221: dividing the behavior image data to obtain a video frame sequence;
step S222: performing video analysis on the video frame sequence, and determining face orientation information when a voice user sends voice information and behavior types when the voice user sends the voice information;
step S223: and judging whether the voice user sends the voice information as a control instruction or not according to the face orientation information and the behavior characteristics.
In the present embodiment, the behavior image data is divided uniformly to obtain, for example, continuous video frames. Then, carrying out video analysis on the continuous video frames so as to determine face orientation information when a voice user sends voice information and behavior types when the voice user sends the voice information; therefore, the face orientation of the voice user when the voice user sends the voice information and whether the voice user talks with a person or carries out man-machine communication can be determined, and then whether the voice user sends the voice information as a control instruction or not is judged according to the face orientation information and the behavior characteristics, so that whether the voice user carries out man-machine interaction with household appliances such as an intelligent air conditioner or not can be judged by comprehensively considering the face orientation and the behavior types, and the judgment accuracy rate of determining whether the voice information sent by the voice user is sent as the control instruction or not is improved.
Therefore, in another embodiment, one implementation manner of the step S222 includes:
step S2221: analyzing the continuous video frame sequence to obtain the head coordinate information of the voice user on each video frame;
step S2222: based on the head coordinate information of the voice user, positioning the head area of the voice user on each video frame, and acquiring head posture information and face feature information in the head area corresponding to each video frame;
step S2223: and determining face orientation information of the voice user when the voice information is sent by the voice user based on the head posture information and the face characteristic information on each video frame by using the trained face orientation recognition model.
In this embodiment, video analysis is performed on consecutive video frames one by one, so as to obtain head coordinate information of the voice user on each video frame, so as to obtain a head coordinate information set, and the head area of the voice user on each video frame is located through the head coordinate information. In this case, the head pose information and the face feature information in the head region corresponding to each video frame are obtained by clipping the head region in the corresponding video frame, the head pose information and the face feature information are transmitted to a trained face orientation recognition model for recognition, and the face orientation information of the voice user when the voice information is sent out is determined.
Specifically, the YOLOv3 algorithm may be used to detect a voice user on a sequence of consecutive video frames and detect head coordinate information of a person, so as to determine the face orientation, if two persons are talking, the face orientations of the two persons are generally opposite, and if a person wants to control an air conditioner through voice, the person generally faces the position of an intelligent appliance such as an intelligent air conditioner.
Of course, in this embodiment, a large number of labeled face orientation samples may be collected to train the face orientation recognition model, so that the trained model can be used for face orientation detection.
Specifically, in another embodiment, the implementation manner of step S222 further includes:
step S2224: screening out video frames containing voice users from the video frame sequence;
step S2225: and determining the behavior type of the voice user when the voice user sends out voice information based on the video frame containing the voice user by using the trained preset behavior recognition model.
Of course, in this embodiment, the execution sequence between step S2221 to step S2223 and step S2224 to step S2225 is not limited, and step S2221 to step S2223 may be executed first, and step S2224 to step S2225 may also be executed first.
In this embodiment, human body detection is performed on all voice users in the video frame sequence, such as: detecting the voice users in the video frame sequence by taking human as a target, so as to screen out the video frames containing the voice users; and determining the behavior type of the voice user when the voice user sends out voice information through the video frame containing the voice user by using the trained preset behavior recognition model.
Specifically, in another embodiment, one implementation manner of the step S2225 includes:
step S22251: positioning and analyzing the voice user in the video frame containing the voice user to obtain the position information of the voice user in the corresponding video frame containing the voice user;
step S22252: extracting human behavior feature information of the voice user in a video frame sequence containing the voice user according to the position information;
step S22253: and determining the behavior category of the voice user based on the human behavior characteristic information of the voice user in the video frame sequence containing the voice user.
In this embodiment, by locating the voice user on each video frame, the human behavior feature information of the voice user on the corresponding video frame can be accurately extracted, and the behavior category of the voice user can be determined by combining the extracted human behavior feature information.
Of course, in another embodiment, the training method of the preset behavior recognition model includes:
performing individual positioning analysis on the persons in the field according to the continuous video frames to obtain the position information of the persons in the field on the corresponding video frames; acquiring human behavior characteristic information of the present person in the corresponding video frame according to the position information corresponding to the present person; and training a preset behavior recognition model based on the human behavior feature information of the on-site person in the corresponding video frame.
Therefore, in this embodiment, each person in the video frame can perform individual positioning analysis, so as to determine whether each person in the video frame is communicating with another person, man-machine communicating with another person, or other behaviors.
Of course, in this embodiment, the predetermined behavior recognition model may adopt a CNN-BiLSTM model. Specifically, through human target detection, the detected video frames containing the human are input into a preset behavior recognition model for classification and recognition. The preset behavior recognition model adopts a CNN-BilSTM model, and the classification category number of the CNN-BilSTM model is 3: the communication between people is carried out, and the people control air conditioning and other behaviors through voice. In addition, in this embodiment, the human behavior feature information of the present person may be collected to train the model, and of course, the human behavior feature information includes the inter-human communication in the actual scene, and the human controls the air conditioner and others through voice, and then inputs the human behavior feature information into the model to train.
Furthermore, in another embodiment, one implementation manner of the step S203 includes:
step S301: preprocessing the voice information and extracting voice keywords from a processing result;
of course, in this embodiment, the pre-processing includes, but is not limited to, one or more of the following: denoising, pre-emphasis, framing, windowing, and endpoint detection.
Step S302: identifying control information from the voice keywords through a preset voice identification model; (ii) a
Step S303: and under the condition that the voice information is determined to be sent out as a control instruction, generating a corresponding control instruction according to the control information so as to control the target equipment to execute a corresponding action.
In the present embodiment, in steps S301 to S303, the language information is preprocessed to extract the speech feature parameters from the processed speech information; then, the control information is recognized by presetting the speech recognition model for the extracted speech feature parameter input values, and in this embodiment, when it is determined that the speech information is sent as a control instruction, a corresponding control instruction is generated according to the control information to control the target device to execute a corresponding action. In the embodiment, the voice information is converted into the text information according to the text information, and the voice keywords in the text information are extracted; and then, inquiring the semantics and the attributes of the extracted keywords according to a preset dictionary database. These keywords include, but are not limited to: assist words, exclamation words and verbs. And then performing combined analysis by taking the preset voice recognition model as the semantics and attributes of the keywords so as to recognize the control information.
Moreover, in this embodiment, after determining that the voice information is issued as the control instruction and before generating a corresponding control instruction according to the control information to control the target device to perform a corresponding action, the voice command recognition method further includes the following implementation steps: judging whether the control information is included in a preset instruction list or not based on the preset instruction list, wherein the preset instruction list comprises a plurality of executable control information; and if so, determining to generate a corresponding control instruction according to the control information so as to control the target equipment to execute a corresponding action. Otherwise, no processing is performed.
It should be understood that, although the steps in the flowchart of fig. 2 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a portion of the steps in fig. 2 may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performance of the sub-steps or stages is not necessarily sequential, but may be performed in turn or alternately with other steps or at least a portion of the sub-steps or stages of other steps.
In another embodiment of the present invention, a storage medium is provided that stores one or more programs executable by one or more processors to implement the voice command recognition method described above.
The nouns and the implementation principle related to a storage medium in this embodiment may specifically refer to a voice command recognition method in the foregoing embodiment, and are not described herein again.
In another embodiment of the present invention, a voice command recognition apparatus is provided that includes a processor and a memory; the memory is used for storing computer instructions, and the processor is used for operating the computer instructions stored by the memory to realize the voice command recognition method.
The nouns and the implementation principle related to the voice command recognition apparatus in this embodiment may specifically refer to a voice command recognition method in the foregoing embodiment, and are not described herein again.
In another embodiment of the present invention, there is provided a voice command recognition system, as shown in fig. 1, including:
the terminal is used for acquiring voice information and behavior image data of a voice user sending the voice information when sending the voice information;
the server is in communication connection with the terminal and is used for receiving the voice information sent by the terminal and behavior image data of a voice user sending the voice information when sending the voice information;
wherein the server further comprises a processor and a memory; the memory is used for storing computer instructions, and the processor is used for operating the computer instructions stored by the memory to realize the voice command recognition method.
The nouns and the implementation principle related to the voice command recognition system in this embodiment may specifically refer to a voice command recognition method in the foregoing embodiment, and are not described herein again.
In another embodiment of the present invention, an intelligent home appliance system is provided, which includes the voice command recognition apparatus or system as described above, and a home appliance in communication connection with the apparatus or system, wherein the home appliance receives a control instruction sent by the terminal device, and executes a corresponding action according to the control instruction.
The term and the implementation principle related to an intelligent household appliance system in this embodiment may specifically refer to a voice command recognition device or a voice command recognition system in the embodiment of the present invention, and are not described herein again.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, databases, or other media used in embodiments provided herein may include non-volatile and/or volatile memory. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (13)

CN202010060884.5A2020-01-192020-01-19Voice command recognition method, device, system and storage mediumActiveCN111276140B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202010060884.5ACN111276140B (en)2020-01-192020-01-19Voice command recognition method, device, system and storage medium

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202010060884.5ACN111276140B (en)2020-01-192020-01-19Voice command recognition method, device, system and storage medium

Publications (2)

Publication NumberPublication Date
CN111276140Atrue CN111276140A (en)2020-06-12
CN111276140B CN111276140B (en)2023-05-12

Family

ID=71003314

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202010060884.5AActiveCN111276140B (en)2020-01-192020-01-19Voice command recognition method, device, system and storage medium

Country Status (1)

CountryLink
CN (1)CN111276140B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN114935903A (en)*2022-05-242022-08-23深圳小佳科技有限公司Internet of things system, equipment control method and device and storage medium
CN115171692A (en)*2022-07-152022-10-11南京地平线机器人技术有限公司Voice interaction method and device
CN115440219A (en)*2022-09-022022-12-06深圳市鸿启科技有限公司Intelligent voice control device and method for intelligent robot

Citations (8)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN105204628A (en)*2015-09-012015-12-30涂悦Voice control method based on visual awakening
CN106354264A (en)*2016-09-092017-01-25电子科技大学Real-time man-machine interaction system based on eye tracking and a working method of the real-time man-machine interaction system
CN109166579A (en)*2018-09-042019-01-08广州市果豆科技有限责任公司A kind of sound control method and system of combination face information
CN109448711A (en)*2018-10-232019-03-08珠海格力电器股份有限公司Voice recognition method and device and computer storage medium
CN109710080A (en)*2019-01-252019-05-03华为技术有限公司 A screen control and voice control method and electronic device
CN109817211A (en)*2019-02-142019-05-28珠海格力电器股份有限公司Electric appliance control method and device, storage medium and electric appliance
CN110136714A (en)*2019-05-142019-08-16北京探境科技有限公司Natural interaction sound control method and device
CN110335600A (en)*2019-07-092019-10-15四川长虹电器股份有限公司The multi-modal exchange method and system of household appliance

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN105204628A (en)*2015-09-012015-12-30涂悦Voice control method based on visual awakening
CN106354264A (en)*2016-09-092017-01-25电子科技大学Real-time man-machine interaction system based on eye tracking and a working method of the real-time man-machine interaction system
CN109166579A (en)*2018-09-042019-01-08广州市果豆科技有限责任公司A kind of sound control method and system of combination face information
CN109448711A (en)*2018-10-232019-03-08珠海格力电器股份有限公司Voice recognition method and device and computer storage medium
CN109710080A (en)*2019-01-252019-05-03华为技术有限公司 A screen control and voice control method and electronic device
CN109817211A (en)*2019-02-142019-05-28珠海格力电器股份有限公司Electric appliance control method and device, storage medium and electric appliance
CN110136714A (en)*2019-05-142019-08-16北京探境科技有限公司Natural interaction sound control method and device
CN110335600A (en)*2019-07-092019-10-15四川长虹电器股份有限公司The multi-modal exchange method and system of household appliance

Cited By (3)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN114935903A (en)*2022-05-242022-08-23深圳小佳科技有限公司Internet of things system, equipment control method and device and storage medium
CN115171692A (en)*2022-07-152022-10-11南京地平线机器人技术有限公司Voice interaction method and device
CN115440219A (en)*2022-09-022022-12-06深圳市鸿启科技有限公司Intelligent voice control device and method for intelligent robot

Also Published As

Publication numberPublication date
CN111276140B (en)2023-05-12

Similar Documents

PublicationPublication DateTitle
CN108875833B (en) Neural network training method, face recognition method and device
CN109729383B (en)Double-recording video quality detection method and device, computer equipment and storage medium
CN110444198B (en)Retrieval method, retrieval device, computer equipment and storage medium
EP3617946B1 (en)Context acquisition method and device based on voice interaction
CN111045639B (en)Voice input method, device, electronic equipment and storage medium
CN104966053B (en)Face identification method and identifying system
CN105700363B (en)A kind of awakening method and system of smart home device phonetic controller
CN112016367A (en)Emotion recognition system and method and electronic equipment
CN110262273A (en)Household equipment control method and device, storage medium and intelligent household system
CN111276140B (en)Voice command recognition method, device, system and storage medium
US10991372B2 (en)Method and apparatus for activating device in response to detecting change in user head feature, and computer readable storage medium
CN107360157A (en)A kind of user registering method, device and intelligent air conditioner
CN110505504B (en)Video program processing method and device, computer equipment and storage medium
CN109448711A (en)Voice recognition method and device and computer storage medium
US10917721B1 (en)Device and method of performing automatic audio focusing on multiple objects
CN114155860A (en) Abstract recording method, apparatus, computer equipment and storage medium
CN112397052A (en)VAD sentence-breaking test method, VAD sentence-breaking test device, computer equipment and storage medium
CN116130088A (en)Multi-mode face diagnosis method, device and related equipment
CN119048247A (en)Method for predicting insurance risk based on pet age and related equipment thereof
CN118660128A (en) Conference content analysis method, system and all-in-one conference machine based on artificial intelligence
CN109345184B (en)Node information processing method and device based on micro-expressions, computer equipment and storage medium
CN117896484A (en)Target searching method, device, equipment and medium based on visual intercom system
CN118332265A (en)Multi-mode identification interaction method, system and computer program product
CN117524228A (en)Voice data processing method, device, equipment and medium
CN115909505A (en)Control method and device of sign language recognition equipment, storage medium and electronic equipment

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp