Movatterモバイル変換


[0]ホーム

URL:


CN110853619B - Man-machine interaction method, control device, controlled device and storage medium - Google Patents

Man-machine interaction method, control device, controlled device and storage medium
Download PDF

Info

Publication number
CN110853619B
CN110853619BCN201810955004.3ACN201810955004ACN110853619BCN 110853619 BCN110853619 BCN 110853619BCN 201810955004 ACN201810955004 ACN 201810955004ACN 110853619 BCN110853619 BCN 110853619B
Authority
CN
China
Prior art keywords
voice signal
control
voice
user
face
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810955004.3A
Other languages
Chinese (zh)
Other versions
CN110853619A (en
Inventor
郭涛
杨春阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Pateo Network Technology Service Co Ltd
Original Assignee
Shanghai Pateo Network Technology Service Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Pateo Network Technology Service Co LtdfiledCriticalShanghai Pateo Network Technology Service Co Ltd
Priority to CN201810955004.3ApriorityCriticalpatent/CN110853619B/en
Publication of CN110853619ApublicationCriticalpatent/CN110853619A/en
Application grantedgrantedCritical
Publication of CN110853619BpublicationCriticalpatent/CN110853619B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Images

Classifications

Landscapes

Abstract

The invention belongs to the technical field of intelligent control, and relates to a human-computer interaction method, a control device, a controlled device and a storage medium, wherein the human-computer interaction method comprises the following steps: a voice signal is received. Characteristics of a source of the voice signal including a facial orientation of a user originating the voice signal or a relative orientation of the user and the controlled device are detected. And judging whether the voice signal comprises a wakeup word. If the voice signal comprises the awakening word, voice instruction recognition is carried out on the voice signal to obtain the voice instruction. If the voice signal does not include the awakening word, when the characteristics of the voice signal source accord with preset characteristics, performing voice instruction recognition on the voice signal to acquire a voice instruction, wherein the preset characteristics comprise that the face of the user faces the front face of the controlled equipment/control device or the user is positioned on the front face of the controlled equipment. Therefore, the invention can effectively avoid the condition of false triggering of the controlled equipment, thereby improving the accuracy of the man-machine interaction method.

Description

Man-machine interaction method, control device, controlled device and storage medium
Technical Field
The invention belongs to the technical field of intelligent control, and particularly relates to a man-machine interaction method, a control device, a controlled device and a storage medium.
Background
With the popularization of intelligent terminals and the appearance of more and more intelligent devices and intelligent homes, human-computer interaction is a very core function. With the development of voice recognition technology, more and more intelligent devices adopt voice control to realize human-computer interaction, and when an existing voice terminal detects a voice control instruction, the existing voice terminal can respond to a control code corresponding to the detected voice control instruction based on a mapping relation between a pre-stored voice control instruction and the control code, so that the voice terminal belongs to a voice assistant function in human-computer interaction. At present, most of intelligent terminals have a voice assistant function, and generally need to input a specific voice (e.g., a wakeup word) to complete triggering, so that the voice assistant is in a voice to-be-input state. For example, in the case of power connection, the intelligent terminal with the voice assistant function says "small AI classmates" at a glance, and then the voice assistant service can be awakened.
However, the current voice assistant is only accurate in voice control triggering when there is a wake-up word, and cannot well resolve how to distinguish a voice receiving object in a natural language mode without the wake-up word, so that a false trigger control instruction is easily generated, for example, when a user says "watch tv", there may be two situations, one is that the user really wants to turn on a home tv, and the other is that the user chats with other people with the word "watch tv", and when the actual situation belongs to the second situation, the voice assistant is easy to generate a situation of turning on the tv by mistake.
In view of the above problems, those skilled in the art have sought solutions.
Disclosure of Invention
In view of this, the present invention provides a human-computer interaction method, a control device, a controlled device and a storage medium, and aims to improve the accuracy of human-computer interaction.
The invention is realized by the following steps:
the invention provides a man-machine interaction method, which comprises the following steps: a speech signal is received. Characteristics of a voice signal source are detected, wherein the characteristics of the voice signal source comprise the face orientation of a user sending the voice signal or the relative orientation of the user and a controlled device. And judging whether the voice signal comprises a wake-up word. And if the voice signal comprises the awakening word, performing voice instruction recognition on the voice signal to acquire the voice instruction. If the voice signal does not include the awakening word, when the characteristics of the voice signal source accord with preset characteristics, performing voice instruction recognition on the voice signal to acquire a voice instruction, wherein the preset characteristics comprise that the face of the user faces the front face of the controlled equipment/control device or the user is positioned on the front face of the controlled equipment.
Further, the step of detecting the characteristics of the voice signal source includes: the time when the user's eye is focused on the controlled device/controlling means is detected. The preset characteristic further comprises that the time for which the user's eyeball is focused on the controlled device/control apparatus is greater than a threshold value.
Further, before the step of determining whether the voice signal includes a wakeup word, the method includes: the method comprises the steps of obtaining the face of a user, and judging whether the face of the user is matched with a specific face stored in advance. And when the face of the user is matched with the specific face stored in advance, the step of judging whether the voice signal comprises the awakening word is carried out. And returning to the step of receiving the voice signal when the face does not match the specific face stored in advance.
Further, after the step of performing voice command recognition on the voice signal to obtain the voice command, the method includes: and entering a man-machine conversation mode according to the voice command, outputting corresponding conversation voice and/or performing corresponding control according to the voice command.
Further, the preset feature includes that the face of the user faces the front of the control device. After the step of performing voice command recognition on the voice signal to obtain the voice command, the method comprises the following steps: and judging whether the voice instruction comprises a control object, wherein the control object comprises at least one household device. And if the voice command does not comprise a control object, acquiring the household appliance control big data according to the current environment information and/or the current time information, and acquiring at least one household appliance and household appliance control information corresponding to each household appliance according to the household appliance control big data so as to control the corresponding household appliances according to the household appliance control information respectively. And if the voice command comprises the control object, correspondingly controlling the control object according to the voice command.
Further, if the voice command includes a control object, the step of correspondingly controlling the control object includes: the method comprises the steps of detecting a face of a user, acquiring historical control information of a control object corresponding to the face, and correspondingly controlling the control object according to the historical control information, wherein the control object comprises a television and/or a music player and/or an electric lamp.
The present invention also provides a control apparatus comprising: and the voice signal receiving module is used for receiving the voice signal. And the characteristic detection module is connected with the voice signal receiving module and is used for detecting the characteristics of the voice signal source, wherein the characteristics of the voice signal source comprise the face orientation of a user sending the voice signal or the relative direction of the user and the controlled device. And the awakening word recognition module is connected with the voice signal receiving module and is used for judging whether the voice signal comprises the awakening word. And the voice instruction acquisition module is used for carrying out voice instruction identification on the voice signal to acquire the voice instruction when the voice signal comprises the awakening word, and carrying out voice instruction identification on the voice signal to acquire the voice instruction when the characteristics of the voice signal source accord with the preset characteristics when the voice signal does not comprise the awakening word, wherein the preset characteristics comprise that the face of the user faces the front face of the controlled equipment/control device or the user is positioned on the front face of the controlled equipment.
The invention also provides a controlled device, and the controlled device comprises the control device.
The invention also provides a control device comprising a processor for executing a computer program stored in a memory to implement the steps of the human-computer interaction method described above.
The invention also provides a computer-readable storage medium, in which a computer program is stored, which, when executed by a processor, implements the steps of the above-described human-computer interaction method.
In the invention, after receiving the voice signal, the characteristics of the voice signal source are detected, and the characteristics of the voice signal source comprise the face orientation of a user sending the voice signal or the relative direction of the user and a controlled device. And judging whether the voice signal comprises a wake-up word. And if the voice signal comprises the awakening word, performing voice instruction recognition on the voice signal to acquire the voice instruction. If the voice signal does not include the awakening word, when the characteristics of the voice signal source accord with the preset characteristics, the step of performing voice instruction recognition on the voice signal to acquire the voice instruction is carried out, wherein the preset characteristics comprise that the face of the user faces the front face of the controlled equipment/control device or the user is positioned on the front face of the controlled equipment.
In order to make the aforementioned and other objects, features and advantages of the invention comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
Fig. 1 is a schematic flowchart of a human-computer interaction method according to an embodiment of the present invention;
fig. 2 is a schematic flowchart of a human-computer interaction method according to a second embodiment of the present invention;
fig. 3 is a schematic structural diagram of a control device according to a third embodiment of the present invention;
fig. 4 is a schematic structural diagram of a controlled device according to a fourth embodiment of the present invention;
fig. 5 is a schematic structural diagram of a control device according to a fifth embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It is to be understood that the described embodiments are merely a few embodiments of the invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The embodiments of the present invention will be described in detail with reference to the accompanying drawings.
The first embodiment is as follows:
fig. 1 is a schematic flowchart of a human-computer interaction method according to an embodiment of the present invention. For a clear description of the man-machine interaction method provided by the first embodiment of the present invention, please refer to fig. 1.
The man-machine interaction method provided by the embodiment of the invention comprises the following steps:
and S101, receiving a voice signal.
In an embodiment, the device/apparatus applying the man-machine interaction method provided in this embodiment is in a mute detection state before receiving a voice signal, where power consumption of the device/apparatus is extremely low, so that the device/apparatus can maintain the capability of long-time operation.
In one embodiment, in step S101, the method may further include: when the volume of the received voice signal reaches a certain threshold, the process proceeds to step S102.
And S102, detecting the characteristics of the voice signal source.
Specifically, the voice signal source includes, but is not limited to, a user who utters a voice signal. The characteristics of the source of the voice signal may include the face orientation of the user originating the voice signal or the relative orientation of the user originating the voice signal and the controlled device.
In an embodiment, the detection of the face orientation of the user who sends out the voice signal may be performed by the control apparatus or the controlled device through an image capturing apparatus, wherein the image capturing apparatus may be, but is not limited to being, integrated in the control apparatus or the controlled device.
In one embodiment, the detection of the relative position of the user sending the voice signal and the controlled device may be performed by the control device or the controlled device through an image acquisition device and/or a sound source positioning device, wherein the image acquisition device and/or the sound source positioning device may be integrated in the controlled device or the control device.
In one embodiment, the control device may perform unified control on a plurality of controlled devices, where the controlled devices may be, for example, electronic curtains, televisions, electronic doors, air conditioners, electric lamps, and the like. In other embodiments, the control device may control only one controlled device, and the control device may be integrated in the controlled device.
S103, judging whether the voice signal comprises a wake-up word.
In one embodiment, the wake-up word refers to a specific vocabulary for waking up the controlling device or the controlled device. The wake word may be the name of the device or the name of a voice recognition program in the device, such as "Temple genius," "Xiaoai classmate," "Voice Assistant," or the like.
In other embodiments, step S103 is performed: before judging whether the voice signal comprises the awakening word, the method can comprise the following steps: the method comprises the steps of obtaining the face of a user, and judging whether the face of the user is matched with a specific face stored in advance. When the face of the user matches with the specific face stored in advance, the step S103 is entered, or when the face of the user does not match with the specific face stored in advance, the step S101 is returned to. The pre-stored specific faces may be obtained and stored by the controlled device or the control apparatus by performing image acquisition in advance through the image acquirer, and when a plurality of specific faces are stored, one or more names (e.g., a person name, a relationship name, etc.) may be set for each specific face to be associated with the specific face.
And S104, if the voice signal comprises the awakening word, performing language instruction identification on the voice signal to acquire the voice instruction.
In one embodiment, after the step of performing voice command recognition on the voice signal, the method includes: and entering a man-machine conversation mode according to the voice command, outputting corresponding conversation voice and/or performing corresponding control according to the voice command. For example, the voice command is to turn on a television, the control device controls the television to turn on or the television is turned on automatically, and at this time, the control device or the television can ask "what program do you want to watch? "and allows the television set to jump to the program after the user speaks the desired program.
In other embodiments, after the user 'S face is acquired and when the user' S face matches a specific face stored in advance, the steps S103 and S104 are entered, and then a human-computer conversation mode is entered according to a voice instruction. For example, when the acquired face is matched with a pre-stored specific face, the control device or the controlled device makes a voice call, so that the user has more intimate experience, and further, personalized man-machine conversation can be performed according to the title corresponding to the specific face, so that different man-machine interaction experiences can be brought to different users.
In one embodiment, the voice command is obtained according to the voice signal, and meanwhile, corresponding analysis operation is performed according to processing of multiple modes such as voice recognition, semantic understanding, image detection and recognition and the like of the voice signal, and a learning model is established, so that a more intelligent and personalized man-machine conversation mode can be realized, and user experience is improved. For example, after the user moves, the user sends a voice signal to say "good heat" to help me to turn on the air conditioner ", at this time, voice recognition, semantic understanding and image information of the user are performed on the voice signal, and after processing, it can be obtained that the user feels very hot after moving instead of being hot, so that the device/apparatus can send a voice prompt to the user that" you have moved, and then advise to rest for a while and then turn on the air conditioner. "
In one embodiment, when the voice signal only includes the wake-up word, the user may be actively prompted to send a voice command, and further, when the voice signal only includes the wake-up word, and the voice command is not detected within a preset time period, a voice prompt may be set to be given to the user (e.g., "little brother, what bar to say").
And S105, if the voice signal comprises the awakening word, judging whether the characteristics of the voice signal source accord with the preset characteristics or not, if so, executing the step S104, and if not, returning to the step S101.
Specifically, the preset feature includes that the face of the user faces the controlled device/control device, or the user is positioned on the front face of the controlled device.
In one embodiment, the step of detecting the characteristics of the voice signal source comprises: after detecting the face orientation of the user, when the eye features of the user can be detected, the time when the user's eyes are focused on the controlled device/control apparatus is detected. The preset feature in step S105 may further include that a time for which the user' S eyeball is focused on the controlled device/control apparatus is greater than a threshold value. Therefore, when the voice signal does not include the wake-up word and the feature of the voice signal source conforms to the preset feature, step S104 is executed, for example, when it is detected that the face of the user faces the controlled device/control apparatus and it is detected that the time when the eyeball of the user focuses on the controlled device/control apparatus is greater than the threshold, the voice command is identified for the voice signal to obtain the voice command, so that when the voice signal does not include the wake-up word, it can be determined whether the voice signal sent by the user is identified for obtaining the voice command by identifying the face orientation and the focusing state of the eyeball of the user, thereby effectively avoiding the occurrence of the situation that the user mistakenly triggers the controlled device/control apparatus when speaking in natural language (e.g. chatting), and greatly improving the accuracy of the man-machine interaction method.
In other embodiments, when the face of the user faces the front of the control apparatus, the control apparatus can determine whether a control object is included in the voice instruction (where the voice instruction is acquired according to the voice signal), where the control object may include at least one controlled device (i.e., the control apparatus may control the at least one controlled device). Further, when the voice command does not include the control object, the control device may automatically control the corresponding controlled device according to the control big data. Furthermore, when the voice signal comprises the control object, the control device can automatically and correspondingly control the control object according to the historical control information of the control object, so that the human-family interaction method provided by the invention is more intelligent.
In one embodiment, the step of detecting the characteristics of the voice signal source comprises: the relative orientation of the user emitting the voice signal and the controlled device is detected. Therefore, when the voice signal does not include the awakening word, and the relative orientation of the user and the controlled device accords with the preset characteristics, the step of acquiring the voice instruction is carried out. Wherein the relative orientation of the user and the controlled device conforms to a predetermined characteristic, for example, the user is located on the front of the controlled device (e.g., the user is located on the front of a television), the distance between the user and the controlled device is less than a threshold (e.g., the distance between the user and a fan is less than 5 meters), and the like.
The man-machine interaction method provided by the embodiment of the invention detects the characteristics of the voice signal source after receiving the voice message. And judging whether the voice signal comprises a wake-up word. And if the voice signal comprises the awakening word, performing voice instruction recognition on the voice signal to acquire the voice instruction. If the voice signal does not include the awakening word, when the characteristics of the voice signal source accord with the preset characteristics, the step of performing voice instruction recognition on the voice signal source is carried out to obtain the voice instruction, wherein the preset characteristics comprise that the face of the user faces the front face of the controlled device/control device or the user is positioned on the front face of the controlled device.
Example two:
fig. 2 is a flowchart illustrating a human-computer interaction method according to a second embodiment of the present invention. For a clear description of the man-machine interaction method provided by the second embodiment of the present invention, please refer to fig. 2.
The man-machine interaction method provided by the embodiment two of the invention is applied to a control device and comprises the following steps:
s201, receiving a voice signal.
And S202, detecting the characteristics of the voice signal source.
In particular, the characteristics of the voice signal source may include the face orientation of the user originating the voice signal or the relative orientation of the user originating the voice signal and the controlled device, wherein the voice signal source includes, but is not limited to, the user originating the voice signal. Specifically, after the voice signal is received, the characteristics of the voice signal source are detected immediately, so that the characteristics when the user sends the voice signal can be detected in time, the condition that the characteristics of the voice signal source are inaccurate due to characteristic transformation caused by other actions after the user sends the voice signal is prevented, and the accuracy of subsequent steps can be further ensured.
In an embodiment, the detecting of the face orientation of the user who sends the voice signal may be performed by the control device or the controlled device through an image capturing device (e.g., a camera), for example, when the control device receives the voice signal, the image capturing device of the control device is turned on to detect the face orientation of the user. For another example, when the control device receives the voice signal, the controlled device may be controlled to turn on the image capturing device to detect the face orientation of the user.
In an embodiment, the control device or the controlled device may include an image capturing device, and may also be an image capturing device external to the control device or the controlled device.
In one embodiment, the control apparatus may perform unified management on a plurality of controlled devices, such as electronic curtains, televisions, electronic doors, air conditioners, electric lights, and the like.
S203, judging whether the voice signal comprises a wake-up word.
And S204, if the voice signal comprises the awakening word, performing language instruction identification on the voice signal to acquire the voice instruction.
S205, if the voice signal does not include the awakening word, determining whether the characteristic of the voice signal source accords with the preset characteristic, if so, executing the step S204, and if not, returning to the step S201.
Specifically, the preset feature includes that the face of the user faces the controlled device/control device, or that the user is located on the front face of the controlled device.
In an embodiment, the preset feature may be that the face of the user faces the front of the control device, and when the face of the user faces the front of the control device, step S206 is executed.
S206: and judging whether the voice instruction comprises a control object.
Specifically, step S206 is after the step of performing voice instruction recognition on the voice signal to acquire the voice instruction.
In one embodiment, the control object includes at least one controlled device (e.g., a television, a smart speaker, an air conditioner, a washing machine, an electric lamp, an electronic curtain, an electronic door, a floor sweeping robot, or other devices).
And S207, if the voice command does not comprise a control object, acquiring the household appliance control big data according to the current environment information and/or the current time information, and acquiring at least one household device and household appliance control information corresponding to each household device according to the household appliance control big data so as to control the corresponding household devices according to the household appliance control information respectively.
In one embodiment, the voice command does not include a control object, such as a user's fuzzy voice command (e.g., "open", etc.) against the control device. Specifically, after acquiring the fuzzy voice command, the control device may acquire at least one home appliance according to the current environment information and/or the current time information, and perform corresponding control.
In one embodiment, the current environment information includes at least one of indoor temperature information, indoor brightness information, floor cleanliness information, and the number of people in the room, but the current environment information is not limited to include the indoor temperature information, the indoor brightness information, the floor cleanliness information, the number of people in the room, and the like.
In an embodiment, the control device may obtain, from the cloud server, the household appliance control big data corresponding to the current environment information and/or the current time information according to the current environment information and/or the current time information, where the household appliance control big data stored in the cloud server may be household appliance control data corresponding to the environment information and/or the time information, which is uploaded to the cloud server by the user through the control device, or household appliance control data corresponding to the environment information and/or the time information, which is uploaded to the cloud server by other users through other control terminals. Specifically, the control device obtains the household appliance control big data corresponding to the current environment information and/or the current time information from the cloud server according to the current environment information and/or the current time information, where the household appliance control big data is household appliance control big data of the user stored in the cloud or household appliance control big data commonly used by other users stored in the cloud.
Specifically, the control device may obtain the household appliance control big data corresponding to the current environment information according to the current environment information, for example, the illumination intensity is less than 50lux (i.e., the light is dark), the indoor temperature is higher than 35 °, or the floor has rubbish, and then the control device obtains the household appliance control big data of the user from the cloud server according to the current environment information to turn on the electric lamp or turn on the curtain, turn on the air conditioner, set the temperature of the air conditioner, or turn on the sweeping robot.
In addition, the control device can also obtain the household appliance control big data corresponding to the current time information according to the current time information, for example, the household appliance control big data corresponding to each time point of the user in the time period from 5 to 9 am is to open a curtain, open a music player or a water dispenser to start heating, and the like; the household appliance control big data of the user corresponding to each time point from 6 pm to 8 pm is to turn on a television or a computer or turn on a lamp and the like.
In addition, the control device may further obtain the household appliance control big data corresponding to the current environment information and the current time information according to the current environment information and the current time information, for example, if the indoor temperature is higher than 35 ° and the time is in a half period from 7 pm to 7 pm, the control device may turn on the air conditioner and set the air conditioner temperature, turn on the electric lamp, or turn on the television broadcast program (for example, a news simulcast at a central television station at 7 pm) according to the household appliance control big data of the user obtained by the control device according to the current environment information and the current time information.
In another embodiment, the control device obtains the home appliance control big data of the user corresponding to the current environment information and/or the current time information, which is stored in the control device, according to the current environment information and/or the current time information.
In one embodiment, the obtained household appliance big data includes at least one household appliance to be controlled and household appliance control information corresponding to each household appliance to be controlled. Therefore, when the voice instruction is obtained and the voice instruction does not include a control object, the man-machine interaction method provided by the embodiment can intelligently select at least one piece of household electrical appliance equipment to perform corresponding control according to the current environmental information and/or the current time, so that the man-machine interaction method provided by the embodiment not only greatly improves the accuracy of human interaction, but also enables the man-machine interaction method provided by the embodiment to be more intelligent.
S208: and if the voice command comprises the control object, correspondingly controlling the control object according to the voice command.
In one embodiment, when only the control object is included in the voice command but the control information of the control object is included in the voice command, a face of the user is detected, and historical control information of the control object corresponding to the face is acquired, so that the control object is controlled correspondingly according to the historical control information. For example, when the voice command indicates that the television is turned on and there is no channel information or program information, the control device may acquire, according to the face of the user, program information and viewing progress information (i.e., history control information) that the user has watched last time, so that the control device may control the television to turn on the multimedia information corresponding to the program information and the viewing progress information after turning on the television.
In an embodiment, the step of detecting the face of the user may be detected at the same time of detecting the feature of the voice signal source at step S202.
The man-machine interaction method provided by the second embodiment of the invention comprises the following steps: a voice signal is received. Characteristics of a source of the speech signal are detected. And judging whether the voice signal comprises a wakeup word. If the voice signal comprises the awakening word, performing language instruction identification on the voice signal to acquire the voice instruction. And if the voice signal does not comprise the awakening word, judging whether the characteristics of the voice signal source accord with the preset characteristics or not, if so, acquiring a voice instruction, and if not, returning to the step of receiving the voice signal. After the step of acquiring the voice command, it is determined whether the voice command includes a control object. And if the voice command does not comprise a control object, acquiring the household appliance control big data according to the current environment information and/or the current time information, and acquiring at least one household appliance and household appliance control information corresponding to each household appliance according to the household appliance control big data so as to control the corresponding household appliances according to the household appliance control information respectively. And if the voice command does not comprise the control object, correspondingly controlling the control object according to the voice command. Therefore, when the voice signal does not include the wake-up word, the feature of the user sending the voice signal is detected, and when the feature of the user is detected to conform to the preset feature, the step of obtaining the voice command is performed, so that the purpose of improving the accuracy of the human-computer interaction can be achieved, and after the step of obtaining the voice command, by judging whether the voice command includes the control object or not, when the voice command does not include the control object, at least one control object can be obtained according to the current environmental information and/or the current time information, and each control object is correspondingly controlled, so that the intelligence of the human-computer interaction method provided by the embodiment is greatly improved.
Example three:
fig. 3 is a schematic structural diagram of a control device according to a third embodiment of the present invention. For a clear description of thecontrol device 1 according to the third embodiment of the present invention, please refer to fig. 3.
Referring to fig. 3, acontrol device 1 according to a third embodiment of the present invention includes: the system comprises a voicesignal receiving module 101, afeature detection module 102, a wake-uprecognition module 103 and a voicecommand acquisition module 104.
Specifically, the voicesignal receiving module 101 is configured to receive a voice signal.
In an embodiment, before the voicesignal receiving module 101 receives the voice signal, the voicesignal receiving module 101 is in a mute detection state, and the power consumption of thecontrol device 1 is very low, so that thecontrol device 1 maintains the capability of long-time operation.
Specifically, thefeature detection module 102 is connected to the voicesignal receiving module 101 and configured to detect a feature of a voice signal source, where the feature of the voice signal source includes a face orientation of a user who sends a voice signal or a relative orientation of the user and a controlled device.
In one embodiment, thefeature detection module 102 includes an image acquisition device. In other embodiments, thefeature detection module 102 may include an image acquisition device and/or a sound source localization device. The image acquisition device can be used for acquiring the image information of the voice signal source, so that the characteristics of the voice signal source are identified. The sound source positioning device can judge the direction of the voice signal source according to the received voice signal.
Specifically, the wake-uprecognition module 103 is connected to the voicesignal receiving module 101, and is configured to determine whether the voice signal includes a wake-up word.
Specifically, the voiceinstruction obtaining module 104 is configured to perform voice instruction recognition on the voice signal when the voice signal includes a wake-up word to obtain a voice instruction, and perform voice instruction recognition on the voice signal when the characteristic of the voice signal source meets a preset characteristic when the voice signal does not include the wake-up word to obtain the voice instruction, where the preset characteristic includes that the face of the user faces the front of the controlled device/control apparatus 1, or that the user is located on the front of the controlled device.
In an embodiment, the voice command recognition module is configured to determine whether the voice command includes a control object. And if the voice command does not comprise a control object, acquiring the household appliance control big data according to the current environment information and/or the current time information, and acquiring at least one household appliance and household appliance control information corresponding to each household appliance according to the household appliance control big data so as to control the corresponding household appliances according to the household appliance control information respectively. And if the voice command comprises the control object, correspondingly controlling the control object according to the voice command.
In one embodiment, when the voice command only includes the control object but does not include the control information of the control object, the voice command recognition module detects a face of the user and acquires historical control information of the control object corresponding to the face, so as to perform corresponding control on the control object according to the historical control information.
In thecontrol apparatus 1 provided by the third embodiment of the present invention, the voicesignal receiving module 101 is configured to receive a voice signal. Thefeature detection module 102 is connected to the voicesignal receiving module 101 and is configured to detect a feature of a voice signal source, where the feature of the voice signal source includes a face orientation of a user who sends a voice signal or a relative orientation of the user and a controlled device. The wake-uprecognition module 103 is connected to the voicesignal receiving module 101, and is configured to determine whether the voice signal includes a wake-up word. And the voiceinstruction acquisition module 104 is configured to perform voice instruction recognition on the voice signal to acquire a voice instruction when the voice signal includes the wake-up word, and perform voice instruction recognition on the voice signal to acquire the voice instruction when the feature of the voice signal source conforms to a preset feature when the voice signal does not include the wake-up word, where the preset feature includes that the face of the user faces the front of the controlled device/control apparatus 1, or that the user is located on the front of the controlled device. Therefore, with thecontrol device 1 provided in the embodiment of the present invention, in the process of human-computer interaction, when the received voice signal does not include the wakeup word, it can be determined through the face orientation of the user or the relative orientation between the user and the controlled device whether the voice signal sent by the user needs to be subjected to voice instruction recognition, and then the controlled device is correspondingly controlled according to the voice instruction, so that the occurrence of a situation that the controlled device/or thecontrol device 1 is erroneously triggered when the user speaks in a natural language (e.g., chatting) can be effectively avoided, and the accuracy of thecontrol device 1 during human-computer interaction can be greatly improved.
Example four:
fig. 4 is a schematic structural diagram of a controlled device according to a fourth embodiment of the present invention. For clearly describing the controlleddevice 2 provided in the fourth embodiment of the present invention, please refer to fig. 4.
Referring to fig. 4, specifically, the controlleddevice 2 includes a control apparatus provided by the present invention (for example, thecontrol apparatus 1 provided by the third embodiment of the present invention). Specifically, thecontrol device 1 can implement the human-computer interaction method provided by the present invention (for example, the human-computer interaction method provided by the first embodiment and/or the human-computer interaction method provided by the second embodiment).
Therefore, the controlleddevice 2 provided in this embodiment can determine, through the face orientation of the user or the relative orientation between the user and the controlleddevice 2, whether to perform voice instruction recognition on the voice signal sent by the user when the received voice signal does not include the wakeup word in the human-computer interaction process, and then perform corresponding control on the controlleddevice 2 according to the voice instruction, so that the controlleddevice 2 provided in this embodiment can effectively avoid the occurrence of a situation that the controlleddevice 2 is falsely triggered when the user speaks in natural language (for example, chat), and thus can greatly improve the accuracy of the controlleddevice 2 during human-computer interaction.
Example five:
fig. 5 is a schematic structural diagram of a control device according to a fifth embodiment of the present invention. For clearly describing the control device provided in the fifth embodiment of the present invention, please refer to fig. 5.
The control device provided by the fifth embodiment of the present invention includes a processor a101, where the processor a101 is configured to execute the computer program A6 stored in the memory a201 to implement the steps of the human-computer interaction method described in the first embodiment or the second embodiment.
As will be appreciated by one skilled in the art, aspects of the present invention may be embodied as a system, method, apparatus, or program product. Thus, various aspects of the invention may be embodied in the form of: an entirely hardware embodiment, an entirely software embodiment (including firmware, microcode, etc.) or an embodiment combining software and hardware may be referred to herein as a "circuit," module, "" system.
In an embodiment, the control device provided in this embodiment may include at least one processor a101 and at least one memory a201. Among them, at least one processor a101 may be referred to as a processing unit A1, and at least one memory a201 may be referred to as a storage unit A2. Specifically, the storage unit A2 stores a computer program A6, and when the computer program A6 is executed by the processing unit A1, the control apparatus provided by the present embodiment is caused to implement the steps of the human-computer interaction method as described above, such as the step S206 shown in fig. 2: and judging whether the voice instruction comprises a control object. As another example, step S105 shown in fig. 1: and judging whether the characteristics of the voice signal source accord with preset characteristics or not.
In one embodiment, the control device further comprises a bus connecting the different components (e.g. processor a101 and memory a 201). In one embodiment, the bus may represent one or more of several types of bus structures, including a memory bus or memory controller bus, a peripheral bus, and the like.
Referring to fig. 5, in an embodiment, the control device provided in the present embodiment includes a plurality of memories a201 (referred to as a storage unit A2 for short), and the storage unit A2 may include, for example, a Random Access Memory (RAM) and/or a cache memory and/or a Read Only Memory (ROM), and the like.
Referring to fig. 5, in an embodiment, the control device in this embodiment may further include a communication interface (e.g., an I/O interface A4), and the communication interface may be used to communicate with an external device (e.g., a computer, a smart terminal, etc.).
Referring to fig. 5, in an embodiment, the control device in this embodiment may further include a display device and/or an input device (e.g., the illustrated touch display screen A3).
Referring to fig. 5, in an embodiment, the control device provided in this embodiment may further include a network adapter A5, where the network adapter A5 may be used to communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, etc.). As shown in fig. 5, the network adapter A5 may communicate with other components of the control apparatus through wires.
The control device provided in this embodiment can implement the steps of the human-computer interaction method provided in the present invention, and for specific implementation and beneficial effects, reference may be made to the first embodiment and the second embodiment of the present invention, which will not be described herein again.
Example five:
in an embodiment of the present invention, a computer-readable storage medium is provided, which stores a computer program that, when executed by a processor, is capable of implementing the steps of the human-computer interaction method of, for example, the first embodiment or the second embodiment. Alternatively, the computer program can realize the functions of the controller and the controlled device when executed by the processor.
In this embodiment, when being executed by a processor, a computer program in a computer-readable storage medium implements the steps of the human-computer interaction method or the functions of a controller or a controlled device, which will not be described herein again, and specific embodiments and beneficial effects may refer to embodiments one to four of the present invention.
The present invention is not limited to the above preferred embodiments, and any modifications, equivalents or improvements made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

CN201810955004.3A2018-08-212018-08-21Man-machine interaction method, control device, controlled device and storage mediumActiveCN110853619B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN201810955004.3ACN110853619B (en)2018-08-212018-08-21Man-machine interaction method, control device, controlled device and storage medium

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201810955004.3ACN110853619B (en)2018-08-212018-08-21Man-machine interaction method, control device, controlled device and storage medium

Publications (2)

Publication NumberPublication Date
CN110853619A CN110853619A (en)2020-02-28
CN110853619Btrue CN110853619B (en)2022-11-25

Family

ID=69594558

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201810955004.3AActiveCN110853619B (en)2018-08-212018-08-21Man-machine interaction method, control device, controlled device and storage medium

Country Status (1)

CountryLink
CN (1)CN110853619B (en)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN113359538A (en)*2020-03-052021-09-07东元电机股份有限公司Voice control robot
CN111443801B (en)*2020-03-252023-10-13北京百度网讯科技有限公司 Human-computer interaction methods, devices, equipment and storage media
CN111562346A (en)*2020-05-062020-08-21江苏美的清洁电器股份有限公司Control method, device and equipment of dust collection station, dust collection station and storage medium
CN111739533A (en)*2020-07-282020-10-02睿住科技有限公司Voice control system, method and device, storage medium and voice equipment
CN112102546A (en)*2020-08-072020-12-18浙江大华技术股份有限公司Man-machine interaction control method, talkback calling method and related device
CN115086095A (en)*2021-03-102022-09-20Oppo广东移动通信有限公司Equipment control method and related device
CN115083402B (en)*2021-03-152025-08-22Oppo广东移动通信有限公司 Method, device, terminal and storage medium for responding to control voice
CN113470660A (en)*2021-05-312021-10-01翱捷科技(深圳)有限公司Voice wake-up threshold adjusting method and system based on router flow
CN113470658A (en)*2021-05-312021-10-01翱捷科技(深圳)有限公司Intelligent earphone and voice awakening threshold value adjusting method thereof
CN113470659A (en)*2021-05-312021-10-01翱捷科技(深圳)有限公司Light intensity-based voice awakening threshold value adjusting method and device
CN113421567A (en)*2021-08-252021-09-21江西影创信息产业有限公司Terminal equipment control method and system based on intelligent glasses and intelligent glasses
CN114253396A (en)*2021-11-152022-03-29青岛海尔空调电子有限公司Target control method, device, equipment and medium
CN114697848A (en)*2022-05-132022-07-01成都市舒听医疗器械有限责任公司 A voice control method, device, device and medium for a hearing aid
CN115588435A (en)*2022-11-082023-01-10荣耀终端有限公司 Voice wake-up method and electronic device

Citations (10)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
JP2009244912A (en)*2009-07-292009-10-22Victor Co Of Japan LtdSpeech signal processing device and speech signal processing method
CN102945029A (en)*2012-10-312013-02-27鸿富锦精密工业(深圳)有限公司Intelligent gateway, smart home system and intelligent control method for home appliance equipment
CN104238369A (en)*2014-09-022014-12-24百度在线网络技术(北京)有限公司Intelligent household appliance control method and device
CN105700363A (en)*2016-01-192016-06-22深圳创维-Rgb电子有限公司Method and system for waking up smart home equipment voice control device
CN105703978A (en)*2014-11-242016-06-22武汉物联远科技有限公司Smart home control system and method
CN105912092A (en)*2016-04-062016-08-31北京地平线机器人技术研发有限公司Voice waking up method and voice recognition device in man-machine interaction
WO2016167004A1 (en)*2015-04-142016-10-20シャープ株式会社Voice recognition system
CN107908116A (en)*2017-10-202018-04-13深圳市艾特智能科技有限公司Sound control method, intelligent domestic system, storage medium and computer equipment
CN108320742A (en)*2018-01-312018-07-24广东美的制冷设备有限公司Voice interactive method, smart machine and storage medium
CN108320753A (en)*2018-01-222018-07-24珠海格力电器股份有限公司Control method, device and system of electrical equipment

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
KR102392113B1 (en)*2016-01-202022-04-29삼성전자주식회사Electronic device and method for processing voice command thereof

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
JP2009244912A (en)*2009-07-292009-10-22Victor Co Of Japan LtdSpeech signal processing device and speech signal processing method
CN102945029A (en)*2012-10-312013-02-27鸿富锦精密工业(深圳)有限公司Intelligent gateway, smart home system and intelligent control method for home appliance equipment
CN104238369A (en)*2014-09-022014-12-24百度在线网络技术(北京)有限公司Intelligent household appliance control method and device
CN105703978A (en)*2014-11-242016-06-22武汉物联远科技有限公司Smart home control system and method
WO2016167004A1 (en)*2015-04-142016-10-20シャープ株式会社Voice recognition system
CN105700363A (en)*2016-01-192016-06-22深圳创维-Rgb电子有限公司Method and system for waking up smart home equipment voice control device
CN105912092A (en)*2016-04-062016-08-31北京地平线机器人技术研发有限公司Voice waking up method and voice recognition device in man-machine interaction
CN107908116A (en)*2017-10-202018-04-13深圳市艾特智能科技有限公司Sound control method, intelligent domestic system, storage medium and computer equipment
CN108320753A (en)*2018-01-222018-07-24珠海格力电器股份有限公司Control method, device and system of electrical equipment
CN108320742A (en)*2018-01-312018-07-24广东美的制冷设备有限公司Voice interactive method, smart machine and storage medium

Also Published As

Publication numberPublication date
CN110853619A (en)2020-02-28

Similar Documents

PublicationPublication DateTitle
CN110853619B (en)Man-machine interaction method, control device, controlled device and storage medium
CN111989741B (en)Speech-based user interface with dynamically switchable endpoints
CN107370649B (en)Household appliance control method, system, control terminal and storage medium
CN105471705B (en)Intelligent control method, equipment and system based on instant messaging
JP6516585B2 (en) Control device, method thereof and program
CN105118257B (en)Intelligent control system and method
CN109240111A (en)Intelligent home furnishing control method, device, system and intelligent gateway
CN111817936A (en)Control method and device of intelligent household equipment, electronic equipment and storage medium
TWI521385B (en) And a control system and a method for driving the corresponding device according to the triggering strategy
WO2017141530A1 (en)Information processing device, information processing method and program
CN114067798A (en) A server, intelligent device and intelligent voice control method
CN110632854A (en)Voice control method and device, voice control node and system and storage medium
US11818820B2 (en)Adapting a lighting control interface based on an analysis of conversational input
CN109658924B (en)Session message processing method and device and intelligent equipment
CN116582382B (en)Intelligent device control method and device, storage medium and electronic device
CN116582381B (en)Intelligent device control method and device, storage medium and intelligent device
CN109976169B (en)Internet television intelligent control method and system based on self-learning technology
CN113641105A (en) A method, device, device and storage medium for controlling household electrical appliances
CN106297783A (en)A kind of interactive voice identification intelligent terminal
WO2018023523A1 (en)Motion and emotion recognizing home control system
CN208848361U (en)Inlay body formula infrared ray domestic electric appliances controller
CN116540557B (en) Audio-visual combined intelligent home appliance terminal control method, device and storage medium
US12400647B2 (en)Device and method for performing environmental analysis, and voice-assistance device and method implementing same
CN110995551A (en)Storage medium, intelligent panel and interaction method thereof
CN116580711B (en)Audio control method and device, storage medium and electronic equipment

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp