Movatterモバイル変換


[0]ホーム

URL:


WO2003085639A1 - Controlling an apparatus based on speech - Google Patents

Controlling an apparatus based on speech
Download PDF

Info

Publication number
WO2003085639A1
WO2003085639A1PCT/IB2003/001145IB0301145WWO03085639A1WO 2003085639 A1WO2003085639 A1WO 2003085639A1IB 0301145 WIB0301145 WIB 0301145WWO 03085639 A1WO03085639 A1WO 03085639A1
Authority
WO
WIPO (PCT)
Prior art keywords
user
speech
instruction
items
speech items
Prior art date
Application number
PCT/IB2003/001145
Other languages
French (fr)
Inventor
Boris E. R. De Ruyter
Steffen C. Pauws
Original Assignee
Koninklijke Philips Electronics N.V.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Koninklijke Philips Electronics N.V.filedCriticalKoninklijke Philips Electronics N.V.
Priority to AU2003212587ApriorityCriticalpatent/AU2003212587A1/en
Publication of WO2003085639A1publicationCriticalpatent/WO2003085639A1/en

Links

Classifications

Definitions

Landscapes

Abstract

A method of controlling an apparatus (400) based on speech comprises the steps of: receiving a series of speech items (104-108), starting with a first speech item (104) of a first user (U1) of the apparatus (400); transforming received speech items (104-114) to voice commands (212-216) corresponding to respective recognized speech items which are classified as belonging to the first user (U1) on basis of a voice profile of the first user; creating an instruction (218) for the apparatus (400) by means of combining the voice commands (212-216); and providing the instruction (218) for execution by the apparatus (400).

Description

Controlling an apparatus based on speech
The invention relates to a method of controlling .an apparatus based on speech.
The invention further relates to an apparatus being arranged to be controlled on basis of speech.
The invention further relates to a consumer electronics system comprising such an apparatus.
The invention further relates to a speech control unit for controlling an apparatus on basis of speech.
Voice control as an interaction modality for products, e.g. consumer products, is getting more mature. However, people perceive it as strange, uncomfortable or even unacceptable to talk to a product like a television. To avoid that conversations or utterances not intended for controlling the products are recognized and executed, most voice controlled system require the user to activate the system resulting in a time span, or also called attention span during which the system is active. Such an activation may be performed via voice, for instance by the user speaking a keyword, like "TV". By using .an anthropomorphic character a barrier for interaction is removed: it is more natural to address the character instead of the product, e.g. by saying "Bello" to a dog-like character. Moreover, a product can make effective use of one object with several appearances, chosen as a result of several state elements. For instance, a basic appearance like a sleeping animal can be used to show that the system is not yet active. A second group of appearances can be used when the system is active, e.g. awake appearances of the animal. The progress of the attention span can then, for instance, be expressed, by the angle of the ears: fully raised at the beginning of the attention span, fully down at the end. The similar appearances can also express whether or not an utterance was understood: an "understanding look" versus a "puzzled look". Also audible feedback can be combined, like a "glad" bark if a speech item has been recognized. A user can quickly grasp the feedback on all such system elements by looking at the one appearance which represents all these elements. E.g. raised ears and an "understanding look", or lowered ears and a "puzzled look". Once a user has started an attention span the product is in a state of accepting further speech items. These speech items will be recognized and associated with voice commands. A number of voice commands together will be combined to one instruction for the product. E.g. a first speech item is associated with "Bello", resulting in a wake-up of the television. A second speech item is associated with the word "channel" and a third speech item is associated with the word "next". The result is that the television will switch, i.e. get tuned to a next broadcasting channel. However, if another user starts talking during the attention span of the television just initiated by the first user, then the communication between the first user and the television might get interfered. The probability is high that the television is not able to construct the appropriate instruction which matches with the intention of the first user.
An embodiment of the apparatus of the kind described in the opening paragraph is known from US 6,230,137. In that patent is disclosed that the control program of the apparatus is configured in such a way that successive voice signals, i.e. speech items, can only form a control command, i.e. instruction, when the successive voice signals are input within a given time period. This means however that the apparatus according to the cited prior art is not capable in dealing with multiple users which provide speech items to the same apparatus within the given time period.
It is an object of the invention to provide a method of the kind described in the opening paragraph with an improved construction of the instruction on the basis of speech from a user.
The object of the invention is achieved in that the method comprises: - receiving a series of speech items, starting with a first speech item of a first user of the apparatus;
- transforming received a selection of the received speech items into voice commands corresponding to respective recognized speech items which are classified as belonging to the first user on basis of a voice profile of the first user; - creating an instruction for the apparatus by means of combining the voice commands; and
- providing the instruction for execution by the apparatus.
An important aspect of the invention is that received speech items are classified: does this speech item belong to the first user? Preferably speech items which do not belong to the user who started the attention span of the apparatus are ignored. The result is that only speech items of the first user are used for other operations to create the eventual instruction for the apparatus. The operations comprise matching with a vocabulary list, i.e. dictionary, and matching with a language model, i.e. grammatical test. Hence, it can be seen as if the attention span is assigned to the first user.
In an embodiment of the method according to the invention, transforming the selection of received speech items comprises:
- classifying the received speech items by means of comparing the received speech items with the voice profile of the first user; and - recognizing speech items being classified as belonging to the first user thereby associating voice commands to respective recognized speech items. In this embodiment the recognition step is performed after the classification step, hi this embodiment only those speech items have to be recognized which have been classified as belonging to the first user. This has a positive influence on the resource usage. An inverse order is also possible: first recognizing and than classifying. In general, the recognition and classification step can be made conditional on basis of the result of the other step.
An embodiment of the method according to the invention further comprises:
- classifying a further selection of the received speech items by means of comparing the received speech items with a further voice profile of a further user of the apparatus;
- recognizing speech items being classified as belonging to the further user and associating further voice commands to respective further recognized speech items;
- creating a further instruction for the apparatus by means of combining the further voice commands corresponding to the further recognized speech items; and - optionally providing the further instruction for execution by the apparatus.
Instead of ignoring the speech items for which it has been concluded that they do not correspond to the first user, these speech items are used for further evaluation in this embodiment of the method according to the invention. Again the speech items are classified. The recognized speech items which have been classified as belonging to one and the same user, i.e. the further user, are combined to the further instruction for the apparatus. Whether the further instruction will be provided to the apparatus for execution is tested first:
- a test might be checking whether the instruction requested by the first user has been executed already. As long as this is not the case the further instruction is halted. - another test might be checking whether the instruction requested by the first user and the further instruction are not mutually conflicting. If that is the case the further instruction will not be provided for execution.
In an embodiment of the method according to the invention the step of creating the instruction is performed after a predetermined time interval which has been started at receiving the first speech item. An attention span for the first user might be started by means of providing the first speech item by the first user. After a predetermined time interval the attention span should be closed. The relevant speech items being processed in the meantime are combined in order to create the instruction. Alternatively, the instruction is created on the fly, but providing of the instruction is postponed till the predetermined time interval has elapsed. hi an embodiment of the method according to the invention the step of creating the instruction is performed after a further predetermined time interval during which no speech items have been classified. An alternative approach for ending an attention span or for a trigger to start with the creation of the instruction is based on the fact that no new valid speech items are received. The advantage of this approach is the flexibility. The duration of the attention span is determined on the received input and not based on a predetermined time interval.
In an embodiment of the method according to the invention the step of creating the instruction is performed after an explicit action of the first user. Another alternative approach for ending an attention span or for a trigger to start with the creation of the instruction is based on the fact that the user performs an explicit action, e.g. uttering a stop-word like "good-bye" The advantage of this approach is the flexibility. The duration of the attention span is determined on the explicit action and not based on a predetermined time interval.
It is a further object of the invention to provide an apparatus of the kind described in the opening paragraph with an improved construction of the instruction on the basis of speech from a user.
This object of the invention is achieved in that the apparatus comprises: - a speech control unit for controlling the apparatus on basis of speech, comprising:
* receiving means for receiving a series of speech items, starting with a first speech item of a first user of the apparatus; * transforming means for transforming a selection of the received speech items into voice commands corresponding to respective recognized speech items which are classified as belonging to the first user on basis of a voice profile of the first user;
* instruction creating means for creating an instruction for the apparatus by means of combining the voice commands; and
- processing means for execution of the instruction. An embodiment of the apparatus according to the invention is arranged to show that the first speech item has been classified as belonging to the first user. Above it is described, i.e. with the example of "Bello", that there are several means to inform the user about system elements such as progress of an attention span and acceptance of speech items. Preferably the apparatus is also arranged to show which user, in the case of multiple users is providing speech items to the apparatus.
An embodiment of the apparatus according to the invention which is arranged to show that the first speech item has been classified as belonging to the first user comprises audio generating means for generating an audio signal representing the first user. By generating an audio signal comprising a representation of the name of the first user, e.g. "Hello Jack" it is clear for the first user that the apparatus is ready to receive speech items from the first user. This concept is also known as auditory greeting.
An embodiment of the apparatus according to the invention which is arranged to show that the first speech item has been classified as belonging to the first user comprises a display device for displaying a visual representation of the first user. By displaying a personalized icon or an image of the first user it is clear for the first user that the apparatus is ready to receive speech items from the first user, hi other words, the apparatus is in an active state of classifying and/or recognizing speech items. An embodiment of the apparatus according to the invention which is arranged to show that the first speech item has been classified as belonging to the first user is developed to show a set of controllable parameters of the apparatus on basis of a preference profile of the first user. Many apparatus have numerous controllable parameters. However not all of these controllable parameters are of interest for each user of the apparatus. Besides that, each of the users has his own preferred default values. Hence, a user has a so-called preference profile. It is advantageously to show the default values of the controllable parameters which are of interest to the first user, i.e. the user who initiated the attention span. Modifications of the method and variations thereof may correspond to modifications and variations thereof of the apparatus described and of the speech control unit described.
These and other aspects of the method, of the apparatus and of the speech control unit according to the invention will become apparent from and will be elucidated with respect to the implementations and embodiments described hereinafter and with reference to the accompanying drawings, wherein: Fig. 1 schematically shows the step of classification of speech items according to the invention;
Fig. 2 schematically shows the behavior of a speech control unit according to the invention;
Fig. 3 schematically shows alternative behavior of a speech control unit according to the invention; and
Fig. 4 schematically shows an embodiment of the apparatus according to the invention. Corresponding reference numerals have same or like meaning in all of the Figs.
Fig. 1 schematically shows the step of classification of speech items 104-114 according to the invention. In Fig. 1 is shown that two users Ul and U2 are speaking. The first user Ul generates a speech signal 100 comprising speech items 104-108. In between these speech items 104-108 the first user Ul is not speaking. The second user U2 generates a speech signal 102 comprising speech items 110-114. In between these speech items 110-114 the second user U2 is not speaking. Because the first user Ul and the second user U2 are speaking during the same attention span the two speech signals 100 and 102 are merged into combined speech signal 103 comprising the speech items 104-114 from both users Ul and U2. The combined speech signal 103 is received by speech control unit 200. The speech items 104-114 are extracted from the combined speech signal 103. An aspect for extraction is dividing the combined speech signal 103 into sub-signals on basis of detected portions of "little signal level", i.e. silence. The first received speech item 104 is classified as belonging to the first user Ul. This is done by means of comparing the received speech items with a voice profile of the first user Ul which is available in the speech control unit. For subsequent speech items 106-114 also classification tests are performed. The speech items 106 and 108 will also be classified as belonging to the first user Ul . For the further speech items 110-114 it will be concluded that these do not belong to the first user Ul. Optionally for these further speech items 110-114 it is tested whether they belong to the second user U2. Fig. 2 schematically shows the behavior of a speech control unit 200 according to the invention. The speech control unit 200 comprises:
- receiving means 202 for receiving a series of speech items 104-114, starting with a first speech item 104 of a first user Ul of the apparatus 400;
- classification means 204 for classifying received speech items 104-114 by means of comparing the received speech items with a voice profile of the first user Ul;
- recognizing means 206 for recognizing speech items 104-108 being classified as belonging to the first user Ul and associating voice commands 212-216 to respective recognized speech items 104-108;
- instruction creating means 208 for creating an instruction 218 for the apparatus 400 by means of combining the voice commands 212-216 corresponding to the recognized speech items 104-108; and
- providing means 209 for providing the instruction 208 to the control processor 210 of the apparatus 400. The control processor is arranged to perform the instruction 208. The receiving means 202 comprises a microphone and an A/D- converter. The other components 204-209 of the speech control unit 200 and the control processor 210 may be implemented using one processor. Normally, both functions are performed under control of a software program product. During execution, normally the software program product is loaded into a memory, like a RAM, and executed from there. The program may be loaded from a background memory, like a ROM, hard disk, or magnetically and/or optical storage, or may be loaded via a network like Internet. Optionally an application specific integrated circuit provides the disclosed functionality.
The behavior of the speech control unit 200 is as follows. From the combined speech signal 103 the speech items 104-114 are extracted. Those speech items 104-108 which correspond to the first user Ul are classified as such. The classified speech items 104-108 are also recognized and voice commands 212-216 are assigned to these speech items 104-108. The voice commands 212-216 are "Bello", "Channel" and "Next", respectively. An instruction "hιcrease_Frequency_Band", which is interpretable for the control processor 210 is created based on these voice commands 212-216. Fig. 3 schematically shows alternative behavior of a speech control unit 200 according to the invention. From the combined speech signal 103 the speech items 104-114 are extracted. Those speech items 104-108 which correspond to the first user Ul are classified as such and the speech items 110-114 which correspond to the second user U2 .are classified as such. The speech items 104-114 are also recognized. Noice commands 212-216 are assigned to the speech items 104-108 being classified as belonging to the first user Ul and voice commands 312-316 are assigned to the speech items 110-114 being classified as belonging to the second user U2. Alternatively the speech items 104-114 are first recognized and then classified. The voice commands 212-216 are "Bello", "Channel" and "Next", respectively. An instruction "frιcrease_Frequency_Band", which is interpretable for the control processor 210 is created based on these voice commands 212-216. The voice commands 312-316 are "Bello", "More" and "Sound", respectively. A further instruction "Increase SoundJ evel", which is interpretable for the control processor 210 is created based on these voice commands 312-316. The instruction "hιcrease_Frequency_Band" is directly provided to the control processor 210 by means of providing means 209. The further instruction "h crease_Sound_Level" is halted for a limited time interval. After the control processor 210 is ready to perform the further instruction it is provided. Optionally the instruction "hιcrease_Frequency_Band" and the further instruction "rncrease_Sound__Level" are compared in order to check whether the further instruction 318 should be blocked. In this case the instructions 218-318 are not mutually conflicting and hence the further instruction "Increase Sound evel" is not blocked. The result is that two instructions are performed which relate to speech items 104-114 of two different users Ul .and U2 which have spoken in the attention span initiated by the first user Ul .
Fig. 4 schematically shows an embodiment of the apparatus 400 according to the invention. The apparatus 400 optionally comprises audio generating means 404 for generating an audio signal representing the first user Ul. By generating an audio signal comprising a representation of the name of the first user Ul, e.g. "Hello Jack" it is clear for the first user Ul that the apparatus is ready to receive speech items 104-108 from the first user Ul. In other words, the apparatus is in an active state of classifying and or recognizing speech items. The generating means 404 comprises a memory device for storage of a sampled audio signal, a sound generator and a load speaker. The apparatus also comprises a display device 402 for displaying a visual representation of the first user Ul. By displaying a personalized icon or an image of the first user it is clear for the first user that the apparatus is ready to receive speech items 104- 108 from the first user Ul . The speech control unit 200 according to the invention is preferably used in a multi-function consumer electronics system, like a TV, set top box, NCR, or DVD player, game box, or similar device. But it may also be a consumer electronic product for domestic use such as a washing or kitchen machine, any kind of office equipment like a copying machine, a printer, various forms of computer work stations etc, electronic products for use in the medical sector or any other kind of professional use as well as a more complex electronic information system. Whereas, the word "multifunction electronic system" as used in the context of the invention may comprise a multiplicity of electronic products for domestic or professional use as well as more complex information systems, the number of individual functions to be controlled by the method would normally be limited to a reasonable level, typically in the range from 2 to 100 different functions. For a typical consumer electronic product like a TV or audio system, where only a more limited number of functions need to be controlled, e.g. 5 to 20 functions, examples of such functions may include volume control including muting, tone control, channel selection and switching from inactive or stand-by condition to active condition and vice versa, which could be initiated, by control commands such as "louder", "softer", "mute", "bass" "treble" "change channel", "on", "off, "stand-by" etcetera.
In the description it is assumed that the speech control unit 200 is located in the apparatus 400 being controlled. It will be appreciated that this is not required and that the control method according to the invention is also possible where several devices or apparatus are connected via a network (local or wide area), and the speech control unit 200 is located in a different device then the device or apparatus being controlled.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be constructed as limiting the claim. The word 'comprising' does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention can be implemented by means of hardware comprising several distinct elements and by means of a suitable programmed computer. In the unit claims enumerating several means, several of these means can be embodied by one and the same item of hardware.

Claims

CLAIMS:
1. A method of controlling an apparatus (400) based on speech, comprising:
- receiving a series of speech items (104-108), starting with a first speech item (104) of a first user (Ul) of the apparatus (400);
- transforming a selection of the received speech items (104-114) into voice commands (212-216) corresponding to respective recognized speech items which are classified as belonging to the first user (Ul) on basis of a voice profile of the first user;
- creating an instruction (218) for the apparatus (400) by means of combining the voice commands (212-216); and
- providing the instruction (218) for execution by the apparatus (400).
2. A method as claimed in Claim 1, characterized in that transforming the selection of received speech items comprises:
- classifying the received speech items (104-114) by means of comparing the received speech items (104-114) with the voice profile of the first user (Ul); .and - recognizing speech items (104-108) being classified as belonging to the first user (Ul) thereby associating voice commands (212-216) to respective recognized speech items (104-108).
3. A method as claimed in Claim 1, characterized in that the method further comprises:
- classifying a further selection of the received speech items (104-114) by means of comparing the received speech items with a further voice profile of a further user (U2) of the apparatus (400);
- recognizing speech items (110-114) being classified as belonging to the further user (U2) whereby associating further voice commands (312-316) to respective further recognized speech items (110-114);
- creating a further instruction (318) for the apparatus (400) by means of combining the further voice commands (312-316) corresponding to the further recognized speech items (110-114); and - optionally providing the further instruction (318) for execution by the apparatus (400).
4. A method as claimed in Claim 3, characterized in that the step of providing the further instruction (318) is performed if the instruction (218) has been executed.
5. A method as claimed in Claim 3, characterized in that the step of providing the further instruction (318) is performed if the instruction (218) and the further instruction (318) are not mutually conflicting.
6. A method as claimed in Claim 1, characterized in that creating the instruction (218) is performed after a predetermined time interval which has been started at receiving the first speech item (104).
7. A method as claimed in Claim 1, characterized in that creating the instruction
(218) is performed after a further predetermined time interval during which no speech items have been classified.
8. A method as claimed in Claim 1, characterized in that creating the instruction (218) is performed after an explicit action of the first user.
9. An apparatus (400) comprising:
- a speech control unit (200) for controlling the apparatus (400) on basis of speech, comprising: * receiving means (202) for receiving a series of speech items (104-114), starting with a first speech item (104) of a first user (Ul) of the apparatus (400);
* transforming means for transforming a selection of the received speech items (104-114) into voice commands (212-216) corresponding to respective recognized speech items which are classified as belonging to the first user (Ul) on basis of a voice profile of the first user;
* instruction creating means (208) for creating an instruction (218) for the apparatus (400) by means of combining the voice commands (212-216); and
- processing means (210) for execution of the instruction.
10. An apparatus (400) as claimed in Claim 9, characterized in being arranged to show that the first speech item (104) has been classified as belonging to the first user (Ul).
11. An apparatus (400) as claimed in Claim 10, characterized in comprising audio generating means (404) for generating an audio signal representing the first user (Ul).
12. An apparatus (400) as claimed in Claim 10, characterized in comprising a display device (402) for displaying a visual representation of the first user (Ul).
13. An apparatus (400) as claimed in Claim 10, characterized in being arranged to show a set of controllable parameters of the apparatus (400) on basis of a preference profile of the first user (Ul).
14. A consumer electronics system comprising the apparatus (400) as claimed in Claim 9.
15. A speech control unit (200) for controlling an apparatus (400) on basis of speech, comprising:
- receiving means (202) for receiving a series of speech items (104-114), starting with a first speech item (104) of a first user (Ul) of the apparatus (400);
- transforming means for transforming a selection of the received speech items (104-114) into voice commands (212-216) corresponding to respective recognized speech items which are classified as belonging to the first user (Ul) on basis of a voice profile of the first user; - instruction creating means (208) for creating an instruction (218) for the apparatus (400) by means of combining the voice commands (212-216); and
- providing means (209) for providing the instruction to the apparatus (400).
PCT/IB2003/0011452002-04-082003-03-20Controlling an apparatus based on speechWO2003085639A1 (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
AU2003212587AAU2003212587A1 (en)2002-04-082003-03-20Controlling an apparatus based on speech

Applications Claiming Priority (2)

Application NumberPriority DateFiling DateTitle
EP02076399.12002-04-08
EP020763992002-04-08

Publications (1)

Publication NumberPublication Date
WO2003085639A1true WO2003085639A1 (en)2003-10-16

Family

ID=28685937

Family Applications (1)

Application NumberTitlePriority DateFiling Date
PCT/IB2003/001145WO2003085639A1 (en)2002-04-082003-03-20Controlling an apparatus based on speech

Country Status (2)

CountryLink
AU (1)AU2003212587A1 (en)
WO (1)WO2003085639A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US5606643A (en)*1994-04-121997-02-25Xerox CorporationReal-time audio recording system for automatic speaker indexing
EP0911808A1 (en)*1997-10-231999-04-28Sony International (Europe) GmbHSpeech interface in a home network environment
WO2000041065A1 (en)*1999-01-062000-07-13Koninklijke Philips Electronics N.V.Speech input device with attention span
US20020007278A1 (en)*2000-07-112002-01-17Michael TraynorSpeech activated network appliance system
EP1189206A2 (en)*2000-09-192002-03-20Thomson Licensing S.A.Voice control of electronic devices

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US5606643A (en)*1994-04-121997-02-25Xerox CorporationReal-time audio recording system for automatic speaker indexing
EP0911808A1 (en)*1997-10-231999-04-28Sony International (Europe) GmbHSpeech interface in a home network environment
WO2000041065A1 (en)*1999-01-062000-07-13Koninklijke Philips Electronics N.V.Speech input device with attention span
US20020007278A1 (en)*2000-07-112002-01-17Michael TraynorSpeech activated network appliance system
EP1189206A2 (en)*2000-09-192002-03-20Thomson Licensing S.A.Voice control of electronic devices

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"BIT PATTERN VOICE RECOGNITION CONTROLLER", IBM TECHNICAL DISCLOSURE BULLETIN, IBM CORP. NEW YORK, US, vol. 39, no. 4, 1 April 1996 (1996-04-01), pages 1 - 3, XP000587400, ISSN: 0018-8689*
LEUNG H C ET AL: "Interactive speech and language systems for telecommunications applications at NYNEX", INTERACTIVE VOICE TECHNOLOGY FOR TELECOMMUNICATIONS APPLICATIONS, 1994., SECOND IEEE WORKSHOP ON KYOTO, JAPAN 26-27 SEPT. 1994, NEW YORK, NY, USA,IEEE, 26 September 1994 (1994-09-26), pages 49 - 54, XP010124371, ISBN: 0-7803-2074-3*

Also Published As

Publication numberPublication date
AU2003212587A1 (en)2003-10-20

Similar Documents

PublicationPublication DateTitle
JP4837917B2 (en) Device control based on voice
CN112201246B (en)Intelligent control method and device based on voice, electronic equipment and storage medium
WO2017012511A1 (en)Voice control method and device, and projector apparatus
CN102568478A (en)Video play control method and system based on voice recognition
EP1346342A1 (en)Speechdriven setting of a language of interaction
CN107609034A (en)A kind of audio frequency playing method of intelligent sound box, audio playing apparatus and storage medium
CN106601242A (en)Executing method and device of operation event and terminal
US10424292B1 (en)System for recognizing and responding to environmental noises
CN108492826B (en)Audio processing method and device, intelligent equipment and medium
WO2003107327A1 (en)Controlling an apparatus based on speech
CA2345434C (en)System and method for concurrent presentation of multiple audio information sources
EP1316944B1 (en)Sound signal recognition system and method, and dialog control system and method using it
CN110197663A (en)A kind of control method, device and electronic equipment
CN107948854B (en)Operation audio generation method and device, terminal and computer readable medium
CN110660393B (en)Voice interaction method, device, equipment and storage medium
CN109658924B (en)Session message processing method and device and intelligent equipment
WO2003085639A1 (en)Controlling an apparatus based on speech
CN111045641B (en)Electronic terminal and voice recognition method
JP2001042887A (en)Method for training automatic speech recognizing device
JP2001134291A (en)Method and device for speech recognition
CN112786031A (en)Man-machine conversation method and system
KR102124396B1 (en)Display apparatus, Method for controlling display apparatus and Method for controlling display apparatus in Voice recognition system thereof
KR102089593B1 (en)Display apparatus, Method for controlling display apparatus and Method for controlling display apparatus in Voice recognition system thereof
CN111028832B (en)Microphone mute mode control method and device, storage medium and electronic equipment
SantosA Review of Voice User Interfaces for Interactive TV

Legal Events

DateCodeTitleDescription
AKDesignated states

Kind code of ref document:A1

Designated state(s):AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NI NO NZ OM PH PL PT RO RU SC SD SE SG SK SL TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

ALDesignated countries for regional patents

Kind code of ref document:A1

Designated state(s):GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121Ep: the epo has been informed by wipo that ep was designated in this application
122Ep: pct application non-entry in european phase
NENPNon-entry into the national phase

Ref country code:JP

WWWWipo information: withdrawn in national office

Country of ref document:JP


[8]ページ先頭

©2009-2025 Movatter.jp