Movatterモバイル変換


[0]ホーム

URL:


CN101516005A - Speech recognition channel selecting system, method and channel switching device - Google Patents

Speech recognition channel selecting system, method and channel switching device
Download PDF

Info

Publication number
CN101516005A
CN101516005ACNA2008100654170ACN200810065417ACN101516005ACN 101516005 ACN101516005 ACN 101516005ACN A2008100654170 ACNA2008100654170 ACN A2008100654170ACN 200810065417 ACN200810065417 ACN 200810065417ACN 101516005 ACN101516005 ACN 101516005A
Authority
CN
China
Prior art keywords
voice
channel
recognition
input
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CNA2008100654170A
Other languages
Chinese (zh)
Inventor
吴治国
张勤伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co LtdfiledCriticalHuawei Technologies Co Ltd
Priority to CNA2008100654170ApriorityCriticalpatent/CN101516005A/en
Priority to PCT/CN2009/070380prioritypatent/WO2009103226A1/en
Publication of CN101516005ApublicationCriticalpatent/CN101516005A/en
Pendinglegal-statusCriticalCurrent

Links

Images

Classifications

Landscapes

Abstract

Translated fromChinese

本发明提供一种语音识别频道选择系统、方法及频道转换装置,该方法包括:控制器接收用户的语音输入信号;频道转换装置根据输入的语音信号及识别词表识别出待匹配名称;根据待匹配名称与匹配表进行匹配得出需要切换的频道;切换到需要切换的频道。本发明避免了在控制器上进行语音识别操作复杂和成本高的问题,使得用户在操作起来十分方便,并且充分利用频道转换装置的性能,节省了控制的成本。通过频道转换装置识别出待匹配名称,不需要在网络中设置专门的语音识别服务器,防止响应时间过长,避免了由于网络传输数据丢失的问题,并且节约了构建网络的成本。

Figure 200810065417

The present invention provides a voice recognition channel selection system, method and channel conversion device, the method comprising: the controller receives the user's voice input signal; the channel conversion device recognizes the name to be matched according to the input voice signal and the recognition vocabulary; The matching name is matched with the matching table to obtain the channel to be switched; switch to the channel to be switched. The present invention avoids the problems of complex and high-cost voice recognition operations on the controller, makes the operation very convenient for users, and fully utilizes the performance of the channel conversion device to save control costs. The name to be matched is recognized by the channel conversion device, without setting up a special voice recognition server in the network, preventing the response time from being too long, avoiding the problem of data loss due to network transmission, and saving the cost of building the network.

Figure 200810065417

Description

A kind of speech recognition channel selection system, method and channel switch device
Technical field
The present invention relates to communication technical field, relate in particular to and a kind ofly carry out channel selection system, device and method by speech recognition.
Background technology
Along with the development of information technology and broadcast television technique, business developments such as cable digital TV and IPTV are rapid in recent years.(Set-top Box, STB), as IP set-top box and top box of digital machine etc., progressively under Shi Changhua the trend, the complete function of set-top box has replaced traditional VCD machine and DVD player gradually in set-top box.On the other hand, along with the development of automatic speech recognition technology, make set-top box select channel to become possibility by voice, this technology also becomes the emphasis of industry research and development.
Traditional speech recognition selects channel that dual mode is arranged: a kind of is by increasing the mode of voice recognition processor on remote controller, imports by the user when identification and downloads the definite speech data of sound template and the speech data coupling of user's input and come converted channel; A kind of is by special speech recognition server is set in network.
The inventor finds that in realizing process of the present invention there is following shortcoming at least in the mode of traditional speech recognition selection channel: by increase the mode of voice recognition processor on remote controller, because each sound template that upgrades all needs user's manual operation to download on the remote controller when identification, it is very complicated, inconvenient to operate, simultaneously, also increased the cost of remote controller; By the mode of special speech recognition server is set in network, owing to voice signal need be uploaded to network during the identification voice, response time is longer, and the possibility by network uplink and twice data-bag lost of downlink transfer also can increase, and special in addition speech recognition server has also increased the cost of building network.
Summary of the invention
In view of this, be necessary to provide a kind of easy to operate, cost-effective speech recognition band selecting method in fact.
Simultaneously, provide a kind of easy to operate, cost-effective speech recognition channel switch system.
Simultaneously, provide a kind of easy to operate, cost-effective channel switch device.
A kind of speech recognition band selecting method comprises the steps:
Controller receives the user's voice input signal;
The channel switch device identifies title to be matched according to the voice signal and the identification vocabulary of input;
Mate the channel that draws the needs switching according to described title to be matched and matching list;
Switch to the described channel that needs switching.
A kind of speech recognition channel selection system comprises: controller is used for communicating with the channel switch processing unit;
Described controller is used to receive the user's voice input signal;
Described channel switch processing unit is used for identifying title to be matched according to the voice input signal of described input and identification vocabulary, mates the channel that draws the needs switching according to described title to be matched and matching list, and switches to the described channel that needs switching.
A kind of channel switch device comprises:
Receiver module is used to receive the user's voice input signal that controller sends;
Recognition processing module is used for identifying title to be matched according to the voice input signal and the identification vocabulary of described input;
The match query module is used for mating the channel that draws the needs switching according to described title to be matched and matching list;
The channel switch control module is used to switch to the channel that described needs switch.
Compared with prior art, the embodiment of the invention receives the user's voice input signal by controller, identify title to be matched by the channel switch device according to the voice input signal of described input, mate the channel that draws the needs switching according to described title to be matched and matching list, and switch to the described channel that need to switch, avoided the complicated and high problem of cost at the enterprising lang sound of controller identifying operation, make the user operate very convenient, and make full use of the performance of channel switch device, saved the cost of control.Identify title to be matched by the channel switch device, special speech recognition server need be set in network, prevent that the response time is long, avoided because the problem that transmitted data on network is lost, and saved the cost of building network.
Description of drawings
Fig. 1 is an embodiment of the invention speech recognition channel switch system configuration schematic diagram.
Fig. 2 is an embodiment of the invention controller architecture schematic diagram.
Fig. 3 is an embodiment of the invention channel switch processing unit structural representation.
Fig. 4 is an embodiment of the invention speech recognition band selecting method flow chart.
Fig. 5 is embodiment of the invention channel and listing update method flow chart.
Fig. 6 is embodiment of the invention identification vocabulary and matching list update method flow chart.
Embodiment
Please referring to Fig. 1, embodiment of the invention speech recognitionchannel switch system 100 comprises: (Electronic Program Guide, EPG)server 106 forcontroller 102,channel switch device 104 and electronicprogram guides.Controller 102 is used to receive the user's voice input signal.Channel switch device 104 is used for identifying title to be matched according to the voice input signal and the identification vocabulary of input, mates the channel that draws the needs switching according to title to be matched and matching list, and switches to the channel that needs switching.EPG server 106, the identification vocabulary of up-to-date matching list that is used to provide to be updated and/or up-to-date renewal,channel switch device 104 can upgrade matching list according to up-to-date matching list, and/or upgrades the identification vocabulary according to up-to-dateidentification vocabulary.Controller 102 can be system's external controller, HS (Handset, mobile phone) or remote controller, in the present embodiment, is example with the remote controller.Channel switch device 104 can be PC (Personal Computer, PC), STB (Set-top Box, set-top box), NB (NotebookComputer, notebook computer), HS (Handset, mobile phone), GP (Game Player, game machine) or ODD (Optical Disc Drive, CD-ROM device) etc., in the present embodiment, be that example describes with STB.
Please in conjunction with referring to Fig. 2, in the present embodiment,controller 102 comprises:voice receiver module 202, voicesignal processing module 204,input module 210,controller receiver module 212 andsending module 216.
Voicesignal receiver module 202 is used to receive the user's voice input signal, and in the present embodiment, voice input module can be a microphone on the remote controller.
Voicesignal processing module 204 is used for the voice input signal of process user.Voicesignal processing module 204 also comprises:speech conversion unit 206 and speech coding unit 208.Speech conversion unit 206 is used for voice signal is converted into digital signal, and in the present embodiment,speech conversion unit 206 can be the A/D change-over circuit.Speech coding unit 208 is used for the digital signal after encodedvoice converting unit 206 is changed, and this coding can be a compressed encoding, comprises diminishing compressed encoding or lossless compression-encoding.The user's voice collection can have different schemes with handling, in the present embodiment, sample with the 16KHz sample rate, by 16 or the precision of 8bit quantize.The coded format of voice signal after over-sampling and processing is PCM (Pulse Code Modulation, pulse code modulation) form.
Input module 210 is used to receive the instruction of user's input, as, the voice activation instruction, it is voice activated to be used to control the channel switch device, and in the present embodiment,input module 210 can be keyboard or touch-screen.
Controller receiver module 212 is used for the signal that receivingchannels conversion equipment 104 sends, and this signal comprises the command signal returned and notification message etc.
Sendingmodule 216, be used to send signal and operation signal after the speech coding of user's input, in the present embodiment, sendingmodule 216 can be wireless communication apparatus such as infrared, bluetooth, as passing through Bluetooth2.0 (bluetooth 2.0 technology), purple honeybee Zigbee or high speed infrared agreement etc. can guarantee the high-speed radiocommunication technology that PCM (Pulse Code Modulation, pulse code modulation) speech data can real-timeTransmission.Sending module 216 also comprises: operationsignal transmitting unit 218, be used to send the operation signal that the user imports, for example, keyboard input and touch-screen input signal.Voicesignal transmitting element 214 is used to send the voice signal that the user imports, and this signal also can be the signal behind the compressed encoding for the digital signal through the A/D conversion.
Please in conjunction with referring to Fig. 3, in the present embodiment, channel switch device 104 (STB) comprising:receiver module 302,quiet control module 308,speech selection module 310,recognition processing module 312,sending module 322, refusalidentification reminding module 324,memory module 326,match query module 336, channelswitch control module 338 andupdate module 340.
Receiver module 302, be used to receive the user's voice input signal of controller transmission and user's operation control command, in the present embodiment, user input signal comprises user's voice input signal and user's operation control command, if be phonetic entry all, also can not comprise user's control command signal.The user's voice input signal is the audio digital signals after changing through analog/digital A/D.Receiver module 302 also comprises operationsignal receiving element 304 and voice signal receiving element 306.Operationsignal receiving element 304 is used to receive user's operation control command, for example voice activated control command.Voicesignal receiving element 306 is used to receive the user's voice input signal.
Quiet control module 308 is used for the voice activated instruction according to user input, and the channel switch device is changed to mute state, and mute state is switched to non-mute state behind voice collecting.
Speech selection module 310 is used for the speech selection signal according to user input, select one with the corresponding acoustic model of described speech selection signal.
Recognition processing module 312 is used for identifying title to be matched according to the voice signal and the identification vocabulary of input.Recognition processing module 312 comprises: voiceactivation detecting unit 314, phoneticfeature extraction unit 316,voice recognition unit 318 andvoice judging unit 320.
Voiceactivation detecting unit 314 is used to detect the starting point and the terminal point of actual speech section.In the present embodiment, the sane end-point detection algorithm of voiceactivation detecting unit 314 employings detects the starting point and the terminal point of actual speech, with actual speech section and non-speech segment in the voice signal of distinguishing input.
Phoneticfeature extraction unit 316 is used for that voice signal is carried out phonetic feature and extracts.In the present embodiment, phoneticfeature extraction unit 316 is handled the voice signal that voiceactivation detecting unit 314 sends, and extracts voice feature data.The phonetic feature type can adopt MFCC (Mel-FrequencyCeptral Coefficients, the Mei Er frequency cepstral coefficient) feature, PLP (Perceptually LinearPrediction, the perception linear prediction) feature or LPCC (Linear Predictive Cepstral Coding, the linear prediction cepstrum coefficient) feature, in order to improve the anti-noise effect, the processing that can in the phonetic feature leaching process, use cepstral mean to subtract.Consider the MFCC characteristic use people's ear the acoustics apperceive characteristic and noise is had robustness preferably, preferred MFCC feature is as phonetic feature.Voice signal has frame-to-frame correlation as stationary signal in short-term between the speech frame, can improve the accuracy rate of speech recognition to MFCC feature extraction first-order difference or single order and second differnce for this reason.
Voice recognition unit 318 is used for calculating the acoustics distance of the voice feature data of input with respect to entry according to acoustic model and identification vocabulary.In the present embodiment,voice recognition unit 318 obtains the shortest accumulation acoustics distance of each isolated word according to acoustic model data and isolated vocabulary data, get then the shortest acoustics apart from the isolated word of minimum as the first-selected recognition result of these voice.The acoustic model that speech recognition is adopted comprises continuous HMM (Hidden Markov Model hidden Markov model) model and Discrete HMM model.In addition, the recognition result thatvoice recognition unit 318 can also provide a plurality of candidates allows the user select, and the foundation of ordering is the shortest accumulation acoustics distance.
Voice judging unit 320, be used to judge voice feature data with respect to the acoustics distance of entry whether less than threshold value, if voice feature data less than threshold value, calculates the channel designation of current speech correspondence with respect to the acoustics of entry distance according to identification vocabulary and matching list.
Sending module 322 is used for sending the identification processing signals to controller 102, and after identification disposed,controller 102 can stop to gather the user's voice input signal.In the present embodiment,sending module 322 also can adopt bluetooth, wireless mode such as infrared to transmit signal.
Refusalidentification reminding module 324 is used for when recognition result is non-voice, and the prompting user re-enters voice.This prompting can be message notifying, video display reminding or auditory tone cues, and in the present embodiment, employing mode of display reminding literal on screen is pointed out the user.
Memory module 326 is used for data such as storage of channel and listing, identification vocabulary, acoustic model and matching list.In the present embodiment,memory module 326 comprises: channel and listingmemory cell 328, identificationvocabulary memory cell 330, acousticmodel memory cell 332, matchinglist memory cell 334.
Channel and listingmemory cell 328 are used for storage of channel and program correspondence table, and in the present embodiment, each entry of table is the channel designation and the in progress programm name of this channel of current time of live telecast.This channel and program correspondence table can be upgraded according toEPG server 106, and the update cycle can be set to one day or a week, and the concrete time interval can be with reference to the EPG server update interval of IPTV or cable digital TV system.
Identificationvocabulary memory cell 330 is used for storage identification vocabulary, and in the present embodiment, the identification vocabulary also comprises an isolated vocabulary that is used for alone word voice identification.
Acousticmodel memory cell 332 is used to store acoustic model to be matched.In the present embodiment, employing comprises the model parameter at the acoustic model of bilingual kind of hybrid modeling of HMM model.Parameter and speaker that bilingual kind is mixed acoustic model have nothing to do, and are the model at unspecified person.Model parameter needs to train through training aids according to the good expectation data of mark in advance, the parameter that training obtains just can be cured to the speech recognition that acoustic model parameter storage part is used for isolated word, and the acoustic model parameter comprises the state parameter of hidden Markov model and the probability-distribution function of state output observational characteristic vector.
Matchinglist memory cell 334 is used to store matching list, and matching list has been stored the channel that the user need switch and the channel corresponding relation of user's voice input.
Match query module 336 is used for mating the channel that draws the needs switching according to title to be matched and matching list.In the present embodiment, as key word of the inquiry, during ranking, the channel of table that at first inquiry in the channel program table comprises inquires about the entry that meets keyword with the isolated word that identifies.
Channelswitch control module 338 is used to switch to the channel that needs switch.If there is the entry of coupling, when Query Result was single entry, controller top box live telecast switched to the channel of entry mid band name attribute-bit; When Query Result is a plurality of record, the control video screen shows the property value of the channel name of a plurality of entries, and the prompting user selects one of them channel to watch live television programming by remote controller, treat that the user finishes selection after, the control TV switches to the channel that the user selects.
Update module 340 is used for according to the EPG server with new matching list and/or identificationvocabulary.Update module 340 also comprises:upgrade timing unit 342 and upgrade control unit 344.Upgrade timing unit 342, be used to write down the time of renewal, and when arrive or be overtime update time, trigger and upgrade, in the present embodiment, channel and listing can be set to upgrade every day update time, and identification vocabulary and matching list can be set to the per minute renewal update time.Upgrade control unit 344, be used for when satisfying update time, matching list and/or identification vocabulary are upgraded in control.
The embodiment of the invention receives the user's voice input signal by controller, identify title to be matched by the channel switch device according to the voice input signal of described input, mate the channel that draws the needs switching according to described title to be matched and matching list, and switch to the described channel that need to switch, avoided the complicated and high problem of cost at the enterprising lang sound of controller identifying operation, make the user operate very convenient, and make full use of the performance of channel switch device, saved the cost of control.Identify title to be matched by the channel switch device, special speech recognition server need be set in network, prevent that the response time is long, avoided because the problem that transmitted data on network is lost, and saved the cost of building network.The embodiment of the invention is by intercepting actual speech section, and the accuracy rate of speech recognition is improved.During by quiet control unit control phonetic entry, set-top box is quiet, prevent the sound of televising interference to user speech.From EPG server more new channel and listing automatically, identification vocabulary and matching list have avoided that the user is manual affected to bring unhandy drawback by update module.
Please in conjunction with referring to Fig. 4, embodiment of the invention speech recognition band selecting method comprises the steps:
Step 402, controller receives the voice activated instruction of user's input.In the present embodiment, the voice activation instruction can be the push button signalling that the user imports, and the user can be by the command signal of input equipments such as keyboard or touch-screen input.
Step 404, controller send to the channel switch device and start the speech recognition controlled command signal.In the present embodiment, be example, send startup speech recognition controlled command signal to set-top box by remote controller in wireless transmission modes such as bluetooth, high speed infrared agreement, purple honeybee Zigbee.
Step 406, the channel switch device is changed to mute state.
Step 408, channel switch device send to controller and start the voice collecting control command signal.If when not adopting mute function, also can not comprise above step, repeat no more.
Step 410, controller receives the user's voice input signal, and the voice signal of collection and process user input in the present embodiment, converts analog voice signal to audio digital signals by A/D converter, and sends the channel switch device to by wireless mode.
Step 412, channel switch device detect the starting point and the terminal point of actual speech section, are used to identify title to be matched according to the starting point and the terminal point of actual speech section.In the present embodiment, voice activation detects starting point and the terminal point that the sane end-point detection algorithm of employing detects actual speech, with actual speech section and non-speech segment in the voice signal of distinguishing input.
Step 414, channel switch device send to controller and stop the voice collecting control signal.After identification disposed, controller can stop to gather the user's voice input signal.In the present embodiment, send mode also can adopt wireless modes such as bluetooth, high speed infrared agreement and Zigbee to transmit signal.
Step 416, controller stops to gather and processes voice signals according to the control that stops the voice collecting control signal of channel switch device.
Step 418 sends the signal of the actual speech section between starting point and the terminal point to the phonetic feature extraction unit.Step 418 and step 414 can not have precedence relationship, also can be first execution in step 418 back execution in step 416, repeat no more.
Step 420, the phonetic feature extraction unit extracts phonetic feature according to the voice signal of input, and voice signal is carried out feature extraction, in the present embodiment, obtains the step that the actual speech paragraph detects if having before, just only needs extraction actual speech section.The phonetic feature type can adopt the MFCC feature, and PLP feature or LPCC feature are in order to improve the anti-noise effect, the processing that can use cepstral mean to subtract in the phonetic feature leaching process.Consider the MFCC characteristic use people's ear the acoustics apperceive characteristic and noise is had robustness preferably, preferred MFCC feature is as phonetic feature.Voice signal has frame-to-frame correlation as stationary signal in short-term between the speech frame, can improve the accuracy rate of speech recognition to MFCC feature extraction first-order difference or single order and second differnce for this reason.
Step 422 calculates the acoustics distance of the voice feature data of input with respect to entry according to acoustic model and identification vocabulary.In the present embodiment, speech recognition obtains the shortest accumulation acoustics distance of each isolated word according to acoustic model data and isolated vocabulary data, get then the shortest acoustics apart from the isolated word of minimum as the first-selected recognition result of these voice.The acoustic model that speech recognition is adopted comprises continuous HMM model and Discrete HMM model.In addition, the recognition result that speech recognition can also provide a plurality of candidates allows the user select, and the foundation of ordering is the shortest accumulation acoustics distance.In the present embodiment, employing comprises the model parameter at the acoustic model of the bilingual kind of hybrid modeling of HMM.Parameter and speaker that bilingual kind is mixed acoustic model have nothing to do, and are the model at unspecified person.Model parameter needs to train through training aids according to the good expectation data of mark in advance, the parameter that training obtains just can be cured to the speech recognition that acoustic model parameter storage part is used for isolated word, and the acoustic model parameter comprises the state parameter of HMM and the probability-distribution function of state output observational characteristic vector.Before this step, can also comprise speech selection signal, select the step of an acoustic model corresponding with this speech selection signal according to user's input.
Step 424, judge voice feature data with respect to each entry acoustics distance whether less than threshold value, if the acoustics distance is not less than threshold value, execution in step 426; If acoustics distance is less than threshold value, execution in step 428.
Step 426, if voice feature data with respect to the acoustics of entry distance more than or equal to threshold value, recognition result is a non-voice, the prompting user re-enters.This prompting can be message notifying, video display reminding or auditory tone cues, and in the present embodiment, employing mode of display reminding literal on screen is pointed out the user.After the execution of step 426, finish this identifying.
Step 428, if voice feature data with respect to the acoustics of entry distance less than threshold value, calculate the channel designation of current speech correspondence according to identification vocabulary and matching list.In the present embodiment, obtain the shortest accumulation acoustics distance of each isolated word according to acoustic model data and isolated vocabulary data, get then the shortest acoustics apart from the isolated word of minimum as the first-selected recognition result of these voice.The acoustic model that speech recognition is adopted comprises continuous HMM model and Discrete HMM model.In addition, the recognition result that can also provide a plurality of candidates allows the user select, and the foundation of ordering is the shortest accumulation acoustics distance.
Step 430 switches to the channel that needs switch according to the channel designation that identifies.If there is the entry of coupling, when Query Result was single entry, controller top box live telecast switched to the channel of entry mid band name attribute-bit; When Query Result is a plurality of record, the control video screen shows the property value of the channel name of a plurality of entries, and the prompting user selects one of them channel to watch live television programming by remote controller, treat that the user finishes selection after, the control TV switches to the channel that the user selects.
Please in conjunction with referring to Fig. 5, embodiment of the invention channel and listing update method comprise the steps:
Step 502 checks whether channel and listing satisfy the condition that is provided with of upgrading, and upgrading the condition that is provided with can be according to user's demand setting, and the renewal of identification vocabulary and matching list can be set to one day.If satisfy to upgrade condition execution instep 504 is set, otherwise returnsstep 502.
Step 504, the channel switch device is downloaded up-to-date channel and listing data, more new channel and listing from the EPG server.
The target of this renewal can be the EPG server, also can be local network or CD etc.
Please in conjunction with referring to Fig. 6, embodiment of the invention identification vocabulary and matching list update method comprise the steps:
Step 602 checks whether identification vocabulary and matching list satisfy the condition that is provided with of upgrading, and upgrading the condition that is provided with can be according to user's demand setting, and the renewal of identification vocabulary and matching list can be set to one minute.If satisfy to upgrade condition execution instep 604 is set, otherwise returnsstep 602.
Step 604 is upgraded local identification vocabulary and matching list according to channel and listing.
One of ordinary skill in the art will appreciate that all or part of step in the said method can be finished by the relevant hardware of program command, this program can be stored in the computer-readable recording medium, this storage medium as, RAM, ROM or CD etc.
The embodiment of the invention receives the user's voice input signal by controller, identify title to be matched by the channel switch device according to the voice input signal of described input, mate the channel that draws the needs switching according to described title to be matched and matching list, and switch to the described channel that need to switch, avoided the complicated and high problem of cost at the enterprising lang sound of controller identifying operation, make the user operate very convenient, and make full use of the performance of channel switch device, saved the cost of control.Identify title to be matched by the channel switch device, special speech recognition server need be set in network, prevent that the response time is long, avoided because the problem that transmitted data on network is lost, and saved the cost of building network.The embodiment of the invention is by intercepting actual speech section, and the accuracy rate of speech recognition is improved, and has removed the interference of noise.During by quiet control unit control phonetic entry, set-top box is quiet, prevent the sound of televising interference to user speech.From EPG server more new channel and listing automatically, identification vocabulary and matching list have avoided that the user is manual affected to bring unhandy drawback by update module.
In sum, more than be preferred embodiment of the present invention only, be not to be used to limit protection scope of the present invention.Within the spirit and principles in the present invention all, any modification of being done, be equal to replacement, improvement etc., all should be included within protection scope of the present invention.

Claims (20)

Translated fromChinese
1.一种语音识别频道选择方法,其特征在于,该方法包括:1. A voice recognition channel selection method, characterized in that the method comprises:控制器接收用户的语音输入信号;The controller receives a user's voice input signal;频道转换装置根据输入的语音信号及识别词表识别出待匹配名称;The channel conversion device recognizes the name to be matched according to the input voice signal and the recognition vocabulary;根据所述待匹配名称与匹配表进行匹配得出需要切换的频道;matching the name to be matched with the matching table to obtain the channel to be switched;切换到所述需要切换的频道。Switch to the channel that needs to be switched.2.如权利要求1所述的语音识别频道选择方法,其特征在于,该方法进一步包括:接收用户输入的激活语音的指令,该指令用于控制所述频道转换装置激活语音,并且将频道转换装置置为静音状态。2. The voice recognition channel selection method according to claim 1, characterized in that, the method further comprises: receiving an instruction for activating the voice inputted by the user, the instruction is used to control the activation voice of the channel switching device, and switch the channel The device is muted.3.如权利要求1所述的语音识别频道选择方法,其特征在于,所述频道转换装置根据输入的语音信号识别出待匹配名称包括:采集和处理用户输入的语音信号,检测实际语音段的起点和终点,根据所述实际语音段的起点和终点识别出所述待匹配名称。3. voice recognition channel selection method as claimed in claim 1, is characterized in that, described channel changing device identifies the name to be matched according to the voice signal of input comprising: collecting and processing the voice signal of user input, detecting the actual voice segment The starting point and the ending point, the name to be matched is identified according to the starting point and the ending point of the actual speech segment.4.如权利要求1所述的语音识别频道选择方法,其特征在于,所述频道转换装置根据输入的语音信号识别出待匹配名称包括:将语音信号进行语音特征提取;根据声学模型和识别词表计算出所述语音特征数据相对于识别词表中的词条的声学距离;若语音特征数据相对于词条的声学距离小于阈值,根据识别词表和匹配表计算出当前语音对应的频道名称。4. voice recognition channel selection method as claimed in claim 1, is characterized in that, described channel switching device identifies the name to be matched according to the voice signal of input comprising: carrying out voice feature extraction with voice signal; The table calculates the acoustic distance of the speech feature data relative to the entry in the recognition vocabulary; if the acoustic distance of the speech feature data relative to the entry is less than the threshold, calculate the channel name corresponding to the current voice according to the recognition vocabulary and the matching table .5.如权利要求4所述的语音识别频道选择方法,其特征在于,该方法还包括:若语音特征数据相对于词条的声学距离大于或等于阈值,提示用户重新输入语音。5. The voice recognition channel selection method according to claim 4, further comprising: prompting the user to re-input the voice if the acoustic distance of the voice feature data relative to the entry is greater than or equal to a threshold.6.如权利要求5所述的语音识别频道选择方法,其特征在于,所述提示用户重新输入语音的方式为通过电视屏幕显示用户当前输入的语音无法识别,提示用户重新输入。6. The voice recognition channel selection method according to claim 5, wherein the method of prompting the user to re-input the voice is to display on the TV screen that the user's current input voice cannot be recognized, and prompt the user to re-input.7.如权利要求1所述的语音识别频道选择方法,其特征在于,该方法还进一步包括:频道转换装置向控制器发送停止语音采集控制信号,控制器根据所述停止语音采集控制信号的控制停止采集和处理语音信号。7. voice recognition channel selection method as claimed in claim 1, it is characterized in that, this method also further comprises: channel conversion device sends to controller and stop voice collection control signal, and controller is according to the control of described stop voice collection control signal Stop collecting and processing speech signals.8.如权利要求1所述的语音识别频道选择方法,其特征在于,该方法进一步包括:频道转换装置根据电子节目指南EPG服务器更新所述匹配表和/或所述识别词表。8. The method for voice recognition channel selection according to claim 1, characterized in that the method further comprises: the channel switching device updates the matching table and/or the recognition word table according to the electronic program guide (EPG) server.9.如权利要求1所述的语音识别频道选择方法,其特征在于,该方法进一步包括:根据用户输入的语言选择信号,选择一个与所述语言选择信号对应的声学模型。9. The voice recognition channel selection method according to claim 1, further comprising: selecting an acoustic model corresponding to the language selection signal according to the language selection signal input by the user.10.如权利要求1所述的语音识别频道选择方法,其特征在于,所述控制器与所述频道转换装置通过无线传输协议进行通信。10. The voice recognition channel selection method according to claim 1, wherein the controller communicates with the channel conversion device through a wireless transmission protocol.11.如权利要求10所述的语音识别频道选择方法,其特征在于,所述无线传输协议包括:高速红外协议、蓝牙传输协议和紫蜂Zigbee传输协议中的一种或多种。11. The voice recognition channel selection method according to claim 10, wherein the wireless transmission protocol comprises: one or more of high-speed infrared protocol, bluetooth transmission protocol and Zigbee transmission protocol.12.一种语音识别频道选择系统,其特征在于,该系统包括:控制器,用于与频道转换处理装置进行通信;12. A voice recognition channel selection system, characterized in that the system comprises: a controller for communicating with a channel conversion processing device;所述控制器用于接收用户的语音输入信号;The controller is used for receiving a user's voice input signal;所述频道转换处理装置用于根据所述输入的语音输入信号及识别词表识别出待匹配名称,根据所述待匹配名称与匹配表进行匹配得出需要切换的频道,并切换到所述需要切换的频道。The channel conversion processing device is used to identify the name to be matched according to the input voice input signal and the recognition vocabulary, and obtain the channel to be switched according to the name to be matched with the matching table, and switch to the desired channel. Switched channels.13.如权利要求2所述的语音识别频道选择系统,其特征在于,该系统还包括:电子节目指南EPG服务器,用于提供待更新的匹配表和/或最更新的识别词表,所述频道转换装置根据所述待更新的匹配表更新所述匹配表,和/或根据所述最新的识别词表更新所述识别词表。13. The speech recognition channel selection system as claimed in claim 2, characterized in that, the system also includes: an electronic program guide (EPG) server, which is used to provide a matching table to be updated and/or the most updated recognition vocabulary, said The channel conversion device updates the matching table according to the matching table to be updated, and/or updates the recognition vocabulary according to the latest recognition vocabulary.14.一种频道转换装置,其特征在于,该装置包括:14. A channel conversion device, characterized in that the device comprises:接收模块,用于接收控制器发送的用户的语音输入信号;The receiving module is used to receive the voice input signal of the user sent by the controller;识别处理模块,用于根据所述输入的语音输入信号及识别词表识别出待匹配名称;The recognition processing module is used to recognize the name to be matched according to the input voice input signal and the recognition vocabulary;查询匹配模块,用于根据所述待匹配名称与匹配表进行匹配得出需要切换的频道;The query matching module is used to match the name to be matched with the matching table to obtain the channel to be switched;频道转换控制模块,用于切换到所述需要切换的频道。The channel switching control module is used to switch to the channel that needs to be switched.15.如权利要求14所述的频道转换装置,其特征在于,该装置还包括:15. The channel switching device of claim 14, further comprising:静音控制模块,用于根据用户输入的激活语音的指令,将频道转换装置置为静音状态。The mute control module is used to set the channel conversion device to a mute state according to the voice activation command input by the user.16.如权利要求14所述的频道转换装置,其特征在于,所述识别处理模块进一步包括:16. The channel changing device according to claim 14, wherein the identification processing module further comprises:语音激活检测单元,用于检测实际语音段的起点和终点。A speech activation detection unit for detecting the start and end of the actual speech segment.17.如权利要求14所述的频道转换装置,其特征在于,所述识别处理模块进一步包括:17. The channel conversion device according to claim 14, wherein the identification processing module further comprises:语音特征提取单元,用于对语音信号进行语音特征提取;A speech feature extraction unit is used to extract speech features from the speech signal;语音识别单元,用于根据声学模型和识别词表计算出输入的语音特征数据相对于识别词表中词条的声学距离;The speech recognition unit is used to calculate the acoustic distance of the input speech feature data relative to the entry in the recognition vocabulary according to the acoustic model and the recognition vocabulary;语音判断单元,用于判断语音特征数据相对于词条的声学距离是否小于阈值,若语音特征数据相对于词条的声学距离小于阈值,根据识别词表和匹配表计算出当前语音对应的频道名称。The voice judging unit is used to judge whether the acoustic distance of the voice feature data relative to the entry is less than the threshold, if the acoustic distance of the voice feature data relative to the entry is less than the threshold, calculate the channel name corresponding to the current voice according to the recognition vocabulary and the matching table .18..如权利要求17所述的频道转换装置,其特征在于,该装置还包括:18. The channel switching device according to claim 17, further comprising:拒绝识别提示模块,用于在识别结果为非语音时,提示用户重新输入语音。The rejection recognition prompt module is used to prompt the user to re-input the speech when the recognition result is non-speech.19.如权利要求14所述的频道转换装置,其特征在于,该装置还包括:19. The channel changing device of claim 14, further comprising:更新模块,用于根据电子节目指南EPG服务器更新所述匹配表和/或所述识别词表。An updating module, configured to update the matching table and/or the recognition word list according to the EPG server.20.如权利要求14所述的频道转换装置,其特征在于,该装置还包括:20. The channel changing device of claim 14, further comprising:语言选择模块,用于根据用户输入的语言选择信号,选择一个与所述语言选择信号对应的声学模型。The language selection module is configured to select an acoustic model corresponding to the language selection signal according to the language selection signal input by the user.
CNA2008100654170A2008-02-232008-02-23Speech recognition channel selecting system, method and channel switching devicePendingCN101516005A (en)

Priority Applications (2)

Application NumberPriority DateFiling DateTitle
CNA2008100654170ACN101516005A (en)2008-02-232008-02-23Speech recognition channel selecting system, method and channel switching device
PCT/CN2009/070380WO2009103226A1 (en)2008-02-232009-02-09A voice recognition channel selection system, a voice recognition channel selection method and a channel switching device

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CNA2008100654170ACN101516005A (en)2008-02-232008-02-23Speech recognition channel selecting system, method and channel switching device

Publications (1)

Publication NumberPublication Date
CN101516005Atrue CN101516005A (en)2009-08-26

Family

ID=40985065

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CNA2008100654170APendingCN101516005A (en)2008-02-232008-02-23Speech recognition channel selecting system, method and channel switching device

Country Status (2)

CountryLink
CN (1)CN101516005A (en)
WO (1)WO2009103226A1 (en)

Cited By (34)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN102546034A (en)*2012-02-072012-07-04深圳市纽格力科技有限公司Method and equipment for processing voice signals
CN102789176A (en)*2012-07-042012-11-21北京捷通华声语音技术有限公司Control method and system for household appliance terminal
CN102833634A (en)*2012-09-122012-12-19康佳集团股份有限公司Implementation method for television speech recognition function and television
CN102999161A (en)*2012-11-132013-03-27安徽科大讯飞信息科技股份有限公司Implementation method and application of voice awakening module
CN103209369A (en)*2012-01-162013-07-17晨星软件研发(深圳)有限公司Voice-controlled system of electronic device and related control method
CN103297725A (en)*2012-02-282013-09-11联想(北京)有限公司Method and device for controlling electronic equipment and remote control
CN103366743A (en)*2012-03-302013-10-23北京千橡网景科技发展有限公司Voice-command operation method and device
CN103366740A (en)*2012-03-272013-10-23联想(北京)有限公司Voice command recognition method and voice command recognition device
CN103458287A (en)*2013-09-022013-12-18四川长虹电器股份有限公司System and method for game voice control based on digital television remote control technology
CN103491411A (en)*2013-09-262014-01-01深圳Tcl新技术有限公司Method and device based on language recommending channels
CN103489447A (en)*2012-06-132014-01-01华为技术有限公司Voice input method of remote controller, remote controller and multimedia terminal system
CN103581724A (en)*2012-08-092014-02-12纬创资通股份有限公司Control method and video-audio playing system
CN103607609A (en)*2013-11-272014-02-26Tcl集团股份有限公司Voice switching method and device for TV set channels
CN103634644A (en)*2013-12-092014-03-12乐视致新电子科技(天津)有限公司Method and system for switching channels of smart television through voices
CN103824559A (en)*2012-11-192014-05-28国际商业机器公司Interleaving voice commands for electronic meetings
CN103916685A (en)*2013-01-082014-07-09联想(北京)有限公司Method and device for changing television channels and television set
CN104363517A (en)*2014-11-122015-02-18科大讯飞股份有限公司Voice switching method and system based on television scene and voice assistant
CN104461446A (en)*2014-11-122015-03-25科大讯飞股份有限公司Software running method and system based on voice interaction
CN104506944A (en)*2014-11-122015-04-08科大讯飞股份有限公司Voice interaction assisting method and system based on television scene and voice assistant
CN104766608A (en)*2014-01-072015-07-08深圳市中兴微电子技术有限公司Voice control method and voice control device
WO2015135300A1 (en)*2014-03-142015-09-17京东方科技集团股份有限公司Method for controlling tv set through voice, and tv set
CN105573709A (en)*2014-10-102016-05-11讯飞智元信息科技有限公司Voice input equipment control method and system
CN105847900A (en)*2016-05-262016-08-10无锡天脉聚源传媒科技有限公司Method and device for determining program channel
WO2017035845A1 (en)*2015-09-062017-03-09何兰Method and remote control system for invoking channel grouping according to voice
WO2017035844A1 (en)*2015-09-062017-03-09何兰Information prompting method for use when matching voice to channel group and remote control system
CN106971703A (en)*2017-03-172017-07-21西北师范大学A kind of song synthetic method and device based on HMM
CN107205169A (en)*2016-03-162017-09-26中航华东光电(上海)有限公司Voice command intelligent television programme televised live switching method
CN108111922A (en)*2016-11-242018-06-01三星电子株式会社Electronic equipment and the method for updating its channel map
CN109600636A (en)*2013-01-072019-04-09三星电子株式会社Interactive server, display equipment and its control method
CN110631064A (en)*2018-05-312019-12-31宁波方太厨具有限公司Voice recognition method and automatic control method of range hood applying voice recognition method
CN110782886A (en)*2018-07-302020-02-11阿里巴巴集团控股有限公司System, method, television, device and medium for speech processing
CN111656793A (en)*2018-01-292020-09-11三星电子株式会社 Display device and method for displaying screen of display device
CN112860205A (en)*2021-03-172021-05-28Vidaa美国公司Channel switching method of display equipment and display equipment
WO2023045459A1 (en)*2021-09-272023-03-30海信视像科技股份有限公司Receiving apparatus and station selection system

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
KR20140055502A (en)*2012-10-312014-05-09삼성전자주식회사Broadcast receiving apparatus, server and control method thereof
CN102938864A (en)*2012-11-272013-02-20四川长虹电器股份有限公司Method for realizing television channel switching based on customized voice

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
KR20000042731A (en)*1998-12-262000-07-15전주범Channel switching apparatus based on voice recognition of television
US6314398B1 (en)*1999-03-012001-11-06Matsushita Electric Industrial Co., Ltd.Apparatus and method using speech understanding for automatic channel selection in interactive television
CN2518278Y (en)*2001-12-312002-10-23海尔集团公司Acoustic controlled telephone remote controller
CN2681491Y (en)*2003-01-222005-02-23程松林Voice demander for television
CN2657310Y (en)*2003-12-022004-11-17肖奇Sound controlled TV set
CN100538762C (en)*2006-12-152009-09-09广东协联科贸发展有限公司A kind of keying speech integrated remote controller

Cited By (46)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN103209369A (en)*2012-01-162013-07-17晨星软件研发(深圳)有限公司Voice-controlled system of electronic device and related control method
CN102546034B (en)*2012-02-072013-12-18深圳市纽格力科技有限公司Method and equipment for processing voice signals
CN102546034A (en)*2012-02-072012-07-04深圳市纽格力科技有限公司Method and equipment for processing voice signals
CN103297725A (en)*2012-02-282013-09-11联想(北京)有限公司Method and device for controlling electronic equipment and remote control
CN103366740A (en)*2012-03-272013-10-23联想(北京)有限公司Voice command recognition method and voice command recognition device
CN103366740B (en)*2012-03-272016-12-14联想(北京)有限公司Voice command identification method and device
CN103366743A (en)*2012-03-302013-10-23北京千橡网景科技发展有限公司Voice-command operation method and device
CN103489447A (en)*2012-06-132014-01-01华为技术有限公司Voice input method of remote controller, remote controller and multimedia terminal system
CN102789176A (en)*2012-07-042012-11-21北京捷通华声语音技术有限公司Control method and system for household appliance terminal
CN102789176B (en)*2012-07-042015-08-05北京捷通华声语音技术有限公司A kind of household electrical appliance terminal control method and system
CN103581724A (en)*2012-08-092014-02-12纬创资通股份有限公司Control method and video-audio playing system
CN102833634A (en)*2012-09-122012-12-19康佳集团股份有限公司Implementation method for television speech recognition function and television
CN102999161A (en)*2012-11-132013-03-27安徽科大讯飞信息科技股份有限公司Implementation method and application of voice awakening module
CN102999161B (en)*2012-11-132016-03-02科大讯飞股份有限公司A kind of implementation method of voice wake-up module and application
CN103824559B (en)*2012-11-192017-06-06国际商业机器公司Insert the voice command for electronic meeting
CN103824559A (en)*2012-11-192014-05-28国际商业机器公司Interleaving voice commands for electronic meetings
CN109600636A (en)*2013-01-072019-04-09三星电子株式会社Interactive server, display equipment and its control method
CN103916685B (en)*2013-01-082017-11-03联想(北京)有限公司A kind of television channel replacing options, device and television set
CN103916685A (en)*2013-01-082014-07-09联想(北京)有限公司Method and device for changing television channels and television set
CN103458287A (en)*2013-09-022013-12-18四川长虹电器股份有限公司System and method for game voice control based on digital television remote control technology
CN103491411A (en)*2013-09-262014-01-01深圳Tcl新技术有限公司Method and device based on language recommending channels
CN103607609B (en)*2013-11-272017-09-05Tcl集团股份有限公司The method for switching languages and device of a kind of TV channel
CN103607609A (en)*2013-11-272014-02-26Tcl集团股份有限公司Voice switching method and device for TV set channels
CN103634644A (en)*2013-12-092014-03-12乐视致新电子科技(天津)有限公司Method and system for switching channels of smart television through voices
CN104766608A (en)*2014-01-072015-07-08深圳市中兴微电子技术有限公司Voice control method and voice control device
WO2015135300A1 (en)*2014-03-142015-09-17京东方科技集团股份有限公司Method for controlling tv set through voice, and tv set
CN105573709A (en)*2014-10-102016-05-11讯飞智元信息科技有限公司Voice input equipment control method and system
CN104461446A (en)*2014-11-122015-03-25科大讯飞股份有限公司Software running method and system based on voice interaction
CN104461446B (en)*2014-11-122018-05-18科大讯飞股份有限公司Software running method and system based on voice interaction
CN104363517A (en)*2014-11-122015-02-18科大讯飞股份有限公司Voice switching method and system based on television scene and voice assistant
CN104506944A (en)*2014-11-122015-04-08科大讯飞股份有限公司Voice interaction assisting method and system based on television scene and voice assistant
CN104363517B (en)*2014-11-122018-05-11科大讯飞股份有限公司Voice switching method and system based on television scene and voice assistant
WO2017035845A1 (en)*2015-09-062017-03-09何兰Method and remote control system for invoking channel grouping according to voice
WO2017035844A1 (en)*2015-09-062017-03-09何兰Information prompting method for use when matching voice to channel group and remote control system
CN107205169A (en)*2016-03-162017-09-26中航华东光电(上海)有限公司Voice command intelligent television programme televised live switching method
CN105847900A (en)*2016-05-262016-08-10无锡天脉聚源传媒科技有限公司Method and device for determining program channel
CN105847900B (en)*2016-05-262018-10-26无锡天脉聚源传媒科技有限公司A kind of program channel determines method and device
CN108111922A (en)*2016-11-242018-06-01三星电子株式会社Electronic equipment and the method for updating its channel map
CN108111922B (en)*2016-11-242022-01-25三星电子株式会社Electronic device and method for updating channel map thereof
CN106971703A (en)*2017-03-172017-07-21西北师范大学A kind of song synthetic method and device based on HMM
CN111656793A (en)*2018-01-292020-09-11三星电子株式会社 Display device and method for displaying screen of display device
CN110631064A (en)*2018-05-312019-12-31宁波方太厨具有限公司Voice recognition method and automatic control method of range hood applying voice recognition method
CN110631064B (en)*2018-05-312021-01-15宁波方太厨具有限公司Voice recognition method and automatic control method of range hood applying voice recognition method
CN110782886A (en)*2018-07-302020-02-11阿里巴巴集团控股有限公司System, method, television, device and medium for speech processing
CN112860205A (en)*2021-03-172021-05-28Vidaa美国公司Channel switching method of display equipment and display equipment
WO2023045459A1 (en)*2021-09-272023-03-30海信视像科技股份有限公司Receiving apparatus and station selection system

Also Published As

Publication numberPublication date
WO2009103226A1 (en)2009-08-27

Similar Documents

PublicationPublication DateTitle
CN101516005A (en)Speech recognition channel selecting system, method and channel switching device
US20210243490A1 (en)Voice enabled media presentation systems and methods
US20050043948A1 (en)Speech recognition method remote controller, information terminal, telephone communication terminal and speech recognizer
JP2019117623A (en)Voice dialogue method, apparatus, device and storage medium
KR100856358B1 (en) Shoe User Interface for Voice Enable Device
JP6440346B2 (en) Display device, electronic device, interactive system, and control method thereof
CN107134286A (en)ANTENNAUDIO player method, music player and storage medium based on interactive voice
EP2311031B1 (en)Method and device for converting speech
US20030130852A1 (en)Headset with radio communication function for speech processing system using speech recognition
CN104168353A (en)Bluetooth earphone and voice interaction control method thereof
MXPA05000311A (en)Voice-controllable communication gateway for controlling multiple electronic and information appliances.
WO2010106711A1 (en)Voice input device, voice recognition system and voice recognition method
US11190851B1 (en)Systems and methods for providing media based on a detected language being spoken
JP2008191662A (en)Voice control system and method for voice control
US20030061033A1 (en)Remote control system for translating an utterance to a control parameter for use by an electronic device
KR20030008726A (en)an electronic-apparatus and method for preventing mis-operation and rising speech recognition rate according to sound recognizing
CN101473636A (en)Method and system for retrieving information
CN206819732U (en)Intelligent music player
CN103916686A (en)Display apparatus and controlling method thereof
CN101345819A (en) A voice control system for set-top box
CN1300175A (en)Radio remote control system with microphone/loud speaker for Internet apparatus and method for controlling its telecontroller
CN101237520A (en)A system and method for voice control STB
JP7197992B2 (en) Speech recognition device, speech recognition method
CN110351419B (en)Intelligent voice system and voice processing method thereof
CN107945806A (en)User identification method and device based on sound characteristic

Legal Events

DateCodeTitleDescription
C06Publication
PB01Publication
C10Entry into substantive examination
SE01Entry into force of request for substantive examination
C12Rejection of a patent application after its publication
RJ01Rejection of invention patent application after publication

Application publication date:20090826


[8]ページ先頭

©2009-2025 Movatter.jp