Movatterモバイル変換


[0]ホーム

URL:


CN101807398B - Speech recognition device and operating method thereof - Google Patents

Speech recognition device and operating method thereof
Download PDF

Info

Publication number
CN101807398B
CN101807398BCN2009100063762ACN200910006376ACN101807398BCN 101807398 BCN101807398 BCN 101807398BCN 2009100063762 ACN2009100063762 ACN 2009100063762ACN 200910006376 ACN200910006376 ACN 200910006376ACN 101807398 BCN101807398 BCN 101807398B
Authority
CN
China
Prior art keywords
voice
training
model
host
recognition device
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN2009100063762A
Other languages
Chinese (zh)
Other versions
CN101807398A (en
Inventor
沈欣懋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Aten International Co Ltd
Original Assignee
Aten International Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Aten International Co LtdfiledCriticalAten International Co Ltd
Priority to CN2009100063762ApriorityCriticalpatent/CN101807398B/en
Publication of CN101807398ApublicationCriticalpatent/CN101807398A/en
Application grantedgrantedCritical
Publication of CN101807398BpublicationCriticalpatent/CN101807398B/en
Expired - Fee Relatedlegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Images

Landscapes

Abstract

A voice recognition device comprises a human-computer interface, a voice input interface, a voice transcoding unit, a voice database and a device processing unit. The man-machine interface is used for listing the voice recognition device as an operation device for the host. The voice input interface is used for capturing the simulation instruction voice. The voice transcoding unit is used for converting the analog command voice into the digital command voice. The speech database includes a plurality of model eigenvalues and a plurality of model device codes, and the model eigenvalues correspond to the model device codes. And the device processing unit is used for comparing the instruction characteristic value of the digital instruction voice with the model characteristic value and outputting a corresponding model device code to the host.

Description

Translated fromChinese
语音识别装置及其操作方法Speech recognition device and operating method thereof

技术领域technical field

本发明是有关于一种语音识别装置及其操作方法,且特别是有关于一种具可移植性、个人化与智能型的语音识别装置及其操作方法。The present invention relates to a speech recognition device and its operating method, and in particular to a portable, personalized and intelligent speech recognition device and its operating method.

背景技术Background technique

近年来,语音识别系统由于具有使用方便,提高效率及节省成本的几项优势,因此广泛地被应用在各行各业。例如,使用者对着麦克风输入指令语音,例如是“复制”,然后经语音软件比对出对应的装置码,例如是代表复制功能的装置码。因此,在识别过程中,用口述方式就能取代键盘的操作,相当便利。In recent years, speech recognition systems have been widely used in various industries due to their advantages of ease of use, efficiency improvement and cost savings. For example, the user inputs an instruction voice into the microphone, such as "copy", and then compares the corresponding device code through the voice software, such as the device code representing the copy function. Therefore, in the recognition process, the operation of the keyboard can be replaced by dictation, which is quite convenient.

一般而言,在进行语音识别前须先安装语音软件至主机内并进行语音训练。语音软件在计算出训练语音的训练特征值后,将训练特征值储存在主机内。当使用者进行语音识别时,对着麦克风输入指令语音,然后由主机内的语音软件计算出指令语音的指令特征值后,将指令特征值与主机内的多个训练特征值进行比对,以比对出对应的语音特征值。然后,再输出与语音特征值对应的装置码。Generally speaking, before performing speech recognition, speech software must be installed in the host and speech training must be performed. After the voice software calculates the training feature value of the training voice, the training feature value is stored in the host. When the user performs speech recognition, he inputs the command voice into the microphone, and then the voice software in the host computer calculates the command feature value of the command voice, and then compares the command feature value with multiple training feature values in the host machine to obtain Compare the corresponding speech feature values. Then, output the device code corresponding to the voice feature value.

然而,由于语音软件必须安装在主机内且训练特征值也储存于主机内,若主机损坏或使用者更换到另一台主机使用,则必须重新安装语音软件且必须重新进行语音训练。此外,每次用来输入指令语音的麦克风不一定是同一支,由于每支麦克风对声音的音频撷取都有些差异,若使用到与进行语音训练时用的麦克风差异过大的麦克风来进行语音识别操作,为达到高识别率的语音识别,则须重新进行语音训练,相当地不便。However, since the voice software must be installed in the host and the training feature values are also stored in the host, if the host is damaged or the user replaces it with another host, the voice software must be reinstalled and the voice training must be performed again. In addition, the microphone used to input the command voice is not necessarily the same one each time, because each microphone has some differences in the audio capture of the sound, if the microphone used for voice training is too different In the recognition operation, in order to achieve high recognition rate speech recognition, speech training must be carried out again, which is quite inconvenient.

发明内容Contents of the invention

本发明是有关于一种语音识别装置及其操作方法,是将运算出的训练特征值储存于一语音识别装置内。如此,使用者可随身携带语音识别装置,就算是更换不同的主机,亦不须重新进行语音训练。如此,不但省时且相当便利。The present invention relates to a voice recognition device and its operation method. The calculated training feature value is stored in a voice recognition device. In this way, the user can carry the speech recognition device with him, even if he changes a different host, he does not need to carry out speech training again. This saves time and is quite convenient.

根据本发明的第一方面,提出一种语音识别装置。语音识别装置包括人机界面(HID Interface)、语音输入界面、语音转码单元、语音数据库及装置处理单元。人机界面用以向主机列举语音识别装置为操作装置。语音输入界面用以撷取模拟指令语音。语音转码单元用以转换模拟指令语音为数字指令语音。语音数据库包括数个模型特征值及数个模型装置码,模型特征值对应于模型装置码。以及,装置处理单元用以比对数字指令语音的指令特征值与模型特征值,并输出对应的模型装置码至主机。According to a first aspect of the present invention, a speech recognition device is proposed. The speech recognition device includes a human-machine interface (HID Interface), a speech input interface, a speech transcoding unit, a speech database and a device processing unit. The man-machine interface is used to enumerate the voice recognition device as an operating device to the host. The voice input interface is used for capturing the simulated command voice. The voice transcoding unit is used for converting the analog command voice into digital command voice. The speech database includes several model feature values and several model device codes, and the model feature values correspond to the model device codes. And, the device processing unit is used for comparing the command feature value of the digital command voice with the model feature value, and outputting the corresponding model device code to the host.

根据本发明的第二方面,提出一种语音识别装置的操作方法。操作方法包括以下步骤。向主机列举语音识别装置为操作装置。其中,语音识别装置储存有语音数据库,语音数据库包括数个模型特征值及数个模型装置码,模型特征值对应至模型装置码;撷取模拟指令语音;转换模拟指令语音为数字指令语音;比对数字指令语音的指令特征值与模型特征值;以及,输出对应的模型装置码至主机。According to a second aspect of the present invention, a method for operating a speech recognition device is proposed. The operation method includes the following steps. List the voice recognition device as the operating device to the host. Wherein, the voice recognition device stores a voice database, and the voice database includes several model feature values and several model device codes, and the model feature values correspond to the model device codes; captures the analog command voice; converts the analog command voice into a digital command voice; The command feature value and the model feature value of the digital command voice; and output the corresponding model device code to the host.

根据本发明的第三方面,提出一种语音识别装置。语音识别装置包括大量储存界面、语音输入界面、语音转码单元、语音数据库及装置处理单元。大量储存界面用以与主机电性连接并向主机将语音识别装置列举为大量储存装置(Mass Storage Device)。语音输入界面用以撷取模拟指令语音。语音转码单元用以转换模拟指令语音为数字指令语音。语音数据库包括数个模型特征值及数个模型装置码,模型特征值对应于模型装置码。以及,装置处理单元用以传送语音数据库及应用程序至主机。其中,主机用以加载应用程序并比对数字指令语音的指令特征值与模型特征值,以传送对应的模型装置码。According to a third aspect of the present invention, a speech recognition device is provided. The speech recognition device includes a mass storage interface, a speech input interface, a speech transcoding unit, a speech database and a device processing unit. The mass storage interface is used to electrically connect with the host and enumerate the speech recognition device as a mass storage device to the host. The voice input interface is used for capturing the simulated command voice. The voice transcoding unit is used for converting the analog command voice into digital command voice. The speech database includes several model feature values and several model device codes, and the model feature values correspond to the model device codes. And, the device processing unit is used to transmit the voice database and the application program to the host. Wherein, the host computer is used to load the application program and compare the command feature value and the model feature value of the digital command voice to transmit the corresponding model device code.

根据本发明的第四方面,提出一种语音识别装置的操作方法。操作方法包括以下步骤。向主机列举语音识别装置为大量储存装置。其中,语音撷取装置储存有应用程序、数个模型特征值及数个模型装置码,模型特征值对应至模型装置码;传送语音数据库与应用程序至主机;主机加载应用程序;撷取模拟指令语音;转换模拟指令语音为数字指令语音;主机比对数字指令语音的指令特征值与此些模型特征值;以及,主机传送对应的模型装置码。According to a fourth aspect of the present invention, a method for operating a speech recognition device is provided. The operation method includes the following steps. The voice recognition device is enumerated to the host as a mass storage device. Among them, the voice capture device stores the application program, several model feature values and several model device codes, the model feature values correspond to the model device codes; transmit the voice database and the application program to the host; the host loads the application program; captures the simulation command voice; converting the analog command voice into digital command voice; the host compares the command feature value of the digital command voice with these model feature values; and the host sends the corresponding model device code.

为让本发明的上述内容能更明显易懂,下文特举较佳实施例,并配合所附图式,作详细说明如下:In order to make the above content of the present invention more obvious and understandable, the following preferred embodiments are specifically cited below, and in conjunction with the accompanying drawings, the detailed description is as follows:

附图说明Description of drawings

图1绘示依照本发明第一实施例的语音识别装置的功能方块图。FIG. 1 is a functional block diagram of a speech recognition device according to a first embodiment of the present invention.

图2绘示依照本发明第一实施例的语音识别装置的操作流程图。FIG. 2 is a flowchart illustrating the operation of the speech recognition device according to the first embodiment of the present invention.

图3绘示依照本发明第二实施例的语音识别装置的功能方块图。FIG. 3 is a functional block diagram of a speech recognition device according to a second embodiment of the present invention.

图4绘示依照本发明第二实施例的语音识别装置的操作流程图。FIG. 4 is a flowchart illustrating the operation of the speech recognition device according to the second embodiment of the present invention.

图5绘示第二实施例的语音训练窗口的示意图。FIG. 5 is a schematic diagram of the speech training window of the second embodiment.

图6绘示依照本发明第三实施例的语音识别装置的功能方块图。FIG. 6 is a functional block diagram of a speech recognition device according to a third embodiment of the present invention.

图7绘示依照本发明第三实施例的语音识别装置的操作流程图。FIG. 7 is a flowchart illustrating the operation of the speech recognition device according to the third embodiment of the present invention.

图8绘示依照本发明第四实施例的语音识别装置的功能方块图。FIG. 8 is a functional block diagram of a speech recognition device according to a fourth embodiment of the present invention.

图9绘示依照本发明第四实施例的语音识别装置的操作流程图。FIG. 9 is a flowchart illustrating the operation of the speech recognition device according to the fourth embodiment of the present invention.

图10绘示本发明另一实施例的语音识别装置的功能方块图。FIG. 10 is a functional block diagram of a speech recognition device according to another embodiment of the present invention.

[主要元件标号说明][Description of main component labels]

100、300、600、800、900:语音识别装置100, 300, 600, 800, 900: voice recognition device

102、902:人机界面102, 902: Man-machine interface

104、616:语音输入界面104, 616: voice input interface

106、606:语音转码单元106, 606: Speech transcoding unit

108、608:储存单元108, 608: storage unit

110、610:装置处理单元110, 610: device processing unit

112、612:语音数据库112, 612: voice database

114、614:主机114, 614: Host

302、602、802:应用程序302, 602, 802: application

304、604:大量储存界面304, 604: mass storage interface

308、904:复合式装置308, 904: composite device

K1:训练装置码K1: Trainer Code

S1:模拟指令语音S1: Analog command voice

S2:数字指令语音S2: digital command voice

T1:模拟训练语音T1: simulated training voice

T2:数字训练语音T2: Digital training voice

W:语音训练窗口W: voice training window

W11、W12、W13:语音字段W11, W12, W13: Speech fields

W21、W22、W23:装置码字段W21, W22, W23: device code field

具体实施方式Detailed ways

依照本发明的语音识别装置及其操作方法,是将计算出的训练特征值储存于语音数据库内,而语音数据库储存于语音识别装置内。如此,使用者可随身携带语音识别装置,就算是更换不同的主机,亦不须重新进行语音训练。如此,不但省时且相当便利。底下以多组应用实例来作说明。然此些实施例仅为本发明的发明精神下的几种实施方式,其说明的文字与图标并不会对本发明的欲保护范围进行限缩。According to the speech recognition device and its operation method of the present invention, the calculated training feature values are stored in the speech database, and the speech database is stored in the speech recognition device. In this way, the user can carry the speech recognition device with him, even if he changes a different host, he does not need to carry out speech training again. This saves time and is quite convenient. In the following, several groups of application examples are used for illustration. However, these embodiments are only several implementation modes under the inventive spirit of the present invention, and the description words and icons thereof will not limit the protection scope of the present invention.

请参照图1,其绘示依照本发明第一实施例的语音识别装置的功能方块图。语音识别装置100,例如是麦克风,其包括人机界面102、语音输入界面104、语音转码单元106、储存单元108及装置处理单元110。Please refer to FIG. 1 , which shows a functional block diagram of a speech recognition device according to a first embodiment of the present invention. The speech recognition device 100 is, for example, a microphone, and includes a man-machine interface 102 , aspeech input interface 104 , aspeech transcoding unit 106 , a storage unit 108 and adevice processing unit 110 .

人机界面102用以向主机114列举语音识别装置为操作装置,操作装置例如是键盘或鼠标。其中,人机界面102可为通用串行总线标准界面(Universal Serial Bus,USB)或PS/2界面,而第一实施例的人机界面102以通用串行总线标准界面为例作说明。语音输入界面104用以撷取模拟指令语音S1。语音转码单元106用以转换模拟指令语音S1为数字指令语音S2。The man-machine interface 102 is used to enumerate the voice recognition device to thehost 114 as an operating device, such as a keyboard or a mouse. Wherein, the man-machine interface 102 can be a Universal Serial Bus standard interface (Universal Serial Bus, USB) or a PS/2 interface, and the man-machine interface 102 of the first embodiment is described using the Universal Serial Bus standard interface as an example. Thevoice input interface 104 is used to capture the analog command voice S1. Thevoice transcoding unit 106 is used for converting the analog command voice S1 into a digital command voice S2.

储存单元108用以储存语音数据库112,语音数据库112包括数个模型特征值(未绘示)及数个模型装置码(未绘示),而模型特征值对应于模型装置码,模型装置码为键盘码(未绘示)或鼠标码(未绘示)。举例来说,模型特征值包括语音“复制”的特征值及语音“删除”的特征值,而模型装置码包括对应于“复制”的模型特征值的模型装置码及对应于“删除”的模型特征值的模型装置码。The storage unit 108 is used to store thespeech database 112, thespeech database 112 includes several model feature values (not shown) and several model device codes (not shown), and the model feature values correspond to the model device codes, and the model device codes are Keyboard code (not shown) or mouse code (not shown). For example, the model feature value includes the feature value of the speech "duplication" and the feature value of the speech "deletion", and the model set code includes the model set code corresponding to the model feature value of "copy" and the model set code corresponding to "delete". The model fixture code for the eigenvalues.

装置处理单元110用以运算出数字指令语音S2的指令特征值(未绘示)并比对指令特征值与模型特征值,并从模型特征值中比对出与指令特征值相似的模型特征值。当比对出与指令特征值相似的模型特征值后,装置处理单元110输出对应于相似的模型特征值的模型装置码至主机114。Thedevice processing unit 110 is used to calculate the instruction characteristic value (not shown) of the digital instruction voice S2 and compare the instruction characteristic value with the model characteristic value, and compare the model characteristic value similar to the instruction characteristic value from the model characteristic value . After comparing a model feature value similar to the instruction feature value, thedevice processing unit 110 outputs a model device code corresponding to the similar model feature value to thehost 114 .

如上所述的装置处理单元110,其比对功能与传送模型装置码的功能是可以固件(firmware)形式实现。因此,第一实施例的语音识别装置100在执行语音识别操作时,并不需要另外安装任何应用程序。如此,只要携带着语音识别装置100,到哪里都能进行语音识别操作,不用担心主机是否有安装语音识别软件,相当方便。As mentioned above, thedevice processing unit 110 can realize the comparison function and the function of transmitting the model device code in the form of firmware. Therefore, the voice recognition device 100 of the first embodiment does not need to install any additional application programs when performing voice recognition operations. In this way, as long as the voice recognition device 100 is carried, the voice recognition operation can be performed anywhere, and there is no need to worry about whether the host has voice recognition software installed, which is quite convenient.

由于语音识别装置100具有可移植性,使用者可随身携带语音识别装置100。也就是说,可随身携带语音数据库112。如此,不管换到哪一台主机,都能通过储存于语音识别装置100的语音数据库112内的模型特征值来进行语音识别操作,而无需重新再进行语音训练。Since the voice recognition device 100 is portable, users can carry the voice recognition device 100 with them. That is to say, thevoice database 112 can be carried with you. In this way, no matter which host is switched to, the speech recognition operation can be performed through the model feature values stored in thespeech database 112 of the speech recognition device 100 , without re-performing speech training.

并且,由于语音识别装置100本身就包括语音输入界面104及语音转码单元106,所以不需通过其它的语音撷取装置,例如是其它的麦克风来输入模拟指令语音。如此,便排除了不同的语音撷取装置的差异性所造成的语音识别率下降的问题。Moreover, since the voice recognition device 100 itself includes thevoice input interface 104 and thevoice transcoding unit 106, it is not necessary to input analog command voice through other voice capture devices, such as other microphones. In this way, the problem of voice recognition rate reduction caused by the differences of different voice capture devices is eliminated.

请参照图2,其绘示依照本发明的第一实施例的语音识别装置的操作流程图。操作方法包括以下步骤。首先,于步骤S202中,电性连接语音识别装置100与主机114,人机界面102并向主机114列举语音识别装置100为操作装置。接着,于步骤S204中,语音输入界面104撷取模拟指令语音S1,例如是使用者对着语音输入界面104讲出“复制”。Please refer to FIG. 2 , which shows a flow chart of the operation of the speech recognition device according to the first embodiment of the present invention. The operation method includes the following steps. First, in step S202 , the speech recognition device 100 is electrically connected to thehost 114 , and the man-machine interface 102 lists the speech recognition device 100 as an operating device to thehost 114 . Next, in step S204 , thevoice input interface 104 captures the analog command voice S1 , for example, the user speaks “copy” to thevoice input interface 104 .

再来,于步骤S206中,语音转码单元106转换模拟指令语音S1为数字指令语音S2。Next, in step S206 , thevoice transcoding unit 106 converts the analog command voice S1 into a digital command voice S2 .

接着,于步骤S208中,装置处理单元110运算出数字指令语音S2的指令特征值。Next, in step S208 , thedevice processing unit 110 calculates the command feature value of the digital command voice S2 .

然后,于步骤S210中,装置处理单元110比对指令特征值与模型特征值,并从模型特征值中比对出与指令特征值相似的模型特征值。例如,装置处理单元110将指令特征值与语音数据库112中“复制”的模型特征值与“删除”的模型特征值进行比对后,发现“复制”的模型特征值与指令特征值相似。Then, in step S210 , thedevice processing unit 110 compares the command feature value and the model feature value, and compares the model feature value similar to the command feature value from the model feature values. For example, thedevice processing unit 110 compares the command feature value with the “duplicated” model feature value and the “deleted” model feature value in thevoice database 112 , and finds that the “copied” model feature value is similar to the command feature value.

然后,于步骤S212中,装置处理单元110通过人机界面102,输出语音数据库112的多个模型装置码中与“复制”的模型特征值相对应的模型装置码至主机114。Then, in step S212 , thedevice processing unit 110 outputs the model device code corresponding to the “copied” model feature value among the plurality of model device codes in thevoice database 112 to thehost 114 through the man-machine interface 102 .

请参照图3,其绘示依照本发明第二实施例的语音识别装置的功能方块图。第二实施例与第一实施例不同之处在于,第二实施例的语音识别装置300的储存单元108还储存有应用程序302,而应用程序302用以进行语音训练。此外,语音识别装置300还包括大量储存界面(Mass Storage Interface)304,应用程序302即通过大量储存界面304被传送至主机114,以让主机114加载以进行语音训练操作。其余相同之处沿用相同标号,在此就不再赘述。Please refer to FIG. 3 , which shows a functional block diagram of a speech recognition device according to a second embodiment of the present invention. The difference between the second embodiment and the first embodiment is that the storage unit 108 of the speech recognition device 300 of the second embodiment further stores anapplication program 302, and theapplication program 302 is used for speech training. In addition, the voice recognition device 300 also includes a mass storage interface (Mass Storage Interface) 304, and theapplication program 302 is transmitted to thehost 114 through the mass storage interface 304, so that thehost 114 can be loaded for voice training. The rest of the same parts continue to use the same labels, and will not be repeated here.

大量储存界面304,例如是通用串行总线标准界面,用以与主机114电性连接并向主机114将语音识别装置300列举为大量储存装置。大量储存装置例如是含应用程序光盘片的光驱或随身碟,第二实施例的大量储存装置是以列举成光驱为例作说明。The mass storage interface 304 , such as a USB standard interface, is used to electrically connect with thehost 114 and list the speech recognition device 300 as a mass storage device to thehost 114 . The mass storage device is, for example, an optical drive containing an application program disc or a flash drive. The mass storage device in the second embodiment is illustrated as an optical drive as an example for illustration.

此外,语音输入界面104于第二实施例的语音识别装置300还用以撷取模拟训练语音T1,而语音转码单元106还用以转换模拟训练语音T1为数字训练语音T2。此外,装置处理单元110通过大量储存界面传送应用程序302至主机114,且用以运算出数字训练语音T2的训练特征值(未绘示)。较佳地,运算出训练特征值所采用的语音算法与运算出指令特征值所采用的语音算法是相同的。In addition, thevoice input interface 104 is also used to capture the analog training voice T1 in the voice recognition device 300 of the second embodiment, and thevoice transcoding unit 106 is also used to convert the analog training voice T1 into a digital training voice T2. In addition, thedevice processing unit 110 transmits theapplication program 302 to thehost 114 through a large storage interface, and is used to calculate the training feature value (not shown) of the digital training speech T2. Preferably, the speech algorithm used to calculate the training feature value is the same as the speech algorithm used to calculate the command feature value.

当装置处理单元110传送应用程序302至主机后,主机114的处理单元(未绘示)加载应用程序302。之后,主机114还用以撷取训练装置码K1,训练装置码K1为键盘码或鼠标码。其中,训练装置码K1对应于训练特征值。After thedevice processing unit 110 transmits theapplication program 302 to the host, the processing unit (not shown) of thehost 114 loads theapplication program 302 . Afterwards, thehost computer 114 is also used to retrieve the training device code K1, which is a keyboard code or a mouse code. Wherein, the training device code K1 corresponds to the training feature value.

通过第二实施例的应用程序,可进行语音训练及装置码的输入。并由装置处理单元110运算出数字训练语音T2的训练特征值后,将训练特征值储存进语音识别装置300内的语音数据库112,使训练特征值成为数个模型特征值之一。并且,装置处理单元110将对应的装置码,即对应于训练特征值的训练装置码K1储存进语音识别装置300内的语音数据库112,使训练装置码K1成为数个模型装置码之一。Voice training and device code input can be performed through the application program of the second embodiment. After calculating the training feature value of the digital training speech T2 by thedevice processing unit 110, the training feature value is stored in thespeech database 112 in the speech recognition device 300, so that the training feature value becomes one of several model feature values. Moreover, thedevice processing unit 110 stores the corresponding device code, namely the training device code K1 corresponding to the training feature value, into thespeech database 112 in the speech recognition device 300, so that the training device code K1 becomes one of several model device codes.

此外,由于语音识别装置300包含人机界面102与大量储存界面304,所以语音识别装置300向主机114列举为一包含操作装置与大量储存装置的复合式装置308。因此,语音识别装置300除了可与主机114配合以进行语音训练操作外,语音识别装置300的装置处理单元110还可进行语音识别的操作并通过人机界面102输出模型装置码至主机114。也就是说,语音识别装置300为同时具有语音训练功能及语音识别功能的装置。In addition, since the speech recognition device 300 includes the man-machine interface 102 and the mass storage interface 304 , the speech recognition device 300 is listed to thehost 114 as a composite device 308 including an operating device and a mass storage device. Therefore, in addition to the speech recognition device 300 being able to cooperate with thehost 114 to perform speech training operations, thedevice processing unit 110 of the speech recognition device 300 can also perform speech recognition operations and output model device codes to thehost 114 through the man-machine interface 102 . That is to say, the speech recognition device 300 is a device having both speech training function and speech recognition function.

请参照图4,其绘示依照本发明第二实施例的语音识别装置的操作流程图。操作方法包括以下步骤。首先,于步骤S402中,电性连接语音识别装置300与主机114,大量储存界面304并向主机114列举语音识别装置300为大量储存装置,而人机界面102向主机114列举语音识别装置300为操作装置,使语音识别装置300成为包含操作装置与大量储存装置的复合式装置308。Please refer to FIG. 4 , which shows a flow chart of the operation of the speech recognition device according to the second embodiment of the present invention. The operation method includes the following steps. First, in step S402, the voice recognition device 300 is electrically connected to thehost 114, the mass storage interface 304 lists the voice recognition device 300 as a mass storage device to thehost 114, and the man-machine interface 102 lists the voice recognition device 300 to thehost 114 as The operating device makes the voice recognition device 300 a composite device 308 including an operating device and a mass storage device.

接着,于步骤S404中,语音识别装置300传送应用程序302至主机114,以让主机114加载。由于语音识别装置300被列举成光驱,所以语音识别装置300的储存单元108可储存有自动执行设定文件,例如是文件名为“autorun.inf”的文件。文件“autorun.inf”记录有执行应用程序302的路径及指令。当主机114发现语音识别装置300内有文件“autorun.inf”时,便自动执行此文件所指向的应用程序302。也就是说,在应用程序302被主机114加载的过程中,是自动完成的,并不需使用者手动设定或点取。Next, in step S404 , the voice recognition device 300 transmits theapplication program 302 to thehost 114 so that thehost 114 can load it. Since the speech recognition device 300 is listed as an optical drive, the storage unit 108 of the speech recognition device 300 may store an automatic execution setting file, such as a file named "autorun.inf". The file "autorun.inf" records the path and instructions for executing theapplication program 302 . When thehost 114 finds the file "autorun.inf" in the voice recognition device 300, it automatically executes theapplication 302 pointed to by this file. That is to say, when theapplication program 302 is loaded by thehost computer 114, it is automatically completed, and the user does not need to manually set or click.

此外,当语音识别装置300被列举为随身碟时也可以执行语音训练操作。举例来说,当语音识别装置300被列举为随身碟时,使用者可以自行点选储存单元108内的应用程序302,在连续双击或按下键盘的确认键(Enter)后,主机114就加载应用程序302。如此,一样可以完成启动应用程序302的操作。In addition, voice training operations can also be performed when the voice recognition device 300 is listed as a flash drive. For example, when the voice recognition device 300 is listed as a flash drive, the user can click on theapplication program 302 in the storage unit 108, and after double-clicking or pressing the Enter key on the keyboard, thehost 114 will load theapplication program 302.application 302 . In this way, the operation of starting theapplication program 302 can also be completed.

再来,请参照图5,其绘示第二实施例的语音训练窗口的示意图。于步骤S406中,主机114加载装置处理单元110所传来的应用程序302,并开启语音训练窗口W,语音训练窗口W包括数个语音字段,例如是语音字段W11、W12及W13及数个装置码字段,例如是装置码字段W21、W22及W23。语音字段用以记录数字训练语音T2,装置码字段用以记录训练装置码K1。Next, please refer to FIG. 5 , which shows a schematic diagram of the speech training window of the second embodiment. In step S406, thehost computer 114 loads theapplication program 302 sent by thedevice processing unit 110, and opens the speech training window W. The speech training window W includes several speech fields, such as speech fields W11, W12, and W13, and several device The code fields are, for example, the device code fields W21, W22 and W23. The voice field is used to record the digital training voice T2, and the device code field is used to record the training device code K1.

然后,于步骤S408中,语音输入界面104撷取模拟训练语音T1,例如是使用者对着语音识别装置300讲出“复制”。Then, in step S408 , thevoice input interface 104 captures the simulated training voice T1 , for example, the user speaks “copy” to the voice recognition device 300 .

然后,于步骤S410中,语音转码单元106转换模拟训练语音T1为数字训练语音T2。转换完成之后,主机114可将代表数字训练语音T2已转换完成的消息记录于语音字段的一者,例如是记录于语音字段W11。记录形式例如是文件名形式或符号形式。Then, in step S410, thespeech transcoding unit 106 converts the analog training speech T1 into the digital training speech T2. After the conversion is completed, thehost 114 may record a message indicating that the digital training voice T2 has been converted into one of the voice fields, such as the voice field W11. The record form is, for example, a file name form or a symbol form.

然后,于步骤S412中,装置处理单元110运算出数字训练语音T2的训练特征值。或者,运算出数字训练语音T2的训练特征值的操作也可以由主机114来完成。更进一步地说,若运算出训练特征值的操作系由主机114来完成的话,应用程序302还包含了语音的特征值的运算功能。较佳地,不管运算特征值的操作是由主机114或装置处理单元110执行,所使用的语音算法是相同的。Then, in step S412, thedevice processing unit 110 calculates the training feature value of the digital training speech T2. Alternatively, the operation of calculating the training feature value of the digital training speech T2 can also be completed by thehost computer 114 . Furthermore, if the operation of calculating the training feature values is performed by thehost computer 114, theapplication program 302 also includes the function of calculating the feature values of speech. Preferably, regardless of whether the operation of calculating the feature value is performed by thehost 114 or thedevice processing unit 110, the speech algorithm used is the same.

然后,于步骤S414中,主机114撷取训练装置码K1,训练装置码K1对应于训练特征值。训练装置码K1例如是使用者触发与主机114相连接的键盘(未绘示)的按键后,由主机114撷取而得。主机114在撷取完训练装置码K1后,将其记录于装置码字段的一者,例如是记录于装置码字段W21。记录形式例如是符号形式或训练装置码K1的码号。Then, in step S414, thehost 114 retrieves the training device code K1, which corresponds to the training feature value. The training device code K1 is obtained by thehost 114 after the user triggers a key on a keyboard (not shown) connected to thehost 114 , for example. After thehost 114 retrieves the training device code K1, it records it in one of the device code fields, for example, in the device code field W21. The recording form is, for example, a symbolic form or a code number of the trainer code K1.

使用者在触发与主机114相连接的键盘上的按键之前,可点击语音训练窗口W上的装置码字段W21,以提醒主机114一使用者已经开始要输入对应数字训练语音T2的训练装置码K1。较佳地,语音训练窗口W还提供确认按键(未绘示)。当步骤S414完成后且经使用者触发此确认按键后,方进入步骤S416。或者,也可在输入完模拟训练语音T1及训练装置码K1后,随即进入步骤S416,不需使用者执行任何的确认操作。Before triggering the keys on the keyboard connected to thehost 114, the user can click on the device code field W21 on the voice training window W to remind thehost 114 that the user has begun to input the training device code K1 corresponding to the digital training voice T2 . Preferably, the voice training window W also provides a confirmation button (not shown). After step S414 is completed and the confirmation button is triggered by the user, the process proceeds to step S416. Alternatively, after inputting the simulated training voice T1 and the training device code K1, the process immediately proceeds to step S416 without any confirmation operation by the user.

然后,于步骤S416中,主机114可传送代表训练完成的信号(未绘示)给语音识别装置300。在语音识别装置300收到此信号后,语音识别装置300储存训练特征值至语音数据库112,以使训练特征值成为数个模型特征值的一者,且将训练装置码K1通过人机界面102或大量储存界面304储存进语音数据库112,以使训练装置码K1成为数个模型装置码的一者。Then, in step S416 , thehost 114 may send a signal (not shown) representing the completion of the training to the speech recognition device 300 . After the speech recognition device 300 receives this signal, the speech recognition device 300 stores the training feature value to thespeech database 112, so that the training feature value becomes one of several model feature values, and passes the training device code K1 through the man-machine interface 102 Or the mass storage interface 304 is stored into thespeech database 112, so that the training device code K1 becomes one of several model device codes.

由于语音识别装置300为同时包含操作装置的复合式装置308。故于结束语音训练的操作后,可随时使用语音识别装置300执行语音识别功能,例如可采用图2的第一实施例的操作方法,来进行语音识别操作。Since the voice recognition device 300 is a composite device 308 that also includes an operating device. Therefore, after the speech training operation is finished, the speech recognition device 300 can be used to perform the speech recognition function at any time, for example, the operation method of the first embodiment shown in FIG. 2 can be used to perform the speech recognition operation.

请参照图6,其绘示依照本发明第三实施例的语音识别装置的功能方块图。第三实施例与第一实施例不同之处在于,第三实施例的语音识别装置600的储存单元608储存有应用程序602,应用程序602用以让主机614加载,以进行语音识别功能,且语音识别装置600并无人机界面102,而改以大量储存界面604取代,应用程序602即通过此大量储存界面604被传送至主机614,以让主机614加载。其余相同之处沿用相同标号,在此不再赘述。Please refer to FIG. 6 , which is a functional block diagram of a speech recognition device according to a third embodiment of the present invention. The difference between the third embodiment and the first embodiment is that the storage unit 608 of the speech recognition device 600 of the third embodiment stores anapplication program 602, and theapplication program 602 is used to be loaded by thehost computer 614 to perform the speech recognition function, and The voice recognition device 600 is replaced by a mass storage interface 604 instead of the drone interface 102 , and theapplication program 602 is transmitted to thehost 614 through the mass storage interface 604 for loading by thehost 614 . The rest of the same parts continue to use the same reference numerals, and will not be repeated here.

语音识别装置600,例如是麦克风包括大量储存界面604、语音输入界面616、语音转码单元606、储存单元608及装置处理单元610。The speech recognition device 600 , such as a microphone, includes a mass storage interface 604 , aspeech input interface 616 , aspeech transcoding unit 606 , a storage unit 608 and adevice processing unit 610 .

大量储存界面604,例如为通用串行总线标准界面,用以向主机614列举语音识别装置600为大量储存装置。大量储存装置例如是含应用程序光盘片的光驱或随身碟,第三实施例的大量储存装置是以光驱为例作说明。The mass storage interface 604 is, for example, a USB standard interface, and is used to enumerate the voice recognition device 600 as a mass storage device to thehost 614 . The mass storage device is, for example, an optical drive containing an application program disc or a flash drive. The mass storage device in the third embodiment is described using the optical drive as an example.

语音输入界面616用以撷取模拟指令语音S1。语音转码单元606用以转换模拟指令语音S1为数字指令语音S2。储存单元608储存有应用程序602及语音数据库612,语音数据库612包括数个模型特征值(未绘示)及数个模型装置码(未绘示)。模型特征值对应于模型装置码,模型装置码为键盘码(未绘示)或鼠标码(未绘示)。举例来说,模型特征值包括语音“复制”的特征值及语音“删除”的特征值,而模型装置码包括对应于“复制”的模型特征值的模型装置码及对应于“删除”的模型特征值的模型装置码。Thevoice input interface 616 is used to capture the analog command voice S1. Thevoice transcoding unit 606 is used for converting the analog command voice S1 into a digital command voice S2. The storage unit 608 stores anapplication program 602 and aspeech database 612, and thespeech database 612 includes several model feature values (not shown) and several model device codes (not shown). The model characteristic value corresponds to the model device code, and the model device code is a keyboard code (not shown) or a mouse code (not shown). For example, the model feature value includes the feature value of the speech "duplication" and the feature value of the speech "deletion", and the model set code includes the model set code corresponding to the model feature value of "copy" and the model set code corresponding to "delete". The model fixture code for the eigenvalues.

装置处理单元610用以传送语音数据库612及应用程序602至主机614,以让主机614加载。于语音数据库612及应用程序602被传送完成后,主机614加载应用程序602。之后,主机614用以运算出数字指令语音S2的指令特征值并还用以从多个模型特征值中比对出与指令特征值相似的模型特征值并用以传送多个模型装置码中与相似的模型特征值对应的模型装置码。Thedevice processing unit 610 is used to transmit thevoice database 612 and theapplication program 602 to thehost 614 for thehost 614 to load. After thespeech database 612 and theapplication program 602 are transmitted, thehost 614 loads theapplication program 602 . Afterwards, thehost computer 614 is used to calculate the command feature value of the digital command voice S2 and also to compare the model feature values similar to the command feature value from the multiple model feature values, and to transmit the similar model feature value among the multiple model device codes. The model device code corresponding to the model eigenvalue of .

更进一步地说,第一实施例的语音识别装置100及第二实施例的语音识别装置300是以写入装置处理单元110的固件来执行语音识别操作,而第三实施例的语音识别装置600则由主机614加载应用程序602来执行语音识别的操作。由此可知,本发明的语音识别装置的语音识别操作可以多种方式来进行,并不受限于本发明的实施例所描述的内容。Furthermore, the speech recognition device 100 of the first embodiment and the speech recognition device 300 of the second embodiment perform the speech recognition operation by writing the firmware of thedevice processing unit 110, while the speech recognition device 600 of the third embodiment Then thehost 614 loads theapplication program 602 to perform voice recognition operations. It can be seen that the voice recognition operation of the voice recognition device of the present invention can be performed in various ways, and is not limited to the content described in the embodiments of the present invention.

此外,第三实施例是由主机来执行语音识别操作,当主机的处理单元(未绘示)的数据处理速度快于装置处理单元610时,可选择第三实施例的语音识别装置600,以节省执行语音识别操作所需要的时间。In addition, in the third embodiment, the host performs the voice recognition operation. When the data processing speed of the processing unit (not shown) of the host is faster than that of thedevice processing unit 610, the voice recognition device 600 of the third embodiment can be selected to Save time needed to perform speech recognition operations.

请参照图7,其绘示依照本发明第三实施例的语音识别装置的操作流程图。操作方法包括以下步骤。Please refer to FIG. 7 , which shows a flow chart of the operation of the speech recognition device according to the third embodiment of the present invention. The operation method includes the following steps.

首先,于步骤S702中,电性连接语音识别装置600与主机614,大量储存界面604并向主机614列举语音识别装置600为大量储存装置。Firstly, in step S702, the voice recognition device 600 is electrically connected to thehost 614, and the mass storage interface 604 lists the voice recognition device 600 as a mass storage device to thehost 614.

接着,于步骤S704中,装置处理单元610传送语音数据库612与应用程序602至主机614。Next, in step S704 , thedevice processing unit 610 transmits thevoice database 612 and theapplication program 602 to thehost 614 .

再来,于步骤S706中,主机614加载装置处理单元610所传来的应用程序602。Next, in step S706 , thehost 614 loads theapplication program 602 transmitted from thedevice processing unit 610 .

然后,于步骤S708中,语音输入界面616撷取模拟指令语音S1,例如是使用者对着语音输入界面104讲出“复制”。Then, in step S708 , thevoice input interface 616 captures the analog command voice S1 , for example, the user speaks “copy” to thevoice input interface 104 .

然后,于步骤S710中,语音转码单元606转换模拟指令语音S1为数字指令语音S2。然后,于步骤S712中,运算出数字指令语音S2的指令特征值。Then, in step S710 , thevoice transcoding unit 606 converts the analog command voice S1 into a digital command voice S2 . Then, in step S712, the command characteristic value of the digital command voice S2 is calculated.

然后,于步骤S714中,主机614从模型特征值中比对出与指令特征值相似的模型特征值。例如,主机614将指令特征值与语音数据库612中“复制”的模型特征值与“删除”的模型特征值进行比对,发现“复制”的模型特征值与指令特征值较接近。Then, in step S714 , thehost 614 compares the model feature values similar to the command feature values from the model feature values. For example, thehost computer 614 compares the command feature value with the "copied" model feature value and the "deleted" model feature value in thespeech database 612, and finds that the "copied" model feature value is closer to the command feature value.

然后,于步骤S716中,主机614传送模型装置码中与相似的模型特征值对应的模型装置码,即传送对应于“复制”的模型特征值的模型装置码。传送目标例如是主机614所开启的应用程序,如WORD文书处理系统,以对文字进行复制操作。Then, in step S716 , thehost 614 transmits the model device code corresponding to the similar model feature value among the model device codes, that is, transmits the model device code corresponding to the “duplicated” model feature value. The transmission target is, for example, an application program opened by thehost 614, such as a WORD document processing system, so as to perform a copy operation on text.

此外,请参照图8,其绘示依照本发明第四实施例的语音识别装置的功能方块图。第四实施例与第三实施例不同之处在于,第四实施例的语音识别装置800的应用程序802除了包含应用程序602的语音识别功能外,还包括了语音训练的功能。更进一步地说,主机614加载应用程序后,除了可进行语音识别操作外,还可进行语音训练操作。亦即,语音识别装置800为同时具有语音训练功能及语音识别功能的装置。其余相同之处沿用相同标号,在此不再赘述。In addition, please refer to FIG. 8 , which shows a functional block diagram of a speech recognition device according to a fourth embodiment of the present invention. The difference between the fourth embodiment and the third embodiment is that theapplication program 802 of the speech recognition device 800 of the fourth embodiment includes a speech training function in addition to the speech recognition function of theapplication program 602 . Furthermore, after thehost computer 614 loads the application program, in addition to the voice recognition operation, the voice training operation can also be performed. That is, the speech recognition device 800 is a device having both speech training function and speech recognition function. The rest of the same parts continue to use the same reference numerals, and will not be repeated here.

请参照图9,其绘示依照本发明第四实施例的语音识别装置的操作流程图。操作方法包括以下步骤。首先,于步骤S902中,电性连接语音识别装置800与主机614,大量储存界面604并向主机614列举语音识别装置800为大量储存装置。接着,于步骤S904中,语音识别装置800传送应用程序802至主机614,以让主机614加载。由于语音识别装置800被列举成光驱,所以语音识别装置800的储存单元608可储存有自动执行设定文件,例如是文件名为“autorun.inf”的文件,其执行方式如第二实施例的步骤S404所揭露,在此不再赘述。Please refer to FIG. 9 , which is a flowchart illustrating the operation of the voice recognition device according to the fourth embodiment of the present invention. The operation method includes the following steps. Firstly, in step S902, the voice recognition device 800 is electrically connected to thehost 614, and the mass storage interface 604 lists the voice recognition device 800 as a mass storage device to thehost 614. Next, in step S904 , the speech recognition device 800 transmits theapplication program 802 to thehost 614 so that thehost 614 can load it. Since the speech recognition device 800 is listed as an optical drive, the storage unit 608 of the speech recognition device 800 can store an automatic execution setting file, such as a file named "autorun.inf", and its execution method is the same as that of the second embodiment. What is disclosed in step S404 will not be repeated here.

再来,如图5所示,于步骤S906中,主机614加载装置处理单元610所传来的应用程序802,并开启语音训练窗口W。语音训练过程如第二实施例的步骤S406所揭露,在此不再赘述。然后,于步骤S908中,语音输入界面616撷取模拟训练语音T1,例如是使用者对着语音识别装置300讲出“复制”。然后,于步骤S910中,语音转码单元606转换模拟训练语音T1为数字训练语音T2。Next, as shown in FIG. 5 , in step S906 , thehost computer 614 loads theapplication program 802 transmitted from thedevice processing unit 610 and opens the voice training window W. The speech training process is as disclosed in step S406 of the second embodiment, and will not be repeated here. Then, in step S908 , thevoice input interface 616 captures the simulated training voice T1 , for example, the user speaks “copy” to the voice recognition device 300 . Then, in step S910, thespeech transcoding unit 606 converts the analog training speech T1 into a digital training speech T2.

然后,于步骤S912中,装置处理单元610运算出数字训练语音T2的训练特征值。或者,运算出数字训练语音T2的训练特征值的操作也可以由主机614来完成。更进一步地说,若运算出训练特征值的操作是由主机614来完成的话,应用程序802还包含了语音的特征值的运算功能。较佳地,不管运算特征值的操作是由主机614或装置处理单元610执行,所使用的语音算法是相同的。Then, in step S912, thedevice processing unit 610 calculates the training feature value of the digital training speech T2. Alternatively, the operation of calculating the training feature value of the digital training voice T2 can also be completed by thehost computer 614 . Furthermore, if the operation of computing the training feature values is done by thehost computer 614, theapplication program 802 also includes the computing function of the feature values of speech. Preferably, regardless of whether the operation of calculating the feature value is performed by thehost 614 or thedevice processing unit 610, the speech algorithm used is the same.

然后,于步骤S914中,主机614撷取训练装置码K1,训练装置码K1对应于训练特征值。训练装置码K1例如是使用者触发与主机114相连接的键盘(未绘示)的按键后,由主机614撷取而得。Then, in step S914, thehost 614 retrieves the training device code K1, which corresponds to the training feature value. The training device code K1 is obtained by thehost 614 after the user triggers a key on a keyboard (not shown) connected to thehost 114 , for example.

然后,于步骤S916中,主机614可传送代表训练完成的信号(未绘示)给语音识别装置800。语音识别装置800收到此信号后,语音识别装置800储存训练特征值至语音数据库612,以使训练特征值成为数个模型特征值的一者,且将训练装置码K1通过大量储存界面604储存进语音数据库612,以使训练装置码K1成为数个模型装置码的一者。Then, in step S916 , thehost 614 may send a signal (not shown) representing the completion of the training to the speech recognition device 800 . After the speech recognition device 800 receives this signal, the speech recognition device 800 stores the training feature value to thespeech database 612, so that the training feature value becomes one of several model feature values, and stores the training device code K1 through the mass storage interface 604 Enter thevoice database 612, so that the training device code K1 becomes one of several model device codes.

虽然,第四实施例的语音识别装置800在执行语音训练操作时,训练装置码K1通过大量储存界面604储存至语音识别装置800的语音数据库612内。然于其它实施态样中,训练装置码K1也可以通过人机界面储存至语音识别装置800的语音数据库612内。请参照图10,其绘示本发明另一实施例的语音识别装置的功能方块图。语音识别装置900与语音识别装置800不同之处为,语音识别装置900可列举为包含人机界面902及大量储存界面604的复合式装置904。在图10中与图8的相同之处沿用相同标号,在此不再赘述。如此,训练装置码K1也可通过人机界面902储存至语音识别装置800的语音数据库612内。Although, when the speech recognition device 800 of the fourth embodiment performs the speech training operation, the training device code K1 is stored in thespeech database 612 of the speech recognition device 800 through the mass storage interface 604 . However, in other implementations, the training device code K1 can also be stored in thespeech database 612 of the speech recognition device 800 through the man-machine interface. Please refer to FIG. 10 , which is a functional block diagram of a speech recognition device according to another embodiment of the present invention. The difference between the speech recognition device 900 and the speech recognition device 800 is that the speech recognition device 900 can be listed as a composite device 904 including a man-machine interface 902 and a mass storage interface 604 . In FIG. 10 , the same parts as those in FIG. 8 continue to use the same reference numerals, which will not be repeated here. In this way, the training device code K1 can also be stored in thespeech database 612 of the speech recognition device 800 through the man-machine interface 902 .

此外,虽然本发明上述实施例的语音识别装置是以麦克风为例作说明,然于其它实施例中,语音识别装置也可以是键盘、鼠标或手机等,其应用范围并不受本发明的实施例所限制。In addition, although the speech recognition device of the above-mentioned embodiment of the present invention is described with a microphone as an example, in other embodiments, the speech recognition device can also be a keyboard, mouse or mobile phone, etc., and its application scope is not limited by the implementation of the present invention. limited by the example.

本发明上述实施例所揭露的语音识别装置及其操作方法,具有多项优点,以下仅列举部分优点说明如下:The speech recognition device and its operating method disclosed in the above-mentioned embodiments of the present invention have many advantages, and only some of the advantages are listed below:

(1).语音数据库与应用程序是储存于语音识别装置内。使用者可随身携带语音识别装置,就算是更换不同的主机,亦不须重新进行语音训练。如此,不但省时且相当便利。(1). The voice database and application program are stored in the voice recognition device. The user can carry the speech recognition device with him, even if he changes to a different host, he does not need to carry out speech training again. This saves time and is quite convenient.

(2).语音识别装置100的装置处理单元110具有语音识别功能的固件,使得主机不需加载任何软件就能进行语音识别操作。也就是说,语音识别装置100不需储存应用程序就能进行语音识别操作。(2). Thedevice processing unit 110 of the voice recognition device 100 has firmware for the voice recognition function, so that the host can perform voice recognition operations without loading any software. That is to say, the voice recognition device 100 can perform voice recognition operations without storing application programs.

(3).语音识别装置600及语音识别装置800的语音识别功能可由主机加载其应用程序602或802来执行。如此,当主机的数据处理速度较快时,可节省语音识别的时间。因此,本发明的语音识别装置具有多种实施态样,可配合不同的环境使语音识别的过程更有效率。(3). The voice recognition function of the voice recognition device 600 and the voice recognition device 800 can be executed by loading itsapplication program 602 or 802 on the host. In this way, when the data processing speed of the host is fast, the time for voice recognition can be saved. Therefore, the speech recognition device of the present invention has various implementation forms, which can make the speech recognition process more efficient according to different environments.

(4).语音识别装置300、语音识别装置800及语音识别装置900为同时具有语音识别及语音训练的功能。(4). The speech recognition device 300 , the speech recognition device 800 and the speech recognition device 900 have functions of speech recognition and speech training at the same time.

(5).上述实施例的语音识别装置可被列举成光驱,如此,使得应用程序可被自动地传送至主机且主机自动地加载应用程序。如此,节省使用者手动执行应用程序的时间。(5). The voice recognition device of the above embodiment can be enumerated as an optical drive, so that the application program can be automatically transmitted to the host and the host automatically loads the application program. In this way, the user's time of manually executing the application program is saved.

综上所述,虽然本发明已以数组较佳实施例揭露如上,然其并非用以限定本发明。本发明所属技术领域中具有通常知识者,在不脱离本发明的精神和范围内,当可作各种的更动与润饰。因此,本发明的保护范围当视所附的权利要求范围所界定者为准。To sum up, although the present invention has been disclosed above with a number of preferred embodiments, it is not intended to limit the present invention. Those skilled in the art of the present invention can make various changes and modifications without departing from the spirit and scope of the present invention. Therefore, the protection scope of the present invention should be defined by the appended claims.

Claims (8)

Translated fromChinese
1.一种语音识别装置,包括:1. A speech recognition device, comprising:人机界面,用以向主机列举该语音识别装置为操作装置;The man-machine interface is used to enumerate the voice recognition device as an operating device to the host;语音输入界面,用以撷取模拟指令语音;The voice input interface is used to capture the analog command voice;语音转码单元,用以转换该模拟指令语音为数字指令语音;A voice transcoding unit, used to convert the analog command voice into a digital command voice;语音数据库,包括多个模型特征值及多个模型装置码,该多个模型特征值对应于该多个模型装置码;以及a speech database comprising a plurality of model feature values and a plurality of model device codes, the plurality of model feature values corresponding to the plurality of model device codes; and装置处理单元,用以比对该数字指令语音的指令特征值与该多个模型特征值,并输出对应的该模型装置码至该主机;a device processing unit, configured to compare the command feature value of the digital command voice with the plurality of model feature values, and output the corresponding model device code to the host;大量储存界面,用以向该主机列举该语音识别装置为大量储存装置;a mass storage interface for enumerating the speech recognition device as a mass storage device to the host;其中,该语音输入界面还用以撷取模拟训练语音,该语音转码单元用以转换该模拟训练语音为数字训练语音,该语音识别装置储存有应用程序,该装置处理单元还用以通过该大量储存界面传送该应用程序至该主机,该主机还用以加载该应用程序及撷取训练装置码,该训练装置码对应于该数字训练语音的训练特征值;Wherein, the voice input interface is also used to retrieve the simulated training voice, the voice transcoding unit is used to convert the simulated training voice into a digital training voice, the voice recognition device stores an application program, and the device processing unit is also used to pass the The mass storage interface transmits the application program to the host, and the host is also used to load the application program and retrieve the training device code, the training device code corresponding to the training feature value of the digital training voice;其中,该装置处理单元还用以将该训练特征值储存进该语音数据库,以使该训练特征值成为该多个模型特征值的一者并将该训练装置码储存进该语音数据库,以使该训练装置码成为该多个模型装置码的一者。Wherein, the device processing unit is also used to store the training feature value into the speech database, so that the training feature value becomes one of the plurality of model feature values and store the training device code into the speech database, so that The training device code becomes one of the plurality of model device codes.2.根据权利要求1所述的语音识别装置,其中该装置处理单元从该多个模型特征值中比对出与该指令特征值相似的模型特征值,并且该装置处理单元还用以输出该多个模型装置码中与该相似的模型特征值对应的该模型装置码至该主机。2. The speech recognition device according to claim 1, wherein the device processing unit compares a model feature value similar to the instruction feature value from the plurality of model feature values, and the device processing unit is also used to output the The model device code corresponding to the similar model characteristic value among the plurality of model device codes is sent to the host.3.一种语音识别装置的操作方法,包括:3. A method of operating a speech recognition device, comprising:向主机列举语音识别装置为操作装置,其中该语音识别装置包括多个模型特征值及多个模型装置码,该多个模型特征值对应至该多个模型装置码;enumerating the speech recognition device as an operating device to the host, wherein the speech recognition device includes a plurality of model feature values and a plurality of model device codes, and the plurality of model feature values correspond to the plurality of model device codes;撷取模拟指令语音;Capture analog command voice;转换该模拟指令语音为数字指令语音;converting the analog command voice into a digital command voice;比对该数字指令语音的指令特征值与该多个模型特征值;以及comparing the instruction feature value of the digital instruction voice with the plurality of model feature values; and输出对应的该模型装置码至该主机;output the corresponding model device code to the host;列举该语音识别装置为大量储存装置;enumerating the speech recognition device as a mass storage device;传送该语音识别装置所储存的应用程序至该主机;transmitting the application stored in the voice recognition device to the host;该主机加载该应用程序;the host loads the application;撷取模拟训练语音;Capture simulated training voice;转换该模拟训练语音为数字训练语音;converting the analog training voice into a digital training voice;该主机撷取训练装置码,该训练装置码对应于该数字训练语音的训练特征值;以及The host retrieves a training device code corresponding to the training feature value of the digital training voice; and储存该训练特征值,以使该训练特征值成为该多个模型特征值的一者,及储存该训练装置码,以使该训练装置码成为该多个模型装置码的一者。The training feature value is stored so that the training feature value becomes one of the plurality of model feature values, and the training device code is stored so that the training device code becomes one of the plurality of model device codes.4.一种语音识别装置,包括:4. A speech recognition device, comprising:大量储存界面,用以向主机列举该语音识别装置为大量储存装置;a mass storage interface, used to enumerate the speech recognition device as a mass storage device to the host;语音输入界面,用以撷取模拟指令语音;The voice input interface is used to capture the analog command voice;语音转码单元,用以转换该模拟指令语音为数字指令语音;A voice transcoding unit, used to convert the analog command voice into a digital command voice;语音数据库,包括多个模型特征值及多个模型装置码,该多个模型特征值对应于该多个模型装置码;以及a speech database comprising a plurality of model feature values and a plurality of model device codes, the plurality of model feature values corresponding to the plurality of model device codes; and装置处理单元,用以传送该语音数据库及应用程序至该主机;a device processing unit, configured to transmit the voice database and the application program to the host;其中,该主机用以加载该应用程序且比对该数字指令语音的指令特征值与该多个模型特征值,并传送对应的该模型装置码。Wherein, the host computer is used to load the application program and compare the instruction characteristic value of the digital instruction voice with the plurality of model characteristic values, and transmit the corresponding model device code.5.根据权利要求4所述的语音识别装置,其中该主机用以从该多个模型特征值中比对出与该指令特征值相似的模型特征值,并传送该多个模型装置码中与该相似的模型特征值对应的模型装置码。5. The speech recognition device according to claim 4, wherein the host computer is used to compare the model feature values similar to the instruction feature value from the plurality of model feature values, and transmit the corresponding model feature values among the plurality of model device codes. The model device code corresponding to the similar model feature value.6.根据权利要求4所述的语音识别装置,其中该语音输入界面还用以撷取模拟训练语音,该语音转码单元用以转换该模拟训练语音为数字训练语音,该主机还用以撷取训练装置码,而该训练装置码对应于该数字训练语音的训练特征值;6. The voice recognition device according to claim 4, wherein the voice input interface is also used to capture the analog training voice, the voice transcoding unit is used to convert the simulated training voice into a digital training voice, and the host is also used to capture Get the training device code, and the training device code is corresponding to the training feature value of the digital training voice;其中,该装置处理单元还用以将该训练特征值储存进该语音数据库,以使该训练特征值成为该多个模型特征值的一者,且将该训练装置码储存进该语音数据库,以使该训练装置码成为该多个模型装置码的一者。Wherein, the device processing unit is also used to store the training feature value into the speech database, so that the training feature value becomes one of the plurality of model feature values, and store the training device code into the speech database, so that Make the training device code one of the plurality of model device codes.7.一种语音识别装置的操作方法,包括:7. A method of operating a speech recognition device, comprising:向主机列举语音识别装置为大量储存装置,其中该语音识别装置储存有应用程序、多个模型特征值及多个模型装置码,该多个模型特征值对应至该多个模型装置码;Enumerating the speech recognition device as a mass storage device to the host, wherein the speech recognition device stores an application program, a plurality of model feature values and a plurality of model device codes, and the plurality of model feature values correspond to the plurality of model device codes;传送该多个模型特征值、该多个模型装置码与该应用程序至该主机;sending the plurality of model feature values, the plurality of model device codes and the application program to the host;该主机加载该应用程序;the host loads the application;撷取模拟指令语音;Capture analog command voice;转换该模拟指令语音为数字指令语音;converting the analog command voice into a digital command voice;该主机比对该数字指令语音的指令特征值与该多个模型特征值;以及The host compares the command feature value of the digital command voice with the plurality of model feature values; and该主机传送对应的该模型装置码。The host sends the corresponding model device code.8.根据权利要求7所述的操作方法,还包括:8. The operating method according to claim 7, further comprising:撷取模拟训练语音;Capture simulated training voice;转换模拟训练语音为数字训练语音;Convert the analog training voice to the digital training voice;该主机撷取训练装置码,该训练装置码对应于该数字训练语音的训练特征值;以及The host retrieves a training device code corresponding to the training feature value of the digital training voice; and储存该训练特征值,以使该训练特征值成为该多个模型特征值的一者,及储存该训练装置码,以使该训练装置码成为该多个模型装置码的一者。The training feature value is stored so that the training feature value becomes one of the plurality of model feature values, and the training device code is stored so that the training device code becomes one of the plurality of model device codes.
CN2009100063762A2009-02-162009-02-16 Speech recognition device and operating method thereofExpired - Fee RelatedCN101807398B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN2009100063762ACN101807398B (en)2009-02-162009-02-16 Speech recognition device and operating method thereof

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN2009100063762ACN101807398B (en)2009-02-162009-02-16 Speech recognition device and operating method thereof

Publications (2)

Publication NumberPublication Date
CN101807398A CN101807398A (en)2010-08-18
CN101807398Btrue CN101807398B (en)2011-12-21

Family

ID=42609167

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN2009100063762AExpired - Fee RelatedCN101807398B (en)2009-02-162009-02-16 Speech recognition device and operating method thereof

Country Status (1)

CountryLink
CN (1)CN101807398B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
KR102022318B1 (en)*2012-01-112019-09-18삼성전자 주식회사Method and apparatus for performing user function by voice recognition
CN103869948B (en)*2012-12-142019-01-15联想(北京)有限公司Voice command processing method and electronic equipment
CN106356057A (en)*2016-08-242017-01-25安徽咪鼠科技有限公司Speech recognition system based on semantic understanding of computer application scenario
CN107889085A (en)*2016-09-302018-04-06亚旭电脑股份有限公司Method for inputting voice signal into intelligent device, electronic device and computer
CN106409297A (en)*2016-10-182017-02-15安徽天达网络科技有限公司Voice recognition method

Citations (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN2428838Y (en)*2000-06-232001-05-02陈振文 voice recognition device
CN2447894Y (en)*2000-11-022001-09-12陈振文 A voice control identification device
CN2724146Y (en)*2004-08-272005-09-07中国科学院自动化研究所Non specific person independent word sound identifying device
CN1716413A (en)*2004-07-022006-01-04深圳市朗科科技有限公司Vehicle carried speech identification audio-video playing device and method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN2428838Y (en)*2000-06-232001-05-02陈振文 voice recognition device
CN2447894Y (en)*2000-11-022001-09-12陈振文 A voice control identification device
CN1716413A (en)*2004-07-022006-01-04深圳市朗科科技有限公司Vehicle carried speech identification audio-video playing device and method
CN2724146Y (en)*2004-08-272005-09-07中国科学院自动化研究所Non specific person independent word sound identifying device

Also Published As

Publication numberPublication date
CN101807398A (en)2010-08-18

Similar Documents

PublicationPublication DateTitle
JP6811758B2 (en) Voice interaction methods, devices, devices and storage media
CN101807398B (en) Speech recognition device and operating method thereof
US20170230318A1 (en)Return to sender
CN108279839A (en)Voice-based exchange method, device, electronic equipment and operating system
EP0653701B1 (en)Method and system for location dependent verbal command execution in a computer based control system
CN103218137B (en)Automatically method and the device of control is adjusted according to user operation
WO1999063425A1 (en)Method and apparatus for information processing, and medium for provision of information
CN1790326A (en)Semantic canvas
CN1763842B (en)Verb error comeback method and system in speech recognition
US11163377B2 (en)Remote generation of executable code for a client application based on natural language commands captured at a client device
CN106844028B (en)System switching method based on dual systems and mobile terminal
CN110211364A (en)A kind of test macro, test method, electronic equipment and storage medium
WO2014032597A1 (en)Voice recognition method and electronic device
US11460971B2 (en)Control method and electronic device
CN105681533A (en)Call content recording method and apparatus
JP6801539B2 (en) Information processing system, information processing device, information processing program and information processing method
TWI382400B (en)Voice recognition device and operating method thereof
CN103973870B (en)Information processing device and information processing method
CN112542168B (en) Voice control method and device
US20090304367A1 (en)System and method for media player device
CN110286940B (en)Smart television log generation method
WO2025030654A1 (en)Voice processing method and apparatus, and electronic device and storage medium
CN111108755B (en)Electronic device control system, audio output device, and method
CN105979371A (en)Method and system for obtaining audio and video information
US11176096B2 (en)File system for genomic data

Legal Events

DateCodeTitleDescription
C06Publication
PB01Publication
C10Entry into substantive examination
SE01Entry into force of request for substantive examination
C14Grant of patent or utility model
GR01Patent grant
CF01Termination of patent right due to non-payment of annual fee
CF01Termination of patent right due to non-payment of annual fee

Granted publication date:20111221


[8]ページ先頭

©2009-2025 Movatter.jp