CN102867512A

Movatterモバイル変換

Info

Publication number: CN102867512A
Application number: CN2011101847596A
Authority: CN
Inventors: 余喆
Original assignee: Individual
Current assignee: Individual
Priority date: 2011-07-04
Filing date: 2011-07-04
Publication date: 2013-01-09

Abstract

The invention discloses a method and a device for recognizing a natural speech, and relates to a speech recognition technology, so as to solve the problem of low speech recognition success ratio because of a keyword mode. The method comprises the steps as follows: obtaining pinyin corresponding to a speech message input by a user; carrying out word segmentation treatment on the pinyin by a dictionary set in advance to obtain a segmented word pinyin string; searching a word to be recognized corresponding to the word pinyin string from the dictionary, and searching a target information database to obtain the target information which has the highest matching degree with the word to be recognized according to the word to be recognized, wherein the dictionary is used for storing a target word to be recognized by speech and the pinyin corresponding to the target word. The technical scheme provided by the embodiment of the invention can be applied to information service systems for navigation, song requesting, linkman inquiry and the like.

Description

Natural-sounding recognition methods and device

Technical field

The present invention relates to speech recognition technology, relate in particular to a kind of natural-sounding recognition methods and device.

Background technology

In field of speech recognition, for different language, speech recognition technology is different, for example: for English, word consists of by the letter in 26 alphabets in the statement of pending speech recognition, when carrying out speech recognition, speech recognition system only need to be identified the letter in the statement, can identify text message corresponding to voice messaging.

Chinese is with English maximum difference, Chinese character quantity is larger, at present, the sum of Chinese character has surpassed 80,000, wherein about nearly 3500 words of Chinese characters in common use, in the face of huge Chinese character storehouse like this, traditional speech recognition technology is based on keyword, the voice content that speech recognition system need to send the user from the beginning to the end by word for word with vocabulary in pre-stored content of text mate, when only having certain bar text content of storing in voice content and the vocabulary to mate fully, speech recognition system just can identify the implication of the voice content of user's transmission, successfully carries out speech recognition, otherwise, the speech recognition failure.

Yet, in the life of reality, the language expression form is diversified, and everyone or same people are different in the statement of different times for same thing, and for example: the statement to mother's one word can comprise: mother, mother, mother, old mother, mommy etc.For success ratio and the accuracy rate that improves speech recognition, needs all store all expression forms of same thing in the vocabulary of speech recognition system as much as possible, this is so that the vocabulary scale of speech recognition system is very huge, safeguard inconvenient, and because vocabulary is in large scale, so that speech recognition system is carried out the speed of speech recognition is slower.In addition, because people's language expression form varies, along with the development in epoch, Expression of language is also being constantly updated, can't be in the vocabulary of speech recognition system all expression forms of limit same thing so that it is lower to adopt the keyword mode to carry out the success ratio of speech recognition.

Be CN00130067.9 at application number, the technical scheme relevant with speech recognition also disclosed in the Chinese patent such as CN03123123.3 and CN03138149.9, yet technique scheme can only be carried out phonetic synthesis or speech conversion is become literal, and can't realize speech conversion is become the identification of Word message, and, technique scheme designs for English speech recognition, according to above analysis as can be known, english language and Chinese language differ widely from word quantity and taxeme, even also can't effectively identify so that technique scheme is applied in the Chinese speech recognition, the success ratio of speech recognition is lower; Be in the Chinese patent of CN99813093.1 at application number, a kind of interactive user interface that adopts speech recognition and natural language processing is disclosed, although can realize speech conversion is become the identification of Word message, yet this technical scheme also designs for english language, in the process of carrying out speech recognition, need to consider the impact of the factors such as grammer, still can't effectively be applied in the Chinese speech recognition.

Summary of the invention

For solving the problems of the technologies described above, embodiments of the invention provide a kind of natural-sounding recognition methods and device, can improve Chinese speech recognition speed, and the success ratio of speech recognition.

A kind of natural-sounding recognition methods comprises: phonetic corresponding to voice messaging that obtains user's input; The dictionary that employing sets in advance carries out word segmentation processing to described phonetic, obtains the word pinyin string behind the participle; From described dictionary, search word to be identified corresponding to described word pinyin string; Search the target information database according to described word to be identified, from described target information database, obtain the target information the highest with described word match degree to be identified; Wherein, described dictionary is used for being stored into target word and the phonetic corresponding to target word of lang sound identification.

A kind of natural-sounding recognition device comprises:

The first acquiring unit is used for obtaining the phonetic corresponding to voice messaging of user's input;

The word segmentation processing unit be used for to adopt the dictionary that sets in advance that the phonetic that described the first acquiring unit obtains is carried out word segmentation processing, obtains the word pinyin string behind the participle;

Second acquisition unit is used for searching word to be identified corresponding to word pinyin string that described word segmentation processing unit obtains from described dictionary;

Search the unit, be used for searching the target information database according to the word to be identified that described second acquisition unit obtains, from described target information database, obtain the target information the highest with described word match degree to be identified;

Wherein, described dictionary is used for being stored into target word and the phonetic corresponding to target word of lang sound identification.

Natural-sounding recognition methods and device that the embodiment of the invention provides, the to be identified word corresponding according to the word pinyin string carries out information matches, and with the target information that obtains as the identification to voice messaging with the highest information of word match degree to be identified in the target information database, do not need voice messaging mated fully and can obtain target information, improved the success ratio of speech recognition, having solved prior art adopts and voice messaging to be carried out complete matching process carries out speech recognition, causing owing to form of presentation is inconsistent makes speech recognition failed, the problem that the speech recognition success ratio is low, because the technical scheme that the embodiment of the invention provides adopts the mode of word match to carry out speech recognition, only need in dictionary, store the target word and in the target information database storage standards information get final product, do not need same thing is stored a large amount of multi-form text messages according to the language expression mode, the data scale of dictionary and target information database is less, be convenient to search, and then improved speech recognition speed, solve prior art and need in vocabulary, store the text message of a large amount of different expression forms to same thing, cause vocabulary in large scale, be not easy to search, carry out the slow problem of speech recognition.The technical scheme that the embodiment of the invention provides is different from English speech recognition technology, this technical scheme is large for Chinese language literal amount, the characteristics that word links up in the statement, nothing is paused, employing is carried out participle according to phonetic to word in the statement, and carry out speech recognition according to the mode that the word to be identified behind the participle is searched, higher to success ratio and the recognition speed of Chinese speech recognition.

Description of drawings

In order to be illustrated more clearly in the embodiment of the invention or technical scheme of the prior art, the below will do to introduce simply to the accompanying drawing of required use in embodiment or the description of the Prior Art, apparently, accompanying drawing in the following describes only is some embodiments of the present invention, for those of ordinary skills, under the prerequisite of not paying creative work, can also obtain according to these accompanying drawings other accompanying drawing.

The natural-sounding recognition methods process flow diagram one that Fig. 1 provides for the embodiment of the invention;

The process flow diagram one of the natural-soundingrecognition methods step 104 that Fig. 2 provides for the embodiment of the invention shown in Figure 1;

The flowchart 2 of the natural-soundingrecognition methods step 104 that Fig. 3 provides for the embodiment of the invention shown in Figure 1;

The natural-sounding recognition methods flowchart 2 that Fig. 4 provides for the embodiment of the invention;

The natural-sounding recognition device structural representation one that Fig. 5 provides for the embodiment of the invention;

The natural-sounding recognition device structural representation two that Fig. 6 provides for the embodiment of the invention;

The natural-sounding recognition device structural representation three that Fig. 7 provides for the embodiment of the invention;

The natural-sounding recognition device structural representation four that Fig. 8 provides for the embodiment of the invention;

Search the structural representation of unit in the natural-sounding recognition device that Fig. 9 provides for the embodiment of the invention shown in Figure 5;

The natural-sounding recognition device structural representation five that Figure 10 provides for the embodiment of the invention;

The natural-sounding recognition device structural representation six that Figure 11 provides for the embodiment of the invention;

The natural-sounding recognition device structural representation seven that Figure 12 provides for the embodiment of the invention.

Embodiment

Below in conjunction with the accompanying drawing in the embodiment of the invention, the technical scheme in the embodiment of the invention is clearly and completely described, obviously, described embodiment only is the present invention's part embodiment, rather than whole embodiment.Based on the embodiment among the present invention, those of ordinary skills belong to the scope of protection of the invention not making the every other embodiment that obtains under the creative work prerequisite.

Adopt the mode of keyword to carry out the lower problem of speech recognition success ratio in order to solve, the embodiment of the invention provides a kind of natural-sounding recognition methods and device.

As shown in Figure 1, the natural-sounding recognition methods that the embodiment of the invention provides comprises:

Step 101 is obtained the phonetic corresponding to voice messaging of user's input.

For the natural-sounding recognition methods scope of application that the embodiment of the invention is provided wider, can identify the user speech information of different geographical, different accents, in the present embodiment,step 101 can adopt the unspecified person speech recognition technology that the voice messaging of user's input is identified parsing, obtains phonetic corresponding to this voice messaging.

Step 102, the phonetic that adopts the dictionary set in advance thatstep 101 is obtained carries out word segmentation processing, obtains the word pinyin string behind the participle.

Wherein, dictionary is used for being stored into target word and the phonetic corresponding to target word of lang sound identification.

In the present embodiment, the target word of storing in the dictionary can be the word of broad scope, particularly, can obtain the target word and form dictionary from daily life and the information that can touch of working, for example: can from the information of news report every day, extract word, form dictionary; The target word of storing in the dictionary also can be the word of narrow sense scope, particularly, can from the target information database, obtain the target word and form dictionary by canned data, wherein, the target information database is used for storing the information of pending identification, for example: if the natural-sounding recognition methods that the embodiment of the invention provides is applied in the automobile navigation field, the target information database is used for store geographic position information and/or destination name information etc.Need to prove that no matter be the word of broad scope or the word of narrow sense scope, the target word in the dictionary all is unique, does not repeat between each target word.

Because speech recognition technology generally uses in specific area, for example: be applied in navigation, requesting song or search the field such as contact person, in order to reduce the amount of redundancy of target word in the dictionary, save storage space, improve the speed of speech recognition, the embodiment of the invention preferably target word in the dictionary is set to the narrow sense scope word that arranges according to the target information database, but be not limited to above-mentioned set-up mode, well known to a person skilled in the art and be, for applied each industry field of this recognition technology, the technician of described industry all can according to its industry characteristic, rationally arrange its target information database.

In the present embodiment,step 102 specifically can be searched dictionary according to the phonetic thatstep 101 is obtained, the phonetic of phonetic according to the target word that comprises in appearance order and the dictionary is mated, when word pinyin string that the phonetic that finds with the target word mates fully, this word pinyin string is split from phonetic, continue the above-mentioned action of searching of circulation, until finish, thereby realization is to the word segmentation processing of phonetic.

Step 103, the word to be identified that the word pinyin string that findingstep 102 obtains from dictionary is corresponding.

Step 104 is searched the target information database according to word to be identified, obtains the target information the highest with word match degree to be identified from the target information database.

In the present embodiment,step 104 can be obtained the target information the highest with word match degree to be identified by two kinds of methods from the target information database, and the below introduces respectively these two kinds of methods:

1, weight coefficient judgement method

In the present embodiment, if dictionary also is used for corresponding weight grade n and the weight rate range N of storage target word, n, N is integer, N 〉=2, n ∈ [1, N], the importance of target word in described Word message of n level is larger than the importance of target word in described Word message of n+1 level, certainly, the relation of its importance and weight grade n also can be opposite, and those skilled in the art can oneself define as required, and present embodiment is carried out example according to the former, then before thestep 104, also comprise the step of obtaining weight grade corresponding to word to be identified according to dictionary.

Particularly, can set in advance the weight rate range N of word in the dictionary, and the weight grade n of each word, for example: the weight rate range of the target word that can dictionary comprises is set to 3, wherein, heavy grade is 1 the highest, the weight grade is 3 minimum, then the weight grade that according to monopoly and the popularity of target word each target word is set, as, when the target word was place name, the weight grade was set to 3, when target word right and wrong geographic position proprietary refers to noun (such as little fertile sheep), the weight grade is set to 1, certainly, described those skilled in the art can arrange rule according to other above-mentioned target word is carried out the weight grade classification, every kind of situation are not given unnecessary details one by one herein.Afterstep 102 is divided into word with Word message, from dictionary, obtain the weight grade attribute information of each word.

Then this moment, as shown in Figure 2,step 104 can comprise:

Step 1041 is searched the target information database according to word to be identified, from the target information database, obtain with word to be identified in the information aggregate that forms of the information of any one or a plurality of word match.

Step 1042, the weight grade corresponding according to word to be identified, every information in the information aggregate thatstep 1041 is obtained is processed respectively, obtains the weight coefficient of every information.

In the present embodiment,step 1042 can adopt Weighted Average Algorithm to obtain the weight coefficient of every information, can certainly adopt other algorithms to obtain the weight information of every information, does not give unnecessary details one by one herein.

Step 1043, the information that the weight selection coefficient is the highest from the information aggregate thatstep 1041 is obtained is target information.

Need to prove, in order to guarantee the accuracy of the target information thatstep 104 is obtained, improve the speech recognition quality, in the present embodiment, should comprise at least one weight grade in the word to be identified thatstep 103 is obtained and be 1 word, if not having the weight grade in the word to be identified is 1 word, then beforestep 104, also comprise: the phonetic that againstep 101 is obtained carries out word segmentation processing, to obtain the word of at least one weight grade as 1, then this moment,step 104 replaced with: search the target information database according to the word to be identified behind the participle again, obtaining from the target information database with word match degree to be identified is 1 target information.

Further, the natural-sounding recognition methods that provides of the embodiment of the invention can also comprise: at least one the highest grade of weight word and pinyin string corresponding to this word that obtains behind the participle again added in the described dictionary.

Need to prove, the embodiment of the invention is carried out concrete giving an example to the division of weight grade height, the height attribute of weight grade can also be set by other rules in the use procedure of reality, for example: when the weight rate range is 3, the weight grade can be set be 3 the highest, the weight grade is 1 minimum, and above method is that those skilled in the art can associate under the prerequisite of not paying creative work easily, gives unnecessary details no longer one by one herein.

2, the nested method of searching

As shown in Figure 3,step 104 can comprise:

Step 1044, the word to be identified thatstep 103 is obtained sorts.

In the present embodiment, step 1044 can sort word according to the sequencing that occurs in Word message, preferably, in order to improve seek rate, step 1044 can be obtained first the keyword in the word that Word message comprises, and the word that then Word message is comprised sorts according to the order of keyword, rear auxiliary word and front auxiliary word.

Wherein, keyword is to have the proprietary word that refers to meaning, and rear auxiliary word is to be positioned at keyword word afterwards in the Word message, and front auxiliary word is to be positioned at keyword word before in the Word message.

In the present embodiment, can set in advance antistop list, this antistop list can be according to canned data setting in the target information database, the technical scheme that the embodiment of the invention provides is after obtaining word to be identified, antistop list searched respectively in each word in the word to be identified, obtain with antistop list in the word of the keyword coupling of storing be the keyword that Word message comprises.

Need to prove that if know and do not have keyword in the word to be identified, then step 1044 sorts according to the sequencing that word occurs after searching; If know to comprise two above keywords in the word to be identified after searching, then auxiliary word is the later non-key word of first keyword in the word to be identified afterwards, and step 1044 still sorts according to the order of keyword, rear auxiliary word and front auxiliary word.

Need to prove that if instep 103, same word pinyin string finds word to be found more than two in dictionary, then step 1044 with described more than two word to be found sort as a Set Global.

The embodiment of the invention sorts by word that Word message the is comprised order according to keyword, rear auxiliary word and front auxiliary word, so that subsequent step is searched when coupling according to word order, keynote message is outstanding, can significantly shorten the time that coupling searched in word, improve the speed of speech recognition.

Step 1045 according to the ranking results of step 1044, is obtained first word from word to be identified, obtain the information with first word match from the target information database.

Step 1046 is obtained second word from word to be identified, obtain the information with second word match from the information aggregate that the information with first word match forms.

By that analogy, step 1047 is obtained last word from word to be identified, obtains the target information with last word match from the information aggregate that the information of a upper word match adjacent with last word forms.

Need to prove, in above step 1045-1047, if do not find the information with current word match, match information that then can current word is set to the information of a upper word match adjacent with this current word, if, current word is first word, and then the information of this first word match is the information that comprises in the whole target information database.

In order to make those skilled in the art more deep understanding be arranged to the above-described nested method of searching, below by concrete example nested specific implementation of searching method is described:

Can find exactly the highest target information of word match degree that comprises with text message by above-described weight coefficient judgement method and the nested method of searching, realize the identification to the voice messaging of user's input.Certainly, in the use procedure of reality, the highest target information of word match degree that can also adopt additive method to obtain to comprise with text message is not given unnecessary details herein one by one.

Further, if instep 104, chosen two above target informations, in order to improve the accurately fixed of speech recognition, as shown in Figure 4, can also comprise after the step 104:

Step 105, the target information of obtaining user's transmission is chosen indication or user's speech recognition statistical information.

Particularly, the embodiment of the invention can be shown to the user with two above target informations choosing afterstep 104, and step 105 receives the user and chooses indication by the target information that the modes such as voice or button or literal input send.

Perhaps, the natural-sounding recognition methods that the embodiment of the invention provides can be added up the information that the user carries out speech recognition at every turn, and this statistics can be for specific user individual, also can be for specific user colony.Further, this speech recognition statistics can be for carrying out the number of times of speech recognition or the result of frequency statistics to one or more target information of user, also can be for a plurality of users being carried out for the last time the statistics of the target information of speech recognition, certainly can also for other statisticses relevant with speech recognition, not give unnecessary details one by one herein.

Step 106, according to target information choose the indication or the speech recognition statistical information from two above target informations, choose selected objective target information.

Alternatively, in order further to shorten the time of speech recognition, improve speech recognition speed, in the present embodiment, before thestep 104, can also comprise according to word to be identified and search spoken dictionary, according to lookup result, the step of deletion spoken word from word to be identified, wherein, spoken dictionary is used for the storage spoken word, does not comprise the Word message that has substantive implication in the voice messaging that relates to user's input in this spoken word.

In the present embodiment, can adopt the method for statistics to set in advance spoken dictionary, can comprise people's spoken word used in everyday in this spoken language dictionary, for example: " I think ", " I want ", " may I ask ", " being ", " right ", " can " and " how " etc., the spoken word that comprises in the spoken word storehouse is not given unnecessary details one by one herein.

Further, for the natural-sounding recognition methods that the embodiment of the invention is provided can be applicable to pronounce to pronounce indistinctly Chu and the different crowd of pronunciation standard, improve success ratio and the accuracy rate of speech recognition, on the technical scheme basis shown in above Fig. 1-4, the natural-sounding recognition methods that the embodiment of the invention provides can also comprise: the phonetic that step 101 is obtained carries out the fuzzy phoneme matching treatment, obtain the step of the phonetic after the fuzzy matching, then this moment, step 102 was specially: the phonetic after adopting the dictionary set in advance to fuzzy matching carries out word segmentation processing, obtains the word pinyin string behind the participle.

Particularly, can set in advance phonetic fuzzy matching table, in this phonetic fuzzy matching table, define matched rule, for example: z=zh, c=ch, s=sh, l=n, f=h, r=l, an=ang, en=eng, in=ing, ian=iang, uan=uang, iong=ing etc., do not give unnecessary details one by one, the phonetic that step 101 is obtained according to described rule carries out the fuzzy phoneme matching treatment herein.

By phonetic is carried out fuzzy matching, solved because problems such as the speech recognition failure that the user is speak with a lisp, cacoepy really causes or identification errors, and then improved recognition success rate and the accuracy rate that the embodiment of the invention provides the natural-sounding recognition methods.

The natural-sounding recognition methods that the embodiment of the invention provides, the to be identified word corresponding according to the word pinyin string carries out information matches, and with the target information that obtains as the identification to voice messaging with the highest information of word match degree to be identified in the target information database, do not need voice messaging mated fully and can obtain target information, improved the success ratio of speech recognition, having solved prior art adopts and voice messaging to be carried out complete matching process carries out speech recognition, causing owing to form of presentation is inconsistent makes speech recognition failed, the problem that the speech recognition success ratio is low, because the technical scheme that the embodiment of the invention provides adopts the mode of word match to carry out speech recognition, only need in dictionary, store the target word and in the target information database storage standards information get final product, do not need same thing is stored a large amount of multi-form text messages according to the language expression mode, the data scale of dictionary and target information database is less, be convenient to search, and then improved speech recognition speed, solve prior art and need in vocabulary, store the text message of a large amount of different expression forms to same thing, cause vocabulary in large scale, be not easy to search, carry out the slow problem of speech recognition.The technical scheme that the embodiment of the invention provides is different from English speech recognition technology, this technical scheme is large for Chinese language literal amount, the characteristics that word links up in the statement, nothing is paused, employing is carried out participle according to phonetic to word in the statement, and carry out speech recognition according to the mode that the word to be identified behind the participle is searched, higher to success ratio and the recognition speed of Chinese speech recognition.

As shown in Figure 5, the embodiment of the invention also provides a kind of natural-sounding recognition device, comprising:

The first acquiringunit 501 is used for obtaining the phonetic corresponding to voice messaging of user's input;

Wordsegmentation processing unit 502 be used for to adopt the dictionary that sets in advance that the phonetic that the first acquiringunit 501 obtains is carried out word segmentation processing, obtains the word pinyin string behind the participle;

Second acquisition unit 503 is used for searching word to be identified corresponding to word pinyin string that wordsegmentation processing unit 502 obtains from dictionary;

Search unit 504, be used for searching the target information database according to the word to be identified thatsecond acquisition unit 503 obtains, from the target information database, obtain the target information the highest with word match degree to be identified;

Further, as shown in Figure 6, the natural-sounding recognition device that the embodiment of the invention provides can also comprise:

The 3rd acquiringunit 505, also be used for corresponding weight grade n and the weight rate range N of storage target word if be used for dictionary, obtain weight grade corresponding to word to be identified thatsecond acquisition unit 503 obtains according to dictionary, wherein, n, N is integer, N 〉=2, n ∈ [1, N], the importance of target word in described Word message of n level is larger than the importance of target word in described Word message of n+1 level, and certainly, the relation of its importance and weight grade n also can be opposite, those skilled in the art can oneself define as required, and present embodiment is carried out example according to the former;

Then, searching unit 504 can comprise:

Search subelement 5041, be used for searching the target information database according to the word to be identified thatsecond acquisition unit 503 obtains, from the target information database, obtain with word to be identified in the information aggregate that forms of the information of any one or a plurality of word match;

First obtains subelement 5042, is used for weight grade corresponding to word to be identified obtain according to the 3rd acquiringunit 505, and every information of searching in the information aggregate that subelement 5041 obtains is processed respectively, obtains the weight coefficient of every information;

Second obtains subelement 5043, is used for choosing first to obtain the highest information of weight coefficient that subelement 5042 obtains being target information from searching information aggregate that subelement 5041 obtains.

Further, as shown in Figure 7, the natural-sounding recognition device that the embodiment of the invention provides can also comprise:

Heavy participle unit 506, not have the weight grade be 1 word if be used for word to be identified thatsecond acquisition unit 503 obtains, the phonetic that again the first acquiringunit 501 is obtained carries out word segmentation processing, to obtain the word of at least one weight grade as 1;

Search unit 504, can also be used for searching the target information database according to the word to be identified behind the heavy participle unit 506 again participle, from the target information database, obtain the target information the highest with word match degree to be identified.

Further, as shown in Figure 8, the natural-sounding recognition device that the embodiment of the invention provides can also comprise:

Updating block 507, being used at least one weight grade that heavy participle unit 506 obtains is that 1 word and pinyin string corresponding to this word are added dictionary to.

Further, as shown in Figure 9, searching unit 504 can also comprise:

Ordering subelement 5044 is used for word to be identified is sorted;

The 3rd obtains subelement 5045, is used for the result according to 5044 orderings of ordering subelement, obtains first word from word to be identified, obtains the information with first word match from the target information database;

The 4th obtains subelement 5046, is used for obtaining second word from word to be identified, obtains the information with second word match from the information aggregate that the information with first word match forms;

By that analogy, the 5th obtains subelement 5047, is used for obtaining last word from word to be identified, obtains the target information with last word match from the information aggregate that the information of a upper word match adjacent with last word forms.

Further, as shown in figure 10, the natural-sounding recognition device that the embodiment of the invention provides can also comprise:

Delete cells 508, be used for searching spoken dictionary according to the word to be identified thatsecond acquisition unit 503 obtains, according to lookup result, from word to be identified, delete spoken word, wherein, spoken dictionary is used for the storage spoken word, does not comprise the Word message that has substantive implication in the voice messaging that relates to described user's input in this spoken word.

Further, as shown in figure 11, the natural-sounding recognition device that the embodiment of the invention provides can also comprise:

The 4th acquiring unit 509 finds two above target informations if be used for searching unit 504, and the target information of obtaining user's transmission is chosen indication or user's speech recognition statistical information;

Choose unit 5010, be used for choosing indication or speech recognition statistical information according to the target information that the 4th acquiring unit 509 obtains and choose selected objective target information from two above target informations of searching unit 504 and finding.

Further, as shown in figure 12, the natural-sounding recognition device that the embodiment of the invention provides can also comprise:

Fuzzy Processing unit 5011, the phonetic that is used for the first acquiringunit 501 is obtained carries out the fuzzy phoneme matching treatment, obtains the phonetic after the fuzzy matching;

Wordsegmentation processing unit 502 can also be used for adopt the phonetic after the fuzzy matching that the dictionary that sets in advance obtains Fuzzy Processing unit 5011 to carry out word segmentation processing, obtains the word pinyin string behind the participle.

The specific implementation of the natural-sounding recognition device that the embodiment of the invention provides can be described referring to the natural-sounding recognition methods that the embodiment of the invention provides, and repeats no more herein.

The natural-sounding recognition device that the embodiment of the invention provides, the to be identified word corresponding according to the word pinyin string carries out information matches, and with the target information that obtains as the identification to voice messaging with the highest information of word match degree to be identified in the target information database, do not need voice messaging mated fully and can obtain target information, improved the success ratio of speech recognition, having solved prior art adopts and voice messaging to be carried out complete matching process carries out speech recognition, causing owing to form of presentation is inconsistent makes speech recognition failed, the problem that the speech recognition success ratio is low, because the technical scheme that the embodiment of the invention provides adopts the mode of word match to carry out speech recognition, only need in dictionary, store the target word and in the target information database storage standards information get final product, do not need same thing is stored a large amount of multi-form text messages according to the language expression mode, the data scale of dictionary and target information database is less, be convenient to search, and then improved speech recognition speed, solve prior art and need in vocabulary, store the text message of a large amount of different expression forms to same thing, cause vocabulary in large scale, be not easy to search, carry out the slow problem of speech recognition.The technical scheme that the embodiment of the invention provides is different from English speech recognition technology, this technical scheme is large for Chinese language literal amount, the characteristics that word links up in the statement, nothing is paused, employing is carried out participle according to phonetic to word in the statement, and carry out speech recognition according to the mode that the word to be identified behind the participle is searched, higher to success ratio and the recognition speed of Chinese speech recognition.

The natural-sounding recognition methods that the embodiment of the invention provides and device can be applied in as in the information service systems such as navigation, requesting song and contact person's inquiry.

The above; be the specific embodiment of the present invention only, but protection scope of the present invention is not limited to this, anyly is familiar with those skilled in the art in the technical scope that the present invention discloses; can expect easily changing or replacing, all should be encompassed within protection scope of the present invention.Therefore, protection scope of the present invention should be as the criterion by described protection domain with claim.

Claims

1. a natural-sounding recognition methods is characterized in that, comprising:

Obtain the phonetic corresponding to voice messaging of user's input;

The dictionary that employing sets in advance carries out word segmentation processing to described phonetic, obtains the word pinyin string behind the participle;

From described dictionary, search word to be identified corresponding to described word pinyin string;

Search the target information database according to described word to be identified, from described target information database, obtain the target information the highest with described word match degree to be identified;

2. method according to claim 1 is characterized in that, described method also comprises:

If described dictionary also is used for storing weight grade n corresponding to described target word and weight rate range N, obtain weight grade corresponding to described word to be identified according to described dictionary, wherein, n, N are integer, N 〉=2, n ∈ [1, N], the importance of target word in described Word message of n level is larger than the importance of target word in described Word message of n+1 level;

Then search the target information database according to described word to be identified, from described target information database, obtain with the highest target information of described word match degree to be identified and comprise:

Search the target information database according to described word to be identified, from described target information database, obtain with described word to be identified in the information aggregate that forms of the information of any one or a plurality of word match;

The weight grade corresponding according to described word to be identified processed respectively every information in the described information aggregate, obtains the weight coefficient of every information;

The information that the weight selection coefficient is the highest from described information aggregate is target information.

3. method according to claim 2 is characterized in that, described method also comprises:

If not having the weight grade in the described word to be identified is 1 word, again described phonetic is carried out word segmentation processing, to obtain the word of at least one weight grade as 1;

Then describedly search the target information database according to described word to be identified, from described target information database, obtain with the highest target information of described word match degree to be identified and be:

Search the target information database according to the word to be identified behind the participle again, from described target information database, obtain the target information the highest with described word match degree to be identified.

4. method according to claim 3 is characterized in that, described method also comprises:

Be that 1 word and pinyin string corresponding to this word are added in the described dictionary with described at least one weight grade.

5. method according to claim 1 is characterized in that, describedly searches the target information database according to described word to be identified, obtains with the highest target information of described word match degree to be identified to comprise from described target information database:

Described word to be identified is sorted;

According to the result of described ordering, from described word to be identified, obtain first word, from described target information database, obtain the information with described first word match;

From described word to be identified, obtain second word, from the information aggregate that information described and first word match forms, obtain the information with described second word match;

By that analogy, from described word to be identified, obtain last word, from the information aggregate that the information of a upper word match adjacent with described last word forms, obtain the target information with described last word match.

6. method according to claim 5 is characterized in that, described described word to be identified is sorted comprises:

Obtain the keyword in the described word to be identified;

The order of described word to be identified according to keyword, rear auxiliary word and front auxiliary word sorted;

Wherein, rear auxiliary word is to be positioned at keyword word afterwards in the described word to be identified, and front auxiliary word is to be positioned at keyword word before in the described word to be identified.

7. method according to claim 6 is characterized in that, if comprise two above keywords in the described word to be identified, described rear auxiliary word is the later non-key word of first keyword in the described word to be identified.

8. method according to claim 1 is characterized in that, described method also comprises:

Search spoken dictionary according to described word to be identified, according to lookup result, from described word to be identified, delete spoken word, wherein, spoken dictionary is used for the storage spoken word, does not comprise the Word message that has substantive implication in the voice messaging that relates to described user's input in the described spoken word.

9. method according to claim 1 is characterized in that, described method also comprises:

If find two above target informations, the target information of obtaining user's transmission is chosen indication or user's speech recognition statistical information;

According to described target information choose the indication or the speech recognition statistical information from described two above target informations, choose selected objective target information.

10. the described method of any one is characterized in that according to claim 1-9, and described method also comprises:

Described phonetic is carried out the fuzzy phoneme matching treatment, obtain the phonetic after the fuzzy matching;

Then the dictionary that sets in advance of described employing carries out word segmentation processing to described phonetic, and the word pinyin string of obtaining behind the participle is:

Phonetic after adopting the described dictionary that sets in advance to described fuzzy matching carries out word segmentation processing, obtains the word pinyin string behind the participle.

11. a natural-sounding recognition device is characterized in that, comprising:

12. device according to claim 11 is characterized in that, described device also comprises:

The 3rd acquiring unit, also be used for storing weight grade n corresponding to described target word and weight rate range N if be used for described dictionary, obtain weight grade corresponding to word to be identified that described second acquisition unit obtains according to described dictionary, wherein, n, N are integer, N 〉=2, n ∈ [1, N], the importance of target word in described Word message of n level is larger than the importance of target word in described Word message of n+1 level;

Then, the described unit of searching comprises:

Search subelement, be used for searching the target information database according to the word to be identified that described second acquisition unit obtains, from described target information database, obtain with described word to be identified in the information aggregate that forms of the information of any one or a plurality of word match;

First obtains subelement, is used for weight grade corresponding to word to be identified obtain according to described the 3rd acquiring unit, and described every information of searching in the information aggregate that subelement obtains is processed respectively, obtains the weight coefficient of every information;

Second obtains subelement, is used for searching information aggregate that subelement obtains and choosing first to obtain the highest information of weight coefficient that subelement obtains be target information from described.

13. device according to claim 12 is characterized in that, described device also comprises:

Heavy participle unit, not have the weight grade be 1 word if be used for word to be identified that described second acquisition unit obtains, the phonetic that again described the first acquiring unit is obtained carries out word segmentation processing, to obtain the word of at least one weight grade as 1;

The described unit of searching also is used for searching the target information database according to the word to be identified behind the described heavy participle unit again participle, obtains the target information the highest with described word match degree to be identified from described target information database.

14. device according to claim 13 is characterized in that, described device also comprises:

Updating block, being used at least one weight grade that described heavy participle unit obtains is that 1 word and pinyin string corresponding to this word are added described dictionary to.

15. device according to claim 11 is characterized in that, the described unit of searching also comprises:

The ordering subelement is used for described word to be identified is sorted;

The 3rd obtains subelement, is used for the result according to described ordering subelement ordering, obtains first word from described word to be identified, obtains the information with described first word match from described target information database;

The 4th obtains subelement, is used for obtaining second word from described word to be identified, obtains the information with described second word match from the information aggregate that information described and first word match forms;

By that analogy, the 5th obtains subelement, be used for obtaining last word from described word to be identified, from the information aggregate that the information of a upper word match adjacent with described last word forms, obtain the target information with described last word match.

16. device according to claim 11 is characterized in that, described device also comprises:

Delete cells, be used for searching spoken dictionary according to the word to be identified that described second acquisition unit obtains, according to lookup result, from described word to be identified, delete spoken word, wherein, spoken dictionary is used for the storage spoken word, does not comprise the Word message that has substantive implication in the voice messaging that relates to described user's input in the described spoken word.

17. device according to claim 11 is characterized in that, described device also comprises:

The 4th acquiring unit finds two above target informations if be used for the described unit of searching, and the target information of obtaining user's transmission is chosen indication or user's speech recognition statistical information;

Choose the unit, be used for choosing indication or speech recognition statistical information according to the target information that described the 4th acquiring unit obtains and search two above target informations that the unit finds and choose selected objective target information from described.

18. the described device of any one is characterized in that according to claim 11-17, described device also comprises:

The Fuzzy Processing unit, the phonetic that is used for described the first acquiring unit is obtained carries out the fuzzy phoneme matching treatment, obtains the phonetic after the fuzzy matching;

Described word segmentation processing unit also is used for adopting the phonetic after the fuzzy matching that the described dictionary that sets in advance obtains described Fuzzy Processing unit to carry out word segmentation processing, obtains the word pinyin string behind the participle.