Invention content
The purpose of the application is to solve the deficiencies in the prior art, is provided a kind of based on offline Application on Voiceprint Recognition and speech recognitionSound control method and device can obtain and realize voice-based electronic equipment control offline, and reduce outer strip as possibleThe effect of influence of the part to speech recognition.
To achieve the goals above, the application proposes a kind of voice control based on offline Application on Voiceprint Recognition and speech recognition firstMethod processed comprising following steps:It receives and wakes up word sound, and extract the first phonetic feature and the first vocal print for waking up word soundFeature;Check extracted the first phonetic feature and the first vocal print feature whether respectively all with wake up word sound template and vocal print mouldPlate matches, and terminates if mismatching, otherwise obtains the first vocal print code corresponding with the first vocal print feature;Receive order wordSound, and extract the second vocal print feature of order word sound;Check that the second vocal print is characterized in no and vocal print template matches, if mismatchingThen terminate, otherwise obtains the second vocal print code corresponding with the second vocal print feature;Check the first vocal print code and the second vocal printWhether code is identical, terminates if differing, otherwise the second phonetic feature of extraction order word;Check the second extracted voiceFeature whether with order word sound template matches, if mismatch if terminate, otherwise obtain the second phonetic feature phonetic code simultaneouslyCorresponding control instruction is generated based on above-mentioned phonetic code.Wherein, word sound template, order word sound template and vocal print template are waken upIt is stored in local.
In a preferred embodiment of the above method, it is to pass through instruction to wake up word sound template and order word sound templatePractice the speech production collected in advance.
In a preferred embodiment of the above method, vocal print template is at least use collected in advance by trainingThe speech production at family.
In a preferred embodiment of the above method, the correspondence of voice and phonetic code is customized.
Further, in above-mentioned preferred embodiment, the correspondence of voice and phonetic code is stored in local.
In a preferred embodiment of the above method, it is by dynamic to wake up word sound template and order word sound templateMade of the collected voice training of state update.
In a preferred embodiment of the above method, vocal print template is to update an at least designated person by dynamicVoice training made of.
Secondly, the application also proposes a kind of phonetic controller based on offline Application on Voiceprint Recognition and speech recognition, including withLower module:First receiving module wakes up word sound for receiving, and extracts the first phonetic feature and the first sound for waking up word soundLine feature;First checks module, for check the first extracted phonetic feature and the first vocal print feature whether respectively all with call outAwake word sound template and vocal print template matches, terminate if mismatching, otherwise obtain corresponding with the first vocal print feature firstVocal print code;Second receiving module for receiving order word sound, and extracts the second vocal print feature of order word sound;SecondIt checks module, for checking that the second vocal print is characterized in no and vocal print template matches, terminates if mismatching, otherwise obtain and secondThe corresponding second vocal print code of vocal print feature;Third checks module, for checking the first vocal print code and the second vocal print codeIt is whether identical, the flow if differing, on the contrary the second phonetic feature of extraction order word;Directive generation module, for checkingSecond phonetic feature of extraction whether with order word sound template matches, terminate if mismatching, otherwise it is special to obtain the second voiceThe phonetic code of sign simultaneously generates corresponding control instruction based on above-mentioned phonetic code.Wherein, word sound template, order word sound are waken upTemplate and vocal print template are stored in local.
In a preferred embodiment of above-mentioned apparatus, it is to pass through instruction to wake up word sound template and order word sound templatePractice the speech production collected in advance.
In a preferred embodiment of above-mentioned apparatus, vocal print template is at least use collected in advance by trainingThe speech production at family.
In a preferred embodiment of above-mentioned apparatus, the correspondence of voice and phonetic code is customized.
Further, in above-mentioned preferred embodiment, the correspondence of voice and phonetic code is stored in local.
In a preferred embodiment of above-mentioned apparatus, it is by dynamic to wake up word sound template and order word sound templateMade of the collected voice training of state update.
In a preferred embodiment of above-mentioned apparatus, vocal print template is to update an at least designated person by dynamicVoice training made of.
Finally, disclosed herein as well is a kind of computer readable storage medium, it is stored thereon with computer instruction, the instructionIt is realized such as the step of aforementioned any one of them method when being executed by processor.
The application's has the beneficial effect that:By local sound template and vocal print template, the body of speaker is easily confirmedThe content of part and voice, improves voice-based electronic equipment ease for use.
Specific implementation mode
The technique effect of the design of the application, concrete structure and generation is carried out below with reference to embodiment and attached drawing clearChu, complete description, to be completely understood by the purpose, scheme and effect of the application.It should be noted that the case where not conflictingUnder, the features in the embodiments and the embodiments of the present application can be combined with each other.The identical attached drawing mark used everywhere in attached drawingNote indicates same or analogous part.
Herein, unless otherwise expressly stated, it refers to by the access right user with electronic equipment to wake up word soundIt is sent out, the identity for verifying user and the sound for starting electronic equipment control flow.Only when wake-up word note closesWhen certain condition, relevant equipment just can further receive the instruction of other voices.Correspondingly, order word sound refer to byAfter confirming relevant wake-up word sound, user further sends out phonetic order, for being assigned to electronic equipment with practical specialDetermine the voice of meaning.
Fig. 1 show the flow of one embodiment of the sound control method based on offline Application on Voiceprint Recognition and speech recognitionFigure.The above method includes the following steps:It receives and wakes up word sound, and extract the first phonetic feature and the first sound for waking up word soundLine feature;Check extracted the first phonetic feature and the first vocal print feature whether respectively all with wake up word sound template and vocal printTemplate matches terminate if mismatching, otherwise obtain the first vocal print code corresponding with the first vocal print feature;Receive order wordVoice, and extract the second vocal print feature of order word sound;It is no with vocal print template matches to check that the second vocal print is characterized in, if notWith then terminating, otherwise obtain the second vocal print code corresponding with the second vocal print feature;Check the first vocal print code and the rising toneWhether line code is identical, terminates if differing, otherwise the second phonetic feature of extraction order word;Check the second extracted languageSound feature whether with order word sound template matches, if mismatch if terminate, otherwise obtain the second phonetic feature phonetic codeAnd corresponding control instruction is generated based on above-mentioned phonetic code.As shown in the schematic diagram in Fig. 2, word sound template, order word are waken upSound template and vocal print template are stored in local.Wherein, when appearance unmatched situation (either phonetic feature or soundLine feature is mismatched with local corresponding templates are stored in) when, above method flow can all force to terminate, and be back to and wait for userInput wakes up the stage of word sound again.
Wherein, the first vocal print feature and the second vocal print feature are all based on the stability of human speech, to collected voicePhysical quantity (such as sound quality, the duration of a sound, loudness of a sound and pitch etc.) formed voice TuPu method parameter.Further, in the applicationOne embodiment in, vocal print template be by extraction with electronic equipment access right multiple users vocal print feature, andThe vocal print feature of multiple users is grouped serialization according to the electronic equipment access right of user.Vocal print feature can be led by this technologyConventional algorithm in domain is analyzed the sound of user and is formed, and the application not limits this.
Similarly, the first phonetic feature and the second phonetic feature be based on specific language, to collected voice according toWord, phoneme and tone of concrete syntax etc. are formed by characteristic parameter.Features described above parameter (will wake up word with sound templateSound template and order word sound template) in the characteristic parameters of voice of multigroup labeled specific meanings match, used with determiningThe specific meanings of the sent out voice in family.Phonetic feature equally can by conventional algorithm in the art to the sound of user intoRow is analyzed and is formed, and the application not limits this.
To reduce the operand of system, in one embodiment of the application, after receiving wake-up word sound, only the first languageSound feature is extracted.When the first phonetic feature and a certain group of phase in multigroup characteristic parameter recorded in wake-up word sound templateWhen matching, the first vocal print feature for waking up word is just extracted;Otherwise, if the first phonetic feature and the institute in wake-up word sound templateWhen thering is characteristic parameter all to mismatch, then user is prompted to send out wake-up word, to be matched again.Relevant matching judgement (such asFirst phonetic feature and the matching of wake-up word sound template, the matching of the first vocal print feature and vocal print template and the second phonetic featureThe matching of order word sound template) may be used this field ordinary matches algorithm realize, the application to this not limit.
In one embodiment of the application, waking up word sound template and order word sound template is received in advance by trainingThe speech production of collection.Specifically, user can repeatedly input in advance wakes up word sound and order word sound, passes through TrainingIt improves and wakes up word sound template and order word sound template, to improve the accuracy rate of speech recognition.
Similarly, in one embodiment of the application, vocal print template is at least use collected in advance by trainingThe speech production at family.Correspondingly, one or the several users with electronic equipment access right pre-enter repeatedly input and call outAwake word sound and order word sound improve vocal print template, to improve the accuracy rate of Application on Voiceprint Recognition by Training.
With reference to the schematic diagram of user-defined voice shown in Fig. 3 and the correspondence of phonetic code, the application'sIn one embodiment, can according to actual electronic equipment and the used language of user, sets itself voice and phonetic codeCorrespondence.At this point, since user can be with the correspondence between self-defined voice and phonetic code, so being sent out to electronic equipmentConcrete syntax is unrelated used by the specific instruction gone out sends out order word sound with user.For example, by changing order word soundThe characteristic parameter of the voice of labeled specific meanings in template so that the voice of English or Chinese and specified phonetic code phaseAssociation is corresponding control instruction to realize reception and convert with the order word sound that English or Chinese are sent out.
Further, in above-described embodiment of the application, the correspondence of voice and phonetic code is also stored in thisGround, so that can realize voice-based electronic equipment control without connecting network.
In one embodiment of the application, it is to update institute by dynamic to wake up word sound template and order word sound templateMade of the voice training of collection.User can wake up word sound and order word sound by regularly updating, and improve electronic equipmentSafety coefficient, avoid abusing the electronic equipment by other personnel without access right.
Similarly, in one embodiment of the application, vocal print template is to update an at least designated person by dynamicVoice training made of, (user of change of voice phase is particularly in, for instance in blueness to the vocal print feature for the user that timely updatesThe user of phase in spring or the user for just receiving operation on larynx).
The modular structure of one embodiment of the phonetic controller based on offline Application on Voiceprint Recognition and speech recognition shown in Fig. 4Figure.Shown device comprises the following modules:First receiving module wakes up word sound for receiving, and extracts and wake up the of word soundOne phonetic feature and the first vocal print feature;First checks module, for checking extracted the first phonetic feature and the first vocal printWhether feature terminates if mismatching respectively all with wake-up word sound template and vocal print template matches, otherwise obtains and the first soundThe corresponding first vocal print code of line feature;Second receiving module for receiving order word sound, and extracts order word soundSecond vocal print feature;Second checks module, for checking that the second vocal print is characterized in no and vocal print template matches, is tied if mismatchingBeam, on the contrary obtain the second vocal print code corresponding with the second vocal print feature;Third checks module, for checking for the first vocal print generationWhether code is identical as the second vocal print code, terminates if differing, otherwise the second phonetic feature of extraction order word;Instruction generatesModule, for check the second extracted phonetic feature whether with order word sound template matches, if mismatch if terminate, it is on the contraryIt obtains the phonetic code of the second phonetic feature and corresponding control instruction is generated based on above-mentioned phonetic code.Such as the schematic diagram in Fig. 2It is shown, it wakes up word sound template, order word sound template and vocal print template and is stored in local.Wherein, when mismatchingThe case where (either phonetic feature or vocal print feature and be stored in local corresponding templates mismatch) when, above-mentioned apparatus all canIt is back to and user is waited for input the state for waking up word sound again.
To reduce the operand of system, in one embodiment of the application, after receiving wake-up word sound, first receivesModule only extracts the first phonetic feature.When the first inspection module determines the first phonetic feature and wakes up recorded in word sound templateMultigroup characteristic parameter in a certain group when matching, the first receiving module just extracts the first vocal print feature for waking up word;Otherwise,If the first inspection module judges that all characteristic parameters in the first phonetic feature and wake-up word sound template all mismatch, firstReceiving module then prompts user to send out wake-up word, to be matched again.Relevant matching judgement (such as the first phonetic feature withWake up the matching and the second phonetic feature order word sound template of the matching, the first vocal print feature of word sound template with vocal print templateMatching) may be used this field ordinary matches algorithm realize, the application to this not limit.
In one embodiment of the application, waking up word sound template and order word sound template is received in advance by trainingThe speech production of collection.Specifically, user can repeatedly input in advance wakes up word sound and order word sound, passes through TrainingIt improves and wakes up word sound template and order word sound template, to improve the accuracy rate of speech recognition.
Similarly, in one embodiment of the application, vocal print template is at least use collected in advance by trainingThe speech production at family.Correspondingly, one or the several users with electronic equipment access right pre-enter repeatedly input and call outAwake word sound and order word sound improve vocal print template, to improve the accuracy rate of Application on Voiceprint Recognition by Training.
With reference to the schematic diagram of user-defined voice shown in Fig. 3 and the correspondence of phonetic code, the application'sIn one embodiment, directive generation module can be according to actual electronic equipment and the used language of user, sets itself voiceWith the correspondence of phonetic code.At this point, due to user can with the correspondence between self-defined voice and phonetic code, soConcrete syntax is unrelated used by the specific instruction sent out to electronic equipment sends out order word sound with user.For example, by repairingChange in order word sound template the characteristic parameter of the voice of labeled specific meanings so that the voice of English or Chinese with it is specifiedPhonetic code it is associated, be that corresponding control refers to realize reception and convert the order word sound sent out with English or ChineseIt enables.
Further, in above-described embodiment of the application, the correspondence of voice and phonetic code is also stored in thisGround, so that can realize voice-based electronic equipment control without connecting network.
In one embodiment of the application, it is to update institute by dynamic to wake up word sound template and order word sound templateMade of the voice training of collection.User can wake up word sound and order word sound by regularly updating, and improve electronic equipmentSafety coefficient, avoid abusing the electronic equipment by other personnel without access right.
Similarly, in one embodiment of the application, vocal print template is to update an at least designated person by dynamicVoice training made of, (user of change of voice phase is particularly in, for instance in blueness to the vocal print feature for the user that timely updatesThe user of phase in spring or the user for just receiving operation on larynx).
Although the description of the present application is quite detailed and especially several embodiments are described, it is notAny of these details or embodiment or any specific embodiments are intended to be limited to, but it is by reference to appended that should be considered asClaim considers that the prior art provides the possibility explanation of broad sense for these claims, to effectively cover the applicationPreset range.In addition, the application is described with inventor's foreseeable embodiment above, its purpose is to be provided withDescription, and those equivalent modifications that the application can be still represented to the unsubstantiality change of the application still unforeseen at present.