Movatterモバイル変換


[0]ホーム

URL:


CN106098068B - A kind of method for recognizing sound-groove and device - Google Patents

A kind of method for recognizing sound-groove and device
Download PDF

Info

Publication number
CN106098068B
CN106098068BCN201610416650.3ACN201610416650ACN106098068BCN 106098068 BCN106098068 BCN 106098068BCN 201610416650 ACN201610416650 ACN 201610416650ACN 106098068 BCN106098068 BCN 106098068B
Authority
CN
China
Prior art keywords
voice messaging
character
verifying
voice
registration
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610416650.3A
Other languages
Chinese (zh)
Other versions
CN106098068A (en
Inventor
李为
钱柄桦
金星明
李科
吴富章
吴永坚
黄飞跃
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Tencent Cloud Computing Beijing Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co LtdfiledCriticalTencent Technology Shenzhen Co Ltd
Priority to CN201610416650.3ApriorityCriticalpatent/CN106098068B/en
Publication of CN106098068ApublicationCriticalpatent/CN106098068A/en
Priority to PCT/CN2017/087911prioritypatent/WO2017215558A1/en
Application grantedgrantedCritical
Publication of CN106098068BpublicationCriticalpatent/CN106098068B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Classifications

Landscapes

Abstract

The embodiment of the invention discloses a kind of method for recognizing sound-groove and devices, and voice messaging is verified caused by the first character string the method comprise the steps that obtaining verifying user and reading aloud;Speech recognition is carried out to the verifying voice messaging and obtains the sound bite corresponding with multiple characters in first character string respectively for including in the verifying voice messaging;Extract the vocal print feature of the corresponding sound bite of each character;According to the vocal print feature of the corresponding sound bite of each character, the corresponding feature vector of each character in voice messaging is verified in conjunction with the corresponding universal background model training of preset respective symbols;Calculate the corresponding feature vector of each character in verifying voice messaging and the preset similarity score for registering the corresponding feature vector of respective symbols in voice messaging, if the similarity score reaches default verifying thresholding, the verifying user is determined as the corresponding registration user of the registration voice messaging.Using the present invention, Application on Voiceprint Recognition accuracy rate can be effectively improved.

Description

A kind of method for recognizing sound-groove and device
Technical field
The present invention relates to voice recognition technology field more particularly to a kind of method for recognizing sound-groove and device.
Background technique
Application on Voiceprint Recognition knows method for distinguishing, including two ranks of user's registration and user identity identification as a kind of biological informationSection.Voice is mapped as user model by a series of processing by registration phase.In the language that cognitive phase is unknown for one section of identityWhether sound carries out the matching of similarity with model, and then unanimously sentences to the identity of unknown voice and the identity of registration voiceIt is disconnected.Existing vocal print modeling method is usually to be modeled from the unrelated level of text to realize and retouch to speaker's identity featureIt states, but the unrelated modeling pattern of text, when user reads aloud different content, recognition accuracy is lower, it is difficult to meet the requirements.
Summary of the invention
In view of this, the embodiment of the present invention provides a kind of method for recognizing sound-groove and device, Application on Voiceprint Recognition standard can be effectively improvedTrue rate.
In order to solve the above-mentioned technical problem, the embodiment of the invention provides a kind of method for recognizing sound-groove, which comprises
It obtains verifying user and reads aloud verifying voice messaging caused by the first character string;
To it is described verifying voice messaging carry out speech recognition obtain it is described verifying voice messaging in include respectively with it is describedThe corresponding sound bite of multiple characters in first character string;
Extract the vocal print feature of the corresponding sound bite of each character;
It is corresponding general in conjunction with preset respective symbols according to the vocal print feature of the corresponding sound bite of each characterBackground model training is verified the corresponding feature vector of each character in voice messaging;
Calculate the corresponding feature vector of each character and corresponding word in preset registration voice messaging in verifying voice messagingThe similarity score of corresponding feature vector is accorded with, if the similarity score reaches default verifying thresholding, the verifying is usedFamily is determined as the corresponding registration user of the registration voice messaging.
Correspondingly, the embodiment of the invention also provides a kind of voice print identification device, described device includes:
Voice obtains module, reads aloud for acquisition verifying user and verifies voice messaging caused by the first character string;
Sound bite identification module obtains the verifying voice letter for carrying out speech recognition to the verifying voice messagingThe sound bite corresponding with multiple characters in first character string respectively for including in breath;
Vocal print feature extraction module, the vocal print for extracting the corresponding sound bite of each character in verifying voice messaging are specialSign;
Characteristic model training module, for the vocal print feature according to the corresponding sound bite of each character, in conjunction with pre-If respective symbols corresponding universal background model training be verified the corresponding feature vector of each character in voice messaging;
Similarity judgment module, for calculating each corresponding feature vector of character and preset note in verifying voice messagingThe similarity score of the corresponding feature vector of respective symbols in volume voice messaging;
Subscriber identification module, it is if reaching default verifying thresholding for the similarity score, the verifying user is trueIt is set to the corresponding registration user of the registration voice messaging.
The vocal print of the corresponding sound bite of each character in verifying voice messaging of the present embodiment by obtaining verifying userFeature is verified the corresponding feature vector of each character in voice messaging in conjunction with the UBM training of preset respective symbols, and leads toCross will verify the feature vectors of respective symbols in the corresponding feature vector of each character and registration voice messaging in voice messaging intoRow similarity-rough set, so that it is determined that the user identity of verifying user, which is to the user characteristics vector that compares and specificCharacter is corresponding, vocal print feature when user reads aloud kinds of characters is fully taken into account, so as to effectively improve Application on Voiceprint Recognition accuracy rate.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show belowThere is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only thisSome embodiments of invention for those of ordinary skill in the art without creative efforts, can be withIt obtains other drawings based on these drawings.
Fig. 1 is the Stages Overview schematic diagram of the method for recognizing sound-groove in the embodiment of the present invention;
Fig. 2 is the flow diagram of one of embodiment of the present invention method for recognizing sound-groove;
Fig. 3 is that the principle that identification obtains the corresponding sound bite of multiple characters from voice messaging in the embodiment of the present invention is shownIt is intended to;
Fig. 4 is the principle signal for obtaining the corresponding feature vector of each character in the embodiment of the present invention from voice messagingFigure;
Fig. 5 is the voiceprint registration flow diagram that user is registered in the embodiment of the present invention;
Fig. 6 is the flow diagram of the method for recognizing sound-groove in another embodiment of the present invention;
Fig. 7 is the structural schematic diagram of one of embodiment of the present invention voice print identification device;
Fig. 8 is the structural schematic diagram of the sound bite identification module in the embodiment of the present invention.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, completeSite preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based onEmbodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every otherEmbodiment shall fall within the protection scope of the present invention.
The embodiment of the invention provides a kind of method for recognizing sound-groove and devices.The method for recognizing sound-groove and device can be applied toIt is in need identification unknown subscriber's identity scene or equipment in.The character in character string for carrying out Application on Voiceprint Recognition can beArabic numerals, English alphabet or other language characters etc..To simplify the description, the character in the embodiment of the present invention is with ArabIt is illustrated for number.
Method for recognizing sound-groove in the embodiment of the present invention can be divided into two stages, as shown in Figure 1:
1) the voiceprint registration stage of user is registered
In the voiceprint registration stage, a login-string (the second character occurred hereinafter can be read aloud by registering userString), voice print identification device acquires registration voice messaging of the registration user when reading aloud the login-string, then to registration languageMessage breath carry out voice recognition obtain it is described registration voice messaging in include respectively with multiple words in the login-stringCorresponding sound bite is accorded with, and then vocal print feature extraction and vocal print model training are carried out to the corresponding sound bite of each character,Including the vocal print feature according to the corresponding sound bite of each character, in conjunction with the corresponding common background of preset respective symbolsModel (Universal Background Model, UBM, i.e. GMM-UBM) training obtains each character in registration voice messagingCorresponding feature vector, then voice print identification device can be respectively that different registration users reads aloud it in the voiceprint registration stageRegistration voice messaging in the corresponding feature vector of multiple characters be stored in the model library of voice print identification device.
For example, login-string is digit strings 0185851, four kinds of digital " 0 "s, " 1 ", " 5 ", " 8 " are contained, then soundLine identification device carries out vocal print feature extraction and sound-groove model according to the corresponding sound bite of character each in registration voice messagingTraining, obtain " 0 ", " 1 ", " 5 ", " 8 " corresponding sound bite vocal print feature, and then combine preset respective symbols it is correspondingUBM training obtains the corresponding feature vector of each character in registration voice messaging, including feature vector corresponding with digital " 0 ",And digital " 1 " corresponding feature vector feature vector corresponding with number " 5 " and feature vector corresponding with number " 8 ".
2) the identification stage of user is verified
In the identification stage, the user for verifying the i.e. unknown identity of user reads aloud a verifying character string (to be occurred hereinafterThe first character string, second character string possesses at least one identical character with first character string), Application on Voiceprint Recognition dressVerifying voice messaging of the acquisition verifying user when reading aloud the verifying character string is set, sound then is carried out to verifying voice messagingIdentification obtains the voice sheet corresponding with multiple characters in the verifying character string respectively for including in the verifying voice messagingSection, and then vocal print feature extraction and vocal print model training are carried out to the corresponding sound bite of each character, including according to described eachThe vocal print feature of the corresponding sound bite of a character is verified voice letter in conjunction with the corresponding UBM training of preset respective symbolsThe corresponding feature vector of each character in breath finally calculates the corresponding feature vector of each character in verifying voice messaging and defaultRegistration voice messaging in the corresponding feature vector of respective symbols similarity score, tested if the similarity score reaches defaultThresholding is demonstrate,proved, then the verifying user is determined as the corresponding registration user of the registration voice messaging.
For example, verifying character string is digit strings 85851510, then when voice print identification device is read aloud according to verifying userThe corresponding sound bite of each character carries out vocal print feature and extracts and vocal print model training in the verifying voice messaging of generation, obtains" 0 ", " 1 ", " 5 ", " 8 " corresponding GMM, and then combine the corresponding UBM of preset respective symbols that verifying user can be calculatedVerifying voice messaging feature vector, including and the corresponding feature vector of digital " 0 ", feature vector corresponding with number " 1 ",And digital " 5 " corresponding feature vector and feature vector corresponding with digital " 8 ", and then calculate separately in verifying voice messaging" 0 ", " 1 ", " 5 ", " 8 " corresponding feature vector spy corresponding with " 0 ", " 1 ", " 5 ", " 8 " in registration voice messaging respectivelyThe similarity score between vector is levied, if the similarity score reaches default verifying thresholding, the verifying user is determinedFor the corresponding registration user of the registration voice messaging.
It should be pointed out that the voiceprint registration stage of above-mentioned registration user and the identification stage of verifying user can beIt realizes, can also be realized in different devices in same device respectively, such as the vocal print note of registration userThe volume stage implements in the first equipment, and then the first equipment will be registered the corresponding feature vector of multiple characters in voice messaging and be sent outThe second equipment is given, so as to implement the identification stage of verifying user in the second equipment.
Above-mentioned two process is described in detail respectively below by specific embodiment.
Fig. 2 is the flow diagram of one of embodiment of the present invention method for recognizing sound-groove, in the present embodiment as shown in the figureMethod for recognizing sound-groove process may include:
S201 obtains verifying user and reads aloud verifying voice messaging caused by the first character string.
Verifying user, that is, unknown identity user, needs to verify its user identity by voice print identification device.It is describedFirst character string is that the character string of authentication is carried out for verifying user, can be randomly generated, and is also possible to default solidA fixed character string, such as the second character string corresponding with pre-generated registration voice messaging are one at least partly identicalCharacter string.Specifically, the character string may include m character, wherein there is n mutually different characters, m, n are positive wholeNumber, and m >=n.
For example, the first character string is " 12358948 ", totally 8 characters, include 7 kinds of mutually different characters " 1 ", " 2 ",“3”、“4”、“5”、“8”、“9”。
In an alternative embodiment, voice print identification device can be generated and show first character string, allows and verifies user's rootIt is read aloud according to first character string of display.
S202, to it is described verifying voice messaging carry out speech recognition obtain it is described verifying voice messaging in include respectively withThe corresponding sound bite of multiple characters in first character string.
As shown in figure 3, voice print identification device can be filtered by speech recognition and intensity of sound, by the verifying voiceInformation divides to obtain the corresponding sound bite of multiple characters, can also optionally weed out invalid voice segment, after being not involved inContinuous treatment process.
S203 extracts the vocal print feature of the corresponding sound bite of each character.
Specifically, voice print identification device can extract the MFCC (Mel in the corresponding sound bite of each characterFrequency Cepstrum Coefficient, mel cepstrum coefficients) or PLP (Perceptual LinearPredictive perceives linear predictor coefficient), the vocal print feature as sound bite corresponding to each character.
S204, it is corresponding in conjunction with preset respective symbols according to the vocal print feature of the corresponding sound bite of each characterUniversal background model training be verified the corresponding feature vector of each character in voice messaging;
The universal background model UBM in the embodiment of the present invention is a kind of language of optional network specific digit by a large amount of speakersMixed Gauss model made of segment combined training characterizes distribution of the voice of corresponding number in feature space, and due to instructionPractice data source in a large amount of speaker, therefore it does not characterize certain one kind and specifically talks about people, it, can with the unrelated characteristic of identityRegard a kind of universal background model as.It schematically, can be more than 20 hours languages greater than 1000 people, duration using number of speakingSound sample, and the frequency of occurrences relative equilibrium of each character, training obtain UBM.The mathematic(al) representation of UBM are as follows:
P (x)=∑I=1 ... CaiN(x|μi, ∑i) ... ... formula (1)
Wherein, P (x) represents the probability distribution of UBM, and C, which is represented, shares C Gauss module in UBM, sums up, aiIt representsThe weight of i-th of Gauss module, μiRepresent the mean value of i-th of Gauss module, ∑iRepresent the variance of i-th of Gauss module, N (x)Gaussian Profile is represented, x represents the sample of input, sample namely vocal print feature.
Voice print identification device can will verify the vocal print feature of the corresponding sound bite of each character in voice messaging asTraining sample data, using maximal posterior probability algorithm (Maximum A Posteriori, MAP) to preset respective symbols pairThe parameter for the universal background model answered is adjusted, i.e., in the sound that will verify the corresponding sound bite of each character in voice messagingAfter line feature substitutes into formula (1) as input sample, by constantly adjusting the corresponding universal background model of preset respective symbolsParameter, so that posterior probability P (x) is maximum, so as to which the maximum parameter of posterior probability P (x) is determining to verify voice according to makingThe corresponding feature vector of respective symbols in information.
Due to largely test the mean value for demonstrating each Gauss module in UBM model with paper can be used for distinguish speakThe identity information of people, we define the mean value super vector of UBM model are as follows:
To which voice print identification device can be by the vocal print feature of the corresponding sound bite of character each in verifying voice messagingAs training sample data, using maximal posterior probability algorithm (Maximum A Posteriori, MAP) to preset corresponding wordThe mean value super vector for according with corresponding universal background model is adjusted, i.e., will verify the corresponding language of each character in voice messagingAfter the vocal print feature of tablet section substitutes into formula (1) as input sample, by constantly adjusting mean value super vector, so that posterior probability P(x) maximum, so as to which the maximum mean value super vector of posterior probability P (x) will be made as respective symbols in verifying voice messagingCorresponding feature vector.
In another alternative embodiment, the slow problem of high-dimensional bring convergence rate in order to reduce super vector, wePass through principal component analytical method based on probability (PPCA, probabilistic principal component analysis)The variation range of mean value super vector is limited in a sub-spaces, voice print identification device can will be verified each in voice messagingThe vocal print feature of the corresponding sound bite of character is as training sample data, using maximal posterior probability algorithm to preset correspondingThe mean value super vector of the corresponding universal background model of character is adjusted, and combines preset super vector subspace matrices to obtainThe corresponding feature vector of each character into verifying voice messaging.In the specific implementation, can be using following formula to preset corresponding wordThe mean value super vector for according with corresponding universal background model is adjusted, so that the corresponding common background mould of respective symbols adjustedThe posterior probability of type is maximum:
M=m+T ω, wherein M represents the mean value super vector of the universal background model of some character adjusted, and m, which is represented, to be adjustedThe mean value super vector of the universal background model of respective symbols before whole, T are preset super vector subspace matrices, and ω is to verifyThe corresponding feature vector of respective symbols in voice messaging will verify the corresponding sound bite of each character in voice messagingAfter vocal print feature substitutes into formula (1) as input sample, by constantly adjust the mean value that ω may be implemented in adjustment type (1) surpass toAmount, so that posterior probability P (x) is maximum, so as to which the maximum ω of posterior probability P (x) will be made as in verifying voice messagingThe corresponding feature vector of respective symbols.The super vector subspace matrices T be according to the mean value of the gauss hybrid models surpass toWhat the correlation determination in amount between each dimension vector obtained.
S205 calculates the corresponding feature vector of each character and phase in preset registration voice messaging in verifying voice messagingThe similarity score of the corresponding feature vector of character is answered, if the similarity score reaches default verifying thresholding, is tested describedCard user is determined as the corresponding registration user of the registration voice messaging.
Specifically, voice print identification device can the voiceprint registration stage get registration user registration voice messaging,And extracted by the vocal print feature similar with the present embodiment and vocal print model training, it is each in available registration voice messagingThe corresponding feature vector of the sound bite of character.The registration voice messaging can be voice print identification device and obtain registration userIt reads aloud and registers voice messaging caused by the second character string, second character string and first character string possess at least oneIdentical character, i.e., described corresponding second character string of registration voice messaging and first character string are at least partly identical.IntoAnd in an alternative embodiment, it is corresponding that voice print identification device can also obtain respective symbols in the registration voice messaging from outsideAfter feature vector, i.e. registration user are by other equipment typing registration voice messaging, other equipment or server pass through soundLine feature extraction and vocal print model training obtain the corresponding feature vector of sound bite of each character in registration voice messaging, soundLine identification device is by getting the corresponding feature of respective symbols in the registration voice messaging from other equipment or serverVector, thus verifying user the identification stage to feature vector corresponding with each character in verifying voice messaging intoRow compares.
In the specific implementation, the similarity score is that voice print identification device is corresponding by each character in verifying voice messagingAfter feature vector feature vector corresponding with respective symbols in preset registration voice messaging is compared, identical characters are measuredThe score value of similarity degree between two feature vectors.In an alternative embodiment, each word in verifying voice messaging can be calculatedAccord with the COS distance value between corresponding feature vector feature vector corresponding with respective symbols in preset registration voice messagingAs the similarity score, that is, be calculate by the following formula some character respectively verifying voice messaging in corresponding feature vector andRegister the similarity score between the feature vector in voice messaging:
Wherein, subscript i indicates i-th of verifying voice messaging and registers the character shared in voice messaging, ωi(tar) tableShow the character corresponding feature vector, ω in verifying voice messagingi(test) indicate that the character is right in registration voice messagingThe feature vector answered.If verifying in voice messaging and registration voice messaging includes multiple identical characters, can be according to above formulaThe similarity score for each character being calculated takes mean value, if the similarity score mean value of each character reaches corresponding defaultThresholding is verified, then the verifying user is determined as the corresponding registration user of the registration voice messaging.Multidigit is registered if it existsUser, such as registration user A, B and C shown in FIG. 1, can be according to the feature vector and each note for verifying some character of userThe similarity of the feature vector of the respective symbols of volume user, when the feature vector and verifying language of the respective symbols of some registration userThe similarity score highest and similarity of the feature vector of the character of sound reach default verifying thresholding, then make registration userFor the identification result for verifying user.
In an alternative embodiment, if there are same characters to occur more than once in the verifying voice messaging, such as occur0,1,5 and 8 all occur 2 times respectively in verifying voice messaging as shown in Figure 2, then can be corresponding according to character 0 twiceThe feature vector that handles of the sound bite similarity with the feature vector of character 0 in preset registration voice messaging respectivelyThe average value of score, as character 0 in the feature vector of character 0 in this verifying voice messaging and preset registration voice messagingFeature vector similarity score, and so on.
It should be pointed out that measuring the mode of the similarity between two feature vectors there are also very much, the above is only this hairsA kind of embodiment of bright offer, those skilled in the art may not need creative labor on the basis of scheme disclosed by the inventionThe similarity point of more feature vectors for calculating verifying voice messaging and registering the character shared in voice messaging is obtained dynamiclySeveral modes, the present invention is without exhaustion.
To the corresponding sound bite of character each in the verifying voice messaging of, the present embodiment by obtaining verifying userVocal print feature is verified the corresponding feature vector of each character in voice messaging in conjunction with the UBM training of preset respective symbols,And by will verify the corresponding feature vector of each character in voice messaging with register the features of respective symbols in voice messaging toAmount carries out similarity-rough set, so that it is determined that the user identity of verifying user, which to the user characteristics vector that compares withSpecific character is corresponding, fully takes into account vocal print feature when user reads aloud kinds of characters, so as to effectively improve Application on Voiceprint Recognition standardTrue rate.
Fig. 5 is the voiceprint registration flow diagram that user is registered in the embodiment of the present invention, in the present embodiment as shown in the figureVoiceprint registration process may include:
S501 obtains registration user and reads aloud and registers voice messaging caused by the second character string, second character string withFirst character string possesses at least one identical character.
The registration user is the user for determining legal identity, and second character string is for acquiring registration user's vocal printThe character string of feature vector can be randomly generated, and be also possible to preset a character string of fixation.Specifically, describedTwo character strings also may include m character, wherein there is n mutually different characters, m, n are positive integer, and m >=n.
In an alternative embodiment, voice print identification device can be generated and show second character string, allows and registers user's rootIt is read aloud according to second character string of display.
S502, to it is described registration voice messaging carry out speech recognition obtain it is described registration voice messaging in include respectively withThe corresponding sound bite of multiple characters in second character string;
Voice print identification device can be filtered by speech recognition and intensity of sound, and the verifying voice messaging is dividedTo the corresponding sound bite of multiple characters, invalid voice segment can also optionally be weeded out, be not involved in subsequent processedJourney.
S503 extracts the vocal print feature of the corresponding sound bite of each character in registration voice messaging.
Specifically, voice print identification device can extract the MFCC (Mel in the corresponding sound bite of each characterFrequency Cepstrum Coefficient, mel cepstrum coefficients) or PLP (Perceptual LinearPredictive perceives linear predictor coefficient), the vocal print feature as sound bite corresponding to each character.
S504, according to the vocal print feature of the corresponding sound bite of character each in registration voice messaging, in conjunction with preset phaseCharacter corresponding universal background model training is answered to obtain the corresponding feature vector of each character in registration voice messaging.
The expression formula of UBM can be with reference to embodiment above.The step of voiceprint registration process and Application on Voiceprint Recognition processS204 is similar, voice print identification device can will register the vocal print feature of the corresponding sound bite of each character in voice messaging asTraining sample data, using maximal posterior probability algorithm (Maximum A Posteriori, MAP) to preset respective symbols pairThe parameter for the universal background model answered is adjusted, i.e., in the sound that will register the corresponding sound bite of each character in voice messagingAfter line feature substitutes into formula (1) as input sample, by constantly adjusting the corresponding universal background model of preset respective symbolsParameter, so that posterior probability P (x) is maximum, so as to which the maximum parameter of posterior probability P (x) is determining to register voice according to makingThe corresponding feature vector of respective symbols in information.
And since the mean value of Gauss module each in UBM model can be used for distinguishing the identity information of speaker, vocal print is knownOther device can be adopted using the vocal print feature of the corresponding sound bite of character each in registration voice messaging as training sample dataWith maximal posterior probability algorithm (Maximum A Posteriori, MAP) to the corresponding common background mould of preset respective symbolsThe mean value super vector of type is adjusted, i.e., makees in the vocal print feature that will register the corresponding sound bite of each character in voice messagingAfter substituting into formula (1) for input sample, by constantly adjusting mean value super vector, so that posterior probability P (x) is maximum, so as to incite somebody to actionSo that the maximum mean value super vector of posterior probability P (x) is as the corresponding feature vector of respective symbols in registration voice messaging.
It, can be using following formula to the equal of the corresponding universal background model of preset respective symbols in another alternative embodimentValue super vector is adjusted, so that the posterior probability of the corresponding universal background model of respective symbols adjusted is maximum:
M=m+T ω, wherein M represents the mean value super vector of the universal background model of some character adjusted, and m, which is represented, to be adjustedThe mean value super vector of the universal background model of respective symbols before whole, T are preset super vector subspace matrices, and ω is to registerThe corresponding feature vector of respective symbols in voice messaging will register the corresponding sound bite of each character in voice messagingAfter vocal print feature substitutes into formula (1) as input sample, by constantly adjust the mean value that ω may be implemented in adjustment type (1) surpass toAmount, so that posterior probability P (x) is maximum, so as to which the maximum ω of posterior probability P (x) will be made as in registration voice messagingThe corresponding feature vector of respective symbols.
Fig. 6 is the flow diagram of the method for recognizing sound-groove in another embodiment of the present invention, in the present embodiment as shown in the figureMethod for recognizing sound-groove may include following below scheme:
S601, it is random to generate the first character string and shown.
S602 obtains verifying user and reads aloud verifying voice messaging caused by the first character string.
S603 identifies efficient voice segment and invalid voice segment in the verifying voice messaging.
Specifically, can be divided according to intensity of sound to verifying voice, the lesser sound bite of intensity of sound is regardedFor invalid voice segment (for example including mute section and impulsive noise).
S604, to the efficient voice segment carry out speech recognition obtain respectively with multiple words in first character stringAccord with corresponding sound bite.
Sound bite corresponding with multiple characters in first character string respectively can be obtained by speech recognition.
S605 determines the sequence and first character string of the sound bite of multiple characters in the verifying voice messagingIn respective symbols sequence it is consistent.
In order to after effectively avoiding the voice messaging of registration user from being copied illegally or illegally copied to carry out Application on Voiceprint Recognition, can be withIt generates the first different character strings at random every time, and judges the sound bite of multiple characters in verifying voice messaging in this stepSequence it is whether consistent with the sequence of respective symbols in the first character string, if inconsistent, may determine that Application on Voiceprint Recognition fail,If consistent with the sequence of the respective symbols in the first character string, follow-up process is executed.
S606 extracts the vocal print feature of the corresponding sound bite of each character.
Specifically, voice print identification device can extract the MFCC (Mel in the corresponding sound bite of each characterFrequency Cepstrum Coefficient, mel cepstrum coefficients) or PLP (Perceptual LinearPredictive perceives linear predictor coefficient), the vocal print feature as sound bite corresponding to each character.
S607, using the vocal print feature of the corresponding sound bite of character each in verifying voice messaging as number of trainingAccording to being adjusted using mean value super vector of the maximal posterior probability algorithm to the corresponding universal background model of preset respective symbolsIt is whole, so that estimation is verified the corresponding feature vector of each character in voice messaging.
Due to largely test the mean value for demonstrating each Gauss module in UBM model with paper can be used for distinguish speakThe identity information of people, voice print identification device can be by the vocal print features of the corresponding sound bite of character each in verifying voice messagingAs training sample data, using maximal posterior probability algorithm (Maximum A Posteriori, MAP) to preset corresponding wordThe mean value super vector for according with corresponding universal background model is adjusted, i.e., will verify the corresponding language of each character in voice messagingAfter the vocal print feature of tablet section substitutes into formula (1) as input sample, by constantly adjusting mean value super vector, so that posterior probability P(x) maximum, so as to which the maximum mean value super vector of posterior probability P (x) will be made as respective symbols in verifying voice messagingCorresponding feature vector.
In another alternative embodiment, the slow problem of high-dimensional bring convergence rate in order to reduce super vector, vocal printIdentification device can be adjusted the mean value super vector of the corresponding universal background model of preset respective symbols using following formula, makeThe posterior probability for obtaining the corresponding universal background model of respective symbols adjusted is maximum:
M=m+T ω, wherein M represents the mean value super vector of the universal background model of some character adjusted, and m, which is represented, to be adjustedThe mean value super vector of the universal background model of respective symbols before whole, T are preset super vector subspace matrices, and ω is to verifyThe corresponding feature vector of respective symbols in voice messaging will verify the corresponding sound bite of each character in voice messagingAfter vocal print feature substitutes into formula (1) as input sample, by constantly adjust the mean value that ω may be implemented in adjustment type (1) surpass toAmount, so that posterior probability P (x) is maximum, so as to which the maximum ω of posterior probability P (x) will be made as in verifying voice messagingThe corresponding feature vector of respective symbols.
S608 calculates the corresponding feature vector of each character and phase in preset registration voice messaging in verifying voice messagingThe similarity score of the corresponding feature vector of character is answered, if similarity score reaches default verifying thresholding, it is true user will to be verifiedIt is set to the corresponding registration user of registration voice messaging.
In the present embodiment, voice print identification device can calculate in verifying voice messaging the corresponding feature vector of each character withCOS distance value in preset registration voice messaging between the corresponding feature vector of respective symbols as the similarity score,It is calculate by the following formula spy of some character respectively in verifying voice messaging in corresponding feature vector and registration voice messagingLevy the similarity score between vector:
Wherein, subscript i indicates i-th of verifying voice messaging and registers the character shared in voice messaging, ωi(tar) tableShow the character corresponding feature vector, ω in verifying voice messagingi(test) indicate that the character is right in registration voice messagingThe feature vector answered.If verifying in voice messaging and registration voice messaging includes multiple identical characters, can be according to above formulaThe similarity score for each character being calculated takes mean value, if the similarity score mean value of each character reaches corresponding defaultThresholding is verified, then the verifying user is determined as the corresponding registration user of the registration voice messaging.Multidigit is registered if it existsUser, such as registration user A, B and C shown in FIG. 1, can be according to the feature vector and each note for verifying some character of userThe similarity of the feature vector of the respective symbols of volume user, when the feature vector and verifying language of the respective symbols of some registration userThe similarity score highest and similarity of the feature vector of the character of sound reach default verifying thresholding, then make registration userFor the identification result for verifying user.
To which, the present embodiment will be by that will verify the corresponding feature vector of each character in voice messaging and register voice messagingThe feature vector of middle respective symbols carries out similarity-rough set, and combines the timing judgement of sound bite, can further reallyProtect the accuracy of the user identity of verifying user.
Fig. 7 is the structural schematic diagram of one of embodiment of the present invention voice print identification device, in the present embodiment as shown in the figureVoice print identification device may include:
Voice obtains module 710, reads aloud for acquisition verifying user and verifies voice messaging caused by the first character string.
Verifying user, that is, unknown identity user, needs to verify its user identity by voice print identification device.It is describedFirst character string is that the character string of authentication is carried out for verifying user, can be randomly generated, and is also possible to default solidA fixed character string, such as the second character string corresponding with pre-generated registration voice messaging are one at least partly identicalCharacter string.Specifically, the character string may include m character, wherein there is n mutually different characters, m, n are positive wholeNumber, and m >=n.
For example, the first character string is " 12358948 ", totally 8 characters, include 7 kinds of mutually different characters " 1 ", " 2 ",“3”、“4”、“5”、“8”、“9”。
Sound bite identification module 720 obtains the verifying language for carrying out speech recognition to the verifying voice messagingThe sound bite corresponding with multiple characters in first character string respectively for including in message breath.
As shown in figure 3, sound bite identification module 720 can be filtered by speech recognition and intensity of sound, it will be describedVerifying voice messaging divides to obtain the corresponding sound bite of multiple characters, can also optionally weed out invalid voice segment,It is not involved in subsequent treatment process.
In an alternative embodiment, the sound bite identification module can further include as shown in Figure 8:
Effective segment recognition unit 721, for identification the efficient voice segment in the verifying voice messaging and invalid languageTablet section.
Specifically, effectively segment recognition unit 721 can divide verifying voice according to intensity of sound, sound is strongIt spends lesser sound bite and is considered as invalid voice segment (for example including mute section and impulsive noise).
Voice recognition unit 722 obtains respectively for carrying out speech recognition to the efficient voice segment with described firstThe corresponding sound bite of multiple characters in character string.
Vocal print feature extraction module 730, for extracting the sound of the corresponding sound bite of each character in verifying voice messagingLine feature.
Specifically, vocal print feature extraction module 730 can extract the MFCC (Mel in the corresponding sound bite of each characterFrequency Cepstrum Coefficient, mel cepstrum coefficients) or PLP (Perceptual LinearPredictive perceives linear predictor coefficient), the vocal print feature as sound bite corresponding to each character.
Characteristic model training module 740, for the vocal print feature according to the corresponding sound bite of each character, in conjunction withThe corresponding universal background model training of preset respective symbols is verified the corresponding feature vector of each character in voice messaging.
Characteristic model training module 740 can be special by the vocal print of the corresponding sound bite of character each in verifying voice messagingSign is used as training sample data, using maximal posterior probability algorithm (Maximum A Posteriori, MAP) to preset correspondingThe parameter of the corresponding universal background model of character is adjusted, i.e., will verify the corresponding voice sheet of each character in voice messagingAfter the vocal print feature of section substitutes into formula (1) as input sample, by constantly adjusting the corresponding common background of preset respective symbolsThe parameter of model, so that posterior probability P (x) is maximum, so that characteristic model training module 740 can be according to making posterior probability P(x) maximum parameter determines the corresponding feature vector of respective symbols in verifying voice messaging.
Due to largely test the mean value for demonstrating each Gauss module in UBM model with paper can be used for distinguish speakThe identity information of people, we define the mean value super vector of UBM model are as follows:
To which characteristic model training module 740 can be by the corresponding sound bite of character each in verifying voice messagingVocal print feature is as training sample data, using maximal posterior probability algorithm (Maximum A Posteriori, MAP) to defaultThe mean value super vector of the corresponding universal background model of respective symbols be adjusted, i.e., will verify each character in voice messagingAfter the vocal print feature of corresponding sound bite substitutes into formula (1) as input sample, by constantly adjusting mean value super vector, so that afterProbability P (x) maximum is tested, characteristic model training module 740 can will be so that the maximum mean value super vector conduct of posterior probability P (x)Verify the corresponding feature vector of respective symbols in voice messaging.
In another alternative embodiment, the slow problem of high-dimensional bring convergence rate in order to reduce super vector, wePass through principal component analytical method based on probability (PPCA, probabilistic principal component analysis)The variation range of mean value super vector is limited in a sub-spaces, characteristic model training module 740 can be by verifying voice letterThe vocal print feature of the corresponding sound bite of each character is as training sample data in breath, using maximal posterior probability algorithm to pre-If the mean value super vector of the corresponding universal background model of respective symbols be adjusted, and combine preset super vector subspace squareBattle array is to be verified the corresponding feature vector of each character in voice messaging.In the specific implementation, characteristic model training module 740The mean value super vector of the corresponding universal background model of preset respective symbols can be adjusted using following formula, so that after adjustmentThe corresponding universal background model of respective symbols posterior probability it is maximum:
M=m+T ω, wherein M represents the mean value super vector of the universal background model of some character adjusted, and m, which is represented, to be adjustedThe mean value super vector of the universal background model of respective symbols before whole, T are preset super vector subspace matrices, and ω is to verifyThe corresponding feature vector of respective symbols in voice messaging will verify the corresponding sound bite of each character in voice messagingAfter vocal print feature substitutes into formula (1) as input sample, by constantly adjust the mean value that ω may be implemented in adjustment type (1) surpass toAmount, so that posterior probability P (x) is maximum, so as to which the maximum ω of posterior probability P (x) will be made as in verifying voice messagingThe corresponding feature vector of respective symbols.The super vector subspace matrices T be according to the mean value of the gauss hybrid models surpass toWhat the correlation determination in amount between each dimension vector obtained.
Similarity judgment module 750 for the corresponding feature vector of character each in calculating verifying voice messaging and is presetRegistration voice messaging in the corresponding feature vector of respective symbols similarity score.
Specifically, voice print identification device can the voiceprint registration stage get registration user registration voice messaging,It is available and by sound bite identification module 720, vocal print feature extraction module 730 and characteristic model training module 740Register the corresponding feature vector of sound bite of each character in voice messaging.The registration voice messaging can be vocal print knowledgeOther device obtains registration user and reads aloud registration voice messaging, second character string and described first caused by the second character stringCharacter string possesses at least one identical character, i.e., described corresponding second character string of registration voice messaging and first characterIt goes here and there at least partly identical.And then in an alternative embodiment, voice print identification device can also obtain the registration voice letter from outsideAfter the corresponding feature vector of respective symbols in breath, i.e. registration user are by other equipment typing registration voice messaging, other are setStandby or server is extracted by vocal print feature and vocal print model training obtains the voice sheet of each character in registration voice messagingThe corresponding feature vector of section, voice print identification device from other equipment or server by getting in the registration voice messagingThe corresponding feature vector of respective symbols, thus verifying user identification stage similarity judgment module 750 to testThe corresponding feature vector of each character is compared in card voice messaging.
In the specific implementation, the similarity score is that voice print identification device is corresponding by each character in verifying voice messagingAfter feature vector feature vector corresponding with respective symbols in preset registration voice messaging is compared, identical characters are measuredThe score value of similarity degree between two feature vectors.In an alternative embodiment, similarity judgment module 750 can calculate verifyingThe corresponding feature vector of each character feature vector corresponding with respective symbols in preset registration voice messaging in voice messagingBetween COS distance value as the similarity score, that is, be calculate by the following formula some character respectively verifying voice messaging inThe similarity score between feature vector in corresponding feature vector and registration voice messaging:
Wherein, subscript i indicates i-th of verifying voice messaging and registers the character shared in voice messaging, ωi(tar) tableShow the character corresponding feature vector, ω in verifying voice messagingi(test) indicate that the character is right in registration voice messagingThe feature vector answered.In an alternative embodiment, if there are same characters to occur more than once in the verifying voice messaging, such asOccur in verifying voice messaging as shown in Figure 20,1,5 and 8 all to occur respectively 2 times, then can be according to character 0 twiceThe feature vector that corresponding sound bite is handled respectively with it is preset registration voice messaging in character 0 feature vector phaseLike the average value of degree score, in the feature vector and preset registration voice messaging as character 0 in this verifying voice messagingThe similarity score of the feature vector of character 0, and so on.
It should be pointed out that measuring the mode of the similarity between two feature vectors there are also very much, the above is only this hairsA kind of embodiment of bright offer, those skilled in the art may not need creative labor on the basis of scheme disclosed by the inventionThe similarity point of more feature vectors for calculating verifying voice messaging and registering the character shared in voice messaging is obtained dynamiclySeveral modes, the present invention is without exhaustion.
Subscriber identification module 760, if reaching default verifying thresholding for the similarity score, by the verifying userIt is determined as the corresponding registration user of the registration voice messaging.
If verifying in voice messaging and registration voice messaging includes multiple identical characters, subscriber identification module 760 canMean value is taken with the similarity score for each character being calculated according to similarity judgment module 750, if each character is similarDegree score mean value reaches corresponding default verifying thresholding, then it is corresponding the verifying user to be determined as the registration voice messagingRegister user.Multidigit registers user if it exists, such as registration user A, B and C shown in FIG. 1, and subscriber identification module 760 can be withAccording to the similarity of the feature vector of verifying some character of user and the feature vector of the respective symbols of each registration user, when certainIt is a registration user respective symbols feature vector and verifying voice the character feature vector similarity score highest andSimilarity reaches default verifying thresholding, then using registration user as the identification result of verifying user.
And then in an alternative embodiment, the voice obtains module 710, is also used to obtain registration user and reads aloud the second characterVoice messaging is registered caused by string, second character string possesses at least one identical character with first character string;
The sound bite identification module 720 is also used to obtain registration voice messaging progress speech recognition describedThe sound bite corresponding with multiple characters in second character string respectively for including in registration voice messaging;
The vocal print feature extraction module 730 is also used to extract the corresponding voice sheet of each character in registration voice messagingThe vocal print feature of section;
The characteristic model training module 740 is also used to according to the corresponding language of character each in the registration voice messagingThe vocal print feature of tablet section obtains each in registration voice messaging in conjunction with the corresponding universal background model training of preset respective symbolsThe corresponding feature vector of a character.
In an alternative embodiment, voice print identification device further can also include:
Character sequence determining module 770, for determining the sound bite for verifying multiple characters in voice messagingIt sorts consistent with the sequence of respective symbols in first character string.
In order to after effectively avoiding the voice messaging of registration user from being copied illegally or illegally copied to carry out Application on Voiceprint Recognition, can be withIt generates the first different character strings at random every time, and judges the sound bite of multiple characters in verifying voice messaging in this stepSequence it is whether consistent with the sequence of respective symbols in the first character string, if inconsistent, may determine that Application on Voiceprint Recognition fail,If consistent with the sequence of the respective symbols in the first character string, vocal print feature extraction module 730 or characteristic model can be notifiedTraining module 740 is executed for the feature extraction of the verifying voice messaging and vocal print training.
In an alternative embodiment, voice print identification device further can also include:
Character string display module 700, for generating first character string at random and being shown.
To the corresponding sound bite of character each in the verifying voice messaging of, the present embodiment by obtaining verifying userVocal print feature is verified the corresponding feature vector of each character in voice messaging in conjunction with the UBM training of preset respective symbols,And by will verify the corresponding feature vector of each character in voice messaging with register the features of respective symbols in voice messaging toAmount carries out similarity-rough set, so that it is determined that the user identity of verifying user, which to the user characteristics vector that compares withSpecific character is corresponding, fully takes into account vocal print feature when user reads aloud kinds of characters, so as to effectively improve Application on Voiceprint Recognition standardTrue rate.
In actual test example, (wherein the test of identities match is 1 in 1000 people's training samples, 290,000 testsTen thousand times or so, test is mismatched about at 280,000 times), it can be realized under one thousandth error rate 79.8% recall rate, wait wrong generalRate (EER, Equal Error Rate) is 3.39%, and compared to traditional unrelated modeling method of text, Application on Voiceprint Recognition performance is mentionedIt rises more than 40% or more.
Those of ordinary skill in the art will appreciate that realizing all or part of the process in above-described embodiment method, being can be withRelevant hardware is instructed to complete by computer program, the program can be stored in a computer-readable storage mediumIn, the program is when being executed, it may include such as the process of the embodiment of above-mentioned each method.Wherein, the storage medium can be magneticDish, CD, read-only memory (Read-Only Memory, ROM) or random access memory (Random AccessMemory, RAM) etc..
The above disclosure is only the preferred embodiments of the present invention, cannot limit the right model of the present invention with this certainlyIt encloses, therefore equivalent changes made in accordance with the claims of the present invention, is still within the scope of the present invention.

Claims (16)

CN201610416650.3A2016-06-122016-06-12A kind of method for recognizing sound-groove and deviceActiveCN106098068B (en)

Priority Applications (2)

Application NumberPriority DateFiling DateTitle
CN201610416650.3ACN106098068B (en)2016-06-122016-06-12A kind of method for recognizing sound-groove and device
PCT/CN2017/087911WO2017215558A1 (en)2016-06-122017-06-12Voiceprint recognition method and device

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201610416650.3ACN106098068B (en)2016-06-122016-06-12A kind of method for recognizing sound-groove and device

Publications (2)

Publication NumberPublication Date
CN106098068A CN106098068A (en)2016-11-09
CN106098068Btrue CN106098068B (en)2019-07-16

Family

ID=57846666

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201610416650.3AActiveCN106098068B (en)2016-06-122016-06-12A kind of method for recognizing sound-groove and device

Country Status (2)

CountryLink
CN (1)CN106098068B (en)
WO (1)WO2017215558A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US11335352B2 (en)*2017-09-292022-05-17Tencent Technology (Shenzhen) Company LimitedVoice identity feature extractor and classifier training
US12130899B2 (en)*2019-07-292024-10-29Huawei Technologies Co., Ltd.Voiceprint recognition method and device

Families Citing this family (44)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN106098068B (en)*2016-06-122019-07-16腾讯科技(深圳)有限公司A kind of method for recognizing sound-groove and device
US11283631B2 (en)2017-01-032022-03-22Nokia Technologies OyApparatus, method and computer program product for authentication
CN108447471B (en)*2017-02-152021-09-10腾讯科技(深圳)有限公司Speech recognition method and speech recognition device
CN107610708B (en)*2017-06-092018-06-19平安科技(深圳)有限公司Identify the method and apparatus of vocal print
CN109102812B (en)*2017-06-212021-08-31北京搜狗科技发展有限公司Voiceprint recognition method and system and electronic equipment
CN107492379B (en)*2017-06-302021-09-21百度在线网络技术(北京)有限公司Voiceprint creating and registering method and device
CN107248410A (en)*2017-07-192017-10-13浙江联运知慧科技有限公司The method that Application on Voiceprint Recognition dustbin opens the door
CN109559759B (en)*2017-09-272021-10-08华硕电脑股份有限公司 Electronic device with incremental registration unit and method thereof
CN107886943A (en)*2017-11-212018-04-06广州势必可赢网络科技有限公司Voiceprint recognition method and device
CN108154588B (en)*2017-12-292020-11-27深圳市艾特智能科技有限公司Unlocking method and system, readable storage medium and intelligent device
CN110047491A (en)*2018-01-162019-07-23中国科学院声学研究所A kind of relevant method for distinguishing speek person of random digit password and device
CN108269590A (en)*2018-01-172018-07-10广州势必可赢网络科技有限公司Vocal cord recovery scoring method and device
CN108447489B (en)*2018-04-172020-05-22清华大学 A continuous voiceprint authentication method and system with feedback
CN109147767B (en)*2018-08-162024-06-21平安科技(深圳)有限公司Method, device, computer equipment and storage medium for recognizing numbers in voice
CN110875044B (en)*2018-08-302022-05-03中国科学院声学研究所 A speaker recognition method based on word correlation score calculation
CN109117622B (en)*2018-09-192020-09-01北京容联易通信息技术有限公司Identity authentication method based on audio fingerprints
CN109257362A (en)*2018-10-112019-01-22平安科技(深圳)有限公司Method, apparatus, computer equipment and the storage medium of voice print verification
CN111199729B (en)*2018-11-192023-09-26阿里巴巴集团控股有限公司Voiceprint recognition method and voiceprint recognition device
CN109473107B (en)*2018-12-032020-12-22厦门快商通信息技术有限公司Text semi-correlation voiceprint recognition method and system
CN111669350A (en)*2019-03-052020-09-15阿里巴巴集团控股有限公司Identity verification method, verification information generation method, payment method and payment device
CN110738998A (en)*2019-09-112020-01-31深圳壹账通智能科技有限公司Voice-based personal credit evaluation method, device, terminal and storage medium
CN110517695A (en)*2019-09-112019-11-29国微集团(深圳)有限公司Verification method and device based on vocal print
CN110971763B (en)*2019-12-102021-01-26Oppo广东移动通信有限公司Arrival reminding method and device, storage medium and electronic equipment
CN110956732A (en)*2019-12-192020-04-03重庆特斯联智慧科技股份有限公司Safety entrance guard based on thing networking
CN111081256A (en)*2019-12-312020-04-28苏州思必驰信息科技有限公司 Digital string voiceprint password verification method and system
CN111081260A (en)*2019-12-312020-04-28苏州思必驰信息科技有限公司Method and system for identifying voiceprint of awakening word
CN111597531A (en)*2020-04-072020-08-28北京捷通华声科技股份有限公司Identity authentication method and device, electronic equipment and readable storage medium
CN111613230A (en)*2020-06-242020-09-01泰康保险集团股份有限公司Voiceprint verification method, voiceprint verification device, voiceprint verification equipment and storage medium
CN112037815B (en)*2020-08-282024-09-06中移(杭州)信息技术有限公司Audio fingerprint extraction method, server and storage medium
CN112487384B (en)*2020-11-252024-12-03华为技术有限公司 Identity verification method and system
CN112435673B (en)*2020-12-152024-05-14北京声智科技有限公司Model training method and electronic terminal
CN112820299B (en)*2020-12-292021-09-14马上消费金融股份有限公司Voiceprint recognition model training method and device and related equipment
CN113113022A (en)*2021-04-152021-07-13吉林大学Method for automatically identifying identity based on voiceprint information of speaker
CN113570754B (en)*2021-07-012022-04-29汉王科技股份有限公司Voiceprint lock control method and device and electronic equipment
CN114357417B (en)*2021-12-312025-04-08中国科学院声学研究所东海研究站 A self-learning dynamic voiceprint identity authentication method based on unknown corpus
CN114782141A (en)*2022-05-072022-07-22中国工商银行股份有限公司Product interaction method and device based on 5G message, electronic equipment and medium
CN115019808B (en)*2022-06-012025-07-11科大讯飞股份有限公司 Voiceprint extraction method, device, equipment and readable storage medium
CN115602177A (en)*2022-10-122023-01-13中国电信股份有限公司(Cn)Voiceprint recognition method and device and computer-readable storage medium
EP4602455A1 (en)*2022-10-142025-08-20Qualcomm IncorporatedVoice-based user authentication
CN115641852A (en)*2022-10-182023-01-24中国电信股份有限公司Voiceprint recognition method and device, electronic equipment and computer readable storage medium
CN115550075B (en)*2022-12-012023-05-09中网道科技集团股份有限公司Anti-counterfeiting processing method and equipment for community correction object public welfare activity data
CN116530944B (en)*2023-07-062023-10-20荣耀终端有限公司Sound processing method and electronic equipment
CN116978368B (en)*2023-09-252023-12-15腾讯科技(深圳)有限公司Wake-up word detection method and related device
CN120279894A (en)*2025-05-142025-07-08广东公信智能会议股份有限公司Man-machine interaction method and system for voice recognition

Citations (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN102254559A (en)*2010-05-202011-11-23盛乐信息技术(上海)有限公司Identity authentication system and method based on vocal print
CN102314877A (en)*2010-07-082012-01-11盛乐信息技术(上海)有限公司Voiceprint identification method for character content prompt
CN102737634A (en)*2012-05-292012-10-17百度在线网络技术(北京)有限公司Authentication method and device based on voice
CN104282303A (en)*2013-07-092015-01-14威盛电子股份有限公司 Method and electronic device for speech recognition using voiceprint recognition
CN104901808A (en)*2015-04-142015-09-09时代亿宝(北京)科技有限公司Voiceprint authentication system and method based on time type dynamic password

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
KR100406307B1 (en)*2001-08-092003-11-19삼성전자주식회사Voice recognition method and system based on voice registration method and system
CN101997689B (en)*2010-11-192012-08-08吉林大学USB (universal serial bus) identity authentication method based on voiceprint recognition and system thereof
CN102163427B (en)*2010-12-202012-09-12北京邮电大学Method for detecting audio exceptional event based on environmental model
CN102238189B (en)*2011-08-012013-12-11安徽科大讯飞信息科技股份有限公司Voiceprint password authentication method and system
CN103679452A (en)*2013-06-202014-03-26腾讯科技(深圳)有限公司Payment authentication method, device thereof and system thereof
CN104064189A (en)*2014-06-262014-09-24厦门天聪智能软件有限公司Vocal print dynamic password modeling and verification method
CN104575504A (en)*2014-12-242015-04-29上海师范大学Method for personalized television voice wake-up by voiceprint and voice identification
CN105096121B (en)*2015-06-252017-07-25百度在线网络技术(北京)有限公司voiceprint authentication method and device
CN105656887A (en)*2015-12-302016-06-08百度在线网络技术(北京)有限公司Artificial intelligence-based voiceprint authentication method and device
CN106098068B (en)*2016-06-122019-07-16腾讯科技(深圳)有限公司A kind of method for recognizing sound-groove and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN102254559A (en)*2010-05-202011-11-23盛乐信息技术(上海)有限公司Identity authentication system and method based on vocal print
CN102314877A (en)*2010-07-082012-01-11盛乐信息技术(上海)有限公司Voiceprint identification method for character content prompt
CN102737634A (en)*2012-05-292012-10-17百度在线网络技术(北京)有限公司Authentication method and device based on voice
CN104282303A (en)*2013-07-092015-01-14威盛电子股份有限公司 Method and electronic device for speech recognition using voiceprint recognition
CN104901808A (en)*2015-04-142015-09-09时代亿宝(北京)科技有限公司Voiceprint authentication system and method based on time type dynamic password

Cited By (3)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US11335352B2 (en)*2017-09-292022-05-17Tencent Technology (Shenzhen) Company LimitedVoice identity feature extractor and classifier training
US12112757B2 (en)2017-09-292024-10-08Tencent Technology (Shenzhen) Company LimitedVoice identity feature extractor and classifier training
US12130899B2 (en)*2019-07-292024-10-29Huawei Technologies Co., Ltd.Voiceprint recognition method and device

Also Published As

Publication numberPublication date
WO2017215558A1 (en)2017-12-21
CN106098068A (en)2016-11-09

Similar Documents

PublicationPublication DateTitle
CN106098068B (en)A kind of method for recognizing sound-groove and device
CN106057206B (en)Sound-groove model training method, method for recognizing sound-groove and device
CN107104803B (en)User identity authentication method based on digital password and voiceprint joint confirmation
CN111402862B (en)Speech recognition method, device, storage medium and equipment
Meyer et al.Anonymizing speech with generative adversarial networks to preserve speaker privacy
CN109243465A (en)Voiceprint authentication method, device, computer equipment and storage medium
Mansour et al.Voice recognition using dynamic time warping and mel-frequency cepstral coefficients algorithms
CN104765996B (en)Voiceprint password authentication method and system
WO2016150032A1 (en)Artificial intelligence based voiceprint login method and device
CN112712809B (en)Voice detection method and device, electronic equipment and storage medium
KR101988165B1 (en)Method and system for improving the accuracy of speech recognition technology based on text data analysis for deaf students
CN106782603A (en)Intelligent sound evaluating method and system
CN107346568A (en)The authentication method and device of a kind of gate control system
BeigiChallenges of LargeScale Speaker Recognition
CN111613230A (en)Voiceprint verification method, voiceprint verification device, voiceprint verification equipment and storage medium
Saquib et al.A survey on automatic speaker recognition systems
CN110111798A (en)A kind of method and terminal identifying speaker
CN110364180A (en)A kind of examination system and method based on audio-video processing
CN114220419A (en) A voice evaluation method, device, medium and equipment
Mandalapu et al.Multilingual voice impersonation dataset and evaluation
Jokinen et al.Variation in Spoken North Sami Language.
CN105976819A (en)Rnorm score normalization based speaker verification method
CN106128464B (en)UBM divides the method for building up of word model, vocal print feature generation method and device
Lin et al.Self-supervised acoustic word embedding learning via correspondence transformer encoder
CN117636873A (en)Voice recognition method, device, equipment and storage medium

Legal Events

DateCodeTitleDescription
C06Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant
TR01Transfer of patent right

Effective date of registration:20230712

Address after:518057 Tencent Building, No. 1 High-tech Zone, Nanshan District, Shenzhen City, Guangdong Province, 35 floors

Patentee after:TENCENT TECHNOLOGY (SHENZHEN) Co.,Ltd.

Patentee after:TENCENT CLOUD COMPUTING (BEIJING) Co.,Ltd.

Address before:2, 518000, East 403 room, SEG science and Technology Park, Zhenxing Road, Shenzhen, Guangdong, Futian District

Patentee before:TENCENT TECHNOLOGY (SHENZHEN) Co.,Ltd.

TR01Transfer of patent right

[8]ページ先頭

©2009-2025 Movatter.jp