Movatterモバイル変換


[0]ホーム

URL:


CN104183240A - Vocal print feature fusion method and device - Google Patents

Vocal print feature fusion method and device
Download PDF

Info

Publication number
CN104183240A
CN104183240ACN201410408952.7ACN201410408952ACN104183240ACN 104183240 ACN104183240 ACN 104183240ACN 201410408952 ACN201410408952 ACN 201410408952ACN 104183240 ACN104183240 ACN 104183240A
Authority
CN
China
Prior art keywords
voiceprint feature
distance
vocal print
average
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201410408952.7A
Other languages
Chinese (zh)
Inventor
刘镝
张云勇
张尼
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China United Network Communications Group Co Ltd
Original Assignee
China United Network Communications Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China United Network Communications Group Co LtdfiledCriticalChina United Network Communications Group Co Ltd
Priority to CN201410408952.7ApriorityCriticalpatent/CN104183240A/en
Publication of CN104183240ApublicationCriticalpatent/CN104183240A/en
Pendinglegal-statusCriticalCurrent

Links

Landscapes

Abstract

Translated fromChinese

本发明公开了一种声纹特征融合方法及装置,该方法包括:在用户的多种声纹特征向量中,分别计算用户任意两种声纹特征向量之间的平均KL距离;其中,两种声纹特征向量的平均KL距离为:第一声纹特征向量、第二声纹特征向量的概率分布的KL距离,与所述第二声纹特征向量、所述第一声纹特征向量的概率分布的KL距离相加后除以2;选择平均KL距离最大的两种声纹特征进行融合。本发明利用声纹特征向量之间的平均KL距离表示两种特征融合的有效信息量,更加精准地选择可分性信息大的两种特征实现融合,能更好地实现个人身份认证的功能,提高认证的精度。

The invention discloses a voiceprint feature fusion method and device. The method includes: calculating the average KL distance between any two voiceprint feature vectors of the user among various voiceprint feature vectors of the user; wherein, the two The average KL distance of the voiceprint feature vector is: the KL distance of the probability distribution of the first voiceprint feature vector and the second voiceprint feature vector, and the probability of the second voiceprint feature vector and the first voiceprint feature vector The distributed KL distances are added and divided by 2; the two voiceprint features with the largest average KL distance are selected for fusion. The present invention uses the average KL distance between the voiceprint feature vectors to represent the effective information amount of the fusion of two features, more accurately selects two features with large separability information to achieve fusion, and can better realize the function of personal identity authentication. Improve the accuracy of authentication.

Description

A kind of vocal print Feature fusion and device
Technical field
The present invention relates to the communications field, be specifically related to a kind of vocal print Feature fusion and device.
Background technology
Along with the continuous evolution of cloud computing technology, a large amount of cloud platforms continue to bring out, as the AWS of Amazon (Amazon Web Services), and domestic Ali's cloud, the platforms such as fertile cloud.The powerful computing power of these cloud platforms has been widely used in national product field, as Taobao's platform of 12306 train tickets ticket bookings websites, Alibaba etc.These cloud platforms by the storage of subscriber data of magnanimity in the database community of cloud platform.The data volume of cloud platform is very big, and this has increased the weight of cloud platform data library manager management, the maintenance load of (Database Administrator is called for short DBA) virtually.And the geographic position of depositing the database IDC data center of cloud platform often has certain physical distance with managerial personnel's Office Area.Safeguard, manage cloud platform database for more convenient, cloud platform DBA often takes data base management system (DBMS) to be mapped to the way on public network, logins this address by public network IP, carries out management, the O&M work of cloud platform database.But, there is following defect in such scheme:
Because cloud platform database is carrying a large amount of data, need DBA moment focused data storehouse situation, when DBA personnel are not in Administrative Area, cannot, by the real-time sign-on access data base management system (DBMS) in computer terminal in Administrative Area, carry out real-time servicing control to database.
For above two point defects, can design a kind of mobile client end system for the customization of the cloud platform database DBA degree of depth.In order to ensure the login of Terminal Server Client DBA personal security, prevent that account is stolen, design in the industry a kind of certification of the high strength based on bio-identification login scheme.Biological identification technology is exactly to utilize to have the human body biological characteristics of uniqueness, as fingerprint, face, sound etc., realizes the authentication to real user, than traditional inputting user name, pin mode is safer.Adopting Application on Voiceprint Recognition for mobile client is a kind of more satisfactory selection.Any mobile phone all possesses vocal print collector mic, and therefore, user does not need more exchange device, thus saving fund.For the biological characteristic problem that is stolen, certain user's the language recording if assailant has copied illegally, system, in the time of login authentication, can specify to test discourse content, thereby avoids assailant to utilize copying illegally the counterfeit identity recording of recording.
But still there are some defects in traditional single mode voiceprint algorithm, main because: single vocal print feature extraction mode can cause system performance to decline.Utilize the proper vector of single features extracting method collection, can not represent the feature of primeval life sample completely, can not reflect its separability information (Discriminatory information) completely, thereby cause system identification precise decreasing.
In order to overcome above problem, information fusion thought is introduced in vocal print feature identification field, i.e. Application on Voiceprint Recognition integration technology.Utilize certain amalgamation mode, as the feature level integration program based on vocal print feature is integrated these features, the key feature by the separability information after merging as identification personal identification, makes system realize better the function of personal identification.But thing followed problem is, because industry is numerous for the vocal print feature of Application on Voiceprint Recognition, as MFCC, Residual phase, LPCC, MVDR, MLSF etc.For as much as possible that the authentication function of DBA mobile client (precision) performance is extremely maximum, prevent that true DBA user from occurring while utilizing mobile client login that misjudgment phenomenon occurs, in the time of two or more vocal print Fusion Features, how choosing two kinds of suitable vocal print features merges, to obtain maximum separability information, the authentication precision of final system algorithm being reached maximize becomes a difficult problem.
Summary of the invention
The technical issues that need to address of the present invention are to provide a kind of vocal print Feature fusion and device, select more accurately two kinds of features that separability information is large to realize fusion, can realize better the function of personal identification, improve the precision of certification.
In order to solve the problems of the technologies described above, the invention provides a kind of vocal print Feature fusion, comprising:
In user's multiple vocal print proper vector, calculate respectively the average KL distance between any two kinds of vocal print proper vectors of user; Wherein, the average KL distance of two kinds of vocal print proper vectors is: the KL distance of the probability distribution of the first vocal print proper vector, the second vocal print proper vector, with the KL of the probability distribution of described the second vocal print proper vector, described the first vocal print proper vector apart from after being added divided by 2;
Select average KL to merge apart from two kinds of vocal print features of maximum.
Further, before calculating the step of the average KL distance between any two kinds of vocal print proper vectors of user, described method also comprises:
For described user extracts two or more vocal print proper vectors.
Further, describedly extract two or more vocal print proper vectors for described user, comprising:
Gather described user's voice signal by sensor, utilize two or more different vocal print characteristics algorithm to extract different vocal print proper vectors to the voice signal collecting.
Further, the average KL distance between any two kinds of vocal print proper vectors of described calculating user, comprising:
Obtain described any two kinds of vocal print proper vectors, calculate average and covariance that two kinds of vocal print proper vectors distribute;
The average and the covariance that distribute according to described two kinds of vocal print proper vectors, the probability distribution of two kinds of vocal print characteristic vector spaces of structure;
According to the probability distribution of two kinds of vocal print characteristic vector spaces, calculate the average KL distance between two kinds of vocal print features.
Further, the probability distribution of described two kinds of vocal print proper vectors is Gaussian distribution.
In order to solve the problems of the technologies described above, the present invention also provides a kind of vocal print Fusion Features device, comprising:
Fuse information amount computing module, for the multiple vocal print proper vector user, calculates respectively the average KL distance between any two kinds of vocal print proper vectors of user; Wherein, the average KL distance of two kinds of vocal print proper vectors is: the KL distance of the probability distribution of the first vocal print proper vector, the second vocal print proper vector, with the KL of the probability distribution of described the second vocal print proper vector, described the first vocal print proper vector apart from after being added divided by 2;
Vocal print Fusion Features module, for selecting average KL to merge apart from two kinds of vocal print proper vectors of maximum.
Further, described device also comprises: vocal print characteristic extracting module, is used to described user to extract two or more vocal print proper vectors.
Further, described vocal print characteristic extracting module, is used to described user to extract two or more vocal print proper vectors, comprising:
Gather described user's voice signal by same sensor or different sensors, utilize two or more different vocal print characteristics algorithm to extract different vocal print proper vectors to the voice signal collecting.
Further, described fuse information amount computing module, for calculating the average KL distance between any two kinds of vocal print proper vectors of user, comprising:
Obtain described any two kinds of vocal print proper vectors, calculate average and covariance that two kinds of vocal print proper vectors distribute;
The average and the covariance that distribute according to described two kinds of vocal print proper vectors, the probability distribution of two kinds of vocal print characteristic vector spaces of structure;
According to the probability distribution of two kinds of vocal print characteristic vector spaces, calculate the average KL distance between two kinds of vocal print features.
Further, the probability distribution of described two kinds of vocal print proper vectors is Gaussian distribution.
Compared with prior art, vocal print Feature fusion and device that the embodiment of the present invention provides, utilize average KL between vocal print proper vector apart from the effective information that represents two kinds of Fusion Features, merge with two kinds of vocal print feature realization character levels selecting correlativity as far as possible little (can obtain large as far as possible separability information), weigh more exactly the quantity of information of feature level blending algorithm, select more accurately two kinds of features that separability information is large to realize fusion, key feature by the separability information after merging as identification personal identification, make system realize better the function of personal identification, improve the precision of identifying algorithm.
Brief description of the drawings
Fig. 1 is the method flow diagram of vocal print Fusion Features in embodiment;
Fig. 2 is the structure drawing of device of vocal print Fusion Features in embodiment.
Embodiment
For making the object, technical solutions and advantages of the present invention clearer, hereinafter in connection with accompanying drawing, embodiments of the invention are elaborated.It should be noted that, in the situation that not conflicting, the combination in any mutually of the feature in embodiment and embodiment in the application.
Embodiment:
Average Kullback-Leibler distance is proposed first, herein:
Tradition Kullback-Leibler distance, is called for short KL distance, is widely used in the size that detects " distance " (similarity) between two stochastic distribution P (x), Q (x).Specific nature is as follows:
A. nonnegativity:
DKL(Q||P)≥0,∀P(x),Q(x)---(1)
B. asymmetry:
DKL(P||Q)≠DKL(Q||P) (2)
C. from etc. property:
DkL(P||Q)=0 or DkL(Q||P)=0
And if only if random probability distribution P (x)=Q (x) (3)
Wherein, DkL(Q||P) the KL distance of expression Q (x), P (x) two probability distribution, DkL(P||Q) the KL distance of expression P (x), Q (x) two probability distribution, if the KL distance value of two probability distribution P (x), Q (x) is larger, illustrate that so the correlativity of P (x), Q (x) two distributions is relatively little; Otherwise, illustrate that two correlativitys that distribute are larger.For conventional vocal print blending algorithm, if the correlativity between two kinds of vocal print feature spaces distributions is less, while fusion so, can obtain more quantity of information, can choose thus these two kinds of features and realize user's vocal print fusion certificate scheme.Therefore, can utilize KL between vocal print proper vector apart from the effective information that represents two kinds of Fusion Features.But because KL is apart from the character with asymmetry, i.e. DkL(P||Q) ≠ DkL(Q||P).Therefore, this distance can not be directly used in the measurement of fuse information amount, can not be directly used in vocal print feature selecting.
In order to weigh more exactly the quantity of information of feature level blending algorithm, select more accurately two kinds of features that separability information is large to realize fusion, this programme proposes the concept of Average Kullback-Leibler distance, is called for short average KL distance.This distance has symmetry, has overcome traditional KL distance and has not had symmetry and cause computing information amount to occur the shortcoming of deviation.Average KL distance B between two stochastic distribution P (x), Q (x)aver_KL(P||Q) be:
DAver_KL(P||Q)=DKL(P||Q)+DKL(Q||P)2.---(4)
The character that this distance has comprises:
A. nonnegativity:
DAver_KL(P||Q)≥0,∀P(x),Q(x)---(5)
B. symmetry:
DAver_KL(P||Q)DAver_KL(Q||P) (6)
C. from etc. property:
Daver_KL(P||Q)=0 or Daver_KL(Q||P)=0
And if only if random probability distribution P (x)=Q (x) (7)
As shown in Figure 1, the present embodiment provides a kind of method that realizes vocal print Fusion Features based on the average KL distance between vocal print feature, calculates the average KL distance of two proper vectors in the fusion of vocal print feature level, specifically comprises the following steps:
S101: in user's multiple vocal print proper vector, calculate respectively the average KL distance between any two kinds of vocal print proper vectors of user;
S102: select average KL to merge apart from two kinds of vocal print proper vectors of maximum.
Before step S101, also comprise: for described user extracts two or more vocal print proper vectors.
Specifically comprise:
The voice signal that gathers described user by same sensor or two different sensors, utilizes two or more different vocal print characteristics algorithm to extract different vocal print proper vectors to the voice signal collecting.
Wherein, utilize different feature extraction algorithms can obtain different vocal print vectors, such as MFCC Mel cepstrum coefficient, residual phase phasing degree residual error, LPCC linear prediction spectral function, MLSF Mel linear spectral function etc.In a word, extract the feature extraction algorithm of vocal print feature many.
In addition, can also can gather by different sensors by same sensor user's voice signal, for example, the first gathers with regular handset mic, the second with professional mic (for example, professional acoustic tone signal is extracted, micphone) gather user voice signal, then use conventional in the industry MFCC (Mel cepstrum coefficient) feature extraction algorithm and LPCC linear prediction spectral function feature extraction algorithm to extract respectively the vocal print feature that (obtaining) two kinds of distinct devices gather, obtain two kinds of vocal print proper vectors.
Wherein, step S101 specifically comprises:
S101a: obtain described any two kinds of vocal print proper vectors, calculate average and covariance that two kinds of vocal print proper vectors distribute;
In an application example, for a certain speaker P, select arbitrarily two kinds of vocal print feature A, B, establishes this two kinds of vocal print feature A, and B has respectively na, nbindividual vocal print feature vector, Xa{ xai, i=1 ... na, Xb{ xbi, i=1 ... nb, vectorial dimension is respectively fa× 1, fb× 1, the average that two proper vectors distribute is respectively
μA=EA[X]=1nAΣi=1nAxAi,---(8)
μB=EB[X]=1nBΣi=1nBxBi.---(9)
The covariance that two proper vectors distribute is
ΣA=EA[(X-μA)t(X-μA)]=1nA-1Σi=1nA(xAi-μA)t(xAi-μA),---(10)
ΣB=EB[(X-μB)t(X-μB)]=1nB-1Σi=1nB(xBi-μB)t(xBi-μB).---(11)
Wherein Σa, Σbdimension be respectively fa× fa, fb× fb.
S101b: the average and the covariance that distribute according to described two kinds of vocal print proper vectors, the probability distribution of two kinds of vocal print characteristic vector spaces of structure;
As the preferred mode of one, the probability distribution of described two kinds of vocal print characteristic vector spaces is made as Gaussian distribution model, and reason is: (1) Gaussian distribution can reflect the true distribution of natural world well; (2) utilize this model can solve the standard deviation of entropy extreme value, and can be to average KL apart from limiting the upper limit.
According to Gauss model, two kinds of vocal print characteristic vector space A, B respectively corresponding probability distribution is:
pA(x)=1|2πΣA|exp[-12(xAi-μA)tΣA-1(xAi-μA)],---(12)
pB(x)=1|2πΣB|exp[-12(xBi-μB)tΣB-1(xBi-μB)].---(13)
S101c: according to the probability distribution of two kinds of vocal print characteristic vector spaces, calculate the average KL distance between two kinds of vocal print features.
Or taking Gaussian distribution as example, according to the probability distribution of above two kinds of vocal print characteristic vector spaces, calculate the average KL distance of two features:
DAver_KL(pA||pB)=12·{∫pA(x)(logpA(x)-log)pB(x)dx+∫pB(x)(logpB(x)-logpA(x))dx}=12·{-loge(ln|πΣA|-ln|2πΣB|+1-E[A(x-μB)ΣB-1(x-μB)])-loge(ln|2πΣA|-ln|2πΣB|+1-EA[(x-μB)ΣB-1(x-μB)])}=12{loge(ln|2πΣA||2πΣB|+trace((ΣA+(μA-μB)t(μA-μB))ΣB-1-I))+loge(ln|2πΣB||2πΣA|+trace((ΣB+(μB-μA)t(μB-μA))ΣA-1-I))}.---(14)
According to above formula, can calculate for the D between two kinds of vocal print features of user Paver_KL(pa|| pb) distance, be feature level fuse information amount, thereby carry out feature selecting, select as far as possible the vocal print feature of two kinds of correlativitys less (being that average KL distance is larger) to merge, thereby obtain more excellent system authentication precision, realize the secure log of DBA mobile client.
As shown in Figure 2, the present embodiment provides a kind of vocal print Fusion Features device, comprising:
Fuse information amount computing module, for the multiple vocal print proper vector user, calculates respectively the average KL distance between any two kinds of vocal print proper vectors of user;
Vocal print Fusion Features module, for selecting average KL to merge apart from two kinds of vocal print proper vectors of maximum.
This device also comprises:
Vocal print characteristic extracting module, is used to described user to extract two or more vocal print proper vectors.
Wherein, described vocal print characteristic extracting module, is used to described user to extract two or more vocal print proper vectors, comprising:
Gather described user's voice signal by same sensor or different sensors, utilize two or more different vocal print characteristics algorithm to extract different vocal print proper vectors to the voice signal collecting.
Wherein, described fuse information amount computing module, for calculating the average KL distance between any two kinds of vocal print proper vectors of user, comprising:
Obtain described any two kinds of vocal print proper vectors, calculate average and covariance that two kinds of vocal print proper vectors distribute;
The average and the covariance that distribute according to described two kinds of vocal print proper vectors, the probability distribution of two kinds of vocal print characteristic vector spaces of structure;
According to the probability distribution of two kinds of vocal print characteristic vector spaces, calculate the average KL distance between two kinds of vocal print features.
Wherein, the probability distribution of described two kinds of vocal print characteristic vector spaces is Gaussian distribution.Concrete account form, as the concrete account form of method in Fig. 1, repeats no more herein.
In addition, it should be noted that, the method that the present embodiment provides and device not only can be for DBA mobile clients, for other need the terminal of safety certification or system applicable equally, such as the vocal print Feature fusion and the device that provide in the present embodiment all can be provided for Web bank's login page, PC Verification System.
Can find out from above-described embodiment, with respect to prior art, the vocal print Feature fusion and the device that in above-described embodiment, provide, utilize average KL between vocal print proper vector apart from the effective information that represents two kinds of Fusion Features, merge with two kinds of vocal print feature realization character levels selecting correlativity as far as possible little (can obtain large as far as possible separability information), weigh more exactly the quantity of information of feature level blending algorithm, select more accurately two kinds of features that separability information is large to realize fusion, key feature by the separability information after merging as identification personal identification, make system realize better the function of personal identification, improve the precision of identifying algorithm.
One of ordinary skill in the art will appreciate that all or part of step in said method can carry out instruction related hardware by program and complete, described program can be stored in computer-readable recording medium, as ROM (read-only memory), disk or CD etc.Alternatively, all or part of step of above-described embodiment also can realize with one or more integrated circuit.Correspondingly, the each module/unit in above-described embodiment can adopt the form of hardware to realize, and also can adopt the form of software function module to realize.The present invention is not restricted to the combination of the hardware and software of any particular form.
The foregoing is only the preferred embodiments of the present invention, be not intended to limit protection scope of the present invention.According to summary of the invention of the present invention; also can there be other various embodiments; in the situation that not deviating from spirit of the present invention and essence thereof; those of ordinary skill in the art are when making according to the present invention various corresponding changes and distortion; within the spirit and principles in the present invention all; any amendment of doing, be equal to replacement, improvement etc., within protection scope of the present invention all should be included in.

Claims (10)

Translated fromChinese
1.一种声纹特征融合方法,包括:1. A voiceprint feature fusion method, comprising:在用户的多种声纹特征向量中,分别计算用户任意两种声纹特征向量之间的平均KL距离;其中,两种声纹特征向量的平均KL距离为:第一声纹特征向量、第二声纹特征向量的概率分布的KL距离,与所述第二声纹特征向量、所述第一声纹特征向量的概率分布的KL距离相加后除以2;Among the various voiceprint feature vectors of the user, the average KL distance between any two voiceprint feature vectors of the user is calculated respectively; wherein, the average KL distance of the two voiceprint feature vectors is: the first voiceprint feature vector, the second voiceprint feature vector The KL distance of the probability distribution of the second voiceprint feature vector is added to the KL distance of the probability distribution of the second voiceprint feature vector and the first voiceprint feature vector and then divided by 2;选择平均KL距离最大的两种声纹特征进行融合。The two voiceprint features with the largest average KL distance are selected for fusion.2.如权利要求1所述的方法,其特征在于:2. The method of claim 1, wherein:在计算用户任意两种声纹特征向量之间的平均KL距离的步骤之前,所述方法还包括:Before the step of calculating the average KL distance between any two voiceprint feature vectors of the user, the method also includes:为所述用户提取两种以上的声纹特征向量。Extracting two or more voiceprint feature vectors for the user.3.如权利要求2所述的方法,其特征在于:3. The method of claim 2, wherein:所述为所述用户提取两种以上的声纹特征向量,包括:The extracting two or more voiceprint feature vectors for the user includes:通过传感器采集所述用户的语音信号,对采集到的语音信号利用两种或两种以上不同的声纹特征算法提取不同的声纹特征向量。The user's voice signal is collected by a sensor, and two or more different voiceprint feature algorithms are used to extract different voiceprint feature vectors from the collected voice signal.4.如权利要求2所述的方法,其特征在于:4. The method of claim 2, wherein:所述计算用户任意两种声纹特征向量之间的平均KL距离,包括:The calculation of the average KL distance between any two voiceprint feature vectors of the user includes:获取所述任意两种声纹特征向量,计算两种声纹特征向量分布的均值与协方差;Obtaining any two voiceprint feature vectors, and calculating the mean and covariance of the distribution of the two voiceprint feature vectors;根据所述两种声纹特征向量分布的均值与协方差,构建两种声纹特征向量空间的概率分布;According to the mean value and the covariance of the two kinds of voiceprint feature vector distributions, construct the probability distribution of two kinds of voiceprint feature vector spaces;根据两种声纹特征向量空间的概率分布,计算两种声纹特征之间的平均KL距离。According to the probability distribution of the two voiceprint feature vector spaces, the average KL distance between the two voiceprint features is calculated.5.如权利要求4所述的方法,其特征在于:5. The method of claim 4, wherein:所述两种声纹特征向量的概率分布为高斯分布。The probability distributions of the two voiceprint feature vectors are Gaussian distributions.6.一种声纹特征融合装置,包括:6. A voiceprint feature fusion device, comprising:融合信息量计算模块,用于在用户的多种声纹特征向量中,分别计算用户任意两种声纹特征向量之间的平均KL距离;其中,两种声纹特征向量的平均KL距离为:第一声纹特征向量、第二声纹特征向量的概率分布的KL距离,与所述第二声纹特征向量、所述第一声纹特征向量的概率分布的KL距离相加后除以2;The fusion information calculation module is used to calculate the average KL distance between any two voiceprint feature vectors of the user among the various voiceprint feature vectors of the user; wherein, the average KL distance between the two voiceprint feature vectors is: The KL distance of the probability distribution of the first voiceprint feature vector and the second voiceprint feature vector is added to the KL distance of the probability distribution of the second voiceprint feature vector and the first voiceprint feature vector and divided by 2 ;声纹特征融合模块,用于选择平均KL距离最大的两种声纹特征向量进行融合。The voiceprint feature fusion module is used to select two voiceprint feature vectors with the largest average KL distance for fusion.7.如权利要求6所述的装置,其特征在于:7. The device of claim 6, wherein:所述装置还包括:声纹特征提取模块,用于为所述用户提取两种以上的声纹特征向量。The device also includes: a voiceprint feature extraction module, configured to extract two or more voiceprint feature vectors for the user.8.如权利要求7所述的装置,其特征在于:8. The device of claim 7, wherein:所述声纹特征提取模块,用于为所述用户提取两种以上的声纹特征向量,包括:The voiceprint feature extraction module is used to extract two or more voiceprint feature vectors for the user, including:通过同一传感器或者不同的传感器采集所述用户的语音信号,对采集到的语音信号利用两种或两种以上不同的声纹特征算法提取不同的声纹特征向量。The user's voice signal is collected by the same sensor or different sensors, and two or more different voiceprint feature algorithms are used to extract different voiceprint feature vectors from the collected voice signal.9.如权利要求7所述的装置,其特征在于:9. The device of claim 7, wherein:所述融合信息量计算模块,用于计算用户任意两种声纹特征向量之间的平均KL距离,包括:The fusion information calculation module is used to calculate the average KL distance between any two voiceprint feature vectors of the user, including:获取所述任意两种声纹特征向量,计算两种声纹特征向量分布的均值与协方差;Obtaining any two voiceprint feature vectors, and calculating the mean and covariance of the distribution of the two voiceprint feature vectors;根据所述两种声纹特征向量分布的均值与协方差,构建两种声纹特征向量空间的概率分布;According to the mean value and the covariance of the two kinds of voiceprint feature vector distributions, construct the probability distribution of two kinds of voiceprint feature vector spaces;根据两种声纹特征向量空间的概率分布,计算两种声纹特征之间的平均KL距离。According to the probability distribution of the two voiceprint feature vector spaces, the average KL distance between the two voiceprint features is calculated.10.如权利要求9所述的装置,其特征在于:10. The device of claim 9, wherein:所述两种声纹特征向量的概率分布为高斯分布。The probability distributions of the two voiceprint feature vectors are Gaussian distributions.
CN201410408952.7A2014-08-192014-08-19Vocal print feature fusion method and devicePendingCN104183240A (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN201410408952.7ACN104183240A (en)2014-08-192014-08-19Vocal print feature fusion method and device

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201410408952.7ACN104183240A (en)2014-08-192014-08-19Vocal print feature fusion method and device

Publications (1)

Publication NumberPublication Date
CN104183240Atrue CN104183240A (en)2014-12-03

Family

ID=51964230

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201410408952.7APendingCN104183240A (en)2014-08-192014-08-19Vocal print feature fusion method and device

Country Status (1)

CountryLink
CN (1)CN104183240A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN105788600A (en)*2014-12-262016-07-20联想(北京)有限公司Voiceprint identification method and electronic device
CN108231082A (en)*2017-12-292018-06-29广州势必可赢网络科技有限公司Updating method and device for self-learning voiceprint recognition
CN108922556A (en)*2018-07-162018-11-30百度在线网络技术(北京)有限公司sound processing method, device and equipment
CN109801634A (en)*2019-01-312019-05-24北京声智科技有限公司A kind of fusion method and device of vocal print feature
CN110489659A (en)*2019-07-182019-11-22平安科技(深圳)有限公司Data matching method and device
CN111081221A (en)*2019-12-232020-04-28合肥讯飞数码科技有限公司 Training data selection method, device, electronic device and computer storage medium
CN115394303A (en)*2022-07-292022-11-25深圳市声扬科技有限公司 Training method, extraction method and electronic equipment of voiceprint extraction model

Citations (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20030182119A1 (en)*2001-12-132003-09-25Junqua Jean-ClaudeSpeaker authentication system and method
CN1758263A (en)*2005-10-312006-04-12浙江大学Multi-model ID recognition method based on scoring difference weight compromised

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20030182119A1 (en)*2001-12-132003-09-25Junqua Jean-ClaudeSpeaker authentication system and method
CN1758263A (en)*2005-10-312006-04-12浙江大学Multi-model ID recognition method based on scoring difference weight compromised

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
DI LIU ETC: "Features selection for fusion of speaker verification via Maximum Kullback-Leibler Distance", 《ICSP2010 PROCEEDINGS》*
刘镝 等: "一种基于关系度量融合框架的说话人认证特征级融合算法", 《自动化学报》*
刘镝: "基于关系度量融合框架的特征级融合说话人认证算法", 《中国博士学位论文全文数据库》*
石峰 等: "《信息论基础》", 31 July 2002, 武汉大学出版社*

Cited By (12)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN105788600A (en)*2014-12-262016-07-20联想(北京)有限公司Voiceprint identification method and electronic device
CN105788600B (en)*2014-12-262019-07-26联想(北京)有限公司Method for recognizing sound-groove and electronic equipment
CN108231082A (en)*2017-12-292018-06-29广州势必可赢网络科技有限公司Updating method and device for self-learning voiceprint recognition
CN108922556A (en)*2018-07-162018-11-30百度在线网络技术(北京)有限公司sound processing method, device and equipment
CN108922556B (en)*2018-07-162019-08-27百度在线网络技术(北京)有限公司Sound processing method, device and equipment
CN109801634A (en)*2019-01-312019-05-24北京声智科技有限公司A kind of fusion method and device of vocal print feature
WO2020155584A1 (en)*2019-01-312020-08-06北京声智科技有限公司Method and device for fusing voiceprint features, voice recognition method and system, and storage medium
CN110489659A (en)*2019-07-182019-11-22平安科技(深圳)有限公司Data matching method and device
CN111081221A (en)*2019-12-232020-04-28合肥讯飞数码科技有限公司 Training data selection method, device, electronic device and computer storage medium
CN111081221B (en)*2019-12-232022-10-14合肥讯飞数码科技有限公司Training data selection method and device, electronic equipment and computer storage medium
CN115394303A (en)*2022-07-292022-11-25深圳市声扬科技有限公司 Training method, extraction method and electronic equipment of voiceprint extraction model
CN115394303B (en)*2022-07-292025-05-06深圳市声扬科技有限公司 Voiceprint extraction model training method, extraction method and electronic device

Similar Documents

PublicationPublication DateTitle
CN104183240A (en)Vocal print feature fusion method and device
US11423131B2 (en)Systems and methods for improving KBA identity authentication questions
TWI592820B (en) Man-machine recognition method and system
US10484426B2 (en)Auto-generated synthetic identities for simulating population dynamics to detect fraudulent activity
JP6726359B2 (en) Identity recognition method and device
US20150356316A1 (en)System, method and program for managing a repository of authenticated personal data
CN111883140A (en)Authentication method, device, equipment and medium based on knowledge graph and voiceprint recognition
CN106572097A (en)Mobile device-based mixed identity authentication method
CN112152961B (en)Malicious encrypted traffic identification method and device
CN109993527A (en) An architecture method of a personal credit information system based on blockchain
CN103973453A (en)Vocal print secret key generating method and device and logging-in method and system based on vocal print secret key
KR102079303B1 (en)Voice recognition otp authentication method using machine learning and system thereof
US10939291B1 (en)Systems and methods for photo recognition-based identity authentication
GB2519571A (en)Audiovisual associative authentication method and related system
CN115511596A (en)Credit investigation, verification, evaluation and management method and system for aid decision
CN115526425A (en)Financial data prediction system and method based on block chain and big data
CN109905388A (en) A method and system for processing domain name credit based on blockchain
CN106657164A (en)Composite identity recognition algorithm for real name authentication, and identity recognition system for real name authentication
Gehrmann et al.Metadata filtering for user-friendly centralized biometric authentication
CN118658484A (en) Audio recognition method and related device
CN108984773B (en)Method and system for verifying blacklist multidimensional information under data missing condition, readable storage medium and device
CN115296820B (en)Speech perception hash authentication method based on intelligent contract
CN115618311A (en)Identity recognition method and device, electronic equipment and storage medium
Chen et al.Similarity fusion scheme for cover song identification
JP2015022593A (en) Authentication system, client terminal, authentication server, terminal program, and server program

Legal Events

DateCodeTitleDescription
C06Publication
PB01Publication
C10Entry into substantive examination
SE01Entry into force of request for substantive examination
RJ01Rejection of invention patent application after publication
RJ01Rejection of invention patent application after publication

Application publication date:20141203


[8]ページ先頭

©2009-2025 Movatter.jp