Background technology
Along with the continuous evolution of cloud computing technology, a large amount of cloud platforms continue to bring out, as the AWS of Amazon (Amazon Web Services), and domestic Ali's cloud, the platforms such as fertile cloud.The powerful computing power of these cloud platforms has been widely used in national product field, as Taobao's platform of 12306 train tickets ticket bookings websites, Alibaba etc.These cloud platforms by the storage of subscriber data of magnanimity in the database community of cloud platform.The data volume of cloud platform is very big, and this has increased the weight of cloud platform data library manager management, the maintenance load of (Database Administrator is called for short DBA) virtually.And the geographic position of depositing the database IDC data center of cloud platform often has certain physical distance with managerial personnel's Office Area.Safeguard, manage cloud platform database for more convenient, cloud platform DBA often takes data base management system (DBMS) to be mapped to the way on public network, logins this address by public network IP, carries out management, the O&M work of cloud platform database.But, there is following defect in such scheme:
Because cloud platform database is carrying a large amount of data, need DBA moment focused data storehouse situation, when DBA personnel are not in Administrative Area, cannot, by the real-time sign-on access data base management system (DBMS) in computer terminal in Administrative Area, carry out real-time servicing control to database.
For above two point defects, can design a kind of mobile client end system for the customization of the cloud platform database DBA degree of depth.In order to ensure the login of Terminal Server Client DBA personal security, prevent that account is stolen, design in the industry a kind of certification of the high strength based on bio-identification login scheme.Biological identification technology is exactly to utilize to have the human body biological characteristics of uniqueness, as fingerprint, face, sound etc., realizes the authentication to real user, than traditional inputting user name, pin mode is safer.Adopting Application on Voiceprint Recognition for mobile client is a kind of more satisfactory selection.Any mobile phone all possesses vocal print collector mic, and therefore, user does not need more exchange device, thus saving fund.For the biological characteristic problem that is stolen, certain user's the language recording if assailant has copied illegally, system, in the time of login authentication, can specify to test discourse content, thereby avoids assailant to utilize copying illegally the counterfeit identity recording of recording.
But still there are some defects in traditional single mode voiceprint algorithm, main because: single vocal print feature extraction mode can cause system performance to decline.Utilize the proper vector of single features extracting method collection, can not represent the feature of primeval life sample completely, can not reflect its separability information (Discriminatory information) completely, thereby cause system identification precise decreasing.
In order to overcome above problem, information fusion thought is introduced in vocal print feature identification field, i.e. Application on Voiceprint Recognition integration technology.Utilize certain amalgamation mode, as the feature level integration program based on vocal print feature is integrated these features, the key feature by the separability information after merging as identification personal identification, makes system realize better the function of personal identification.But thing followed problem is, because industry is numerous for the vocal print feature of Application on Voiceprint Recognition, as MFCC, Residual phase, LPCC, MVDR, MLSF etc.For as much as possible that the authentication function of DBA mobile client (precision) performance is extremely maximum, prevent that true DBA user from occurring while utilizing mobile client login that misjudgment phenomenon occurs, in the time of two or more vocal print Fusion Features, how choosing two kinds of suitable vocal print features merges, to obtain maximum separability information, the authentication precision of final system algorithm being reached maximize becomes a difficult problem.
Summary of the invention
The technical issues that need to address of the present invention are to provide a kind of vocal print Feature fusion and device, select more accurately two kinds of features that separability information is large to realize fusion, can realize better the function of personal identification, improve the precision of certification.
In order to solve the problems of the technologies described above, the invention provides a kind of vocal print Feature fusion, comprising:
In user's multiple vocal print proper vector, calculate respectively the average KL distance between any two kinds of vocal print proper vectors of user; Wherein, the average KL distance of two kinds of vocal print proper vectors is: the KL distance of the probability distribution of the first vocal print proper vector, the second vocal print proper vector, with the KL of the probability distribution of described the second vocal print proper vector, described the first vocal print proper vector apart from after being added divided by 2;
Select average KL to merge apart from two kinds of vocal print features of maximum.
Further, before calculating the step of the average KL distance between any two kinds of vocal print proper vectors of user, described method also comprises:
For described user extracts two or more vocal print proper vectors.
Further, describedly extract two or more vocal print proper vectors for described user, comprising:
Gather described user's voice signal by sensor, utilize two or more different vocal print characteristics algorithm to extract different vocal print proper vectors to the voice signal collecting.
Further, the average KL distance between any two kinds of vocal print proper vectors of described calculating user, comprising:
Obtain described any two kinds of vocal print proper vectors, calculate average and covariance that two kinds of vocal print proper vectors distribute;
The average and the covariance that distribute according to described two kinds of vocal print proper vectors, the probability distribution of two kinds of vocal print characteristic vector spaces of structure;
According to the probability distribution of two kinds of vocal print characteristic vector spaces, calculate the average KL distance between two kinds of vocal print features.
Further, the probability distribution of described two kinds of vocal print proper vectors is Gaussian distribution.
In order to solve the problems of the technologies described above, the present invention also provides a kind of vocal print Fusion Features device, comprising:
Fuse information amount computing module, for the multiple vocal print proper vector user, calculates respectively the average KL distance between any two kinds of vocal print proper vectors of user; Wherein, the average KL distance of two kinds of vocal print proper vectors is: the KL distance of the probability distribution of the first vocal print proper vector, the second vocal print proper vector, with the KL of the probability distribution of described the second vocal print proper vector, described the first vocal print proper vector apart from after being added divided by 2;
Vocal print Fusion Features module, for selecting average KL to merge apart from two kinds of vocal print proper vectors of maximum.
Further, described device also comprises: vocal print characteristic extracting module, is used to described user to extract two or more vocal print proper vectors.
Further, described vocal print characteristic extracting module, is used to described user to extract two or more vocal print proper vectors, comprising:
Gather described user's voice signal by same sensor or different sensors, utilize two or more different vocal print characteristics algorithm to extract different vocal print proper vectors to the voice signal collecting.
Further, described fuse information amount computing module, for calculating the average KL distance between any two kinds of vocal print proper vectors of user, comprising:
Obtain described any two kinds of vocal print proper vectors, calculate average and covariance that two kinds of vocal print proper vectors distribute;
The average and the covariance that distribute according to described two kinds of vocal print proper vectors, the probability distribution of two kinds of vocal print characteristic vector spaces of structure;
According to the probability distribution of two kinds of vocal print characteristic vector spaces, calculate the average KL distance between two kinds of vocal print features.
Further, the probability distribution of described two kinds of vocal print proper vectors is Gaussian distribution.
Compared with prior art, vocal print Feature fusion and device that the embodiment of the present invention provides, utilize average KL between vocal print proper vector apart from the effective information that represents two kinds of Fusion Features, merge with two kinds of vocal print feature realization character levels selecting correlativity as far as possible little (can obtain large as far as possible separability information), weigh more exactly the quantity of information of feature level blending algorithm, select more accurately two kinds of features that separability information is large to realize fusion, key feature by the separability information after merging as identification personal identification, make system realize better the function of personal identification, improve the precision of identifying algorithm.
Embodiment:
Average Kullback-Leibler distance is proposed first, herein:
Tradition Kullback-Leibler distance, is called for short KL distance, is widely used in the size that detects " distance " (similarity) between two stochastic distribution P (x), Q (x).Specific nature is as follows:
A. nonnegativity:
B. asymmetry:
DKL(P||Q)≠DKL(Q||P) (2)
C. from etc. property:
DkL(P||Q)=0 or DkL(Q||P)=0
And if only if random probability distribution P (x)=Q (x) (3)
Wherein, DkL(Q||P) the KL distance of expression Q (x), P (x) two probability distribution, DkL(P||Q) the KL distance of expression P (x), Q (x) two probability distribution, if the KL distance value of two probability distribution P (x), Q (x) is larger, illustrate that so the correlativity of P (x), Q (x) two distributions is relatively little; Otherwise, illustrate that two correlativitys that distribute are larger.For conventional vocal print blending algorithm, if the correlativity between two kinds of vocal print feature spaces distributions is less, while fusion so, can obtain more quantity of information, can choose thus these two kinds of features and realize user's vocal print fusion certificate scheme.Therefore, can utilize KL between vocal print proper vector apart from the effective information that represents two kinds of Fusion Features.But because KL is apart from the character with asymmetry, i.e. DkL(P||Q) ≠ DkL(Q||P).Therefore, this distance can not be directly used in the measurement of fuse information amount, can not be directly used in vocal print feature selecting.
In order to weigh more exactly the quantity of information of feature level blending algorithm, select more accurately two kinds of features that separability information is large to realize fusion, this programme proposes the concept of Average Kullback-Leibler distance, is called for short average KL distance.This distance has symmetry, has overcome traditional KL distance and has not had symmetry and cause computing information amount to occur the shortcoming of deviation.Average KL distance B between two stochastic distribution P (x), Q (x)aver_KL(P||Q) be:
The character that this distance has comprises:
A. nonnegativity:
B. symmetry:
DAver_KL(P||Q)DAver_KL(Q||P) (6)
C. from etc. property:
Daver_KL(P||Q)=0 or Daver_KL(Q||P)=0
And if only if random probability distribution P (x)=Q (x) (7)
As shown in Figure 1, the present embodiment provides a kind of method that realizes vocal print Fusion Features based on the average KL distance between vocal print feature, calculates the average KL distance of two proper vectors in the fusion of vocal print feature level, specifically comprises the following steps:
S101: in user's multiple vocal print proper vector, calculate respectively the average KL distance between any two kinds of vocal print proper vectors of user;
S102: select average KL to merge apart from two kinds of vocal print proper vectors of maximum.
Before step S101, also comprise: for described user extracts two or more vocal print proper vectors.
Specifically comprise:
The voice signal that gathers described user by same sensor or two different sensors, utilizes two or more different vocal print characteristics algorithm to extract different vocal print proper vectors to the voice signal collecting.
Wherein, utilize different feature extraction algorithms can obtain different vocal print vectors, such as MFCC Mel cepstrum coefficient, residual phase phasing degree residual error, LPCC linear prediction spectral function, MLSF Mel linear spectral function etc.In a word, extract the feature extraction algorithm of vocal print feature many.
In addition, can also can gather by different sensors by same sensor user's voice signal, for example, the first gathers with regular handset mic, the second with professional mic (for example, professional acoustic tone signal is extracted, micphone) gather user voice signal, then use conventional in the industry MFCC (Mel cepstrum coefficient) feature extraction algorithm and LPCC linear prediction spectral function feature extraction algorithm to extract respectively the vocal print feature that (obtaining) two kinds of distinct devices gather, obtain two kinds of vocal print proper vectors.
Wherein, step S101 specifically comprises:
S101a: obtain described any two kinds of vocal print proper vectors, calculate average and covariance that two kinds of vocal print proper vectors distribute;
In an application example, for a certain speaker P, select arbitrarily two kinds of vocal print feature A, B, establishes this two kinds of vocal print feature A, and B has respectively na, nbindividual vocal print feature vector, Xa{ xai, i=1 ... na, Xb{ xbi, i=1 ... nb, vectorial dimension is respectively fa× 1, fb× 1, the average that two proper vectors distribute is respectively
The covariance that two proper vectors distribute is
Wherein Σa, Σbdimension be respectively fa× fa, fb× fb.
S101b: the average and the covariance that distribute according to described two kinds of vocal print proper vectors, the probability distribution of two kinds of vocal print characteristic vector spaces of structure;
As the preferred mode of one, the probability distribution of described two kinds of vocal print characteristic vector spaces is made as Gaussian distribution model, and reason is: (1) Gaussian distribution can reflect the true distribution of natural world well; (2) utilize this model can solve the standard deviation of entropy extreme value, and can be to average KL apart from limiting the upper limit.
According to Gauss model, two kinds of vocal print characteristic vector space A, B respectively corresponding probability distribution is:
S101c: according to the probability distribution of two kinds of vocal print characteristic vector spaces, calculate the average KL distance between two kinds of vocal print features.
Or taking Gaussian distribution as example, according to the probability distribution of above two kinds of vocal print characteristic vector spaces, calculate the average KL distance of two features:
According to above formula, can calculate for the D between two kinds of vocal print features of user Paver_KL(pa|| pb) distance, be feature level fuse information amount, thereby carry out feature selecting, select as far as possible the vocal print feature of two kinds of correlativitys less (being that average KL distance is larger) to merge, thereby obtain more excellent system authentication precision, realize the secure log of DBA mobile client.
As shown in Figure 2, the present embodiment provides a kind of vocal print Fusion Features device, comprising:
Fuse information amount computing module, for the multiple vocal print proper vector user, calculates respectively the average KL distance between any two kinds of vocal print proper vectors of user;
Vocal print Fusion Features module, for selecting average KL to merge apart from two kinds of vocal print proper vectors of maximum.
This device also comprises:
Vocal print characteristic extracting module, is used to described user to extract two or more vocal print proper vectors.
Wherein, described vocal print characteristic extracting module, is used to described user to extract two or more vocal print proper vectors, comprising:
Gather described user's voice signal by same sensor or different sensors, utilize two or more different vocal print characteristics algorithm to extract different vocal print proper vectors to the voice signal collecting.
Wherein, described fuse information amount computing module, for calculating the average KL distance between any two kinds of vocal print proper vectors of user, comprising:
Obtain described any two kinds of vocal print proper vectors, calculate average and covariance that two kinds of vocal print proper vectors distribute;
The average and the covariance that distribute according to described two kinds of vocal print proper vectors, the probability distribution of two kinds of vocal print characteristic vector spaces of structure;
According to the probability distribution of two kinds of vocal print characteristic vector spaces, calculate the average KL distance between two kinds of vocal print features.
Wherein, the probability distribution of described two kinds of vocal print characteristic vector spaces is Gaussian distribution.Concrete account form, as the concrete account form of method in Fig. 1, repeats no more herein.
In addition, it should be noted that, the method that the present embodiment provides and device not only can be for DBA mobile clients, for other need the terminal of safety certification or system applicable equally, such as the vocal print Feature fusion and the device that provide in the present embodiment all can be provided for Web bank's login page, PC Verification System.
Can find out from above-described embodiment, with respect to prior art, the vocal print Feature fusion and the device that in above-described embodiment, provide, utilize average KL between vocal print proper vector apart from the effective information that represents two kinds of Fusion Features, merge with two kinds of vocal print feature realization character levels selecting correlativity as far as possible little (can obtain large as far as possible separability information), weigh more exactly the quantity of information of feature level blending algorithm, select more accurately two kinds of features that separability information is large to realize fusion, key feature by the separability information after merging as identification personal identification, make system realize better the function of personal identification, improve the precision of identifying algorithm.
One of ordinary skill in the art will appreciate that all or part of step in said method can carry out instruction related hardware by program and complete, described program can be stored in computer-readable recording medium, as ROM (read-only memory), disk or CD etc.Alternatively, all or part of step of above-described embodiment also can realize with one or more integrated circuit.Correspondingly, the each module/unit in above-described embodiment can adopt the form of hardware to realize, and also can adopt the form of software function module to realize.The present invention is not restricted to the combination of the hardware and software of any particular form.
The foregoing is only the preferred embodiments of the present invention, be not intended to limit protection scope of the present invention.According to summary of the invention of the present invention; also can there be other various embodiments; in the situation that not deviating from spirit of the present invention and essence thereof; those of ordinary skill in the art are when making according to the present invention various corresponding changes and distortion; within the spirit and principles in the present invention all; any amendment of doing, be equal to replacement, improvement etc., within protection scope of the present invention all should be included in.