Movatterモバイル変換


[0]ホーム

URL:


CN102737645A - Algorithm for estimating pitch period of voice signal - Google Patents

Algorithm for estimating pitch period of voice signal
Download PDF

Info

Publication number
CN102737645A
CN102737645ACN2012101969831ACN201210196983ACN102737645ACN 102737645 ACN102737645 ACN 102737645ACN 2012101969831 ACN2012101969831 ACN 2012101969831ACN 201210196983 ACN201210196983 ACN 201210196983ACN 102737645 ACN102737645 ACN 102737645A
Authority
CN
China
Prior art keywords
voice signal
pitch period
evaluation method
average magnitude
function
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2012101969831A
Other languages
Chinese (zh)
Inventor
管晏
付斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Tianyu Information Industry Co Ltd
Original Assignee
Wuhan Tianyu Information Industry Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Tianyu Information Industry Co LtdfiledCriticalWuhan Tianyu Information Industry Co Ltd
Priority to CN2012101969831ApriorityCriticalpatent/CN102737645A/en
Publication of CN102737645ApublicationCriticalpatent/CN102737645A/en
Pendinglegal-statusCriticalCurrent

Links

Images

Landscapes

Abstract

The invention discloses an algorithm for estimating a pitch period of a voice signal and relates to the field of voice signal processing. The algorithm comprises the following steps of: 1, denoising a voice signal with noise through an adaptive filter; 2, determining a self-correlation function of the denoised voice signal and a cyclic average magnitude difference function; and 3, obtaining a weighted square characteristic value through a formula, wherein alpha, beta and gamma are constants which are respectively more than 1, R (k) is the self-correlation function, and the D (k) is the average magnitude difference function. By the algorithm, the pitch period can be effectively detected in the environment with a low signal to noise ratio, the extraction errors are reduced, octave or semioctave errors are reduced, the estimation accuracy of a pitch is improved when the algorithm is sensitive to change of an amplitude or frequency of the voice signal, and the robustness is high.

Description

A kind of pitch period algorithm for estimating of voice signal
Technical field
The present invention relates to field of voice signal, specifically relate to a kind of pitch period algorithm for estimating of voice signal.
Background technology
Speech signal analysis is prerequisite and the basis that voice signal is handled; Only analyze the parameter that can characterize the voice signal essential characteristic; Just might utilize these parameters to carry out processing such as phonetic synthesis, speech recognition, voice compression coding efficiently; Wherein, pitch period is one of most important characteristic parameter during voice signal is handled.Pitch period is meant the cycle of vocal cord vibration when sending out voiced sound, and the estimation of pitch period is called pitch Detection, its objective is to extract the geometric locus that pitch period consistent with vibration frequency of vocal band or that match as far as possible changes, and effect is very crucial.
Because voice signal can be considered the stochastic process of a dynamic non-stationary; The frequency range of speech waveform and vocal cord vibration is big and very complicated; The changeableness of sound channel and sound channel characteristic vary with each individual, and the scope of fundamental tone is very wide, even the pitch period that same individual pronounces under different moods is also different; Pitch period also receives the influence of pronunciation of words tone in addition, thereby the accurate detection of pitch period is actually a relatively thing of difficulty.Especially the portion end to end at voice does not have the such periodicity of vocal cord vibration, judges that to the transition frames of some pure and impure sound is very difficult it belongs to periodically or aperiodicity; Even voice signal is quasi-periodic, its resonance peak structure and noise influence crest and zero-crossing rate sometimes, are difficult to the accurately beginning and the end of location pitch period; The pitch period variation range is bigger, and the 500Hz from the bass male sex's 50Hz to high pitch women or children is near three octaves; These have brought certain difficulty all for the detection of pitch period.
In the present fundamental tone detecting method, the most classical with ACF (Auto Correlation Function, autocorrelation function) method and AMDF (Average Magnitude Difference Function, average magnitude difference function) method.The ACF method is the autocorrelation function of computing voice signal, exists big peak value to estimate fundamental tone in pitch period integral multiple position through the ACF curve, but along with the decline of signal to noise ratio (S/N ratio), can cause frequency multiplication or half mistake frequently usually.The AMDF method is the average magnitude difference function of computing voice signal, occurs valley through the AMDF curve at pitch period integral multiple place and estimates fundamental tone, and this method is when the amplitude of voice signal or change of frequency are relatively more responsive, and the pitch Detection precision obviously descends.
Summary of the invention
To the defective that exists in the prior art; The object of the present invention is to provide a kind of pitch period algorithm for estimating of voice signal, under the low signal-to-noise ratio environment, can effectively detect pitch period, reduce and extract error; Reduce frequency multiplication or half mistake frequently; When the amplitude of voice signal or change of frequency are responsive, improve the fundamental tone estimated accuracy, robustness is better.
For reaching above purpose, the technical scheme that the present invention takes is: a kind of pitch period evaluation method of voice signal comprises the steps: that S1. will carry out noise reduction process through sef-adapting filter with the voice signal of noise; S2. obtain the autocorrelation function and the round-robin average magnitude difference function of voice signal behind the noise reduction; S3. draw the weighted quadratic characteristic through formula
Wherein, α, β, γ are the constant greater than 1, and R (k) is said autocorrelation function, and D (k) is said average magnitude difference function.
On the basis of technique scheme, said sef-adapting filter is the least mean-square error sef-adapting filter.
On the basis of technique scheme, among the said S2, the round-robin average magnitude difference function does
Figure BSA00000734556300031
K=0,1 ... N-1, wherein N is the length of speech analysis frame, Sω(n) be windowing voice behind the noise reduction, (n+k, N) n+k is carried out mould is that the mould of asking of N is got surplus operation to mod in expression.
On the basis of technique scheme, when calculating said round-robin average magnitude difference function, each sample point in the current windowing speech frame all is used and only is used once, and the difference item number of summation is also identical.
On the basis of technique scheme, said Sω(n) autocorrelation functionWherein N is the length of speech analysis frame, and k is the delay degree.
On the basis of technique scheme, a said autocorrelation function mistake! Do not find Reference source.R (k) peak feature occurs at fundamental frequency integral multiple place, estimates fundamental tone according to first peak point except that R (0).
On the basis of technique scheme, said autocorrelation function shows as peak value at the pitch period place, and average magnitude difference function shows as valley at the pitch period place.
On the basis of technique scheme, among the said S3,K(k)=[α R(k)]2[D(k)+βγ]2=[αΣn=0N-K-1Sω(n)Sω(n+k)]2[Σn=0N-1|Sω(Mod(n+k,N))-Sω(n)|+βγ]2,R (k) is identical with the cycle of K (k), and waveform is more sharp-pointed after R (k) and D (k) weighted quadratic, and both are divided by and obtain the weighted quadratic characteristic.
Beneficial effect of the present invention is: the pitch period algorithm for estimating of voice signal of the present invention; Suppressed the influence of resonance peak effectively; Under the low signal-to-noise ratio environment, can effectively detect pitch period, can locate the position of pitch period more accurately, reduce the extraction error; Improved the fundamental tone estimated accuracy, and algorithm complex is lower.
Description of drawings
Fig. 1 is the process flow diagram of the pitch period algorithm for estimating of voice signal of the present invention;
Fig. 2 is a LMS sef-adapting filter schematic diagram in the embodiment of the invention.
Embodiment
Below in conjunction with accompanying drawing and embodiment the present invention is done further explain.
As shown in Figure 1, the pitch period algorithm for estimating of voice signal of the present invention, it comprises the steps:
S1. will carry out noise reduction process through sef-adapting filter with the voice signal of noise.
In the present embodiment, with the voice signal of band noise through LMS (Least Mean Square, least mean-square error) sef-adapting filter enhancement process, to extract pure as far as possible primary speech signal.The LMS sef-adapting filter is one type of adaptive system with feedback performance; It comes the filter parameter of now of adjusting automatically through the filter parameter that previous moment is obtained; Make that the error mean square value between filter output signal and the wanted signal is minimum, thereby reach the effect of optimum filtering.Certainly, in other embodiments, can use other sef-adapting filter.
As shown in Figure 2; For using the schematic diagram of LMS (Least Mean Square, lowest mean square) sef-adapting filter in the present embodiment, X (n) expression n input signal vector constantly; D (n) representes wanted signal; The weighted vector of W (n) expression sef-adapting filter, Y (n) expression output signal, E (n) representes error signal.Said n input signal vector X (n) constantly is: X (n)=[x (n), x (n-1) ..., x (n-M+1)]T, wherein M is the exponent number of sef-adapting filter.
Said output signal Y (n) is: Y (n)=X (n)TW (n); Error signal E (n) is: E (n)=D (n)-Y (n).The weighted vector iterative formula of sef-adapting filter is:
W (n+1)=W (n)+μ E (n) X (n) formula 1
In the said formula 1, μ is the converging factor of sef-adapting filter, and next weighted vector constantly of adaptive iteration can be added with the error signal to be that the input vector of scale factor obtains by the weighted vector of current time.Said converging factor μ is step-length, and it confirms effect of filtering very responsive, selects suitable converging factor μ will influence convergence of algorithm speed, μ than hour, algorithm convergence is slow, but the stable state offset error is little; When μ was big, algorithm the convergence speed was fast, but the stable state offset error is big, so converging factor μ has decisive influence to the performance of algorithm.In addition, the exponent number M of wave filter also will directly influence the performance of sef-adapting filter.As square error E [E2(n)] hour, wave filter will adjust the best weight value vector W (n) that is fit to external environment automatically, make Y (n) optimal approximation D (n).
S2. obtain the ACF and the CAMDF (Circular Average Magnitude Difference Function, round-robin average magnitude difference function) of voice signal behind the noise reduction.
Said CAMDF function expression is:
D(k)=Σn=0N-1|Sω(Mod(n+k,N))-Sω(n)|,k=0,1,...N-1Formula 2
Wherein, Sω(n) be windowing voice behind the noise reduction, N is the length of speech analysis frame, and (n+k, N) n+k is carried out mould is that the mould of asking of N is got surplus operation to mod in expression.D (0)=0; In field of definition, D (k) is about k=N/2 symmetry, i.e. D (k)=D (N-k).In addition, for the minimum period be the strict periodic signal of T, the CAMDF function also possesses following character:
D (aT)≤D (aT+b), 0≤aT+b≤N/2 wherein, 0<b<T, a=0,1,2,
K=aT is the local smallest point of D (K), 0≤aT≤N/2 wherein, and a=0,1,2,
D (aT)≤D (aT+T), 0≤aT<aT+T≤N/2 wherein, a=0,1,2,
When calculating amplitude difference function D (k), each sample point in the current windowing speech frame all is used and only is used once, and the difference item number of summation is also identical.The character of utilizing the functional value of symmetry and the valley point of CAMDF function to increase progressively successively can also overcome the problem that pitch period doubles, brings great convenience to pitch Detection, as: the fluctuation tendency of level, it is easier that valley point is detected; Can one-time positioning arrive the pitch period position of estimating, simplify the testing process of pitch period; The sample point that uses during each D of calculating (k) is all consistent, makes the amplitude difference function more can react the difference between the different value of K.
Said ACF function representation random signal is in any two different degrees of correlation between the value constantly, and the autocorrelation function of periodic signal has the identical cycle.Windowing voice S behind the noise reductionω(n) ACF function R (k) is:
R(k)=Σn=0N-k-1Sω(n)Sω(n+k)Formula 3
Wherein N is the length of speech analysis frame, and k is the delay degree.The autocorrelation function R (k) of voice signal peak feature will occur at fundamental frequency integral multiple place, estimate fundamental tone according to first peak point (except the R (0)) usually.
S3. draw the weighted quadratic characteristic of ACF/CAMDF through formula
Figure BSA00000734556300062
; Wherein, α, β, γ are the constant greater than 1, and R (k) is said autocorrelation function, and D (k) is said average magnitude difference function; All can be drawn by formula 2 and formula 3, promptly this formula further is:
K(k)=[α R(k)]2[D(k)+βγ]2=[αΣn=0N-K-1Sω(n)Sω(n+k)]2[Σn=0N-1|Sω(Mod(n+k,N))-Sω(n)|+βγ]2Formula 4
Can be known that by above-mentioned steps S2 what the ACF function was sought is the position of maximal peak point, be the position of dark valley point and the CAMDF function is sought; ACF function number shows as peak value at the pitch period place, and the CAMDF function shows as valley at the pitch period place.If first peak value of R (k) is more sharp-pointed or the acutance of the overall valley point of D (k) is outstanding more, then the estimation of pitch period will be accurate more.Analyze and to know by formula 4; R (k) is identical with the cycle of K (k); Then the peak value waveform is more sharp-pointed after R (k) weighted quadratic, and the valley waveform is more outstanding after D (k) weighted quadratic, and both weighted quadratic characteristics that obtains of being divided by are especially obvious at the peak point at pitch period integral multiple place; Because pitch period locatees out through peak point, so this weighted quadratic characteristic has been located the position of pitch period more accurately.
The present invention is not limited to above-mentioned embodiment, for those skilled in the art, under the prerequisite that does not break away from the principle of the invention, can also make some improvement and retouching, and these improvement and retouching also are regarded as within protection scope of the present invention.The content of not doing in this instructions to describe in detail belongs to this area professional and technical personnel's known prior art.

Claims (8)

1. the pitch period evaluation method of a voice signal is characterized in that, comprises the steps:
S1. will carry out noise reduction process through sef-adapting filter with the voice signal of noise;
S2. obtain the autocorrelation function and the round-robin average magnitude difference function of voice signal behind the noise reduction;
S3. draw the weighted quadratic characteristic through formula
Figure FSA00000734556200011
; Wherein, α, β, γ are the constant greater than 1; R (k) is said autocorrelation function, and D (k) is said average magnitude difference function.
2. the pitch period evaluation method of voice signal as claimed in claim 1, it is characterized in that: said sef-adapting filter is the least mean-square error sef-adapting filter.
3. the pitch period evaluation method of voice signal as claimed in claim 1, it is characterized in that: among the said S2, the round-robin average magnitude difference function does
Figure FSA00000734556200012
K=0,1 ... N-1, wherein N is the length of speech analysis frame, Sω(n) be windowing voice behind the noise reduction, (n+k, N) n+k is carried out mould is that the mould of asking of N is got surplus operation to mod in expression.
4. the pitch period evaluation method of voice signal as claimed in claim 3; It is characterized in that: when calculating said round-robin average magnitude difference function; Each sample point in the current windowing speech frame all is used and only is used once, and the difference item number of summation is also identical.
5. the pitch period evaluation method of voice signal as claimed in claim 3 is characterized in that: said Sω(n) autocorrelation functionWherein N is the length of speech analysis frame, and k is the delay degree.
6. the pitch period evaluation method of voice signal as claimed in claim 5 is characterized in that: said autocorrelation function mistake! Do not find Reference source.R (k) peak feature occurs at fundamental frequency integral multiple place, estimates fundamental tone according to first peak point except that R (0).
7. the pitch period evaluation method of voice signal as claimed in claim 5, it is characterized in that: said autocorrelation function shows as peak value at the pitch period place, and average magnitude difference function shows as valley at the pitch period place.
8. the pitch period evaluation method of voice signal as claimed in claim 7 is characterized in that: among the said S3,K(k)=[α R(k)]2[D(k)+βγ]2=[αΣn=0N-K-1Sω(n)Sω(n+k)]2[Σn=0N-1|Sω(Mod(n+k,N))-Sω(n)|+βγ]2,
R (k) is identical with the cycle of K (k), and waveform is more sharp-pointed after R (k) and D (k) weighted quadratic, and both are divided by and obtain the weighted quadratic characteristic.
CN2012101969831A2012-06-152012-06-15Algorithm for estimating pitch period of voice signalPendingCN102737645A (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN2012101969831ACN102737645A (en)2012-06-152012-06-15Algorithm for estimating pitch period of voice signal

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN2012101969831ACN102737645A (en)2012-06-152012-06-15Algorithm for estimating pitch period of voice signal

Publications (1)

Publication NumberPublication Date
CN102737645Atrue CN102737645A (en)2012-10-17

Family

ID=46993014

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN2012101969831APendingCN102737645A (en)2012-06-152012-06-15Algorithm for estimating pitch period of voice signal

Country Status (1)

CountryLink
CN (1)CN102737645A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN104217731A (en)*2014-08-282014-12-17东南大学Quick solo music score recognizing method
CN105679331A (en)*2015-12-302016-06-15广东工业大学Sound-breath signal separating and synthesizing method and system
CN106205638A (en)*2016-06-162016-12-07清华大学A kind of double-deck fundamental tone feature extracting method towards audio event detection
CN108831504A (en)*2018-06-132018-11-16西安蜂语信息科技有限公司Determination method, apparatus, computer equipment and the storage medium of pitch period
CN110390953A (en)*2019-07-252019-10-29腾讯科技(深圳)有限公司It utters long and high-pitched sounds detection method, device, terminal and the storage medium of voice signal

Citations (3)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN1851806A (en)*2006-05-302006-10-25北京中星微电子有限公司Adaptive microphone array system and its voice signal processing method
JP2008209547A (en)*2007-02-262008-09-11National Institute Of Advanced Industrial & Technology Pitch estimation apparatus, pitch estimation method and program
CN101673550A (en)*2008-09-092010-03-17联芯科技有限公司Spectral gain calculating method and device and noise suppression system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN1851806A (en)*2006-05-302006-10-25北京中星微电子有限公司Adaptive microphone array system and its voice signal processing method
JP2008209547A (en)*2007-02-262008-09-11National Institute Of Advanced Industrial & Technology Pitch estimation apparatus, pitch estimation method and program
CN101673550A (en)*2008-09-092010-03-17联芯科技有限公司Spectral gain calculating method and device and noise suppression system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李晋: "改进的基音检测算法", 《计算机工程与应用》, vol. 03, no. 03, 19 January 2011 (2011-01-19)*

Cited By (8)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN104217731A (en)*2014-08-282014-12-17东南大学Quick solo music score recognizing method
CN105679331A (en)*2015-12-302016-06-15广东工业大学Sound-breath signal separating and synthesizing method and system
CN106205638A (en)*2016-06-162016-12-07清华大学A kind of double-deck fundamental tone feature extracting method towards audio event detection
CN106205638B (en)*2016-06-162019-11-08清华大学 A two-layer pitch feature extraction method for audio event detection
CN108831504A (en)*2018-06-132018-11-16西安蜂语信息科技有限公司Determination method, apparatus, computer equipment and the storage medium of pitch period
CN108831504B (en)*2018-06-132020-12-04西安蜂语信息科技有限公司Method and device for determining pitch period, computer equipment and storage medium
CN110390953A (en)*2019-07-252019-10-29腾讯科技(深圳)有限公司It utters long and high-pitched sounds detection method, device, terminal and the storage medium of voice signal
CN110390953B (en)*2019-07-252023-11-17腾讯科技(深圳)有限公司Method, device, terminal and storage medium for detecting howling voice signal

Similar Documents

PublicationPublication DateTitle
BoersmaAccurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound
CN103117067B (en)Voice endpoint detection method under low signal-to-noise ratio
JP5992427B2 (en) Method and apparatus for estimating a pattern related to pitch and / or fundamental frequency in a signal
US9454976B2 (en)Efficient discrimination of voiced and unvoiced sounds
US20170287507A1 (en)Pitch detection algorithm based on pwvt
KR20100049601A (en)Cyclic signal processing method, cyclic signal conversion method, cyclic signal processing device, and cyclic signal analysis method
CN102737645A (en)Algorithm for estimating pitch period of voice signal
CN101586997A (en)Method for calculating guy cable vibrating base frequency
CN106991998A (en)The detection method of sound end under noise circumstance
CN106328168A (en)Voice signal similarity detection method
CN107371116A (en)A kind of detection method of uttering long and high-pitched sounds based on interframe spectrum flatness deviation
KR20150014492A (en)Method and apparatus for detecting correctness of pitch period
CN107564512A (en)Voice activity detection method and device
CN110379438B (en)Method and system for detecting and extracting fundamental frequency of voice signal
JP5325130B2 (en) LPC analysis device, LPC analysis method, speech analysis / synthesis device, speech analysis / synthesis method, and program
CN103839544B (en)Voice-activation detecting method and device
CN104036785A (en)Speech signal processing method, speech signal processing device and speech signal analyzing system
UpadhyaPitch detection in time and frequency domain
Wu et al.Speech endpoint detection based on EMD and improved spectral subtraction
CN108830232B (en)Voice signal period segmentation method based on multi-scale nonlinear energy operator
JP4760179B2 (en) Voice feature amount calculation apparatus and program
Zhao et al.A New Pitch Estimation Method Based on AMDF.
Upadhya et al.Pitch estimation using autocorrelation method and AMDF
Hasan et al.An efficient pitch estimation method using windowless and normalized autocorrelation functions in noisy environments
Guangyu et al.Improving AMDF for pitch period detection

Legal Events

DateCodeTitleDescription
C06Publication
PB01Publication
C10Entry into substantive examination
SE01Entry into force of request for substantive examination
C12Rejection of a patent application after its publication
RJ01Rejection of invention patent application after publication

Application publication date:20121017


[8]ページ先頭

©2009-2025 Movatter.jp