Movatterモバイル変換


[0]ホーム

URL:


EP1355298A2 - Code Excitation linear prediction encoder and decoder - Google Patents

Code Excitation linear prediction encoder and decoder
Download PDF

Info

Publication number
EP1355298A2
EP1355298A2EP03013629AEP03013629AEP1355298A2EP 1355298 A2EP1355298 A2EP 1355298A2EP 03013629 AEP03013629 AEP 03013629AEP 03013629 AEP03013629 AEP 03013629AEP 1355298 A2EP1355298 A2EP 1355298A2
Authority
EP
European Patent Office
Prior art keywords
excitation
signal
codebook
code
vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
EP03013629A
Other languages
German (de)
French (fr)
Other versions
EP1355298A3 (en
EP1355298B1 (en
Inventor
Kenichiro Hosoda
Hiromi Aoyaki
Hiroshi Katsuragawa
Yoshihiro Ariyama
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Oki Electric Industry Co Ltd
Original Assignee
Oki Electric Industry Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oki Electric Industry Co LtdfiledCriticalOki Electric Industry Co Ltd
Priority claimed from PCT/JP1993/000776external-prioritypatent/WO1994029965A1/en
Priority claimed from EP93913500Aexternal-prioritypatent/EP0654909A4/en
Publication of EP1355298A2publicationCriticalpatent/EP1355298A2/en
Publication of EP1355298A3publicationCriticalpatent/EP1355298A3/en
Application grantedgrantedCritical
Publication of EP1355298B1publicationCriticalpatent/EP1355298B1/en
Anticipated expirationlegal-statusCritical
Expired - Lifetimelegal-statusCriticalCurrent

Links

Images

Classifications

Definitions

Landscapes

Abstract

A code excitation linearpredictive (CELP) coding or decoding apparatus is provided in whicha code vector, which is provided bya stochastic codebook (108), is converted adaptively inaccordance with vocal tract analysis information (LPC)so that a high quality reproduction speech is obtainedat a low coding rate. Further, in order to obtain asimilar effect, a pulse-like excitation codebook formedof an isolated impulse is provided in addition to theadaptive excitation codebook (107) and stochastic excitationcodebook (108) so that either the stochastic excitationcodebook or the pulse-like excitation codebook isselectively used to provide a vocal tract parameter asa linear spectrum pair parameter.

Description

TECHNICAL FIELD OF THE INVENTION
This invention relates to an encoder and a decoderbased on the code excitation linear predictive coding(CELP) system.
BACKGROUND OF THE INVENTION
Conventionally, as a high efficient coding systemfor speech signal including audible signal in a fieldof digital transportable communication system, a codeexcitation linear predictive coding and itsmodification, that is, a vector sum excitation linearpredictive coding system (VSELP) have been used. Thecoding apparatus which uses the code excitation linearpredictive coding (CELP) is disclosed in, for example,N.S. Jayant and J.H.Chen, "Speech Coding withTime-varying Bit Allocation to Excitation and LPCParameters", Proc. ICASSP, pp65-68, 1989.
A fundamental construction of the coding systemrelative to the speech signal is to obtain vocal tractparameters representing vocal tract properties andexcitation source parameters representing excitationsource information. In the recent CELP system, anexcited signal as a excitation source information isencoded by means of both an adaptive excitation codevectors, which contribute to stochasticallystronger periodic excitation signal and stochasticexcitation codevectors which contribute to stochasticless periodic random excitation signal, and then thecoded excitation signals are stored in a codebook, andan optimum adaptive excitation codevectors andstochastic excitation codevectors are found out in eachcodebook so that weighted error power sum between aninput speech vector and synthetic speech vector becomesminimum. Then, whatever it is of a forward-type codingsystem which obtains vocal tract parameters from aninput speech vector or of a backward-type coding systemwhich obtains vocal tract parameters from syntheticspeech vectors, at least the excitation sourceparameters, that is, adaptive excitation code andstochastic excitation code information are transmitted.
By utilizing the code excitation linear predictive(CELP) system as described above, it is known that ahigh quality regenerated speech signals are obtained ata coding rate of 6 kbit/s to 8 kbit/s.
However, some communication systems require lowercoding rate, for example 4kbit/s or less. In such alower coding rate, regardless of being the forward typewhich transmits both vocal tract parameters andexcitation source parameters or being the backward typewhich transmits excitation source parameters, thenumber of coded bits which are assigned to the excitation source parameters is smaller and the numberof adaptive excitation codevectors stored in theadaptive excitation codebook and the number ofstochastic excitation codevectors stored in thestochastic excited codebook become smaller.Consequently, the quality of the regenerated speechsignal inevitably degrades at the lower coding rate asdescribed above.
Besides, the adaptive excited codebook areadaptively renewed by synthetic codevectors of optimumadaptive excitation codevectors and stochasticexcitation codevectors and, accordingly, it can bedetermined that the adaptive excitation codevectors areformed on the basis of the stochastic excitationcodevectors. Therefore, the current CELP coding has apoor tracking capability for a voice signal having anature of strong periodicity. Consequently, generatedspeech signal lacks clearness.
A speech coding and decoding system that attempts to realise ahigher compression of speech information is described inEP 476614. Here, a sparse adaptive codebook is used in associationwith a time-reversed perceptual weighting filter.
SUMMARY OF THE INVENTION
The present invention is based upon the foregoing problems and anobject of the present invention is to provide code excitation linear predictivecoding encoder and decoder which can provide a high quality regeneratedspeech signal even when pulse-like noise components are contained in theinput speech vectors.
Another object of the present invention is to provide code excitation linear predictive codingencoder and decoder which can provide high-qualityregenerated speech signal even when a lower coding rateis employed.
According to the present invention, there isprovided a code excitation linear predictive codingapparatus which uses, as a speech excitation sourceinformation, excitation signals in the form ofexcitation codebook, wherein the apparatus is providedwith a codevectors conversion circuit which convertsthe frequency characteristics of fixed codevectors suchas stochastic excitation codevectors transmitted fromthe excitation codebook into the predeterminedfrequency characteristics at the time of output of theexcitation codevectors. A primary reason for providingthe codevectors conversion circuit is as set forthbelow. Conventionally, the frequency characteristicsof an excitation signal is modelled as "theoreticallywhite" and yet it actually is not "white" but isrecognized by examinations to have a characteristicwhich is near to a frequency characteristics of aninput speech vectors. Therefore, the nearer the fixedcodevectors frequency characteristics is set to thefrequency characteristics of the input speech vectors,the higher the quality of the synthetic speech vectoris obtained and, moreover, an effective frequencycomponent of the excitation codevectors becomes much larger than a quantization error vectors so that amasking effect of the quantization error vector can beobtained. As an information representing frequencycharacteristics of the code conversion circuit,parameters of LPC (linear predictive coefficient) andoptimum adaptive excitation code information whichmeans pitch predictive information (which includes VQgains) are used. Thus, the codevectors conversioncircuit controls the frequency characteristics of thestochastic excitation codevectors and so forth, inaccordance with these information.
Further, in the present invention, there isprovided a code excitation linear predictive decodingapparatus which has codevectors conversion circuitwhich forces the fixed codevector frequencycharacteristics near to the input speech vectorfrequency characteristics in accordance with therespective code excitation linear predictive codingsystem.
In the codevector converter circuit, an impulseresponse determined by the following formula (1) asfilter transfer function H(Z) according to the vocaltract parameters,H(Z)= (1-ΣAjajZ-j) / (1-ΣBjaj-j)or an impulse response determined by the followingformula (2) in accordance with a excited pitch lag,H(Z) = 1/ (1-ε Z-L) or an impulse response which is cascade-connectedfilter represented by formulas (1) and (2) is used toproceed a convolution treatment to the stochasticexcitation codevectors and thereafter a adaptiveexcitation codevectors are added to produce excitationcodevectors. Here, aj(j=1 to p) represents a parameterof LPC and p represents the order of LPC analysis. A,B and E are constants which are determined in the rangeof 0<A<1, 0<B<1 and 0<ε≤1, respectively, and Lrepresents a pitch lag.
Further, the present invention provides a codeexcitation linear predictive coding or decodingapparatus which is provided, as an excitation codebook,with a adaptive excitation codebook and stochasticexcitation codebook, in which pulse-like excitationcodebook storing a pulse-like excitation codevectorwhich consists of isolated impulse in addition to theadaptive excitation codebook and stochastic excitationcodebook is provided so that the current CELP codinghas a good tracking capability for a speech signalhaving a nature of strong periodicity. Thus, clearregenerated speech signal can be obtained.
Further, in the code excited linear predictivecoding apparatus, excitation codevectors from thestochastic excitation codebook or pulse-like excitationcodebook are selectively used, and this selectedinformation is transmitted to the code excitation linear predictive decoder apparatus. In this codeexcitation linear predictive decoder apparatus, theexcitation codevectors from the stochastic excitationcodebook or pulse-like excitation codebook are selectedin accordance with the information transmitted from thecode excitation linear predictive coding apparatus.
In addition, in each of the above-described codeexcitation linear predictive encoders, the output ofvocal tract parameters are assigned to be LSP (linearspectral pair) parameters and this linear spectral pairparameters are utilized for the speech regeneration inthe code excitation linear predictive decoder so thatthe regeneration speech quality at the lower codingrate can be improved from a viewpoint of vocal tractparameters. The reasons for using LSP parameters asthe vocal tract parameters reside in that aninterpolation characteristics relative to the frequencycharacteristics of the vocal tract are improved, thatthe LSP parameters provides less distortion to thevocal tract spectral than LPC parameters even when theLSP parameters are coded by smaller number of codebits, and that an effective coding can be obtained bycombination with vector quantization.
BRIEF DESCRIPTION OF THE DRAWING
  • Fig. 1 is a block diagram of a code excitationlinear predictive encoder (coding apparatus) according to a first and a second embodiments of the presentinvention.
  • Fig. 2 is a block diagram of a code excitationlinear predictive decoder in correspondence with thecode excitation linear predictive encoder shown in Fig.1.
  • Fig. 3 is a block diagram of a code excitationlinear predictive encoder (coding apparatus) accordingthe a third embodiment of the invention.
  • Fig. 4 is a block diagram of a code excitationlinear predictive decoder in correspondence with thecode excitation linear predictive encoder shown in Fig.3.
  • Fig. 5 is a detailed block diagram of a codevectorconversion circuit shown in Figs. 3 and 4.
  • BEST MODE FOR CARRYING OUT THE INVENTION
    Preferred embodiments of the code excitationlinear predictive coding apparatus (encoder) and thecode excitation linear predictive decoding apparatus(decoder) according to the present invention will bedescribed with reference to the figures of the drawingattached herewith.
    Referring to Fig. 1 which shows a code excitationlinear predictive encoder (coding apparatus) accordingthe a first embodiment of the present invention, aninput speech vector S which has been inputted in each frame from an input terminal 101 is first transmittedto a vocal tract analysis circuit 102 to obtain a vocaltract parameter aj (linear predictive coefficient).
    An LPC (linear predictive coefficient)quantization circuit 103 quantizes vocal tractpredictive parameter aj and transmits its code Ic(quantized LPC code) to an LPC inverse-quantizationcircuit 104 and a multiplex circuit 106.
    The LPC inverse-quantization circuit 104 serves toconvert the LPC code Ic into vocal tract predictiveparameter aqj and transmits the same to a synthesisfilter 105.
    Then, an adaptive excitation codevector e ai (i=1to n) is outputted from a adaptive excitation codebook107 and similarly, a stochastic excitation codevector esl (l=1 to m) is from a stochastic excitation codebook108. Similarly, an excitation gains βk and γk (k=1 tor) are outputted from a VQ gain codebook 110.
    A codevector conversion circuit 109, which has animpulse response of filter transfer function H(Z)represented by the following formula (3), performsconvolutional computation with stochastic excitationcodevector e sl from a stochastic excitation codebook108, and transmits a converted stochastic excitationcodevector e scl.
    Figure 00100001
       wherein aqj represents an output of LPC inversequantization circuit 104 and p represents vocal tractanalysis order.
    The adaptive excitation codevector e ai ismultiplied by the gain βk by means of a multiplier 113to produce a vector e aik and, on the other hand, theconverted stochastic excitation codevector e scl ismultiplied by the gain γk by means of a multiplier 114to produce a vector e sclk.
    An adder 115 adds the components of vector e alkand vector e sclk and produces an excitation codevectore.
    The synthesis filter 105 calculates syntheticspeech vector Sw corresponding to the excitationcodevector e and transmits it to asubtracter 116.
    Thesubtracter 116 performs the subtractionbetween the synthesized speech vector Sw and the inputspeech vector S, and the obtained error vector betweenSw and S is transmitted to aperceptual weightingfilter 111.
    Theperceptual weighting filter 111 transmits aperceptual weighting error vector ew corresponding tothe error vector er to a perceptual weighting errorcalculation circuit 112.
    The perceptual weighting error calculation circuit112 calculates a mean square value of each component ofthe perceptual weighting error vector ew, and determines the excitation codevector (i.e., combinationof i, l and k) to minimize the mean square error powerof ew for the input speech vector at the present time.Indexes Ia, Is and Ig of each codebook at this momentare transmitted to each of the adaptive excitationcodebook 107, stochastic excitation codebook 108, VQgain codebook 110 and multiplex circuit 106.
    The adaptive excitation codebook 107 outputs anoptimum adaptive excitation codevector ea0 assigned byindex Ia, the stochastic excitation codebook 108outputs an optimum stochastic excitation codevector es0assigned by index Is, and the VQ gain codebook 110transmits optimum VC gain β0 and γ0 assigned by indexIg. A codevector conversion circuit 109 converts thestochastic codevector es0 which has been transmittedfrom the stochastic excitation codebook in accordancewith the index Is into an optimum converted stochasticexcitation codevector e sc0 and then outputs it to themultiplier 114.
    The optimum excitation codevector e0pt composed bythe ea0, esc0, β0 and γ0 is transmitted to the adaptiveexcitation codebook 107 and updates the content of theadaptive excitation codebook 107.
    The multiplex circuit 106 multiplexes Ic, Ia, Isand Ig, as a total code C, and transmits it to thereceiver through anoutput terminal 117.
    Fig. 2 is a block diagram of a code excitation linear predictive decoder corresponding to the codeexcitation linear predictive encoder.
    In Fig. 2 the total code C from aninput terminal201 is separated by a demultiplex circuit 212 into LPCcode Ic, adaptive excitation code index Ia, stochasticexcitation code index Is, and VQ gain code index Ig andthey are transmitted, respectively, to LPC inversequantization circuit 202, adaptive excitation codebook204, stochastic excitation codebook 205 and VQ gaincodebook 207.
    The LPC inverse quantization circuit 202 convertsthe LPC code Ic into vocal tract predictive parameteraj and transmits to a synthesis filter 203. Theadaptive excitation codebook 204 outputs adaptiveexcitation codevector ea assigned by the index Ia, thestochastic excitation codebook 205 outputs a stochasticexcitation codevector es assigned by the index Is, anda VQ gain codebook 207 outputs excitation gains β andγ, assigned by index Ig.
    Acodevector conversion circuit 206 converts thevector es into vector e sc and outputs it as similar asthe aforementioned code excitation linear predictivecoding apparatus (encoder).
    The adaptive excitation codevector ea ismultiplied by gain β by means ofmultiplier 208, andthe vector e sc is multiplied by gain γ by means ofmultiplier 209. These multiplied vector components are added byadder 210, and final excitation codevector efor synthesis filter is obtained.
    A synthesis filter 203 calculates a synthesizedspeech vector S corresponding to the excitationcodevector e and outputs to anoutput terminal 211. Atthe same time, the content of the adaptive excitationcodebook 204 is updated by vector e.
    The code excitation linear predictive encoderaccording to the second embodiment of the inventionwill be explained with reference to Fig. 1 again.
    This code excitation linear predictive encoderaccording the a second embodiment has the similarconstruction as that of the first embodiment except thecodevector conversion circuit 109 and, therefore, anoperational mode of the codevector conversion circuit109 will be explained presently.
    The codevector conversion circuit 109, which hasan impulse response of filter transfer function H(Z)shown by the following formula (4) performsconvolutional computation with the vector e sl andresults in vector e scl.H(Z)=1/(1-εZ-L)
    Where ε is ε ≤ 1.0, and L is a pitchlag obtainedfrom index of the adaptive excitation code.
    Incidentally, in the codebook of a shift-typeadaptive excitation codebook, the index of the adaptiveexcitation code corresponds with the pitch lag index as below.
    Figure 00150001
    The convolutional processing of the aforementionedcode excitation linear predictive coding apparatus(encoder) are represented by the following formula (5),provided that the e sl is an output stochasticexcitation codevector of the stochastic excitationcodebook, e scl is a stochastic excitation codevectorafter the conversion, and h is an impulse response ofconversion circuit.e scl = e sl X h   wherein:
    • e scl = [x0, x1 ,,,,,, xn-1], e sl=[y0, y1,,,yn-1],
    • h=[h0, h1,,,,,hn-1] (The bracket [ ] is columnvector.),
    • x, y and h are elements, and n is subframe length(or frame length).
    • A transfer function composed of a vocal tractparameter, or a transfer function composed of the pitchlag can be used for the impulse response of codeconversion circuit, alternatively, said two transferfunctions can be cascaded to form the impulse response.
      Fig. 3 is a block diagram of a code excitationlinear predictive encoder according to the third embodiment of the invention. In Fig. 3 this codeexcitation linear predictive encoder is primarilycomposed of a inputspeech process portion 301, optimumsynthesizedspeech search portion 302 andmultiplexcircuit 303.
      Theinput speech process 301 has LSP parameteranalysis circuit 311, LSP parameter coding circuit 312,LSP parameter decoding circuit 313, LPC conversioncircuit 314,perceptual weighting filter 315, synthesisfilter zero inputresponse generation circuit 316,perceptual weighting filter zero inputresponsegeneration circuit 317, and subtracters 318 and 319.When an input vector is given, a speech parameter whichis to be transmitted to the decoder is obtained and,target speech vector for a synthesized speech vectorwhich is formed by local reproduction.
      In the code excitation linear predictive encoder,digitalized discrete input speech vector series arestored as much as the time which corresponds to ananalysis frame length for obtaining a vocal tractparameter and, this analysis frame length is separatedinto several subframes and processed by inputspeechprocessing portion 301.
      The input speech vector is given to the LSPparameter analysis circuit 311, analyzed by the LSPanalysis circuit 311, and converted to LSP parameter asvocal tract parameter. This LSP parameter is coded (for example, to be vector quantized) by LSP parametercoding circuit 312 and given to themultiplex circuit303 and transmitted to the code excitation lineardecoder. The coded LSP parameter is decoded (vectorquantized) by LSP parameter decoding circuit 313 andconverted to LPC by the LPC conversion circuit 314.The thus converted LPC is used as a tap coefficient forperceptual weighting filter 315, synthesis filter zeroinputresponse generation circuit 316, perceptualweighting filter zeroinput generation circuit 317 andasynthesis filter 329 which will be describedpresently, and given also to a codevector conversioncircuit 328. The quantized LSP parameter is convertedinto LPC.
      Next, an operation for forming a target speechvector relative to synthesized speech vector which islocally reproduced from the input speech vector will beexplained.
      The input speech vector described above is givento theperceptual weighting filter 315 and after theweighing processing in consideration of humanperceptual characteristics, the input speech vector isgiven to asubtracter 318 to be subtracted. Further, azero input response vector in relation to asynthesisfilter 329, is given for input ofsubtracter 318.Thus, a speech vector, from which an influence of thesynthesis filter 329 in the immediately before analysis frame is excluded, is given to subtracter 319.Further, a zero input response vector in relation to aperceptual weighting filter 315, is given for input ofsubtracter 139. Thus, a speech vector, from which aninfluence of theweighted filter 315 in the immediatelybefore analysis frame is obtained, is given tosubtracter 330.
      The optimum synthesizedticspeech search portion302 serves to search a excitation source parameter inwhich the synthesis speech vector in the localreproduction is most similar to the target speechvector, and is composed ofadaptive excitation codebook320,stochastic excitation codebook 321, pulse-likeexcitation codebook 322,VQ gain codebook 323,VQ gaincontrollers 324 and 327,adder 325, fixedcodebookselection switch 326,codevector conversion circuit328,synthesis filter 329, subtracter 330, error powersum computing circuit 331 and code selection circuit332.
      Each of theadaptive excitation codebook 320,stochastic excitation codebook 321 and pulse-likeexcitation codebook 322 stores adaptive excitationcodevector, which is a waveform code in relation to anexcitation signal, stochastic excitation codevector andpulse-like excitation codevector, respectively, and VQgain codebook 323 stores VQ gain code which is relatedto adaptive excitation codevector and fixed codevector (which generally represents stochastic excitationcodevector and pulse-like excitation codevector).
      The adaptive excitation code vector contributes tothe voiced speech signal having stochasticallyperiodicity, while the stochastic excitation codevectorcontributes to the unvoiced speech signal havingstochastically less periodicity. The adaptiveexcitation codevector of theadaptive excitationcodebook 320 is adaptively updated as describedpresently.
      The pulse-like excitation codevector is a waveformexcitation codevector consisting of an unit impulse andis considered to contribute to the steady portion ofthe voiced speech signal having a strong periodicity.
      The VQ gain code is vector-quantized, for example,and one component of the vector relates to VQ gain foradaptive excitation code vector and the other componentrelates to VQ gain for the fixed code vector.
      Pulse-like excitation code vector is a periodicsimple signal which can be generated by means of apulse signal generating circuit but, it can preferablybe generated by coding and reading out from thecodebook 322 as this code excitation linear predictiveencoder, the reason of which will be explainedpresently. Namely, it is easy to synchronize theexcitation vector with an output from theadaptiveexcitation codebook 320. The same processing for selecting the stochastic excitation codebook can bepulse-like excitation codevector search by constitutingthe excitation code vector to have the same codebookconstruction with thecodebook 321.
      By utilizing said various codebook to obtain anoptimum code so that the locally synthesized speechvector becomes the most similar to the target speechvector, and its indices are given to themultiplexcircuit 303 and are transmitted to the code excitationlinear predictive decoder portion.
      In case of the search of an optimum codeincluding a selection of the stochastic excitation codevector or the pulse-like excitation code vector asdescribed above, the searching is carried out withrespect to the adaptive excitation code, stochasticexcitation code, pulse-like excitation code and VQ gaincode, in turn, in this code excitation linearpredictive encoder.
      In case of searching an optimum adaptiveexcitation code vector, an output from thestochasticexcitation codebook 321 and the pulse-like excitationcodebook 322 are assigned to be zero (0), and theVQgain controller 324 multiply a suitable value of VQcoefficient ("1", for example). In this state, theadaptive excitation codebook 320 outputs all of thestored adaptive excitation code vector sequentially orin parallel, and gives it as an excitation code vector to thesynthesis filter 329 through theVQ gaincontroller 324 and theadder 325. Thesynthesis filter329 carries out a convolutional computing relative tothe excitation code vector, by utilizing, as a tapcoefficient, the LPC which is given from the LPCconversion circuit 314, and a synthesized speechvectors, which are synthesized only by the content ofthe adaptive excitation code vector as the excitationsource signal, are obtained with respect to all theadaptive excitation code vector.
      The subtracter 330 obtains, with respect to all ofthe adaptive excitation code vector, an error vectorbetween the synthesized speech vector on which only thecontent of the adaptive excitation code vector iseffected and the target speech vector, and then givesit to an error powersum calculation circuit 331. Theerror powersum calculation circuit 331 obtains squaresum (error power sum) of the error vector, with respectto all the adaptive code vector, and gives it to a codeselection circuit 332. The code selection circuit 332determines the the adaptive excitation code vector tominimize the error power sum.
      Next, an optimum stochastic excitation code vectorsearching is carried out and in the searching of this,a fixedcodebook selection switch 326 is driven to theside of thestochastic excitation codebook 321 theoutput from adaptive excitation codebook is set to zero (0) or to the previously obtained optimum adaptiveexcitation code vector. In the state as this, thestochastic excitation codebook 321 outputs sequentiallyor in parallel, all the stored stochastic excitationcode vectors,and inputs them into the codevectorconversion circuit 328 through the fixedcodebookselection switch 326 andVQ controller 324.
      The codevector conversion circuit 328 proceedsthe conversion of the frequency characteristics ofinputted stochastic excitation code vector so that itis moved to close to frequency characteristics of aninput speech vector in correspondence with time-lengthof the stochastic excitation code vector. As describedabove, all the stochastic exited code vector with itsfrequency characteristics being conversion-processed isgiven, as an excitation code vector, to asyntheticfilter 329. Thereafter, it is processed as similar asthe searching of the optimum adaptive excitation codevector, and the code selection circuit 332 determinesan optimum stochastic excitation code vector.
      After the searching of the optimum stochasticexcitation code vector is finished as described above,a searching of an optimum pulse-like excitation codevector is carried out. At this searching, the fixedcodebook selection switch 326 is driven to the side ofthe pulse-like excitation codebook 322 the output fromadaptive excitation codebook 326 is set to zero (0) or to the previously obtained optimum adaptive excitationcode vector. In this state, the pulse-like excitationcodebook 322 outputs sequentially or in parallel, allthe stored pulse-like excitation code vectors.Processings thereafter will be substantially similarwith those of the moment when an optimum stochasticexcitation code vector is searched and, accordingly,more detailed explanation will not be necessary.
      As described above, when the optimum pulse-likeexcitation code vector is determined, the codeselection circuit 332 compares the error power sum ofthe selected code vector in the stochastic excitationcode vector search with the error power sum of theselected code vector in the pulse-like excitation codevector search to obtain smallest error power sum, anddetermin a fixed code to be transmitted to the codeexcitation linear predictive decoder.
      Thereafter, a searching of an optimum VQ gaincodeis carried out. At the searching of this VQ gain code,an optimum (selected) adaptive excitation code vectoris transmitted from theadaptive excitation codebook320, and the fixedcodebook selection switch 326 isswitched to either the selectedstochastic excitationcodebook 321 or pulse-like excitation codebook 322, andan optimum (selected) fixed code vector is outputtedfrom the selectedfixed codebook 321 or 322. AVQ gaincodebook 323 is composed of VQ gain for an adaptive excitation code vector and VQ gain for the fixed codevector. The VQ gain for the adaptive excitation codevector is given to aVQ gain controller 324 and the VQgain for the fixed code vector is given to aVQ gaincontroller 327. Thus, both the VQ gain-controlledoptimum adaptive excitation code vector and the optimumfixed code vector, which have been processed withrespect to a frequency characteristic operation and VQgain control, are added by anadder 325 and then givento a synthesis filter as an excitation code vector.This processing is carried out sequentially or inparallel, relative to all the VQ gain codes in theVQgain codebook 323.
      After an optimum adaptive excitation code, optimumfixed code and optimum VQ gain code are selected, thecode selection circuit 332 gives the indexes of thesecodes to amultiplex circuit 303 and, a fixed codebookselection switching information which one of thestochastic excitation code vector and the pulse-likeexcitation code vector is selected actually, is givento themultiplex circuit 303. Themultiplex circuit303 multiplexes said indexes with LSP parameter givenfrom the LSP parameter coding circuit 312 and transmitsit to the code excitation linear predictive decoder.Incidentally, in case of utilizing a vectorquantization for a VQ gain coding method, thetransmitted index is vector number.
      The coding processings described above is repeatedwith respect of each subframe, and the coded speechinformation is transmitted in turn to the codeexcitation linear predictive decoder.
      Fig. 5 shows in detail the specific structure ofthe codevector conversion circuit 328. In Fig. 3, thecodevector conversion circuit 328 has two cascadedfilters 328a and 328b, and a pitch lag decision circuit328c.
      The fixed code vector is given to a first filter328a. An impulse response H1(Z) of the first filter328a is set as shown by formula (6), by which thefrequency conversion processing is carried out relativeto the fixed vector.H1(Z)=(1-ΣAjajZ-j) / (1-ΣBjajZ-j)   wherein aj(j is 1 to p) is a tap coefficientrelative to asynthesis filter 329 which is suppliedfrom theLPC conversion circuit 324, and p is vocaltract analysis order. Further, A and B are constantswhich are determined in the ranges of 0<A ≤ 1, and 0 <B ≤ 1.
      The code vector which was processed in itsfrequency characteristics by the first filter 328a istransmitted to the second filter 328b. The pitch lagdecision circuit 328c obtains a pitch lag L from theindex of the optimum adaptive excitation code relativeto theadaptive excitation codebook 320 and then gives the pitch lag L to the second filter 328b. An impulseresponse H2(Z) of the second filter 328b is determinedas shown by formula (7), by which a frequency 'conversion is carried out relative to the inputtedfixed code vector.H2(Z)= 1/(1-εZ-L)   wherein ε is a constant determined in the range of0<ε≤1. An output of the second filter 328b is given toVQ gain controller 327 shown in Fig. 3.
      By the codevector conversion circuit 328 asdescribed above, the frequency characteristics ofinputted fixed code vector can be made closer to thefrequency characteristics of the input speech vector,in accordance with a time length of the fixed codevector.
      Accordingly, the code excited linear predictivecoding apparatus (encoder) can provide a high qualityregenerated speech signal.
      Next, a code excitation linear predictive decoderin correspondence with the code excitation linearpredictive coding apparatus (encoder) shown in Fig. 3will be described with reference to the accompanyingdrawing.
      Fig. 4 is a block diagram of code excitationlinear predictive decoder which corresponds to the codeexcitation linear predictive coding apparatus (encoder)shown in Fig. 3. In Fig. 4, the code excitation linear predictive decoder has demultiplex circuit 440, LSPparameter decoding circuit 441, LPC conversion circuit442,adaptive excitation codebook 443, stochasticexcitation codebook 444, pulse-like excitation codebook445, VQ gain codebook 446, VQ gain controller 447, VQgain controller 449, fixedcodebook selection switch448, code vector conversion circuit 450, adder 451 andsynthesis filter 452.
      The coded speech information given from the codeexcitation linear predictive encoder is inputted to thedemultiplex circuit 440. The demultiplex circuit 440separates the coded speech information into LSPparameter code, index of the optimum adaptiveexcitation code, index of the optimum fixed code, indexof the optimum VQ gain codebook and fixed codeselection switch information.
      Then, LSP parameter code is given to the LSPparameter decoding circuit 441 and the index of theoptimum adaptive excitation code is given to theadaptive excitation codebook 443. Further, the indexof optimum VQ gain code is given to the VQ gaincodebook 446 and the fixed codebook selection switchinformation is given to the fixedcodebook selectionswitch 448.
      The index of the optimum fixedcode 443 is givento a pulse-like excitation codebook 445 or a stochasticexcitation codebook 444 which are determined by the fixed code selection switching information. Theadaptive excitation codebook outputs an adaptiveexcitation code vector which is determined by a givenindex, and this adaptive excitation code vector is VQgain-controlled through VQ gain controller 447 andgiven to an adder 451. Further, theadaptiveexcitation codebook 443 gives adaptive excitation codevector to a code vector conversion circuit 450.
      The stochastic excitation codebook 444 orpulse-like excitation codebook 445 gives a stochasticexcitation code vector or pulse-like excitation codevector, which corresponds to the given index, to a codevector conversion circuit 450 through'a fixedcodebookselection switch 448.
      The code vector conversion circuit 450 operates sothat the frequency characteristics become closer to afrequency characteristics of the input speech vector inaccordance with the index of the LPC and adaptiveexcitation code vector. A specific structure of thecode vector conversion circuit 450 will be the same asthat of the structure shown in Fig. 5. Thus, thefrequency-processed fixed code vector is VQgain-controlled by a VQ gain controller and then givento an adder 451.
      The adder 451 adds the given adaptive excitationcode vector and the fixed code vector together, and theadded vector is assigned to be an excitation code vector, which is then given to a synthesis filter 452.The synthesis filter 452 outputs a synthesized speechvector.
      The code excitation linear predictive decoderconducts the above-described processes every time whena decoded speech vector is given or, in other words,for each subframe.
      Important features of the present invention arethat the LSP parameter is used and transmitted as avocal tract parameter; pulse-like excitation codebookis provided for giving an excitation source parameter;and a frequency characteristic of fixed code vector iscontrolled. These features can be independentlyprovided to each of the coding apparatus and decodingapparatus without failure of the advantages and effectsthereof.
      In addition, the coding apparatus and decodingapparatus described above are related primarily to theforward-type code excitation linear predictive encoderand decoder, respectively, but the present invention isnot limited thereto but applicable to backward-typecode excitation linear predictive encoder and decoder,respectively.
      The above-described encoder and decoder wereintentionally designed under the technological basisfor seeking to solve the problems induced from the lowrate coding of 4-bit/s or less. However, more favorable sound reproduction can be realized if theyare adapted to encoders and decoders of high ratecoding. If the higher coding rate is allowable, bothof the stochastic excitation codebook and pulse-likeexcitation codebook can be co-operated effectivelyrather than selectively operating either the stochasticexcitation codebook or the pulse-like excitationcodebook.
      INDUSTRIAL APPLICABILITY
      According to the present invention, it isconsidered that a frequency characteristic of actualexcitation code vector is relatively close to that ofan input speech vector and, in order to make it closerthe frequency of the excitation code vector to afrequency of the input speech vector, the stochasticexcitation code vector is convolutionaly computed withutilizing a specific impulse response. Thereafter, anadaptive excitation code vector is added to produceexcitation code vector and, therefore, an excitationcode vector which is well adaptive to an input speechvector by a small number of vector can be obtained and,at the same time, quantization error can be masked withconversion operation of an excitation code vector,thereby improving a reproduction quality.
      Further, in addition to the adaptive excitation codebook and stochastic excitation codebook, pulse-likeexcitation codebook is disposed which stores thereinpulse-like excitation code vector composed of unitimpulse and, accordingly, a rapid tracking to a speechsignal having periodicity can be realized, and a clearpulse-like excitation code vector can be formed at asteady portion of the speech signal.
      Besides, since the pulse-like excitation codevector and the stochastic excitation code vector areswitched over, the apparatus of the present inventioncan be adapted to low rate coding, and a favorablyreproduced speech can be realized at the time , forexample of a transitional period of the speech in whichthere are random signals and pulse-like signalstogether.
      In addition, according to the code excitationlinear coding apparatus and decoding apparatus, anexcitation code vector is selected and used from eitherstochastic excitation codebook or pulse-like excitationcodebook and, therefore, a favorable reproductionspeech sound can be realized with the condition thatthe number of coded bit of the excitation sourceparameter is small.
      Further, the vocal tract parameter for soundsynthecization is used as lSP parameter which givesless distortion to the vocal tract vector than LPC whenit is coded with a smaller number of code bit and, therefore, reproduction quality at a lower coding ratecan be improved from a vocal tract parameter viewpoint.

      Claims (11)

      1. A speech coding apparatus comprising:
        an adaptively-renewable first codebook means (107) for selectivelyoutputting a first signal;
        a first gain controller means (110,113) for controlling a value of thefirst signal and outputting a second signal;
        a second codebook means (108) for selectively outputting a thirdsignal;
        a signal conversion circuit means (109) for converting the thirdsignal into a frequency characteristic and outputting a fourth signal;
        a second gain controller means (110, 114) for controlling a value ofthe fourth signal and outputting a fifth signal;
        an adder means (115) for adding the second signal and the fifthsignal and thereby obtaining an excitation signal for use in speechsynthesis; wherein
        the first codebook means (107) is adaptively renewed on the basisof the excitation signal; and
        the signal conversion circuit means (109) is arranged to generate animpulse response of a transfer function which is determined in accordancewith pitch information relative to the first signal and to obtain the fourthsignal by convolving the third signal with this impulse response.
      2. A speech decoding apparatus comprising:
        an adaptively-renewable first codebook means (204) for selectivelyoutputting a first signal;
        a first gain controller means (207, 208) for controlling a value of thefirst signal and outputting a second signal;
        a second codebook means (205) for selectively outputting a thirdsignal;
        a signal conversion circuit means (206) for converting the thirdsignal into a frequency characteristic and outputting a fourth signal;
        a second gain controller means (207, 209) for controlling a value ofthe fourth signal and outputting a fifth signal;
        an adder means (210) for adding the second signal and the fifthsignal and thereby obtaining an excitation signal for use in speechsynthesis; wherein
        the first codebook means (204) is adaptively renewed on the basisof the excitation signal; and
        the signal conversion circuit means (206) is arranged to generate animpulse response of a transfer function which is determined in accordancewith pitch information relative to the first signal and to obtain the fourthsignal by convolving the third signal with this impulse response.
      3. A code excitation linear predictive coding apparatus which uses anexcitation signal of an excitation codebook (108) as an excitation sourceinformation of a speech signal, the apparatus beingcharacterised in that itcomprises:-
        a code vector conversion circuit means (109) for converting anexcitation code vector selected from the excitation codebook (108) into afrequency characteristic which is determined at the time of output of saidexcitation code vector, said frequency characteristic serving as the input ofa synthesis filter (105).
      4. A code excitation linear predictive decoding apparatus which usesan excitation signal of an excitation codebook (205) as an excitation sourceinformation of a speech signal, the apparatus beingcharacterised in that itcomprises:-
        a code vector conversion circuit means (206) for converting anexcitation code vector selected from the excitation codebook (205) into afrequency characteristic which is determined at the time of output of saidexcitation code vector, said frequency characteristic serving as the input ofa synthesis filter (203).
      5. A coding or decoding apparatus according to claims 3 or 4, whereinthe code vector conversion circuit means (109, 206) generates an impulseresponse of a transfer function which is determined in accordance with avocal tract parameter of a speech signal input, and convolutionally computes the excitation code vector with the impulse response.
      6. A coding or decoding apparatus according to claims 1, 2 or 5,wherein the impulse response of the transfer function is represented by:-
        Figure 00350001
        whereaj are linear predictive coefficients;p is a vocal tract analysis order;and A and B are within the range: 0 < A < 1 and 0 < B < 1.
      7. A coding or decoding apparatus according to claims 3 or 4, whereinthe code vector conversion circuit means (109, 206) generates an impulseresponse of a transfer function which is determined in accordance with anexcited pitch lag, and convolutionally computes the excitation code vectorwith the impulse response.
      8. A coding or decoding apparatus according to claims 1, 2 or 7,wherein the impulse response of the transfer function which is determinedin accordance with the excited pitch lag is represented by:-H(Z) =11 - εZ-Lwhere ε is a constant within the range 0 < ε ≤ 1; andL is pitch lag signal.
      9. A coding or decoding apparatus according to claim 1 or 2, whereinthe codebook vector conversion circuit means (109, 206) convolutionallycomputes the excitation code vector with the impulse response of thetransfer function which is determined in accordance with transfer functionsrepresented by:-
        Figure 00350002
        andH(Z) =11-εZ-Lwhereaj are linear predictive coefficients;p is a vocal tract analysis order,A, B and ε are within the range: 0 < A < 1, 0 < B < 1 and 0 < ε ≤ 1; andL ispitch lag signal.
      10. A coding or decoding apparatus according to claim 4 or 4, whereinthe impulse response of the transfer function is determined in accordancewith transfer functions represented by:-
        Figure 00360001
        andH(Z) =11-εZ-Lwhereaj are linear predictive coefficients;p is a vocal tract analysis order,A, B and ε are within the range: 0 < A < 1, 0 < B < 1 and 0 < ε ≤ 1; andL ispitch lag signal.
      11. A coding or decoding apparatus according to any one of claims 1 to4 wherein said excitation code book or second codebook means is a pulse-likeexcitation code book (322).
      EP03013629A1993-06-101993-06-10Code Excitation linear prediction encoder and decoderExpired - LifetimeEP1355298B1 (en)

      Applications Claiming Priority (2)

      Application NumberPriority DateFiling DateTitle
      PCT/JP1993/000776WO1994029965A1 (en)1993-06-101993-06-10Code excitation linear prediction encoder and decoder
      EP93913500AEP0654909A4 (en)1993-06-101993-06-10Code excitation linear prediction encoder and decoder.

      Related Parent Applications (1)

      Application NumberTitlePriority DateFiling Date
      EP93913500ADivisionEP0654909A4 (en)1993-06-101993-06-10Code excitation linear prediction encoder and decoder.

      Publications (3)

      Publication NumberPublication Date
      EP1355298A2true EP1355298A2 (en)2003-10-22
      EP1355298A3 EP1355298A3 (en)2004-02-04
      EP1355298B1 EP1355298B1 (en)2007-02-21

      Family

      ID=28459643

      Family Applications (1)

      Application NumberTitlePriority DateFiling Date
      EP03013629AExpired - LifetimeEP1355298B1 (en)1993-06-101993-06-10Code Excitation linear prediction encoder and decoder

      Country Status (1)

      CountryLink
      EP (1)EP1355298B1 (en)

      Cited By (4)

      * Cited by examiner, † Cited by third party
      Publication numberPriority datePublication dateAssigneeTitle
      WO2009015944A1 (en)*2007-07-302009-02-05Global Ip Solutions (Gips) AbA low-delay audio coder
      RU2462769C2 (en)*2006-10-242012-09-27Войсэйдж КорпорейшнMethod and device to code transition frames in voice signals
      US8463615B2 (en)2007-07-302013-06-11Google Inc.Low-delay audio coder
      CN111818519A (en)*2020-07-162020-10-23郑州信大捷安信息技术股份有限公司End-to-end voice encryption and decryption method and system

      Family Cites Families (2)

      * Cited by examiner, † Cited by third party
      Publication numberPriority datePublication dateAssigneeTitle
      JPH0451199A (en)*1990-06-181992-02-19Fujitsu LtdSound encoding/decoding system
      CA2051304C (en)*1990-09-181996-03-05Tomohiko TaniguchiSpeech coding and decoding system

      Cited By (7)

      * Cited by examiner, † Cited by third party
      Publication numberPriority datePublication dateAssigneeTitle
      RU2462769C2 (en)*2006-10-242012-09-27Войсэйдж КорпорейшнMethod and device to code transition frames in voice signals
      US8401843B2 (en)2006-10-242013-03-19Voiceage CorporationMethod and device for coding transition frames in speech signals
      WO2009015944A1 (en)*2007-07-302009-02-05Global Ip Solutions (Gips) AbA low-delay audio coder
      EP2023339A1 (en)*2007-07-302009-02-11Global IP Solutions (GIPS) ABA low-delay audio coder
      US8463615B2 (en)2007-07-302013-06-11Google Inc.Low-delay audio coder
      CN111818519A (en)*2020-07-162020-10-23郑州信大捷安信息技术股份有限公司End-to-end voice encryption and decryption method and system
      CN111818519B (en)*2020-07-162022-02-11郑州信大捷安信息技术股份有限公司End-to-end voice encryption and decryption method and system

      Also Published As

      Publication numberPublication date
      EP1355298A3 (en)2004-02-04
      EP1355298B1 (en)2007-02-21

      Similar Documents

      PublicationPublication DateTitle
      US5727122A (en)Code excitation linear predictive (CELP) encoder and decoder and code excitation linear predictive coding method
      US8364473B2 (en)Method and apparatus for receiving an encoded speech signal based on codebooks
      US5729655A (en)Method and apparatus for speech compression using multi-mode code excited linear predictive coding
      US5142584A (en)Speech coding/decoding method having an excitation signal
      US5778334A (en)Speech coders with speech-mode dependent pitch lag code allocation patterns minimizing pitch predictive distortion
      US5140638A (en)Speech coding system and a method of encoding speech
      EP1221694B1 (en)Voice encoder/decoder
      US6023672A (en)Speech coder
      KR20010024935A (en)Speech coding
      EP1162604B1 (en)High quality speech coder at low bit rates
      US5659659A (en)Speech compressor using trellis encoding and linear prediction
      EP1005022B1 (en)Speech encoding method and speech encoding system
      CA2090205C (en)Speech coding system
      US5797119A (en)Comb filter speech coding with preselected excitation code vectors
      EP1355298B1 (en)Code Excitation linear prediction encoder and decoder
      US5884252A (en)Method of and apparatus for coding speech signal
      EP0855699B1 (en)Multipulse-excited speech coder/decoder
      US7076424B2 (en)Speech coder/decoder
      WO1994029965A1 (en)Code excitation linear prediction encoder and decoder

      Legal Events

      DateCodeTitleDescription
      PUAIPublic reference made under article 153(3) epc to a published international application that has entered the european phase

      Free format text:ORIGINAL CODE: 0009012

      17PRequest for examination filed

      Effective date:20030707

      ACDivisional application: reference to earlier application

      Ref document number:0654909

      Country of ref document:EP

      Kind code of ref document:P

      AKDesignated contracting states

      Kind code of ref document:A2

      Designated state(s):DE FR GB SE

      PUALSearch report despatched

      Free format text:ORIGINAL CODE: 0009013

      AKDesignated contracting states

      Kind code of ref document:A3

      Designated state(s):DE FR GB SE

      AKXDesignation fees paid

      Designated state(s):DE FR GB SE

      17QFirst examination report despatched

      Effective date:20050502

      GRAPDespatch of communication of intention to grant a patent

      Free format text:ORIGINAL CODE: EPIDOSNIGR1

      GRASGrant fee paid

      Free format text:ORIGINAL CODE: EPIDOSNIGR3

      GRAA(expected) grant

      Free format text:ORIGINAL CODE: 0009210

      ACDivisional application: reference to earlier application

      Ref document number:0654909

      Country of ref document:EP

      Kind code of ref document:P

      AKDesignated contracting states

      Kind code of ref document:B1

      Designated state(s):DE FR GB SE

      REGReference to a national code

      Ref country code:GB

      Ref legal event code:FG4D

      REFCorresponds to:

      Ref document number:69334115

      Country of ref document:DE

      Date of ref document:20070405

      Kind code of ref document:P

      PG25Lapsed in a contracting state [announced via postgrant information from national office to epo]

      Ref country code:SE

      Free format text:LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

      Effective date:20070521

      ETFr: translation filed
      PLBENo opposition filed within time limit

      Free format text:ORIGINAL CODE: 0009261

      STAAInformation on the status of an ep patent application or granted ep patent

      Free format text:STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT

      26NNo opposition filed

      Effective date:20071122

      PGFPAnnual fee paid to national office [announced via postgrant information from national office to epo]

      Ref country code:DE

      Payment date:20120607

      Year of fee payment:20

      PGFPAnnual fee paid to national office [announced via postgrant information from national office to epo]

      Ref country code:GB

      Payment date:20120606

      Year of fee payment:20

      Ref country code:FR

      Payment date:20120619

      Year of fee payment:20

      REGReference to a national code

      Ref country code:DE

      Ref legal event code:R071

      Ref document number:69334115

      Country of ref document:DE

      REGReference to a national code

      Ref country code:GB

      Ref legal event code:PE20

      Expiry date:20130609

      PG25Lapsed in a contracting state [announced via postgrant information from national office to epo]

      Ref country code:GB

      Free format text:LAPSE BECAUSE OF EXPIRATION OF PROTECTION

      Effective date:20130609

      Ref country code:DE

      Free format text:LAPSE BECAUSE OF EXPIRATION OF PROTECTION

      Effective date:20130611


      [8]ページ先頭

      ©2009-2025 Movatter.jp