EP1355298A2

Movatterモバイル変換

Info

Publication number: EP1355298A2
Application number: EP03013629A
Authority: EP
Inventors: Kenichiro Hosoda; Hiromi Aoyaki; Hiroshi Katsuragawa; Yoshihiro Ariyama
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 1993-06-10
Filing date: 1993-06-10
Publication date: 2003-10-22
Anticipated expiration: 2013-06-10
Also published as: EP1355298A3; EP1355298B1

Abstract

A code excitation linearpredictive (CELP) coding or decoding apparatus is provided in whicha code vector, which is provided bya stochastic codebook (108), is converted adaptively inaccordance with vocal tract analysis information (LPC)so that a high quality reproduction speech is obtainedat a low coding rate. Further, in order to obtain asimilar effect, a pulse-like excitation codebook formedof an isolated impulse is provided in addition to theadaptive excitation codebook (107) and stochastic excitationcodebook (108) so that either the stochastic excitationcodebook or the pulse-like excitation codebook isselectively used to provide a vocal tract parameter asa linear spectrum pair parameter.

Description

TECHNICAL FIELD OF THE INVENTION

This invention relates to an encoder and a decoderbased on the code excitation linear predictive coding(CELP) system.

BACKGROUND OF THE INVENTION

Conventionally, as a high efficient coding systemfor speech signal including audible signal in a fieldof digital transportable communication system, a codeexcitation linear predictive coding and itsmodification, that is, a vector sum excitation linearpredictive coding system (VSELP) have been used. Thecoding apparatus which uses the code excitation linearpredictive coding (CELP) is disclosed in, for example,N.S. Jayant and J.H.Chen, "Speech Coding withTime-varying Bit Allocation to Excitation and LPCParameters", Proc. ICASSP, pp65-68, 1989.

A fundamental construction of the coding systemrelative to the speech signal is to obtain vocal tractparameters representing vocal tract properties andexcitation source parameters representing excitationsource information. In the recent CELP system, anexcited signal as a excitation source information isencoded by means of both an adaptive excitation codevectors, which contribute to stochasticallystronger periodic excitation signal and stochasticexcitation codevectors which contribute to stochasticless periodic random excitation signal, and then thecoded excitation signals are stored in a codebook, andan optimum adaptive excitation codevectors andstochastic excitation codevectors are found out in eachcodebook so that weighted error power sum between aninput speech vector and synthetic speech vector becomesminimum. Then, whatever it is of a forward-type codingsystem which obtains vocal tract parameters from aninput speech vector or of a backward-type coding systemwhich obtains vocal tract parameters from syntheticspeech vectors, at least the excitation sourceparameters, that is, adaptive excitation code andstochastic excitation code information are transmitted.

By utilizing the code excitation linear predictive(CELP) system as described above, it is known that ahigh quality regenerated speech signals are obtained ata coding rate of 6 kbit/s to 8 kbit/s.

However, some communication systems require lowercoding rate, for example 4kbit/s or less. In such alower coding rate, regardless of being the forward typewhich transmits both vocal tract parameters andexcitation source parameters or being the backward typewhich transmits excitation source parameters, thenumber of coded bits which are assigned to the excitation source parameters is smaller and the numberof adaptive excitation codevectors stored in theadaptive excitation codebook and the number ofstochastic excitation codevectors stored in thestochastic excited codebook become smaller.Consequently, the quality of the regenerated speechsignal inevitably degrades at the lower coding rate asdescribed above.

Besides, the adaptive excited codebook areadaptively renewed by synthetic codevectors of optimumadaptive excitation codevectors and stochasticexcitation codevectors and, accordingly, it can bedetermined that the adaptive excitation codevectors areformed on the basis of the stochastic excitationcodevectors. Therefore, the current CELP coding has apoor tracking capability for a voice signal having anature of strong periodicity. Consequently, generatedspeech signal lacks clearness.

A speech coding and decoding system that attempts to realise ahigher compression of speech information is described inEP 476614. Here, a sparse adaptive codebook is used in associationwith a time-reversed perceptual weighting filter.

SUMMARY OF THE INVENTION

The present invention is based upon the foregoing problems and anobject of the present invention is to provide code excitation linear predictivecoding encoder and decoder which can provide a high quality regeneratedspeech signal even when pulse-like noise components are contained in theinput speech vectors.

Another object of the present invention is to provide code excitation linear predictive codingencoder and decoder which can provide high-qualityregenerated speech signal even when a lower coding rateis employed.

According to the present invention, there isprovided a code excitation linear predictive codingapparatus which uses, as a speech excitation sourceinformation, excitation signals in the form ofexcitation codebook, wherein the apparatus is providedwith a codevectors conversion circuit which convertsthe frequency characteristics of fixed codevectors suchas stochastic excitation codevectors transmitted fromthe excitation codebook into the predeterminedfrequency characteristics at the time of output of theexcitation codevectors. A primary reason for providingthe codevectors conversion circuit is as set forthbelow. Conventionally, the frequency characteristicsof an excitation signal is modelled as "theoreticallywhite" and yet it actually is not "white" but isrecognized by examinations to have a characteristicwhich is near to a frequency characteristics of aninput speech vectors. Therefore, the nearer the fixedcodevectors frequency characteristics is set to thefrequency characteristics of the input speech vectors,the higher the quality of the synthetic speech vectoris obtained and, moreover, an effective frequencycomponent of the excitation codevectors becomes much larger than a quantization error vectors so that amasking effect of the quantization error vector can beobtained. As an information representing frequencycharacteristics of the code conversion circuit,parameters of LPC (linear predictive coefficient) andoptimum adaptive excitation code information whichmeans pitch predictive information (which includes VQgains) are used. Thus, the codevectors conversioncircuit controls the frequency characteristics of thestochastic excitation codevectors and so forth, inaccordance with these information.

Further, in the present invention, there isprovided a code excitation linear predictive decodingapparatus which has codevectors conversion circuitwhich forces the fixed codevector frequencycharacteristics near to the input speech vectorfrequency characteristics in accordance with therespective code excitation linear predictive codingsystem.

In the codevector converter circuit, an impulseresponse determined by the following formula (1) asfilter transfer function H(Z) according to the vocaltract parameters,H(Z)= (1-ΣAjajZ-j) / (1-ΣBjaj-j)or an impulse response determined by the followingformula (2) in accordance with a excited pitch lag,H(Z) = 1/ (1-ε Z-L) or an impulse response which is cascade-connectedfilter represented by formulas (1) and (2) is used toproceed a convolution treatment to the stochasticexcitation codevectors and thereafter a adaptiveexcitation codevectors are added to produce excitationcodevectors. Here, aj(j=1 to p) represents a parameterof LPC and p represents the order of LPC analysis. A,B and E are constants which are determined in the rangeof 0<A<1, 0<B<1 and 0<ε≤1, respectively, and Lrepresents a pitch lag.

Further, the present invention provides a codeexcitation linear predictive coding or decodingapparatus which is provided, as an excitation codebook,with a adaptive excitation codebook and stochasticexcitation codebook, in which pulse-like excitationcodebook storing a pulse-like excitation codevectorwhich consists of isolated impulse in addition to theadaptive excitation codebook and stochastic excitationcodebook is provided so that the current CELP codinghas a good tracking capability for a speech signalhaving a nature of strong periodicity. Thus, clearregenerated speech signal can be obtained.

Further, in the code excited linear predictivecoding apparatus, excitation codevectors from thestochastic excitation codebook or pulse-like excitationcodebook are selectively used, and this selectedinformation is transmitted to the code excitation linear predictive decoder apparatus. In this codeexcitation linear predictive decoder apparatus, theexcitation codevectors from the stochastic excitationcodebook or pulse-like excitation codebook are selectedin accordance with the information transmitted from thecode excitation linear predictive coding apparatus.

In addition, in each of the above-described codeexcitation linear predictive encoders, the output ofvocal tract parameters are assigned to be LSP (linearspectral pair) parameters and this linear spectral pairparameters are utilized for the speech regeneration inthe code excitation linear predictive decoder so thatthe regeneration speech quality at the lower codingrate can be improved from a viewpoint of vocal tractparameters. The reasons for using LSP parameters asthe vocal tract parameters reside in that aninterpolation characteristics relative to the frequencycharacteristics of the vocal tract are improved, thatthe LSP parameters provides less distortion to thevocal tract spectral than LPC parameters even when theLSP parameters are coded by smaller number of codebits, and that an effective coding can be obtained bycombination with vector quantization.

BRIEF DESCRIPTION OF THE DRAWING

Fig. 1 is a block diagram of a code excitationlinear predictive encoder (coding apparatus) according to a first and a second embodiments of the presentinvention.

Fig. 2 is a block diagram of a code excitationlinear predictive decoder in correspondence with thecode excitation linear predictive encoder shown in Fig.1.

Fig. 3 is a block diagram of a code excitationlinear predictive encoder (coding apparatus) accordingthe a third embodiment of the invention.

Fig. 4 is a block diagram of a code excitationlinear predictive decoder in correspondence with thecode excitation linear predictive encoder shown in Fig.3.

Fig. 5 is a detailed block diagram of a codevectorconversion circuit shown in Figs. 3 and 4.

BEST MODE FOR CARRYING OUT THE INVENTION

Preferred embodiments of the code excitationlinear predictive coding apparatus (encoder) and thecode excitation linear predictive decoding apparatus(decoder) according to the present invention will bedescribed with reference to the figures of the drawingattached herewith.

Referring to Fig. 1 which shows a code excitationlinear predictive encoder (coding apparatus) accordingthe a first embodiment of the present invention, aninput speech vector S which has been inputted in each frame from an input terminal 101 is first transmittedto a vocal tract analysis circuit 102 to obtain a vocaltract parameter aj (linear predictive coefficient).

An LPC (linear predictive coefficient)quantization circuit 103 quantizes vocal tractpredictive parameter aj and transmits its code Ic(quantized LPC code) to an LPC inverse-quantizationcircuit 104 and a multiplex circuit 106.

The LPC inverse-quantization circuit 104 serves toconvert the LPC code Ic into vocal tract predictiveparameter aqj and transmits the same to a synthesisfilter 105.

Then, an adaptive excitation codevector e ai (i=1to n) is outputted from a adaptive excitation codebook107 and similarly, a stochastic excitation codevector esl (l=1 to m) is from a stochastic excitation codebook108. Similarly, an excitation gains βk and γk (k=1 tor) are outputted from a VQ gain codebook 110.

A codevector conversion circuit 109, which has animpulse response of filter transfer function H(Z)represented by the following formula (3), performsconvolutional computation with stochastic excitationcodevector e sl from a stochastic excitation codebook108, and transmits a converted stochastic excitationcodevector e scl.

wherein aqj represents an output of LPC inversequantization circuit 104 and p represents vocal tractanalysis order.

The adaptive excitation codevector e ai ismultiplied by the gain βk by means of a multiplier 113to produce a vector e aik and, on the other hand, theconverted stochastic excitation codevector e scl ismultiplied by the gain γk by means of a multiplier 114to produce a vector e sclk.

An adder 115 adds the components of vector e alkand vector e sclk and produces an excitation codevectore.

The synthesis filter 105 calculates syntheticspeech vector Sw corresponding to the excitationcodevector e and transmits it to asubtracter 116.

Thesubtracter 116 performs the subtractionbetween the synthesized speech vector Sw and the inputspeech vector S, and the obtained error vector betweenSw and S is transmitted to aperceptual weightingfilter 111.

Theperceptual weighting filter 111 transmits aperceptual weighting error vector ew corresponding tothe error vector er to a perceptual weighting errorcalculation circuit 112.

The perceptual weighting error calculation circuit112 calculates a mean square value of each component ofthe perceptual weighting error vector ew, and determines the excitation codevector (i.e., combinationof i, l and k) to minimize the mean square error powerof ew for the input speech vector at the present time.Indexes Ia, Is and Ig of each codebook at this momentare transmitted to each of the adaptive excitationcodebook 107, stochastic excitation codebook 108, VQgain codebook 110 and multiplex circuit 106.

The adaptive excitation codebook 107 outputs anoptimum adaptive excitation codevector ea0 assigned byindex Ia, the stochastic excitation codebook 108outputs an optimum stochastic excitation codevector es0assigned by index Is, and the VQ gain codebook 110transmits optimum VC gain β₀ and γ₀ assigned by indexIg. A codevector conversion circuit 109 converts thestochastic codevector es0 which has been transmittedfrom the stochastic excitation codebook in accordancewith the index Is into an optimum converted stochasticexcitation codevector e sc0 and then outputs it to themultiplier 114.

The optimum excitation codevector e₀pt composed bythe ea₀, esc₀, β₀ and γ₀ is transmitted to the adaptiveexcitation codebook 107 and updates the content of theadaptive excitation codebook 107.

The multiplex circuit 106 multiplexes Ic, Ia, Isand Ig, as a total code C, and transmits it to thereceiver through anoutput terminal 117.

Fig. 2 is a block diagram of a code excitation linear predictive decoder corresponding to the codeexcitation linear predictive encoder.

In Fig. 2 the total code C from aninput terminal201 is separated by a demultiplex circuit 212 into LPCcode Ic, adaptive excitation code index Ia, stochasticexcitation code index Is, and VQ gain code index Ig andthey are transmitted, respectively, to LPC inversequantization circuit 202, adaptive excitation codebook204, stochastic excitation codebook 205 and VQ gaincodebook 207.

The LPC inverse quantization circuit 202 convertsthe LPC code Ic into vocal tract predictive parameteraj and transmits to a synthesis filter 203. Theadaptive excitation codebook 204 outputs adaptiveexcitation codevector ea assigned by the index Ia, thestochastic excitation codebook 205 outputs a stochasticexcitation codevector es assigned by the index Is, anda VQ gain codebook 207 outputs excitation gains β andγ, assigned by index Ig.

Acodevector conversion circuit 206 converts thevector es into vector e sc and outputs it as similar asthe aforementioned code excitation linear predictivecoding apparatus (encoder).

The adaptive excitation codevector ea ismultiplied by gain β by means ofmultiplier 208, andthe vector e sc is multiplied by gain γ by means ofmultiplier 209. These multiplied vector components are added byadder 210, and final excitation codevector efor synthesis filter is obtained.

A synthesis filter 203 calculates a synthesizedspeech vector S corresponding to the excitationcodevector e and outputs to anoutput terminal 211. Atthe same time, the content of the adaptive excitationcodebook 204 is updated by vector e.

The code excitation linear predictive encoderaccording to the second embodiment of the inventionwill be explained with reference to Fig. 1 again.

This code excitation linear predictive encoderaccording the a second embodiment has the similarconstruction as that of the first embodiment except thecodevector conversion circuit 109 and, therefore, anoperational mode of the codevector conversion circuit109 will be explained presently.

The codevector conversion circuit 109, which hasan impulse response of filter transfer function H(Z)shown by the following formula (4) performsconvolutional computation with the vector e sl andresults in vector e scl.H(Z)=1/(1-εZ-L)

Where ε is ε ≤ 1.0, and L is a pitchlag obtainedfrom index of the adaptive excitation code.

Incidentally, in the codebook of a shift-typeadaptive excitation codebook, the index of the adaptiveexcitation code corresponds with the pitch lag index as below.

The convolutional processing of the aforementionedcode excitation linear predictive coding apparatus(encoder) are represented by the following formula (5),provided that the e sl is an output stochasticexcitation codevector of the stochastic excitationcodebook, e scl is a stochastic excitation codevectorafter the conversion, and h is an impulse response ofconversion circuit.e scl = e sl X h wherein:

e scl = [x₀, x₁ ,,,,,, x_n-1], e sl=[y₀, y₁,,,y_n-1],

h=[h₀, h₁,,,,,h_n-1] (The bracket [ ] is columnvector.),

x, y and h are elements, and n is subframe length(or frame length).

A transfer function composed of a vocal tractparameter, or a transfer function composed of the pitchlag can be used for the impulse response of codeconversion circuit, alternatively, said two transferfunctions can be cascaded to form the impulse response.

Fig. 3 is a block diagram of a code excitationlinear predictive encoder according to the third embodiment of the invention. In Fig. 3 this codeexcitation linear predictive encoder is primarilycomposed of a inputspeech process portion 301, optimumsynthesizedspeech search portion 302 andmultiplexcircuit 303.

Theinput speech process 301 has LSP parameteranalysis circuit 311, LSP parameter coding circuit 312,LSP parameter decoding circuit 313, LPC conversioncircuit 314,perceptual weighting filter 315, synthesisfilter zero inputresponse generation circuit 316,perceptual weighting filter zero inputresponsegeneration circuit 317, and subtracters 318 and 319.When an input vector is given, a speech parameter whichis to be transmitted to the decoder is obtained and,target speech vector for a synthesized speech vectorwhich is formed by local reproduction.

In the code excitation linear predictive encoder,digitalized discrete input speech vector series arestored as much as the time which corresponds to ananalysis frame length for obtaining a vocal tractparameter and, this analysis frame length is separatedinto several subframes and processed by inputspeechprocessing portion 301.

The input speech vector is given to the LSPparameter analysis circuit 311, analyzed by the LSPanalysis circuit 311, and converted to LSP parameter asvocal tract parameter. This LSP parameter is coded (for example, to be vector quantized) by LSP parametercoding circuit 312 and given to themultiplex circuit303 and transmitted to the code excitation lineardecoder. The coded LSP parameter is decoded (vectorquantized) by LSP parameter decoding circuit 313 andconverted to LPC by the LPC conversion circuit 314.The thus converted LPC is used as a tap coefficient forperceptual weighting filter 315, synthesis filter zeroinputresponse generation circuit 316, perceptualweighting filter zeroinput generation circuit 317 andasynthesis filter 329 which will be describedpresently, and given also to a codevector conversioncircuit 328. The quantized LSP parameter is convertedinto LPC.

Next, an operation for forming a target speechvector relative to synthesized speech vector which islocally reproduced from the input speech vector will beexplained.

The input speech vector described above is givento theperceptual weighting filter 315 and after theweighing processing in consideration of humanperceptual characteristics, the input speech vector isgiven to asubtracter 318 to be subtracted. Further, azero input response vector in relation to asynthesisfilter 329, is given for input ofsubtracter 318.Thus, a speech vector, from which an influence of thesynthesis filter 329 in the immediately before analysis frame is excluded, is given to subtracter 319.Further, a zero input response vector in relation to aperceptual weighting filter 315, is given for input ofsubtracter 139. Thus, a speech vector, from which aninfluence of theweighted filter 315 in the immediatelybefore analysis frame is obtained, is given tosubtracter 330.

The optimum synthesizedticspeech search portion302 serves to search a excitation source parameter inwhich the synthesis speech vector in the localreproduction is most similar to the target speechvector, and is composed ofadaptive excitation codebook320,stochastic excitation codebook 321, pulse-likeexcitation codebook 322,VQ gain codebook 323,

VQ gaincontrollers

324 and 327,adder 325, fixedcodebookselection switch 326,codevector conversion circuit328,synthesis filter 329, subtracter 330, error powersum computing circuit 331 and code selection circuit332.

Each of theadaptive excitation codebook 320,stochastic excitation codebook 321 and pulse-likeexcitation codebook 322 stores adaptive excitationcodevector, which is a waveform code in relation to anexcitation signal, stochastic excitation codevector andpulse-like excitation codevector, respectively, and VQgain codebook 323 stores VQ gain code which is relatedto adaptive excitation codevector and fixed codevector (which generally represents stochastic excitationcodevector and pulse-like excitation codevector).

The adaptive excitation code vector contributes tothe voiced speech signal having stochasticallyperiodicity, while the stochastic excitation codevectorcontributes to the unvoiced speech signal havingstochastically less periodicity. The adaptiveexcitation codevector of theadaptive excitationcodebook 320 is adaptively updated as describedpresently.

The pulse-like excitation codevector is a waveformexcitation codevector consisting of an unit impulse andis considered to contribute to the steady portion ofthe voiced speech signal having a strong periodicity.

The VQ gain code is vector-quantized, for example,and one component of the vector relates to VQ gain foradaptive excitation code vector and the other componentrelates to VQ gain for the fixed code vector.

Pulse-like excitation code vector is a periodicsimple signal which can be generated by means of apulse signal generating circuit but, it can preferablybe generated by coding and reading out from thecodebook 322 as this code excitation linear predictiveencoder, the reason of which will be explainedpresently. Namely, it is easy to synchronize theexcitation vector with an output from theadaptiveexcitation codebook 320. The same processing for selecting the stochastic excitation codebook can bepulse-like excitation codevector search by constitutingthe excitation code vector to have the same codebookconstruction with thecodebook 321.

By utilizing said various codebook to obtain anoptimum code so that the locally synthesized speechvector becomes the most similar to the target speechvector, and its indices are given to themultiplexcircuit 303 and are transmitted to the code excitationlinear predictive decoder portion.

In case of the search of an optimum codeincluding a selection of the stochastic excitation codevector or the pulse-like excitation code vector asdescribed above, the searching is carried out withrespect to the adaptive excitation code, stochasticexcitation code, pulse-like excitation code and VQ gaincode, in turn, in this code excitation linearpredictive encoder.

In case of searching an optimum adaptiveexcitation code vector, an output from thestochasticexcitation codebook 321 and the pulse-like excitationcodebook 322 are assigned to be zero (0), and theVQgain controller 324 multiply a suitable value of VQcoefficient ("1", for example). In this state, theadaptive excitation codebook 320 outputs all of thestored adaptive excitation code vector sequentially orin parallel, and gives it as an excitation code vector to thesynthesis filter 329 through theVQ gaincontroller 324 and theadder 325. Thesynthesis filter329 carries out a convolutional computing relative tothe excitation code vector, by utilizing, as a tapcoefficient, the LPC which is given from the LPCconversion circuit 314, and a synthesized speechvectors, which are synthesized only by the content ofthe adaptive excitation code vector as the excitationsource signal, are obtained with respect to all theadaptive excitation code vector.

The subtracter 330 obtains, with respect to all ofthe adaptive excitation code vector, an error vectorbetween the synthesized speech vector on which only thecontent of the adaptive excitation code vector iseffected and the target speech vector, and then givesit to an error powersum calculation circuit 331. Theerror powersum calculation circuit 331 obtains squaresum (error power sum) of the error vector, with respectto all the adaptive code vector, and gives it to a codeselection circuit 332. The code selection circuit 332determines the the adaptive excitation code vector tominimize the error power sum.

Next, an optimum stochastic excitation code vectorsearching is carried out and in the searching of this,a fixedcodebook selection switch 326 is driven to theside of thestochastic excitation codebook 321 theoutput from adaptive excitation codebook is set to zero (0) or to the previously obtained optimum adaptiveexcitation code vector. In the state as this, thestochastic excitation codebook 321 outputs sequentiallyor in parallel, all the stored stochastic excitationcode vectors,and inputs them into the codevectorconversion circuit 328 through the fixedcodebookselection switch 326 andVQ controller 324.

The codevector conversion circuit 328 proceedsthe conversion of the frequency characteristics ofinputted stochastic excitation code vector so that itis moved to close to frequency characteristics of aninput speech vector in correspondence with time-lengthof the stochastic excitation code vector. As describedabove, all the stochastic exited code vector with itsfrequency characteristics being conversion-processed isgiven, as an excitation code vector, to asyntheticfilter 329. Thereafter, it is processed as similar asthe searching of the optimum adaptive excitation codevector, and the code selection circuit 332 determinesan optimum stochastic excitation code vector.

After the searching of the optimum stochasticexcitation code vector is finished as described above,a searching of an optimum pulse-like excitation codevector is carried out. At this searching, the fixedcodebook selection switch 326 is driven to the side ofthe pulse-like excitation codebook 322 the output fromadaptive excitation codebook 326 is set to zero (0) or to the previously obtained optimum adaptive excitationcode vector. In this state, the pulse-like excitationcodebook 322 outputs sequentially or in parallel, allthe stored pulse-like excitation code vectors.Processings thereafter will be substantially similarwith those of the moment when an optimum stochasticexcitation code vector is searched and, accordingly,more detailed explanation will not be necessary.

As described above, when the optimum pulse-likeexcitation code vector is determined, the codeselection circuit 332 compares the error power sum ofthe selected code vector in the stochastic excitationcode vector search with the error power sum of theselected code vector in the pulse-like excitation codevector search to obtain smallest error power sum, anddetermin a fixed code to be transmitted to the codeexcitation linear predictive decoder.

Thereafter, a searching of an optimum VQ gaincodeis carried out. At the searching of this VQ gain code,an optimum (selected) adaptive excitation code vectoris transmitted from theadaptive excitation codebook320, and the fixedcodebook selection switch 326 isswitched to either the selectedstochastic excitationcodebook 321 or pulse-like excitation codebook 322, andan optimum (selected) fixed code vector is outputtedfrom the selected

fixed codebook

321 or 322. AVQ gaincodebook 323 is composed of VQ gain for an adaptive excitation code vector and VQ gain for the fixed codevector. The VQ gain for the adaptive excitation codevector is given to aVQ gain controller 324 and the VQgain for the fixed code vector is given to aVQ gaincontroller 327. Thus, both the VQ gain-controlledoptimum adaptive excitation code vector and the optimumfixed code vector, which have been processed withrespect to a frequency characteristic operation and VQgain control, are added by anadder 325 and then givento a synthesis filter as an excitation code vector.This processing is carried out sequentially or inparallel, relative to all the VQ gain codes in theVQgain codebook 323.

After an optimum adaptive excitation code, optimumfixed code and optimum VQ gain code are selected, thecode selection circuit 332 gives the indexes of thesecodes to amultiplex circuit 303 and, a fixed codebookselection switching information which one of thestochastic excitation code vector and the pulse-likeexcitation code vector is selected actually, is givento themultiplex circuit 303. Themultiplex circuit303 multiplexes said indexes with LSP parameter givenfrom the LSP parameter coding circuit 312 and transmitsit to the code excitation linear predictive decoder.Incidentally, in case of utilizing a vectorquantization for a VQ gain coding method, thetransmitted index is vector number.

The coding processings described above is repeatedwith respect of each subframe, and the coded speechinformation is transmitted in turn to the codeexcitation linear predictive decoder.

Fig. 5 shows in detail the specific structure ofthe codevector conversion circuit 328. In Fig. 3, thecodevector conversion circuit 328 has two cascadedfilters 328a and 328b, and a pitch lag decision circuit328c.

The fixed code vector is given to a first filter328a. An impulse response H1(Z) of the first filter328a is set as shown by formula (6), by which thefrequency conversion processing is carried out relativeto the fixed vector.H1(Z)=(1-ΣAjajZ-j) / (1-ΣBjajZ-j) wherein aj(j is 1 to p) is a tap coefficientrelative to asynthesis filter 329 which is suppliedfrom theLPC conversion circuit 324, and p is vocaltract analysis order. Further, A and B are constantswhich are determined in the ranges of 0<A ≤ 1, and 0 <B ≤ 1.

The code vector which was processed in itsfrequency characteristics by the first filter 328a istransmitted to the second filter 328b. The pitch lagdecision circuit 328c obtains a pitch lag L from theindex of the optimum adaptive excitation code relativeto theadaptive excitation codebook 320 and then gives the pitch lag L to the second filter 328b. An impulseresponse H2(Z) of the second filter 328b is determinedas shown by formula (7), by which a frequency 'conversion is carried out relative to the inputtedfixed code vector.H2(Z)= 1/(1-εZ-L) wherein ε is a constant determined in the range of0<ε≤1. An output of the second filter 328b is given toVQ gain controller 327 shown in Fig. 3.

By the codevector conversion circuit 328 asdescribed above, the frequency characteristics ofinputted fixed code vector can be made closer to thefrequency characteristics of the input speech vector,in accordance with a time length of the fixed codevector.

Accordingly, the code excited linear predictivecoding apparatus (encoder) can provide a high qualityregenerated speech signal.

Next, a code excitation linear predictive decoderin correspondence with the code excitation linearpredictive coding apparatus (encoder) shown in Fig. 3will be described with reference to the accompanyingdrawing.

Fig. 4 is a block diagram of code excitationlinear predictive decoder which corresponds to the codeexcitation linear predictive coding apparatus (encoder)shown in Fig. 3. In Fig. 4, the code excitation linear predictive decoder has demultiplex circuit 440, LSPparameter decoding circuit 441, LPC conversion circuit442,adaptive excitation codebook 443, stochasticexcitation codebook 444, pulse-like excitation codebook445, VQ gain codebook 446, VQ gain controller 447, VQgain controller 449, fixedcodebook selection switch448, code vector conversion circuit 450, adder 451 andsynthesis filter 452.

The coded speech information given from the codeexcitation linear predictive encoder is inputted to thedemultiplex circuit 440. The demultiplex circuit 440separates the coded speech information into LSPparameter code, index of the optimum adaptiveexcitation code, index of the optimum fixed code, indexof the optimum VQ gain codebook and fixed codeselection switch information.

Then, LSP parameter code is given to the LSPparameter decoding circuit 441 and the index of theoptimum adaptive excitation code is given to theadaptive excitation codebook 443. Further, the indexof optimum VQ gain code is given to the VQ gaincodebook 446 and the fixed codebook selection switchinformation is given to the fixedcodebook selectionswitch 448.

The index of the optimum fixedcode 443 is givento a pulse-like excitation codebook 445 or a stochasticexcitation codebook 444 which are determined by the fixed code selection switching information. Theadaptive excitation codebook outputs an adaptiveexcitation code vector which is determined by a givenindex, and this adaptive excitation code vector is VQgain-controlled through VQ gain controller 447 andgiven to an adder 451. Further, theadaptiveexcitation codebook 443 gives adaptive excitation codevector to a code vector conversion circuit 450.

The stochastic excitation codebook 444 orpulse-like excitation codebook 445 gives a stochasticexcitation code vector or pulse-like excitation codevector, which corresponds to the given index, to a codevector conversion circuit 450 through'a fixedcodebookselection switch 448.

The code vector conversion circuit 450 operates sothat the frequency characteristics become closer to afrequency characteristics of the input speech vector inaccordance with the index of the LPC and adaptiveexcitation code vector. A specific structure of thecode vector conversion circuit 450 will be the same asthat of the structure shown in Fig. 5. Thus, thefrequency-processed fixed code vector is VQgain-controlled by a VQ gain controller and then givento an adder 451.

The adder 451 adds the given adaptive excitationcode vector and the fixed code vector together, and theadded vector is assigned to be an excitation code vector, which is then given to a synthesis filter 452.The synthesis filter 452 outputs a synthesized speechvector.

The code excitation linear predictive decoderconducts the above-described processes every time whena decoded speech vector is given or, in other words,for each subframe.

Important features of the present invention arethat the LSP parameter is used and transmitted as avocal tract parameter; pulse-like excitation codebookis provided for giving an excitation source parameter;and a frequency characteristic of fixed code vector iscontrolled. These features can be independentlyprovided to each of the coding apparatus and decodingapparatus without failure of the advantages and effectsthereof.

In addition, the coding apparatus and decodingapparatus described above are related primarily to theforward-type code excitation linear predictive encoderand decoder, respectively, but the present invention isnot limited thereto but applicable to backward-typecode excitation linear predictive encoder and decoder,respectively.

The above-described encoder and decoder wereintentionally designed under the technological basisfor seeking to solve the problems induced from the lowrate coding of 4-bit/s or less. However, more favorable sound reproduction can be realized if theyare adapted to encoders and decoders of high ratecoding. If the higher coding rate is allowable, bothof the stochastic excitation codebook and pulse-likeexcitation codebook can be co-operated effectivelyrather than selectively operating either the stochasticexcitation codebook or the pulse-like excitationcodebook.

INDUSTRIAL APPLICABILITY

According to the present invention, it isconsidered that a frequency characteristic of actualexcitation code vector is relatively close to that ofan input speech vector and, in order to make it closerthe frequency of the excitation code vector to afrequency of the input speech vector, the stochasticexcitation code vector is convolutionaly computed withutilizing a specific impulse response. Thereafter, anadaptive excitation code vector is added to produceexcitation code vector and, therefore, an excitationcode vector which is well adaptive to an input speechvector by a small number of vector can be obtained and,at the same time, quantization error can be masked withconversion operation of an excitation code vector,thereby improving a reproduction quality.

Further, in addition to the adaptive excitation codebook and stochastic excitation codebook, pulse-likeexcitation codebook is disposed which stores thereinpulse-like excitation code vector composed of unitimpulse and, accordingly, a rapid tracking to a speechsignal having periodicity can be realized, and a clearpulse-like excitation code vector can be formed at asteady portion of the speech signal.

Besides, since the pulse-like excitation codevector and the stochastic excitation code vector areswitched over, the apparatus of the present inventioncan be adapted to low rate coding, and a favorablyreproduced speech can be realized at the time , forexample of a transitional period of the speech in whichthere are random signals and pulse-like signalstogether.

In addition, according to the code excitationlinear coding apparatus and decoding apparatus, anexcitation code vector is selected and used from eitherstochastic excitation codebook or pulse-like excitationcodebook and, therefore, a favorable reproductionspeech sound can be realized with the condition thatthe number of coded bit of the excitation sourceparameter is small.

Further, the vocal tract parameter for soundsynthecization is used as lSP parameter which givesless distortion to the vocal tract vector than LPC whenit is coded with a smaller number of code bit and, therefore, reproduction quality at a lower coding ratecan be improved from a vocal tract parameter viewpoint.

Claims

A speech coding apparatus comprising:
an adaptively-renewable first codebook means (107) for selectivelyoutputting a first signal;
a first gain controller means (110,113) for controlling a value of thefirst signal and outputting a second signal;
a second codebook means (108) for selectively outputting a thirdsignal;
a signal conversion circuit means (109) for converting the thirdsignal into a frequency characteristic and outputting a fourth signal;
a second gain controller means (110, 114) for controlling a value ofthe fourth signal and outputting a fifth signal;
an adder means (115) for adding the second signal and the fifthsignal and thereby obtaining an excitation signal for use in speechsynthesis; wherein
the first codebook means (107) is adaptively renewed on the basisof the excitation signal; and
the signal conversion circuit means (109) is arranged to generate animpulse response of a transfer function which is determined in accordancewith pitch information relative to the first signal and to obtain the fourthsignal by convolving the third signal with this impulse response.
A speech decoding apparatus comprising:
an adaptively-renewable first codebook means (204) for selectivelyoutputting a first signal;
a first gain controller means (207, 208) for controlling a value of thefirst signal and outputting a second signal;
a second codebook means (205) for selectively outputting a thirdsignal;
a signal conversion circuit means (206) for converting the thirdsignal into a frequency characteristic and outputting a fourth signal;
a second gain controller means (207, 209) for controlling a value ofthe fourth signal and outputting a fifth signal;
an adder means (210) for adding the second signal and the fifthsignal and thereby obtaining an excitation signal for use in speechsynthesis; wherein
the first codebook means (204) is adaptively renewed on the basisof the excitation signal; and
the signal conversion circuit means (206) is arranged to generate animpulse response of a transfer function which is determined in accordancewith pitch information relative to the first signal and to obtain the fourthsignal by convolving the third signal with this impulse response.
A code excitation linear predictive coding apparatus which uses anexcitation signal of an excitation codebook (108) as an excitation sourceinformation of a speech signal, the apparatus beingcharacterised in that itcomprises:-
a code vector conversion circuit means (109) for converting anexcitation code vector selected from the excitation codebook (108) into afrequency characteristic which is determined at the time of output of saidexcitation code vector, said frequency characteristic serving as the input ofa synthesis filter (105).
A code excitation linear predictive decoding apparatus which usesan excitation signal of an excitation codebook (205) as an excitation sourceinformation of a speech signal, the apparatus beingcharacterised in that itcomprises:-
a code vector conversion circuit means (206) for converting anexcitation code vector selected from the excitation codebook (205) into afrequency characteristic which is determined at the time of output of saidexcitation code vector, said frequency characteristic serving as the input ofa synthesis filter (203).
A coding or decoding apparatus according to claims 3 or 4, whereinthe code vector conversion circuit means (109, 206) generates an impulseresponse of a transfer function which is determined in accordance with avocal tract parameter of a speech signal input, and convolutionally computes the excitation code vector with the impulse response.
A coding or decoding apparatus according to claims 1, 2 or 5,wherein the impulse response of the transfer function is represented by:-
wherea_j are linear predictive coefficients;p is a vocal tract analysis order;and A and B are within the range: 0 < A < 1 and 0 < B < 1.
A coding or decoding apparatus according to claims 3 or 4, whereinthe code vector conversion circuit means (109, 206) generates an impulseresponse of a transfer function which is determined in accordance with anexcited pitch lag, and convolutionally computes the excitation code vectorwith the impulse response.
A coding or decoding apparatus according to claims 1, 2 or 7,wherein the impulse response of the transfer function which is determinedin accordance with the excited pitch lag is represented by:-H(Z) =11 - εZ-Lwhere ε is a constant within the range 0 < ε ≤ 1; andL is pitch lag signal.
A coding or decoding apparatus according to claim 1 or 2, whereinthe codebook vector conversion circuit means (109, 206) convolutionallycomputes the excitation code vector with the impulse response of thetransfer function which is determined in accordance with transfer functionsrepresented by:-
andH(Z) =11-εZ-Lwherea_j are linear predictive coefficients;p is a vocal tract analysis order,A, B and ε are within the range: 0 < A < 1, 0 < B < 1 and 0 < ε ≤ 1; andL ispitch lag signal.
A coding or decoding apparatus according to claim 4 or 4, whereinthe impulse response of the transfer function is determined in accordancewith transfer functions represented by:-
andH(Z) =11-εZ-Lwherea_j are linear predictive coefficients;p is a vocal tract analysis order,A, B and ε are within the range: 0 < A < 1, 0 < B < 1 and 0 < ε ≤ 1; andL ispitch lag signal.
A coding or decoding apparatus according to any one of claims 1 to4 wherein said excitation code book or second codebook means is a pulse-likeexcitation code book (322).