Movatterモバイル変換


[0]ホーム

URL:


CN101944362B - Integer wavelet transform-based audio lossless compression encoding and decoding method - Google Patents

Integer wavelet transform-based audio lossless compression encoding and decoding method
Download PDF

Info

Publication number
CN101944362B
CN101944362BCN201010281033XACN201010281033ACN101944362BCN 101944362 BCN101944362 BCN 101944362BCN 201010281033X ACN201010281033X ACN 201010281033XACN 201010281033 ACN201010281033 ACN 201010281033ACN 101944362 BCN101944362 BCN 101944362B
Authority
CN
China
Prior art keywords
signal
frame
information
module
integer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201010281033XA
Other languages
Chinese (zh)
Other versions
CN101944362A (en
Inventor
吴玺宏
曲天书
迟惠生
高懿
何文欣
张搏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University
Original Assignee
Peking University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking UniversityfiledCriticalPeking University
Priority to CN201010281033XApriorityCriticalpatent/CN101944362B/en
Publication of CN101944362ApublicationCriticalpatent/CN101944362A/en
Application grantedgrantedCritical
Publication of CN101944362BpublicationCriticalpatent/CN101944362B/en
Expired - Fee Relatedlegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Images

Landscapes

Abstract

The invention discloses an audio lossless compression encoding and decoding method and belongs to the field of information source encoding and decoding. In the method, a signal is framed adaptively according to related conditions of the former frame and the later frame of the signal, wherein a frame signal after framing is a combination of signals with similar signal characteristics, so that an encoder can achieve higher compression efficiency and is good for later integer wavelet transform and linear predictive encoding. For lossless compression encoding, a signal can be completely reconstructed, and the complete reconstructed characteristic of the signal is guaranteed by adopting integer lifting wavelet transform. Compared with the prior art, after a related adaptive framing-based module and an integer lifting wavelet-based de-correlation module are introduced, redundant information in an original signal can be well de-correlated, and the generated compressed data contains less redundant information. Therefore, the audio lossless compression encoding and decoding method can greatly improve the compression ratio by low computation complexity.

Description

A kind of audio frequency lossless compression-encoding, coding/decoding method based on the shaping wavelet transformation
Technical field
The invention belongs to information source coding and decoding field, be specifically related to a kind of audio frequency lossless compression-encoding, coding/decoding method.
Background technology
Arrival along with digital Age; The digitizing of sound signal brings many voice datas that has generated magnanimity easily simultaneously to people; This has brought very big challenge for the storage and the transmission of sound signal, becomes to hinder people's acquisition and use one of bottleneck problem of multimedia messages.In order to address this problem, just must compress voice data, data are stored and transmit with the mode of compressed encoding.Fact proved; It is necessary and feasible that multimedia data are compressed; Because stronger redundant information is arranged in the multimedia data information such as sound and image; Be that stronger correlativity is arranged between the data, can realize compression through the audio-frequency information that removes redundant information (promptly removing the correlativity between data), remains with usefulness.Therefore, research and develop audio coding method efficiently, with the selection that is inevitable of stored in form and the transmit audio information of compression.And Along with people's is to the improve of audio quality requirement; How under the condition that keeps all audio frequency information; With big as far as possible ratio of compression audio compressed data, thereby real transparent tonequality is provided, becomes the major subjects that current audio compression coding is faced to people.
As far back as the seventies in 20th century; Just begin one's study DAB lossy compression method coding of Broadcasting Authority such as Britain, Japan; The present audio compression coding standard that diminishes a lot of outstanding coding standards occurred through the development in 40 years, and wherein representational have MP3, AAC, a WMA etc.; These coded formats can reach subjective preferably tonequality and very high ratio of compression under many circumstances; But when they run into the bigger music of frequency dynamic scope, for example large-scale symphony etc., these tonequality performances that diminish behind the audio coding just seem barely satisfactory.In addition in the audio editing field, the voice data of lossy compression method coding is done secondary coding (i.e. conversion between two kinds of lossy coding forms) can lose more information, thus the bigger distortion of introducing.In order to solve the above problems, to satisfy some tonequality is required just must use lossless compression-encoding than higher needs.
The research of at present carrying out lossless compression-encoding to the sound signal lossy compression method of comparing with application is encoded actually rare.The reason that lossless compress fails enough to be paid close attention to is that its ratio of compression be difficult to surpass 3: 1, and diminishes that the compression algorithm specific energy reaches 12: 1 even higher.But concerning the lossy compression method algorithm, ratio of compression is high more, and the final audio quality that obtains is poor more, in case confirm minimum possible data transfer rate, the lossy compression method algorithm is a unique selection.Yet; The music-lover wants to download the high fidelity stereo audio signal from the Internet so that obtain best music effect; Therefore; Online music is promoted the more sound signal of high compression ratio will be provided so that different consumer browses and selects, and the music-lover who likes ardently CD level audio quality hope to obtain original audio signal the lossless compress copy---this backup does not have any loss of signal because of the difference of compression algorithm.Except can supplying online sound signal downloads, the lossless audio compressed encoding also can be applicable to the filing, audio mixing, studio, program making of HD Audio data under the professional environment etc.In this case, lossless compress has been avoided using under the lossy compression method coding situation because of repeatedly editing the loss of signal that causes.
From the information theory viewpoint, sound signal is as an information source, and the data of describing information source are quantity of information (information entropy) and information redundancy amount sum.Nearly all lossless audio compression is at first removed redundancy all based on similar thought from signal, the amount of redundancy in the just data of removal, and do not reduce the quantity of information in the information source.Encode with the active data encoding scheme then.Exist multiple redundancy in the sound signal, mainly contain the heterogeneity that signal amplitude distributes, the correlativity between the adjacent sample value, the correlativity between the cycle etc.So the main thought of lossless compression-encoding algorithm is exactly the redundancy in the sound signal of place to go how effectively.The form of at present more well-known audio frequency lossless coding algorithm has FLAC (Free Lossless Audio Codec); WavPack; TAK (Tom ' s Audio Kompressor); APE (Monkey ' s Audio); OFR (OptimFROG); ALAC (Apple Lossless Audio Codec); WMAL (Windows Media Audio Lossless); Shorten; LA (LosslessAudio); TTA (Ture Audio); LPAC (Lossless Predictive Audio Coder); RAL (RealAudioLossless); MPEG-ALS etc.Thereby these algorithms mainly utilize two kinds of methods to carry out decorrelation further carries out lossless compression-encoding: a kind of technology that is based on time domain linear predictive coding (LPC); The technology that another kind is based on transform domain is IntMDCT (Integer Modified Discrete Cosnie Transform, the discrete cosine transform of integer modified) for example.The target of lossless compress is the redundancy (redundancy) of removing in the data, the perfect reconstruction original audio signal.Linear predictive coding can further reduce redundancy, and is effective especially for those signals with smooth performance.In general, the voice signal information redundancy is bigger stably, and the signal message redundancy of inharmonious (similar noise) is less.The sampling value big or small adjacent thereto of a specific assignment sampling value is relevant, and generally speaking, a current sampling value sampling value last with it is comparatively approaching.To low frequency signal, all the more so.
All be to be embodied in the decorrelation part for the main thought of the linear forecast coding method of main flow at present; Make the data of giving the entropy coding module be more suitable for utilizing the method compression of entropy coding, make entropy coding that outstanding compression performance can be arranged for data to be encoded.The ultimate principle of Linear Predictive Coder is to utilize the correlativity of voice signal, with sample value x [n-1] in the past, and x [n-2] ... predict current sample value x [n], precision of prediction is high more more at most to utilize sample value in the past.Subtract each other current sample value and predicted value again and get its poor (predicated error) and encode.Because the dynamic range of predicated error will be much smaller than the dynamic range of original signal, even the quantized level that adopts when at this moment still adopting original signal to quantize also can reduce the sign indicating number position and encode, and then realize Bit-Rate Reduction.The mild sound of amplitude scintillation for example, predicated error can change between the very little value of zero-sum, and the average of e [n] will be little more a lot of than x [n], and be incoherent basically between the adjacent sample value of predicated error e [n], and smooth frequency spectrum is arranged.So, only need less data bit just can represent its actual value.And entropy coding commonly used is the RICE sign indicating number, and its encoding-decoding process is simple, and need not know the prior distribution of signal during coding, so in the audio frequency lossless compress, be widely used.Behind the RICE coding, the data that can obtain big compressibility generally have following characteristics: the one, and amplitude is less, because the process that coding all need quantize at last, and less amplitude means and can represent with less bit number; The 2nd, correlativity is little between data, and the 3rd, DATA DISTRIBUTION is as far as possible near geometric distributions.The redundancy of original audio signal is removed totally when using linear predictive coding to carry out decorrelation, i.e. decorrelation is not thorough..The prediction error data that promptly is input to the entropy coding module also has redundant information, also has certain correlativity between the adjacent sample value of error signal, can further handle.
Summary of the invention
The purpose of this invention is to provide a kind of audio frequency lossless compression-encoding, coding/decoding method; This method is carried out branch frame according to the correlation circumstance self-adaptation of frame before and after the signal to signal based on the branch frame strategy of related coefficient; Make the signal in the frame have very strong correlation; Frame signal behind the branch frame is the close signal combination of characteristics of signals, makes scrambler can acquire better compression efficiency, for integer wavelet transform and linear predictive coding at the back brings benefit.In order to make the residual error amplitude as far as possible little; Require linear prediction accurate as far as possible; And linear predictive coding has good predictive ability for the strong signal of correlativity; So consider to utilize wavelet transformation to come signal is carried out the branch tape handling, because the signal correlation in the arrowband can be better than the correlativity of the signal of full range band, so signal is through more helping removing the correlativity of sample point behind the wavelet transformation; For lossless compression-encoding, should be for signal reconstruct completely, so will adopt integer lifting wavelet transform to guarantee the complete restructural characteristic of signal.We introduced divide frame module and de-correlation modules based on relevant self-adaptation based on the integer Lifting Wavelet after; Redundant information in the original signal can be by better decorrelation; The redundant information that is contained in the packed data that generates still less improves so we can bring bigger ratio of compression with very little computation complexity cost.
The present invention includes based on relevant self-adaptation and divide frame technique, based on other correlation techniques that relate in the de-correlation technique of the integer wavelet transform that promotes and the coding and decoding.It can provide audio frequency lossless coding, decode system than the decorrelation of independent use linear forecasting technology that higher ratio of compression is provided.
Audio frequency lossless encoding/decoding device system according to the inventive method can be divided into encoder subsystem and Decoder Subsystem two parts:
Encoder subsystem comprises:
Divide frame module: be used for the sound signal of input is carried out adaptive minute frame;
Integer wavelet transform module: be used for the section audio signal behind minute frame is carried out the branch tape handling;
Linear predictive coding module: be used for that the signal in each subband is carried out linear prediction and remove the correlativity between the adjacent spots;
Entropy coding module: be used for the information source coding that the residual signals to linear predictive coding module output can't harm
Bit stream forms module: the entropy coding stream, frame length information, small echo rating information, LPC parameter, the code book information that are used for forming above-mentioned module form bit stream and are write as file by certain form;
Decoder Subsystem comprises:
The bit stream separation module: be used for the bit stream of the audio file after the compression according to the rules form separate different data such as formation entropy encoding stream, frame length information, small echo rating information, LPC parameter, code book information respectively;
Entropy decoder module: be used for entropy coding stream through the complete again generation residual signals of decoding
LPC reconstructed module: be used for reconstituting the branch band signal behind the wavelet transformation to the LPC parameter of side information and residual signals.
Integer Lifting Wavelet reconstructed module: be used for synthesizing a complete audio signal frame to the branch band signal after the wavelet decomposition again.
Merge frame module: be merged into the PCM file of an audio frequency to each the frame sound signal after the reconstruct, and write the file header of WAVE file, the WAVE file behind the generation decompress(ion).
Concrete realization according to the harmless coding/decoding method of sound signal of the present invention is following:
The lossless coding flow process of sound signal is that audio file is divided into some frames according to a minute frame strategy earlier, divides frame information (being frame length information) to include the side information transmission in; Every frame individual processing promptly earlier obtains approximate signal and detail signal through wavelet transformation, and wavelet decomposition progression (being the small echo rating information) is pressed adaptation rule and obtained, and decomposed class is included side information equally in; Approximate signal and detail signal obtain residual signals and LPC parameter through the linear prediction module; The residual signals that in linear prediction module, obtains obtains entropy coding stream through entropy coding; The code book information of LPC parameter and entropy coding is included side information in, at last with each road code stream (being side information and entropy coding stream) final compressed bit stream of multiplexing formation.
In fact the losslessly encoding flow process of sound signal is exactly the inverse process of flow process of encoding; Through decoding side information earlier, from side information, isolate code book, LPC parameter, rating information and the frame length information of entropy coding, the entropy coding module is carried out the entropy decoding according to code book information; From entropy coding stream, solve LPC prediction back residual signals; The LPC reconstructed module utilizes the LPC parameter from residual signals, to solve the approximate signal and the detail signal of wavelet decomposition, and integer Lifting Wavelet reconstructed module is carried out reconstruct according to small echo rating information pairing approximation signal and detail signal again, obtains every frame signal; According to minute frame information each frame is coupled together in order the harmless original audio file that obtains at last.
Audio frequency according to the inventive method can't harm the coder/decoder system comprising encoder subsystem and Decoder Subsystem two parts.The main gordian technique that adopts in the total system has based on relevant self-adaptation to be divided frame technique, integer lifting wavelet transform technology, adaptive linear forecast coding technology, is directed against this yard of Lay entropy coding of the data of geometric distributions.To introduce each technology contents respectively below:
1, divides frame technique based on relevant self-adaptation
Frame one speech is from image, and its meaning is to divide a continuously active image into a width of cloth width of cloth picture, and strip cartoon is exactly a good example.In DAB, use " frame ", its meaning is that simulating signal is transformed to digital signal, and its digital signal is divided into many small fragments, claims that this small fragment is 1 frame.Owing to have considerable jump signal in the sound signal, if adopt anchor-frame progress row to divide frame, the correlativity between the signal in each frame that obtains can be a greater impact, and then makes compressibility reduce.
The present invention is according to the related coefficient of consecutive frame, and the signal that correlativity is big merges in the frame, and like this, the compactness of wavelet transformation and linear prediction all can improve, and can obtain higher compression efficiency.Be unit at first, calculate the related coefficient of present frame and former frame, if this coefficient is less than threshold value with minimum frame length; Then this frame of mark and former frame are uncorrelated frame, become a frame separately, if this coefficient is greater than threshold value; Then think present frame and former frame associated frame; Adjacent associated frame is merged successively, but maximum frame length is no more than the maximum frame length permissible value of setting, when the length that merges frame surpasses the maximum frame length of setting, restarts a frame.Take above branch frame strategy, the consistent signal of characteristic can be handled in a frame.
2, integer wavelet transform technology
Integer wavelet transform is the wavelet transformation that integer is mapped to integer, and promptly input signal is an integer, and the wavelet coefficient after the conversion also is an integer, and original signal can be recovered by inverse transformation accurately.The coefficient that traditional wavelet produces later on is a floating number, and not only calculated amount is very big, and can't realize the lossless compress of data.Adopt lifting scheme to calculate wavelet transformation, in lifting process, add quantization operations and just can realize by the wavelet transformation of integer to integer.Integer wavelet transform has a lot of application in the compression of images field, can realize from diminishing to the low complex degree that can't harm embedded encoded, yet also well do not use in the lossless compress of sound signal.
No matter traditional transform method is Fast Fourier Transform (FFT) or wavelet transformation, and input signal is an integer, and the coefficient after the conversion that obtains is a floating number, and there is round-off error in computing machine when handling, can not realize the lossless compress of data.Consideration adds quantization operations in lifting step; If input vector x is an integer, then exports y and also be integer, and can accurately recover x by y; It should be noted that; The effect that here quantizes is different from the quantification in the data compression, and this quantification does not bring information loss, and just in order to obtain integer output.Owing to comprised quantization operations, so integer wavelet transform is a kind of nonlinear transformation, and this makes the analysis to integer wavelet transform become comparatively complicated.In practical application, suitably choose the form of quantization operations, can with integer wavelet transform approximate regard linear transformation as, analyze to simplify.
Viewpoint with multiresolution analysis or BPF.; Wavelet decomposition is not limited to above-mentioned one-level decomposes, and the approximate signal after can also decomposing one-level continues to do wavelet decomposition, further removes its correlativity; But difference because unlike signal distributes on frequency; It is influential to adopt different decomposed classes that the result of compression is understood, the compression effectiveness of method of the present invention after according to signal decomposition, adaptive selection progression; Make compression result reach best, and the decomposed class information of the best is recorded in the side information.
3, adaptive linear forecast coding technology
Lossless audio coding device precision of prediction is high more, and code efficiency is then high more.Most of algorithms are removed redundant through some improved linear predictors, these algorithms are applied to each frame data with linear predictor, produce the predicated error sequence.The parameter of fallout predictor is being represented the redundancy of from signal, removing, and the parameter of lossless coding fallout predictor and predicated error are represented each frame signal together.
The ultimate principle of linear predictor is to utilize the correlativity of voice signal, with in the past sample value x [n-1], x [n-2] ... wait and predict current sample value x [n], precision of prediction is high more more at most to utilize sample value in the past.Subtracting each other poor (predicated error) of getting it to current sample value and predicted value again encodes.Because the dynamic range of predicated error will be much smaller than the dynamic range of original signal, even the quantized level that adopts when at this moment still adopting original signal to quantize also can reduce the sign indicating number position and encode, and then realize Bit-Rate Reduction.This method is effective especially for the voice signal that those have smooth performance.The mild sound of amplitude scintillation for example, predicated error can change between the very little value zero.Like the fallout predictor operational excellence, predicated error e [n] is incoherent, and smooth frequency spectrum is arranged.Equally, the average of e [n] will be littler than x [n], as long as less data bit just can be represented its actual value.
Linear predictor is widely used in voice and Audio Signal Processing, in most cases, uses the FIR wave filter, and the coefficient of predictive filter A (z) is decided by minimizing of mean square prediction error.If do not consider quantizer, the FIR predictive coefficient can obtain through finding the solution one group of linear equation.If in the lossless audio compression, use the FIR wave filter, then coefficient can be tried to achieve then through the step of confirming and quantized, and in decoding end, utilizes same coefficient to rebuild x [n] by e [n].Because the reconstruct original signal that must can't harm fully is so predictive coefficient (being the LPC parameter) must quantize and encode, with the part as lossless audio coding.Usually, in order to make the variation of fallout predictor adaptation signal, each frame behind the branch frame must be confirmed one group of new predictive coefficient.
4, be directed against this yard of Lay entropy coding of the data of geometric distributions
The theoretical foundation of data compression technique is exactly information theory.The theoretical subject matter that solves of information source coding in the information theory: the Basic Ways of theoretical limit (2) data compression of (1) data compression.According to information-theoretical principle, can find the method for optimum data compressed encoding, the theoretical limit of data compression is an information entropy.Information entropy is the average information (probabilistic tolerance) of information source.If require in the cataloged procedure not drop-out amount; Promptly require to preserve information entropy; This information preserving encoding is entropy coding; Entropy coding (entropy encoding) be the no semantic data stream that compresses of one type of statistical information of utilizing data lossless coding it be that distribution character according to the message probability of occurrence carries out, in this process, can remove the redundancy in the error signal.And there is not information dropout.The entropy coding mode of often using has: run-length encoding (RLE), Shannon (Shannon) coding, Huffman (Huffman) coding and arithmetic coding (arithmetic coding).Entropy coding is a kind of harmless information source coding, and the effect of entropy coding is the redundant information of removing in the predictive error signal, in this process, does not have losing of data message.Because the information source of residual signals is obeyed geometric distributions, residual signals is encoded so adopt Rice to encode.
The Rice coding is that an information source is the Huffman encoding that Laplace distributes, and has only a parameter k, and in fact, the predictive error signal in the sound channel in the decorrelation operation all is similar to the Laplace probability density distribution.Rice coding is made up of three parts: 1. sign bit, and 2. exponent is hanged down in the k position; 3. the high-order position that keeps.The first of code word representes the symbol of e [n]; Second portion comprises | e [n] | low k significance bit of binary code, third part connects zero by N and constitutes, N is here | e [n] | the scale-of-two typical value of residue significance bit, N connects zeroback insertion 1 as separator.
Suppose Integer n is carried out Rice, then coding step does
(1) sign bit (1 is just representing, and 0 representative is negative)
(2) n/ (2k) individual company zero
(3)separate position 1
(4) the back k position significance bit of n
We have done two groups and have tested lossless compression-encoding algorithm more described herein and MPEG ALS (RM22) and two kinds of lossless coding forms of FLAC to compare.
First group of experiment we selected 13 kinds of different music styles to carry out lossless audio compression, proved that this scrambler can obtain compression performance preferably for the sound signal of different tonequality.
The audio file compression result of different-style relatively
Figure BSA00000268754400071
Test us for second group and selected the different sample rate one group sound signal different to come the coded system that we propose is tested, prove that with this this scheme is in the combination of various sampling rates and quantified precision effect preferably being arranged with quantified precision.
The audio file compression result of different sample rate and quantified precision relatively
Figure BSA00000268754400072
Two groups of above experiment proofs all can obtain compression effectiveness preferably for music file the inventive method of different-style, and also can obtain than the suitable result of current main-stream audio frequency lossless compress software for audio file the inventive method of different sampling rates and quantified precision.
Compared with prior art, good effect of the present invention is:
1, native system can carry out the branch frame to signal according to the correlation circumstance self-adaptation of frame before and after the signal, makes that the signal in the frame has strong correlation, handles for the wavelet decomposition of back and predicts with LPC and to bring benefit.
2. native system adopts integer lifting wavelet transform, has avoided the error that deal with data and filter coefficient are blocked generation.
3. native system is majorized function with the optimal compression value, and the integer Lifting Wavelet progression of signal is adjusted according to the unlike signal self-adaptation.
4. the signal after the wavelet decomposition is adopted the LPC prediction processing, increase the compactness of decomposing the back signal.
5. with this sign indicating number of the Lay that is fit to geometric distributions residual signals is encoded, signal is further compressed.
Description of drawings
Fig. 1: coder structure block diagram;
Fig. 2: decoder architecture block diagram;
Fig. 3: divide the frame strategy;
Fig. 4: the decomposition of Lifting Wavelet;
Fig. 5: the reconstruct of Lifting Wavelet;
Fig. 6: integer wavelet transform with and inverse transformation;
Fig. 7: forward prediction device code pattern;
Fig. 8: forward prediction device decoding figure.
Embodiment
Following reference accompanying drawing of the present invention is described most preferred embodiment of the present invention in more detail, describes the technical scheme that how to realize this invention in detail:
Audio frequency according to the inventive method can't harm the coder/decoder system comprising encoder subsystem and Decoder Subsystem two parts.The structured flowchart of system is as depicted in figs. 1 and 2, and wherein Fig. 1 is an audio frequency lossless compression-encoding subsystem structure block diagram, and Fig. 2 is a frequency lossless compress Decoder Subsystem structured flowchart.Below will combine the detailed introducing system structure of accompanying drawing.
1, overall plan:
Encoder section: the coder structure block diagram is as shown in Figure 1: audio file is divided into some frames according to a minute frame strategy earlier, divides frame information to include the side information transmission in; With the every frame individual processing behind minute frame, promptly earlier logical wavelet transformation obtains the approximate signal and the detail signal of each frame, and wavelet decomposition progression is pressed adaptation rule and obtained, and decomposed class is included side information equally in; Approximate signal and detail signal are through the linear prediction module, and the residual signals that in linear prediction module, obtains is through entropy coding, and the LPC parameter is included side information in, at last each road code stream multiplex is formed final compressed bit stream.
Decoder section: the decoder architecture block diagram is as shown in Figure 2: in fact be exactly the inverse process of scrambler, through decoding side information earlier, therefrom isolate the code book of entropy coding; Solve LPC prediction back residual error; Utilize the LPC parametric solution to get the approximate signal and the detail signal of wavelet decomposition, according to the information of wavelet decomposition progression, reconstruct obtains every frame signal again; According to minute frame information each frame is coupled together in order the harmless original audio file that obtains at last.
2, divide the frame strategy:
Owing to have considerable jump signal in the sound signal, if adopt anchor-frame progress row to divide frame, the correlativity between the signal in each frame that obtains can be a greater impact, and then makes compressibility reduce.This programme is according to the related coefficient of consecutive frame, and the signal that correlativity is big merges in the frame, and like this, the compactness of wavelet transformation and linear prediction all can improve, and can obtain higher compression efficiency.So take branch frame strategy in this programme, the consistent signal of characteristic can be handled in a frame like Fig. 3.
Be unit at first, calculate the related coefficient of present frame and former frame, if this coefficient is less than threshold value with minimum frame length; Then this frame of mark and former frame are uncorrelated frame; Become separately a frame, if this coefficient, is then thought present frame and former frame associated frame greater than threshold value; Adjacent associated frame is merged successively, but maximum frame length is no more than the maximum frame length permissible value of setting.
3, integer lifting wavelet transform:
Promote strategy: people such as Swledens are verified, and any existing wavelet transformation can be realized through the lifting step cascade of limited number of time, and in addition, promoting the conversion implementation framework can realize with integer wavelet transform.Fig. 4 has provided a lifting scheme the most basic realizing wavelet transformation, comprises three basic steps: separate, predict and upgrade.
Separate (split): refer to through down-sampling x [n] is separated into odd sequence xo [n]=x [2n+1] even sequence xe [n]=x [2n].This separation also is known as the Lazy wavelet transformation.
Prediction (predict): in general, have very strong correlativity between sequence of parity, can be with even number collection prediction odd number collection (P is a predictive operator) thus remove the redundancy between data, obtain the detail of the high frequency of signal.
d[n]=xo[n]-P(xe[n]) (1)
If signal is local smooth, then the value of prediction residual can be very little.
Upgrade (update): this step can obtain the low-frequency information of x [n], i.e. the general picture of signal, or be called approximate signal.Corresponding computing is that prediction residual d [n] is added among the even sequence xe [n] through upgrading operator U,
s[n]=xe[n]+U(do[n]) (2)
More than be the decomposition algorithm of lifting scheme, the restructing algorithm that can be derived lifting scheme by the character that promotes matrix is following, comprises anti-renewal, anti-prediction, three steps of amalgamation, and is as shown in Figure 5
Anti-upgrade (undo update):
xe[n]=s[n]-U(d[n]) (3)
Anti-prediction (undo predict):
xo[n]=d[n]+P(xe[n]) (4)
Synthetic (merge):
x[n]=Merge(x[2n+1],x[2n])?(5)
The conversion of structure different wavelet is depended on selection different predicting operator and upgrades operator that these operators can be constant coefficient (simply multiplying each other), also can be the shock responses (convolution algorithm) of wave filter.Predictive operator is corresponding with the scaling function in the traditional wavelet, can obtain different scaling functions by the different predicting interpolation operator.Same, it is also corresponding with the wavelet function in traditional wavelet transformation to upgrade operator.
No matter traditional transform method is Fast Fourier Transform (FFT) or wavelet transformation, and input signal is an integer, and the coefficient after the conversion that obtains is a floating number, and there is round-off error in computing machine when handling, can not realize the lossless compress of data.
Review and adopt lifting scheme to calculate wavelet transformation, in lifting process, just can add quantization operations, owing to promote the requirement that matrix satisfies completely reversibility, thus just can realize the wavelet transformation from the integer to the integer.Concrete situation is shown in following formula: (Q is quantificational operators)
y(i)=x(i)+Q[αx(j)],y(k)=x(k),k≠i (6)
x(i)=y(i)-Q[αy(j)],x(k)=y(k),k≠i (7)
According to above-mentioned quantification lifting step, just can obtain the way of realization of integer wavelet transform and inverse transformation thereof, as shown in Figure 6, each step lifting step all adds quantization operations, has guaranteed that each step is output as integer.At first, any existing first generation small echo, the lifting structure cascade that can pass through limited number of time is achieved; Secondly; Can promote conversion through extra four times and make that gain factor is 1, like this, the LP and the BP of output also are integer at last; Because the integer transform that each step is all reversible obviously can accurately recover original signal by LP and BP when inverse transformation.
Viewpoint with multiresolution analysis or BPF.; Wavelet decomposition is not limited to above-mentioned one-level decomposes, and the approximate signal after can also decomposing one-level continues to do wavelet decomposition, further removes its correlativity; But difference because unlike signal distributes on frequency; It is influential to adopt different decomposed classes that the result who compresses is understood, and the present invention adopts the different wavelet decomposed class that signal is decomposed, according to the result after decomposing; Select the highest progression of ratio of compression, and the decomposed class information of the best is recorded in the side information.
4, LPC prediction:
What this programme adopted is the method for forward prediction, thinks that the value of discrete-time signal current point in time can predict that expression formula is as follows by the linear combination of preceding K the value of putting:
x^(n)=Σk=1Khk·x(n-k)---(8)
Wherein k is the exponent number of fallout predictor.If predictor predicts result is near current demand signal, the variance ratio original signal of its residual error approaches zero more so, thereby has reduced code length effectively.The structure of forward prediction device is as shown in Figure 7, and its corresponding decode procedure is as shown in Figure 8.
The key problem of LPC prediction is how to obtain predictor coefficient according to input signal, and the exponent number of establishing fallout predictor is p, is then got by the Yule-Walker equation:
rx(0)rx(1)rx(2)...rx(p)rx(1)rx(0)rx(1)...rx(p-1)rx(2)rx(1)rx(0)...rx(p-2)............rx(p)rx(p-1)rx(p-2)...rx(0)1-a1-a2...-ap=σ200...0---(9)
Wherein, rx (i), i=0,1,2 ..., p is the sampling point autocorrelation value, can be estimated by the autocorrelation function of p point before the current sampling point in the fallout predictor.A1, a2, a3 ..., ap is a predictor coefficient, σ 2 is the least error power of forward prediction.When known autocorrelation matrix, make the minimum predictor coefficient of error power be institute and ask.This programme adopts classical Levinson-Durbin algorithm to find the solution this coefficient.
Make reflection coefficient am (m)=km, the recurrence equation that can be obtained coefficient by the Levinson-Durbin algorithm is following:
km=(-Σk=1m-1am-1(k)rx(m-k)+rx(m))/ρm-1---(10)
am(k)=am-1(k)-kmam-1(m-k) (11)
ρm=ρm-1(1-km2)---(12)
Each rank coefficient recursion obtains in this algorithm, so can relatively select suitable fallout predictor exponent number and each rank coefficient.Because km2 is the number greater than 0, so ρ m<ρ m-1 is arranged forever, promptly along with the carrying out of iterative process, predicated error will reduce step by step.
Because the predictive coefficient that calculates according to above-mentioned optiaml ciriterion all is a floating number; So the fallout predictor reflection coefficient that in side information, keeps is the fixed-point number through quantizing; Though given up certain optimization criterion, guaranteed the complete restructural characteristic of signal.
5, entropy coding
Because residual error is obeyed certain distribution, so generally use entropy coding for residual error.This programme adopts the Rice sign indicating number that residual signals is encoded.The Rice coding is applicable to that its distribution is similar to the digital coding of geometric distributions.Its distribution function is:
Pr{rθ=p}=(1-θ)θp,θ∈(0,1) (13)
But these coding requirement data of Lay be necessary on the occasion of.In order to reach this requirement, at first need do a mapping to residual signals, with negative value be mapped on the occasion of.This process is as follows:
ri*=2ri,ri>0-2ri-1,ri<0---(14)
For a data p, at first use M (M=2n generally speaking) to remove this number, obtain a merchant q and corresponding remainder r.Promptly
r=p-q·M (16)
Then the coded system of these data is following: 1 of q bit is used for representing the merchant; 0 of 1 bit is used for distinguishing the zone bit of quotient and the remainder; [log2r] individual bit is used for representing remainder r.
M value suitable in the formula is confirmed by following method:
uN=1N&Sigma;i=1N|ri|---(17)
Figure BSA00000268754400124
Constant c1=0.97 wherein.
Although disclose specific embodiment of the present invention and accompanying drawing for the purpose of illustration; Its purpose is to help to understand content of the present invention and implement according to this; But it will be appreciated by those skilled in the art that: in the spirit and scope that do not break away from the present invention and appended claim, various replacements, variation and modification all are possible.Therefore, the present invention should not be limited to most preferred embodiment and the disclosed content of accompanying drawing.

Claims (8)

Translated fromChinese
1.一种基于整型小波变换的音频无损压缩编码方法,其步骤为:1. An audio lossless compression coding method based on integer wavelet transform, the steps of which are:1)分帧模块根据相邻帧的相关系数,将相关性大的信号合并到一帧内,对输入的音频信号进行分帧处理,将分帧信息纳入边信息;其中,所述分帧模块首先以最小帧长为单位,计算当前帧与前一帧的相关系数;如果此系数小于设定阈值,则将当前帧单独分为一帧;否则将当前帧与前一帧标记为相关帧,然后将相邻的相关帧依次合并构成一帧;1) The framing module merges the signals with high correlation into one frame according to the correlation coefficients of adjacent frames, performs framing processing on the input audio signal, and incorporates the framing information into side information; wherein, the framing module Firstly, the correlation coefficient between the current frame and the previous frame is calculated in units of the minimum frame length; if the coefficient is smaller than the set threshold, the current frame is divided into a single frame; otherwise, the current frame and the previous frame are marked as related frames, Then, adjacent related frames are sequentially merged to form one frame;2)整型小波变换模块对分帧后的每一帧进行小波变换得到近似信号、细节信号和分级信息,并将分级信息纳入边信息;2) The integer wavelet transform module performs wavelet transform on each frame after framing to obtain approximate signal, detail signal and grading information, and incorporates grading information into side information;3)线性预测编码模块对近似信号和细节信号进行线性预测,得到残差信号和LPC参数,并将LPC参数纳入边信息;3) The linear predictive coding module performs linear prediction on the approximate signal and the detail signal, obtains the residual signal and the LPC parameter, and incorporates the LPC parameter into the side information;4)熵编码模块对残差信号进行熵编码得到熵编码流,同时将熵编码的码本信息纳入边信息;4) The entropy encoding module performs entropy encoding on the residual signal to obtain an entropy encoding stream, and simultaneously incorporates the codebook information of the entropy encoding into side information;5)比特流形成模块将边信息和熵编码流复用形成最终的压缩码流。5) The bit stream forming module multiplexes the side information and the entropy coded stream to form a final compressed code stream.2.如权利要求1所述的方法,其特征在于所述整型小波变换模块为整型提升小波变换模块。2. The method according to claim 1, characterized in that the integer wavelet transform module is an integer lifting wavelet transform module.3.如权利要求2所述的方法,其特征在于所述整型提升小波变换模块为四次提升变换的整型小波变换模块。3. The method according to claim 2, characterized in that the integer lifting wavelet transform module is an integer wavelet transform module of quadruple lifting transform.4.如权利要求1或2所述的方法,其特征在于采用自适应级数选择方法确定所述分级信息。4. The method according to claim 1 or 2, characterized in that the classification information is determined using an adaptive series selection method.5.如权利要求1所述的方法,其特征在于设定一最大帧长阈值,当合并帧的帧长达到设定的最大帧长阈值时重起一帧进行分帧。5. The method according to claim 1, wherein a maximum frame length threshold is set, and when the frame length of the merged frame reaches the set maximum frame length threshold, a frame is restarted for subdivision.6.如权利要求1所述的方法,其特征在于所述熵编码模块采用莱斯码编码方法对残差信号进行熵编码。6. The method according to claim 1, characterized in that the entropy encoding module adopts a Rice code encoding method to entropy encode the residual signal.7.一种基于整型小波变换的音频无损压缩解码方法,其用于对使用如权利要求1所述的编码方法进行编码的音频信号进行解码,其步骤为:7. An audio lossless compression decoding method based on integer wavelet transform, which is used to decode the audio signal encoded using the encoding method as claimed in claim 1, its steps are:1)比特流分离模块从压缩码流中解码出边信息,并从边信息中分离出熵编码的码本、LPC参数、分级信息和分帧信息;1) The bit stream separation module decodes the side information from the compressed code stream, and separates the entropy coded codebook, LPC parameters, classification information and framing information from the side information;2)熵解码模块根据熵编码的码本信息对压缩码流进行熵解码,得到残差信号;2) The entropy decoding module performs entropy decoding on the compressed code stream according to the codebook information of the entropy encoding to obtain the residual signal;3)LPC重构模块利用LPC参数从残差信号中解得小波分解的近似信号和细节信号;3) The LPC reconstruction module uses the LPC parameters to solve the approximate signal and the detail signal of the wavelet decomposition from the residual signal;4)整型小波重构模块根据分级信息对近似信号和细节信号进行重构,得到每帧信号;4) The integer wavelet reconstruction module reconstructs the approximate signal and the detail signal according to the hierarchical information to obtain the signal of each frame;5)合并帧模块根据分帧信息将各帧顺次连接起来,得到原始音频信号。5) The merging frame module connects each frame sequentially according to the framing information to obtain the original audio signal.8.如权利要求7所述的方法,其特征在于所述整型小波重构模块为整型提升小波重构模块。8. The method according to claim 7, characterized in that the integer wavelet reconstruction module is an integer lifting wavelet reconstruction module.
CN201010281033XA2010-09-142010-09-14Integer wavelet transform-based audio lossless compression encoding and decoding methodExpired - Fee RelatedCN101944362B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN201010281033XACN101944362B (en)2010-09-142010-09-14Integer wavelet transform-based audio lossless compression encoding and decoding method

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201010281033XACN101944362B (en)2010-09-142010-09-14Integer wavelet transform-based audio lossless compression encoding and decoding method

Publications (2)

Publication NumberPublication Date
CN101944362A CN101944362A (en)2011-01-12
CN101944362Btrue CN101944362B (en)2012-05-30

Family

ID=43436323

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201010281033XAExpired - Fee RelatedCN101944362B (en)2010-09-142010-09-14Integer wavelet transform-based audio lossless compression encoding and decoding method

Country Status (1)

CountryLink
CN (1)CN101944362B (en)

Families Citing this family (19)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN102368385B (en)*2011-09-072013-08-14中科开元信息技术(北京)有限公司Backward block adaptive Golomb-Rice coding and decoding method and apparatus thereof
FR2982446A1 (en)2011-11-072013-05-10France Telecom METHOD FOR ENCODING AND DECODING IMAGES, CORRESPONDING ENCODING AND DECODING DEVICE AND COMPUTER PROGRAMS
FR2982447A1 (en)2011-11-072013-05-10France Telecom METHOD FOR ENCODING AND DECODING IMAGES, CORRESPONDING ENCODING AND DECODING DEVICE AND COMPUTER PROGRAMS
JP6527462B2 (en)*2013-03-222019-06-05富士通株式会社 Compression device, compression method, recording medium and decompression device
CN106409310B (en)*2013-08-062019-11-19华为技术有限公司 A kind of audio signal classification method and device
SG11201603116XA (en)2013-10-222016-05-30Fraunhofer Ges ForschungConcept for combined dynamic range compression and guided clipping prevention for audio devices
CN103632673B (en)*2013-11-052016-05-18无锡北邮感知技术产业研究院有限公司A kind of non-linear quantization of speech linear predictive model
CN104217726A (en)*2014-09-012014-12-17东莞中山大学研究院 A lossless audio compression coding method and its decoding method
CN106024000B (en)*2016-05-232019-12-24苏州大学 An End-to-End Speech Encryption and Decryption Method Based on Spectrum Mapping
CN106098073A (en)*2016-05-232016-11-09苏州大学A kind of end-to-end speech encrypting and deciphering system mapping based on frequency spectrum
US10210874B2 (en)*2017-02-032019-02-19Qualcomm IncorporatedMulti channel coding
CN109147805B (en)*2018-06-052021-03-02安克创新科技股份有限公司Audio tone enhancement based on deep learning
CN109309513B (en)*2018-09-112021-06-11广东石油化工学院Adaptive reconstruction method for power line communication signals
CN110380826B (en)*2019-08-212021-09-28苏州大学Self-adaptive mixed compression method for mobile communication signal
CN110992739B (en)*2019-12-262021-06-01上海松鼠课堂人工智能科技有限公司Student on-line dictation system
CN113571073A (en)*2020-04-282021-10-29华为技术有限公司 A kind of coding method and coding device of linear prediction coding parameter
EP4138396A4 (en)*2020-05-212023-07-05Huawei Technologies Co., Ltd.Audio data transmission method, and related device
CN112118445A (en)*2020-07-292020-12-22广东省建筑科学研究院集团股份有限公司 A Data Compression Method for Bridge Health Monitoring Based on Wavelet Analysis
EP4440151A4 (en)*2021-11-262024-11-27Beijing Xiaomi Mobile Software Co., Ltd. STEREO AUDIO SIGNAL PROCESSING METHOD AND APPARATUS, ENCODING DEVICE, DECODING DEVICE, AND STORAGE MEDIUM

Citations (7)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US6094631A (en)*1998-07-092000-07-25Winbond Electronics Corp.Method of signal compression
CN1318904A (en)*2001-03-132001-10-24北京阜国数字技术有限公司Practical sound coder based on wavelet conversion
CN1354456A (en)*2001-12-212002-06-19北京阜国数字技术有限公司Block effect eliminating method in wavelet voice frequency signal processing
US6496797B1 (en)*1999-04-012002-12-17Lg Electronics Inc.Apparatus and method of speech coding and decoding using multiple frames
CN1424713A (en)*2003-01-142003-06-18北京阜国数字技术有限公司High frequency coupled pseudo small wave 5-tracks audio encoding/decoding method
CN1529246A (en)*2003-09-282004-09-15王向阳Digital audio-frequency water-print inlaying and detecting method based on auditory characteristic and integer lift ripple
CN1920950A (en)*2006-09-252007-02-28北京理工大学Characteristic waveform decomposition and reconfiguration method based on Haar wavelet exaltation

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US6094631A (en)*1998-07-092000-07-25Winbond Electronics Corp.Method of signal compression
US6496797B1 (en)*1999-04-012002-12-17Lg Electronics Inc.Apparatus and method of speech coding and decoding using multiple frames
CN1318904A (en)*2001-03-132001-10-24北京阜国数字技术有限公司Practical sound coder based on wavelet conversion
CN1354456A (en)*2001-12-212002-06-19北京阜国数字技术有限公司Block effect eliminating method in wavelet voice frequency signal processing
CN1424713A (en)*2003-01-142003-06-18北京阜国数字技术有限公司High frequency coupled pseudo small wave 5-tracks audio encoding/decoding method
CN1529246A (en)*2003-09-282004-09-15王向阳Digital audio-frequency water-print inlaying and detecting method based on auditory characteristic and integer lift ripple
CN1920950A (en)*2006-09-252007-02-28北京理工大学Characteristic waveform decomposition and reconfiguration method based on Haar wavelet exaltation

Also Published As

Publication numberPublication date
CN101944362A (en)2011-01-12

Similar Documents

PublicationPublication DateTitle
CN101944362B (en)Integer wavelet transform-based audio lossless compression encoding and decoding method
CN101199121B (en) Encoding input signal method and encoder/decoder
JP5384780B2 (en) Lossless audio encoding method, lossless audio encoding device, lossless audio decoding method, lossless audio decoding device, and recording medium
TWI515720B (en)Method of compressing a digitized audio signal, method of decoding an encoded compressed digitized audio signal, and machine readable storage medium
CN101027717B (en)Lossless multi-channel audio codec
CN1272911C (en)Audio signal decoding device and audio signal encoding device
JP5265682B2 (en) Digital content encoding and / or decoding
RU2522020C1 (en)Hierarchical audio frequency encoding and decoding method and system, hierarchical frequency encoding and decoding method for transient signal
CN103280221B (en)A kind of audio lossless compressed encoding, coding/decoding method and system of following the trail of based on base
CN1973319B (en)Method and apparatus to encode and decode multi-channel audio signals
CN102368385B (en)Backward block adaptive Golomb-Rice coding and decoding method and apparatus thereof
US7991622B2 (en)Audio compression and decompression using integer-reversible modulated lapped transforms
WO2023241205A1 (en)Audio processing method and apparatus, and electronic device, computer-readable storage medium and computer program product
US7333929B1 (en)Modular scalable compressed audio data stream
WO2023241222A1 (en)Audio processing method and apparatus, and device, storage medium and computer program product
CN117198301A (en)Audio encoding method, audio decoding method, apparatus, and readable storage medium
JP2002091497A (en) Audio signal encoding method, decoding method, and program storage medium for executing those methods
AU2011205144B2 (en)Scalable compressed audio bit stream and codec using a hierarchical filterbank and multichannel joint coding
Mondal et al.Optimized lossless audio compression using DCT energy thresholding and machine learning technique
Avramović et al.Lossless audio compression using modular arithmetic and performance-based adaptation
CN118609581A (en) Audio encoding and decoding method, device, equipment, storage medium and product
CN117219099A (en)Audio encoding, audio decoding method, audio encoding device, and audio decoding device
JP2001298367A (en) Audio signal encoding method, audio signal decoding method, audio signal encoding / decoding device, and recording medium recording program for implementing the method
ZureraOn the coding gain of dynamic Huffman coding applied to a wavelet-based perceptual audio coder
CA2467466A1 (en)System and method for compressing and reconstructing audio files

Legal Events

DateCodeTitleDescription
C06Publication
PB01Publication
C10Entry into substantive examination
SE01Entry into force of request for substantive examination
C14Grant of patent or utility model
GR01Patent grant
CF01Termination of patent right due to non-payment of annual fee
CF01Termination of patent right due to non-payment of annual fee

Granted publication date:20120530


[8]ページ先頭

©2009-2025 Movatter.jp