CN103943113B

Movatterモバイル変換

Info

Publication number: CN103943113B
Application number: CN201410151551.8A
Authority: CN
Inventors: 王子亮; 陈凤
Original assignee: Fujian Star Net eVideo Information Systems Co Ltd
Current assignee: Fujian Star Net eVideo Information Systems Co Ltd
Priority date: 2014-04-15
Filing date: 2014-04-15
Publication date: 2017-11-07
Anticipated expiration: 2034-04-15
Also published as: CN103943113A

Abstract

The present invention provides a kind of method that song goes accompaniment, including step：Obtain audio accompaniment signal and song audio signal；FFT is carried out to song audio signal and audio accompaniment signal respectively and obtains song audio signal amplitude spectrum and audio accompaniment signal amplitude spectrum and the phase of song audio signal；Audio accompaniment signal amplitude spectrum is strengthened；Song audio signal amplitude spectrum is subtracted into enhanced audio accompaniment signal amplitude spectrum, and combines the phase of song audio signal and carries out the audio signal that FFT inverse transformations obtain accompaniment.The present invention also provides the device that a kind of song goes accompaniment.The present invention can improve the degree of purity for the song isolated, and the execution efficiency separated is high, and algorithm is simple and easy to apply.

Description

The method and apparatus that a kind of song goes accompaniment

Technical field

The present invention relates to Audio Signal Processing field, the method and apparatus that more particularly to a kind of song goes accompaniment.

Background technology

Song piece-rate system is widely used in some fields, such as lyrics automatic identification and correction.Lyrics automatic identificationUsually require that input system is single song, i.e., only song of not accompanying, but this for almost all of song usuallyIt is unpractiaca, because most song is all the accompaniment while comprising song and musical instrument.

The research that song is isolated in current music is also seldom, never with Sound seperation acoustic problem, suchBusiness is easy for people, but highly difficult for machine.Speech Separation is widely studied, but due toMusic is a kind of extremely complex signal, and the multiple signals comprising song and different musical instruments are mixed, and musical instrument soundSound and song or related, are difficult to isolate pure song using Blind Speech Signal isolation technics.

Master's thesis《Song separation based on time frequency analysis》Propose the song separation analyzed based on TF.Its separation process masterMain pitch parameters are depended on, there can be overlapping phenomenon between song and the fundamental tone and overtone of musical instrument in many cases, individuallyIt is it is difficult to obtain the TF information of song completely, therefore song often or can not be separated with accompaniment using keynote height.And it is thisMethod has that algorithm is complicated, computationally intensive, execution efficiency is low.

The content of the invention

The present invention provides a kind of method that song goes accompaniment, it is possible to increase the degree of purity for the song isolated, and separatesExecution efficiency it is high, algorithm is simple and easy to apply.

A kind of method that song goes accompaniment, including step：

Obtain audio accompaniment signal and song audio signal；

Song audio signal and audio accompaniment signal are pre-processed respectively and FFT is carried out and obtains song audio signalAmplitude spectrum and audio accompaniment signal amplitude spectrum and the phase of song audio signal；

Audio accompaniment signal amplitude spectrum is strengthened；

Song audio signal amplitude spectrum is subtracted into enhanced audio accompaniment signal amplitude spectrum, and combines song audio signalPhase carry out FFT inverse transformations obtain accompaniment audio signal.

The present invention also provides the device that a kind of song goes accompaniment, and the song goes the device of accompaniment to include：

Audio accompaniment signal and song audio signal acquisition module, for obtaining audio accompaniment signal and song audio letterNumber；

Pretreatment and FFT module, for being gone forward side by side respectively to song audio signal and audio accompaniment signal as pretreatmentRow FFT obtains song audio signal amplitude spectrum and audio accompaniment signal amplitude spectrum and the phase of song audio signal.

Amplitude spectrum of accompanying strengthens module, for strengthening audio accompaniment signal amplitude spectrum；

Spectral substraction and FFT inverse transform modules, for song audio signal amplitude spectrum to be subtracted into enhanced audio accompanimentSignal amplitude is composed, and combines the audio signal that the phase progress FFT inverse transformations of song audio signal obtain accompaniment.

Beneficial effects of the present invention are：The present invention to song audio signal and audio accompaniment signal by carrying out FFTSong audio signal amplitude spectrum and audio accompaniment signal amplitude spectrum are obtained, and the amplitude spectrum of audio accompaniment signal is strengthened,Song audio signal amplitude spectrum is subtracted into enhanced audio accompaniment signal amplitude spectrum, and combines the phase of song audio signal to enterRow FFT inverse transformations obtain the audio signal of accompaniment, and the present invention can improve the degree of purity for the song isolated, and separateExecution efficiency is high, and algorithm is simple and easy to apply.

Brief description of the drawings

Fig. 1 goes the execution flow chart of the method for accompaniment for a kind of song in an embodiment of the present invention；

Fig. 2 goes the functional block diagram of the device of accompaniment for a kind of song in an embodiment of the present invention；

Fig. 3 is embodiment of the present invention example song《Meet》Song audio time domain beamformer；

Fig. 4 is embodiment of the present invention example song《Meet》Audio accompaniment time domain beamformer；

Fig. 5 is embodiment of the present invention example song《Meet》Go accompaniment after audio time domain oscillogram；

Major Symbol explanation：

10- audio accompaniments signal and song audio signal acquisition module；20- is pre-processed and FFT module；30- accompaniesAmplitude spectrum strengthens module；40- spectral substractions and FFT inverse transform modules.

Embodiment

The present invention is composed by the way that song audio signal amplitude spectrum is subtracted into the enhanced audio accompaniment signal amplitude of amplitude spectrum, fromAnd the degree of purity for the song isolated is improved, and the execution efficiency separated is high.

To describe the technology contents of the present invention in detail, feature, the objects and the effects being constructed, below in conjunction with embodimentAnd coordinate accompanying drawing to be explained in detail.

Referring to Fig. 1, a kind of song of present embodiment removes the method flow diagram of accompaniment method.The song goes to the side of accompanimentMethod, including step：

S1, acquisition audio accompaniment signal and song audio signal；

S2, song audio signal and audio accompaniment signal are pre-processed respectively and FFT is carried out obtain song audioSignal amplitude composes the phase with audio accompaniment amplitude spectrum and song audio signal；

S3, to audio accompaniment signal amplitude spectrum strengthen；

S4, song audio signal amplitude spectrum is subtracted into enhanced audio accompaniment signal amplitude composed, and combine song audioThe phase of signal carries out the audio signal that FFT inverse transformations obtain accompaniment.

The present invention is subtracted song audio signal amplitude spectrum after enhancing by strengthening audio accompaniment signal amplitude spectrumAudio accompaniment signal amplitude spectrum, and combine song audio signal phase carry out FFT inverse transformations obtain accompaniment audio letterNumber, the beneficial degree of purity for improving the song isolated, and the simple execution efficiency of algorithm of present embodiment is high.

In the present embodiment, the step S1 " obtaining audio accompaniment signal and song audio signal " method for " fromAudio accompaniment signal and song audio signal are obtained in stereo song audio ", be specially：

The left channel signals of stereo song audio are carried out anti-phase to obtain left inversion signal；

Left inversion signal is added with right-channel signals and obtains audio accompaniment signal.

And it regard right-channel signals in stereo song audio as the song audio signal for needing removal to accompany.

The stereo song audio includes left channel signals and right-channel signals, and the left channel signals are voice and a left sideThe mixed signal of sound channel accompaniment, right-channel signals are the mixed signals of voice and R channel accompaniment.

In another embodiment, the step S1 " obtaining audio accompaniment signal and song audio signal " method " fromAudio accompaniment signal and song audio signal are obtained in stereo song audio ", be specially：

The right-channel signals of stereo song audio are carried out anti-phase to obtain right inversion signal；

Right inversion signal is added with left channel signals and obtains audio accompaniment signal.

And it regard left channel signals in stereo song audio as the song audio signal for needing removal to accompany.

In another embodiment, the step S1 can also be realized by other method.Determine whether song audioAnd the audio accompaniment corresponding with song audio, if can just make next step processing.

In the present embodiment, it is described " respectively to song sound in step S2 for ease of the processing to song audio signalFrequency signal and audio accompaniment signal are pre-processed ", it implements step and is：

Step S20, song audio signal and audio accompaniment signal are normalized respectively；Wherein, the normalizingChanging the method handled is：The maximum value of song audio signal and audio accompaniment signal is found out respectively, and song audio is believedNumber and audio accompaniment signal divided by corresponding maximum value；

The song audio signal and audio accompaniment signal after normalized are divided into N number of frame respectively, wherein, N is justInteger, each song frame and accompaniment frame include 1024 sampled points, and have per two adjacent song frames or between accompaniment frameThe sampled point of 512 coincidences.

By the normalized, its amplitude of the song audio signal and audio accompaniment signal be limited to -1 andBetween+1, it is easy to subsequent treatment；Song audio signal and audio accompaniment signal are divided into each frame, and two adjacent songsThere is the sampled point of 512 coincidences between bent frame or accompaniment frame, in order that being seamlessly transitted between frame and frame.

In the present embodiment, the spectral leakage caused when being and reduce subsequent conversion to frequency domain, in " difference described in step S2FFT is carried out to song audio and audio accompaniment signal " it is preceding also including carrying out adding Hanning window to each song frame and accompaniment frameFiltering.

In the present embodiment, in step S3, described " audio accompaniment signal is carried out into amplitude spectrum enhancing " implementsStep includes：

Step S30, traversal audio accompaniment signal amplitude spectrum M_n(i),（I=0,1,2L512, n=0,1,2LN-1) it is allFrame, finds out the maximum of all amplitude spectrum corresponding points of common 2m+1 frames of the rear m frames of present frame, the preceding m frames of present frame and present frame,The new value that will be put corresponding to the value as present frame, wherein, m is default positive integer.

Such as, m selections 2 in one embodiment.Travel through all frames of audio accompaniment signal amplitude spectrum（Remove all framesPreceding 2 frame with the frame of end 2）, rear 2 frame that present frame, preceding 2 frame of present frame and present frame are found out successively is compared and assignment.For example, present frame is the 2nd frame, then to find out its preceding 2 frame the i.e. the 0th, 1 frame, the 2nd frame, and 2 frames are the 3rd, 4 frames thereafter, to this 5Frame is traveled through by the 0th~512 point successively, is found out the maximum of 5 each corresponding points of frame and is assigned present frame by the valueCorresponding points.For example, the 0th maximum is the value of the 3rd frame in 5 frames, then it is the 2nd frame to assign present frame by the value of the 3rd frame0th point.Then, the value and assignment that this 5 frame 1-512 point is compared successively give the corresponding points of present frame.Then, ought the 3rd frame workFor present frame, its preceding 2 frame the i.e. the 1st, 2 frame, the 3rd frame are found out, and 2 frames are the 4th, 5 frames thereafter, are entered according to above-mentioned identical stepRow compares and assignment.Formula is M_n(i)=max(MM_n-2(i),MM_n-1(i),MM_n(i),MM_n+1(i),MM_n+2(i)),i=0,1,2L512, n=2,3,4LN-3, wherein MM_n(i)=M_n(i), i=0,1,2L512, wherein n=0,1,2LN-1, MM_n(i) it is copyThe amplitude spectrum caching of audio accompaniment signal.In other embodiments, the m values can be arranged to other positive integers beyond 2,Such as 1,3,4.

The amplitude of audio accompaniment signal can be strengthened by " being strengthened audio accompaniment signal amplitude spectrum " stepSpectrum, allows spectral substraction and FFT inverse transformation steps largely to remove the accompaniment composition in song audio signal.

In present embodiment, the step S4 is specifically included：

S41, according to formula(i=0,1,2L512) (n=0,1,2LN-1),

All song audio frames of traversal, are traveled through by the 0th~512 point, by the amplitude spectrum of song audio frame again per frameThe corresponding amplitude spectrum of enhanced audio accompaniment frame is subtracted, the amplitude spectrum of all frames of audio after accompaniment is obtained.Wherein, S_n(i) composed for song audio signal amplitude, M_n(i) composed for enhanced audio accompaniment signal amplitude, Y_n(i) it is to remove the sound after accompanyingFrequency signal amplitude is composed, and a is signal to noise ratio Dynamic gene, and b is accompaniment Dynamic gene；

A takes 2, b to take 4 in the present embodiment, and a and b can be arranged to other values in other embodiments, increases aValue can improve accompaniment after audio signal signal to noise ratio, increase b value can increase the removal of accompaniment.

S42, according to formula k_n(i)=Y_n(i)/S_n(i) (i=0,1,2L512) (n=0,1,2LN-1) will remove the sound after accompanimentFrequency frame amplitude is composed divided by the corresponding amplitude spectrum of song audio frame obtains proportionality coefficient k_n(i)；

The FFT real parts of all song audio frames are multiplied by corresponding proportionality coefficient k with imaginary part respectively_n(i), it can be goneThe 0th point of FFT real part and imaginary part to the 512nd point of the audio frame after accompaniment, according to FFT symmetry principle, FFT symmetrical 2Conjugate complex number, i.e. real part are equal each other for part sample value, imaginary part on the contrary, can obtain the 513rd point to 1023 points of FFT real parts withImaginary part, then carries out the inverse FFT of 1024 points；

Frame obtained by after inverse FFT is stitched together（Notice that interframe is overlapping）, obtain removing the audio letter after accompanimentNumber.

Referring to Fig. 2, being that the present invention also provides the functional block diagram that a kind of song removes the device of accompaniment.The song goes accompanimentDevice include audio accompaniment signal and song audio signal acquisition module 10, pretreatment and FFT module 20, accompaniment amplitudeSpectrum enhancing module 30, spectral substraction and FFT inverse transform modules 40；

Pretreatment and FFT module, for being gone forward side by side respectively to song audio signal and audio accompaniment signal as pretreatmentRow FFT obtains song audio signal amplitude spectrum and audio accompaniment signal amplitude spectrum and the phase of song audio signal；

Amplitude spectrum of accompanying strengthens module, strengthens for the amplitude spectrum to audio accompaniment signal；

The present invention carries out amplitude spectrum enhancing by signal amplitude spectrum enhancing module of accompanying to audio accompaniment signal, makes frequency spectrum phaseSubtract and FFT inverse transform modules can largely remove the accompaniment composition in song audio, so as to improve the song isolatedDegree of purity.

In the present embodiment, the audio accompaniment signal and song audio signal acquisition module include audio accompaniment signalAcquiring unit and song audio signal acquiring unit.

The audio accompaniment signal acquiring unit is used to the left channel signals of stereo song audio carrying out anti-phase obtainLeft inversion signal；Left inversion signal is added with right-channel signals and obtains audio accompaniment signal.

The song audio signal acquiring unit be used for using right-channel signals in stereo song audio as need removeThe song audio signal of accompaniment.

In another embodiment, the audio accompaniment signal and song audio signal acquisition module are believed including audio accompanimentNumber acquiring unit and song audio signal acquiring unit.

The audio accompaniment signal acquiring unit is used to the right-channel signals of stereo song audio carrying out anti-phase obtainRight inversion signal；Right inversion signal is added with left channel signals and obtains audio accompaniment signal.

The song audio signal acquiring unit be used for using left channel signals in stereo song audio as need removeThe song audio signal of accompaniment.

In another embodiment, the audio accompaniment signal and song audio signal acquisition module can also be by otherMethod is realized.Song audio is determined whether and the audio accompaniment corresponding with song audio, if can just make nextStep processing.

In the above-described embodiment, the pretreatment and FFT module also include normalization unit, framing unit, addedWindow unit；

The normalization unit is used to song audio signal and audio accompaniment signal is normalized respectively, whereinNormalized is：Find out the maximum value of song audio signal and audio accompaniment signal respectively, and by song audio signalWith audio accompaniment signal divided by corresponding maximum value；

The framing unit is used to the song audio signal and audio accompaniment signal after normalized are divided into N respectivelyIndividual frame, wherein, N is positive integer, and each song frame and accompaniment frame include 1024 sampled points, and per two adjacent song framesOr have the sampled point of 512 coincidences between accompaniment frame.

The windowing unit is used to carry out plus Hanning window filtering each song frame and accompaniment frame.In above-mentioned embodimentIn, the accompaniment amplitude spectrum enhancing module is used for all frames for traveling through audio accompaniment signal amplitude spectrum, finds out present frame, present framePreceding m frames and present frame rear m frames all amplitude spectrum corresponding points of common 2m+1 frames maximum, using the value as present frame institute it is rightThe new value that should be put, wherein, m is default positive integer.

Spectral substraction and the FFT inverse transform module includes spectral substraction unit, FFT inverse transformation blocks and concatenation unit；

The spectral substraction unit is used to the amplitude spectrum of song audio signal subtracting enhanced audio accompaniment signal widthDegree spectrum, obtains the audio frequency signal amplitude spectrum after accompaniment, and formula is：(i=0,1,2L512)(n=0,1,2LN-1).Wherein, S_n(i) composed for song audio signal amplitude, M_n(i) it is enhanced audio accompaniment signalAmplitude spectrum, Y_n(i) to go the audio frequency signal amplitude spectrum after accompaniment, a is signal to noise ratio Dynamic gene, and b is accompaniment Dynamic gene；

The FFT inverse transformation blocks are used for going the audio frequency signal amplitude spectrum after accompaniment to carry out FFT inverse transformations.Specifically,According to formula k_n(i)=Y_n(i)/S_n(i) (i=0,1,2L512) (n=0,1,2LN-1) will go the audio frequency signal amplitude after accompaniment to composeDivided by song audio signal amplitude spectrum obtains proportionality coefficient k_n(i)；Then the FFT real parts of song audio signal and imaginary part are distinguishedIt is multiplied by proportionality coefficient k_n(i), and carry out 1024 points FFT inverse transformations；

The concatenation unit is used to frame resulting after FFT inverse transformations being stitched together, and obtains removing the audio after accompanimentSignal.

In summary, the method and apparatus that song of the present invention goes accompaniment, by increasing to audio accompaniment signal amplitude spectrumBy force, the amplitude spectrum of song audio signal is subtracted into enhanced audio accompaniment signal amplitude to compose, and combines song audio signalPhase carries out the audio that FFT inverse transformations obtain accompaniment, the degree of purity for the song that beneficial raising is isolated, and present embodimentThe simple execution efficiency of algorithm it is high.

Example

Removing accompaniment example below by a specific song, the present invention will be described.

The song of Sun Yan appearances《Meet》, audio format is stereo double channel audio.

By stereo song《Meet》L channel carry out anti-phase obtaining inversion signal；By inversion signal and stereo songThe right-channel signals of audio are added and obtain song《Meet》Audio accompaniment signal；And by the right-channel signals of stereo song audioAs《Meet》Song audio signal.

2 audios are obtained through above-mentioned steps：Meet _ original singer .wav, and meet _ accompany .wav.

Reading is met _ original singer .wav and meet _ and pre-processed after the voice data for the .wav that accompanies, and 1024 points of progressFFT, met _ the song audio signal amplitude of original singer spectrum and meet _ audio accompaniment signal amplitude spectrum.Then according to such asLower formula is to meeting _ audio accompaniment signal amplitude spectrum progress amplitude spectrum enhancing：

M_n(i)=max(MM_n-2(i),MM_n-1(i),MM_n(i),MM_n+1(i),MM_n+2(i)),i=0,1,2L512,n=2,3,4LN-3, wherein MM_n(i)=M_n(i), i=0,1,2L512, n=0,1,2LN-1 represent that the audio accompaniment signal amplitude spectrum of copy is slowDeposit, N represents frame number.

According to formula(i=0,1,2L512) (n=0,1,2LN-1), will meet _ formerSing song audio signal amplitude spectrum subtract it is enhanced meet _ audio accompaniment signal amplitude spectrum, obtain accompaniment after audioSignal amplitude is composed.Wherein a takes 2, b to take 4.

According to formula k_n(i)=Y_n(i)/S_n(i) (i=0,1,2L512) (n=0,1,2LN-1) will go the audio after accompaniment to believeNumber amplitude spectrum divided by meet _ the song audio signal amplitude spectrum of original singer obtains proportionality coefficient k_n(i)；

By meeting _ the FFT real parts of the song audio signal amplitude spectrum of original singer are multiplied by proportionality coefficient k respectively with imaginary part_n(i), and carry out 1024 points FFT inverse transformations；

Frame obtained by after FFT inverse transformations is stitched together, the audio for obtaining removing after accompaniment meets _ voice .wav.

It refer to Fig. 3 to Fig. 5, respectively song《Meet》Song audio, audio accompaniment and the audio gone after accompanimentTime domain beamformer.Use player plays audio：Meet _ voice .wav, can hear, accompaniment removes clean, voice substantiallyAlthough amplitude has weakened, tonequality is close to the voice in original audio.

Embodiments of the invention are the foregoing is only, are not intended to limit the scope of the invention, it is every to utilize this hairEquivalent structure or equivalent flow conversion that bright specification and accompanying drawing content are made, or directly or indirectly it is used in other related skillsArt field, is included within the scope of the present invention.

Claims

1. a kind of method that song goes accompaniment, it is characterised in that including step：

Obtain audio accompaniment signal and song audio signal；

Song audio signal and audio accompaniment signal are pre-processed respectively and FFT is carried out and obtains song audio signal amplitudeSpectrum and audio accompaniment signal amplitude spectrum and the phase of song audio signal；

Audio accompaniment signal amplitude spectrum is strengthened；" being strengthened audio accompaniment signal amplitude spectrum " specifically includes stepSuddenly：Travel through audio accompaniment signal amplitude and compose all frames, find out the rear common 2m+1 of m frames of present frame, the preceding m frames of present frame and present frameThe maximum of all amplitude spectrum corresponding points of frame, the new value that will be put corresponding to the value as present frame, wherein, m is default just wholeNumber；

Song audio signal amplitude spectrum is subtracted into enhanced audio accompaniment signal amplitude spectrum, and combines the phase of song audio signalPosition carries out the audio signal that FFT inverse transformations obtain accompaniment.

2. song according to claim 1 goes accompaniment method, it is characterised in that the step " obtains audio accompaniment signalWith song audio signal " method be：Audio accompaniment signal and song audio signal are obtained from stereo song audio, specificallyFor：

The left or right sound channel signal of stereo song audio is carried out anti-phase to obtain left or right inversion signal；

Left or right inversion signal is added with right or left channel signals and obtains audio accompaniment signal；

And it regard the right side of stereo song audio or left channel signals as the song audio signal for needing to go to accompany.

3. song according to claim 1 goes accompaniment method, it is characterised in that described " respectively to song audio signalThe step that implements pre-processed is done with audio accompaniment signal is：

Song audio signal and audio accompaniment signal are normalized respectively；Wherein, the method for the normalizedFor：Find out the maximum value of song audio signal and audio accompaniment signal respectively, and by song audio signal and audio accompanimentSignal divided by corresponding maximum value；

The song audio signal and audio accompaniment signal after normalized are divided into N number of frame respectively, wherein, N is positive integer；

Each song frame and accompaniment frame are carried out plus Hanning window filtering.

4. song according to claim 1 goes accompaniment method, it is characterised in that described " to compose song audio signal amplitudeEnhanced audio accompaniment signal amplitude spectrum is subtracted, and combines the phase progress FFT inverse transformations of song audio signal and obtains companionThe audio signal played ", specifically includes step：

According to formulaBy songThe amplitude spectrum of audio signal subtracts enhanced audio accompaniment signal amplitude spectrum, obtains the audio frequency signal amplitude spectrum after accompaniment,Wherein, S_n(i) composed for song audio signal amplitude, M_n(i) composed for enhanced audio accompaniment signal amplitude, Y_n(i) it is to go accompanimentAudio frequency signal amplitude spectrum afterwards, FN is the points of FFT, and a is signal to noise ratio Dynamic gene, and b is accompaniment Dynamic gene, and N is justInteger；

According to formula k_n(i)=Y_n(i)/S_n(i) (i=0,1,2 ... FN/2) (n=0,1,2 ... N-1) will go the audio after accompanimentSignal amplitude is composed divided by song audio signal amplitude spectrum obtains proportionality coefficient k_n(i)

The FFT real parts of song audio signal and imaginary part are multiplied by proportionality coefficient k respectively_n(i), and the FFT inversions of FN point are carried outChange；

Frame obtained by after FFT inverse transformations is stitched together, obtains removing the audio signal after accompaniment.

5. a kind of song removes the device of accompaniment, it is characterised in that including：

Audio accompaniment signal and song audio signal acquisition module, for obtaining audio accompaniment signal and song audio signal；

Pretreatment and FFT module, for being pre-processed respectively to song audio signal and audio accompaniment signal and carrying out FFTConversion obtains song audio signal amplitude spectrum and audio accompaniment signal amplitude spectrum and the phase of song audio signal；

Amplitude spectrum of accompanying strengthens module, for strengthening audio accompaniment signal amplitude spectrum；Specifically for traversal audio accompanimentAll frames of signal amplitude spectrum, find out common all amplitudes of 2m+1 frames of the rear m frames of present frame, the preceding m frames of present frame and present frameThe maximum of spectrum corresponding points, the new value that will be put corresponding to the value as present frame, wherein, m is default positive integer；

Spectral substraction and FFT inverse transform modules, for song audio signal amplitude spectrum to be subtracted into enhanced audio accompaniment signalAmplitude spectrum, and combine the audio signal that the phase progress FFT inverse transformations of song audio signal obtain accompaniment.

6. song according to claim 5 removes accompaniment apparatus, it is characterised in that the audio accompaniment signal and song audioSignal acquisition module includes audio accompaniment signal acquiring unit and song audio signal acquiring unit；

The audio accompaniment signal acquiring unit is used to the left or right sound channel signal of stereo song audio carrying out anti-phase obtainLeft or right inversion signal；Left or right inversion signal is added with right or left channel signals and obtains audio accompaniment signal；

The song audio signal acquiring unit is used for right in stereo song audio or left channel signals as needing to removeThe song audio signal of accompaniment.

7. song according to claim 5 removes accompaniment apparatus, it is characterised in that the pretreatment and FFT module are alsoIncluding normalization unit, framing unit, windowing unit；

The normalization unit is used to song audio signal and audio accompaniment signal is normalized respectively, wherein normalizingChange is processed as：Find out the maximum value of song audio signal and audio accompaniment signal respectively, and by song audio signal and companionPlay audio signal divided by corresponding maximum value；

Song audio signal and audio accompaniment signal after normalized is divided into N number of by the framing unit for respectivelyFrame, wherein, N is positive integer；

The windowing unit is used to carry out plus Hanning window filtering each song frame and accompaniment frame.

8. song according to claim 5 removes accompaniment apparatus, it is characterised in that spectral substraction and FFT the inversion mold changingBlock includes spectral substraction unit, FFT inverse transformation blocks and concatenation unit；

The spectral substraction unit is used to the amplitude spectrum of song audio signal subtracting enhanced audio accompaniment signal amplitude spectrum, obtainsThe audio frequency signal amplitude gone after accompaniment is composed, and formula is：Wherein, S_n(i) composed for song audio signal amplitude, M_n(i) composed for enhanced audio accompaniment signal amplitude, Y_n(i) it is to go accompanimentAudio frequency signal amplitude spectrum afterwards, FN is the points of FFT, and a is signal to noise ratio Dynamic gene, and b is accompaniment Dynamic gene, and N is justInteger；

The FFT inverse transformation blocks are used for going the audio frequency signal amplitude spectrum after accompaniment to carry out FFT inverse transformations；Specifically, according toFormula k_n(i)=Y_n(i)/S_n(i) (i=0,1,2 ... FN/2) (n=0,1,2 ... N-1) will go the audio frequency signal amplitude after accompanimentSpectrum divided by song audio signal amplitude spectrum obtain proportionality coefficient k_n(i)；Then by the FFT real parts and imaginary component of song audio signalProportionality coefficient k is not multiplied by_n(i), and the FFT inverse transformations of FN point are carried out；

The concatenation unit is used to frame resulting after FFT inverse transformations being stitched together, and obtains removing the audio signal after accompaniment.