Summary of the invention
In view of this, the purpose of the embodiment of the present invention is to provide under a kind of noisy environment borborygmus detection method, dressPut and system, it is possible to be lifted in noisy environment the accuracy identifying borborygmus signal from collection borborygmus mixed signal.
First aspect, embodiments provides the borborygmus detection method under a kind of noisy environment, including:
By the borborygmus mixed signal of sensor acquisition active user, wherein, described borborygmus mixed signal includes intestinalRing tone signal and environmental disturbances signal;
Described borborygmus mixed signal is converted to digital signal;
Extract the time-frequency feature of described digital signal;
The time-frequency feature of described digital signal is inputted in the convolutional neural networks trained and processes, detect intestinalThe time point that ring sound occurs, thus distinguish described borborygmus signal and described environmental disturbances signal;Wherein, described convolutional Neural netThe training process of network includes:
Borborygmus sample signal and least one set interference sample signal is gathered respectively by described sensor;
Sample signal is disturbed all to be converted into numeral sample signal described borborygmus sample signal and least one set;
Extract the time-frequency feature of described numeral sample signal;
In time-frequency domain, described numeral sample signal is made signal label;Described signal label includes for labelling intestinalRing tone signal borborygmus label of time of occurrence point in described borborygmus sample signal, and for labelling interference signal describedThe interference label of time of occurrence point in interference sample signal;
From described numeral sample signal, borborygmus label letter is extracted according to described borborygmus label and described interference labelNumber and each group interference label signal as training sample;
Using the time-frequency feature of numeral sample signal corresponding for described training sample as training data, by described borborygmusLabel and described interference label are as supervision message, and training is for distinguishing borborygmus signal and the convolutional Neural of various interference signalNetwork.
In conjunction with first aspect, embodiments provide the first possible embodiment of first aspect, wherein, carryThe time-frequency feature taking described numeral sample signal includes:
The described numeral sample signal with sequential is carried out framing and windowing;
Described numeral sample signal after windowing is carried out fast Fourier transform, extracts power spectrum;
Use Gammatone bank of filters that described power spectrum is filtered;Described Gammatone bank of filters is accomplished that onePlanting linear transformation, its impulse respective table is shown as:
gi(t)=Atn-1exp(-2πbit)cos(2πfi+φi),t≥0,1≤i≤N,
Wherein, A represents the constant of regulation ratio, and n represents filter order, biRepresent the rate of decay, fiExpression center frequencyRate, φiRepresenting phase place, N represents number of filter;For i-th wave filter, there is bi=1.019ERB (fi), wherein equivalent rectangularBandwidth ERB (fi) expression formula be
The coefficient matrix of the filtered described power spectrum through Gammatone bank of filters is carried out discrete cosine changeChange, obtain Gammatone cepstrum coefficient;
Using described Gammatone cepstrum coefficient as the time-frequency feature of described numeral sample signal.
In conjunction with the first possible embodiment of first aspect, embodiments providing first aspect the second canThe embodiment of energy, wherein, makes signal label to described numeral sample signal in time-frequency domain and includes:
In time-frequency domain, the described numeral sample signal that each time point is corresponding is judged;Wherein, each time pointCorresponding described numeral sample signal is the signal frame after described framing and windowing;
When the described signal frame of current point in time has borborygmus signal, borborygmus label is set for described signal frame;When the described signal frame of current point in time has interference signal, interference label is set for described signal frame;Wherein, described borborygmusPhonetic symbol label and described interference label multi-C vector represent;
From described numeral sample signal, borborygmus label letter is extracted according to described borborygmus label and described interference labelNumber and each group interference label signal include as training sample: according to arranging described borborygmus label and the institute of described interference labelState signal frame order, from described numeral sample signal, extract borborygmus label signal and each group of interference label signal as trainingSample.
In conjunction with the embodiment that the second of first aspect or first aspect is possible, embodiments provide first partyThe third possible embodiment in face, wherein: described convolutional neural networks includes input layer, multiple hidden layer, full linking layerAnd output layer.Described hidden layer and full linking layer all contain the parameter of self, and described parameter includes weights and biasing;
Described convolutional neural networks training process use gradient descent method, detailed process includes:
Convolutional neural networks is carried out random initializtion;
Start training, order random-ising by described training sample and described signal label, take out the most at randomTake J training sample and form a sample set as input sample, extract the signal label group corresponding with described input sampleBecome a sub-set of tags, complete being trained on described sample set all inputs sample and one take turns training, complete in instituteBeing trained on sample set is had once to train;
During taking turns training one, in described sample set, all of input sample all carries out propagated forward, passes throughAfter the effect of described convolutional neural networks, the output layer at convolutional neural networks compares with corresponding signal label, calculatesDifference square as square error between output result and corresponding signal label;Obtain the output result of all input samplesSquare error with signal label;
During taking turns training one, utilize described square error to carry out back propagation and parameter updates, including: from describedOutput layer starts, and reversely sequentially passes through each layer, obtains the equivalent error on each layer;Utilize the equivalent error meter on each layerCalculate the gradient of parameter on place layer, utilize the parameter of gradient updating place layer on each layer;
During once training, complete last when taking turns training, calculate the mean error of all described square errors,Described mean error is utilized to judge whether described convolutional neural networks restrains;The stable threshold set is tended at described mean errorTime, determine that described convolutional neural networks reaches convergence, if described convolutional neural networks reaches to restrain, deconditioning;Otherwise openBegin newly once to train, during until the number of times of training or duration reach to set threshold value, deconditioning;
After training stops, using current convolutional neural networks as the convolutional neural networks trained.
In conjunction with the third possible embodiment of first aspect, embodiments provide the 4th kind of first aspectPossible embodiment, wherein, the detailed process that described input sample carries out propagated forward includes:
Described input sample is done computing by the input layer of described convolutional neural networks, and each layer of convolutional neural networks is to upperComputing is done in the output of one layer;
In convolutional neural networks, l layer is output as
xl=f (ul) formula (1)
Wherein f () is activation primitive, ul=Wlxl-1+bl, xl-1It is the output of l-1 layer, the input of l layer, WlAnd blIt is weights and the biasing of l layer respectively;Activation primitive uses sigmoid function or hyperbolic tangent function;
The square error calculated between output result and corresponding signal label includes: input sample, meter for eachCalculating the square error between output result and the corresponding signal label obtained by the output layer of convolutional neural networks, jth is defeatedThe squared error function entering sample is
Wherein, K represents described output result and the dimension of signal label,Represent that jth sample is through convolutional Neural netThe kth dimension of the output result after network,Represent the kth dimension of the signal label that jth sample is corresponding.
In conjunction with the 4th kind of possible embodiment of first aspect, embodiments provide the 5th kind of first aspectPossible embodiment, wherein, described back propagation and parameter update and specifically include:
By described output result with the square error of signal label from the beginning of described output layer, be transferred to convolution the most successivelyEach layer in neutral net, obtains the equivalent error on each layer;Described equivalent error is that square error is to place layer parameterError rate, computing formula is
Wherein, E is the square error of output result, and b is the parameter of convolutional neural networks;
Equivalent error on output layer is
L in formula represents output layer, operative symbolRepresent element multiplication one by one;yLFor the output result of output layer, tLForThe signal label of output layer;
Equivalent error on other layers is
Utilize the equivalent error δ on each layerl, calculate the gradient of parameter on the layer of place, obtain the gradient of weights and biasingIt is respectively as follows:
η is learning rate, arranges different learning rates for different parameters;
Utilize the parameter of the gradient updating place layer of parameter on each layer;Plus place layer in the original parameter of each layerThe gradient of parameter obtains new parameter.
In conjunction with the 4th kind of possible embodiment of first aspect, embodiments provide the 5th kind of first aspectPossible embodiment, wherein, during once training, complete last when taking turns training, calculate all described square mistakesThe mean error of difference, described mean error function is:
Wherein, the number of samples during J represents once training;
At described mean error EJWhen tending to the stable threshold set, determine that described convolutional neural networks reaches to receiveHold back;
If described convolutional neural networks reaches to restrain, deconditioning;Otherwise start newly once to train, update convolution godThrough the parameter of network, gradually minimize EJ, the output result making described convolutional neural networks is close with corresponding signal label.
Second aspect, the embodiment of the present invention also provides for the borborygmus sound detection device under a kind of noisy environment, including:
Convolutional neural networks training module, for the training of convolutional neural networks, concrete training process includes: by sensingDevice gathers borborygmus sample signal and least one set interference sample signal respectively;By described borborygmus sample signal and least one setInterference sample signal is all converted into numeral sample signal;Extract the time-frequency feature of described numeral sample signal;In time-frequency domain,Described numeral sample signal is made signal label;Described signal label includes for labelling borborygmus signal at described borborygmusThe borborygmus label of time of occurrence point in sample signal, and when labelling interference signal occurs in described interference sample signalBetween point interference label;From described numeral sample signal, borborygmus is extracted according to described borborygmus label and described interference labelLabel signal and each group of interference label signal are as training sample;Time-frequency by numeral sample signal corresponding for described training sampleDescribed borborygmus label and described interference label, as training data, are used for distinguishing intestinal by spectrum signature as supervision message, trainingRing tone signal and the convolutional neural networks of various interference signal;
Signal acquisition module, for by the borborygmus mixed signal of sensor acquisition active user, wherein, described borborygmusMixture of tones signal includes borborygmus signal and environmental disturbances signal;
Signal conversion module, for described borborygmus mixed signal is converted to digital signal, and extracts described numeral letterNumber time-frequency feature;
Borborygmus detection module, for inputting the training of described convolutional neural networks by the time-frequency feature of described digital signalThe described convolutional neural networks that module trains processes, detects the time point that borborygmus occurs, thus distinguish describedBorborygmus signal and described environmental disturbances signal.
In conjunction with second aspect, embodiments provide the first possible embodiment of second aspect, wherein, instituteState convolutional neural networks training module to include:
Signal windowing unit, for carrying out framing and windowing to the described numeral sample signal with sequential;
Fourier transform unit, for the described numeral sample signal after windowing is carried out fast Fourier transform, extractsPower spectrum;
Gammatone bank of filters, is used for realizing a kind of linear transformation, filters described power spectrum;Described GammatoneThe impulse respective table of bank of filters is shown as:
gi(t)=Atn-1exp(-2πbit)cos(2πfi+φi),t≥0,1≤i≤N,
Wherein, A represents the constant of regulation ratio, and n represents filter order, biRepresent the rate of decay, fiExpression center frequencyRate, φiRepresenting phase place, N represents number of filter;For i-th wave filter, there is bi=1.019ERB (fi), wherein equivalent rectangularBandwidth ERB (fi) expression formula be
Discrete cosine transform unit, for the filtered described power spectrum through Gammatone bank of filters beingMatrix number carries out discrete cosine transform, obtains Gammatone cepstrum coefficient;Using described Gammatone cepstrum coefficient as describedThe time-frequency feature of numeral sample signal.
The third aspect, the embodiment of the present invention also provides for the borborygmus detecting system under a kind of noisy environment, including second partyThe borborygmus sound detection device of face offer and sensor;
Described sensor is for gathering borborygmus sample signal and least one set interference sample in neural network training processThis signal;Gathering the borborygmus mixed signal of active user during carrying out borborygmus detection, wherein, described borborygmus mixesSignal includes borborygmus signal and environmental disturbances signal;And the signal gathered is sent to described borborygmus sound detection device.
Borborygmus detection method under the noisy environment that the embodiment of the present invention is provided, Apparatus and system, utilize borborygmusThe difference of performance in time and frequency domain characteristics of signal and environmental disturbances signal, train one convolutional neural networks distinguish borborygmus withInterference tones, can complete the detection to borborygmus in noisy environment, contributes to promoting the accuracy of borborygmus detection.
For making the above-mentioned purpose of the present invention, feature and advantage to become apparent, preferred embodiment cited below particularly, and coordinateAppended accompanying drawing, is described in detail below.
Detailed description of the invention
For making the purpose of the embodiment of the present invention, technical scheme and advantage clearer, below in conjunction with the embodiment of the present inventionMiddle accompanying drawing, is clearly and completely described the technical scheme in the embodiment of the present invention, it is clear that described embodiment is onlyIt is a part of embodiment of the present invention rather than whole embodiments.Generally real with the present invention illustrated described in accompanying drawing hereinThe assembly executing example can be arranged with various different configurations and design.Therefore, below to the present invention's provided in the accompanying drawingsThe detailed description of embodiment is not intended to limit the scope of claimed invention, but is merely representative of the selected reality of the present inventionExecute example.Based on embodiments of the invention, the institute that those skilled in the art are obtained on the premise of not making creative workThere are other embodiments, broadly fall into the scope of protection of the invention.
At present, the collection of borborygmus and distinguish the artificial auscultation depending on doctor, the auscultation process one to borborygmusAs need longer time and relatively quiet environment.The method using sensor acquisition signal combination computer-assisted analysis canThink that diagnosis and treatment process provides sound assurance, but the interference signal such as the noisy speech in existing environment has with borborygmus signalSimilar wave characteristic, easily to the collection of borborygmus with distinguish and form serious interference.For this problem, at noisy environmentUnder borborygmus detection algorithm have great importance.
Convolutional neural networks is applicable to image, Speech processing, is usually used to structure complexity, nonlinearityDependency relation.Being mainly characterized by it and can go out there is distinctive by there being the training procedure extraction of supervision of convolutional neural networksThe feature combining form of matter, various feature combining form is stored in multi-level convolution kernel.Further, roll up in multilamellarUnder the common effect of lamination, it is possible to achieve to the most abstract of feature and combination, so that the expression energy of multi-level convolution kernelPower is abundanter, extends the suitability of the content that convolutional neural networks is acquired on the basis of finite sample.
Utilize the difference of performance in time and frequency domain characteristics of borborygmus signal and noise signal, train a convolutional Neural netNetwork distinguishes borborygmus and noise, can complete the detection to borborygmus under noise jamming.Convolutional neural networks can be spontaneousExtracting characteristically for distinguishing the key message of borborygmus and voice, the learning outcome form of expression in time-frequency domain is with specialThe form of expression levied is close, it is simple to analyzes and adjusts.
Based on this, the invention provides the borborygmus detection method under a kind of noisy environment, Apparatus and system, can filterNoise in environment, apparent extraction and identification borborygmus, complete the detection to borborygmus in noisy environment.
For ease of the present embodiment is understood, first to the intestinal under a kind of noisy environment disclosed in the embodiment of the present inventionRing sound detection method describes in detail, and Fig. 1 shows the borborygmus under a kind of noisy environment that the embodiment of the present invention is providedThe flow chart of detection method.As it is shown in figure 1, this detection method includes:
Step S101, by the borborygmus mixed signal of sensor acquisition active user, wherein, borborygmus mixed signal bagInclude borborygmus signal and environmental disturbances signal;
Step S102, is converted to digital signal by above-mentioned borborygmus mixed signal;
Step S103, extracts the time-frequency feature of above-mentioned digital signal;
Step S104, inputs the time-frequency feature of digital signal in the convolutional neural networks trained and processes, inspectionMeasure the time point that borborygmus occurs, thus distinguish borborygmus signal and environmental disturbances signal.
Wherein, the concrete grammar of training convolutional neural networks is as in figure 2 it is shown, comprise the steps.
Step S201, gathers borborygmus sample signal and least one set interference sample signal respectively by sensor.
Step S202, disturbs sample signal to be all converted into numeral sample signal borborygmus sample signal and least one set.
Step S203, extracts the time-frequency feature of numeral sample signal.The time-frequency feature that the embodiment of the present invention uses isGammatone cepstrum coefficient, the concrete steps of the Gammatone cepstrum coefficient extracting numeral sample signal include: to when havingThe numeral sample signal of sequence carries out framing and windowing;To the numeral sample signal elder generation zero padding of each frame to N point, N=2i, i is wholeNumber, and i >=8;Then, the numeral sample signal of each frame being carried out windowing or preemphasis processes, windowed function uses Hamming windowOr breathe out peaceful window (hanning) (hamming).
Numeral sample signal after windowing is carried out fast Fourier transform, extracts power spectrum;
Use Gammatone bank of filters that power spectrum is filtered;Described Gammatone bank of filters is accomplished that a kind of lineProperty conversion, its impulse respective table is shown as:
gi(t)=Atn-1exp(-2πbit)cos(2πfi+φi),t≥0,1≤i≤N,
Wherein, A represents the constant of regulation ratio, and n represents filter order, biRepresent the rate of decay, fiExpression center frequencyRate, φiRepresenting phase place, N represents number of filter;For i-th wave filter, there is bi=1.019ERB (fi), wherein equivalent rectangularBandwidth ERB (fi) expression formula be
The coefficient matrix of the filtered power spectrum through Gammatone bank of filters is carried out discrete cosine transform,To Gammatone cepstrum coefficient;Gammatone cepstrum coefficient combines the auditory properties of human ear, is a kind of audition filtering characteristics,The resolution of low frequency is high, and the resolution of high frequency is suitably compressed.
It should be noted that through above-mentioned steps, borborygmus sample signal c (t) correspondence can be respectively obtainedGammatone cepstrum coefficient S (j) that Gammatone cepstrum coefficient C (j) is corresponding with speech samples signal, both cepstrum coefficientsThe training of convolutional neural networks will be used for as training data.In like manner, the borborygmus collected under noisy environment to be detected mixesClose the Gammatone cepstrum coefficient that signal is obtained by above-mentioned steps, the inspection of borborygmus time of occurrence can be used for as featureSurvey.
Step S204, in time-frequency domain, makes signal label to numeral sample signal;Signal label includes for labelling intestinalRing tone signal borborygmus label of time of occurrence point in borborygmus sample signal, and for labelling interference signal at interference sampleThe interference label of time of occurrence point in signal.The detailed process of this step includes:
In time-frequency domain, the numeral sample signal that each time point is corresponding is judged;Wherein, each time point is correspondingNumeral sample signal be the signal frame after framing and windowing;
When the signal frame of current point in time has borborygmus signal, borborygmus label is set for signal frame;When currentBetween point signal frame in have interference signal time, for signal frame, interference label is set.
Wherein, borborygmus label and interference label multi-C vector represent.If only one group interference signal, such as interference letterNumber being voice signal, signal label can use bivector to represent, has [1,0] to be carved with borborygmus when representing this certain moment tSound occurs, [0,1] is carved with voice and is occurred when representing this.Note time index t here be no longer concrete sampled signal timeBetween index, but the time sequencing index of the Gammatone cepstrum coefficient obtained through step S203, i.e. Gammatone cepstrumT frame coefficient time point in time sequencing in coefficient.If having many group interference signals, need to solve many classification problems,Label vector dimension can be increased, keep the value result of element in vector and the corresponding relation of classification results.
Step S205, according to borborygmus label and interference label extract from numeral sample signal borborygmus label signal withEach group interference label signal is as training sample.Detailed process includes: according to arranging borborygmus label and the signal of interference labelFrame sequential, extracts borborygmus label signal and each group of interference label signal as training sample from numeral sample signal.TrainingThe form of sample is continuous d frame Gammatone cepstrum coefficient matrix, and a frame at this matrix center is with borborygmus or voiceCepstrum coefficient.After all training samples have extracted, the Gammatone cepstrum coefficient not being extracted is considered not comprise borborygmusAnd voice, the most it is not used for training convolutional neural networks.Sample one the training sample set of composition being extracted, internalPut in order and only represent the order being extracted, the most corresponding concrete time time point.Corresponding, the mark of successive frame orderThe label that only marked borborygmus and voice appearance in label is extracted, and forms tag set, and remaining label is not used.ByThis, be available for two class sample of signal of training convolutional neural networks and corresponding label.In like manner, adopt under noisy environmentThe borborygmus mixed signal that collection arrives, it is possible to sample drawn set.
Step S206, using the time-frequency feature of numeral sample signal corresponding for training sample as training data, by borborygmusPhonetic symbol label and interference label are as supervision message, and training is for distinguishing borborygmus signal and the convolutional Neural net of various interference signalNetwork.
The structure of described convolutional neural networks as shown in Figure 4, including input layer, multiple hidden layer, full linking layer and defeatedGo out layer.Hidden layer and full linking layer all contain the parameter of self, and described parameter includes weights and biasing.Convolutional neural networks hiddenHide layer comprise alternately arranged two convolutional layer and two down-sampling layers, convolutional layer and down-sampling layer all comprise self weights andBiasing.Convolutional layer, by the convolutional calculation of convolution kernel with input, obtains an output from the block of input every time, passes through convolution kernelTraversal in input obtains complete output.Described convolution kernel is the weights of convolutional layer.Down-sampling layer is by designingProportionality coefficient, input is compressed.
Convolutional neural networks training process as it is shown on figure 3, detailed process includes:
Step S2061, carries out random initializtion to convolutional neural networks;Except weights and the biasing of convolutional neural networks needOutside initializing, it is even more important that need the quantity of the setting network degree of depth and convolution kernel.The present embodiment uses typical caseConfiguration, along with raising and the increase of training sample of learning tasks complexity, can suitably increase the degree of depth and the convolution of networkThe quantity of core.Meanwhile, the specification of convolution kernel is also important influence factor, it is proposed that the limit of convolution kernel in design ground floor convolutional layerLength is general more than sample time span, is so conducive to the feature representation form of convolutional neural networks acquistion global sense;
Step S2062, starts training, order random-ising by training sample and signal label, the most randomExtract J training sample and form a sample set as input sample, extract the signal label composition corresponding with inputting sampleOne sub-set of tags, completes being trained on sample set all inputs sample and one takes turns training, complete at all samplesBeing trained in subset is once trained;
Step S2063, during taking turns training one, in sample set, all of input sample all carries out forward direction biographyBroadcasting, after the effect of convolutional neural networks, the output layer at convolutional neural networks compares with corresponding signal label, meterCalculate difference between output result and corresponding signal label square as square error;Obtain the output knot of all input samplesFruit and the square error of signal label;
Square error cost function is defined as
Number of samples during wherein J represents once training, K represents the dimension of output and label,Represent jth sample warpCross the kth dimension of the output of convolutional neural networks,Represent the kth dimension of the label that jth sample is corresponding.The target of training is intended to moreThe parameter of new network so that network output and label closer to, namely minimize EJ.During for one of them sample, thenThe error function of jth sample is
In definition neutral net, l layer is output as
xl=f (ul) wherein ul=Wlxl-1+bl,
Here f () is activation primitive, xl-1It is the output of l-1 layer, the namely input of l layer, WlAnd blIt is respectivelyThe weights of l layer and biasing.Activation primitive can have a variety of, usually sigmoid function or hyperbolic tangent function,Sigmoid function is by output squeezing to [0,1], and hyperbolic tangent function is by output squeezing to [-1,1].By training data normalizingCancellation average and variance are the distribution form of 1, can strengthen convergence during stochastic gradient descent.The most permissibleRealizing propagated forward, the output of last layer is done computing by each layer, obtains exporting result, sample through nonlinear activation primitiveInformation is successively transmitted, and last output result is i.e. to inputting the predictive value that sample is borborygmus or voice.
Step S2064, during taking turns training one, utilizes square error to carry out back propagation and parameter updates, including:From the beginning of output layer, reversely sequentially pass through each layer, obtain the equivalent error on each layer;Utilize the equivalent error on each layerCalculate the gradient of parameter on place layer, utilize the parameter of gradient updating place layer on each layer;
Back propagation and parameter renewal process include:
The rate of change of neural network parameter is defined as by error
Then the back propagation on output layer is
L layer i.e. output layer, operative symbol thereinRepresenting element multiplication one by one, the back propagation on other layers is
By the error rate δ on each layerlThe gradient of each weights and biasing can be obtained
η therein is learning rate, and the parameter that can be different arranges different learning rates, utilizes gradient descent method to update ginsengDuring number, the gradient of parameter is added in original parameter and obtains new parameter.
Output at convolutional layer be multiple input convolution combination result, be represented by
WhereinRepresent the jth dimension output on l layer, MjRepresent input set,Represent in input set one concreteInput,Represent the weights contacting this input on l layer with jth dimension output,Represent corresponding biasing.Before and after convolutional layer withDown-sampling layer is connected, and back propagation and parameter on convolutional layer update the inverse process with down-sampling layer.In the embodiment of the present inventionDown-sampling layer weights useRepresenting, down-sampling factor n represents, down-sampling process will the block weighted average of n × n.By mistakeDifference rate of change is when down-sampling layer back propagation, it is only necessary to be once multiplied available with the weights participating in during propagated forward calculatingThe above error rate on a convolutional layer.According to aforementioned back propagation, can obtain the error rate on convolutional layer is
Up () therein represent up-sampling calculate, it is simply that by the object tools on a point to the block carrying out down-samplingIn the matrix that size is identical, this process is also referred to as Kronecker and amasss, and is represented by
N therein is exactly the factor during down-sampling calculates.Then, can the error rate of change to biasing on this convolutional layer
The block position of down-sampling, the error rate of change to convolution kernel is carried out when what wherein u, v represented is propagated forwardFor
WhereinIt isIn with convolution kernelBlock by element multiplication.The thus obtained error rate of change to parameterSubstitute into the formula in back-propagation process and calculate the gradient of parameter, and then undated parameter.
It is output as on down-sampling layer
Wherein down () represents that down-sampling calculates, by same in two dimensions for input under the control of down-sampling factor of nTime be compressed into original 1/n.When l+1 layer is convolutional layer, can be byMatrix according to the whole inverted arrangements of ranks order, withCarry out complete convolution algorithm, the result of complete convolution again withElement multiplication one by one, can obtainWhat is called is rolled up completelyLong-pending, it is convolution again after zero padding on boundary position, thus can obtain identical with down-sampling layer output sizePass throughThe error rate of change to parameter on down-sampling layer can be obtained
And then can be with undated parameter.
Step S2065, during once training, complete last when taking turns training, calculate the flat of all square errorsAll error, utilizes mean error to judge whether convolutional neural networks restrains;When mean error tends to the stable threshold set, reallyDetermine convolutional neural networks and reach convergence, if convolutional neural networks reaches to restrain, deconditioning;Otherwise return step S2602,Start newly once to train, during until the number of times of training or duration reach to set threshold value, deconditioning;
Selection for the condition of convergence is not unique, and the stable threshold of mean error can regard concrete application to be needed reallyFixed, it is also possible to the number of times trained by setting carrys out the time of controlled training neutral net.
Step S2066, after training stops, using current convolutional neural networks as the convolutional neural networks trained.
In other embodiments, it would however also be possible to employ other time and frequency domain characteristics such as amplitude spectrum, power spectrum etc., concrete processMethod belongs to common knowledge, does not repeats at this.
Corresponding with the borborygmus detection method under above-mentioned noisy environment, the embodiment of the present invention additionally provides a kind of noisy ringBorborygmus sound detection device under border.As it is shown in figure 5, this borborygmus sound detection device, including such as lower module:
Convolutional neural networks training module 501, for the training of convolutional neural networks, concrete training process is examined with borborygmusIn survey method, the training process of convolutional neural networks is identical, does not repeats them here.
Signal acquisition module 502, for by the borborygmus mixed signal of sensor acquisition active user;
Signal conversion module 503, for borborygmus mixed signal being converted to digital signal, and extract digital signal timeSpectrum signature;
Borborygmus detection module 504, for inputting convolutional neural networks training module by the time-frequency feature of digital signalThe convolutional neural networks trained processes, detects the time point that borborygmus occurs, thus distinguish borborygmus signal andEnvironmental disturbances signal.
Wherein, convolutional neural networks training module 501 includes:
Signal windowing unit, for carrying out framing and windowing to the numeral sample signal with sequential;
Fourier transform unit, for the numeral sample signal after windowing carries out fast Fourier transform, extracts powerSpectrum;
Gammatone bank of filters, is used for realizing a kind of linear transformation, filters power spectrum;Concrete methods of realizing is upperThe borborygmus detection method stated is it is stated that repeat no more.
Discrete cosine transform unit, for the coefficient square to the filtered power spectrum through Gammatone bank of filtersBattle array carries out discrete cosine transform, obtains Gammatone cepstrum coefficient.
Further embodiment of this invention additionally provides the borborygmus detecting system under a kind of noisy environment, shown in Figure 6, bagInclude the borborygmus sound detection device 62 in above-described embodiment and sensor 64.Sensor 64 is for adopting in neural network training processCollection borborygmus sample signal and least one set interference sample signal;The intestinal of active user is gathered during carrying out borborygmus detectionRing mixture of tones signal, wherein, borborygmus mixed signal includes borborygmus signal and environmental disturbances signal;And the signal of collection is sent outDeliver to borborygmus sound detection device.Wherein, the concrete structure of borborygmus sound detection device 62 can use the structure shown in Fig. 5.
Those skilled in the art is it can be understood that arrive, for convenience and simplicity of description, and the system of foregoing descriptionWith the specific works process of device, it is referred to the corresponding process in preceding method embodiment, does not repeats them here.
Borborygmus detection method under the noisy environment that the embodiment of the present invention is provided, Apparatus and system, it is adaptable at noiseDetection to borborygmus in heterocycle border, utilizes the difference of performance in time and frequency domain characteristics of borborygmus signal and environmental disturbances signalDifferent, borborygmus signal can be identified from multiple interference signal quickly and accurately.
The above, the only detailed description of the invention of the present invention, but protection scope of the present invention is not limited thereto, and anyThose familiar with the art, in the technical scope that the invention discloses, can readily occur in change or replace, should containCover within protection scope of the present invention.Therefore, protection scope of the present invention should described be as the criterion with scope of the claims.