CN102509546B

Movatterモバイル変換

Info

Publication number: CN102509546B
Application number: CN201110359005XA
Authority: CN
Inventors: 谭政
Original assignee: Beijing Telesound Electronics Co Ltd
Current assignee: Beijing Telesound Electronics Co Ltd
Priority date: 2011-11-11
Filing date: 2011-11-11
Publication date: 2013-07-10
Anticipated expiration: 2031-11-11
Also published as: CN102509546A

Abstract

The invention relates to the technical field of voice recognition, and discloses a noise reduction and abnormal sound detection method applied to rail transit. The method comprises the following steps: S1, acquiring sound signals in a rail transit monitoring range; S2, converting the sound signals to digital signals from analog signals; S3, extracting the characteristic values of the digital signals, and judging whether the sound signal is abnormal or not according to the characteristic values; and S4, constructing a filter for performing noise reduction process for the sound signals outputted by the step 2, and outputting the processed sound signals. According to the invention, the voice recognition technical design is utilized, the noise component of the sound signals in the rail transit monitoring range can be effectively restrained, and meanwhile, the abnormal sound of the rail transit monitoring range can be detected conveniently.

Description

Be applied to noise reduction and the abnormal sound detection method of track traffic

Technical field

The present invention relates to the speech recognition technology field, be specifically related to a kind of noise reduction and abnormal sound detection method that is applied to track traffic.

Background technology

Along with the fast development of track traffic, track traffic has become the important vehicles of city dweller's go off daily, and it is time many that the taking advantage of of track traffic substituted, and the volume of the flow of passengers is big, and this has become a big problem of current track traffic safety operation.The video monitoring system of Yin Ruing subsequently, to a certain extent for operation department provides more comprehensive on-site supervision information, by these information, operation department can understand the security situation in the orbital station in real time.Because these supervisory systems do not possess the Voice Surveillance function, only helpless with video monitoring therefore for some special events, for example Tu Fa the event of surrounding and watching, explosive incident, cry, calling for help etc.

Speech recognition technology has been obtained a lot of great achievements from rising to present stage, and accidents such as this screams for we identify from abnormal sound identification, calls for help, blast provide theoretical foundation.In recent years, the effect of audio select function constantly comes into one's own, and the user has higher requirement to supervisory system, should can " visible ", can " listen clearly " again.Because influence of environmental noise, audio select often are difficult to reach auditory effect well, in order to accomplish " listening clearly ", need a kind of effective noise reduction of design and abnormal sound detection method badly.

Summary of the invention

(1) technical matters that will solve

Technical matters to be solved by this invention is: how the noise contribution in a kind of voice signal that can effectively suppress to monitor in the track traffic monitoring scope is provided, can detects the method that abnormal sound in the scope is monitored in track traffic simultaneously easily.

(2) technical scheme

For solving the problems of the technologies described above, the invention provides a kind of noise reduction and abnormal sound detection method that is applied to track traffic, may further comprise the steps:

Voice signal in S1, the acquisition trajectory Traffic monitoring regional extent;

S2, described voice signal is changed into digital signal from simulating signal;

The eigenwert of S3, the described digital signal of extraction judges according to described eigenwert whether described voice signal is the abnormal sound signal;

S4, by the structure wave filter, utilize described wave filter that the voice signal of step S2 output is done noise reduction process, and the voice signal of output after handling.

Preferably, the step of structure wave filter comprises among the step S4:

S41, the digital signal of step S2 output is done windowing process;

S42, utilize the signal after the Fourier pair windowing to do the amplitude-frequency conversion process;

S43, according to the energy information of the signal frame after the windowing of Fourier transform coefficients statistics;

The priori signal to noise ratio (S/N ratio) of S44, the described signal frame of calculating;

S45, utilize described energy information and priori signal to noise ratio (S/N ratio) structure wave filter.

Preferably, window function used when doing windowing process among the step S41 is the hamming window of distortion, and the detailed process of constructing described window function is: at first, ask the hamming window of L+1 point, then each point is asked square value; Secondly, ask the each point quadratic sum of the hamming window of L+1 point; At last, with the quadratic sum of trying to achieve each square value is done normalization, get preceding L normalized value as the value of the hamming window of distortion.

Preferably, before described step S1, also comprise step: the voice signal sample storehouse of the some in the generator orbital Traffic monitoring regional extent, extract each audio signal characteristics value in the described sample storehouse then, form proper vector according to each audio signal characteristics value, again formed proper vector is carried out cluster operation, generate the sample Al Kut and levy model.

Preferably, described cluster operation is specially: utilize gauss hybrid models or Hidden Markov Model (HMM) to carry out modeling.

Preferably, judge that according to described eigenwert whether described voice signal is that the step of abnormal sound signal is specially among the step S3: calculate the eigenwert of described digital signal and described sample Al Kut and levy normal sound proper vector and abnormal sound proper vector Euclidean space distance separately in the model, gained distance value minimum, think that corresponding sound type is identical in the digital signal extracted and this sample storehouse, thereby judge whether the signal of gathering is the abnormal sound signal.

Preferably, described eigenwert is short-time zero-crossing rate, short-time energy, signal energy and signal frequency distribution relation, any in energy gradient value and the Mel cepstrum coefficient or several combinations.

(3) beneficial effect

The present invention utilizes speech recognition technology to design a kind of noise contribution that can effectively suppress to monitor in the interior voice signal of track traffic monitoring scope, can detect the method that abnormal sound in the scope is monitored in track traffic simultaneously easily.The present invention not only successfully is incorporated into protection and monitor field with speech recognition technology, can improve track operation personnel to a certain extent to the response speed of accident, and effectively speech enhancement technique has been referred to protection and monitor field, utilize the wave filter construct to carry out the noise contribution of noise reduction in can the obvious suppression voice signal, and then improved the audio select effect effectively.

Description of drawings

Fig. 1 is the method flow diagram of the embodiment of the invention;

Fig. 2 is the process flow diagram that noise-reduction method is implemented among the present invention.

Embodiment

Regard to a kind of noise reduction and abnormal sound detection method that is applied to track traffic proposed by the invention down, describe in detail in conjunction with the accompanying drawings and embodiments.

As shown in Figure 1, 2, be the method flow diagram of the embodiment of the invention, this method may further comprise the steps:

The voice signal sample storehouse of the some in S0, the generator orbital Traffic monitoring regional extent, extract each audio signal characteristics value in the described sample storehouse then, form proper vector according to each audio signal characteristics value, again formed proper vector is carried out cluster operation, generate the sample Al Kut and levy model.Described cluster operation is specially: (Gaussian Mixture Model GMM) carries out modeling, also can adopt other model to carry out modeling, as Hidden Markov Model (HMM) (HMM) to utilize gauss hybrid models;

Voice signal in S1, the acquisition trajectory Traffic monitoring regional extent, the scope that the size of described scope can be gathered by employed sound collection equipment decides;

The eigenwert of S3, the described digital signal of extraction, calculate the eigenwert of described digital signal and described sample Al Kut and levy Euclidean space distance between model, particularly, eigenwert by calculating the digital signal of extracting and sample Al Kut are levied normal sound proper vector and the abnormal sound proper vector Euclidean space distance separately in the model, gained distance value minimum, we think that the digital signal of extracting is identical with the type of the corresponding sound in this sample storehouse, thereby judge whether the signal of gathering is abnormal sound;

S4, by the structure wave filter, utilize described wave filter that described voice signal is done noise reduction process, and output clean speech signal.

The step of structure wave filter comprises among the step S4:

S41, the digital signal of step S2 output is done windowing process, used window function is the hamming window of distortion, and the detailed process of tectonic window function is: at first, ask L+1 the hamming window of putting, then each point is asked square value; Secondly, ask the each point quadratic sum of the hamming window of L+1 point; At last, with the quadratic sum of trying to achieve each square value is done normalization, get preceding L normalized value as the value of the hamming window of distortion.Experiment shows that the tectonic window function can improve the accuracy that noise spectrum is estimated to a certain extent like this, thereby can effectively improve the filtering performance of wave filter;

S45, utilize described energy information and priori signal to noise ratio (S/N ratio) structure wave filter, utilize the method for described energy information and priori signal to noise ratio (S/N ratio) structure wave filter to be prior art.Wave filter with step S45 output is done noise reduction process to the digital audio and video signals of S2 output, and the sound signal of output is exactly the echo signal behind the noise reduction.

Advantage of the present invention is that speech recognition technology successfully is incorporated into protection and monitor field, can solve the problem of abnormal sound event and audio defeat difficulty in the intelligent video monitoring system monitoring track traffic to a certain extent.

Above embodiment only is used for explanation the present invention; and be not limitation of the present invention; the those of ordinary skill in relevant technologies field; under the situation that does not break away from the spirit and scope of the present invention; can also make a variety of changes and modification; therefore all technical schemes that are equal to also belong to category of the present invention, and scope of patent protection of the present invention should be defined by the claims.

Claims

1. noise reduction and an abnormal sound detection method that is applied to track traffic is characterized in that, may further comprise the steps:

S4, by the structure wave filter, utilize described wave filter that the voice signal of step S2 output is done noise reduction process, and the voice signal of output after handling;

The step of structure wave filter comprises among the step S4:

S41, the digital signal of step S2 output is done windowing process;

S45, utilize described energy information and priori signal to noise ratio (S/N ratio) structure filter model;

Window function used when doing windowing process among the step S41 is the hamming window of distortion, and the detailed process of constructing described window function is: at first, ask the hamming window of L+1 point, then each point is asked square value; Secondly, ask the each point quadratic sum of the hamming window of L+1 point; At last, with the quadratic sum of trying to achieve each square value is done normalization, get preceding L normalized value as the value of the hamming window of distortion.

2. the method for claim 1, it is characterized in that, before described step S1, also comprise step: the voice signal sample storehouse of the some in the generator orbital Traffic monitoring regional extent, extract each audio signal characteristics value in the described sample storehouse then, form proper vector according to each audio signal characteristics value, again formed proper vector is carried out cluster operation, generate the sample Al Kut and levy model.

3. method as claimed in claim 2 is characterized in that, described cluster operation is specially: utilize gauss hybrid models or Hidden Markov Model (HMM) to carry out modeling.

4. method as claimed in claim 2, it is characterized in that, judge that according to described eigenwert whether described voice signal is that the step of abnormal sound signal is specially among the step S3: calculate the eigenwert of described digital signal and described sample Al Kut and levy normal sound proper vector and abnormal sound proper vector Euclidean space distance separately in the model, gained distance value minimum, think that corresponding sound type is identical in the digital signal extracted and this sample storehouse, thereby judge whether the signal of gathering is the abnormal sound signal.

5. as each described method in the claim 1 ~ 4, it is characterized in that described eigenwert is short-time zero-crossing rate, short-time energy, signal energy and signal frequency distribution relation, any in energy gradient value and the Mel cepstrum coefficient or several combinations.