Mobile robot pavement material identification method based on sound characteristicsTechnical Field
The invention relates to a mobile robot pavement material identification method based on sound characteristics, and belongs to the technical field of mobile robot road condition identification and autonomous navigation.
Background
Currently, intelligent products are beginning to appear in every corner of people's life, and can replace people to complete various works, and one of the most widespread applications is intelligent mobile robots. So far, the intelligent mobile robot technology has become mature and perfect, and the functions and application range are continuously expanded. Mobile robots have been successfully used in many fields, such as education, scientific research, medical care, agriculture, industry, logistics, etc.
The autonomous navigation is one of basic functions required by the intelligent mobile robot, and the current road material is accurately known in the outdoor autonomous mobile driving process, which is very important for realizing the autonomous navigation function. The related personnel at home and abroad carry out a great deal of research on the road surface material identification direction, and the real-time, comprehensive, accurate, rapid and reliable road surface material identification technology is the key and difficult point of the current research. At present, the research on the quality classification of pavement materials is mainly divided into three aspects based on laser radar, cameras and other types of sensors (acceleration sensors, optical sensors and the like). Among them, the method based on the laser radar and the camera is to recognize the road surface environment by simulating human vision. However, when the mobile robot encounters different material pavements with similar colors and textures or special conditions such as low-visibility haze weather, correct judgment is difficult to make in a visual mode. Most of the other sensor-based methods focus on the identification of road parameters, including the identification of adhesion coefficients, the identification of road unevenness, the identification of soft and hard roads, and the like, and an effective road material identification method is lacked.
In order to accurately and effectively identify the road surface material, the invention explores a novel road surface material identification method of the mobile robot based on the sound characteristics. The classification and recognition of the road surface material are realized by analyzing and judging the sound information of the road surface material in consideration of analysis from the perspective of hearing. The method can effectively make up for the deficiency of the visual mode.
Disclosure of Invention
In order to overcome the defects in the prior art, the invention aims to provide a mobile robot pavement material identification method based on sound characteristics so as to improve the autonomous navigation capability of the mobile robot. The method comprises the steps of classifying and dividing common road surfaces, collecting sound data of the road surface material, establishing an effective road surface material sound data set, and preprocessing original road surface material sound data, wherein the preprocessing comprises operations of filtering and noise reduction, framing and windowing, end point detection and the like, so that effective fragments of the road surface material sound are obtained. And extracting frequency domain characteristics of sound data of different road surface materials, building a neural network, and performing characteristic learning by taking the sound characteristics of the road surface materials as input signals of the network, thereby realizing classification and identification of the different road surface materials. The invention enriches the technology and theory of pavement material identification, becomes a very beneficial attempt, and has important theoretical significance and application value.
In order to achieve the purpose of the invention and solve the problems existing in the prior art, the invention adopts the technical scheme that: a mobile robot pavement material identification method based on sound characteristics comprises the following steps:
step 1, collecting and establishing a road surface material sound data set: the pavement materials are divided into 10 categories, namely grassland pavement, ceramic tile pavement, asphalt pavement, natural stone pavement, cement pavement, rubber pavement, floor tile pavement, wood board pavement, cobblestone pavement and gravel pavement; collecting road surface material sound data by adopting a sound collector according to the rule that every 50 knocks of each type of road surface form a group, and establishing a road surface material sound data set S ═ SiI is more than or equal to 1 and less than or equal to 10, wherein i is the road surface material type, and S isiThe data is the sound data of the i-th type pavement material;
step 2, preprocessing sound data of the road surface material: through segmentation, filtering noise reduction, framing windowing and endpoint detection preprocessing operations, the quality of the sound data of the pavement material is improved, and noise influence is removed, and the method specifically comprises the following substeps:
(a) and (3) dividing: the recording of the road surface material sound is carried out according to a 50-time continuous knocking method, the original data needs to be further divided, the sound of knocking the road surface material each time is used as a single road surface material sound data, i-th class road surface material sound data Si={sijJ is more than or equal to 1 and less than or equal to 50, wherein sijJ is the knocking times corresponding to the road surface material sound data;
(b) filtering and denoising: filtering the collected road surface material sound data through a high-pass filter, describing through a formula (1),
where α is the filter coefficient, s
ij(t) is the original data at the current moment,
for the road surface material sound data at the time t after the filtering and noise reduction processing, the frequency spectrum can be balanced through a filter, the signal-to-noise ratio (SNR) is improved, the proportion of low-frequency information can be suppressed, and the proportion of high-frequency useful components is relatively increased;
(c) framing and windowing: on the whole, the overall road surface material sound characteristic of the knocking road surface has time-varying characteristic and is not stable road surface material sound data, but the recorded road surface material sound data is considered to be stable in a short time, in order to meet the Fourier transformation condition, the road surface material sound data needs to be divided into short-time stable road surface material sound data, the frame division and interception time period ranges from 10 ms to 30ms, in order to ensure that the characteristic parameter continuously and stably changes, some frames are uniformly reserved between two frames without overlapping parts, the change of each overlapping part is called frame shift, the value range is 1/4-1/2 of the frame length, and the road surface material sound data f after frame division and framing is between 1/4-1/2ijWindowing to reduce sub-frame artifactsInfluence, windowed road texture Sound data cij=fijXxw, wherein the window function w adopts a Hamming window;
(d) and (3) end point detection: determining the initial position and the end position of the effective road material sound data by setting different volume thresholds, and firstly calculating the windowed road material sound data cijVolume maximum value vmax inijAnd minimum value vminijAnd subtracting to obtain the transformation range value vdif of the road surface material sound dataijRespectively taking vmin from the lowest pavement material sound data as a starting pointij+vdifij×0.1、vminij+vdifij×0.01、vminij+vdifijTaking three different threshold values of x 0.05 as the end point position judgment of the road surface material sound data, starting to record the starting point when the threshold value is exceeded, judging whether the sound data at the next moment is larger than the threshold value or not until the end point lower than the threshold value is detected and recorded, and taking the road surface material sound data between the final starting point position and the final position lower than the threshold value as the effective road surface material sound data e for knocking the road surfaceij;
Step 3, Mel cepstrum feature extraction of the road surface material sound: the Mel cepstrum feature is a most common feature extraction method in sound data, and has the advantages of strong anti-interference capability and capability of reflecting feature differences among different sound data, so that the sound data of the road surface material can be effectively distinguished, and the method specifically comprises the following substeps:
(a) road surface material sound data e obtained by detecting end points
ijCarrying out one-dimensional Fourier transform to obtain corresponding frequency domain signals
Wherein N represents the number of sampling points in each frame of pavement material sound data, k is the sampling point, and the power spectrum g
ij=|p
ij(k)|
2;
(b) The power spectrum g of each frame of road surface material sound data
ijPutting the obtained mixture into a Mel triangular filter bank for filtering, and calculating logarithmic energy q
ijAnd is described by the following formula,
wherein h is
ijRepresenting the frequency response of the mel filters, M being the number of mel filters;
(c) to q is
ijDiscrete cosine transform is carried out to obtain the final Mel cepstrum characteristic coefficient, which is described by the following formula,
wherein mel
ijRepresenting a Mel cepstrum characteristic coefficient, v representing the order of the Mel cepstrum coefficient, and N representing the number of sampling points in each frame of pavement material sound data;
(d) for the obtained mel-frequency cepstrum characteristic coefficient melijCarrying out artificial marking and constructing a training set;
step 4, constructing a deep convolution neural network for training: constructing a deep convolutional neural network, and based on Mel cepstrum characteristic coefficients mel of sound data of different pavement materialsijClassifying to realize the identification of the pavement material, and mainly comprising the following substeps:
(a) enhancing the characteristics, adding some auxiliary characteristics, such as chrominance frequency characteristics, Mel frequency spectrum characteristics, spectral contrast characteristics and hue centroid characteristics in audio, forming 193-dimensional characteristic vectors through characteristic splicing and fusion, finally forming 196-dimensional characteristic vectors through zero padding and alignment, further converting the characteristics into a two-dimensional matrix form (14 × 14), and performing dimension expansion on an input end in the network to obtain a 3D tensor (14 × 14 × 1);
(b) the encoding end comprises 1 convolutional layer Conv1, parameters areconvolutional kernel size 1 multiplied by 1,step length 1, same filling and output channel number 32, Batch Normalization (BN) processing is carried out, and a Leaky _ ReLU activation function is adopted for activation; 3 lightweight depth separable convolutional layers DConv2, DConv3, DConv4 connected in series; the parameters of DConv2 are convolution kernel size 3 × 3,step size 1, same filling and output channel number 32, batch normalization processing is performed, and a Leaky _ ReLU activation function is used for activation; parameters of the pooling layer Maxpool1 are 2 × 2 maximum pooling, step size 2, no padding; the parameters of DConv3 are convolution kernel size 3 × 3,step size 1, same filling and output channel number 64, batch normalization processing is carried out, and a Leaky _ ReLU activation function is adopted for activation; the parameters of DConv4 are convolution kernel size 3 × 3,step size 1, same filling and output channel number 64, batch normalization processing is carried out, and a Leaky _ ReLU activation function is adopted for activation; parameters of the pooling layer Maxpool2 are 3 × 3 maximal pooling, step length 2 andfilling value 1, and then the parameters are flattened into 1024-dimensional feature vectors;
(c) the decoding end first performs Dropout operation to prevent overfitting, the ratio is set to 0.5, and then 3 full-connection layers Dense1, Dense2 and Dense3 are included; wherein the number of output neurons of the Dense1 is 512, and a Leaky _ ReLU activation function is adopted for activation; the number of output neurons of Dense2 is 128, a Leaky _ ReLU activation function is adopted for activation, and the number of output neurons of Dense3 is 10;
and 5, identifying the pavement material based on the trained neural network model.
The invention has the beneficial effects that: a mobile robot pavement material identification method based on sound characteristics comprises the following steps: (1) the method comprises the steps of (1) collecting and establishing a road surface material sound data set, (2) preprocessing road surface material sound data, (3) extracting Mel cepstrum characteristics of road surface material sounds, (4) constructing a deep convolution neural network for training, and (5) identifying the road surface material based on a trained neural network model. Compared with the prior art, the invention has the following advantages: firstly, the method realizes the classification and identification of the pavement material by using the sound characteristics from the perspective of hearing, and can effectively make up for the deficiency of visual mode; secondly, the method realizes effective extraction of the road surface material sound characteristics based on the Mel cepstrum characteristics, builds a deep convolution neural network model, and can effectively perform multi-material road surface identification based on the material sound characteristics.
Drawings
FIG. 1 is a flow chart of the method steps of the present invention.
FIG. 2 is a diagram of road surface texture images and corresponding texture audio data.
In the figure: (a) the sound data map is a sound data map of a grass road surface, (b) the sound data map is a sound data map of a grass road surface, (c) the sound data map is a sound data map of a tile road surface, (d) the sound data map is a sound data map of a tile road surface, (e) the sound data map is an asphalt road surface, (f) the sound data map is an asphalt road surface, (g) the sound data map is a sound data map of a natural stone road surface, (h) the sound data map is a sound data map of a natural stone road surface, (i) the sound data map is a sound data map of a cement road surface, (j) the sound data map is a sound data map of a cement road surface, (k) the sound data map of a rubber road surface, (l) the sound data map of a rubber road surface, (m) the sound data map of a road tile, (n) the sound data map of a road tile, (o) the sound data map of a wood road surface, (p) the sound data map of a wood road surface, (q) the sound data map of a cobblestone road surface, (r) the cobblestone pavement, (s road surface,(s) the sound data map of a gravel road surface, and (t) the sound data map of a gravel road surface.
FIG. 3 is a diagram of road surface material classification training accuracy based on a neural network.
Detailed Description
The invention will be further explained with reference to the drawings.
As shown in fig. 1, a method for identifying a road surface material of a mobile robot based on sound characteristics includes the following steps:
step 1, collecting and establishing a road surface material sound data set: the pavement materials are divided into 10 categories, namely grassland pavement, ceramic tile pavement, asphalt pavement, natural stone pavement, cement pavement, rubber pavement, floor tile pavement, wood board pavement, cobblestone pavement and gravel pavement; collecting road surface material sound data by adopting a sound collector according to the rule that every 50 knocks of each type of road surface form a group, and establishing a road surface material sound data set S ═ SiI is more than or equal to 1 and less than or equal to 10, wherein i is the road surface material type, and S isiThe i-th class road surface texture sound data, the road surface texture image and the corresponding sound data are shown in fig. 2.
Step 2, preprocessing sound data of the road surface material: through segmentation, filtering noise reduction, framing windowing and endpoint detection preprocessing operations, the quality of the sound data of the pavement material is improved, and noise influence is removed, and the method specifically comprises the following substeps:
(a) and (3) dividing: the recording of the road surface material sound is carried out according to a 50-time continuous knocking method, the original data needs to be further divided, the sound of knocking the road surface material each time is used as a single road surface material sound data, i-th class road surface material sound data Si={sijJ is more than or equal to 1 and less than or equal to 50, wherein sijJ is the knocking times corresponding to the road surface material sound data;
(b) filtering and denoising: filtering the collected road surface material sound data through a high-pass filter, describing through a formula (1),
where α is the filter coefficient, s
ij(t) is the original data at the current moment,
for the road surface material sound data at the time t after the filtering and noise reduction processing, the frequency spectrum can be balanced through a filter, the signal-to-noise ratio (SNR) is improved, the proportion of low-frequency information can be suppressed, and the proportion of high-frequency useful components is relatively increased;
(c) framing and windowing: on the whole, the overall road surface material sound characteristic of the knocking road surface has time-varying characteristic and is not stable road surface material sound data, but the recorded road surface material sound data is considered to be stable in a short time, in order to meet the Fourier transformation condition, the road surface material sound data needs to be divided into short-time stable road surface material sound data, the frame division and interception time period ranges from 10 ms to 30ms, in order to ensure that the characteristic parameter continuously and stably changes, some frames are uniformly reserved between two frames without overlapping parts, the change of each overlapping part is called frame shift, the value range is 1/4-1/2 of the frame length, and the road surface material sound data f after frame division and framing is between 1/4-1/2ijWindowing to reduce the influence of framing, windowed road surface material sound data cij=fijXxw, wherein the window function w adopts a Hamming window;
(d) and (3) end point detection: determining the initial position and the end position of the effective road material sound data by setting different volume thresholds, and firstly calculating the windowed road material sound data cijVolume maximum value vmax inijAnd minimum value vminijAnd subtracting to obtain the transformation range value vdif of the road surface material sound dataijRespectively taking vmin from the lowest pavement material sound data as a starting pointij+vdifij×0.1、vminij+vdifij×0.01、vminij+vdifijTaking three different threshold values of x 0.05 as the end point position judgment of the road surface material sound data, starting to record the starting point when the threshold value is exceeded, judging whether the sound data at the next moment is larger than the threshold value or not until the end point lower than the threshold value is detected and recorded, and taking the road surface material sound data between the final starting point position and the final position lower than the threshold value as the effective road surface material sound data e for knocking the road surfaceij;
Step 3, Mel cepstrum feature extraction of the road surface material sound: the Mel cepstrum feature is a most common feature extraction method in sound data, and has the advantages of strong anti-interference capability and capability of reflecting feature differences among different sound data, so that the sound data of the road surface material can be effectively distinguished, and the method specifically comprises the following substeps:
(a) road surface material sound data e obtained by detecting end points
ijCarrying out one-dimensional Fourier transform to obtain corresponding frequency domain signals
Wherein N represents the number of sampling points in each frame of pavement material sound data, k is the sampling point, and the power spectrum g
ij=|p
ij(k)|
2;
(b) The power spectrum g of each frame of road surface material sound data
ijPutting the obtained mixture into a Mel triangular filter bank for filtering, and calculating logarithmic energy q
ijAnd is described by the following formula,
wherein h is
ijRepresenting the frequency response of the mel filters, M being the number of mel filters;
(c) to q is
ijDiscrete cosine transform is carried out to obtain the final Mel cepstrum characteristic coefficient, which is described by the following formula,
wherein mel
ijRepresenting a Mel cepstrum characteristic coefficient, v representing the order of the Mel cepstrum coefficient, and N representing the number of sampling points in each frame of pavement material sound data;
(d) for the obtained mel-frequency cepstrum characteristic coefficient melijCarrying out artificial marking and constructing a training set;
step 4, constructing a deep convolution neural network for training: constructing a deep convolutional neural network, and based on Mel cepstrum characteristic coefficients mel of sound data of different pavement materialsijClassifying to realize the identification of the pavement material, and mainly comprising the following substeps:
(a) enhancing the characteristics, adding some auxiliary characteristics, such as chrominance frequency characteristics, Mel frequency spectrum characteristics, spectral contrast characteristics and hue centroid characteristics in audio, forming 193-dimensional characteristic vectors through characteristic splicing and fusion, finally forming 196-dimensional characteristic vectors through zero padding and alignment, further converting the characteristics into a two-dimensional matrix form (14 × 14), and performing dimension expansion on an input end in the network to obtain a 3D tensor (14 × 14 × 1);
(b) the encoding end comprises 1 convolutional layer Conv1, parameters areconvolutional kernel size 1 multiplied by 1,step length 1, same filling and output channel number 32, Batch Normalization (BN) processing is carried out, and a Leaky _ ReLU activation function is adopted for activation; 3 lightweight depth separable convolutional layers DConv2, DConv3, DConv4 connected in series; the parameters of DConv2 are convolution kernel size 3 × 3,step size 1, same filling and output channel number 32, batch normalization processing is performed, and a Leaky _ ReLU activation function is used for activation; parameters of the pooling layer Maxpool1 are 2 × 2 maximum pooling, step size 2, no padding; the parameters of DConv3 are convolution kernel size 3 × 3,step size 1, same filling and output channel number 64, batch normalization processing is carried out, and a Leaky _ ReLU activation function is adopted for activation; the parameters of DConv4 are convolution kernel size 3 × 3,step size 1, same filling and output channel number 64, batch normalization processing is carried out, and a Leaky _ ReLU activation function is adopted for activation; parameters of the pooling layer Maxpool2 are 3 × 3 maximal pooling, step length 2 andfilling value 1, and then the parameters are flattened into 1024-dimensional feature vectors;
(c) the decoding end first performs Dropout operation to prevent overfitting, the ratio is set to 0.5, and then 3 full-connection layers Dense1, Dense2 and Dense3 are included; wherein the number of output neurons of the Dense1 is 512, and a Leaky _ ReLU activation function is adopted for activation; the number of output neurons of Dense2 is 128, a Leaky _ ReLU activation function is adopted for activation, and the number of output neurons of Dense3 is 10;
and 5, identifying the road surface material based on the trained neural network model, wherein the result is shown in figure 3.