Disclosure of Invention
In view of the above, it is desirable to provide an obstacle detection method, an obstacle detection apparatus, a computer device, and a storage medium capable of improving obstacle detection accuracy.
An obstacle detection method, the method comprising:
acquiring collected point cloud data;
dividing the point cloud data by adopting preset number of different dividing scales to obtain voxels of each dividing scale;
extracting the characteristic vector of the voxel of each segmentation scale to obtain each shape characteristic;
adjusting the size of each shape feature to obtain the shape features to be spliced with the same size;
splicing and combining the shape features to be spliced to obtain point cloud features with multi-scale information;
projecting the point cloud features to a horizontal plane to form a two-dimensional tensor;
and inputting the two-dimensional tensor into a convolutional neural network model to predict the obstacle, and determining a prediction result.
In one embodiment, the determination manner of the preset number of segmentation scales with different sizes is as follows:
and determining the preset number of segmentation scales with different sizes according to various types of obstacles and the size range corresponding to the various types of obstacles.
In one embodiment, the step of extracting the feature vector of the voxel of each of the segmentation scales to obtain each shape feature includes:
and inputting the voxels of each segmentation scale into a corresponding feature extraction network for feature extraction to obtain each shape feature.
In one embodiment, the feature extraction network is a deep learning network structure.
In one embodiment, the step of adjusting the size of each shape feature to obtain the shape features to be spliced with the same size includes:
and adjusting the size of each shape feature by adopting a trilinear interpolation method to obtain each shape feature to be spliced with the same size.
In one embodiment, the step of projecting the point cloud features to a horizontal plane to form a two-dimensional tensor comprises:
and calling a reshape function to project the point cloud characteristics to a horizontal plane to form a two-dimensional tensor.
In one embodiment, the step of inputting the two-dimensional tensor into the convolutional neural network model to predict the obstacle and determining a prediction result includes:
inputting the two-dimensional tensor into a backbone network of a convolutional neural network model for calculation to obtain a calculation result;
inputting the calculation result into a head network of the convolutional neural network model for predicting the bounding box and the obstacle species, and determining the center point coordinate of the obstacle, the length, the width and the height of the obstacle, the orientation angle of the obstacle and the obstacle species.
An obstacle detection apparatus, the apparatus comprising:
the data acquisition module is used for acquiring the acquired point cloud data;
the segmentation module is used for segmenting the point cloud data by adopting a preset number of segmentation scales with different sizes to obtain voxels of each segmentation scale;
the characteristic extraction module is used for extracting the characteristic vector of the voxel of each segmentation scale to obtain each shape characteristic;
the shape adjusting module is used for adjusting the size of each shape feature to obtain each shape feature to be spliced with the same size;
the splicing module is used for splicing and combining the shape features to be spliced to obtain point cloud features with multi-scale information;
the projection module is used for projecting the point cloud characteristics to a horizontal plane to form a two-dimensional tensor;
and the prediction module is used for inputting the two-dimensional tensor into the convolutional neural network model to predict the obstacle and determining a prediction result.
A computer device comprising a memory storing a computer program and a processor implementing the steps of the method when executing the computer program.
A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method.
According to the obstacle detection method, the obstacle detection device, the computer equipment and the storage medium, the collected point cloud data is obtained; dividing the point cloud data by adopting preset number of different dividing scales to obtain voxels of each dividing scale; the point cloud shape information under different scales can be better reserved, and better point cloud characteristics are provided; extracting the characteristic vectors of the voxels of each segmentation scale to obtain each shape characteristic; adjusting the size of each shape feature to obtain the shape features to be spliced with the same size; splicing and combining the shape features to be spliced to obtain point cloud features with multi-scale information; projecting the point cloud characteristics to a horizontal plane to form a two-dimensional tensor; and the two-dimensional tensor is input into the convolutional neural network model to predict the obstacle, a prediction result is determined, and the obstacle detection precision is improved.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
In one embodiment, as shown in fig. 1, there is provided an obstacle detection method including the steps of:
step S220, acquiring the collected point cloud data.
Wherein the point cloud data is a three-dimensional point cloud generated by a laser radar, such as: the installed 128-line laser radar generally has about 12 ten thousand points (laser reflection points) according to the three-dimensional point cloud generated in the detection target range, and is distributed in the range of 50 meters all around, and the height is generally from-5 to 3 meters.
And S240, segmenting the point cloud data by adopting a preset number of segmentation scales with different sizes to obtain voxels of each segmentation scale.
Wherein the segmentation scale is a scale for dividing the point cloud data, and the size of the segmentation scale can be expressed as (x)i,yi,zi) Wherein i is the serial number of the scale, x is the length, y is the width, and z is the height. Voxels are short for volumn pixels, explained by analogy with pixels: a voxel is a three-dimensional concept, a pixelIs a two-dimensional concept. The voxels of each division scale mean that each division scale divides the point cloud data to obtain the voxels divided by each division scale, and if the point cloud data has a length of l, a width of w and a height of h, the number of the voxels divided by each division scale is l/xi,w/yi,h/zi。
In one embodiment, the predetermined number of segmentation scales with different sizes is determined in the following manner: according to various types of obstacles and the size range corresponding to the various types, determining preset number of segmentation scales with different sizes, such as: the obstacles to be predicted comprise people, cars and buses, the length, width and height of the people are generally (0.73, 0.67 and 1.77) meters, and the average value is 1 meter; the length, width and height of the car are (4.63, 1.97 and 1.74) meters, and the average value is 2.78 meters; the length, width and height of the bus are (10.5, 2.94, 3.47) meters, the average value is 5.63 meters, and the designer can set how many parts of the obstacle are cut in each direction in the length, width and height directions according to own experience, for example, a designer considers that voxels (voxels) obtained by cutting the three directions of the length, width and height into 10 parts according to the average value are enough to express the shape of the obstacle, and then the designer can select the division scales of 0.1 meter, 0.278 meter and 0.563 meter according to the average value of the three obstacles, wherein the three scales are the selected division scales, the size span of the three obstacles in the above example is relatively large, so the three obstacles are divided into 3 scales, and if the predicted sizes of the obstacles are not so large, the scales can be divided into a few scales.
Step S260 is to extract the feature vectors of the voxels of each division scale to obtain each shape feature.
In one embodiment, the step of extracting the feature vector of the voxel at each segmentation scale to obtain each shape feature includes: and inputting the voxels of each segmentation scale into a corresponding feature extraction network for feature extraction to obtain each shape feature.
And each voxel of each segmentation scale trains a feature extraction network for extracting the feature vector of the voxel of the corresponding segmentation scale. Each divisionAll the points in the scale voxel are transmitted into the corresponding feature extraction network, the shape features are extracted, different feature vectors are obtained, and the feature vectors are expressed as (C)i,l/xi,w/yi,h/zi) In which C isiIs the dimension of the feature vector. The feature extraction network is a deep learning network structure, that is, after each point in a voxel passes through a fully connected network, the shape feature of the whole voxel is obtained by using maxporoling (pooling operation).
And step S280, adjusting the size of each shape feature to obtain each shape feature to be spliced with the same size.
At least one of the length, width, height and number of each shape feature is different, so that the shape features with the same length, width, height and number need to be adjusted to facilitate later splicing. As shown in fig. 2, for example, the point cloud data is segmented by using 3 segmentation scales with different sizes, an object similar to a magic cube represents the whole point cloud data, the segmentation scale used in the segmentation scale 1 has a larger size, and the point cloud data is segmented into 2 parts in length, width and height, as shown in the segmentation scale 1 in fig. 2; dividing the point cloud data into 3 parts, namely dividing the length, the width and the height of the point cloud data into 3 parts, such as a division result of the division scale 2 in fig. 2; the segmentation scale 3 divides the length, width and height of the point cloud data into 4 parts, and as a segmentation result of the segmentation scale 3 in fig. 2, each voxel is generally extracted as feature vectors composed of 64 numbers, so the feature shapes of the 3 segmentation scales in fig. 2 are (64, 2, 2, 2), (64, 3, 3, 3) and (64, 4, 4, 4), respectively, and therefore it is necessary to adjust the feature shapes of the 3 segmentation scales to the shape feature of the size of (64, 4, 4, 4) by using a trilinear interpolation method, and the feature shapes of the 3 segmentation scales are adjusted to be the feature of the shapes to be spliced with the same size.
In one embodiment, the step of adjusting the size of each shape feature to obtain each shape feature to be spliced with the same size includes:
and adjusting the size of each shape feature by adopting a trilinear interpolation method to obtain each shape feature to be spliced with the same size. The shape feature to be spliced with the same size can be represented as (C)i,l0,w0,h0) Wherein l is0Is a length, w, of a shape characteristic of the to-be-spliced0Width of the shape feature to be spliced, h0Is the height of the shape feature to be spliced.
The trilinear interpolation method is a method of performing linear interpolation on a tensor product grid of three-dimensional discrete sampling data.
And step S300, splicing and combining the shape characteristics to be spliced to obtain point cloud characteristics with multi-scale information.
Splicing and combining the shape features to be spliced to obtain a point cloud feature f with multi-scale information, wherein the shape is represented as (C)1+C2+...+Cn,l0,w0,h0) Mixing C with1+C2+...+CnThe point cloud feature f obtained in the step is marked as C, and the shape of the point cloud feature f is (C, l)0,w0,h0)。
Step S320, projecting the point cloud features to a horizontal plane to form a two-dimensional tensor.
In one embodiment, the step of projecting the point cloud features onto a horizontal plane to form a two-dimensional tensor comprises: and calling a reshape function to project the point cloud characteristics to a horizontal plane to form a two-dimensional tensor.
Wherein the point cloud feature f has the shape of (C, l)0,w0,h0) F needs to be projected to a horizontal plane, and the projection operation can be reshape operation and changed into the shape of (C x l)0,w0,h0) I.e. the two-dimensional tensor. The reshape function is a function that transforms a specified matrix into a particular dimension matrix.
And step S340, inputting the two-dimensional tensor into the convolutional neural network model to predict the obstacle, and determining a prediction result.
In one embodiment, the step of inputting the two-dimensional tensor into the convolutional neural network model to predict the obstacle and determining the prediction result comprises: inputting the two-dimensional tensor into a backbone network of a convolutional neural network model for calculation to obtain a calculation result; and inputting the calculation result into a head network of a convolutional neural network model to predict a bounding box and the types of obstacles, and determining the coordinates of the center point of the obstacle, the length, the width and the height of the obstacle, the orientation angle of the obstacle and the types of the obstacles.
The structure of the convolutional neural network model is divided into two parts, namely a main network and a head network, the main network has 16 convolutional layers in total, the convolutional layers are divided into 3 stages, the number of each stage convolutional layer is respectively 4, 6 and 6, the first convolutional layer of each stage is subjected to down sampling, the number of channels of the convolutional layers of the 3 stages is respectively 64, 128 and 256, the output (namely the calculation result) obtained by calculation of the main network is used as the input of the head network, the head network is divided into two branches, the first branch is used for predicting a bounding box (bounding box) and specifically comprises a central point coordinate of an obstacle, the length and width of the obstacle and the orientation angle of the obstacle, the second branch is used for predicting the type of the obstacle, the output results of the two branches are prediction results, the prediction results comprise the central point coordinate of the obstacle, the length and width of the obstacle, the orientation angle of the obstacle and the type of the obstacle, the types of obstacles are as follows: pedestrians, bicycles, cars, buses, etc.
According to the obstacle detection method, the collected point cloud data is obtained; dividing the point cloud data by adopting preset number of different dividing scales to obtain voxels of each dividing scale; the point cloud shape information under different scales can be better reserved, and better point cloud characteristics are provided; extracting the characteristic vectors of the voxels of each segmentation scale to obtain each shape characteristic; adjusting the size of each shape feature to obtain the shape features to be spliced with the same size; splicing and combining the shape features to be spliced to obtain point cloud features with multi-scale information; projecting the point cloud characteristics to a horizontal plane to form a two-dimensional tensor; and the two-dimensional tensor is input into the convolutional neural network model to predict the obstacle, a prediction result is determined, and the obstacle detection precision is improved.
It should be understood that, although the steps in the flowchart of fig. 1 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a portion of the steps in fig. 1 may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performance of the sub-steps or stages is not necessarily sequential, but may be performed in turn or alternately with other steps or at least a portion of the sub-steps or stages of other steps.
In one embodiment, as shown in fig. 3, there is provided an obstacle detecting device including: adata acquisition module 310, asegmentation module 320, afeature extraction module 330, ashape adjustment module 340, astitching module 350, aprojection module 360, and aprediction module 370.
And adata acquiring module 310, configured to acquire the acquired point cloud data.
Thesegmentation module 320 is configured to segment the point cloud data by using a preset number of segmentation scales with different sizes to obtain voxels of each segmentation scale.
Thefeature extraction module 330 is configured to extract feature vectors of voxels of each segmentation scale to obtain each shape feature.
And theshape adjusting module 340 is configured to adjust the size of each shape feature to obtain each shape feature to be spliced, which has the same size.
And thesplicing module 350 is configured to splice and combine the shape features to be spliced to obtain a point cloud feature with multi-scale information.
And theprojection module 360 is used for projecting the point cloud characteristics to a horizontal plane to form a two-dimensional tensor.
And the predictingmodule 370 is configured to input the two-dimensional tensor into the convolutional neural network model to predict the obstacle, and determine a prediction result.
In one embodiment, the predetermined number of segmentation scales with different sizes is determined in the following manner:
and determining the preset number of segmentation scales with different sizes according to various types of obstacles and the size range corresponding to the various types of obstacles.
In one embodiment, thefeature extraction module 330 is further configured to: and inputting the voxels of each segmentation scale into a corresponding feature extraction network for feature extraction to obtain each shape feature.
In one embodiment, the feature extraction network is a deep learning network structure.
In one embodiment, theshape adjustment module 340 is further configured to: and adjusting the size of each shape feature by adopting a trilinear interpolation method to obtain each shape feature to be spliced with the same size.
In one embodiment, theprojection module 360 is further configured to: and calling a reshape function to project the point cloud characteristics to a horizontal plane to form a two-dimensional tensor.
In one embodiment, theprediction module 370 is further configured to: inputting the two-dimensional tensor into a backbone network of a convolutional neural network model for calculation to obtain a calculation result; and inputting the calculation result into a head network of a convolutional neural network model to predict a bounding box and the types of obstacles, and determining the coordinates of the center point of the obstacle, the length, the width and the height of the obstacle, the orientation angle of the obstacle and the types of the obstacles.
For specific limitations of the obstacle detection device, reference may be made to the above limitations of the obstacle detection method, which are not described herein again. The respective modules in the above obstacle detection apparatus may be wholly or partially implemented by software, hardware, and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In an embodiment, a computer device is provided, comprising a memory and a processor, the memory storing a computer program, the processor implementing the steps of the above-mentioned obstacle detection method when executing the computer program.
In an embodiment, a computer-readable storage medium is provided, on which a computer program is stored, which computer program, when being executed by a processor, carries out the steps of the above-mentioned obstacle detection method.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware related to instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.