Disclosure of Invention
The invention mainly aims to provide a target detection method, a target detection device and a readable storage medium, and aims to solve the technical problems that in the prior art, a barrier target detection method based on the fusion of a millimeter wave radar and vision has high requirement on platform resources, low detection precision and poor robustness and instantaneity.
In a first aspect, the present invention provides a target detection method, comprising the steps of:
acquiring a first target sequence of an obstacle target acquired by a radar;
acquiring an image in front of a vehicle, and calculating to obtain a second target sequence of a corresponding obstacle target in the image based on a convolutional neural network model;
after the space-time synchronization of the first target sequence and the second target sequence is completed, performing target matching on the first target sequence and the second target sequence;
and outputting corresponding barrier prompt information according to the target matching result.
Optionally, the step of acquiring a first target sequence of obstacle targets acquired by the radar includes:
acquiring real-time environment data in front of a vehicle, which is acquired by a radar;
calculating a target sequence of the obstacle target according to the data;
if the numerical value of the radar reflection sectional area of the obstacle target in the target sequence is smaller than a first preset threshold value, and the numerical value of the signal-to-noise ratio is smaller than a second preset threshold value, judging that the target signal is a null signal;
if the value of the transverse distance between the obstacle target and the vehicle in the target sequence is larger than a third preset threshold value, determining that the target signal is a non-dangerous signal;
if the number of times of accumulated detection of the obstacle target in the target sequence is smaller than a fourth preset threshold, determining that the target signal is an interference signal;
and screening the null signal, the non-dangerous signal and the interference signal in the target sequence to obtain a first target sequence of the obstacle target collected by the radar.
Optionally, the step of obtaining an image in front of the vehicle, and calculating a second target sequence of a corresponding obstacle target in the image based on the convolutional neural network model includes:
acquiring an image in front of a vehicle, and inputting the image into a trained convolutional neural network model;
performing multiple feature extraction on the obstacle target by utilizing MobileNet depth separable convolution in a downsampling mode to generate feature maps of different scales;
fusing feature maps with different scales generated after feature extraction for the last 3 times in an up-sampling mode to generate a first fused feature map with 3 layers of different semantics and position information;
convolution with different expansion rates is applied to each layer of first fusion feature map, corresponding fusion weights are combined to perform self-adaptive feature fusion, and 3 second fusion feature maps with different receptive field sizes are generated to serve as 3 target prediction layers with different scales;
and respectively generating an anchor frame on the second fusion feature map corresponding to each target prediction layer according to preset anchor point parameters, and identifying the obstacle target in the anchor frame by using a feature matching and non-maximum suppression method to obtain the position, type and confidence information of the obstacle target, so as to generate a second target sequence of the corresponding obstacle target in the image.
Optionally, the step of outputting corresponding obstacle prompt information according to the result of the target matching includes:
if the first target sequence does not have a first obstacle target, but the second target sequence has a second obstacle target, outputting corresponding obstacle prompt information according to the category information of the second obstacle target in the second target sequence when the confidence of the second obstacle target in the second target sequence is greater than a fifth preset threshold.
Optionally, the step of outputting corresponding obstacle prompt information according to the result of the target matching includes:
if the first target sequence has a first obstacle target, generating an area of interest according to a target point corresponding to the first obstacle target in the first target sequence;
and if the second target sequence does not have a second obstacle target, inputting the image of the region of interest into a convolutional neural network model, calculating to obtain the category information of the first obstacle target corresponding to the region of interest, and outputting corresponding obstacle prompt information.
Optionally, after the step of generating the region of interest according to the target point corresponding to the first obstacle target in the first target sequence, the method further includes:
if the second target sequence has a second obstacle target, judging whether the second obstacle target is overlapped with the region of interest;
and if the first obstacle target sequence is overlapped with the second obstacle target sequence, outputting corresponding obstacle prompt information according to the class information of the second obstacle target in the second target sequence.
Optionally, after the step of determining whether the second obstacle target coincides with the region of interest if the second obstacle target exists in the second target sequence, the method further includes:
if the first obstacle target and the second obstacle target do not coincide with each other, outputting corresponding obstacle prompt information according to the class information of the second obstacle target in the second target sequence when the confidence coefficient of the second obstacle target in the second target sequence is greater than a fifth preset threshold;
and inputting the image of the region of interest into a convolutional neural network model, calculating to obtain the category information of the first obstacle target corresponding to the region of interest, and outputting corresponding obstacle prompt information.
In a second aspect, the present invention also provides an object detecting apparatus, including:
the acquisition module is used for acquiring a first target sequence of the obstacle target acquired by the radar;
the calculation module is used for acquiring an image in front of the vehicle, and calculating a second target sequence of a corresponding obstacle target in the image based on a convolutional neural network model;
the matching module is used for performing target matching on the first target sequence and the second target sequence after the first target sequence and the second target sequence are subjected to space-time synchronization;
and the output module is used for outputting corresponding barrier prompt information according to the target matching result.
Optionally, the obtaining module is configured to:
acquiring real-time environment data in front of a vehicle, which is acquired by a radar;
calculating a target sequence of the obstacle target according to the data;
if the numerical value of the radar reflection sectional area of the obstacle target in the target sequence is smaller than a first preset threshold value, and the numerical value of the signal-to-noise ratio is smaller than a second preset threshold value, judging that the target signal is a null signal;
if the value of the transverse distance between the obstacle target and the vehicle in the target sequence is larger than a third preset threshold value, determining that the target signal is a non-dangerous signal;
if the number of times of accumulated detection of the obstacle target in the target sequence is smaller than a fourth preset threshold, determining that the target signal is an interference signal;
and screening the null signal, the non-dangerous signal and the interference signal in the target sequence to obtain a first target sequence of the obstacle target collected by the radar.
Optionally, the calculating module is configured to:
acquiring an image in front of a vehicle, and inputting the image into a trained convolutional neural network model;
performing multiple feature extraction on the obstacle target by utilizing MobileNet depth separable convolution in a downsampling mode to generate feature maps of different scales;
fusing feature maps with different scales generated after feature extraction for the last 3 times in an up-sampling mode to generate a first fused feature map with 3 layers of different semantics and position information;
convolution with different expansion rates is applied to each layer of first fusion feature map, corresponding fusion weights are combined to perform self-adaptive feature fusion, and 3 second fusion feature maps with different receptive field sizes are generated to serve as 3 target prediction layers with different scales;
and respectively generating an anchor frame on the second fusion feature map corresponding to each target prediction layer according to preset anchor point parameters, and identifying the obstacle target in the anchor frame by using a feature matching and non-maximum suppression method to obtain the position, type and confidence information of the obstacle target, so as to generate a second target sequence of the corresponding obstacle target in the image.
Optionally, the output module is configured to:
if the first target sequence does not have a first obstacle target, but the second target sequence has a second obstacle target, outputting corresponding obstacle prompt information according to the category information of the second obstacle target in the second target sequence when the confidence of the second obstacle target in the second target sequence is greater than a fifth preset threshold;
if the first target sequence has a first obstacle target, generating an area of interest according to a target point corresponding to the first obstacle target in the first target sequence;
and if the second target sequence does not have a second obstacle target, inputting the image of the region of interest into a convolutional neural network model, calculating to obtain the category information of the first obstacle target corresponding to the region of interest, and outputting corresponding obstacle prompt information.
Optionally, the output module is further configured to:
if the second target sequence has a second obstacle target, judging whether the second obstacle target is overlapped with the region of interest;
if the first obstacle target sequence is overlapped with the second obstacle target sequence, outputting corresponding obstacle prompt information according to the class information of the second obstacle target in the second target sequence;
if the first obstacle target and the second obstacle target do not coincide with each other, outputting corresponding obstacle prompt information according to the class information of the second obstacle target in the second target sequence when the confidence coefficient of the second obstacle target in the second target sequence is greater than a fifth preset threshold;
and inputting the image of the region of interest into a convolutional neural network model, calculating to obtain the category information of the first obstacle target corresponding to the region of interest, and outputting corresponding obstacle prompt information.
In a third aspect, the present invention also provides an object detection apparatus, which includes a processor, a memory, and an object detection program stored on the memory and executable by the processor, wherein the object detection program, when executed by the processor, implements the steps of the object detection method as described above.
In a fourth aspect, the present invention further provides a readable storage medium, wherein the readable storage medium stores an object detection program, and when the object detection program is executed by a processor, the object detection program implements the steps of the object detection method as described above.
The method comprises the steps of acquiring a first target sequence of an obstacle target acquired by a radar; acquiring an image in front of a vehicle, and calculating to obtain a second target sequence of a corresponding obstacle target in the image based on a convolutional neural network model; after the space-time synchronization of the first target sequence and the second target sequence is completed, performing target matching on the first target sequence and the second target sequence; and outputting corresponding barrier prompt information according to the target matching result. By the method, the detection precision of the vehicle sensing system on the targets with different scales, particularly small sizes, can be improved by using lower computing resources, accurate data is provided for subsequent decision making and planning, collision caused by missed detection on the obstacle targets or difficulty in decision making on the obstacle targets is avoided, and the reliability and safety of intelligent auxiliary driving are improved.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
In a first aspect, an embodiment of the present invention provides an object detection apparatus.
Referring to fig. 1, fig. 1 is a schematic diagram of a hardware structure of an object detection device according to an embodiment of the present invention. In this embodiment of the present invention, the target detection device may include a processor 1001 (e.g., a Central Processing Unit, CPU), acommunication bus 1002, auser interface 1003, anetwork interface 1004, and amemory 1005. Thecommunication bus 1002 is used for realizing connection communication among the components; theuser interface 1003 may include a Display screen (Display), an input unit such as a Keyboard (Keyboard); thenetwork interface 1004 may optionally include a standard wired interface, a WIreless interface (e.g., a WI-FI interface, WI-FI interface); thememory 1005 may be a Random Access Memory (RAM) or a non-volatile memory (non-volatile memory), such as a magnetic disk memory, and thememory 1005 may optionally be a storage device independent of theprocessor 1001. Those skilled in the art will appreciate that the hardware configuration depicted in FIG. 1 is not intended to be limiting of the present invention, and may include more or less components than those shown, or some components in combination, or a different arrangement of components.
With continued reference to FIG. 1, thememory 1005 of FIG. 1, which is one type of computer storage medium, may include an operating system, a network communication module, a user interface module, and an object detection program. Theprocessor 1001 may call an object detection program stored in thememory 1005, and execute the object detection method provided by the embodiment of the present invention.
In a second aspect, an embodiment of the present invention provides a target detection method.
Referring to fig. 2, fig. 2 is a schematic flow chart of an embodiment of the target detection method of the present invention.
In an embodiment of the object detection method of the present invention, the object detection method includes:
step S10, acquiring a first target sequence of the obstacle target acquired by the radar;
the step S10 specifically includes:
acquiring real-time environment data in front of a vehicle, which is acquired by a radar;
calculating a target sequence of the obstacle target according to the data;
if the numerical value of the radar reflection sectional area of the obstacle target in the target sequence is smaller than a first preset threshold value, and the numerical value of the signal-to-noise ratio is smaller than a second preset threshold value, judging that the target signal is a null signal;
if the value of the transverse distance between the obstacle target and the vehicle in the target sequence is larger than a third preset threshold value, determining that the target signal is a non-dangerous signal;
if the number of times of accumulated detection of the obstacle target in the target sequence is smaller than a fourth preset threshold, determining that the target signal is an interference signal;
and screening the null signal, the non-dangerous signal and the interference signal in the target sequence to obtain a first target sequence of the obstacle target collected by the radar.
In this embodiment, the millimeter wave radar acquires real-time environment data in front of the vehicle, including obstacle targets such as a vehicle and a pedestrian in front, performs target analysis on a radar signal transmission format to obtain information such as a relative distance, a relative speed, a relative angle, a reflection sectional area, and a signal-to-noise ratio between the obstacle target and a position where the radar is located, and eliminates interference of false target signals according to data of the target relative distance, the relative speed, the relative angle, the reflection sectional area, and the signal-to-noise ratio in the analyzed information to obtain an effective obstacle target sequence detected by the radar after being screened.
Filtering a static target in a vehicle running environment by setting a first preset threshold of a reflection sectional area of a target signal and a second preset threshold of a signal-to-noise ratio of the target signal, and when the reflection sectional area of the target signal is smaller than the first preset threshold and the signal-to-noise ratio of the target signal is smaller than the second preset threshold, indicating that the target signal is a null signal; meanwhile, filtering non-dangerous targets of a non-self lane and a non-adjacent lane by setting a third preset threshold of the transverse distance between the target signal and the vehicle, wherein the transverse distance between the target signal and the vehicle is determined according to the relative distance between the target in the analyzed information and the relative displacement between the installation position of the radar outside the vehicle and the vehicle, and when the transverse distance is greater than the third preset threshold, the target signal is a non-dangerous signal; meanwhile, invalid noise interference is suppressed by setting a fourth preset threshold of the accumulated detection times of the target signal, and when the accumulated detection times of the target signal is smaller than the fourth preset threshold, the target is indicated to have invalid interference signals with less occurrence times in a short time.
And when the existence of the empty signal, the non-dangerous signal and the interference signal is judged according to the threshold value, the interference of the false target signal of the signal is screened out, and a first target sequence of the obstacle target collected by the radar is obtained, wherein the first target sequence comprises information such as the relative distance, the relative speed, the relative angle, the reflection sectional area, the signal to noise ratio and the like of the effective obstacle target relative to the position of the radar.
Step S20, acquiring an image in front of the vehicle, and calculating to obtain a second target sequence of a corresponding obstacle target in the image based on a convolutional neural network model;
the step S20 specifically includes:
acquiring an image in front of a vehicle, and inputting the image into a trained convolutional neural network model;
performing multiple feature extraction on the obstacle target by utilizing MobileNet depth separable convolution in a downsampling mode to generate feature maps of different scales;
fusing feature maps with different scales generated after feature extraction for the last 3 times in an up-sampling mode to generate a first fused feature map with 3 layers of different semantics and position information;
convolution with different expansion rates is applied to each layer of first fusion feature map, corresponding fusion weights are combined to perform self-adaptive feature fusion, and 3 second fusion feature maps with different receptive field sizes are generated to serve as 3 target prediction layers with different scales;
and respectively generating an anchor frame on the second fusion feature map corresponding to each target prediction layer according to preset anchor point parameters, and identifying the obstacle target in the anchor frame by using a feature matching and non-maximum suppression method to obtain the position, type and confidence information of the obstacle target, so as to generate a second target sequence of the corresponding obstacle target in the image.
In this embodiment, the camera internal and external parameters are calibrated by the Zhang Youth calibration method to generate an internal reference matrix, an external reference matrix and a distortion matrix of the camera, where the internal reference matrix of the camera includes fx,fy,u0,v0The like; the camera external reference matrix comprises an external reference rotation matrix R and a translation matrix T; the camera distortion matrix describes the lens distortion with 5 distortion parameters, Q ═ k1,k2,k3,p1,p2) The method comprises the steps that according to an image acquired by a camera, targets at corresponding positions in the image can be in one-to-one correspondence with targets in a three-dimensional real scene through an internal reference matrix and an external reference matrix of the camera, and the relative distance and the relative speed of the targets relative to the position of the camera are obtained; and obtaining the image after position correction according to the image obtained by the camera through the distortion matrix of the camera.
Based on the requirement that computing resources of a vehicle-mounted computing platform are limited, a light-weight convolutional neural network model based on deep learning is designed to realize high-precision real-time detection of obstacle targets contained in images acquired by a camera and generate an obstacle target sequence for visual detection. After the current YOLOv4 target detection algorithm is optimized, a convolution neural network model which performs self-adaptive feature fusion on a multi-scale feature map with different receptive field sizes after feature extraction based on MobileNet depth separable convolution is constructed, and the effect of obtaining a high-precision target detection result with lower computing resources is achieved.
Based on the obstacle detection requirement of the operation scene of the vehicle ADAS advanced driving assistance system, the data of the obstacle targets in the actual road environment are collected, the obstacle targets are classified and labeled, and a training database of the automatic driving perception model is constructed. Training a convolutional neural network model based on a training database of the constructed automatic driving perception model, optimizing the model by adopting a random gradient descent method and a mode of gradually descending learning rate in the training, and enhancing and expanding images in a data set by adopting various online data enhancement methods. And finally, optimizing the trained convolutional neural network model by adopting a TensorRT model, and deploying the model on a vehicle-mounted computing platform with limited computing resources to detect the obstacle target in front of the vehicle in real time.
After a camera acquires an image in front of a vehicle, the acquired image is input into a trained convolutional neural network model, and feature extraction is performed on an obstacle target for multiple times by utilizing MobileNet depth separable convolution in a downsampling mode to generate feature maps with different scales. And fusing the feature maps with different scales generated after the last 3 times of feature extraction by adopting an up-sampling mode to generate a first fused feature map with 3 layers of different semantics and position information. Convolution with different expansion rates is applied to each layer of first fusion feature map, corresponding fusion weights are combined to perform self-adaptive feature fusion, and 3 second fusion feature maps with different receptive field sizes are generated to serve as 3 target prediction layers with different scales. Respectively generating an anchor frame on a second fusion feature map corresponding to each target prediction layer according to preset anchor point parameters, identifying the obstacle target in the anchor frame by using a feature matching and non-maximum suppression method so as to obtain the position, the category and the confidence information of the obstacle target, and generating a second target sequence of the obstacle target corresponding to the image acquired by the camera, wherein the second target sequence comprises the position, the category and the confidence information of the obstacle target, and the relative distance and the relative speed relative to the position of the camera.
Step S30, after the space-time synchronization of the first target sequence and the second target sequence is completed, the first target sequence and the second target sequence are subjected to target matching;
in this embodiment, step S10 obtains a first target sequence according to the obstacle target information collected by the millimeter wave radar sensor, and step S20 obtains a second target sequence according to the obstacle target information collected by the camera sensor, and before matching the obstacle target, the two sensors need to be synchronized in terms of time and space data, so as to ensure that the values of the measured target corresponding information of the different sensors are converted into the same reference coordinate system.
Taking the example that the vehicle millimeter-wave radar and the camera sensor are installed on the central axis of the vehicle, if the relative positions of the installation are as shown in fig. 3, Or-XrYrZrRepresenting the millimeter-wave radar coordinate system, Ow-XwYwZwRepresenting the vehicle coordinate system, Oc-XcYcZcRepresenting the camera coordinate system, Z0Representing the distance between the millimeter-wave radar coordinate system and the camera coordinate system in the Z-axis direction, Z1And H is the distance between the millimeter wave radar coordinate system and the camera coordinate system and the distance between the millimeter wave radar coordinate system and the vehicle coordinate system in the Y-axis direction. Since the position of the sensor is not changed after the sensor is installed on the vehicle, the millimeter wave radar and the camera data can be synchronized in space through the vehicle coordinate system. Meanwhile, the millimeter wave radar and the camera are time-synchronized by taking data collected by the millimeter wave radar with low sampling frequency as a reference and adopting a multi-thread working mode to realize the time synchronization of the millimeter wave radar and the camera data.
After the time-space synchronization of the first target sequence acquired by the millimeter wave radar and the second target sequence acquired by the camera is completed in the above manner, the first target sequence acquired by the radar and the second target sequence acquired by the camera can be subjected to target matching in the same time-space.
And step S40, outputting corresponding obstacle prompt information according to the target matching result.
In this embodiment, after the first target sequence acquired by the radar and the second target sequence acquired by the camera are subjected to target matching in the same space-time, there are four cases. Firstly, a first obstacle target exists in a first target sequence acquired by a radar, a second obstacle target exists in a second target sequence acquired by a camera, and the first obstacle target and the second obstacle target coincide at the same position of the same time and space, namely the first obstacle target and the second obstacle target are the same obstacle target in the same time and space; secondly, a first obstacle target exists in a first target sequence acquired by the radar, a second obstacle target exists in a second target sequence acquired by the camera, and the first obstacle target and the second obstacle target are at different positions in the same time and space, namely, the first obstacle target and the second obstacle target are at different positions in the same time and space, wherein some of the first obstacle targets are acquired by the radar, and the other of the first obstacle targets are acquired by the camera; thirdly, a first obstacle target exists in a first target sequence acquired by the radar, but a second obstacle target does not exist in a second target sequence acquired by the camera, namely that only the radar detects the first obstacle target in the same space-time; and fourthly, the second obstacle target exists in the second target sequence acquired by the camera, but the first obstacle target does not exist in the first target sequence acquired by the radar, namely that only the camera detects the second obstacle target in the same space-time. And outputting corresponding barrier prompt information according to the four different target matching conditions.
Taking fig. 4 as an example, first, whether an obstacle target exists in the sequence 1 acquired by the radar is determined; if the sequence 1 has the obstacle target, generating an ROI (region of interest) of the obstacle target in the sequence 1 according to the target point; judging whether the sequence 2 acquired by the camera has an obstacle target or not; if the sequence 2 does not have the obstacle target, because the sequence 1 does not contain the type information of the obstacle target, inputting the image of the ROI into a convolutional neural network model to obtain the type of the obstacle target; if the sequence 2 has the obstacle target, judging whether the obstacle target in the sequence 2 is overlapped with the ROI or not; if the overlapped objects are represented as the same obstacle target, because the sequence 2 contains the type information of the obstacle target, the type of the obstacle target is directly obtained according to the sequence 2; if the two obstacle targets do not coincide, the two obstacle targets are represented as different obstacle targets, because the sequence 1 does not contain the type information of the obstacle targets, the image of the ROI area is input into a convolutional neural network model to obtain the type of the obstacle targets in the sequence 1, meanwhile, because the obstacle targets in the sequence 2 are only monitored by a camera, whether the targets are credible or not needs to be judged according to the fact that the sequence 2 contains the confidence degree information of the obstacle targets, and when the confidence degree is larger than X, the type of the obstacle targets contained in the sequence 2 is directly output.
If the sequence 1 does not have the obstacle target, judging whether the sequence 2 acquired by the camera has the obstacle target or not; if the obstacle target exists in the sequence 2, because the obstacle target in the sequence 2 is only monitored by a camera, whether the target is credible or not needs to be judged according to the confidence information of the obstacle target contained in the sequence 2, and when the confidence is greater than X, the type of the obstacle target contained in the sequence 2 is directly output; if the sequence 2 does not have the obstacle target, the fact that the obstacle target does not exist in the current space-time is indicated.
Further, in an embodiment, the step S40 includes:
if the first target sequence does not have a first obstacle target, but the second target sequence has a second obstacle target, outputting corresponding obstacle prompt information according to the category information of the second obstacle target in the second target sequence when the confidence of the second obstacle target in the second target sequence is greater than a fifth preset threshold.
In this embodiment, if the first target sequence does not have the first obstacle target, but the second target sequence has the second obstacle target, it indicates that only the camera detects the second obstacle target in the current same space, and it is necessary to obtain a target matching result according to the confidence of the second obstacle target in the second target sequence. And when the confidence of the second obstacle target in the second target sequence is greater than a fifth preset threshold, outputting corresponding obstacle prompt information according to the class information of the second obstacle target in the second target sequence, wherein the corresponding obstacle prompt information comprises the class of the obstacle target, and the distance and the speed of the obstacle target relative to the position of the camera.
Further, in an embodiment, the step S40 includes:
if the first target sequence has a first obstacle target, generating an area of interest according to a target point corresponding to the first obstacle target in the first target sequence;
and if the second target sequence does not have a second obstacle target, inputting the image of the region of interest into a convolutional neural network model, calculating to obtain the category information of the first obstacle target corresponding to the region of interest, and outputting corresponding obstacle prompt information.
In this embodiment, if the first target sequence has the first obstacle target, and meanwhile, if the second target sequence does not have the second obstacle target, it indicates that only the radar detects the first obstacle target in the current same air. At this time, the distance and the speed of the first obstacle target relative to the position of the radar can be obtained according to the first target sequence, but the type of the first obstacle target cannot be specifically judged, an interest region needs to be generated according to a target point corresponding to the first obstacle target in the first target sequence, an image of the interest region needs to be input into the convolutional neural network model, after the type information of the first obstacle target corresponding to the interest region is obtained through calculation, corresponding obstacle prompt information including the type of the obstacle target and the distance and the speed of the obstacle target relative to the position of the camera is output.
Further, in an embodiment, after the step of generating the region of interest according to the target point corresponding to the first obstacle target in the first target sequence, the method further includes:
if the second target sequence has a second obstacle target, judging whether the second obstacle target is overlapped with the region of interest;
and if the first obstacle target sequence is overlapped with the second obstacle target sequence, outputting corresponding obstacle prompt information according to the class information of the second obstacle target in the second target sequence.
In this embodiment, if the first target sequence has a first obstacle target, and if the second target sequence has a second obstacle target, it indicates that the radar has detected the first obstacle target and the camera has detected the second obstacle target in the same current time space. Generating an interested area according to a target point corresponding to the first obstacle target in the first target sequence, and judging whether the second obstacle target is overlapped with the interested area.
If the first obstacle target and the second obstacle target are overlapped, namely the first obstacle target and the second obstacle target are the same obstacle target in the same space-time, outputting corresponding obstacle prompt information including the type of the obstacle target and the distance and the speed of the obstacle target relative to the position of the camera according to the type information of the second obstacle target in the second target sequence.
Further, in an embodiment, after the step of determining whether the second obstacle target coincides with the region of interest if the second target sequence has the second obstacle target, the method further includes:
if the first obstacle target and the second obstacle target do not coincide with each other, outputting corresponding obstacle prompt information according to the class information of the second obstacle target in the second target sequence when the confidence coefficient of the second obstacle target in the second target sequence is greater than a fifth preset threshold;
and inputting the image of the region of interest into a convolutional neural network model, calculating to obtain the category information of the first obstacle target corresponding to the region of interest, and outputting corresponding obstacle prompt information.
In this embodiment, if the first target sequence has a first obstacle target, and if the second target sequence has a second obstacle target, it indicates that the radar has detected the first obstacle target and the camera has detected the second obstacle target in the same current time space. Generating an interested area according to a target point corresponding to the first obstacle target in the first target sequence, and judging whether the second obstacle target is overlapped with the interested area. If the two obstacle targets do not coincide, in different positions in the same space-time, some of the two obstacle targets are first obstacle targets collected by a radar, and some of the two obstacle targets are second obstacle targets collected by a camera.
And if the first obstacle target is acquired by the radar, inputting the image of the region of interest into the convolutional neural network model, calculating to obtain the category information of the first obstacle target corresponding to the region of interest, and outputting corresponding obstacle prompt information, including the category of the obstacle target, and the distance and the speed of the obstacle target relative to the position of the camera.
If the second obstacle target is acquired by the camera, a target matching result is obtained according to the confidence degree of the second obstacle target in the second target sequence. And when the confidence of the second obstacle target in the second target sequence is greater than a fifth preset threshold, outputting corresponding obstacle prompt information according to the class information of the second obstacle target in the second target sequence, wherein the corresponding obstacle prompt information comprises the class of the obstacle target, and the distance and the speed of the obstacle target relative to the position of the camera.
In the embodiment, a first target sequence of an obstacle target collected by a radar is obtained; acquiring an image in front of a vehicle, and calculating to obtain a second target sequence of a corresponding obstacle target in the image based on a convolutional neural network model; after the space-time synchronization of the first target sequence and the second target sequence is completed, performing target matching on the first target sequence and the second target sequence; and outputting corresponding barrier prompt information according to the target matching result. By the method, the detection precision of the vehicle sensing system on the targets with different scales, particularly small sizes, can be improved by using lower computing resources, accurate data is provided for subsequent decision making and planning, collision caused by missed detection on the obstacle targets or difficulty in decision making on the obstacle targets is avoided, and the reliability and safety of intelligent auxiliary driving are improved.
In a third aspect, an embodiment of the present invention further provides an object detection apparatus.
Referring to fig. 5, a functional block diagram of an embodiment of an object detection apparatus is shown.
In this embodiment, the target detection apparatus includes:
theacquisition module 10 is configured to acquire a first target sequence of an obstacle target acquired by a radar;
thecalculation module 20 is configured to obtain an image in front of the vehicle, and calculate a second target sequence of a corresponding obstacle target in the image based on a convolutional neural network model;
amatching module 30, configured to perform target matching on the first target sequence and the second target sequence after the time-space synchronization of the first target sequence and the second target sequence is completed;
and theoutput module 40 is used for outputting corresponding obstacle prompt information according to the target matching result.
Further, in an embodiment, the obtainingmodule 10 is configured to:
acquiring real-time environment data in front of a vehicle, which is acquired by a radar;
calculating a target sequence of the obstacle target according to the data;
if the numerical value of the radar reflection sectional area of the obstacle target in the target sequence is smaller than a first preset threshold value, and the numerical value of the signal-to-noise ratio is smaller than a second preset threshold value, judging that the target signal is a null signal;
if the value of the transverse distance between the obstacle target and the vehicle in the target sequence is larger than a third preset threshold value, determining that the target signal is a non-dangerous signal;
if the number of times of accumulated detection of the obstacle target in the target sequence is smaller than a fourth preset threshold, determining that the target signal is an interference signal;
and screening the null signal, the non-dangerous signal and the interference signal in the target sequence to obtain a first target sequence of the obstacle target collected by the radar.
Further, in an embodiment, the calculatingmodule 20 is configured to:
acquiring an image in front of a vehicle, and inputting the image into a trained convolutional neural network model;
performing multiple feature extraction on the obstacle target by utilizing MobileNet depth separable convolution in a downsampling mode to generate feature maps of different scales;
fusing feature maps with different scales generated after feature extraction for the last 3 times in an up-sampling mode to generate a first fused feature map with 3 layers of different semantics and position information;
convolution with different expansion rates is applied to each layer of first fusion feature map, corresponding fusion weights are combined to perform self-adaptive feature fusion, and 3 second fusion feature maps with different receptive field sizes are generated to serve as 3 target prediction layers with different scales;
and respectively generating an anchor frame on the second fusion feature map corresponding to each target prediction layer according to preset anchor point parameters, and identifying the obstacle target in the anchor frame by using a feature matching and non-maximum suppression method to obtain the position, type and confidence information of the obstacle target, so as to generate a second target sequence of the corresponding obstacle target in the image.
Further, in an embodiment, theoutput module 40 is configured to:
if the first target sequence does not have a first obstacle target, but the second target sequence has a second obstacle target, outputting corresponding obstacle prompt information according to the category information of the second obstacle target in the second target sequence when the confidence of the second obstacle target in the second target sequence is greater than a fifth preset threshold;
if the first target sequence has a first obstacle target, generating an area of interest according to a target point corresponding to the first obstacle target in the first target sequence;
and if the second target sequence does not have a second obstacle target, inputting the image of the region of interest into a convolutional neural network model, calculating to obtain the category information of the first obstacle target corresponding to the region of interest, and outputting corresponding obstacle prompt information.
Further, in an embodiment, theoutput module 40 is further configured to:
if the second target sequence has a second obstacle target, judging whether the second obstacle target is overlapped with the region of interest;
if the first obstacle target sequence is overlapped with the second obstacle target sequence, outputting corresponding obstacle prompt information according to the class information of the second obstacle target in the second target sequence;
if the first obstacle target and the second obstacle target do not coincide with each other, outputting corresponding obstacle prompt information according to the class information of the second obstacle target in the second target sequence when the confidence coefficient of the second obstacle target in the second target sequence is greater than a fifth preset threshold;
and inputting the image of the region of interest into a convolutional neural network model, calculating to obtain the category information of the first obstacle target corresponding to the region of interest, and outputting corresponding obstacle prompt information.
The function implementation of each module in the target detection apparatus corresponds to each step in the target detection method embodiment, and the function and implementation process thereof are not described in detail herein.
In a fourth aspect, the embodiment of the present invention further provides a readable storage medium.
The readable storage medium of the present invention stores an object detection program, wherein the object detection program, when executed by a processor, implements the steps of the object detection method as described above.
The method for implementing the target detection program when executed may refer to various embodiments of the target detection method of the present invention, and will not be described herein again.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) as described above and includes instructions for causing a terminal device to execute the method according to the embodiments of the present invention.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.