Disclosure of Invention
The invention aims to provide a method and a system for fusing image information and radar information of a traffic scene, which have the advantages of high accuracy, high processing speed and strong adaptability.
In order to achieve the above object, a technical method of the present invention is to provide a method for fusing image information and radar information of a traffic scene, comprising the following steps: preprocessing the image information in front of the vehicle obtained by the camera; extracting characteristic information in the image information, and comparing and judging the characteristic information with prestored traffic scene information; and classifying the current traffic scene according to the comparison and judgment result, executing a corresponding fusion algorithm according to a preset fusion method of the image information and the radar information which are matched with the current traffic scene category, and outputting the result of the fusion algorithm.
Further, before the step of preprocessing the image information in front of the vehicle obtained by the camera, the method further comprises the following steps: and classifying traffic scenes by using a deep learning method, and establishing a corresponding fusion method of image information and radar information aiming at different classifications.
Further, before the step of preprocessing the image information in front of the vehicle obtained by the camera, the method further comprises the following steps: and installing the two sensors on the vehicle according to the installation criteria of the camera and the millimeter wave radar, and respectively calibrating and jointly calibrating the two sensors to obtain the related parameters.
Furthermore, the camera is installed at a position 1-3 cm below the base of the rearview mirror inside the vehicle, and the millimeter wave radar is installed at the center of the license plate at the front end of the vehicle.
Further, the step of executing a corresponding fusion algorithm according to a preset fusion method of image information and radar information adapted to the current traffic scene specifically includes: and processing the acquired image information and radar information according to the scene classification result, including matrix conversion between coordinate systems, effective target screening, target identification, monocular distance measurement and the like, and simultaneously executing a corresponding fusion algorithm.
Further, the fusion method of the image information and the radar information comprises the following steps: a fusion method based on radar information, a fusion method based on image information, and a fusion method based on a common decision.
Specifically, the fusion method mainly based on radar information comprises the following steps: the position information of the effective target obtained by the radar is converted into a pixel coordinate system of an image through projection transformation, an interested area is formed in the image, target identification is carried out by using a deep learning method, effective target information is processed by using an information fusion algorithm, and information such as the position, the speed, the type and the like of the fused target is output.
Specifically, the fusion method mainly based on image information comprises the following steps: the method comprises the steps of performing target identification by using a deep learning algorithm from image information, performing matching judgment on the image information of a target and radar information of the target, fusing the image information of the target and the radar information of the target if the image information of the target is matched with the radar information of the target, outputting information such as position, speed and type of the fused target, rejecting the radar information if the image information of the target is not matched with the radar information of the target, and outputting information such as the position, the speed and the type of the target according to the image information.
Specifically, the fusion method of the common decisions comprises the following steps: the method comprises the steps of finishing primary selection on radar targets by using a target screening algorithm, outputting effective target information, finishing target identification in images returned by a camera by using a deep learning algorithm, obtaining transverse and longitudinal distance position information of the targets by using a monocular distance measuring algorithm, finishing observation value matching of the radar information and the image information by using the Mahalanobis distance, finishing data fusion by using a joint probability density algorithm after finishing matching, and outputting information such as the position, the speed, the type and the like of the targets.
In order to achieve the above object, a technical method of the present invention is to provide a system for fusing image information and radar information of a traffic scene, including: a processor, a memory, and communication circuitry, the processor coupling the memory and the communication circuitry; the memory stores communication data information, image information, traffic scene classification information and working program data of the processor, the communication circuit is used for information transmission, and the processor executes the program data when working so as to realize any one of the fusion methods of the image information and the radar information of the traffic scene.
The invention has the following beneficial effects:
(1) the invention provides a fusion method and a fusion system of image information and radar information of a traffic scene, which are used for judging the scene according to the acquired image information and switching among different fusion algorithms, thereby effectively utilizing resources and improving the scene adaptability.
(2) The invention provides a fusion method and a fusion system of image information and radar information of a traffic scene, which fully utilize the redundancy and complementary characteristics among different sensor data and improve the robustness and reliability of the system.
(3) The invention provides a fusion method and a fusion system of image information and radar information of a traffic scene, which adopt a deep learning algorithm in the aspect of image information processing, and have higher real-time performance and more accurate target identification compared with the traditional image processing algorithm.
Detailed Description
The technical solutions in the embodiments of the present invention will be described clearly and completely below, and it should be understood that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, fig. 1 is a schematic diagram of a principle framework of an embodiment of a method and a system for fusing image information and radar information of a traffic scene according to the present invention. Firstly, an extracted typical traffic scene in life is processed by a deep learning method, such as: and classifying the straight road in the sunny day, the straight road in the rainy day, the ramp in the sunny day, the curve in the sunny day at night, the curve in the rainy day at night and the like, and establishing a corresponding fusion method of the image information and the radar information according to the classification information. In the present embodiment, three fusion methods are mainly included: the fusion method mainly comprises a fusion method mainly based on radar information, a fusion method mainly based on image information and a fusion method for jointly deciding the radar information and the image information. The method mainly comprises the steps that an algorithm flow chart is shown in figure 2, a region of interest (ROI) is preliminarily determined according to detection target information of a millimeter wave radar, then projection transformation is carried out, and target classification detection and feature extraction are carried out by applying an image processing algorithm according to the ROI region. The fusion method mainly based on image information is characterized in that an algorithm flow chart is shown in fig. 3, a target recognition algorithm based on CNN is established, relevant information of effective targets in an image is extracted, and the relevant target information is supplemented by combining radar information. The fusion method of the common decision is characterized in that an algorithm flow chart is shown in 4, a camera and a radar respectively make decisions, observation value matching is completed by using the Mahalanobis distance after space-time joint calibration is completed, then weight distribution of a sensor is determined by using a joint probability density algorithm, data fusion is completed, and therefore information such as speed, type and position of a forward dangerous target is determined. The most appropriate fusion method is selected to carry out environment detection under different corresponding scenes, so that the detection precision and reliability of the forward object are improved, and the adaptability to different scenes is improved.
In a more specific embodiment, the present invention comprises the steps of:
(1) and installing the two sensors on the vehicle according to the installation criteria of the camera and the millimeter wave radar, and respectively calibrating and jointly calibrating the two sensors to obtain the related parameters. The millimeter wave radar is arranged at the center of the front end of the vehicle, the height from the ground is ensured to be between 35cm and 65cm, the mounting plane of the millimeter wave radar is perpendicular to the ground as much as possible and is perpendicular to the longitudinal plane of the vehicle body, and the pitch angle and the yaw angle are close to 0 degrees. The camera is arranged at a position 1-3 cm under a rearview mirror base in the vehicle, the pitch angle of the camera is adjusted, and when the scene is a straight road and the vehicle body is parallel to the road, the 2/3 area under the picture is the road. Calibrating the internal parameters of the camera by using a checkerboard calibration method, and jointly calibrating the two sensors by combining the respective position information of the camera and the radar and the angle information of the checkerboard calibration plate to obtain the required parameters.
(2) The method comprises the steps of preprocessing image information in front of a vehicle obtained by a camera, extracting characteristic information in the image information, comparing and judging the characteristic information with prestored traffic scene information, and classifying the current traffic scene according to the result of comparison and judgment. And information is acquired by taking the sensor with a small sampling frame rate as a reference, so that the time of the information acquired by the sensor is uniform. For the image information collected by the camera, image preprocessing is carried out, and the image preprocessing comprises the following steps: filtering, graying, normalizing and the like, and after preprocessing, inputting the images into a SEnet for classification. Compared with a general convolutional neural network, the SENET adopts a brand-new characteristic recalibration strategy: the importance degree of each feature channel is automatically acquired through a learning mode, and then useful features are promoted according to the importance degree and the features which are not useful for the current task are suppressed. The recalibration mainly comprises the following three steps: firstly, the process of Squeeze is carried out, the feature compression is carried out along the space dimension, each two-dimensional feature channel is changed into a real number, the real number has a global receptive field to some extent, and the output dimension is matched with the number of the input feature channels. It characterizes the global distribution of responses over the feature channels so that the superficial layer can also obtain a global receptive field. The second is the Excitation operation, which is a mechanism similar to the gate in the recurrent neural network. Weights are generated for each feature channel by parameters that are learned to explicitly model the correlation between feature channels. And finally, a Reweight operation, wherein the weight of the output of the Excitation is regarded as the importance of each feature channel after feature selection, and then the feature channels are weighted to the previous feature channel by channel through multiplication, so that the original feature is recalibrated in the channel dimension. Before an input image is tested, a large number of pictures are needed to be trained to obtain a corresponding network structure, because the core of the SEnet is an SE module, the SE module can be embedded into almost all the existing network structures, so that the SE module is embedded into building block units of structures of ResNet, BN-inclusion and inclusion-ResNet-v 2 during training, model results are compared, and an optimal model is reserved. For the setting of parameters in the network, the adjustment can be carried out according to the training result until a satisfactory result is obtained, and a final model is output. After the pictures are input into the trained model, the network can automatically extract the picture characteristics to complete scene classification.
(3) And (4) combining the scene classification result, executing a corresponding fusion algorithm according to a preset fusion method of the image information and the radar information which are matched with the current traffic scene category, and outputting the result of the fusion algorithm. More specifically, the acquired image information and radar information are processed, including matrix conversion between coordinate systems, effective target screening, target identification, monocular distance measurement and the like, corresponding fusion algorithms are executed simultaneously, and finally fusion algorithm results are output.
Scene one
For severe environments such as haze, rainstorm, snowstorm and the like or environments with poor illumination such as night, the performance of the camera is affected, the detection reliability is reduced, and at the moment, a multi-sensor fusion method mainly based on radar is adopted. With reference to fig. 3, the fusion method based on radar information includes the following steps: the position information of the effective target obtained by the radar is converted into a pixel coordinate system of an image through projection transformation, an interested area is formed in the image, target identification is carried out by using a deep learning method, effective target information is processed by using an information fusion algorithm, and information such as the position, the speed, the type and the like of the fused target is output.
The method comprises the steps of firstly, judging information output by a radar according to the detection range of the vehicle-mounted radar in combination with technical parameters such as measurement accuracy and resolution of the vehicle-mounted radar, and removing unreasonable target information, secondly, when an automobile runs, the number of nearby targets is relatively small, effective obstacle targets cannot be detected in more channels of the radar, the returned target signals are the most original signals of the radar, for the signals, corresponding conditions are set according to the definition of each type of radar to remove the signals, and meanwhile, false signals are generated when echo energy is uneven due to radar vibration.
The projective transformation matrix involved in the method is:
wherein (x)w,yw,zw) As world coordinate system coordinates, (u, v) as image pixel coordinate system coordinates, (x)c,yc,zc) For the coordinates of the camera coordinate system, R represents a rotation matrix, t represents a translation matrix, f represents a focal length, dx and dy represent the length units occupied by one pixel in the x direction and the y direction of the image physical coordinate system, and u represents0,v0Representing the center pixel coordinate (O) of the image1) And image origin pixel coordinates (O)0) The number of horizontal and vertical pixels of the phase difference therebetween.
The size of the region of interest involved in the method is not fixed, and is inversely proportional to the distance of the vehicle relative to the millimeter wave radar. The coordinates acquired by the radar are generally vehicle centroid coordinates, the vehicle centroid coordinates are used as the center of the region of interest, and the region of interest is drawn by adopting a self-adaptive threshold value method.
The deep learning algorithm related to the method considers the characteristics of traffic scenes: the target characteristics are obvious, mutual shielding may exist between targets, a Caffe-Net model is selected, and the model is finely adjusted according to the recognition result during training.
The information fusion algorithm related to the method considers that the confidence coefficient of radar information is high, a simpler weighted average information fusion algorithm can be adopted, and a high weight is given to the radar information and a low weight is given to the image information.
Scene two
Because the radar detection plane is a horizontal plane and the azimuth angle is small, the detection function is limited to a certain extent for scenes such as uphill and downhill roads, curves and the like, and at the moment, a fusion method mainly based on image information is adopted, and the method mainly comprises the following steps: the method comprises the steps of performing target identification by using a deep learning algorithm from image information, performing matching judgment on the image information of a target and radar information of the target, fusing the image information of the target and the radar information of the target if the image information of the target is matched with the radar information of the target, outputting information such as position, speed and type of the fused target, rejecting the radar information if the image information of the target is not matched with the radar information of the target, and outputting information such as the position, the speed and the type of the target according to the image information.
Scene three
In general, the performance of both radar and camera can be maintained in a better state, and a fusion method of common decision of radar information and image information is adopted. The method specifically comprises the following steps: and (4) finishing primary selection on the radar target by using a target screening algorithm, and outputting effective target information. And completing target identification in the image returned by the camera by using a deep learning algorithm, and acquiring the transverse and longitudinal distance position information of the target by using a monocular distance measuring algorithm. Matching the radar information and the image information by applying the mahalanobis distance to finish the observation value, and specifically, defining Vk as the most likely area of the observation value of the current target:
according to the statistical data, when c is 3, the probability of the observed value in the effective area is 99.8%. And after matching is finished, data fusion is finished by using a joint probability density algorithm, and information such as the position, the speed, the type and the like of the target is output.
The more detailed procedure is as follows:
A. establishing a system state equation and an observation equation xi,k=Fkxi,k-1+vk,i=1,2,3,...,zij,k=Hjxi,k+wj,k,j=1,2,
Wherein x
i,kRepresenting the state vector of the ith target in the kth state. v. of
kIs white Gaussian noise with mean value of 0 and covariance matrix E (v)
kv
kT)=Q
k,z
ij,kIndicating the observed value of the ith target detected and output by the jth sensor at time k. Wherein H
jTo convert the matrix, w
j,kIs white Gaussian noise, the average value is also 0, and the covariance thereof satisfies
Depending on the type of sensor.
B. And predicting the state value and the observed value of the last step by using Kalman filtering:
the state of this cycle (time k) is updated as:
x′ij,k=x′ij,k|k-1+Kij(zij,k-z′k|k-1)
wherein x'ij,kRepresents the ith objective according toAnd updating the observed value output by the jth sensor. Kij is the kalman gain matrix of the present system,
C. updating the covariance matrix of the predicted value and the observed value as follows:
D. updating a Kalman gain matrix:
E. updating the estimated value by adopting a weighted average method:
β thereinijThe probability of the jth sensor observation being generated for the ith target, then the state covariance is updated as follows:
η thereinijFor hypothetical deviations, define:
ηij=(x′ij-x′i,k|k)(x′ij-x′i,k|k)T
F. solving β according to Poisson distribution theoryij:
Wherein gamma isij=zij,k-z′k|k-1Residual vectors that are observations and predictors.
The invention relates to a scene-based vision and millimeter wave radar information fusion system and a scene-based vision and millimeter wave radar information fusion method, which mainly comprise three blocks: sensor installation and calibration, scene classification, fusion algorithm selection and result output. The method comprises the following steps of finishing the installation and calibration of a camera and a millimeter wave radar by combining the characteristics of a sensor and a checkerboard marking method, classifying scenes by using an optimal model in ResNet, BN-inclusion and inclusion-ResNet-v 2 embedded in an SE module, and selecting a proper fusion algorithm by combining scene classification results: a fusion algorithm based on radar information, a fusion algorithm based on image information and a fusion algorithm of common decision, and outputting a final fusion result.
The invention also provides a system for fusing the image information and the radar information of the traffic scene, which comprises the following steps: a processor, a memory, and communication circuitry, the processor coupling the memory and the communication circuitry; the memory stores communication data information, image information, traffic scene classification information and working program data of the processor, the communication circuit is used for information transmission, and the processor executes the program data when working so as to realize any one of the fusion methods of the image information and the radar information of the traffic scene. For a detailed description of related contents, please refer to the above method section, which is not described herein again.
The invention further provides a device with a storage function, wherein program data are stored on the device, and when the program data are executed by a processor, the method for fusing the image information and the radar information of the traffic scene is implemented.
The device with storage function may be at least one of a server, a floppy disk drive, a hard disk drive, a CD-ROM reader, a magneto-optical disk reader, and the like.
The invention has the following beneficial effects:
(1) the invention provides a fusion method and a fusion system of image information and radar information of a traffic scene, which are used for judging the scene according to the acquired image information and switching among different fusion algorithms, thereby effectively utilizing resources and improving the scene adaptability.
(2) The invention provides a fusion method and a fusion system of image information and radar information of a traffic scene, which fully utilize the redundancy and complementary characteristics among different sensor data and improve the robustness and reliability of the system.
(3) The invention provides a fusion method and a fusion system of image information and radar information of a traffic scene, which adopt a deep learning algorithm in the aspect of image information processing, and have higher real-time performance and more accurate target identification compared with the traditional image processing algorithm.
The above description is only an embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by the present specification, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.