CN110059558B

Movatterモバイル変換

Info

Publication number: CN110059558B
Application number: CN201910198144.5A
Authority: CN
Inventors: 刘慧�; 张礼帅; 沈跃; 吴边; 张健
Original assignee: Jiangsu University
Current assignee: Jiangsu University
Priority date: 2019-03-15
Filing date: 2019-03-15
Publication date: 2023-08-25
Anticipated expiration: 2039-03-15
Also published as: CN110059558A

Abstract

The invention discloses an improved SSD-based real-time detection method for orchard obstacles, which is characterized in that an improved SSD deep learning target detection method is used for identifying the obstacles in an orchard environment, a lightweight network MobileNet V2 is used as a basic network in an SSD model to reduce time and operand spent in the process of extracting image features, an auxiliary layer is used for carrying out position prediction by combining a reverse residual structure with cavity convolution as a basic structure, so that the multi-scale features can be synthesized, information loss caused by downsampling operation is avoided, a corresponding SSD target detection model after image data set training is used, images acquired by a camera are input into the trained model to detect target positions, and the problems that the traditional obstacle detection algorithm is easy to be affected by background interference, the positioning of the obstacle positions is inaccurate, and the detection of various obstacle categories is difficult to realize simultaneously are solved.

Description

Translated fromChinese

一种基于改进SSD网络的果园障碍物实时检测方法A Real-time Detection Method of Orchard Obstacles Based on Improved SSD Network

技术领域technical field

本发明属于计算机视觉、深度学习领域，具体涉及的是一种针对室外果园环境下移动机器人智能作业的障碍物检测方法。The invention belongs to the fields of computer vision and deep learning, and specifically relates to an obstacle detection method for intelligent operation of a mobile robot in an outdoor orchard environment.

背景技术Background technique

农田障碍物的精确识别是无人农业车辆必不可少的关键技术之一随着精准农业理论的提出以及智能化机器人的发展，智能农业车辆的自动导航越来越受到国内外的关注。自主导航的农业车辆具有取代人工，提高作业效率，降低农业生产成本等特点。为了保证智能化车辆在无人工干预时在田间操作的安全性，必须有实时的障碍物检测。田间环境下的障碍物检测由于其复杂的自然环境、障碍物形态的多变性、光照等外部条件的大范围变化等，实施起来具有一定挑战性。在田间环境下，超声波传感器存在检测障碍物空间位置的准确性较差，易受干扰等缺点，虽然激光雷达传感器可以较直观地检测障碍物，但雷达系统的造价昂贵。计算机视觉检测相比于其他障碍物检测方法具有成本低、能够有效利用环境中的颜色与纹理信息等优点。本文采用计算机视觉方法结合深度学习进行无人农机自动作业过程中的行人障碍物检测。Accurate identification of farmland obstacles is one of the essential key technologies for unmanned agricultural vehicles. With the introduction of the theory of precision agriculture and the development of intelligent robots, the automatic navigation of intelligent agricultural vehicles has attracted more and more attention at home and abroad. Autonomously navigating agricultural vehicles have the characteristics of replacing human labor, improving operational efficiency, and reducing agricultural production costs. In order to ensure the safety of intelligent vehicles operating in the field without human intervention, real-time obstacle detection is necessary. Obstacle detection in the field environment is challenging to implement due to its complex natural environment, variability of obstacle shapes, and large-scale changes in external conditions such as lighting. In the field environment, the ultrasonic sensor has the disadvantages of poor accuracy in detecting the spatial position of obstacles and is susceptible to interference. Although the lidar sensor can detect obstacles more intuitively, the cost of the radar system is expensive. Compared with other obstacle detection methods, computer vision detection has the advantages of low cost and effective use of color and texture information in the environment. In this paper, the computer vision method combined with deep learning is used to detect pedestrian obstacles in the automatic operation of unmanned agricultural machinery.

在目标检测领域，基于深度学习的方法准确率大大超过了传统的基于HOG、SIFT等人工设计特征的检测方法。基于深度学习的目标检测主要包括两类，一类是基于区域生成的卷积网络结构，代表性的网络为R-CNN系列(R-CNN，fastR-CNN，fasterR-CNN)；另一类是把目标位置的检测视作回归问题，直接利用CNN网络结构对整个图像进行处理，同时预测出目标的类别和位置，代表性的网络有YOLO、SSD(SingleShotMultiBoxDetector)等，其速度一般快于前一类方法。In the field of target detection, the accuracy rate of the method based on deep learning greatly exceeds the traditional detection method based on artificially designed features such as HOG and SIFT. The target detection based on deep learning mainly includes two types, one is the convolutional network structure based on region generation, and the representative network is the R-CNN series (R-CNN, fastR-CNN, fasterR-CNN); the other is Treat the detection of the target position as a regression problem, directly use the CNN network structure to process the entire image, and predict the category and position of the target at the same time. Representative networks include YOLO, SSD (SingleShotMultiBoxDetector), etc., and their speed is generally faster than the previous one. class method.

SSD目标检测模型由于不需要耗时的区域生成及特征重采样步骤，直接对整个图像进行卷积操作并预测出图像中所包含物体的类别及对应的坐标，从而极大提高了检测速度，同时通过使用小尺寸的卷积核、多尺度预测等使得目标检测的精度得到很大提升。SSD网络结构分为基础网络(basenetwork)和辅助网络(auxiliarynetwork)两部分：基础网络为一些典型的在图像分类领域具有很高分类精度的网络并去除其分类层；辅助网络为在基础网络基础上增加的用于目标检测的卷积网络结构，这些层的尺寸逐渐减小从而可以进行多尺度预测。SSD网络在检测速度和精度的综合性能上表现优异，其检测速度和精度有待于进一步提升，且需要减少其运算量以满足其在移动设备上部署运行的要求。Since the SSD target detection model does not require time-consuming region generation and feature resampling steps, it directly performs convolution operations on the entire image and predicts the category and corresponding coordinates of the objects contained in the image, thereby greatly improving the detection speed. By using small-sized convolution kernels and multi-scale predictions, the accuracy of target detection is greatly improved. The SSD network structure is divided into two parts: the basic network (basenetwork) and the auxiliary network (auxiliary network): the basic network is some typical networks with high classification accuracy in the field of image classification and its classification layer is removed; the auxiliary network is based on the basic network. The convolutional network structure for object detection is added, and the size of these layers is gradually reduced to enable multi-scale prediction. The SSD network has excellent performance in the comprehensive performance of detection speed and accuracy. Its detection speed and accuracy need to be further improved, and its calculation amount needs to be reduced to meet the requirements of its deployment and operation on mobile devices.

发明内容Contents of the invention

本发明针对以上问题，使用轻量化网络MobileNetV2作为SSD模型中的基础网络以减少提取图像特征过程所花费时间及运算量，辅助层以反向残差结构结合空洞卷积作为基础结构进行位置预测从而可以综合多尺度特征的同时避免下采样操作带来的信息损失，以进行实时的障碍物检测并保证智能化车辆在无人工干预时在田间操作的安全性，减少深度学习模型的参数量和计算量从而可以降低深度学习模型对硬件的要求并且达到实时性以满足其在室外移动设备上的应用。In view of the above problems, the present invention uses the lightweight network MobileNetV2 as the basic network in the SSD model to reduce the time spent and the amount of calculation in the process of extracting image features, and the auxiliary layer uses the reverse residual structure combined with the dilated convolution as the basic structure to perform position prediction. It can integrate multi-scale features while avoiding information loss caused by downsampling operations, so as to perform real-time obstacle detection and ensure the safety of intelligent vehicles operating in the field without manual intervention, reducing the amount of parameters and calculations of deep learning models The amount can reduce the hardware requirements of the deep learning model and achieve real-time performance to meet its application on outdoor mobile devices.

本发明的技术方案为：一种基于改进SSD网络的果园障碍物实时检测方法，包括以下步骤：The technical scheme of the present invention is: a kind of real-time detection method of orchard obstacle based on improved SSD network, comprises the following steps:

步骤1，构造关于果园环境的数据集并将数据集分为训练集和测试集；Step 1, construct a data set about the orchard environment and divide the data set into a training set and a test set;

步骤2：在TensorFlow深度学习框架的基础上，搭建SSD网络目标检测模型，将MobileNetV2作为特征提取网络，对SSD的辅助层使用反向残差结构并结合空洞卷积作为基础卷积结构；Step 2: On the basis of the TensorFlow deep learning framework, build an SSD network target detection model, use MobileNetV2 as the feature extraction network, use the reverse residual structure for the auxiliary layer of SSD and combine the hole convolution as the basic convolution structure;

步骤3：初始化网络模型中的参数得到预训练模型；Step 3: Initialize the parameters in the network model to obtain the pre-trained model;

步骤4：使用步骤1中的训练集和测试集，对预训练模型使用批量梯度下降算法进行训练，在训练过程中使用困难样本挖掘策略以增强模型判别假阳性的能力；Step 4: Use the training set and test set in step 1 to train the pre-trained model using the batch gradient descent algorithm, and use the difficult sample mining strategy during the training process to enhance the ability of the model to distinguish false positives;

步骤5：部署SSD网络目标检测模型，通过摄像头采集图像并送入SSD网络目标检测模型，并使用非极大值抑制算法去掉多余边界框，得到检测结果。Step 5: Deploy the SSD network target detection model, collect images through the camera and send them to the SSD network target detection model, and use the non-maximum value suppression algorithm to remove redundant bounding boxes to obtain the detection result.

进一步，步骤1的具体过程为：Further, the specific process of step 1 is:

1.1)通过安装在相应果园农机上的摄像头上获取大量不同场景的果园环境下的视频图像获取大量果园环境下的视频，并按照7.5帧/秒抽取图片，将所有图片按照2∶1∶1比例分为训练集、验证集和测试集；1.1) Obtain a large number of video images in the orchard environment of different scenes through the cameras installed on the corresponding orchard agricultural machinery to obtain a large number of videos in the orchard environment, and extract pictures at 7.5 frames per second, and convert all pictures according to the ratio of 2:1:1 Divided into training set, verification set and test set;

1.2)对上述所有图像进行人工标注，标注的对象是所要检测的障碍物目标，具体的标注信息为图像中目标的类别和该目标的边界框的左上和右下的坐标值；1.2) Manually mark all the above images, the marked object is the obstacle target to be detected, and the specific marking information is the category of the target in the image and the upper left and lower right coordinate values of the bounding box of the target;

1.3)对训练集的图像进行预处理，包括水平翻转和平移以增加样本数量同时也对标注信息进行对应的处理，并通过自适应直方图均衡化增加图像的质量，减少光照变化对图像的影响。1.3) Preprocess the images in the training set, including horizontal flipping and translation to increase the number of samples and correspondingly process the label information, and increase the quality of the image through adaptive histogram equalization to reduce the impact of illumination changes on the image .

进一步，步骤2中，所述的将MobileNetV2作为特征提取网络，对SSD的辅助层使用反向残差结构并结合空洞卷积作为基础卷积结构具体方法为：Further, in step 2, the specific method of using MobileNetV2 as the feature extraction network, using the reverse residual structure for the auxiliary layer of SSD and combining dilated convolution as the basic convolution structure is as follows:

2.1)将MobileNetV2的用于分类的卷积层移去后留下特征提取层作为SSD的基础网络；2.1) Remove the convolutional layer of MobileNetV2 for classification and leave the feature extraction layer as the basic network of SSD;

2.2)以反向残差结构结合空洞卷积并应用层级特征融合策略解决空洞卷积所带来的计算不连续问题，从而作为辅助层的基本结构用于对基础网络提取出的特征进行位置及类别的检测。2.2) Combining the reverse residual structure with dilated convolution and applying a hierarchical feature fusion strategy to solve the computational discontinuity problem caused by dilated convolution, so that it can be used as the basic structure of the auxiliary layer for position and location of the features extracted by the basic network. category detection.

进一步，步骤3的具体方法为：Further, the specific method of step 3 is:

3.1)在ImageNet大规模分类数据集上对MobileNetV2进行训练使其对达到高的分类准确度；3.1) Train MobileNetV2 on the ImageNet large-scale classification dataset to achieve high classification accuracy;

3.2)去掉MobileNetV2的分类卷积层，取其用于特征提取的卷积层参数赋值给SSD对应的特征提取层；3.2) Remove the classification convolutional layer of MobileNetV2, and assign the convolutional layer parameters used for feature extraction to the corresponding feature extraction layer of SSD;

3.3)对SSD辅助层各层参数使用以0为均值，0.01为标准差的高斯分布进行随机初始化。3.3) The parameters of each layer of the SSD auxiliary layer are randomly initialized using a Gaussian distribution with 0 as the mean and 0.01 as the standard deviation.

进一步，步骤4的具体方法为：Further, the specific method of step 4 is:

4.1)批量梯度下降算法进行训练过程中使用的目标函数为：4.1) The objective function used in the training process of the batch gradient descent algorithm is:

其中N是匹配的默认边界框的个数，当其中当N为0时，直接设置L为0，c为标注类别，l为预测的边界框，g为标注的边界框，L_loc为对应的位置预测的smooth_L1误差，L_conf为对应的softmax多分类误差函数：Where N is the number of matching default bounding boxes. When N is 0, directly set L to 0, c is the label category, l is the predicted bounding box, g is the labeled bounding box, and L_loc is the corresponding The smooth_L1 error of position prediction, L_conf is the corresponding softmax multi-classification error function:

其中：P_OS为样本中的正例，cx，cy为预测框的中心点坐标，w为预测框的宽，h为预测框的高，为第i个预测框与第j个真实框关于类别K是否匹配，Neg为样本中的负例，/>为预测框，/>为真实框Among them: P_OS is the positive example in the sample, cx, cy are the coordinates of the center point of the prediction frame, w is the width of the prediction frame, h is the height of the prediction frame, Is whether the i-th prediction box matches the j-th real box with respect to category K, Neg is a negative example in the sample, /> is the prediction box, /> for the real frame

其中：为预测框i与真实框j关于类别p是否匹配，Neg为样本中的负例，/>为预测框中没有物体，/>计算式为：in: In order to predict whether the frame i matches the real frame j with respect to the category p, Neg is a negative example in the sample, /> For predicting that there is no object in the box, /> The calculation formula is:

其中：为目标第i个预测框中目标是第p个类别的概率；in: is the probability that the target in the i-th prediction frame of the target is the p-th category;

4.2)训练过程中先用初始的正负样本训练检测模型，然后使用训练出的模型对样本进行检测分类，把其中检测错误的那些样本继续放入负样本集合进行训练，从而加强模型判别假阳性的能力。4.2) In the training process, first use the initial positive and negative samples to train the detection model, then use the trained model to detect and classify the samples, and continue to put those samples that are detected incorrectly into the negative sample set for training, thereby strengthening the model to distinguish false positives Ability.

进一步，步骤5的具体方法为：Further, the specific method of step 5 is:

5.1)去除训练过程中所用到的用于防止过拟合的操作并固定网络参数已得到用于部署的SSD网络目标检测模型；5.1) Remove the operation used to prevent overfitting and fix the network parameters used in the training process to obtain the SSD network target detection model for deployment;

5.2)通过摄像头采集图像并作为模型的输入，从而得到若干目标的类别置信度和边界框坐标；5.2) The image is collected by the camera and used as the input of the model, so as to obtain the category confidence and bounding box coordinates of several objects;

5.3)使用非极大值抑制算法去除多余的检测框，得到更准确的检测结果。5.3) Use the non-maximum suppression algorithm to remove redundant detection frames to obtain more accurate detection results.

进一步，非极大值抑制算法具体为：对于检测结果中所对应的置信度对检测结果进行按照置信度从高到低进行排序，并且计算出相应的重叠率，设置重叠率阈值为0.5，在检测结果具有高置信度和高重叠率阈值时采纳此检测结果。Further, the non-maximum value suppression algorithm is specifically: for the corresponding confidence in the detection results, sort the detection results from high to low according to the confidence, and calculate the corresponding overlap rate, set the overlap rate threshold to 0.5, in A detection is accepted when it has a high confidence level and a high overlap threshold.

本方案的优点是：The advantages of this program are:

1)通过迁移学习技术，把MobileNetV2在Imagenet分类表现较好的参数移植到SSD的特征提取网络模型中，从而简化目标检测模型的训练过程并缩短训练时间。1) Through transfer learning technology, the parameters of MobileNetV2 that perform better in Imagenet classification are transplanted into the feature extraction network model of SSD, thereby simplifying the training process of the target detection model and shortening the training time.

2)通过改进原始SSD的特征提取网络，使用更加轻量化的MobileNetV2网络模型进行特征提取，辅助层使用改进后的反向残差结构进行卷积运算，从而可以利用多特征信息并且减少运算量，从而提高模型检测的准确率和检测速度。2) By improving the feature extraction network of the original SSD, the more lightweight MobileNetV2 network model is used for feature extraction, and the auxiliary layer uses the improved reverse residual structure for convolution operation, so that multi-feature information can be used and the amount of calculation can be reduced. Thereby improving the accuracy and detection speed of model detection.

3)采用改进的SSD目标检测模型，进行田间环境下的行人障碍物检测，模型占用空间较小且轻量化，适合于在移动设备上部署，模型具有较好的鲁棒性，可以较好地实现果园环境下障碍物的检测，为避障决策提供依据。3) The improved SSD target detection model is used to detect pedestrian obstacles in the field environment. The model occupies a small space and is lightweight, suitable for deployment on mobile devices. The model has good robustness and can better Realize the detection of obstacles in the orchard environment, and provide a basis for decision-making of obstacle avoidance.

附图说明Description of drawings

图1为本发明的步骤图。Fig. 1 is a step diagram of the present invention.

图2改进后的反向残差结构图Figure 2 Improved reverse residual structure diagram

图3层级特征融合结构图Figure 3 Hierarchical feature fusion structure diagram

空洞卷积层结构表示为(输入通道，感受野，输出通道)，其中空洞卷积核的有效感受野为nk*nk，nk＝(n-1)*2k-1+1，k＝1，...，K。The hollow convolution layer structure is expressed as (input channel, receptive field, output channel), where the effective receptive field of the hollow convolution kernel is nk*nk, nk=(n-1)*2k-1+1, k=1, ..., K.

图4改进后的SSD目标检测模型。Figure 4 Improved SSD object detection model.

具体实施方式Detailed ways

以下结合附图和具体实施方式，对本发明做进一步的详细说明。The present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments.

本发明提供一种基于改进SSD的果园障碍物实时检测方法，该方法主要包括以下步骤：The invention provides a real-time detection method for orchard obstacles based on improved SSD, the method mainly includes the following steps:

步骤一：构造数据集并将数据集分为训练集和测试集，该步骤包括以下子步骤：Step 1: Construct the data set and divide the data set into training set and test set. This step includes the following sub-steps:

1.2)对上述所有图像进行人工标注，标注的对象是所要检测的障碍物目标，具体的标注信息为图像中目标的类别和该目标的边界框的左上和右下的坐标值。1.2) Manually annotate all the above images. The object to be annotated is the obstacle object to be detected. The specific annotation information is the category of the object in the image and the upper left and lower right coordinates of the bounding box of the object.

步骤二：在TensorFlow深度学习框架的基础上，将MobileNetV2作为特征提取网络，对SSD的辅助层使用反向残差结构并结合空洞卷积作为基础卷积结构。Step 2: On the basis of the TensorFlow deep learning framework, use MobileNetV2 as the feature extraction network, use the reverse residual structure for the auxiliary layer of SSD and combine the dilated convolution as the basic convolution structure.

主要包括以下步骤：It mainly includes the following steps:

1、在TensorFlow深度学习框架中搭建改进SSD目标检测算法，将轻量化网络模型MobileNetV2最后用于分类的卷积层conv2d1x1，avgpool7x7，conv2d1x1移除后作为SSD的基础层用于提取特征。1. Build an improved SSD target detection algorithm in the TensorFlow deep learning framework, and remove the convolutional layers conv2d1x1, avgpool7x7, and conv2d1x1 of the lightweight network model MobileNetV2 that are finally used for classification as the basic layer of SSD for feature extraction.

2、对SSD目标检测模型的辅助层卷积结构进行改进，使用反向残差结构集合空洞卷积结构来对卷积结构进行改进，并作为辅助层的基本卷积结构单元，具体如附图2所示，应用空洞卷积可以在不用下采样操作的情况下增加卷积核的感受野，以减少学习过程中非线性变换造成的信息损失且卷积核具有多尺度的感受野。2. Improve the convolution structure of the auxiliary layer of the SSD target detection model, use the reverse residual structure set hole convolution structure to improve the convolution structure, and use it as the basic convolution structure unit of the auxiliary layer, as shown in the attached figure As shown in 2, the application of hole convolution can increase the receptive field of the convolution kernel without downsampling operation, so as to reduce the information loss caused by nonlinear transformation in the learning process and the convolution kernel has a multi-scale receptive field.

3、由于空洞卷积的引入会导致卷积核运算不连续的问题，进一步使用层级特征融合算法来消除空洞卷积带来的负面影响，具体实现方式为对空洞卷积层的每一个卷积单元的输出依次进行求和，并且把每个求和后的结果都进行连接(concatenate)操作得到最后的输出结果。见附图3。3. Since the introduction of dilated convolution will lead to the problem of discontinuous convolution kernel operation, the hierarchical feature fusion algorithm is further used to eliminate the negative impact of dilated convolution. The specific implementation method is for each convolution of the dilated convolution layer. The outputs of the units are summed sequentially, and each summed result is concatenated to obtain the final output result. See attached drawing 3.

其中，改进后的反向残差结构使用的激活函数为ReLU6，ReLU6相比于ReLU在低精度运算场景中具有更好的鲁棒性，另外，卷积核尺寸还是使用典型的3x3大小的卷积核，ReLU6函数为式(1)所示。Among them, the activation function used by the improved reverse residual structure is ReLU6. Compared with ReLU, ReLU6 has better robustness in low-precision computing scenarios. In addition, the convolution kernel size still uses a typical 3x3 size volume. The accumulation kernel, the ReLU6 function is shown in formula (1).

ReLU6＝min(max(features，0)，6) (1)ReLU6=min(max(features, 0), 6) (1)

最终得到的改进后的SSD目标检测模型如附图4。The resulting improved SSD target detection model is shown in Figure 4.

步骤三：初始化网络模型中的参数得到预训练模型。主要包括以下步骤：Step 3: Initialize the parameters in the network model to obtain the pre-trained model. It mainly includes the following steps:

3.1)在ImageNet大规模分类数据集上对MobileNetV2进行训练使其对达到较高的分类准确度；3.1) Train MobileNetV2 on the ImageNet large-scale classification dataset to achieve higher classification accuracy;

3.2)去掉MobileNetV2的分类卷积层，取其用于特征提取的卷积层参数赋值给SSD对应的特征提取层；对基础网络MobilenetV2部分，使用已在ImageNet分类任务数据集上训练好的MobilenetV2网络并提取对应网络结构的参数作为基础网络的初始化值。3.2) Remove the classification convolutional layer of MobileNetV2, and assign the convolutional layer parameters used for feature extraction to the feature extraction layer corresponding to SSD; for the MobilenetV2 part of the basic network, use the MobilenetV2 network that has been trained on the ImageNet classification task dataset And extract the parameters corresponding to the network structure as the initialization value of the basic network.

步骤四：对预训练模型使用批量梯度下降算法进行训练，在训练过程中使用困难样本挖掘策略以增强模型判别假阳性的能力。具体训练过程为：Step 4: Use the batch gradient descent algorithm to train the pre-trained model, and use the difficult sample mining strategy during the training process to enhance the ability of the model to distinguish false positives. The specific training process is:

其中：P_OS为样本中的正例，cx，cy为预测框的中心点坐标，w为预测框的宽，h为预测框的高，为第i个预测框与第j个真实框关于类别K是否匹配，Neg为样本中的负例，/>为预测框，/>为真实框Among them: P_OS is the positive example in the sample, cx, cy are the coordinates of the center point of the prediction frame, w is the width of the prediction frame, h is the height of the prediction frame, Is whether the i-th prediction box matches the j-th real box with respect to category K, Neg is a negative example in the sample, /> is the prediction box, /> for the real box

其中：为目标第i个预测框中目标是第p个类别的概率。in: is the probability that the target in the i-th prediction box is the p-th category.

上述批量梯度下降算法设置样本批量大小为128，冲量为0.9，权值衰减系数为2×10^-3，最大迭代次数设置为100k，初始学习率为0.004，衰减率为0.95，每10000次迭代后衰减一次，并每间隔10000次迭代后保存一次模型，最终选取精度最高的模型。The above batch gradient descent algorithm sets the sample batch size to 128, the impulse to 0.9, the weight decay coefficient to 2×10^-3 , the maximum number of iterations to 100k, the initial learning rate to 0.004, and the decay rate to 0.95. After every 10,000 iterations Attenuate once, and save the model after every 10,000 iterations, and finally select the model with the highest accuracy.

训练过程中使用困难样本挖掘(hardnegativemining)策略，即训练过程中先用初始的正负样本训练检测模型，然后使用训练出的模型对样本进行检测分类，把其中检测错误的那些样本继续放入负样本集合进行训练，从而可以加强模型判别假阳性的能力。The hard negative mining strategy is used in the training process, that is, the initial positive and negative samples are used to train the detection model in the training process, and then the trained model is used to detect and classify the samples, and those samples that are detected incorrectly continue to be placed in the negative samples. The sample set is used for training, which can strengthen the ability of the model to distinguish false positives.

步骤五：部署SSD模型，通过摄像头采集图像并送入SSD目标检测模型，并使用非极大值抑制算法去掉多余边界框，得到检测结果。具体实现方式为：Step 5: Deploy the SSD model, collect images through the camera and send them to the SSD target detection model, and use the non-maximum value suppression algorithm to remove redundant bounding boxes to obtain the detection result. The specific implementation method is:

5.1)去除训练过程中所用到的用于防止过拟合的操作并固定网络参数已得到用于部署的网络模型；5.2)通过摄像头采集图像并作为模型的输入，从而得到若干目标的类别置信度和边界框坐标；5.3)使用非极大值抑制算法去除多余的检测框，得到更准确的检测结果。5.1) Remove the operations used in the training process to prevent overfitting and fix the network parameters to obtain the network model for deployment; 5.2) Collect images through the camera and use them as the input of the model to obtain the category confidence of several targets and bounding box coordinates; 5.3) Use the non-maximum value suppression algorithm to remove redundant detection frames to obtain more accurate detection results.

1、固定步骤四中训练好的模型参数并去除dropout等防止过拟合的操作从而得到最终的网络模型。1. Fix the model parameters trained in step 4 and remove operations such as dropout to prevent overfitting to obtain the final network model.

2、测试和评估网络模型，评价指标采用查准率(P)和查全率(R)以及二者的调和均值F1，分别如式(2)、(3)、(4)所示。2. Test and evaluate the network model. The evaluation indicators use precision (P) and recall (R) and their harmonic mean F1, as shown in formulas (2), (3) and (4) respectively.

式(2)、(3)中TP为正确检测到行人的数量，FP为误把非行人目标检测为行人目标的数量，FN为误把行人检测为背景的数量，F1值是对查准率和查全率的调和均值，越接近于1，表明模型表现越好。In formulas (2) and (3), TP is the number of correctly detected pedestrians, FP is the number of wrongly detected non-pedestrian targets as pedestrian targets, FN is the number of wrongly detected pedestrians as background, and the F1 value is the accuracy rate The closer to 1, the better the performance of the model.

3、把达到预期的网络模型参数固定并部署在相应的移动设备中，通过摄像头实时获取果园环境下的图片并输入模型中，采用非极大值抑制来去除多余的检测框，其中IOU阈值选为0.4，置信度阈值选为0.5。3. Fix the network model parameters that meet the expectations and deploy them in the corresponding mobile devices, obtain pictures in the orchard environment through the camera in real time and input them into the model, and use non-maximum value suppression to remove redundant detection frames. The IOU threshold is selected as is 0.4, and the confidence threshold is selected as 0.5.

综上，本发明的基于改进SSD的果园障碍物实时检测方法主要适用于果园环境下无人农机的自动导航场景，通过研究深度学习目标检测算法基本原理，提出一种基于改进SSD目标检测网络的障碍物检测方法，以SSD检测网络为基础，在网络结构和训练过程上进行改进，以减少运算量加快检测速度，提高检测精度，达到实时性的要求，并且降低深度学习模型对硬件的要求，从而可以满足在移动上的部署应用。该方法首先采集相应的视频数据，并以一定的帧率抽取图片进行标注，制作出用于训练的数据集。并通过迁移学习对深度学习模型进行初始化，然后使用批量梯度下降算法对模型进行训练，最后使用训练好的模型用于实际障碍物检测任务中。本发明可在果园环境下对前方障碍物进行快速而准确的检测，是实现智能农业提高其可靠性的有力措施。In summary, the improved SSD-based real-time detection method for orchard obstacles of the present invention is mainly applicable to the automatic navigation scene of unmanned agricultural machinery in the orchard environment. By studying the basic principles of the deep learning target detection algorithm, a method based on the improved SSD target detection network is proposed. The obstacle detection method, based on the SSD detection network, improves the network structure and training process to reduce the amount of computation to speed up the detection speed, improve the detection accuracy, meet the real-time requirements, and reduce the hardware requirements of the deep learning model. Therefore, it can meet the requirements of deploying applications on mobile devices. This method first collects the corresponding video data, and extracts pictures at a certain frame rate for labeling to produce a data set for training. And initialize the deep learning model through migration learning, then use the batch gradient descent algorithm to train the model, and finally use the trained model for the actual obstacle detection task. The invention can quickly and accurately detect obstacles ahead in the orchard environment, and is a powerful measure for realizing intelligent agriculture and improving its reliability.