CN107958255A

Movatterモバイル変換

Info

Publication number: CN107958255A
Application number: CN201711164646.3A
Authority: CN
Inventors: 娄玉强; 蒋华涛; 常琳; 李庆; 陈大鹏; 薛静
Original assignee: Institute of Microelectronics of CAS
Current assignee: Institute of Microelectronics of CAS
Priority date: 2017-11-21
Filing date: 2017-11-21
Publication date: 2018-04-24

Abstract

Translated fromChinese

本发明提供了一种基于图像的目标检测方法及装置，在获得待检测图像之后，利用图像中目标物体的边缘信息，从待检测图像中确定出目标候选区域。然后，利用多粒度扫描从目标候选区域中提取特征向量。提取得到的特征向量输入到预先训练好的级联森林进行识别，确定目标候选区域中是否存在目标物体。该方法利用在提取特征向量之前确定出目标候选区域，该目标候选区域的范围小于整个待检测图像的范围，因此，从目标候选区域中提取的特征向量数量，远远小于从整个待检测图像中提取的特征向量数量，从而提高了检测速度。而且，与神经网络模型相比，级联森林模型中的参数数量较少，需要的训练数据较少，而且，检测准确率高。

The invention provides an image-based target detection method and device. After the image to be detected is obtained, the edge information of the target object in the image is used to determine the target candidate area from the image to be detected. Then, feature vectors are extracted from target candidate regions using multi-granularity scanning. The extracted feature vectors are input to the pre-trained cascade forest for recognition to determine whether there is a target object in the target candidate area. This method uses to determine the target candidate area before extracting the feature vector, and the range of the target candidate area is smaller than the range of the entire image to be detected. Therefore, the number of feature vectors extracted from the target candidate area is much smaller than that of the entire image to be detected. The number of extracted feature vectors increases the detection speed. Moreover, compared with the neural network model, the cascaded forest model has fewer parameters, requires less training data, and has higher detection accuracy.

Description

Translated fromChinese

一种基于图像的目标检测方法及装置An image-based target detection method and device

技术领域technical field

本申请涉及图像处理技术领域，更具体的说是涉及一种基于图像的目标检测方法及装置。The present application relates to the technical field of image processing, and more specifically relates to an image-based target detection method and device.

背景技术Background technique

近年来，基于图像的车辆检测成为目标检测的一个重要热点。在智能视频监控、安全辅助驾驶、人工智能以及智能交通中都有重要的应用。In recent years, image-based vehicle detection has become an important hotspot in object detection. It has important applications in intelligent video surveillance, safety assisted driving, artificial intelligence and intelligent transportation.

基于机器学习的车辆检测方法是当前的主流方法。其中，基于卷积神经网络的深度学习模型在大规模图像分类任务上获得性能提高。而且，在车辆检测问题上也取得了较好的效果。但是，神经网络模型的参数很多，因此模型非常复杂，而且，训练模型时需要大量的训练数据，训练模型的过程复杂。导致整个检测过程复杂、耗时长等。The vehicle detection method based on machine learning is the current mainstream method. Among them, the deep learning model based on convolutional neural network has achieved performance improvement on large-scale image classification tasks. Moreover, better results have been achieved on the vehicle detection problem. However, the neural network model has many parameters, so the model is very complex, and a large amount of training data is required to train the model, and the process of training the model is complicated. The whole detection process is complicated and time-consuming.

发明内容Contents of the invention

有鉴于此，本申请提供一种基于图像的目标检测方法及装置，以解决现有的目标检测方法检测过程复杂、耗时长的技术问题。其技术方案如下：In view of this, the present application provides an image-based target detection method and device to solve the technical problems of complex and time-consuming detection processes in existing target detection methods. Its technical scheme is as follows:

第一方面，本申请提供了一种基于图像的目标检测方法，包括：In a first aspect, the present application provides an image-based target detection method, including:

根据待检测图像所包含的目标物体的轮廓信息确定目标候选区域；Determine the target candidate area according to the outline information of the target object contained in the image to be detected;

利用预先训练得到的第一随机森林和第一完全随机树森林，从所述目标候选区域中提取特征向量；Using the first random forest and the first completely random tree forest obtained in advance to extract feature vectors from the target candidate area;

利用预先训练得到的级联森林对所述特征向量进行分类，识别出所述待检测图像中包含的目标物体，所述级联森林的每一级联层包括第一预设数量的第二随机森林和第一预设数量的第二完全随机树森林。The feature vectors are classified using the pre-trained cascade forest to identify the target object contained in the image to be detected, and each cascade layer of the cascade forest includes a first preset number of second random forest and a second forest of completely random trees of the first preset number.

可选地，所述根据待检测图像所包含的目标物体的轮廓信息确定目标候选区域，包括：Optionally, the determining the target candidate area according to the outline information of the target object contained in the image to be detected includes:

利用边缘检测算法得到所述待检测图像中的目标物体的边缘图像；Using an edge detection algorithm to obtain an edge image of the target object in the image to be detected;

将所述边缘图像中连通的第二预设数量个边缘像素点划分为一组，得到多个边缘分组；Dividing a second preset number of connected edge pixels in the edge image into a group to obtain a plurality of edge groups;

计算每两个所述边缘分组之间的相似度；calculating the similarity between each two edge groups;

根据所述相似度确定包含所述目标物体轮廓的目标候选区域。A target candidate area including the outline of the target object is determined according to the similarity.

可选地，所述利用所述第一随机森林和所述第一完全随机树森林，从所述目标候选区域中提取特征向量，包括：Optionally, the extracting feature vectors from the target candidate regions by using the first random forest and the first completely random tree forest includes:

利用预设大小的滑动窗口扫描所述目标候选区域得到扫描特征；Scanning the target candidate area using a sliding window of a preset size to obtain scanning features;

将所述扫描特征输入至所述预设大小的滑动窗口对应的第一完全随机树森林，得到第一类特征向量；inputting the scanning features into the first completely random tree forest corresponding to the sliding window of the preset size to obtain the first type of feature vector;

将所述扫描特征输入至所述预设大小的滑动窗口对应的第一随机森林，得到第二类特征向量；Inputting the scanning features into the first random forest corresponding to the sliding window of the preset size to obtain the second type of feature vector;

将所述第一类特征向量和所述第二类特征向量进行级联，得到第三类特征向量；Concatenating the first-type eigenvectors and the second-type eigenvectors to obtain a third-type eigenvector;

将不同大小的滑动窗口对应的所述第三类特征向量进行级联，得到所述目标候选区域的特征向量。The feature vectors of the third type corresponding to the sliding windows of different sizes are concatenated to obtain the feature vectors of the candidate target regions.

可选地，所述方法还包括：Optionally, the method also includes:

利用预设大小的滑动窗口从训练样本中提取样例；Extract samples from training samples using a sliding window of preset size;

利用所述样例训练得到第一随机森林，所述第一随机森林包括第四预设数量棵决策树；Using the sample training to obtain a first random forest, the first random forest includes a fourth preset number of decision trees;

利用所述样例训练得到第一完全随机树森林，所述第一完全随机树森林包括第四预设数量棵决策树。Using the example training to obtain a first completely random tree forest, the first completely random tree forest includes a fourth preset number of decision trees.

可选地，所述方法还包括：Optionally, the method also includes:

将训练样本划分为生长子集和评估子集，所述生长子集与所述评估子集所包含的训练样本的数量比例满足预设比例；Dividing the training samples into a growth subset and an evaluation subset, where the ratio of the number of training samples contained in the growth subset to the evaluation subset satisfies a preset ratio;

利用所述生长子集逐级训练级联森林的级联层，每一级所述级联层包括两个完全随机树森林和两个随机森林，且每个所述完全随机树森林包括第五预设数量棵决策树，每个所述随机森林包括第五预设数量棵决策树；The cascaded layers of the cascaded forest are trained step by step using the growth subset, each level of the cascaded layer includes two completely random tree forests and two random forests, and each of the completely random tree forests includes a fifth A preset number of decision trees, each of which includes a fifth preset number of decision trees in the random forest;

当所述级联森林的级联层数增长后，利用所述评估子集验证当前级联森林的准确率是否提升；After the number of cascading layers of the cascading forest increases, use the evaluation subset to verify whether the accuracy of the current cascading forest is improved;

如果所述准确率没有提升，则所述级联森林的级联层数停止增加，得到最终的级联森林模型。If the accuracy rate is not improved, the number of cascaded layers of the cascaded forest stops increasing to obtain the final cascaded forest model.

第二方面，本申请还提供了一种基于图像的目标检测装置，包括：In a second aspect, the present application also provides an image-based target detection device, including:

确定单元，用于根据待检测图像所包含的目标物体的轮廓信息确定目标候选区域；A determination unit, configured to determine a target candidate area according to outline information of a target object included in the image to be detected;

特征提取单元，用于利用预先训练得到的第一随机森林和第一完全随机树森林，从所述目标候选区域中提取特征向量；A feature extraction unit, configured to extract a feature vector from the target candidate region using the first random forest and the first completely random tree forest obtained through pre-training;

识别单元，用于利用预先训练得到的级联森林对所述特征向量进行分类，识别出所述待检测图像中包含的目标物体，所述级联森林的每一级联层包括第一预设数量的第二随机森林和第一预设数量的第二完全随机树森林。A recognition unit, configured to use the pre-trained cascade forest to classify the feature vectors to identify the target object contained in the image to be detected, and each cascade layer of the cascade forest includes a first preset number of second random forests and a first preset number of second completely random tree forests.

可选地，所述确定单元，包括：Optionally, the determining unit includes:

边缘提取子单元，用于利用边缘检测算法得到所述待检测图像中的目标物体的边缘图像；The edge extraction subunit is used to obtain the edge image of the target object in the image to be detected by using an edge detection algorithm;

分组子单元，用于将所述边缘图像中连通的第二预设数量个边缘像素点划分为一组，得到多个边缘分组；A grouping subunit, configured to divide a second preset number of connected edge pixels in the edge image into a group to obtain a plurality of edge groups;

第一计算子单元，用于计算每两个所述边缘分组之间的相似度；a first calculation subunit, configured to calculate the similarity between each two edge groups;

第一确定子单元，用于根据所述相似度确定包含所述目标物体轮廓的目标候选区域。The first determination subunit is configured to determine a target candidate area including the outline of the target object according to the similarity.

可选地，所述特征提取单元，包括：Optionally, the feature extraction unit includes:

扫描子单元，用于利用预设大小的滑动窗口扫描所述目标候选区域得到扫描特征；A scanning subunit, configured to scan the target candidate area using a sliding window of a preset size to obtain scanning features;

第一提取子单元，用于将所述扫描特征输入至所述预设大小的滑动窗口对应的第一完全随机树森林，得到第一类特征向量；The first extraction subunit is configured to input the scanning features into the first completely random tree forest corresponding to the sliding window of the preset size to obtain the first type of feature vector;

第二提取子单元，用于将所述扫描特征输入至所述预设大小的滑动窗口对应的第一随机森林，得到第二类特征向量；The second extraction subunit is used to input the scanning features into the first random forest corresponding to the sliding window of the preset size to obtain the second type of feature vector;

第一级联子单元，用于将所述第一类特征向量和所述第二类特征向量进行级联，得到第三类特征向量；The first cascading subunit is configured to concatenate the feature vectors of the first type and the feature vectors of the second type to obtain feature vectors of the third type;

第二级联子单元，用于将不同大小的滑动窗口对应的所述第三类特征向量进行级联，得到所述目标候选区域的特征向量。The second cascading subunit is configured to concatenate the feature vectors of the third type corresponding to sliding windows of different sizes to obtain the feature vectors of the candidate target regions.

可选地，所述装置还包括：Optionally, the device also includes:

第一提取单元，用于利用预设大小的滑动窗口从训练样本中提取样例；The first extraction unit is used to extract samples from training samples using a sliding window of a preset size;

第一训练单元，用于利用所述样例训练得到第一随机森林，所述第一随机森林包括第四预设数量棵决策树；The first training unit is configured to use the sample training to obtain a first random forest, and the first random forest includes a fourth preset number of decision trees;

第二训练单元，用于利用所述样例训练得到第一完全随机树森林，所述第一完全随机树森林包括第四预设数量棵决策树。The second training unit is configured to use the example training to obtain a first completely random tree forest, and the first completely random tree forest includes a fourth preset number of decision trees.

可选地，所述装置还包括：Optionally, the device also includes:

样本集划分单元，用于将训练样本划分为生长子集和评估子集，所述生长子集与所述评估子集所包含的训练样本的数量比例满足预设比例；A sample set division unit, configured to divide the training samples into a growth subset and an evaluation subset, where the ratio of the number of training samples contained in the growth subset to the evaluation subset satisfies a preset ratio;

第三训练单元，用于利用所述生长子集逐级训练级联森林的级联层，每一级所述级联层包括两个完全随机树森林和两个随机森林，且每个所述完全随机树森林包括第五预设数量棵决策树，每个所述随机森林包括第五预设数量棵决策树；The third training unit is used to use the growth subset to train the cascaded layers of the cascaded forest step by step, each level of the cascaded layer includes two completely random tree forests and two random forests, and each of the The completely random tree forest includes a fifth preset number of decision trees, each of said random forests includes a fifth preset number of decision trees;

验证单元，用于当所述级联森林的级联层数增长后，利用所述评估子集验证当前级联森林的准确率是否提升，如果所述准确率没有提升，则所述级联森林的级联层数停止增加，得到最终的级联森林模型。The verification unit is used to use the evaluation subset to verify whether the accuracy of the current cascade forest is improved after the number of cascade layers of the cascade forest increases, and if the accuracy rate does not increase, the cascade forest The number of cascade layers stops increasing, and the final cascade forest model is obtained.

本申请提供的基于图像的目标检测方法，在获得待检测图像之后，利用图像中目标物体的边缘信息，从待检测图像中确定出目标候选区域。然后，利用多粒度扫描技术从目标候选区域中提取特征向量。提取得到的特征向量输入到预先训练好的级联森林进行识别，确定目标候选区域中是否存在目标物体。首先，该方法利用在提取特征向量之前确定出目标候选区域，该目标候选区域的范围小于整个待检测图像的范围，因此，从目标候选区域中提取的特征向量数量，远远小于从整个待检测图像中提取的特征向量数量，从而提高了检测速度。其次，利用多粒度扫描能够更好地得到图像像素点之间的空间信息，提高了检测结果的准确率。与神经网络模型相比，级联森林模型中的参数数量较少，需要的训练数据较少，而且，检测准确率高。In the image-based target detection method provided by the present application, after the image to be detected is obtained, the target candidate area is determined from the image to be detected by using the edge information of the target object in the image. Then, feature vectors are extracted from target candidate regions using multi-granularity scanning technique. The extracted feature vectors are input to the pre-trained cascade forest for recognition to determine whether there is a target object in the target candidate area. First of all, this method uses the method to determine the target candidate area before extracting the feature vector. The range of the target candidate area is smaller than the range of the entire image to be detected. Therefore, the number of feature vectors extracted from the target candidate area is much smaller than that of the entire image to be detected. The number of feature vectors extracted in the image, which improves the detection speed. Secondly, the use of multi-granularity scanning can better obtain the spatial information between image pixels and improve the accuracy of detection results. Compared with the neural network model, the cascaded forest model has fewer parameters, requires less training data, and has higher detection accuracy.

附图说明Description of drawings

为了更清楚地说明本申请实施例或现有技术中的技术方案，下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图仅仅是本申请的实施例，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据提供的附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present application or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description are only It is an embodiment of the present application, and those skilled in the art can also obtain other drawings according to the provided drawings without creative work.

图1是本申请实施例一种基于图像的目标检测方法的流程图；Fig. 1 is a flow chart of an image-based target detection method according to an embodiment of the present application;

图2是本申请实施例一种确定目标候选区域过程的流程图；FIG. 2 is a flow chart of a process of determining a target candidate area according to an embodiment of the present application;

图3是本申请实施例一种提取特征向量的过程的流程图；Fig. 3 is a flow chart of a process of extracting feature vectors according to an embodiment of the present application;

图4是本申请实施例另一种基于图像的目标检测方法的流程图；FIG. 4 is a flow chart of another image-based target detection method according to an embodiment of the present application;

图5是本申请实施例一种基于图像的目标检测装置的框图；FIG. 5 is a block diagram of an image-based target detection device according to an embodiment of the present application;

图6是本申请实施例一种确定单元的框图；FIG. 6 is a block diagram of a determining unit according to an embodiment of the present application;

图7是本申请实施例一种特征提取单元的框图；FIG. 7 is a block diagram of a feature extraction unit according to an embodiment of the present application;

图8是本申请实施例另一种基于图像的目标检测装置的框图。FIG. 8 is a block diagram of another image-based object detection device according to an embodiment of the present application.

具体实施方式Detailed ways

下面将结合本申请实施例中的附图，对本申请实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例仅仅是本申请一部分实施例，而不是全部的实施例。基于本申请中的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，都属于本申请保护的范围。The following will clearly and completely describe the technical solutions in the embodiments of the application with reference to the drawings in the embodiments of the application. Apparently, the described embodiments are only some of the embodiments of the application, not all of them. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the scope of protection of this application.

本申请提供的基于图像的目标检测方法，利用图像中目标物体的边缘信息，从待检测图像中确定出目标候选区域。然后，利用多粒度扫描技术和级联森林，完成对待检测图像中的目标物体的识别。该方法的级联森林模型简单，不需要大量的训练数据；而且，准确率高、鲁棒性高。引入目标候选区域，极大地提升了检测速度。The image-based target detection method provided in this application uses the edge information of the target object in the image to determine the target candidate area from the image to be detected. Then, using multi-granularity scanning technology and cascaded forests, the recognition of the target object in the image to be detected is completed. The cascaded forest model of this method is simple and does not require a large amount of training data; moreover, it has high accuracy and high robustness. The introduction of target candidate regions greatly improves the detection speed.

请参见图1，示出了本申请实施例一种基于图像的目标检测方法的流程图，该方法应用于服务器或终端中，本实施例以检测车辆为例进行说明，在其它实施例中，该方法可以用于检测其它轮廓规则的其它物体。Please refer to FIG. 1 , which shows a flow chart of an image-based target detection method according to an embodiment of the present application. The method is applied to a server or a terminal. This embodiment takes vehicle detection as an example for illustration. In other embodiments, The method can be used to detect other objects with regular contours.

如图1所示，该方法可以包括以下步骤：As shown in Figure 1, the method may include the following steps:

S110，根据待检测图像所包含的目标物体的轮廓信息确定目标候选区域。S110. Determine a target candidate area according to the contour information of the target object contained in the image to be detected.

本实施例中，确定目标候选区域时，先利用检测出待检测图像中的目标物体的边缘信息，再利用边缘信息确定出包含目标物体轮廓的目标候选区域。In this embodiment, when determining the target candidate area, the edge information of the target object in the image to be detected is detected first, and then the edge information is used to determine the target candidate area including the outline of the target object.

如图2所示，确定目标候选区域的过程可以包括以下步骤：As shown in Figure 2, the process of determining the target candidate area may include the following steps:

S111，利用边缘检测算法得到所述待检测图像中的目标物体的边缘图像。S111. Obtain an edge image of the target object in the image to be detected by using an edge detection algorithm.

利用结构化边缘检测算法获得待检测图像包含的边缘图像，然后，采用非极大值抑制算法进一步处理所述边缘图像，得到一个相对稀疏的边缘图像。The edge image contained in the image to be detected is obtained by using a structured edge detection algorithm, and then the edge image is further processed by a non-maximum value suppression algorithm to obtain a relatively sparse edge image.

其中，结构化边缘算法的原理是对一幅图像微分得到梯度图像，边缘正好对应梯度图像中的山脊线，边缘检测就是寻找山脊线的过程。Among them, the principle of the structured edge algorithm is to differentiate an image to obtain a gradient image, and the edge corresponds to the ridge line in the gradient image. Edge detection is the process of finding the ridge line.

非极大值抑制算法的本质是搜索局部极大值，抑制非极大值元素。物体检测中应用该算法的主要目的是消除冗余的元素，找到最佳的边缘图像。The essence of the non-maximum suppression algorithm is to search for local maximum values and suppress non-maximum value elements. The main purpose of applying this algorithm in object detection is to eliminate redundant elements and find the best edge image.

S112，将边缘图像中连通的第二预设数量个边缘像素点划分为一组，得到多个边缘分组。S112. Divide a second preset number of connected edge pixel points in the edge image into a group to obtain a plurality of edge groups.

得到边缘图像后，搜索预设数量连通的边缘，其中，第二预设数量可以根据算法需求确定。After the edge image is obtained, a preset number of connected edges is searched, wherein the second preset number can be determined according to algorithm requirements.

例如，本实施例可以利用贪心算法搜索8连通的边缘，具体的搜索过程如下：对于一个边缘点(例如，A点)在它的8领域内找到方向角相差最小的边缘点(例如，B点)，然后，获得A点与B点之间的最小角度差值。然后，再利用同样的方法，在B点的8领域上搜索除A点之外的最小角度差值对应的边缘点(例如，C点)，获得B点和C点之间的最小方向角差值。并将B点和C点的最小方向角差值与A点和B点的最小方向角差值进行累加。依次类推，直到最小方向角差值的累加和大于π/2停止搜索，至此得到一个边缘分组。For example, in this embodiment, a greedy algorithm can be used to search for 8-connected edges. The specific search process is as follows: For an edge point (for example, point A), find the edge point (for example, point B) with the smallest difference in direction angle within its 8 domains ), and then, obtain the minimum angle difference between point A and point B. Then, use the same method to search for the edge point (for example, point C) corresponding to the minimum angle difference value other than point A on the 8 fields of point B to obtain the minimum direction angle difference between point B and point C value. And accumulate the minimum direction angle difference between point B and point C and the minimum direction angle difference between point A and point B. By analogy, the search is stopped until the accumulated sum of the minimum orientation angle difference is greater than π/2, and an edge group is obtained so far.

S113，计算每两个边缘分组之间的相似度。S113. Calculate the similarity between every two edge groups.

得到多个边缘分组后，计算每两个边缘分组之间的相似度，其中计算相似度的公式如下：After obtaining multiple edge groups, calculate the similarity between every two edge groups, where the formula for calculating the similarity is as follows:

a(s_i,s_j)＝|cos(θ_i-θ_ij)cos(θ_j-θ_ij)| (公式1)a(s_i ,s_j )＝|cos(θ_i -θ_ij )cos(θ_j -θ_ij )| (Formula 1)

其中，公式1中，a(s_i,s_j)为s_i和s_j之间的相似度，s_i和s_j为两个边缘分组，其平均位置分别为x_i和x_j，平均方向角分别为θ_i和θ_j，θ_ij是x_i和x_j之间的角度，γ为调整相似度敏感度的参数，通常设置为2。Among them, in Formula 1, a(s_i , s_j ) is the similarity between s_i and s_j , s_i and s_j are two edge groups, their average positions are x_i and x_j respectively, and the average direction The angles are θ_i and θ_j respectively, θ_ij is the angle between x_i and x_j , and γ is a parameter to adjust the similarity sensitivity, which is usually set to 2.

S114，根据相似度确定包含目标物体轮廓的目标候选区域。S114. Determine a target candidate area including the outline of the target object according to the similarity.

定义ω_b(s_i)∈[0,1]，当s_i完全在区域b内时，ω_b(s_i)＝1；当s_i不在区域b内时，ω_b(s_i)＝0；ω_b(s_i)计算公式如下：Define ω_b (s_i )∈[0,1], when s_i is completely in region b, ω_b (s_i )=1; when s_i is not in region b, ω_b (s_i )=0 ; The calculation formula of ω_b (s_i ) is as follows:

其中，|T|是长度，例如，一个边缘分组则该长度值为1，两个边缘分则，则该长度值为2。起点为t₁∈S_b，终点为t_|T|∈s_i的有序边缘分组路径。Among them, |T| is the length, for example, if there is one edge grouping, then the length value is 1, and if there are two edge divisions, then the length value is 2. The ordered edge grouping path whose starting point is t₁ ∈ S_b and the end point is t_|T| ∈ s_i .

根据ω_b按照公式3计算得到每个候选区域的得分，其中，公式3如下：Calculate the score of each candidate region according to formula 3 according to_ωb , where formula 3 is as follows:

其中，公式3中，m_i为s_i中所有像素值大小之和，b_ω和b_h分别为候选区域的宽和高，κ为补偿系数。Among them, in Formula 3, m_i is the sum of all pixel values in s_i , b_ω and b_h are the width and height of the candidate area, respectively, and κ is the compensation coefficient.

在实际应用中，为了提高计算效率，公式3所示的得分函数可以采用公式4所示的计算方式：In practical applications, in order to improve the calculation efficiency, the scoring function shown in formula 3 can adopt the calculation method shown in formula 4:

为中心在b且在b内的候选区域对应的得分，原理为将公式3中s_i具有且ω_b(s_i)＜1(即，s_i不在区域b内)，(1-ω_b(s_i))m_i剔除掉。因此，进一步提高了计算效率。 is the score corresponding to the candidate area centered at b and within b, the principle is that s_i in formula 3 has And ω_b (s_i )<1 (that is, s_i is not in region b), (1-ω_b (s_i ))m_i is eliminated. Therefore, computational efficiency is further improved.

按照得分从高到低进行排序，选取前预设数量个候选区域作为目标候选区域。Sorting is performed according to the scores from high to low, and a preset number of candidate regions are selected as target candidate regions.

S120，利用预先训练得到的第一随机森林和第一完全随机树森林，从目标候选区域中提取特征向量。S120, using the pre-trained first random forest and the first completely random tree forest to extract feature vectors from target candidate regions.

为了更好地得到图像像素点之间的空间信息，采用滑动窗口的方法对图像的目标候选区域进行多粒度扫描(即，采用不同大小的滑动窗口进行扫描)，从而提高了检测准确率。In order to better obtain the spatial information between image pixels, the sliding window method is used to scan the target candidate area of the image at multiple granularities (that is, using sliding windows of different sizes to scan), thereby improving the detection accuracy.

利用预设大小的滑动窗口按照预设步长扫描目标候选区域，得到扫描特征，然后将这些扫描特征输入到预先训练得到的第一随机森林和第一完全随机树森林进行判断，输出要选择的候选特征向量。然后，将第一随机树森林输出的候选特征向量和第一完全随机树森林输出的候选特征向量进行级联；再将不同滑动窗口对应的级联后的特征向量进行再次级联，得到最终要输入至级联森林的特征向量。Use the sliding window of the preset size to scan the target candidate area according to the preset step size to obtain the scanning features, and then input these scanning features into the first random forest and the first completely random tree forest obtained in advance for judgment, and output the selected Candidate feature vectors. Then, concatenate the candidate feature vectors output by the first random tree forest and the candidate feature vectors output by the first completely random tree forest; then concatenate the concatenated feature vectors corresponding to different sliding windows again to obtain the final The feature vector input to the cascade forest.

随机森林顾名思义，是用随机的方式建立一个森林，森林里面有很多的决策树组成。随机森林的任意两棵棵决策树之间没有关联。Random forest, as the name suggests, is to build a forest in a random way, and there are many decision trees in the forest. There is no correlation between any two decision trees in the random forest.

其中，完全随机树森林和随机森林是两种不同类型的森林，构成完全随机树森林的第一类决策树随机选取1个候选特征作为训练特征；构成随机森林的第二类决策树随机选取个候选特征作为训练特征，其中，d为输入的候选特征的数量。利用这两种类型的随机森林能够提高提取特征向量的准确率。Among them, completely random tree forest and random forest are two different types of forests. The first type of decision tree that constitutes a completely random tree forest randomly selects a candidate feature as a training feature; the second type of decision tree that constitutes a random forest randomly selects Candidate features are used as training features, where d is the number of input candidate features. Using these two types of random forests can improve the accuracy of feature vector extraction.

请参见图3，示出了本申请实施例一种提取特征向量的过程的流程图，该流程图可以包括：Referring to FIG. 3 , it shows a flow chart of a process for extracting feature vectors according to an embodiment of the present application. The flow chart may include:

S121，利用预设大小的滑动窗口扫描目标候选区域，得到扫描特征。S121. Scan the target candidate area with a sliding window of a preset size to obtain scanning features.

滑动窗口的大小需要根据待检测图像的大小确定，而且，滑动窗口的大小与待检测图像的大小正相关；即，待检测图像越大，滑动窗口也越大；待检测图像越小，则滑动窗口也越小。例如，根据待检测图像确定三个滑动窗口大小分别40×40，80×80和160×160。The size of the sliding window needs to be determined according to the size of the image to be detected, and the size of the sliding window is positively correlated with the size of the image to be detected; that is, the larger the image to be detected, the larger the sliding window; the smaller the image to be detected, the sliding The window is also smaller. For example, according to the image to be detected, the sizes of three sliding windows are determined to be 40×40, 80×80 and 160×160, respectively.

S122，将该扫描特征输入到该滑动窗口对应的第一完全随机树森林，得到第一类特征向量。S122. Input the scanning feature into the first completely random tree forest corresponding to the sliding window to obtain the first type of feature vector.

每种滑动窗口对应一种第一完全随机树森林和一种第一随机森林。例如，40×40的滑动窗口对应一个第一完全随机树森林和一个第一随机森林；80×80的滑动窗口对应一个第一完全随机树森林和一个第一随机森林。Each sliding window corresponds to a first completely random tree forest and a first random forest. For example, a 40×40 sliding window corresponds to a first completely random tree forest and a first random forest; an 80×80 sliding window corresponds to a first completely random tree forest and a first random forest.

扫描特征输入至第一完全随机树森林进行判断后，输出第一类特征向量。After the scanning features are input to the first completely random tree forest for judgment, the first type of feature vector is output.

S123，将该扫描特征输入到该滑动窗口对应的第一随机森林，得到第二类特征向量。S123. Input the scanning feature into the first random forest corresponding to the sliding window to obtain a feature vector of the second type.

将上述扫描特征输入到该滑动窗口对应的第一随机森林进行判断后，输出第二类特征向量。After inputting the above scanning features into the first random forest corresponding to the sliding window for judgment, the second type of feature vector is output.

S124，将第一类特征向量和第二类特征向量进行级联，得到第三类特征向量。S124. Concatenate the feature vectors of the first type and the feature vectors of the second type to obtain feature vectors of the third type.

将同一滑动窗口对应的第一类特征向量和第二类特征向量进行级联，得到第三类特征向量。The feature vectors of the first type and the feature vectors of the second type corresponding to the same sliding window are concatenated to obtain the feature vectors of the third type.

S125，将不同大小的滑动窗口对应的第三类特征向量进行级联，得到目标候选区域的特征向量。S125. Concatenate the feature vectors of the third type corresponding to the sliding windows of different sizes to obtain feature vectors of the target candidate area.

最后，将不同大小的滑动窗口对应的第三类特征向量进行级联，得到最终的特征向量。Finally, the third type of eigenvectors corresponding to sliding windows of different sizes are concatenated to obtain the final eigenvector.

例如，将40×40的滑动窗口对应的第一类特征向量和第二类特征向量进行级联，得到40×40的滑动窗口对应的第三类特征向量；同理，得到80×80滑动窗口对应的第三类特征向量，以及160×160的滑动窗口对应的第三类特征向量；最后，将这三个滑动窗口对应的第三类特征向量进行级联，得到最终的特征向量。For example, the first type of eigenvector corresponding to the 40×40 sliding window and the second type of eigenvector are concatenated to obtain the third type of eigenvector corresponding to the 40×40 sliding window; similarly, an 80×80 sliding window is obtained The corresponding third-type eigenvectors, and the third-type eigenvectors corresponding to the 160×160 sliding window; finally, the third-type eigenvectors corresponding to the three sliding windows are concatenated to obtain the final eigenvector.

S130，利用预先训练得到的级联森林对所述特征向量进行分类，识别出所述待检测图像中包含的目标物体。S130, using the pre-trained cascade forest to classify the feature vector, and identify the target object contained in the image to be detected.

为了更好地挖掘出图像数据中的深层信息，采用一种基于决策树的多层级联结构，构成级联森林，特征向量通过级联森林的不断挖掘信息，得到最终的分类结果。In order to better excavate the deep information in the image data, a multi-layer cascade structure based on decision trees is adopted to form a cascade forest, and the feature vectors are continuously mined through the cascade forest to obtain the final classification result.

所述级联森林的每一级联层包括第一预设数量的第二随机森林和第一预设数量的第二完全随机树森林。其中，理论上第一数量越大，准确率越高，但是，森林的数量越多计算量也就越大，因此，通常选取一个能够兼顾准确率和计算量的平衡点对应的数量，例如，1200棵。Each cascade layer of the cascade forest includes a first preset number of second random forests and a first preset number of second completely random tree forests. Among them, theoretically, the larger the first number, the higher the accuracy rate. However, the larger the number of forests, the greater the amount of calculation. Therefore, a number corresponding to a balance point that can take into account both accuracy and calculation amount is usually selected. For example, 1200 trees.

第二完全随机树森林与上述的第一完全随机树森林的类型相同，但是所包含的决策树的数量不同；同理，第二随机森林与第一随机森林的决策树数量也不同。The type of the second completely random tree forest is the same as that of the above-mentioned first completely random tree forest, but the number of decision trees included is different; similarly, the number of decision trees in the second random forest is also different from that of the first random forest.

级联森林也需要根据训练样本训练得到，其中，级联森林的训练过程如下：The cascade forest also needs to be trained according to the training samples. The training process of the cascade forest is as follows:

将训练样本划分成生长子集和评估子集，利用生长子集训练得到一个级联层结构，在生长一个级联层之后，利用评估子集验证当前得到的级联森林的准确率；如果生长一个级联层之后，与生长该级联层之前相比，准确率没有明显提升，则停止生长，从而得到最终的级联森林模型。Divide the training samples into a growth subset and an evaluation subset, and use the growth subset to train to obtain a cascade layer structure. After growing a cascade layer, use the evaluation subset to verify the accuracy of the currently obtained cascade forest; if the growth After a cascade layer, compared with before growing the cascade layer, the accuracy rate is not significantly improved, then the growth is stopped, so as to obtain the final cascade forest model.

本实施例提供的基于图像的目标检测方法，在获得待检测图像之后，利用图像中目标物体的边缘信息，从待检测图像中确定出目标候选区域。然后，利用多粒度扫描技术从目标候选区域中提取特征向量。提取得到的特征向量输入到预先训练好的级联森林进行识别，确定目标候选区域中是否存在目标物体。首先，该方法利用在提取特征向量之前确定出目标候选区域，该目标候选区域的范围小于整个待检测图像的范围，因此，从目标候选区域中提取的特征向量数量，远远小于从整个待检测图像中提取的特征向量数量，从而提高了检测速度。其次，利用多粒度扫描能够更好地得到图像像素点之间的空间信息，提高了检测结果的准确率。与神经网络模型相比，级联森林模型中的参数数量较少，需要的训练数据较少，而且，检测准确率高。In the image-based target detection method provided in this embodiment, after the image to be detected is obtained, edge information of the target object in the image is used to determine a target candidate area from the image to be detected. Then, feature vectors are extracted from target candidate regions using multi-granularity scanning technique. The extracted feature vectors are input to the pre-trained cascade forest for recognition to determine whether there is a target object in the target candidate area. First of all, this method utilizes to determine the target candidate area before extracting the feature vector. The range of the target candidate area is smaller than the range of the entire image to be detected. Therefore, the number of feature vectors extracted from the target candidate area is much smaller than that of the entire image to be detected. The number of feature vectors extracted in the image, which improves the detection speed. Secondly, the use of multi-granularity scanning can better obtain the spatial information between image pixels and improve the accuracy of detection results. Compared with the neural network model, the cascaded forest model has fewer parameters, requires less training data, and has higher detection accuracy.

请参见图4，示出了本申请实施例另一种基于图像的目标检测方法的流程图。本实施例将着重介绍训练多粒度扫描的第一完全随机树森林和第一随机森林，以及训练级联森林的过程。训练过程只需要进行一次，训练完成后，可以直接使用训练得到的模型。Referring to FIG. 4 , it shows a flowchart of another image-based object detection method according to an embodiment of the present application. This embodiment will focus on the process of training the first completely random tree forest and the first random forest for multi-granularity scanning, and training the cascaded forest. The training process only needs to be performed once, and after the training is completed, the trained model can be used directly.

如图4所示，该方法在图1所示实施例的基础上还包括：As shown in Figure 4, the method also includes on the basis of the embodiment shown in Figure 1:

S210，利用预设大小的滑动窗口从训练样本中提取样例。S210, using a sliding window with a preset size to extract samples from the training samples.

预设大小可以预先根据样本图像的大小确定。例如，利用40×40的滑动窗口从训练样本中提取样例，从正样本中提取的是正样例，从负样本中提取的是负样例。其中，正样本是包含目标物体的样本图像，负样本是不包含目标物体的样本图像。The preset size can be determined in advance according to the size of the sample image. For example, a 40×40 sliding window is used to extract samples from training samples, positive samples are extracted from positive samples, and negative samples are extracted from negative samples. Among them, positive samples are sample images that contain the target object, and negative samples are sample images that do not contain the target object.

S220，利用所述样例分别训练得到第一随机森林和第一完全随机树森林。S220. Using the examples to train respectively to obtain a first random forest and a first complete random tree forest.

用于特征提取的第一完全随机森林和第一随机森林只进行特征的粗略筛选，因此，每个层级包含的森林数量比较少(例如，每个层级包括一个第一完全随机树森林和一个第一随机森林)，而且每个森林里包含的决策树的数量也比较少(例如，每个森林只包括40棵决策树)。The first complete random forest and the first random forest used for feature extraction only perform rough screening of features, so each level contains a relatively small number of forests (for example, each level includes a first complete random tree forest and a second A random forest), and the number of decision trees contained in each forest is relatively small (for example, each forest only includes 40 decision trees).

提取到样例后，利用样例分别训练得到第一完全随机树森林和第一随机森林。例如，第一完全随机树森林可以包括40棵第一类决策树，第一随机森林可以包括40棵第二类决策树。After the samples are extracted, the first completely random tree forest and the first random forest are obtained by using the samples to train respectively. For example, the first completely random tree forest may include 40 decision trees of the first type, and the first random forest may include 40 decision trees of the second type.

训练得到第一随机森林和第一完全随机树森林后，可以在S120中直接利用这两个森林提取目标候选区域中的特征向量，然后将两个森林提取到的特征向量进行级联。After training the first random forest and the first complete random tree forest, the two forests can be directly used to extract feature vectors in the target candidate region in S120, and then the feature vectors extracted by the two forests can be concatenated.

如果存在不同大小的滑动窗口对应的第一随机森林和第一完全随机树森林，则需要将各个不同大小的滑动窗口对应的特征向量进行级联，得到最终的特征向量。If there are first random forests and first complete random tree forests corresponding to sliding windows of different sizes, the feature vectors corresponding to sliding windows of different sizes need to be concatenated to obtain the final feature vector.

S230，将训练样本划分为生长子集和评估子集；其中，所述生长子集与所述评估子集所包含的训练样本的数量比例满足预设比例。S230. Divide the training samples into a growth subset and an evaluation subset; wherein, the ratio of the number of training samples contained in the growth subset to the evaluation subset satisfies a preset ratio.

利用训练样本训练级联森林，首先将训练样本集中的一部分作为生长子集，剩余部分作为评估子集。例如，生长子集和评估子集之间的比例可以为3:1，即，样本集中75％的样本作为生长子集，剩余25％的样本作为评估子集。其中，生长子集用来训练级联森林，评估子集用来检验当前训练得到的级联森林的准确率。Using the training samples to train the cascaded forest, first, a part of the training sample set is used as a growth subset, and the remaining part is used as an evaluation subset. For example, the ratio between the growth subset and the evaluation subset may be 3:1, that is, 75% of the samples in the sample set are used as the growth subset, and the remaining 25% of the samples are used as the evaluation subset. Among them, the growth subset is used to train the cascade forest, and the evaluation subset is used to test the accuracy of the current training cascade forest.

S240，利用生长子集逐级训练级联森林的级联层。S240, using the growing subset to train the cascaded layers of the cascaded forest step by step.

每一级所述级联层包括第六预设数量个完全随机树森林和第六数量个随机森林，且每个所述完全随机树森林和每个随机森林均包括第五预设数量棵决策树。Each level of the cascade layer includes a sixth preset number of completely random tree forests and a sixth number of random forests, and each of the completely random tree forests and each random forest includes a fifth preset number of decision trees Tree.

用于特征识别分类的级联森林需要进行准确的对特征向量进行判断和分类，因此，此类级联森林的每个层级所包含的随机森林的数量越多，识别结果的准确率越高。但是，森林的数量越多，计算量越大，计算速度越慢，选取准确率和计算速度的平衡点对应的森林数量。例如，2个，即每个级联层包括两个第二完全随机树森林和第二随机森林。同理，每个森林所包含的决策树也根据准确率和计算速度的平衡点来确定。The cascaded forest used for feature recognition and classification needs to accurately judge and classify the feature vectors. Therefore, the more random forests contained in each level of this type of cascaded forest, the higher the accuracy of the recognition result. However, the larger the number of forests, the greater the amount of calculation, and the slower the calculation speed. Select the number of forests corresponding to the balance point between accuracy and calculation speed. For example, 2, that is, each cascade layer includes two second completely random tree forests and a second random forest. Similarly, the decision trees contained in each forest are also determined according to the balance point of accuracy and calculation speed.

例如，每一个级联层可以包括两个第二完全随机树森林和两个第二随机森林，其中，第二完全随机树森林和第二随机森林均包括1200棵决策树。For example, each cascade layer may include two second completely random tree forests and two second random forests, wherein both the second completely random tree forests and the second random forests include 1200 decision trees.

S250，当级联森林的级联层数增长后，利用评估子集验证当前级联森林的准确率是否提升；如果没有提升，则执行S260；如果有提升，则返回执行S240。S250, when the number of cascading layers of the cascading forest increases, use the evaluation subset to verify whether the accuracy of the current cascading forest is improved; if not, execute S260; if there is, return to execute S240.

在训练过程中，每增加一个级联层，都需要利用评估子集验证当前训练得到的级联森林的验证准确率是否比增加当前级联层之前的级联森林的准确率有明显提高。During the training process, each time a cascade layer is added, it is necessary to use the evaluation subset to verify whether the verification accuracy of the cascade forest obtained by the current training is significantly higher than that of the cascade forest before adding the current cascade layer.

其中，评估子集中样本类型已经确定，包括目标物体的样本是正样本，不包括目标物体的样本是负样本。分别将评估子集中的样本的特征向量输入到当前训练得到的级联森林中，在该级联森林的输出端得到这些样本的分类结果；再将该分类结果分别与样本的已知结果进行比较，得到该级联森林的检测准确率。Wherein, the sample types in the evaluation subset have been determined, the samples including the target object are positive samples, and the samples not including the target object are negative samples. Input the feature vectors of the samples in the evaluation subset into the cascade forest obtained by the current training, and obtain the classification results of these samples at the output of the cascade forest; then compare the classification results with the known results of the samples , to get the detection accuracy of the cascade forest.

最后，将当前级联森林对应的准确率与比当前级联森林少一个级联层的级联森林的准确率进行比较，判断生长了一个级联层之后，级联森林的准确率是否有明显提升。Finally, compare the accuracy rate corresponding to the current cascade forest with the accuracy rate of the cascade forest with one cascade layer less than the current cascade forest, and judge whether the accuracy rate of the cascade forest is significantly improved after a cascade layer is grown. promote.

S260，级联森林的级联层数停止增加，得到最终的级联森林模型。S260, stop increasing the number of cascaded layers of the cascaded forest to obtain a final cascaded forest model.

如果生长了一个级联层之后，级联森林的准确率并没有明显提升，则停止生长，即，停止增加级联森林的级联层数。If the accuracy of the cascade forest does not increase significantly after growing a cascade layer, stop growing, that is, stop increasing the number of cascade layers in the cascade forest.

本实施例提供的基于图像的目标检测方法，在获得待检测图像之后，利用图像中目标物体的边缘信息，从待检测图像中确定出目标候选区域。然后，利用多粒度扫描技术从目标候选区域中提取特征向量。提取得到的特征向量输入到预先训练好的级联森林进行识别，确定目标候选区域中是否存在目标物体。首先，该方法利用在提取特征向量之前确定出目标候选区域，该目标候选区域的范围小于整个待检测图像的范围，因此，从目标候选区域中提取的特征向量数量，远远小于从整个待检测图像中提取的特征向量数量，从而提高了检测速度。其次，利用多粒度扫描能够更好地得到图像像素点之间的空间信息，提高了检测结果的准确率。与神经网络模型相比，级联森林模型中的参数数量较少，需要的训练数据较少，而且，检测准确率高。In the image-based target detection method provided in this embodiment, after the image to be detected is obtained, edge information of the target object in the image is used to determine a target candidate area from the image to be detected. Then, feature vectors are extracted from target candidate regions using multi-granularity scanning techniques. The extracted feature vectors are input to the pre-trained cascade forest for recognition to determine whether there is a target object in the target candidate area. First of all, this method uses the method to determine the target candidate area before extracting the feature vector. The range of the target candidate area is smaller than the range of the entire image to be detected. Therefore, the number of feature vectors extracted from the target candidate area is much smaller than that of the entire image to be detected. The number of feature vectors extracted in the image, which improves the detection speed. Secondly, the use of multi-granularity scanning can better obtain the spatial information between image pixels and improve the accuracy of detection results. Compared with the neural network model, the cascaded forest model has fewer parameters, requires less training data, and has higher detection accuracy.

对于前述的各方法实施例，为了简单描述，故将其都表述为一系列的动作组合，但是本领域技术人员应该知悉，本申请并不受所描述的动作顺序的限制，因为依据本申请，某些步骤可以采用其他顺序或者同时进行。其次，本领域技术人员也应该知悉，说明书中所描述的实施例均属于优选实施例，所涉及的动作和模块并不一定是本申请所必须的。For the aforementioned method embodiments, for the sake of simple description, they are expressed as a series of action combinations, but those skilled in the art should know that the application is not limited by the described action sequence, because according to the application, Certain steps may be performed in other orders or simultaneously. Secondly, those skilled in the art should also know that the embodiments described in the specification belong to preferred embodiments, and the actions and modules involved are not necessarily required by this application.

相应于上述的基于图像的目标检测方法实施例，本申请还提供了基于图像的目标检测装置实施例。Corresponding to the above embodiment of the image-based object detection method, the present application also provides an embodiment of an image-based object detection device.

请参见图5，示出了本申请实施例一种基于图像的目标检测装置的框图，该装置可以应用于服务器或终端中，用于从图像中识别出目标物体。如图5所示，该装置包括：确定单元110、特征提取单元120和识别单元130。Please refer to FIG. 5 , which shows a block diagram of an image-based object detection device according to an embodiment of the present application. The device can be applied to a server or a terminal to identify a target object from an image. As shown in FIG. 5 , the device includes: a determination unit 110 , a feature extraction unit 120 and a recognition unit 130 .

确定单元110，用于根据待检测图像所包含的目标物体的轮廓信息确定目标候选区域。The determining unit 110 is configured to determine the target candidate area according to the outline information of the target object contained in the image to be detected.

本实施例中，确定单元在确定目标候选区域时，先利用检测出待检测图像中的目标物体的边缘信息，再利用边缘信息确定出包含目标物体轮廓的目标候选区域。In this embodiment, when determining the target candidate area, the determining unit first uses the detected edge information of the target object in the image to be detected, and then uses the edge information to determine the target candidate area including the outline of the target object.

如图6所示，该确定单元可以包括：边缘提取子单元111、分组子单元112、第一计算子单元113和第一确定子单元114。As shown in FIG. 6 , the determination unit may include: an edge extraction subunit 111 , a grouping subunit 112 , a first calculation subunit 113 and a first determination subunit 114 .

边缘提取子单元111，用于利用边缘检测算法得到该待检测图像中的目标物体的边缘图像。The edge extraction subunit 111 is configured to use an edge detection algorithm to obtain an edge image of the target object in the image to be detected.

分组子单元112，用于将边缘图像中连通的第二预设数量个边缘像素点划分为一组，得到多个边缘分组。The grouping subunit 112 is configured to divide the second preset number of connected edge pixel points in the edge image into a group to obtain a plurality of edge groups.

第一计算子单元113，用于计算每两个边缘分组之间的相似度。The first calculation subunit 113 is configured to calculate the similarity between every two edge groups.

可以利用公式1计算两个边缘分组的相似度，此处不再赘述。The similarity between two edge groups can be calculated by using Formula 1, which will not be repeated here.

第一确定子单元114，用于根据该相似度确定包含目标物体轮廓的目标候选区域。The first determination subunit 114 is configured to determine a target candidate area including the outline of the target object according to the similarity.

利用上述的公式2-公式4计算得到每个候选区域的得分，得分越高越可能作为目标候选区域。The score of each candidate area is calculated by using the above-mentioned formula 2 to formula 4, and the higher the score is, the more likely it is to be the target candidate area.

特征提取单元120，用于利用预先训练得到的第一随机森林和第一完全随机树森林，从目标候选区域中提取特征向量。The feature extraction unit 120 is configured to extract feature vectors from target candidate regions by using the first random forest and the first complete random tree forest obtained through pre-training.

如图7所示，特征提取单元120可以包括：扫描子单元121、第一提取子单元122、第二提取子单元123、第一级联子单元124和第二级联子单元125。As shown in FIG. 7 , the feature extraction unit 120 may include: a scanning subunit 121 , a first extraction subunit 122 , a second extraction subunit 123 , a first cascading subunit 124 and a second cascading subunit 125 .

扫描子单元121，用于利用预设大小的滑动窗口扫描目标候选区域得到扫描特征。The scanning subunit 121 is configured to scan target candidate regions using a sliding window of a preset size to obtain scanning features.

滑动窗口的大小与待检测图像的大小正相关；即，待检测图像越大，滑动窗口也越大；待检测图像越小，则滑动窗口也越小。The size of the sliding window is positively correlated with the size of the image to be detected; that is, the larger the image to be detected, the larger the sliding window; the smaller the image to be detected, the smaller the sliding window.

第一提取子单元122，用于将扫描特征输入至预设大小的滑动窗口对应的第一完全随机树森林，得到第一类特征向量。The first extraction subunit 122 is configured to input the scanning features into the first completely random tree forest corresponding to the sliding window of a preset size to obtain the first type of feature vector.

例如，滑动窗口是80×80，则将利用该滑动窗口扫描得到的扫描特征输入到第一完全随机树森林得到第一类特征向量。For example, if the sliding window is 80×80, then the scanning features obtained by using the sliding window scanning are input to the first completely random tree forest to obtain the first type of feature vector.

第二提取子单元123，用于将扫描特征输入至预设大小的滑动窗口对应的第一随机森林，得到第二类特征向量。The second extraction subunit 123 is configured to input the scanning features into the first random forest corresponding to the sliding window of a preset size to obtain the second type of feature vectors.

例如，将80×80的滑动窗口获得的扫描特征输入至第一随机森林得到第二类特征向量。For example, the scanning features obtained by the sliding window of 80×80 are input into the first random forest to obtain the feature vector of the second type.

随机森林是用随机的方式建立一个森林，森林里面有很多的决策树组成。随机森林的任意两棵棵决策树之间没有关联。Random forest is to build a forest in a random way, and there are many decision trees in the forest. There is no correlation between any two decision trees in the random forest.

第一级联子单元124，用于将所述第一类特征向量和所述第二类特征向量进行级联，得到第三类特征向量。The first concatenation subunit 124 is configured to concatenate the feature vectors of the first type and the feature vectors of the second type to obtain feature vectors of the third type.

然后，将80×80的滑动窗口对应的第一类特征向量和第二类特征向量进行级联得到第三类特征向量。同理，可以得到其它大小的滑动窗口对应的第三类特征向量。Then, the feature vectors of the first type and the feature vectors of the second type corresponding to the 80×80 sliding window are concatenated to obtain the feature vectors of the third type. Similarly, the third type of eigenvectors corresponding to sliding windows of other sizes can be obtained.

第二级联子单元125，用于将不同大小的滑动窗口对应的所述第三类特征向量进行级联，得到所述目标候选区域的特征向量。The second concatenation subunit 125 is configured to concatenate the feature vectors of the third type corresponding to sliding windows of different sizes to obtain the feature vectors of the target candidate regions.

例如，将40×40、80×80及160×160这三种大小的滑动窗口对应的第三类特征向量进行级联，作为目标候选区域的特征向量。For example, the third type of feature vectors corresponding to the sliding windows of three sizes of 40×40, 80×80 and 160×160 are concatenated to be the feature vectors of the target candidate region.

识别单元130，用于利用预先训练得到的级联森林对该特征向量进行分类，识别出待检测图像中包含的目标物体。The identification unit 130 is configured to use the pre-trained cascade forest to classify the feature vector, and identify the target object contained in the image to be detected.

本实施例提供的基于图像的目标检测装置，在获得待检测图像之后，利用图像中目标物体的边缘信息，从待检测图像中确定出目标候选区域。然后，利用多粒度扫描技术从目标候选区域中提取特征向量。提取得到的特征向量输入到预先训练好的级联森林进行识别，确定目标候选区域中是否存在目标物体。首先，该方法利用在提取特征向量之前确定出目标候选区域，该目标候选区域的范围小于整个待检测图像的范围，因此，从目标候选区域中提取的特征向量数量，远远小于从整个待检测图像中提取的特征向量数量，从而提高了检测速度。其次，利用多粒度扫描能够更好地得到图像像素点之间的空间信息，提高了检测结果的准确率。与神经网络模型相比，级联森林模型中的参数数量较少，需要的训练数据较少，而且，检测准确率高。The image-based target detection device provided in this embodiment uses the edge information of the target object in the image after obtaining the image to be detected to determine the target candidate area from the image to be detected. Then, feature vectors are extracted from target candidate regions using multi-granularity scanning techniques. The extracted feature vectors are input to the pre-trained cascade forest for recognition to determine whether there is a target object in the target candidate area. First of all, this method uses the method to determine the target candidate area before extracting the feature vector. The range of the target candidate area is smaller than the range of the entire image to be detected. Therefore, the number of feature vectors extracted from the target candidate area is much smaller than that of the entire image to be detected. The number of feature vectors extracted in the image, which improves the detection speed. Secondly, the use of multi-granularity scanning can better obtain the spatial information between image pixels and improve the accuracy of detection results. Compared with the neural network model, the cascaded forest model has fewer parameters, requires less training data, and has higher detection accuracy.

请参见图8，示出了本申请实施例另一种基于图像的目标检测装置的框图。本实施例将着重介绍训练多粒度扫描的第一完全随机树森林和第一随机森林，以及训练级联森林的过程。Please refer to FIG. 8 , which shows a block diagram of another image-based object detection device according to an embodiment of the present application. This embodiment will focus on the process of training the first completely random tree forest and the first random forest for multi-granularity scanning, and training the cascaded forest.

如图8所示，该装置在图5所示实施例的基础上还包括：As shown in Figure 8, on the basis of the embodiment shown in Figure 5, the device also includes:

第一提取单元210，用于利用预设大小的滑动窗口从训练样本中提取样例。The first extracting unit 210 is configured to extract samples from training samples using a sliding window of a preset size.

预设大小可以预先根据样本图像的大小确定。然后，利用滑动窗口从训练样本中提取样例，从正样本中提取的是正样例，从负样本中提取的是负样例。其中，正样本是包含目标物体的样本图像，负样本是不包含目标物体的样本图像。The preset size can be determined in advance according to the size of the sample image. Then, the sliding window is used to extract samples from the training samples, the positive samples are extracted from the positive samples, and the negative samples are extracted from the negative samples. Among them, positive samples are sample images that contain the target object, and negative samples are sample images that do not contain the target object.

第一训练单元220，用于利用样例训练得到第一随机森林。The first training unit 220 is configured to obtain a first random forest by training examples.

提取到样例后，利用样例分别训练得到第一随机森林。例如，第一随机森林可以包括40棵第二类决策树。After extracting the samples, use the samples to train separately to obtain the first random forest. For example, the first random forest may include 40 decision trees of the second type.

第二训练单元230，用于利用所述样例训练得到第一完全随机树森林。The second training unit 230 is configured to use the sample training to obtain a first completely random tree forest.

第一完全随机树森林包括第四预设数量棵决策树。The first completely random tree forest includes a fourth preset number of decision trees.

提取到样例后，利用样例分别训练得到第一完全随机树森林，其中，第一完全随机树森林可以包括40棵第一类决策树。After the samples are extracted, the samples are used to train respectively to obtain the first completely random tree forest, wherein the first completely random tree forest may include 40 decision trees of the first type.

第一训练单元和第二训练单元训练得到的两个森林，用来提取目标候选区域中的特征向量。The two forests trained by the first training unit and the second training unit are used to extract feature vectors in target candidate regions.

样本集划分单元240，用于将训练样本划分为生长子集和评估子集。A sample set dividing unit 240, configured to divide the training samples into a growth subset and an evaluation subset.

生长子集与评估子集所包含的训练样本的数量比例满足预设比例。The ratio of the number of training samples contained in the growth subset to the evaluation subset satisfies a preset ratio.

将训练样本集中的一部分作为生长子集，剩余部分作为评估子集。例如，生长子集和评估子集之间的比例可以为3:1A part of the training sample set is used as a growing subset, and the remaining part is used as an evaluation subset. For example, the ratio between the growing subset and the evaluating subset could be 3:1

第三训练单元250，用于利用生长子集逐级训练级联森林的级联层。The third training unit 250 is configured to use the growth subset to train the cascaded layers of the cascaded forest step by step.

每一级所述级联层包括第六预设数量个完全随机树森林和第六预设数量个随机森林，且每个完全随机树森林包括第五预设数量棵决策树，每个随机森林包括第五预设数量棵决策树。The cascade layer at each level includes a sixth preset number of completely random tree forests and a sixth preset number of random forests, and each completely random tree forest includes a fifth preset number of decision trees, and each random forest A fifth predetermined number of decision trees are included.

验证单元260，用于当所述级联森林的级联层数增长后，利用所述评估子集验证当前级联森林的准确率是否提升，如果所述准确率没有提升，则所述级联森林的级联层数停止增加，得到最终的级联森林模型。The verification unit 260 is configured to use the evaluation subset to verify whether the accuracy rate of the current cascaded forest is improved after the number of cascaded layers of the cascaded forest increases, and if the accuracy rate is not improved, the cascaded The number of cascaded layers of the forest stops increasing, and the final cascaded forest model is obtained.

本实施例提供的基于图像的目标检测装置，首先，利用样本集训练用于特征响铃提取的第一完全随机树森林和第一随机森林；然后，继续利用样本集训练用于识别目标物体的级联森林。训练过程只需要进行一次，训练完成后直接使用训练结果即可。首先，该方法利用在提取特征向量之前确定出目标候选区域，该目标候选区域的范围小于整个待检测图像的范围，因此，从目标候选区域中提取的特征向量数量，远远小于从整个待检测图像中提取的特征向量数量，从而提高了检测速度。其次，利用多粒度扫描能够更好地得到图像像素点之间的空间信息，提高了检测结果的准确率。与神经网络模型相比，级联森林模型中的参数数量较少，需要的训练数据较少，而且，检测准确率高。The image-based target detection device provided in this embodiment, first, uses the sample set to train the first completely random tree forest and the first random forest used for feature bell extraction; then, continues to use the sample set to train the Cascade Forest. The training process only needs to be performed once, and the training results can be used directly after the training is completed. First of all, this method uses the method to determine the target candidate area before extracting the feature vector. The range of the target candidate area is smaller than the range of the entire image to be detected. Therefore, the number of feature vectors extracted from the target candidate area is much smaller than that of the entire image to be detected. The number of feature vectors extracted in the image, which improves the detection speed. Secondly, the use of multi-granularity scanning can better obtain the spatial information between image pixels and improve the accuracy of detection results. Compared with the neural network model, the cascaded forest model has fewer parameters, requires less training data, and has higher detection accuracy.

本说明书中各个实施例采用递进的方式描述，每个实施例重点说明的都是与其他实施例的不同之处，各个实施例之间相同相似部分互相参见即可。对于实施例公开的装置而言，由于其与实施例公开的方法相对应，所以描述的比较简单，相关之处参见方法部分说明即可。Each embodiment in this specification is described in a progressive manner, each embodiment focuses on the difference from other embodiments, and the same and similar parts of each embodiment can be referred to each other. As for the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and for the related information, please refer to the description of the method part.

Claims

Translated fromChinese

1.一种基于图像的目标检测方法，其特征在于，包括：1. An image-based target detection method, characterized in that, comprising:

2.根据权利要求1所述的方法，其特征在于，所述根据待检测图像所包含的目标物体的轮廓信息确定目标候选区域，包括：2. The method according to claim 1, wherein the determining the target candidate area according to the outline information of the target object contained in the image to be detected comprises:

3.根据权利要求2所述的方法，其特征在于，所述利用所述第一随机森林和所述第一完全随机树森林，从所述目标候选区域中提取特征向量，包括：3. The method according to claim 2, wherein said utilizing said first random forest and said first complete random tree forest to extract a feature vector from said target candidate region comprises:

4.根据权利要求1所述的方法，其特征在于，所述方法还包括：4. The method according to claim 1, wherein the method further comprises:

5.根据权利要求1所述的方法，其特征在于，所述方法还包括：5. The method according to claim 1, characterized in that the method further comprises:

6.一种基于图像的目标检测装置，其特征在于，包括：6. An image-based target detection device, characterized in that, comprising:

7.根据权利要求6所述的装置，其特征在于，所述确定单元，包括：7. The device according to claim 6, wherein the determining unit comprises:

8.根据权利要求7所述的装置，其特征在于，所述特征提取单元，包括：8. The device according to claim 7, wherein the feature extraction unit comprises:

9.根据权利要求6所述的装置，其特征在于，所述装置还包括：9. The device according to claim 6, further comprising:

10.根据权利要求6所述的装置，其特征在于，所述装置还包括：10. The device according to claim 6, further comprising: