CN107392929B

Movatterモバイル変換

Info

Publication number: CN107392929B
Application number: CN201710580068.5A
Authority: CN
Inventors: 李庆武; 周亚琴; 马云鹏; 邢俊; 许金鑫; 席淑雅
Original assignee: Hohai University HHU
Current assignee: Hohai University HHU
Priority date: 2017-07-17
Filing date: 2017-07-17
Publication date: 2020-07-10
Anticipated expiration: 2037-07-17
Also published as: CN107392929A

Abstract

Translated fromChinese

本发明公开了一种基于人眼视觉模型的智能化目标检测及尺寸测量方法。将人眼显著性检测与双目视觉成像模型相结合，检测出目标的显著性边缘和受空间信息约束的特征点作为种子点，再利用多特征融合生长机制完成目标检测，最后根据双目视觉模型完成尺寸测量。本发明提出的基于人眼视觉模型的智能化目标检测及尺寸测量系统，能准确地从完成多类型目标的检测与尺寸测量，测量误差率低于2％，满足实际目标检测及测量需求。

The invention discloses an intelligent target detection and size measurement method based on a human eye visual model. The human eye saliency detection is combined with the binocular vision imaging model to detect the saliency edge of the target and the feature points constrained by spatial information as seed points, and then use the multi-feature fusion growth mechanism to complete the target detection. Finally, according to the binocular vision The model completes the dimension measurement. The intelligent target detection and size measurement system based on the human visual model proposed by the invention can accurately complete the detection and size measurement of multiple types of targets, and the measurement error rate is less than 2%, which meets the actual target detection and measurement requirements.

Description

Translated fromChinese

一种基于人眼视觉模型的智能化目标检测及尺寸测量方法An intelligent target detection and size measurement method based on human visual model

技术领域technical field

本发明涉及一种基于人眼视觉模型的智能化目标检测及尺寸测量方法，属于数字图像处理、目标检测及尺寸测量领域。The invention relates to an intelligent target detection and size measurement method based on a human visual model, belonging to the fields of digital image processing, target detection and size measurement.

背景技术Background technique

尺寸是物体的重要特征之一，所以尺寸测量成为了生产生活的一项重要技术，如工件尺寸测量、大型文物建筑尺寸测量、车辆尺寸测量等。现存的尺寸测量方法可分为以下几类：三坐标测量机、经纬仪测量系统、全站仪测量系统、关节式坐标测量机、室内全球定位系统(GPS)、激光跟踪测量系统、声学测距系统、激光测距系统、人工测量和机器视觉测量系统。但传统的目标尺寸测量方法容易受到物体形状、背景等因素的制约。如三坐标机、全站仪、关节式坐标机、人工测量以及激光跟踪均属于接触性测量方法，不适用于非接触型的物体或材料，使用范围受限；声学测距和激光测距受测量范围的制约，难以进行远距离测。机器视觉具有非接触性、测量速度快、测量精度高、实时性强等显著优点，是尺寸测量领域的研究热点之一。Size is one of the important characteristics of objects, so size measurement has become an important technology in production and life, such as workpiece size measurement, large cultural relics building size measurement, vehicle size measurement, etc. Existing dimensional measurement methods can be divided into the following categories: CMM, theodolite measurement system, total station measurement system, articulated coordinate measuring machine, indoor global positioning system (GPS), laser tracking measurement system, acoustic ranging system , laser ranging system, manual measurement and machine vision measurement system. However, traditional target size measurement methods are easily restricted by factors such as object shape and background. For example, CMM, total station, articulated coordinate machine, manual measurement and laser tracking are all contact measurement methods, which are not suitable for non-contact objects or materials, and the scope of use is limited; acoustic ranging and laser ranging are subject to Due to the limitation of the measurement range, it is difficult to carry out long-distance measurement. Machine vision has significant advantages such as non-contact, fast measurement speed, high measurement accuracy, and strong real-time performance. It is one of the research hotspots in the field of dimensional measurement.

机器视觉测量方法根据视觉传感器的数目和类型可以分成单目视觉、双目视觉、多目视觉、红外视觉、紫外视觉和混合视觉等多个研究方向。其中，单目视觉常常作为最简便快捷的图像获取方法服务于各个领域，但其获得的视觉信息限制性较大。单目视觉包含的信息量少，未能体现三维信息，同时视野范围有限，容易受外界环境影响。与单目视觉相比，双目视觉技术是模拟人类双目视觉系统，利用左右目图像中的视差信息获取环境与目标的三维信息，具体优势表现为两个方面：1)模拟人类双目视觉系统，可以在目标二维信息的基础上，获取目标的深度信息，从而完成目标的空间定位与三维测量；2)人类双目视觉系统经过“物竞天择、适者生存”的进化，适用于多源图像的获取与处理。Machine vision measurement methods can be divided into several research directions, such as monocular vision, binocular vision, polycular vision, infrared vision, ultraviolet vision and hybrid vision, according to the number and type of vision sensors. Among them, monocular vision is often used as the easiest and quickest image acquisition method to serve various fields, but the visual information obtained by it is relatively limited. Monocular vision contains a small amount of information, fails to reflect three-dimensional information, and has a limited field of view, which is easily affected by the external environment. Compared with monocular vision, binocular vision technology simulates the human binocular vision system, and uses the parallax information in the left and right eye images to obtain the three-dimensional information of the environment and the target. The specific advantages are shown in two aspects: 1) Simulate human binocular vision The system can obtain the depth information of the target on the basis of the two-dimensional information of the target, so as to complete the spatial positioning and three-dimensional measurement of the target; 2) The human binocular vision system has undergone the evolution of "natural selection and survival of the fittest". for the acquisition and processing of multi-source images.

现存的利用双目视觉进行的尺寸测量方法精度高，但局限于测量精度的提高，忽略了目标检测步骤。而在实际测量环境下，目标处于不同背景下，准确检测目标是尺寸测量的重要前提。故在尺寸测量前完善目标检测，完成智能化目标检测及尺寸测量是现阶段亟需解决的问题之一。Existing dimensional measurement methods using binocular vision have high accuracy, but are limited to the improvement of measurement accuracy and ignore the target detection step. In the actual measurement environment, the target is in different backgrounds, and accurate detection of the target is an important prerequisite for size measurement. Therefore, improving target detection before size measurement, and completing intelligent target detection and size measurement is one of the problems that needs to be solved urgently at this stage.

发明内容SUMMARY OF THE INVENTION

本发明所要解决的技术问题是：现存尺寸测量方法忽略目标检测步骤，应用范围窄的问题。The technical problem to be solved by the present invention is that the existing size measurement method ignores the target detection step and has a narrow application range.

为解决上述技术特征，本发明提供一种基于人眼视觉模型的智能化目标检测及尺寸测量方法，包括以下步骤：In order to solve the above technical features, the present invention provides an intelligent target detection and size measurement method based on a human visual model, comprising the following steps:

1)采集待测双目图像：以目标为距离相机最近完整个体的角度利用标定后的双目相机拍摄一组左目图像和右目图像作为待测量图像；1) Collect the binocular image to be measured: take the target as the angle of the complete individual closest to the camera, and use the calibrated binocular camera to shoot a group of left-eye images and right-eye images as the images to be measured;

2)提取有效显著点：利用高斯差分算法提取左目图像的显著点图，计算所有显著值大于0的显著点的均值，并以该均值作为筛选阈值，提取显著值大于该阈值的像素点为图像显著点；2) Extract effective salient points: use the Gaussian difference algorithm to extract the salient point map of the left eye image, calculate the mean of all salient points with salient values greater than 0, and use the mean value as the screening threshold, and extract the pixels with the salient value greater than the threshold as the image salient point;

3)提取显著性边缘：利用Canny边缘检测算法获取图像的边缘图像，计算每条边缘上显著点的数目，将各边缘线按显著点数目降序排序，保留前a％的边缘作为显著性边缘；a取10；3) Extract significant edge: use the Canny edge detection algorithm to obtain the edge image of the image, calculate the number of significant points on each edge, sort each edge line in descending order of the number of significant points, and retain the first a% of the edges as significant edges; a takes 10;

4)左目图像和右目图像配准：利用SURF算法检测双目图像的匹配点对，再根据特征点之间的欧氏距离最小化选取匹配点对，最后利用匹配点对之间的斜率进行特征点筛选，去除异常点对，保留有效匹配点对；4) Left-eye image and right-eye image registration: use the SURF algorithm to detect the matching point pairs of the binocular images, and then select the matching point pairs according to the minimum Euclidean distance between the feature points, and finally use the slope between the matching point pairs to characterize Point screening, remove abnormal point pairs, and retain valid matching point pairs;

5)提取受空间信息制约的特征点：计算所有有效匹配点的视差信息，按视差降序对匹配点进行排序，提取前c％作为受空间信息制约的特征点；c取10；5) Extract feature points constrained by spatial information: Calculate the parallax information of all valid matching points, sort the matching points in descending order of parallax, and extract the first c% as feature points constrained by spatial information; c is 10;

6)目标检测：利用显著信息、空间信息和颜色信息的多特征融合方法对目标区域快速生长完成目标检测；6) Target detection: use the multi-feature fusion method of salient information, spatial information and color information to quickly grow the target area to complete target detection;

7)利用双目视觉模型进行尺寸测量：获取检测目标二维图像中水平方向和垂直方向上的距离最大的两个点对，根据双目视觉模型的双目视差和相机内外参数计算两个点对的三维坐标，计算水平和垂直方向上两组点对的欧式距离作为目标的长度和高度，完成目标的尺寸测量。7) Use the binocular vision model for size measurement: obtain the two point pairs with the largest distance in the horizontal and vertical directions in the two-dimensional image of the detection target, and calculate the two points according to the binocular parallax of the binocular vision model and the internal and external parameters of the camera For the three-dimensional coordinates of the pair, calculate the Euclidean distance of the two sets of point pairs in the horizontal and vertical directions as the length and height of the target, and complete the size measurement of the target.

与现有技术相比，本发明的有益效果是，本发明首先将人眼视觉显著性检测与双目视觉三维信息感知模型相结合，提取显著性边缘和受空间信息约束的特征点作为种子点，再利用显著性、空间信息和颜色信息的多特征融合生长机制完成目标检测，最后根据双目视觉模型计算目标尺寸。完成了多类型目标的自动检测与尺寸测量，应用范围大幅提升，且测量误差低于2％，满足实际测量需求。Compared with the prior art, the beneficial effect of the present invention is that the present invention firstly combines human visual saliency detection with binocular visual three-dimensional information perception model, and extracts saliency edges and feature points constrained by spatial information as seed points. , and then use the multi-feature fusion growth mechanism of saliency, spatial information and color information to complete the target detection, and finally calculate the target size according to the binocular vision model. The automatic detection and size measurement of multiple types of targets has been completed, the application range has been greatly improved, and the measurement error is less than 2%, which meets the actual measurement needs.

附图说明Description of drawings

图1为智能化目标检测及尺寸测量方法流程图；Fig. 1 is the flow chart of intelligent target detection and size measurement method;

图2为显著性边缘提取示意图；Fig. 2 is a schematic diagram of saliency edge extraction;

图3为受空间信息约束的特征点提取示意图；3 is a schematic diagram of feature point extraction constrained by spatial information;

图4为多特征融合生长机制示意图；4 is a schematic diagram of a multi-feature fusion growth mechanism;

图5为生长策略示意图；Figure 5 is a schematic diagram of a growth strategy;

图6为物体尺寸测量方法示意图；6 is a schematic diagram of a method for measuring the size of an object;

图7为目标尺寸测量结果示意图。FIG. 7 is a schematic diagram of the measurement result of the target size.

具体实施方式Detailed ways

图1为本发明的智能化目标检测及尺寸测量方法流程图，首先利用张氏标定法对双目相机进行标定，再以目标为距离相机最近的完整个体的角度，使用标定后的双目相机拍摄待测目标的左右目图像，利用高斯差分算法(DoG)和Canny算法结合计算显著性边缘，利用改进SURF算法对左右目图像进行匹配，根据视差获取受空间信息制约的特征点。以显著性边缘和受空间信息制约的特征点作为种子点，利用颜色相似性对目标进行快速检测，利用双目视觉模型计算目标尺寸。Fig. 1 is the flow chart of the intelligent target detection and size measurement method of the present invention. First, the binocular camera is calibrated by Zhang's calibration method, and then the target is the angle of the complete individual closest to the camera, and the calibrated binocular camera is used. The left and right eye images of the target to be tested are taken, the difference of Gaussian algorithm (DoG) and the Canny algorithm are used to calculate the saliency edge, the improved SURF algorithm is used to match the left and right eye images, and the feature points constrained by spatial information are obtained according to the parallax. Taking salient edges and feature points constrained by spatial information as seed points, the color similarity is used to quickly detect the target, and the binocular vision model is used to calculate the target size.

本发明的具体步骤依次为：The concrete steps of the present invention are successively:

(1)利用张氏标定法对双目相机进行标定。利用7×7的棋盘模型拍摄若干组双目图像，输入MATLAB标定工作箱，完成相机标定，获取双目相机的内外参数。(1) Use Zhang's calibration method to calibrate the binocular camera. The 7×7 chessboard model is used to shoot several sets of binocular images, and input them into the MATLAB calibration workbox to complete the camera calibration and obtain the internal and external parameters of the binocular camera.

(2)按目标为距离相机最近的完整个体的角度，利用标定好的双目相机对目标进行拍摄，获取目标的左目图像和右目图像，通过USB接口传输到PC端电脑上显示并存储。(2) According to the angle of the target as the complete individual closest to the camera, use the calibrated binocular camera to shoot the target, obtain the left eye image and right eye image of the target, and transmit it to the PC through the USB interface for display and storage.

(3)如图2(b)所示，提取左目图像的高斯差分显著点图。DoG算法可以达到激励局部中央区域并抑制周围邻域的效果，符合人眼视觉特性，从而能够在一定程度上反映图像的显著度。按下式计算左目图像的初始高斯差分显著点图：(3) As shown in Figure 2(b), extract the Gaussian difference saliency map of the left eye image. The DoG algorithm can achieve the effect of stimulating the local central area and suppressing the surrounding neighborhood, which is in line with the visual characteristics of the human eye, so that it can reflect the saliency of the image to a certain extent. Calculate the initial Gaussian difference saliency map of the left eye image as follows:

其中，σ₁和σ₂分别表示兴奋带宽和抑制带宽，本文中取值σ₁＝0.6，σ₂＝0.9，I为灰度图像，符号

代表对图像进行滑频滤波，DoG(x,y)为得到的像素点(x,y)显著性度量值；Among them, σ₁ and σ₂ represent the excitation bandwidth and the inhibitory bandwidth, respectively, and the values in this paper are σ₁ =0.6, σ₂ =0.9, I is the grayscale image, the symbol

Represents the sliding frequency filtering of the image, and DoG(x,y) is the obtained pixel point (x,y) saliency measure;

将显著性度量值DoG(x,y)产生的负值设置为0，并设置显著度均值为阈值T，用于显著点的提取，得到显著点图D(x,y)：Set the negative value generated by the saliency measure DoG(x, y) to 0, and set the saliency mean as the threshold value T, which is used for the extraction of salient points, and the saliency point map D(x, y) is obtained:

其中，count(DoG＞0)表示DoG中显著值大于0的显著点数，sum(DoG＞0)表示大于0的显著值总和。Among them, count (DoG>0) represents the number of significant points in the DoG with significant values greater than 0, and sum (DoG>0) represents the sum of the significant values greater than 0.

(4)如图2(c)所示，Canny边缘检测算法可以检测出图像所有连续的边缘，却无法突出显著性目标，尤其是背景和阴影边缘对目标检测干扰极大。为保证目标检测的准确性，本发明利用DoG的显著度约束Canny边缘，提取目标的显著性边缘，如图2(d)。其中获取显著性边缘方法如下：(4) As shown in Figure 2(c), the Canny edge detection algorithm can detect all the continuous edges of the image, but cannot highlight the salient objects, especially the background and shadow edges greatly interfere with the object detection. In order to ensure the accuracy of target detection, the present invention uses the saliency of DoG to constrain the Canny edge, and extracts the saliency edge of the target, as shown in Figure 2(d). The method of obtaining the saliency edge is as follows:

a.筛选有效边缘：Canny检测结果中有很多由阴影、反光等因素导致的短边缘，不利于目标边缘的提取，提取长边界作为有效边缘有利于去除误检测边界，提炼边缘信息，减少计算量，故本发明中对所有边缘长度进行排序，保留长度最长的50％条边界作为长边界；a. Screening effective edges: There are many short edges caused by shadows, reflections and other factors in the Canny detection results, which are not conducive to the extraction of target edges. Extracting long boundaries as effective edges is conducive to removing false detection boundaries, refining edge information, and reducing the amount of calculation. , so in the present invention, all edge lengths are sorted, and the longest 50% boundary is reserved as the long boundary;

b.计算边缘显著度：统计每个有效边界上显著点的个数作为衡量边缘显著度的标准，且显著点个数与边缘显著度成正比；b. Calculate edge salience: count the number of salient points on each valid frontier as a standard for measuring edge salience, and the number of salient points is proportional to the edge salience;

c.提取显著性边缘：各边缘按显著点数目降序排序，保留前a％的边缘线作为显著性边缘；c. Extracting significant edges: the edges are sorted in descending order by the number of significant points, and the first a% of edge lines are reserved as significant edges;

(5)本发明采用改进SURF算法对左目图像和右目图像进行配准，精准地完成左目随机特征点和右目随机特征点的匹配，使用SURF算法在尺度空间检测图像的特征点，对图片进行滤波处理时，SURF算法选择不同大小的窗口，通过Hessian矩阵提取特征点，Hessian矩阵定义为：(5) The present invention uses the improved SURF algorithm to register the left eye image and the right eye image, accurately completes the matching of the left eye random feature point and the right eye random feature point, uses the SURF algorithm to detect the feature points of the image in the scale space, and filters the image. During processing, the SURF algorithm selects windows of different sizes, and extracts feature points through the Hessian matrix. The Hessian matrix is defined as:

其中，σ为高斯核函数的尺度，L_xx(x,σ)为高斯二阶微分

与图像I(x,y)在x方向的卷积，L_yy(x,σ)为高斯二阶微分

与图像I(x,y)在y方向上的卷积，

在xy方向上与图像I(x,y)的卷积，其公式如下：Among them, σ is the scale of the Gaussian kernel function, and L_xx (x,σ) is the second-order differential of the Gaussian

The convolution with the image I(x,y) in the x direction, L_yy (x,σ) is the second-order Gaussian differential

The convolution with the image I(x,y) in the y direction,

The convolution with the image I(x,y) in the xy direction, the formula is as follows:

g(σ)为高斯核函数。g(σ) is a Gaussian kernel function.

a.选取特征点主方向：为了保证算法的方向不变性，SURF算法根据周围像素点信息为特征点指定了一个唯一的主方向。首先，以检测特征点为中心，以σ为步长，小波的尺寸为4σ，计算Harr小波在半径为6σ的圆形邻域内的水平以及垂直方向上的响应，其中，σ表示所处的尺度空间。再以尺度为2σ的高斯函数对以当前特征点为中心的Harr小波响应进行高斯加权。规定与特征点距离越小，则该像素点的权重越小，得到了新Harr小波响应值。SURF算法构造了一个角度为60°的扇形窗口，并计算该窗口内Harr小波响应值，利用该窗口不断旋转遍历整个圆形区域，直到窗口内的Harr小波响应达到最强，则将扇形窗口此时指向的方向定义为该特征点的主方向；a. Select the main direction of the feature points: In order to ensure the direction invariance of the algorithm, the SURF algorithm specifies a unique main direction for the feature points according to the surrounding pixel information. First, take the detection feature point as the center, take σ as the step, the size of the wavelet is 4σ, and calculate the horizontal and vertical responses of the Harr wavelet in a circular neighborhood with a radius of 6σ, where σ represents the scale. space. Then, a Gaussian weighting is performed on the Harr wavelet response centered on the current feature point with a Gaussian function with a scale of 2σ. It is specified that the smaller the distance from the feature point, the smaller the weight of the pixel point, and the new Harr wavelet response value is obtained. The SURF algorithm constructs a fan-shaped window with an angle of 60°, and calculates the Harr wavelet response value in the window, and uses the window to continuously rotate and traverse the entire circular area until the Harr wavelet response in the window reaches the strongest, then the fan-shaped window is set to this value. The direction pointing at is defined as the main direction of the feature point;

b.以当前特征点为中心，沿当前特征点的主方向构造一个大小为20σ的块；然后将该区域分割为16个小区域，在每个5σ×5σ的区域计算哈尔小波响应；每个子区域v表示为：v＝(∑dx,∑|dx|,∑dy,∑|dy|),最终获得该点的64维描述符；左目图像和右目图像的特征点可定义为：b. With the current feature point as the center, construct a block of size 20σ along the main direction of the current feature point; then divide the region into 16 small regions, and calculate the Haar wavelet response in each region of 5σ×5σ; The sub-regions v are expressed as: v=(∑dx,∑|dx|,∑dy,∑|dy|), and finally the 64-dimensional descriptor of the point is obtained; the feature points of the left-eye image and the right-eye image can be defined as:

其中，Pos1代表左目图像的特征点参数，Pos2代表右目图像特征点参数，m、n分别为左目图像和右目图像中特征点的数目，i、j分别代表左目图像和右目图像特征点的下标，(x_m'，y_m')表示左目图像第m个特征点坐标；(x_n，y_n)表示右目图像第n个特征点坐标；Among them, Pos1 represents the feature point parameters of the left-eye image, Pos2 represents the feature point parameters of the right-eye image, m and n are the number of feature points in the left-eye image and the right-eye image, respectively, and i and j represent the subscripts of the feature points of the left-eye image and the right-eye image respectively. , (x_m ', y_m ') represent the coordinates of the m-th feature point of the left-eye image; (x_n , y_n ) represent the coordinates of the n-th feature point of the right-eye image;

c.计算左目图像的特征点参数Pos1和右目图像特征点参数Pos2中所有点的欧式距离，选择欧氏距离最小的点作为粗略匹配点对，按欧式距离升序对粗略匹配点对进行排序，删除异常点，选择前K个匹配点对，定义为：c. Calculate the Euclidean distance of all points in the feature point parameter Pos1 of the left eye image and the feature point parameter Pos2 of the right eye image, select the point with the smallest Euclidean distance as the rough matching point pair, sort the rough matching point pairs in ascending order of Euclidean distance, and delete them Outliers, selecting the top K matching point pairs, are defined as:

Pos_K＝{{(x'₁,y'₁),(x₁,y₁)},{(x'₂,y'₂),(x₂,y₂)},...,{(x'_i,y'_i),(x_i,y_i)},...{(x'_K,y'_K),(x_K,y_K)}}，1≤i≤K；Pos_K={{(x'₁ ,y'₁ ),(x₁ ,y₁ )},{(x'₂ ,y'₂ ),(x₂ ,y₂ )},...,{(x '_i ,y'_i ),(x_i ,y_i )},...{(x'_K ,y'_K ),(x_K ,y_K )}}, 1≤i≤K;

d.根据K个匹配点对Pos_K中相应点的斜率筛选匹配点对，计算所有粗匹配点对的斜率，以10^-2数量级保留所有斜率值，计算所有斜率的出现频率，选取出现频率最大的斜率作为主导斜率，删除其他异常斜率对应的匹配点对，更新得到H组准确匹配点对Pos_K_new，公式如下：d. Screen matching point pairs according to the slopes of the corresponding points in Pos_K of K matching point pairs, calculate the slopes of all rough matching point pairs, retain all slope values in the order of 10^-2 , calculate the occurrence frequency of all slopes, and select the one with the largest occurrence frequency Take the slope as the dominant slope, delete the matching point pairs corresponding to other abnormal slopes, and update the H groups of accurate matching point pairs Pos_K_new , the formula is as follows:

(x_zi,y_zi),(x_yi,y_yi)分别代表一组匹配点对中左目图像和右目图像的特征点坐标。(x_zi , y_zi ), (x_yi , y_yi ) respectively represent the feature point coordinates of the left-eye image and the right-eye image in a set of matching point pairs.

(6)提取受空间信息制约的特征点，通过SURF算法匹配，直接计算得出所有匹配点视差并升序排序，保留视差最大的前c％匹配点作为受空间信息制约的特征点。如图3所示，白色汽车模型为该图像中的空间距离相机最近的完整目标，提取的受空间信息约束的特征点被标记于左右目图像之中。图中所有提取特征点都在目标上，降低了误分割的概率，一般c取10。(6) Extract the feature points constrained by spatial information, and match them through the SURF algorithm, directly calculate the parallax of all matching points and sort them in ascending order, and retain the top c% matching points with the largest parallax as the feature points constrained by spatial information. As shown in Figure 3, the white car model is the complete target in the image that is closest to the camera in space, and the extracted feature points constrained by spatial information are marked in the left and right eye images. All the extracted feature points in the figure are on the target, which reduces the probability of mis-segmentation. Generally, c is 10.

(7)如图4所示，仅根据显著性边缘无法获取完整的目标，但显著性边缘和受空间信息约束的特征点可以准确代表目标的特征，正满足生长种子点的选取要求。故本发明将显著边缘信息、空间信息和颜色信息相融合，进行快速生长完成目标检测。图5模拟了种子点生长的过程，生长过程中存在重复生长区域，即有两个种子点同时将该区域视为待生长区域进行判断，重复操作将降低生长效率，本发明增加“已处理像素点标签”，若上一种子点已经处理过该像素点，则标记该像素点，下一种子点将不再对该区域进行重复判断。(7) As shown in Figure 4, the complete target cannot be obtained only based on the saliency edge, but the saliency edge and the feature points constrained by spatial information can accurately represent the characteristics of the target, which exactly meets the selection requirements of growth seed points. Therefore, the present invention integrates significant edge information, spatial information and color information to perform rapid growth and complete target detection. Fig. 5 simulates the process of seed point growth, and there is a repeated growth area in the growth process, that is, there are two seed points at the same time as the area to be grown to judge, the repeated operation will reduce the growth efficiency, the present invention increases the "processed pixel" Point label", if the previous sub-point has processed the pixel point, the pixel point will be marked, and the next sub-point will not repeat the judgment on the area.

在种子集合的生长过程中，相似度判断是非常重要的生长或停止指标，为了定量的计算种子集合与待生长区域的相似度，选用相似度函数计算相似度，如果待生长区域与种子集合的相似度大于设定阈值，则进行生长。由于HSI颜色模型符合人眼视觉特性，为减少亮度对目标检测的影响，本发明采用HSI颜色空间进行相似度的计算。In the growth process of the seed set, similarity judgment is a very important growth or stop index. In order to quantitatively calculate the similarity between the seed set and the area to be grown, the similarity function is used to calculate the similarity. If the similarity is greater than the set threshold, the growth is carried out. Since the HSI color model conforms to the visual characteristics of the human eye, in order to reduce the influence of brightness on the target detection, the present invention adopts the HSI color space to calculate the similarity.

以显著性边缘和受空间信息制约的特征点作为生长种子点，并将所有生长种子点作为初始种子区域，将图像左目图像转换为HSI颜色模型，分别计算HSI颜色信息中待生长区域与种子区域色调(H)、饱和度(S)、亮点(I)均值分量均方差，并对均方差进行加权求和，加权求和值作为待生长区域与种子区域的差异值，若差异值低于阈值时，则将生长该点，并将该点合并入对应的种子区域，更新得到新的种子区域，不断循环，直至所有种子区域没有符合条件的像素或者到达图像边缘，则停止生长。Taking salient edges and feature points restricted by spatial information as growth seed points, and taking all growth seed points as initial seed regions, the left-eye image of the image is converted into HSI color model, and the to-be-growing area and seed area in HSI color information are calculated respectively. Hue (H), saturation (S), bright spot (I) mean component mean square deviation, and weighted summation of the mean squared error, the weighted summation value is used as the difference value between the area to be grown and the seed area, if the difference value is lower than the threshold When , the point will be grown, merged into the corresponding seed area, updated to obtain a new seed area, and the cycle will continue until all the seed areas have no eligible pixels or reach the edge of the image, then stop growing.

将显著性边缘和受空间信息约束的特征点作为初始种子区域，统计初始种子点数目为N,R_i则为第i个种子区域，i∈[1,N]，

为种子区域R_i对应的待生长区域，每个种子区域根据与待生长区域的相似度进行生长，并不断将新生长的像素点合并入对应的种子区域作为新的种子区域进行下一轮生长，直至所有种子区域都没有满足条件的待生长区域，则停止生长。The saliency edge and the feature points constrained by spatial information are used as the initial seed area, the number of initial seed points is N, and R_i is the ith seed area, i∈[1,N],

is the region to be grown corresponding to the seed region_Ri , each seed region grows according to the similarity with the region to be grown, and continuously merges the newly grown pixels into the corresponding seed region as a new seed region for the next round of growth , until all seed areas have no areas to be grown that meet the conditions, then stop growing.

设待生长区域

与种子区域R_i的相似度函数

定义为：area to grow

Similarity function with seed region R_i

defined as:

其中，K为每个种子区域内的像素点的个数，随着种子区域的生长而不断增加，j表示生长区域内第j个像素点，H_t、S_t、I_t分别为待种子区域中色调、饱和度和亮度值，

分别为生长区域

内色调、饱和度和亮度均值；ε₁，ε₂和ε₃用于调节色调分量、饱和度和亮度分量的权重。由于均方差能够反映出差异性，差异性越大，代表该分量对图像的影响度越大，故本发明采用图像H、S、I分量的均方差作为ε₁，ε₂和ε₃的取值，可以提高目标区域检测的准确度。Among them, K is the number of pixels in each seed area, which increases with the growth of the seed area, j represents the jth pixel in the growth area, and H_t , S_t , and It are the areas to be seeded_, respectively Midtone, Saturation and Lightness values,

growth area

Inner Hue, Saturation, and Luminance Means; ε₁ , ε₂ and ε₃ are used to adjust the weights of the Hue, Saturation, and Luma components. Since the mean square error can reflect the difference, the greater the difference, the greater the influence of the component on the image, so the present invention uses the mean square error of the H, S, and I components of the image as the values of ε₁ , ε₂ and ε₃ , which can improve the accuracy of target area detection.

(8)已知双目视觉摄像机间的基线距离选取为b，摄像机的焦距为f，视差用d来表示，假设左目图像和右目图像已经过配准，视差即可由同一点在图像对中的位置差表示为d＝(x_l-x_r)。(8) It is known that the baseline distance between binocular vision cameras is selected as b, the focal length of the camera is f, and the parallax is represented by d. Assuming that the left eye image and the right eye image have been registered, the parallax can be determined by the same point in the image pair. The position difference is expressed as d=(x_l -x_r ).

其中，x_l、x_r分别是左目图像和右目图像中匹配点的横坐标，根据下式可计算出左摄像机坐标系中某点P的空间坐标(x^c,y^c,z^c)，(x^c,y^c,z^c)为匹配点在世界坐标系下的空间坐标，(x_l,y_l)为匹配点在左目图像中的二维图像坐标。Among them, x_l and x_r are the abscissas of the matching points in the left-eye image and the right-eye image respectively, and the spatial coordinates (x^c , y^c , z^c ) of a point P in the left camera coordinate system can be calculated according to the following formula, ( x^c , y^c , z^c ) are the spatial coordinates of the matching point in the world coordinate system, and (x_l , y_l ) are the two-dimensional image coordinates of the matching point in the left-eye image.

计算分割后的目标在二维平面图像上水平和垂直方向上距离最大的点对，如图6中水平方向上L₁和L₂，垂直方向上的H₁和H₂，根据双目视觉原理得到四点的空间坐标，利用三维坐标计算出水平方向和垂直方向上最大的欧式距离作为目标的水平宽度和垂直高度。Calculate the point pair with the largest distance in the horizontal and vertical directions of the segmented target on the two-dimensional plane image, such as L₁ and L₂ in the horizontal direction and H₁ and H₂ in the vertical direction in Figure 6. According to the principle of binocular vision Obtain the spatial coordinates of the four points, and use the three-dimensional coordinates to calculate the maximum Euclidean distance in the horizontal and vertical directions as the horizontal width and vertical height of the target.

假设这四点三维坐标分别为L₁(x_L1,y_L1,z_L1)、L₂(x_L2,y_L2,z_L2)和H₁(x_H1,y_H1,z_H1)、H₂(x_H2,y_H2,z_H2)，即可按下式计算出物体的最大长度和高度：Assume that the three-dimensional coordinates of these four points are L₁ (x_L1 , y_L1 , z_L1 ), L₂ (x_L2 , y_L2 , z_L2 ) and H₁ (x_H1 , y_H1 , z_H1 ), H₂ ( x_H2 , y_H2 , z_H2 ), you can calculate the maximum length and height of the object as follows:

如图7所示，本发明完成了目标的检测及尺寸测量，并以0.1mm的精度计算目标的长度和高度，显示在图像上。As shown in FIG. 7 , the present invention completes the detection and size measurement of the target, calculates the length and height of the target with an accuracy of 0.1 mm, and displays it on the image.

以上已以较佳实施例公开了本发明，然其并非用以限制本发明，凡采用等同替换或者等效变换方式所获得的技术方案，均落在本发明的保护范围之内。The present invention has been disclosed above with preferred embodiments, but it is not intended to limit the present invention, and all technical solutions obtained by adopting equivalent replacement or equivalent transformation methods fall within the protection scope of the present invention.

Claims

1. An intelligent target detection and size measurement method based on a human eye vision model is characterized by comprising the following steps:

1) acquiring binocular images to be detected: taking a target as an angle of a complete individual closest to the camera, and shooting a group of left eye images and right eye images by using a calibrated binocular camera to serve as images to be measured;

2) extracting effective significant points: extracting a salient point diagram of the left eye image by using a Gaussian difference algorithm, calculating the mean value of all salient points with the salient values larger than 0, taking the mean value as a screening threshold value, and extracting pixel points with the salient values larger than the threshold value as image salient points;

3) extracting a significant edge: acquiring an edge image of the image by using a Canny edge detection algorithm, calculating the number of salient points on each edge, sequencing all edge lines in a descending order according to the number of the salient points, and reserving the first a% of edges as salient edges;

4) and (3) registering the left eye image and the right eye image: detecting a matching point pair of the binocular image by using a SURF algorithm, selecting the matching point pair according to Euclidean distance minimization between the characteristic points, screening the characteristic points by using slopes between the matching point pairs, removing abnormal point pairs, and reserving effective matching point pairs;

5) extracting feature points restricted by spatial information: calculating parallax information of all effective matching points, sorting the matching points according to a parallax descending order, and extracting the top c% as a characteristic point restricted by spatial information;

6) target detection: rapidly growing a target area by utilizing a multi-feature fusion method of significant information, spatial information and color information to complete target detection;

7) size measurement using binocular vision model: the method comprises the steps of obtaining two point pairs with the largest distance in the horizontal direction and the vertical direction in a two-dimensional image of a detected target, calculating three-dimensional coordinates of the two point pairs according to binocular parallax of a binocular vision model and internal and external parameters of a camera, calculating Euclidean distances of the two point pairs in the horizontal direction and the vertical direction to serve as the length and the height of the target, and completing size measurement of the target.

2. The intelligent target detection and size measurement method based on human eye vision model as claimed in claim 1, wherein in the step 2), an initial gaussian difference saliency map of the left eye image is calculated as follows:

wherein σ₁And σ₂Respectively representing excitation bandwidth and suppression bandwidth, I is gray image, symbol

Representing that sliding frequency filtering is carried out on the image, and the DoG (x, y) is the significance metric value of the obtained pixel point (x, y);

setting a negative value generated by the saliency metric DoG (x, y) as 0, and setting a saliency mean value as a threshold value T for extraction of a saliency point to obtain a saliency point diagram D (x, y):

wherein, count (DoG > 0) represents the number of significant points with significant value greater than 0 in DoG, and sum (DoG > 0) represents the sum of significant values greater than 0.

3. The intelligent object detection and dimension measurement method based on human eye vision model as claimed in claim 1, wherein in the step 3), a-10.

4. The intelligent target detection and dimension measurement method based on human eye vision model as claimed in claim 1, wherein in the step 4), the following steps are included:

a. obtaining feature point coordinates of the left eye image and the right eye image and 64-dimensional descriptors of all feature points by using an SURF algorithm, wherein the feature points of the left eye image and the right eye image are defined as follows:

wherein Pos1 represents the feature point parameter of the left eye image, Pos2 represents the feature point parameter of the right eye image, m and n are the number of feature points in the left eye image and the right eye image respectively, i and j represent the subscripts of the feature points of the left eye image and the right eye image respectively, (x)_m'，y_m') denotes the mth feature point coordinate of the left eye image; (x)_n，y_n) Representing the coordinates of the nth characteristic point of the right eye image;

b. calculating Euclidean distances of all points in feature point parameters Pos1 and Pos2 of the left eye image and the right eye image, selecting a point with the minimum Euclidean distance as a rough matching point pair, sorting the rough matching point pair according to ascending Euclidean distances, deleting abnormal points, selecting the first K matching point pairs, and defining as:

Pos_K＝{{(x′₁,y′₁),(x₁,y₁)},{(x′₂,y′₂),(x₂,y₂)},...,{(x′_i,y′_i),(x_i,y_i)},...{(x′_K,y′_K),(x_K,y_K)}}，1≤i≤K；

c. screening the matched point pairs according to the slopes of the corresponding points in the K matched point pairs Pos _ K, calculating the slopes of all coarse matched point pairs, and calculating the slope of 10^-2Keeping all slope values in the order of magnitude, calculating the occurrence frequency of all slopes, selecting the slope with the maximum occurrence frequency as a leading slope, deleting the matching point pairs corresponding to other abnormal slopes, and updating to obtain the H groups of accurate matching point pairs Pos _ K_newThe formula is as follows:

(x_zi,y_zi),(x_yi,y_yi) Respectively representing the coordinates of the characteristic points of the left eye image and the right eye image in a group of matching point pairs.

5. The intelligent object detection and dimension measurement method based on human eye vision model as claimed in claim 1, wherein in the step 5), c-10.

6. The intelligent target detection and dimension measurement method based on human eye vision model as claimed in claim 1, characterized in that in the step 6), the salient edges and the feature points restricted by the spatial information are used as growth seed points, all the growing seed points are used as initial seed areas, the left eye image is converted into an HSI color model, the hue, saturation and mean square deviation of the bright point mean value components of the area to be grown and the seed areas in HSI color information are respectively calculated, and the mean square error is weighted and summed, the weighted and summed value is used as the difference value between the area to be grown and the seed area, if the difference value is lower than the threshold value, growing the point, combining the point into the corresponding seed region, updating to obtain a new seed region, and continuously circulating until all seed regions have no qualified pixels or reach the image edge, and stopping growing.

7. The intelligent target detection and size measurement method based on the human eye vision model as claimed in claim 6, wherein a repeated growth area exists in the growth process, that is, two seed points simultaneously judge a certain area as an area to be grown, the repeated operation reduces the growth efficiency, if a certain pixel point is processed by the previous seed point, the pixel point is marked with a 'processed pixel point label', and the next seed point does not repeatedly judge the area.

8. The method as claimed in claim 1, wherein in the step 7), the base distance between the binocular vision cameras is b, the focal length of the cameras is f, the parallax is d, and the parallax is represented by d (x ═ x) which is the difference between the positions of the same point in the image point pair_l-x_r) Wherein x is_l、x_rRespectively, the abscissa of the matched point in the left eye image and the right eye image, and the spatial coordinate (x) of a certain point P in the left camera coordinate system is calculated according to the following formula^c,y^c,z^c)，(x^c,y^c,z^c) For spatial coordinates of the matching points in the world coordinate system, (x)_l,y_l) Two-dimensional image coordinates of the matching points in the left eye image;

calculating the point pairs with the maximum distance of the segmented target in the horizontal and vertical directions on the two-dimensional plane image, obtaining the space coordinates of four points according to the binocular vision principle, and calculating the maximum Euclidean distance in the horizontal and vertical directions by using the three-dimensional coordinates as the horizontal width and the vertical height of the target;

suppose that the four-point three-dimensional coordinates are L respectively₁(x_L1,y_L1,z_L1)、L₂(x_L2,y_L2,z_L2) And H₁(x_H1,y_H1,z_H1)、H₂(x_H2,y_H2,z_H2) The maximum length and height of the object are calculated as follows: