

技术领域technical field
本发明涉及图像识别技术领域,具体地说是一种基于曲率特征递归神经网络的三维目标识别方法。The invention relates to the technical field of image recognition, in particular to a three-dimensional target recognition method based on a curvature feature recursive neural network.
背景技术Background technique
三维目标识别是指从任意给定的二维图像场景中自动检测、定位、识别出指定目标模式的过程,是计算机视觉研究的关键问题之一。随着计算机视觉技术的不断发展,三维目标识别越来越广泛地应用于工业检测、增强现实和医学影像等领域。但是,由于受到光照变化、图像噪声和目标遮挡等因素的影响,难以提取三维目标及其在不同视角下二维图像的共同特征,成为三维目标识别亟待解决的问题。3D target recognition refers to the process of automatically detecting, locating, and recognizing the specified target pattern from any given 2D image scene. It is one of the key issues in computer vision research. With the continuous development of computer vision technology, 3D object recognition is more and more widely used in the fields of industrial inspection, augmented reality and medical imaging. However, due to the influence of factors such as illumination changes, image noise and target occlusion, it is difficult to extract the common features of 3D objects and 2D images from different perspectives, which has become an urgent problem to be solved in 3D object recognition.
三维目标识别的关键是找到三维目标模型的二维表达,提取三维目标和二维图像的共同特征。现有的三维目标识别方法主要包括基于人工标记点的方法、基于几何特征的方法和基于深度学习的方法等。基于人工标记点的方法需要人工初始化二维图像中的特征点,由于需要人工交互,所以此类方法不具有可重复性;基于几何特征的方法通过提取目标的中线骨架、轮廓形状等信息实现目标识别,但是此类方法在图像存在噪声的情况下识别效果较差;基于深度学习的方法利用深度神经网络将低水平的图像特征融合成带有语义信息的高水平特征,能够很好地解决三维目标识别过程中二维图像的图像噪声问题,但是通常使用的深度卷积神经网络无法表达序列属性,不能有效地刻画三维目标在不同视角下的特征。因此,亟需提出一种在不同视角图像中对图像噪声问题鲁棒的自动化三维目标识别方法。The key to 3D object recognition is to find the 2D representation of the 3D object model and extract the common features of the 3D object and the 2D image. Existing 3D target recognition methods mainly include methods based on artificial markers, methods based on geometric features, and methods based on deep learning. The method based on artificial marking points needs to manually initialize the feature points in the two-dimensional image. Due to the need for manual interaction, such methods are not repeatable; the method based on geometric features achieves the target by extracting information such as the midline skeleton, contour shape and other information of the target. However, such methods have poor recognition effect in the presence of noise in the image; deep learning-based methods use deep neural networks to fuse low-level image features into high-level features with semantic information, which can solve the problem of three-dimensional The problem of image noise in two-dimensional images in the process of target recognition, but the commonly used deep convolutional neural network cannot express sequence attributes, and cannot effectively describe the characteristics of three-dimensional targets in different perspectives. Therefore, there is an urgent need to propose an automated 3D object recognition method that is robust to image noise in images from different perspectives.
发明内容SUMMARY OF THE INVENTION
本发明目的是能够更有效地刻画三维目标在不同视角下的特征,降低特征提取过程对图像噪声的敏感程度,提高三维目标识别准确率,本发明提出一种基于曲率特征递归神经网络的三维目标识别方法。The purpose of the invention is to more effectively describe the characteristics of the three-dimensional target under different viewing angles, reduce the sensitivity of the feature extraction process to image noise, and improve the accuracy of the three-dimensional target recognition. recognition methods.
本发明为实现上述目的所采用的技术方案是:一种基于曲率特征递归神经网络的三维目标识别方法,包括以下步骤:The technical scheme adopted by the present invention to achieve the above object is: a three-dimensional target recognition method based on a curvature feature recurrent neural network, comprising the following steps:
步骤1:计算目标三维模型的联合曲率提取联合曲率的局部极大值构成三维模型的曲率草图RSketch;再对三维模型的曲率草图RSketch利用透射投影变换生成360°二维图像Pm,其中m=1,2,...,360;Step 1: Calculate the joint curvature of the target 3D model Extract joint curvature The local maximum value of the 3D model constitutes the curvature sketch RSketch of the three-dimensional model; then the 360° two-dimensional image Pm is generated by the transmission projection transformation of the curvature sketch RSketch of the three-dimensional model, where m=1,2,...,360;
步骤2:将360°二维图像输入BRNN,利用多角度特征进行学习计算其在多视角下的序列属性;在softmax层利用softmax函数求得序列属性的正确概率最大时的识别类别;所述BRNN为双向递归神经网络。Step 2: Input the 360° two-dimensional image into the BRNN, and use multi-angle features to learn and calculate its sequence attributes under multiple perspectives; use the softmax function in the softmax layer to obtain the recognition category when the correct probability of the sequence attribute is the largest; the BRNN is a bidirectional recurrent neural network.
所述计算目标三维模型的联合曲率包括以下步骤:The joint curvature of the three-dimensional model of the calculation target Include the following steps:
设是目标三维模型R上给定一点(x,y,z)的法向量;令则px,py,qx,qy定义为Assume is the normal vector of a given point (x, y, z) on the target 3D model R; let Then px , py , qx , qy are defined as
计算三维模型R上每一点的法向量周围3×3邻域内的平均高斯曲率和平均均值曲率Calculate the average Gaussian curvature in the 3×3 neighborhood around the normal vector of each point on the 3D model R and mean mean curvature
其中,为平均曲率矩阵,trace(·)是矩阵的迹,分别为p,q,px,py,qx,qy在3×3邻域内的平均值;in, is the mean curvature matrix, trace( ) is the trace of the matrix, are the average values of p, q, px , py , qx , and qy in a 3×3 neighborhood;
定义目标三维模型R的联合曲率为:Define the joint curvature of the target 3D model R for:
所述将360°二维图像输入BRNN,利用多角度特征进行学习计算出其在多视角下的序列属性,包括以下步骤:The 360° two-dimensional image is input into the BRNN, and the multi-angle features are used to learn and calculate its sequence attributes under the multi-view, including the following steps:
获取360°二维图像的一维特征序列TS,s=1,2,...,360,则特征序列TS在BRNN第i层的输出分为正向输出和反向输出并且分别与本层BRNN上一序列的正向输出本层BRNN下一序列的反向输出以及上一层BRNN的正向输出和反向输出有如下关系:Obtain the one-dimensional feature sequence TS of the 360° two-dimensional image, s=1,2,...,360, then the output of the feature sequence TS in the i-th layer of BRNN is divided into forward output and reverse output And respectively with the forward output of the previous sequence of BRNN in this layer The reverse output of the next sequence of BRNN in this layer and the forward output of the previous layer of BRNN and reverse output There are the following relationships:
其中,为各输出间的权值矩阵,b为偏置,tanh为神经元激活函数;in, is the weight matrix between each output, b is the bias, and tanh is the neuron activation function;
则特征序列TS在BRNN的总输出Os,即为全连接层fc的输入Ifc为:Then the total output Os of the feature sequence TS in the BRNN, that is, the input Ifc of the fully connected layer fc is:
其中,分别为正向输出和反向输出在全连接层的连接权值;in, are the connection weights of the forward output and the reverse output in the fully connected layer, respectively;
因此,特征序列TS在全连接层fc的累加输出为即为序列属性。Therefore, the cumulative output of the feature sequence TS in the fully connected layer fc is is the sequence attribute.
所述在softmax层利用softmax函数求得序列属性的正确概率最大时的识别类别,包括以下步骤:Described using the softmax function in the softmax layer to obtain the recognition category when the correct probability of the sequence attribute is the largest, including the following steps:
在softmax层利用softmax函数计算出识别结果为第k类的正确概率p(Ck)In the softmax layer, the softmax function is used to calculate the correct probability p(Ck ) that the recognition result is the kth class
其中,C为识别类别总数,Ak为第k类三维目标的序列属性在全连接层fc的累加输出结果;Among them, C is the total number of recognition categories, Ak is the cumulative output result of the sequence attribute of the k-th three-dimensional target in the fully connected layer fc;
然后利用最大似然估计方法求得损失函数最小值时,即正确概率p(Ck)最大时的识别类别k:Then the maximum likelihood estimation method is used to obtain the minimum value of the loss function, that is, the identification category k when the correct probability p(Ck ) is the largest:
其中,δ(·)是克罗内克函数r表示特征序列TS的正确识别类别。where δ( ) is the Kronecker function r represents the correct recognition category of the feature sequence TS.
本发明具有以下有益效果及优点:The present invention has the following beneficial effects and advantages:
1.本发明设计的联合曲率草图特征提取方法,能够自动提取三维模型与二维图像的共同特征,并且联合曲率所使用的局部平均高斯曲率和局部平均均值曲率可以有效的解决图像噪声问题。1. The joint curvature sketch feature extraction method designed by the present invention can automatically extract the common features of the three-dimensional model and the two-dimensional image, and the local average Gaussian curvature and the local average mean curvature used by the joint curvature can effectively solve the problem of image noise.
2.本发明设计多角度特征学习双向递归神经网络,能够同时考虑三维模型在多角度下的特征序列,能够在任意角度的二维图像中准确识别三维目标。2. The present invention designs a multi-angle feature learning bidirectional recurrent neural network, which can simultaneously consider the feature sequences of the three-dimensional model under multiple angles, and can accurately identify the three-dimensional target in the two-dimensional image at any angle.
附图说明Description of drawings
图1为本发明方法流程图;Fig. 1 is the flow chart of the method of the present invention;
图2为本发明方法中的多角度特征学习双向递归神经网络框架图。FIG. 2 is a frame diagram of a bidirectional recurrent neural network for multi-angle feature learning in the method of the present invention.
具体实施方式Detailed ways
下面结合附图及实施例对本发明做进一步的详细说明。The present invention will be further described in detail below with reference to the accompanying drawings and embodiments.
本发明主要分为两部分,如图1所示为本发明方法流程图,具体实现过程如下所述。The present invention is mainly divided into two parts, as shown in FIG. 1 is a flow chart of the method of the present invention, and the specific implementation process is as follows.
步骤1:计算目标三维模型的联合曲率,并通过提取联合曲率局部极大值构成三维模型的曲率草图,利用透射投影变换生成360°二维图像作为训练递归神经网络的输入;Step 1: Calculate the joint curvature of the target three-dimensional model, and form the curvature sketch of the three-dimensional model by extracting the local maximum value of the joint curvature, and use the transmission projection transformation to generate a 360° two-dimensional image as the input for training the recurrent neural network;
步骤1.1:设是三维模型上给定一点(x,y,z)的法向量。令则px,py,qx,qy定义为则三维模型的高斯曲率GK为Step 1.1: Set up is the normal vector of a given point (x, y, z) on the 3D model. make Then px , py , qx , qy are defined as Then the Gaussian curvature GK of the three-dimensional model is
GK=|C|,GK = |C|,
其中曲率矩阵三维模型的均值曲率MK为trace(·)是矩阵的迹。为了消除噪声影响,本发明计算三维模型上每一点的法向量周围其3×3邻域内的平均高斯曲率和平均均值曲率where the curvature matrix The mean curvature MK of the 3D model is trace( ) is the trace of the matrix. In order to eliminate the influence of noise, the present invention calculates the average Gaussian curvature in the 3×3 neighborhood around the normal vector of each point on the three-dimensional model and mean mean curvature
其中为平均曲率矩阵,分别为p,q,px,py,qx,qy在3×3邻域内的平均值。由此,我们定义三维模型的联合曲率为in is the mean curvature matrix, are the average values of p, q, px , py , qx , and qy in a 3×3 neighborhood, respectively. From this, we define the joint curvature of the 3D model for
步骤1.2:提取联合曲率的局部最大值点构成三维模型R的曲率草图RSketch。通过透视投影变换,生成三维曲率草图RSketch的360°二维投影图像Pm,m=1,2,...,360,作为BRNN的输入。Step 1.2: Extract the joint curvature The local maximum points of , constitute the curvature sketch RSketch of the 3D model R . Through perspective projection transformation, a 360° two-dimensional projection image Pm , m=1,2,...,360 of the three-dimensional curvature sketch RSketch is generated as the input of the BRNN.
步骤2:本发明采用一种深度递归神经网络(DRNN)作为曲率特征识别方法,DRNN框架如图2所示。利用多角度特征学习BRNN刻画三维模型在多视角下的序列属性,在softmax层利用softmax函数求得正确概率最大的识别类别。Step 2: The present invention adopts a deep recurrent neural network (DRNN) as the curvature feature identification method, and the DRNN framework is shown in FIG. 2 . The multi-angle feature learning BRNN is used to describe the sequence attributes of the 3D model under multi-view, and the softmax function is used in the softmax layer to obtain the recognition category with the highest probability of correctness.
步骤2.1:为了刻画三维模型在不同视角下特征的序列性,定义三维模型在多视角下的一维特征序列为TS,s=1,2,...,360,则特征序列TS在BRNN第i层的输出分为正向输出和反向输出分别与本层BRNN上一序列的正向输出本层BRNN下一序列的反向输出以及上一层BRNN的正向输出和反向输出有如下关系:Step 2.1: In order to describe the sequence of the features of the 3D model under different perspectives, define the one-dimensional feature sequence of the 3D model in multiple perspectives as TS , s=1,2,...,360, then the feature sequence TS is in The output of the i-th layer of BRNN is divided into forward output and reverse output Respectively with the forward output of the previous sequence of the BRNN in this layer The reverse output of the next sequence of BRNN in this layer and the forward output of the previous layer of BRNN and reverse output There are the following relationships:
其中为各输出间的权值矩阵,b为偏执,tanh为神经元激活函数;则特征序列TS在BRNN的总输出Os,即为全连接层fc的输入Ifc为in is the weight matrix between each output, b is paranoia, and tanh is the neuron activation function; then the total output Os of the feature sequence TS in the BRNN is the input Ifc of the fully connected layer fc is
其中,分别为正向输出和反向输出在全连接层的连接权值。in, are the connection weights of the forward output and reverse output in the fully connected layer, respectively.
步骤2.2:特征序列TS在全连接层fc的累加输出为即为序列属性。在softmax层利用softmax函数计算识别结果为第k类的正确概率p(Ck)Step 2.2: The cumulative output of the feature sequence TS in the fully connected layer fc is is the sequence attribute. In the softmax layer, the softmax function is used to calculate the correct probability p(Ck ) that the recognition result is the kth class
其中C为识别类别总数,Ak为第k类三维目标的序列属性在全连接层fc的累加输出结果。然后利用最大似然估计方法求得损失函数最小值时,即正确概率p(Ck)最大时的识别类别k:Among them, C is the total number of recognition categories, and Ak is the cumulative output result of the sequence attributes of the k-th three-dimensional objects in the fully connected layer fc. Then the maximum likelihood estimation method is used to obtain the minimum value of the loss function, that is, the identification category k when the correct probability p(Ck ) is the largest:
其中δ(·)是克罗内克函数r表示特征序列TS的正确识别类别。where δ( ) is the Kronecker function r represents the correct recognition category of the feature sequence TS.
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201611096314.1ACN108154066B (en) | 2016-12-02 | 2016-12-02 | A 3D Object Recognition Method Based on Curvature Feature Recurrent Neural Network |
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201611096314.1ACN108154066B (en) | 2016-12-02 | 2016-12-02 | A 3D Object Recognition Method Based on Curvature Feature Recurrent Neural Network |
| Publication Number | Publication Date |
|---|---|
| CN108154066A CN108154066A (en) | 2018-06-12 |
| CN108154066Btrue CN108154066B (en) | 2021-04-27 |
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201611096314.1AActiveCN108154066B (en) | 2016-12-02 | 2016-12-02 | A 3D Object Recognition Method Based on Curvature Feature Recurrent Neural Network |
| Country | Link |
|---|---|
| CN (1) | CN108154066B (en) |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN109166183B (en)* | 2018-07-16 | 2023-04-07 | 中南大学 | Anatomical landmark point identification method and identification equipment |
| CN109496316B (en)* | 2018-07-28 | 2022-04-01 | 合刃科技(深圳)有限公司 | Image recognition system |
| CN109242955B (en)* | 2018-08-17 | 2023-03-24 | 山东师范大学 | Workpiece manufacturing characteristic automatic identification method and device based on single image |
| CN109493354B (en)* | 2018-10-10 | 2021-08-06 | 中国科学院上海技术物理研究所 | A Reconstruction Method of Target 2D Geometric Shape Based on Multi-view Image |
| CN110287783A (en)* | 2019-05-18 | 2019-09-27 | 天嗣智能信息科技(上海)有限公司 | A method for humanoid recognition in video surveillance images |
| CN117315397B (en)* | 2023-10-11 | 2024-06-07 | 电子科技大学 | Classification method for noise data containing labels based on class curvature |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101770566A (en)* | 2008-12-30 | 2010-07-07 | 复旦大学 | Quick three-dimensional human ear identification method |
| CN104166842A (en)* | 2014-07-25 | 2014-11-26 | 同济大学 | Three-dimensional palm print identification method based on partitioning statistical characteristic and combined expression |
| CN104463111A (en)* | 2014-11-21 | 2015-03-25 | 天津工业大学 | Three-dimensional face recognition method fused with multi-scale feature region curvatures |
| CN105205478A (en)* | 2015-10-23 | 2015-12-30 | 天津工业大学 | 3-dimensional human face recognition method integrating anthropometry and curvelet transform |
| KR101592294B1 (en)* | 2014-09-03 | 2016-02-05 | 배재대학교 산학협력단 | Decimation Method For Complex Three Dimensional Polygonal Mesh Data |
| CN106097431A (en)* | 2016-05-09 | 2016-11-09 | 王红军 | A kind of object global recognition method based on 3 d grid map |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US9984473B2 (en)* | 2014-07-09 | 2018-05-29 | Nant Holdings Ip, Llc | Feature trackability ranking, systems and methods |
| US10019784B2 (en)* | 2015-03-18 | 2018-07-10 | Toshiba Medical Systems Corporation | Medical image processing apparatus and method |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101770566A (en)* | 2008-12-30 | 2010-07-07 | 复旦大学 | Quick three-dimensional human ear identification method |
| CN104166842A (en)* | 2014-07-25 | 2014-11-26 | 同济大学 | Three-dimensional palm print identification method based on partitioning statistical characteristic and combined expression |
| KR101592294B1 (en)* | 2014-09-03 | 2016-02-05 | 배재대학교 산학협력단 | Decimation Method For Complex Three Dimensional Polygonal Mesh Data |
| CN104463111A (en)* | 2014-11-21 | 2015-03-25 | 天津工业大学 | Three-dimensional face recognition method fused with multi-scale feature region curvatures |
| CN105205478A (en)* | 2015-10-23 | 2015-12-30 | 天津工业大学 | 3-dimensional human face recognition method integrating anthropometry and curvelet transform |
| CN106097431A (en)* | 2016-05-09 | 2016-11-09 | 王红军 | A kind of object global recognition method based on 3 d grid map |
| Title |
|---|
| "Study on novel Curvature Features for 3D fingerprint recognition";Feng Liu 等;《Neurocomputing》;20151130;第168卷;第599-608页* |
| "基于模型的任意视点下三维目标识别研究";许俊峰;《中国优秀硕士学位论文全文数据库 信息科技辑》;20160115(第01期);全文* |
| Publication number | Publication date |
|---|---|
| CN108154066A (en) | 2018-06-12 |
| Publication | Publication Date | Title |
|---|---|---|
| CN108154066B (en) | A 3D Object Recognition Method Based on Curvature Feature Recurrent Neural Network | |
| US12270727B2 (en) | Underwater detection method and system for contact leakage of tunnel joints of dam culvert | |
| CN109816049B (en) | A deep learning-based assembly monitoring method, device and readable storage medium | |
| CN107103613B (en) | A kind of three-dimension gesture Attitude estimation method | |
| CN108537191B (en) | A three-dimensional face recognition method based on structured light camera | |
| CN110543878A (en) | A Neural Network-Based Recognition Method of Pointer Meter Readings | |
| CN103247045B (en) | A kind of method obtaining artificial scene principal direction and image border from multi views | |
| CN108764065A (en) | A kind of method of pedestrian's weight identification feature fusion assisted learning | |
| CN110766746B (en) | 3D driver posture estimation method based on combined 2D-3D neural network | |
| CN110084304A (en) | A kind of object detection method based on generated data collection | |
| CN107424161B (en) | A Coarse-to-fine Image Layout Estimation Method for Indoor Scenes | |
| CN110246181A (en) | Attitude estimation model training method, Attitude estimation method and system based on anchor point | |
| CN116958420A (en) | A high-precision modeling method for the three-dimensional face of a digital human teacher | |
| CN109035172A (en) | A kind of non-local mean Ultrasonic Image Denoising method based on deep learning | |
| CN105513094A (en) | Stereo vision tracking method and stereo vision tracking system based on 3D Delaunay triangulation | |
| CN115937520A (en) | Point cloud moving target segmentation method based on semantic information guidance | |
| CN104077775A (en) | Shape matching method and device combining skeleton feature points and shape context | |
| CN110490915B (en) | A point cloud registration method based on convolution restricted Boltzmann machine | |
| CN115049842B (en) | Method for detecting damage of aircraft skin image and positioning 2D-3D | |
| CN113780240B (en) | Object pose estimation method based on neural network and rotation characteristic enhancement | |
| CN104182968A (en) | Method for segmenting fuzzy moving targets by wide-baseline multi-array optical detection system | |
| CN116416376A (en) | Three-dimensional hair reconstruction method, system, electronic equipment and storage medium | |
| CN116310095A (en) | Multi-view three-dimensional reconstruction method based on deep learning | |
| CN113808273A (en) | A Disordered Incremental Sparse Point Cloud Reconstruction Method for Numerical Simulation of Ship Traveling Waves | |
| He et al. | A generative feature-to-image robotic vision framework for 6D pose measurement of metal parts |
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant | ||
| CB03 | Change of inventor or designer information | ||
| CB03 | Change of inventor or designer information | Inventor after:Liang Wei Inventor after:Li Yang Inventor after:Zheng Meng Inventor after:Peng Shiwei Inventor before:Liang Wei Inventor before:Li Yang Inventor before:Zheng Meng Inventor before:Tan Jindong Inventor before:Peng Shiwei |