CN110060212A

Movatterモバイル変換

Info

Publication number: CN110060212A
Application number: CN201910208408.0A
Authority: CN
Inventors: 举雅琨; 董军宇; 亓琳
Original assignee: Ocean University of China
Current assignee: Ocean University of China
Priority date: 2019-03-19
Filing date: 2019-03-19
Publication date: 2019-07-26
Anticipated expiration: 2039-03-19
Also published as: CN110060212B

Abstract

Translated fromChinese

本发明涉及多光谱多度立体表面法向恢复领域，特别公开了一种基于深度学习的多光谱光度立体表面法向恢复方法。该方法利用多光谱光度立体系统，拍摄单张待恢复物体的照片；将拍摄的单张待恢复物体照片的RGB通道分离为三幅单通道的灰白图像；利用三输入标准光度立体算法输入三幅单通道图像，得到初始表面法向；将单张待恢复物体的照片以及初始表面法向一同输入深度网络模型，利用深度学习算法，输出准确的表面法向。本发明步骤简单，相比于传统多光谱光度立体算法，不需要预先标定物体，也不需要使用其他设备获得部分位置的初始深度，而是完全采用自身信息，获得初始表面法向并利用深度学习算法恢复准确的表面法向。

The invention relates to the field of multi-spectral multi-degree stereo surface normal recovery, and particularly discloses a multi-spectral photometric stereo surface normal recovery method based on deep learning. The method uses a multi-spectral photometric stereo system to take a single photo of the object to be restored; separates the RGB channels of the single photo of the object to be restored into three single-channel gray-white images; uses the three-input standard photometric stereo algorithm to input three images A single-channel image is used to obtain the initial surface normal; a single photo of the object to be restored and the initial surface normal are input into the deep network model, and the deep learning algorithm is used to output the accurate surface normal. Compared with the traditional multi-spectral photometric stereo algorithm, the method of the invention does not need to pre-calibrate the object, nor does it need to use other equipment to obtain the initial depth of a part of the position. Instead, it completely uses its own information to obtain the initial surface normal and uses deep learning. The algorithm restores the exact surface normal.

Description

Translated fromChinese

一种基于深度学习的多光谱光度立体表面法向恢复方法A Deep Learning-Based Multispectral Photometric Stereo Surface Normal Recovery Method

（一）技术领域(1) Technical field

本发明涉及多光谱多度立体表面法向恢复领域，特别涉及一种基于深度学习的多光谱光度立体表面法向恢复方法。The invention relates to the field of multi-spectral multi-degree stereo surface normal recovery, in particular to a multi-spectral photometric stereo surface normal recovery method based on deep learning.

（二）背景技术(2) Background technology

物体表面法向恢复是三维重建的一个重要组成部分，是计算机视觉领域中具有广泛应用价值的研究方向。多光谱光度立体表面法向恢复方法是一种单张图像预测表面法向的方法，能够作用于运动或非刚性物体的表面法向恢复。表面法向相比于二维的图像，可以提供物体的表面形状数据和深度数据，能够更加全面的展示物体特性。因此在无人驾驶、地理测量、人机交互、现代医学等多个领域都有着广泛的应用。Object surface normal recovery is an important part of 3D reconstruction, and it is a research direction with extensive application value in the field of computer vision. The multispectral photometric stereoscopic surface normal recovery method is a method for predicting the surface normal direction from a single image, which can be applied to the surface normal recovery of moving or non-rigid objects. Compared with the two-dimensional image, the surface normal can provide the surface shape data and depth data of the object, and can display the characteristics of the object more comprehensively. Therefore, it has a wide range of applications in many fields such as unmanned driving, geographic measurement, human-computer interaction, and modern medicine.

但是现存的多光谱光度立体表面法向恢复方法在实际应用中局限性很大：传统方法需要对待测物体进行预先的材质标定，或者需要预先知道物体表面部分位置的准确深度信息。这些条件在实际应用中是难以实现的。此外，由于传统的多光谱光度立体方法依赖这些预先的信息，在实际操作中经常会由于预先信息的偏差导致方法预测表面法向的误差。However, the existing multispectral photometric three-dimensional surface normal recovery methods have great limitations in practical applications: traditional methods require pre-calibration of the object to be measured, or need to know the accurate depth information of the surface part of the object in advance. These conditions are difficult to achieve in practical applications. In addition, since the traditional multispectral photometric stereo methods rely on these pre-information, in practice, the deviation of pre-information often leads to errors in the method's prediction of the surface normal.

（三）发明内容(3) Contents of the invention

本发明为了弥补现有技术的不足，提供了一种步骤精确、准确度高的基于深度学习的多光谱光度立体表面法向恢复方法。In order to make up for the deficiencies of the prior art, the present invention provides a deep learning-based multispectral photometric stereoscopic surface normal recovery method with precise steps and high accuracy.

本发明是通过如下技术方案实现的：The present invention is achieved through the following technical solutions:

一种基于深度学习的多光谱光度立体表面法向恢复方法，包括如下步骤：A deep learning-based multispectral photometric stereoscopic surface normal recovery method, comprising the following steps:

（1）利用多光谱光度立体系统，拍摄单张待恢复物体的照片；(1) Using the multi-spectral photometric stereo system, take a single photo of the object to be restored;

（2）将拍摄的单张待恢复物体照片的RGB通道分离为三幅单通道的灰白图像；(2) Separate the RGB channel of the single photo of the object to be restored into three single-channel gray-white images;

（3）利用三输入标准光度立体算法输入三幅单通道图像，得到初始表面法向；(3) Use the three-input standard photometric stereo algorithm to input three single-channel images to obtain the initial surface normal;

（4）将单张待恢复物体的照片以及初始表面法向一同输入深度网络模型，利用深度学习算法，输出准确的表面法向。(4) Input the single photo of the object to be restored and the initial surface normal into the deep network model, and use the deep learning algorithm to output the accurate surface normal.

本发明利用三输入标准光度立体算法，首先获得初始表面法向，尽管初始表面法向是不准确的，但是，利用深度学习算法，与拍摄的照片进行融合，可以输出准确的恢复表面法向。本发明通过加入初始表面法向，改进了深度学习算法，提高了多光谱光度立体表面法向恢复的准确度。The present invention uses a three-input standard photometric stereo algorithm to first obtain the initial surface normal direction. Although the initial surface normal direction is inaccurate, the deep learning algorithm is used to fuse with the photographed photos, and an accurate restored surface normal direction can be output. By adding the initial surface normal, the invention improves the deep learning algorithm and improves the accuracy of the multispectral photometric stereoscopic surface normal recovery.

本发明的具体技术方案为：The specific technical scheme of the present invention is:

步骤（1）中，待恢复物体在红绿蓝三个灯照射下拍摄，以待恢复物体为坐标轴原点，建立笛卡尔坐标系，红绿蓝三个灯光照射方向的单位向量分别为，，。In step (1), the object to be restored is photographed under the illumination of three lights of red, green and blue, and the object to be restored is taken as the origin of the coordinate axis to establish a Cartesian coordinate system, and the unit vectors of the illumination directions of the three lights of red, green and blue are respectively: , , .

步骤（2）中，将经RGB通道分离的单通道图像分别存为，，。In step (2), the single-channel images separated by RGB channels are stored as , , .

步骤（3）中，三输入标准光度立体算法为C=ρLn；其中，ρ为可求的固定的标量，，为图像上像素点坐标，，n为待求的像素点初始表面法向。在利用三输入标准光度立体算法的情况下，忽略了红绿蓝灯在不同通道中的混杂以及物体表面的材质信息，因此求得的初始表面法向是有误差的。In step (3), the three-input standard photometric stereo algorithm isC = ρL n; among them, ρ is a fixed scalar that can be obtained, , is the pixel coordinates on the image, , n is the desired The pixel's initial surface normal. In the case of using the three-input standard photometric stereo algorithm, the mixing of red, green and blue lights in different channels and the material information of the object surface are ignored, so the obtained initial surface normal is in error.

步骤（4）中，深度网络模型的深度学习算法为，首先将待恢复物体的照片和初始表面法向对齐，裁剪为40*40像素的图块；裁剪后的照片与获得的初始表面法向融合输入6层卷积层，前3层卷积层的卷积核大小为5*5像素，后3层的卷积核大小为3*3像素，所有卷积层均采用“SAME”的填充方式和“Relu”的激活函数，经过6层的卷积层，网络最终输出对应40*40像素的准确的表面法向。In step (4), the deep learning algorithm of the deep network model is to first align the photo of the object to be restored with the initial surface normal, and crop it into a 40*40 pixel block; the cropped photo and the obtained initial surface normal Fusion input 6 layers of convolution layers, the size of the convolution kernel of the first 3 layers of convolution layers is 5*5 pixels, the size of the convolution kernel of the last 3 layers is 3*3 pixels, and all the convolution layers are filled with "SAME" method and the activation function of "Relu", after 6 layers of convolutional layers, the network finally outputs an accurate surface normal corresponding to 40*40 pixels.

该深度学习算法采用均方误差作为损失函数，具体公式为，其中，n代表真实法向，代表网络预测的法向；损失函数采用默认设置的Adam算法进行最小化。The deep learning algorithm uses the mean square error as the loss function, and the specific formula is , where n represents the true normal, Represents the normal predicted by the network; the loss function is minimized using the Adam algorithm with default settings.

所述6层卷积层的特征图通道数分别为32，64，128，128，64，32。The number of feature map channels of the 6-layer convolutional layers are 32, 64, 128, 128, 64, and 32, respectively.

本发明步骤简单，相比于传统多光谱光度立体算法，不需要预先标定物体，也不需要使用其他设备获得部分位置的初始深度，而是完全采用自身信息，获得初始表面法向并利用深度学习算法恢复准确的表面法向，使多光谱光度立体算法可以实际应用，并且增加了恢复表面法向的准确性。Compared with the traditional multi-spectral photometric stereo algorithm, the method of the invention does not need to pre-calibrate the object, nor does it need to use other equipment to obtain the initial depth of a part of the position. Instead, it completely uses its own information to obtain the initial surface normal and uses deep learning. The algorithm restores accurate surface normals, making multispectral photometric stereo algorithms practical and increasing the accuracy of restoring surface normals.

（四）附图说明(4) Description of drawings

下面结合附图对本发明作进一步的说明。The present invention will be further described below in conjunction with the accompanying drawings.

图1为本发明深度网络模型的结构示意图；1 is a schematic structural diagram of a deep network model of the present invention;

图2为本发明实施例待恢复物体的单张照片；2 is a single photo of an object to be restored according to an embodiment of the present invention;

图3为本发明实施例待恢复物体的初始表面法向；Fig. 3 is the initial surface normal direction of the object to be restored according to the embodiment of the present invention;

图4为本发明实施例待恢复物体的准确表面法向。FIG. 4 is an accurate surface normal of an object to be restored according to an embodiment of the present invention.

（五）具体实施方式(5) Specific implementation methods

下面结合附图对本发明作进一步说明。The present invention will be further described below in conjunction with the accompanying drawings.

一种基于深度学习的多光谱光度立体表面法向恢复方法，具体包括如下步骤：A deep learning-based multispectral photometric stereoscopic surface normal recovery method specifically includes the following steps:

（1）获取多光谱光度立体系统拍摄的单张待恢复物体的照片(1) Obtain a single photo of the object to be restored taken by the multispectral photometric stereo system

待恢复物体即实验物体的正上方安装圆形支架，圆形支架中间位置放置相机，圆形轨道上均布红绿蓝三个照射灯，提供不同的照明方向，且三个照射灯具有相同的倾斜角度30度。The object to be restored, namely the experimental object, is installed with a circular bracket, and a camera is placed in the middle of the circular bracket. Three illumination lamps of red, green and blue are evenly distributed on the circular track to provide different illumination directions, and the three illumination lamps have the same The inclination angle is 30 degrees.

利用多光谱光度立体系统，拍摄单张待恢复物体的照片，如附图2所示：待恢复物体在红绿蓝三个灯照射下拍摄，若以待恢复物体为坐标轴原点，建立笛卡尔坐标系，则红绿蓝三个灯光照射方向的单位向量分别为，，。Using the multi-spectral photometric stereo system, take a single photo of the object to be restored, as shown in Figure 2: The object to be restored is photographed under the illumination of three lights, red, green and blue. If the object to be restored is taken as the origin of the coordinate axis, a Cartesian image is established. coordinate system, the unit vectors of the red, green and blue light irradiation directions are respectively , , .

（2）将拍摄的单张待恢复物体照片的RGB通道分离为三幅单通道图像(2) Separate the RGB channels of the single photo of the object to be restored into three single-channel images

拍摄的照片是有RGB三个通道的彩色照片，将彩色照片的RGB通道分离，分别存为三张单通道的灰白图像，，。The photo taken is a color photo with three RGB channels. The RGB channels of the color photo are separated and stored as three single-channel gray and white images. , , .

（3）利用三输入标准光度立体算法输入三幅单通道图像得到初始表面法向（如附图3所示）(3) Use the three-input standard photometric stereo algorithm to input three single-channel images to obtain the initial surface normal (as shown in Figure 3)

三输入标准光度立体算法是C=ρLn；其中，ρ为可求的固定的标量，，为图像上像素点坐标，， n待求的像素点初始表面法向。在利用三输入标准光度立体算法的情况下，忽略了红绿蓝灯在不同通道中的混杂以及物体表面的材质信息，因此求得的初始表面法向是有误差的。The three-input standard photometric stereo algorithm isC =ρL n; where ρ is a fixed scalar that can be obtained, , is the pixel coordinates on the image, , n to be sought The pixel's initial surface normal. In the case of using the three-input standard photometric stereo algorithm, the mixing of red, green and blue lights in different channels and the material information of the object surface are ignored, so the obtained initial surface normal is in error.

（4）利用深度学习算法输入单张待恢复物体的照片以及初始表面法向，输出准确的表面法向(4) Use the deep learning algorithm to input a single photo of the object to be restored and the initial surface normal, and output the accurate surface normal

将初始表面法向与第一步中拍摄的照片一同输入深度网络模型，利用深度学习算法，输出准确的表面法向。其中，深度网络模型具体结构如附图1所示。Input the initial surface normal and the photo taken in the first step into the deep network model, and use the deep learning algorithm to output the accurate surface normal. The specific structure of the deep network model is shown in Figure 1.

首先将拍摄的照片和初始法向对齐，裁剪为40*40像素的图块。裁剪后的照片与获得的初始法向融合输入6层卷积层，前3层卷积层的卷积核大小为5*5像素，后3层的卷积核大小为3*3像素，所有卷积层均采用“SAME”的填充方式，这意味着卷积层的尺寸保持40*40像素不变。6层卷积均采用“Relu”的激活函数，6层卷积层的特征图通道数分别为32，64，128，128，64，32，经过6层的卷积层，网络最终输出对应40*40像素的准确的表面法向，如附图4所示。First, align the captured photo with the initial normal and crop it into a 40*40 pixel tile. The cropped photo and the obtained initial normal are fused to input 6 convolutional layers. The convolution kernel size of the first 3 convolutional layers is 5*5 pixels, and the size of the convolutional kernels of the last 3 layers is 3*3 pixels. All The convolutional layer adopts the "SAME" padding method, which means that the size of the convolutional layer remains unchanged at 40*40 pixels. The 6 layers of convolution all use the activation function of "Relu". The number of feature map channels of the 6 layers of convolution layers are 32, 64, 128, 128, 64, and 32 respectively. After the 6 layers of convolution layers, the final output of the network corresponds to 40 *40 pixel exact surface normal, as shown in Figure 4.

Claims

Translated fromChinese

1.一种基于深度学习的多光谱光度立体表面法向恢复方法，其特征为，包括如下步骤：（1）利用多光谱光度立体系统，拍摄单张待恢复物体的照片；（2）将拍摄的单张待恢复物体照片的RGB通道分离为三幅单通道的灰白图像；（3）利用三输入标准光度立体算法输入三幅单通道图像，得到初始表面法向；（4）将单张待恢复物体的照片以及初始表面法向一同输入深度网络模型，利用深度学习算法，输出准确的表面法向。1. A deep learning-based multi-spectral photometric stereoscopic surface normal recovery method, characterized by comprising the following steps: (1) using a multi-spectral photometric stereo system to take a single photo of an object to be restored; (2) photographing The RGB channels of the single photo of the object to be restored are separated into three single-channel gray-white images; (3) use the three-input standard photometric stereo algorithm to input three single-channel images to obtain the initial surface normal; The photo of the restored object and the initial surface normal are input to the deep network model together, and the deep learning algorithm is used to output the accurate surface normal.

2.根据权利要求1所述的基于深度学习的多光谱光度立体表面法向恢复方法，其特征在于：步骤（1）中，待恢复物体在红绿蓝三个灯照射下拍摄，以待恢复物体为坐标轴原点，建立笛卡尔坐标系，红绿蓝三个灯光照射方向的单位向量分别为，，。2 . The deep learning-based multispectral photometric stereoscopic surface normal restoration method according to claim 1 , wherein in step (1), the object to be restored is photographed under the illumination of three lights, red, green and blue, to be restored. 3 . The object is the origin of the coordinate axis, and a Cartesian coordinate system is established. The unit vectors of the three light irradiation directions of red, green and blue are respectively , , .

3.根据权利要求1所述的基于深度学习的多光谱光度立体表面法向恢复方法，其特征在于：步骤（2）中，将经RGB通道分离的单通道图像分别存为，，。3. The deep learning-based multispectral photometric stereoscopic surface normal recovery method according to claim 1, wherein in step (2), the single-channel images separated by RGB channels are stored as , , .

4.根据权利要求1所述的基于深度学习的多光谱光度立体表面法向恢复方法，其特征在于：步骤（3）中，三输入标准光度立体算法为C=ρLn；其中，ρ为可求的固定的标量，，为图像上像素点坐标，，n为待求的像素点初始表面法向。4. The deep learning-based multispectral photometric stereoscopic surface normal recovery method according to claim 1, characterized in that: in step (3), the three-input standard photometric stereo algorithm isC =ρL n; wherein, ρ is A seekable fixed scalar, , is the pixel coordinates on the image, , n is the desired The pixel's initial surface normal.

5.根据权利要求1所述的基于深度学习的多光谱光度立体表面法向恢复方法，其特征在于：步骤（4）中，深度网络模型的深度学习算法为，首先将待恢复物体的照片和初始表面法向对齐，裁剪为40*40像素的图块；裁剪后的照片与获得的初始表面法向融合输入6层卷积层，前3层卷积层的卷积核大小为5*5像素，后3层的卷积核大小为3*3像素，所有卷积层均采用“SAME”的填充方式和“Relu”的激活函数，经过6层的卷积层，网络最终输出对应40*40像素的准确的表面法向。5. The deep learning-based multispectral photometric stereoscopic surface normal recovery method according to claim 1, wherein in step (4), the deep learning algorithm of the deep network model is: The initial surface normal is aligned and cropped into a 40*40 pixel tile; the cropped photo and the obtained initial surface normal are fused into 6 convolutional layers, and the convolution kernel size of the first 3 convolutional layers is 5*5 Pixels, the size of the convolution kernel of the last 3 layers is 3*3 pixels, all the convolution layers use the “SAME” padding method and the “Relu” activation function, after 6 layers of convolution layers, the final output of the network corresponds to 40* 40px accurate surface normal.

6.根据权利要求5所述的基于深度学习的多光谱光度立体表面法向恢复方法，其特征在于：所述深度学习算法采用均方误差作为损失函数，具体公式为，其中，n代表真实法向，代表网络预测的法向。6. The deep learning-based multispectral photometric stereoscopic surface normal recovery method according to claim 5, wherein the deep learning algorithm adopts mean square error as a loss function, and the specific formula is , where n represents the true normal, Represents the normal predicted by the network.

7.根据权利要求5所述的基于深度学习的多光谱光度立体表面法向恢复方法，其特征在于：所述6层卷积层的特征图通道数分别为32，64，128，128，64，32。7. The deep learning-based multispectral photometric stereoscopic surface normal recovery method according to claim 5, wherein the number of feature map channels of the 6-layer convolutional layers are 32, 64, 128, 128, 64 respectively , 32.