CN112258626A

Movatterモバイル変換

Info

Publication number: CN112258626A
Application number: CN202010986409.0A
Authority: CN
Inventors: 刘丽; 王萍; 田甜; 张静静; 王天时; 张化祥
Original assignee: Shandong Normal University
Current assignee: Shandong Normal University
Priority date: 2020-09-18
Filing date: 2020-09-18
Publication date: 2021-01-22

Abstract

The invention discloses a three-dimensional model generation method and a system for generating dense point cloud based on image cascade, which comprises the following steps: carrying out pre-reconstruction processing on the obtained image to obtain a corresponding sparse point cloud model; sequentially carrying out double uniform up-sampling on the sparse point cloud model to obtain a dense point cloud model, wherein the double uniform up-sampling comprises feature extraction, residual image convolution and up-sampling processing; and reconstructing the dense point cloud model and outputting a three-dimensional graph model associated with the image. The high-resolution target point cloud is reconstructed by combining image pre-reconstruction and double point cloud uniform up-sampling, joint optimization is realized by using staged training and network fine adjustment, so that the generated dense point cloud has the characteristic of uniform distribution, the visual reality of a dense point cloud model is enhanced, and the degree of association between the dense point cloud and an original single image is higher by adopting an image re-description mechanism.

Description

Translated fromChinese

基于图像级联生成稠密点云的三维模型生成方法及系统3D model generation method and system based on image cascade to generate dense point cloud

技术领域technical field

本发明涉及立体设计技术领域，特别是涉及一种基于图像级联生成稠密点云的三维模型生成方法及系统。The invention relates to the technical field of three-dimensional design, in particular to a three-dimensional model generation method and system for generating dense point clouds based on image cascade.

背景技术Background technique

本部分的陈述仅仅是提供了与本发明相关的背景技术信息，不必然构成在先技术。The statements in this section merely provide background information related to the present invention and do not necessarily constitute prior art.

基于图像重建三维模型是指利用一个或多个图像生成与其相匹配的具有真实感的三维模型，从给定的图像中生成高分辨率模型对于某些实际应用是非常重要的，例如机器人技术、计算机视觉和自动驾驶等，尤其在计算机视觉领域中需重建目标对象的高分辨率三维模型。Reconstructing a 3D model based on an image refers to using one or more images to generate a realistic 3D model that matches it. Generating a high-resolution model from a given image is very important for some practical applications, such as robotics, Computer vision and autonomous driving, etc., especially in the field of computer vision, need to reconstruct high-resolution 3D models of target objects.

人类视觉系统能够处理目标对象的视网膜图像以提取其底层的三维结构，人类视觉系统的三维感知能力不仅限于整体形状的重构，而且能够捕获对象表面的局部细节。The human visual system can process the retinal image of the target object to extract its underlying three-dimensional structure. The three-dimensional perception ability of the human visual system is not limited to the reconstruction of the overall shape, but also can capture the local details of the object surface.

类似于人类视觉系统，机器同样可以学习并感知三维世界中各种目标对象，其中，点云是机器学习和感知目标对象的一种表现形式，与几何图形和简单网格表示的三维模型相比，点云在表示底层几何结构方面可能不够高效，但是由于其无需定义多种基本单元或连接方式，因此点云也具有许多优势，如具有简单统一、易于学习的三维结构，易于进行几何变换和变形，充分捕获模型的表面信息。Similar to the human visual system, machines can also learn and perceive various target objects in the three-dimensional world. Among them, point clouds are a form of machine learning and perception of target objects, compared with 3D models represented by geometric figures and simple grids , the point cloud may not be efficient enough in representing the underlying geometric structure, but since it does not need to define multiple basic units or connection methods, the point cloud also has many advantages, such as a simple, unified, easy-to-learn three-dimensional structure, easy to perform geometric transformations and Deformed to fully capture the surface information of the model.

点云由于其矢量化和信息紧凑而被广泛使用，但是无序和离散的几何特征降低了从图像中重建点云的准确性；本发明的发明人发现，从单个图像重建三维对象存在以下问题：早期的三维模型重建方法主要基于几何特征的研究，一种是通过在模型投影和输入图像之间寻找最佳拟合参数来进行重构；另一种是根据图像的三维信息，如阴影、纹理、轮廓等恢复几何形状；这些方法需要大量的先验信息，并且对光照和环境有很高的要求，因此通常很难重建高质量的对象模型。Point clouds are widely used due to their vectorization and compact information, but disordered and discrete geometric features reduce the accuracy of reconstructing point clouds from images; the inventors of the present invention found that reconstructing three-dimensional objects from a single image has the following problems : Early 3D model reconstruction methods are mainly based on the study of geometric features. One is to reconstruct by finding the best fitting parameters between the model projection and the input image; the other is to reconstruct according to the 3D information of the image, such as shadow, Textures, contours, etc. recover geometry; these methods require a lot of prior information and have high requirements on lighting and environment, so it is often difficult to reconstruct high-quality object models.

发明内容SUMMARY OF THE INVENTION

为了解决上述问题，本发明提出了一种基于图像级联生成稠密点云的三维模型生成方法及系统，通过图像预重构和双重点云均匀上采样重建高分辨率目标点云，并利用分阶段训练和网络微调实现联合优化，使生成的稠密点云具有均匀分布的特点，增强稠密点云模型的视觉真实性，并采用图像再描述机制，使得稠密点云与原始的单幅图像关联程度更高。In order to solve the above problems, the present invention proposes a three-dimensional model generation method and system for generating dense point clouds based on image cascade. The high-resolution target point cloud is reconstructed through image pre-reconstruction and uniform upsampling of double point clouds, and the points The stage training and network fine-tuning realize joint optimization, so that the generated dense point cloud has the characteristics of uniform distribution, which enhances the visual authenticity of the dense point cloud model, and adopts the image re-description mechanism to make the dense point cloud and the original single image. higher.

为了实现上述目的，本发明采用如下技术方案：In order to achieve the above object, the present invention adopts the following technical solutions:

第一方面，本发明提供一种基于图像级联生成稠密点云的三维模型生成方法，包括：In a first aspect, the present invention provides a three-dimensional model generation method for generating dense point clouds based on image cascade, including:

对获取的图像采用预重构处理后得到对应的稀疏点云模型；The corresponding sparse point cloud model is obtained after pre-reconstructing the acquired image;

对稀疏点云模型依次进行双重均匀上采样后得到稠密点云模型，所述双重均匀上采样包括特征提取、残差图卷积以及上采样处理；The dense point cloud model is obtained by performing double uniform upsampling on the sparse point cloud model in turn, and the double uniform upsampling includes feature extraction, residual image convolution and upsampling processing;

对稠密点云模型进行重构，输出与图像相关联的三维图模型。The dense point cloud model is reconstructed to output a 3D graphical model associated with the image.

第二方面，本发明提供一种基于图像级联生成稠密点云的三维模型生成系统，包括：In a second aspect, the present invention provides a three-dimensional model generation system for generating dense point clouds based on image cascade, including:

预重构模块，用于对获取的图像采用预重构处理后得到对应的稀疏点云模型；The pre-reconstruction module is used to obtain the corresponding sparse point cloud model after pre-reconstructing the acquired image;

稠密点云模型生成模块，用于对稀疏点云模型依次进行双重均匀上采样后得到稠密点云模型，所述双重均匀上采样包括特征提取、残差图卷积以及上采样处理；The dense point cloud model generation module is used for obtaining the dense point cloud model after performing double uniform upsampling on the sparse point cloud model in sequence, and the double uniform upsampling includes feature extraction, residual image convolution and upsampling processing;

三维模型模块，用于对稠密点云模型进行重构，输出与图像相关联的三维图模型。The 3D model module is used to reconstruct the dense point cloud model and output the 3D graphic model associated with the image.

第三方面，本发明提供一种电子设备，包括存储器和处理器以及存储在存储器上并在处理器上运行的计算机指令，所述计算机指令被处理器运行时，完成第一方面所述的方法。In a third aspect, the present invention provides an electronic device, comprising a memory, a processor, and computer instructions stored in the memory and executed on the processor, and when the computer instructions are executed by the processor, the method described in the first aspect is completed .

第四方面，本发明提供一种计算机可读存储介质，用于存储计算机指令，所述计算机指令被处理器执行时，完成第一方面所述的方法。In a fourth aspect, the present invention provides a computer-readable storage medium for storing computer instructions, and when the computer instructions are executed by a processor, the method described in the first aspect is completed.

与现有技术相比，本发明的有益效果为：Compared with the prior art, the beneficial effects of the present invention are:

本发明通过图像预重构和点云上采样重建高分辨率目标点云，首先将预重构和均匀上采样结合以生成高分辨率的点云，并通过分阶段训练实现联合优化；继而通过图像再描述机制实现双向关联并增强图像与点云之间的语义一致性。The present invention reconstructs high-resolution target point cloud through image pre-reconstruction and point cloud up-sampling, firstly combines pre-reconstruction and uniform up-sampling to generate high-resolution point cloud, and realizes joint optimization through staged training; The image re-description mechanism enables bidirectional association and enhances semantic consistency between images and point clouds.

本发明通过预重构、均匀上采样和图像再描述的协同作用，提高生成的稠密点云的精确度，并且生成的稠密点云具有均匀分布的特点，增强稠密点云模型的视觉真实性，使得生成的稠密点云与原始的单幅图像关联程度更高。Through the synergistic effect of pre-reconstruction, uniform upsampling and image re-description, the invention improves the accuracy of the generated dense point cloud, and the generated dense point cloud has the characteristics of uniform distribution, and enhances the visual authenticity of the dense point cloud model. It makes the generated dense point cloud more related to the original single image.

附图说明Description of drawings

构成本发明的一部分的说明书附图用来提供对本发明的进一步理解，本发明的示意性实施例及其说明用于解释本发明，并不构成对本发明的不当限定。The accompanying drawings forming a part of the present invention are used to provide further understanding of the present invention, and the exemplary embodiments of the present invention and their descriptions are used to explain the present invention, and do not constitute an improper limitation of the present invention.

图1为本发明实施例1提供的基于图像级联生成稠密点云的三维模型生成方法流程图；1 is a flowchart of a three-dimensional model generation method for generating dense point clouds based on image cascade provided in Embodiment 1 of the present invention;

图2为本发明实施例1提供的稠密点云的级联生成框架的结构示意图。FIG. 2 is a schematic structural diagram of a cascade generation framework for dense point clouds provided in Embodiment 1 of the present invention.

具体实施方式：Detailed ways:

下面结合附图与实施例对本发明做进一步说明。The present invention will be further described below with reference to the accompanying drawings and embodiments.

应该指出，以下详细说明都是示例性的，旨在对本发明提供进一步的说明。除非另有指明，本文使用的所有技术和科学术语具有与本发明所属技术领域的普通技术人员通常理解的相同含义。It should be noted that the following detailed description is exemplary and intended to provide further explanation of the invention. Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.

需要注意的是，这里所使用的术语仅是为了描述具体实施方式，而非意图限制根据本发明的示例性实施方式。如在这里所使用的，除非上下文另外明确指出，否则单数形式也意图包括复数形式，此外，还应当理解的是，术语“包括”和“具有”以及他们的任何变形，意图在于覆盖不排他的包含，例如，包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元，而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。It should be noted that the terminology used herein is for the purpose of describing specific embodiments only, and is not intended to limit the exemplary embodiments according to the present invention. As used herein, unless the context clearly dictates otherwise, the singular is intended to include the plural as well, furthermore, it is to be understood that the terms "including" and "having" and any conjugations thereof are intended to cover the non-exclusive A process, method, system, product or device comprising, for example, a series of steps or units is not necessarily limited to those steps or units expressly listed, but may include those steps or units not expressly listed or for such processes, methods, Other steps or units inherent to the product or equipment.

在不冲突的情况下，本发明中的实施例及实施例中的特征可以相互组合。Embodiments of the invention and features of the embodiments may be combined with each other without conflict.

实施例1Example 1

如图1所示，本实施例提供一种基于图像级联生成稠密点云的三维模型生成方法，包括：As shown in FIG. 1 , this embodiment provides a method for generating a 3D model based on cascaded images to generate a dense point cloud, including:

S1：对获取的图像采用预重构处理后得到对应的稀疏点云模型；S1: The corresponding sparse point cloud model is obtained after pre-reconstructing the acquired image;

S2：对稀疏点云模型依次进行双重均匀上采样后得到稠密点云模型，所述双重均匀上采样包括特征提取、残差图卷积以及上采样处理；S2: performing double uniform upsampling on the sparse point cloud model in turn to obtain a dense point cloud model, where the double uniform upsampling includes feature extraction, residual image convolution and upsampling processing;

S3：对稠密点云模型进行重构，输出与图像相关联的三维图模型。S3: Reconstruct the dense point cloud model and output a 3D graphical model associated with the image.

实现在稠密点云生成过程中通过提取单幅图像的信息将视觉技术与三维模型技术融合，实现三维模型的预重构、均匀上采样和图像再描述，最终生成满足用户设计意图的三维模型，实现基于单幅图像重建三维模型。Realize the integration of vision technology and 3D model technology by extracting the information of a single image in the process of dense point cloud generation, realize the pre-reconstruction, uniform upsampling and image re-description of the 3D model, and finally generate a 3D model that meets the user's design intent. Realize the reconstruction of 3D model based on a single image.

在本实施例中，三维设计对象以椅子为例，可以理解的是，在其他一些实施方式中，所述设计对象可以是一辆车，一个飞机，一张桌子或者一个房子，只要是可以提供视觉方面的图像即可。In this embodiment, the three-dimensional design object takes a chair as an example. It can be understood that in other embodiments, the design object may be a car, an airplane, a table or a house, as long as it can provide A visual image will do.

如图2所示为以该椅子的单幅图像生成稠密点云模型的生成过程，具体包括：Figure 2 shows the generation process of generating a dense point cloud model with a single image of the chair, including:

所述步骤S1中，为了实现以给定的单幅图像为基础生成视觉真实性和语义一致性的稠密点云，提出预重构过程；In the step S1, in order to generate a dense point cloud with visual authenticity and semantic consistency based on a given single image, a pre-reconstruction process is proposed;

所述预重构包括：The pre-refactoring includes:

S1-1：将单幅RGB图像输入到由多个卷积层和全连接层组成的编-解码器网络中；S1-1: Input a single RGB image into an encoder-decoder network consisting of multiple convolutional layers and fully connected layers;

S1-2：通过该网络提取单幅RGB图像特征，输出稀疏点云；S1-2: Extract the features of a single RGB image through the network, and output a sparse point cloud;

S1-3：为了更好的训练网络，预重构损失函数定义为：S1-3: In order to better train the network, the pre-reconstruction loss function is defined as:

其中，P_i^pre和

分别表示第i个重构样本(pre)的点云和第i个真实样本(T)的点云，

表示点云P_i^pre和

之间的模型距离，模型距离可以是倒角距离(CD)或地球移动的距离(EMD)。Among them, P_i^pre and

represent the point cloud of the ith reconstructed sample (pre) and the point cloud of the ith real sample (T), respectively,

represents the point cloud P_i^pre and

The distance between models, which can be either the Chamfer Distance (CD) or the Earth Movement Distance (EMD).

所述步骤S2中，稠密点云的级联生成过程旨在对预重构过程生成的稀疏点云进行双重均匀上采样，以实现稠密点云的均匀生成，并具有更高的视觉真实性和语义一致性，均匀上采样由特征提取网络、残差图卷积块和上采样块构成；In the step S2, the cascade generation process of the dense point cloud aims to perform double uniform upsampling on the sparse point cloud generated by the pre-reconstruction process, so as to realize the uniform generation of the dense point cloud and have higher visual realism and Semantic consistency, uniform upsampling consists of feature extraction network, residual map convolution block and upsampling block;

其中，所述特征提取网络是为了使生成的点云具有更高的视觉真实性和语义一致性，通过获取点云的特征作为辅助信息，以指导稠密点云的生成；本实施例设计特征提取网络块，将点云作为输入，输出与其对应的特征f，具体为：Among them, the feature extraction network is to make the generated point cloud have higher visual authenticity and semantic consistency, by acquiring the features of the point cloud as auxiliary information to guide the generation of dense point clouds; this embodiment designs feature extraction The network block takes the point cloud as input and outputs its corresponding feature f, specifically:

对于形状为1×3的每个点p，获得其k个形状为z×3的最近邻居N，计算得到点P＝N-p；经过对逐个点进行卷积的方式，将P转换为形状为1×c的特征f^p；在本实施例中，将k设置为6，c设置为64，卷积层数设置为3。For each point p with a shape of 1×3, obtain its k nearest neighbors N with a shape of z×3, and calculate the point P=Np; after convolving the points one by one, convert P into a shape of 1 The feature f^p of ×c; in this embodiment, k is set to 6, c is set to 64, and the number of convolutional layers is set to 3.

所述残差图卷积块将点云和对应的特征作为输入，进一步提取残余特征；The residual map convolution block takes the point cloud and corresponding features as input, and further extracts residual features;

残差图卷积块的核心是G-conv，G-conv被定义在图G＝(v，ε)上，并计算如下：The core of the residual graph convolution block is G-conv. G-conv is defined on the graph G=(v, ε) and is calculated as follows:

其中，

表示第j层顶点p的特征；w_i是一个的超参数；V(p)是连接到p的所有顶点，由邻接矩阵ε所定义，q是属于V(p)中的一个点；因为无法获得点云的预定义邻接矩阵ε，所以将V(p)定义为欧几里得空间中p的k个最近邻居，其坐标由输入点云x_in定义。in,

Represents the feature of the j-th layer vertex p;_wi is a hyperparameter of a; V(p) is all vertices connected to p, defined by the adjacency matrix ε, q is a point belonging to V(p); because there is no way to The predefined adjacency matrix ε of the point cloud is obtained, so V(p) is defined as the k nearest neighbors of p in Euclidean space, whose coordinates are defined by the input point cloud x_in .

为了更好的训练生成器G，提出残差图卷积损失函数：In order to better train the generator G, the residual graph convolution loss function is proposed:

其中，λ是超参数，用来控制距离损失项

和生成损失项

之间的平衡；x_i是输入的稀疏点云；y_i是与x_i对应的真实稠密点云；G(x_i)是点云上采样网络生成的稠密点云；另外，距离损失项

和生成损失项

的含义解释如下：where λ is a hyperparameter that controls the distance loss term

and generate a loss term

balance between; x_i is the input sparse point cloud; y_i is the real dense point cloud corresponding to x_i ; G(_xi ) is the dense point cloud generated by the point cloud upsampling network; in addition, the distance loss term

and generate a loss term

The meaning is explained as follows:

用来测量y和

之间的距离：

used to measure y and

the distance between:

其中，

表示生成的稠密点云；此外，

被定义为：in,

represents the resulting dense point cloud; furthermore,

is defined as:

为了使网络的训练达到更快的收敛和更好的结果，除了G-conv，本实施例引入残差网络，验证发现残差网络有助于发现低分辨率点云与相应的高分辨率点云之间的相似性。In order to achieve faster convergence and better results in the training of the network, in addition to G-conv, this embodiment introduces a residual network, and it is verified that the residual network is helpful for discovering low-resolution point clouds and corresponding high-resolution points. Similarities between clouds.

所述上采样块将点云x_in和与其对应的特征f_in作为输入，为了预测x_in和x_out之间的残差，并不是直接的回归x_out；The upsampling block takes the point cloud x_in and its corresponding feature f_in as input, in order to predict the residual between x_in and x_out , it is not a direct regression x_out ;

通过G-conv层将形状为

的f_in转换为形状为

的张量；By G-conv layer the shape is

The f_in is converted to shape as

tensor;

将张量重塑为

并记为δ_x；reshape the tensor to

and denoted as δ_x ;

通过逐点添加x_in和δ_x获得上采样后的点云x_out，点云中的每个点在经过上采样后被转换为两个点。The upsampled point cloud_xout is obtained by adding x_in and δ_x point by point, each point in the point cloud is converted into two points after upsampling.

输出点云p对应的输出特征f_out：The output feature f_out corresponding to the output point cloud p:

其中，V[x_in](p)表示点云x_in中点p的k个最近邻点。where V[x_in ](p) represents the k nearest neighbors of point p in the point cloud x_in .

在本实施例中，还包括将稀疏点云进行一次均匀上采样后得到较稠密点云，将较稠密点云与相同规模的真实点云进行判别，以判断生成的较稠密点云的真假性；继而将当前阶段的较稠密点云输入到下一个均匀上采样网络中，生成最终的稠密点云。In this embodiment, the method further includes performing a uniform upsampling on the sparse point cloud to obtain a denser point cloud, and discriminating the denser point cloud from the real point cloud of the same scale, so as to judge the authenticity of the generated denser point cloud. Then the denser point cloud of the current stage is input into the next uniform upsampling network to generate the final dense point cloud.

在本实施例中，基于单幅图像重建稠密点云的级联生成网络结构采用端到端的生成方式，生成的稠密点云更具有均匀分布的特点。In this embodiment, the cascade generation network structure for reconstructing the dense point cloud based on a single image adopts an end-to-end generation method, and the generated dense point cloud has the characteristics of uniform distribution.

所述步骤S3中，基于单幅图像生成稠密点云模型后，提出图像再描述机制，提高点云的视觉真实性，以实现双向关联并增强图像与点云之间的语义一致性；In the step S3, after generating a dense point cloud model based on a single image, an image re-description mechanism is proposed to improve the visual authenticity of the point cloud, so as to achieve bidirectional association and enhance the semantic consistency between the image and the point cloud;

图像再描述模块将原始输入的单幅图像和最终生成的稠密点云映射到公共语义空间中的两个网络，即图像编码器和模型编码器，这两种编码器可以测量图像和点云之间的相似度，以计算用于目标点云重构的图像再生成损失。The image re-description module maps the original input single image and the final generated dense point cloud to two networks in a common semantic space, namely the image encoder and the model encoder, which can measure the difference between the image and the point cloud. to calculate the image regeneration loss for target point cloud reconstruction.

其中，所述图像编码器采用ImageNet上预训练的Inception-v3，用来将输入的图像映射到图像语义空间；输入的图像首先被重新缩放为299×299大小的像素点，然后再被输入到图像编码器中，经过图像编码器后，图像特征向量

被从Inception-v3的最后一个池化层中提取出来。Among them, the image encoder uses Inception-v3 pre-trained on ImageNet to map the input image to the image semantic space; the input image is first rescaled to 299×299 pixels, and then input to In the image encoder, after the image encoder, the image feature vector

is extracted from the last pooling layer of Inception-v3.

所述模型编码器是一维CNN，用来将生成的点云映射到模型语义空间中；本实施例使用一维卷积和特征转换的方法提取模型特征向量

并通过添加具有三个全连接层的特征投影网络，将模型特征转换到与图像特征的公共语义空间中；计算公式如下：The model encoder is a one-dimensional CNN, which is used to map the generated point cloud into the model semantic space; this embodiment uses one-dimensional convolution and feature transformation to extract model feature vectors

And by adding a feature projection network with three fully connected layers, the model features are converted into a common semantic space with image features; the calculation formula is as follows:

其中，F_P表示特征投影网络，

是图像语义空间中的模型特征向量。where_FP represents the feature projection network,

is the model feature vector in the image semantic space.

在本实施例中，基于单幅图像生成稠密点云模型以多模块为主体，在多个网络模块的协同作用下引入图像再生成损失，通过将原始的单幅图像和最终生成的稠密点云模型转换到公共的空间中计算出两者的损失，该损失可以表示生成的稠密点云的视觉真实性，并增强单幅图像与最终生成的稠密点云模型之间的关联；In this embodiment, the generation of a dense point cloud model based on a single image takes multiple modules as the main body, and the image regeneration loss is introduced under the synergistic effect of multiple network modules. By combining the original single image and the final generated dense point cloud The model is converted to a common space to calculate the loss of both, which can represent the visual authenticity of the generated dense point cloud and enhance the association between a single image and the final generated dense point cloud model;

所述图像再生成损失

用以增强重建的点云与给定图像之间的语义相关性，其损失函数如下：The image regeneration loss

To enhance the semantic correlation between the reconstructed point cloud and a given image, the loss function is as follows:

其中，d_Euc(x,y)表示向量x和y之间的欧几里得距离；where d_Euc (x, y) represents the Euclidean distance between the vectors x and y;

基于单幅图像重建稠密点云的级联生成网络中生成器的最终生成损失函数为：The final generation loss function of the generator in the cascade generation network for reconstructing dense point clouds from a single image is:

其中，λ₁,λ₂是平衡各项损失的超参数，

为重建稠密点云的级联生成网络中的预重构损失函数，

为残差图卷积块中总的损失函数，

为图像再描述模块的图像再生成损失函数。Among them, λ₁ , λ₂ are hyperparameters that balance the losses,

The pre-reconstruction loss function in the cascade generative network for reconstructing dense point clouds,

is the total loss function in the residual map convolution block,

Generates a loss function for the image re-description module's image reconstruction.

在本实施例中，为了生成视觉真实性更高的稠密点云，提出点云的判别过程，判别器D由特征提取网络、残差图卷积块和池化块三部分组成；In this embodiment, in order to generate a dense point cloud with higher visual authenticity, a point cloud discrimination process is proposed. The discriminator D consists of three parts: a feature extraction network, a residual map convolution block and a pooling block;

采用最远点采样(FPS)方法将形状为4n×3的输入点云x_in转换成形状为n×3的输出点云x_out，获得相应点p的特征：Using the farthest point sampling (FPS) method to convert the input point cloud x_in of shape 4n × 3 into the output point cloud x_out of shape n × 3 to obtain the features of the corresponding point p:

其中，max表示逐点进行的最大化操作。where max represents the point-by-point maximization operation.

在本实施例中，构建图生成对抗网络的局部判别器，代替之前方法中使用的全局判别器，具体地，局部判别器采用多次对输入进行下采样的方式，以使输出包含多于1个点。In this embodiment, a local discriminator of the graph generative adversarial network is constructed to replace the global discriminator used in the previous method. Specifically, the local discriminator adopts a method of downsampling the input multiple times, so that the output contains more than 1 point.

判别器D的损失函数旨在通过最小化损失来区分真实和生成的稠密点云：The loss function of the discriminator D is designed to distinguish between real and generated dense point clouds by minimizing the loss:

本实施例以设计对象的视觉方面的单幅图像作为输入，首先采用预重构模块对单幅图像进行特征提取，预重构稀疏点云，然后采用均匀上采样在稀疏点云的基础上生成稠密点云，根据图像再描述模块，利用图像编码器与模型编码器分别对单幅图像和最终生成的稠密点云进行特征提取，转换到公共的空间之后计算出损失，以确保最终生成的稠密点云具有视觉真实性和语义一致性。In this embodiment, a single image of the visual aspect of the design object is used as input. First, a pre-reconstruction module is used to extract features from the single image, pre-reconstruct a sparse point cloud, and then use uniform upsampling to generate the sparse point cloud. Dense point cloud, according to the image re-description module, the image encoder and the model encoder are used to extract the features of the single image and the final dense point cloud, and the loss is calculated after converting to a common space to ensure the final generated dense point cloud. Point clouds have visual realism and semantic consistency.

实施例2Example 2

本实施例提供一种基于图像级联生成稠密点云的三维模型生成系统，包括：This embodiment provides a three-dimensional model generation system for generating dense point clouds based on image cascade, including:

此处需要说明的是，上述模块对应于实施例1中的步骤S1至S3，上述模块与对应的步骤所实现的示例和应用场景相同，但不限于上述实施例1所公开的内容。需要说明的是，上述模块作为系统的一部分可以在诸如一组计算机可执行指令的计算机系统中执行。It should be noted here that the above modules correspond to steps S1 to S3 in Embodiment 1, and the examples and application scenarios implemented by the above modules and corresponding steps are the same, but are not limited to the content disclosed in Embodiment 1 above. It should be noted that the above modules may be executed in a computer system such as a set of computer-executable instructions as part of the system.

在更多实施例中，还提供：In further embodiments, there is also provided:

一种电子设备，包括存储器和处理器以及存储在存储器上并在处理器上运行的计算机指令，所述计算机指令被处理器运行时，完成实施例1中所述的方法。为了简洁，在此不再赘述。An electronic device includes a memory, a processor, and computer instructions stored on the memory and executed on the processor, and when the computer instructions are executed by the processor, the method described in Embodiment 1 is completed. For brevity, details are not repeated here.

应理解，本实施例中，处理器可以是中央处理单元CPU，处理器还可以是其他通用处理器、数字信号处理器DSP、专用集成电路ASIC，现成可编程门阵列FPGA或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。It should be understood that, in this embodiment, the processor may be a central processing unit CPU, and the processor may also be other general-purpose processors, digital signal processors DSP, application-specific integrated circuits ASIC, off-the-shelf programmable gate array FPGA or other programmable logic devices , discrete gate or transistor logic devices, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

存储器可以包括只读存储器和随机存取存储器，并向处理器提供指令和数据、存储器的一部分还可以包括非易失性随机存储器。例如，存储器还可以存储设备类型的信息。The memory may include read-only memory and random access memory and provide instructions and data to the processor, and a portion of the memory may also include non-volatile random access memory. For example, the memory may also store device type information.

一种计算机可读存储介质，用于存储计算机指令，所述计算机指令被处理器执行时，完成实施例1中所述的方法。A computer-readable storage medium for storing computer instructions, when the computer instructions are executed by a processor, the method described in Embodiment 1 is completed.

实施例1中的方法可以直接体现为硬件处理器执行完成，或者用处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器、闪存、只读存储器、可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器，处理器读取存储器中的信息，结合其硬件完成上述方法的步骤。为避免重复，这里不再详细描述。The method in Embodiment 1 may be directly embodied as being executed by a hardware processor, or executed by a combination of hardware and software modules in the processor. The software modules may be located in random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, registers and other storage media mature in the art. The storage medium is located in the memory, and the processor reads the information in the memory, and completes the steps of the above method in combination with its hardware. To avoid repetition, detailed description is omitted here.

本领域普通技术人员可以意识到，结合本实施例描述的各示例的单元即算法步骤，能够以电子硬件或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行，取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能，但是这种实现不应认为超出本申请的范围。Those of ordinary skill in the art can realize that the unit, that is, the algorithm step of each example described in conjunction with this embodiment, can be implemented by electronic hardware or a combination of computer software and electronic hardware. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the technical solution. Skilled artisans may implement the described functionality using different methods for each particular application, but such implementations should not be considered beyond the scope of this application.

以上仅为本发明的优选实施例而已，并不用于限制本发明，对于本领域的技术人员来说，本发明可以有各种更改和变化。凡在本发明的精神和原则之内，所作的任何修改、等同替换、改进等，均应包含在本发明的保护范围之内。The above are only preferred embodiments of the present invention, and are not intended to limit the present invention. For those skilled in the art, the present invention may have various modifications and changes. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention shall be included within the protection scope of the present invention.

上述虽然结合附图对本发明的具体实施方式进行了描述，但并非对本发明保护范围的限制，所属领域技术人员应该明白，在本发明的技术方案的基础上，本领域技术人员不需要付出创造性劳动即可做出的各种修改或变形仍在本发明的保护范围以内。Although the specific embodiments of the present invention have been described above in conjunction with the accompanying drawings, they do not limit the scope of protection of the present invention. Those skilled in the art should understand that on the basis of the technical solutions of the present invention, those skilled in the art do not need to pay creative work. Various modifications or variations that can be made are still within the protection scope of the present invention.

Claims

1. A three-dimensional model generation method for generating dense point cloud based on image cascade is characterized by comprising the following steps:

carrying out pre-reconstruction processing on the obtained image to obtain a corresponding sparse point cloud model;

sequentially carrying out double uniform up-sampling on the sparse point cloud model to obtain a dense point cloud model, wherein the double uniform up-sampling comprises feature extraction, residual image convolution and up-sampling processing;

and reconstructing the dense point cloud model and outputting a three-dimensional graph model associated with the image.

2. The method of generating a three-dimensional model of dense point cloud based on image cascading of claim 1, wherein the pre-reconstruction process comprises: inputting the acquired image into a coder-decoder network consisting of a plurality of convolutional layers and full-link layers, extracting image characteristics by defining a pre-reconstruction loss function, and outputting a sparse point cloud model.

3. The method of generating a three-dimensional model of dense point cloud based on image cascading of claim 2, wherein the pre-reconstruction loss function is:

wherein, P_i^preAnd

respectively representing the point cloud of the i-th reconstructed sample pre and the point cloud of the i-th real sample T,

represents P_i^preAnd

the model distance between.

4. The method of claim 1, wherein the feature extraction is extracting features corresponding to each point in the sparse point cloud model;

or, the residual map convolution is to take each point in the sparse point cloud model and the corresponding feature as input, and extract the residual feature by defining a residual map convolution loss function;

or, the up-sampling is to convert the features corresponding to each point into tensors, and after the tensors are reshaped, the point cloud model after up-sampling is obtained by adding each point in the sparse point cloud model and the reshaped tensors point by point.

5. The method of claim 4, wherein the residual map volume loss function is:

where λ is the hyperparameter, the control distance loss term

And generating a loss term

Balance between them; x is the number of_iIs an input sparse point cloud; y is_iIs with x_iCorresponding real dense point clouds; g (x)_i) Is a dense point cloud generated by a point cloud upsampling network.

6. The method for generating the three-dimensional model based on image cascade generation of the dense point cloud as claimed in claim 1, wherein the dense point cloud is subjected to image re-description processing, an image re-generation loss function is calculated according to the acquired image and the dense point cloud model, and the visual reality of the dense point cloud model is judged.

7. The method of generating a three-dimensional model of dense point cloud based on image cascading of claim 6, wherein the image regeneration loss function is:

wherein d is_Euc(x, y) represents the Euclidean distance between vectors x and y,

is a model feature vector, f_IIs an image feature vector.

8. A three-dimensional model generation system for generating dense point clouds based on image cascading, comprising:

the pre-reconstruction module is used for carrying out pre-reconstruction processing on the acquired image to obtain a corresponding sparse point cloud model;

the dense point cloud model generation module is used for sequentially carrying out double uniform up-sampling on the sparse point cloud model to obtain the dense point cloud model, wherein the double uniform up-sampling comprises feature extraction, residual image convolution and up-sampling processing;

and the three-dimensional model module is used for reconstructing the dense point cloud model and outputting a three-dimensional map model associated with the image.

9. An electronic device comprising a memory and a processor and computer instructions stored on the memory and executed on the processor, the computer instructions when executed by the processor performing the method of any of claims 1-7.

10. A computer-readable storage medium storing computer instructions which, when executed by a processor, perform the method of any one of claims 1 to 7.