CN106648109A

Movatterモバイル変換

Info

Publication number: CN106648109A
Application number: CN201611271591.1A
Authority: CN
Inventors: 周余; 李倩倩; 于耀
Original assignee: Nanjing University
Current assignee: Nanjing University
Priority date: 2016-12-30
Filing date: 2016-12-30
Publication date: 2017-05-10

Abstract

Translated fromChinese

本发明是一种基于三视角变换的真实场景实时虚拟漫游系统，属于计算机视觉领域。本发明解决的问题是：用普通相机21在不同位置、不同角度拍摄同一场景的三张照片，用户佩戴上虚拟现实头盔23就可以实现真实场景的虚拟漫游，而不需要传统方法的3D建模或者详细的几何信息。本发明系统的主要算法的核心部分在于三张图片同时变换到同一平面并实时计算出中间视角。本发明在传统的两张图片后向变换算法的基础上，进一步提出了一种三张图片后向变换算法，使之能够适应大视差的情况。本发明连接了计算机视觉领域上游的数据采集和下游的视觉控制技术，极大地促进了虚拟现实、街景服务等应用的发展。

The invention is a real-time virtual roaming system of a real scene based on three-view transformation, which belongs to the field of computer vision. The problem solved by the present invention is: take three photos of the same scene at different positions and different angles with the ordinary camera 21, and the user can realize the virtual roaming of the real scene by wearing the virtual reality helmet 23 without the 3D modeling of the traditional method or detailed geometric information. The core part of the main algorithm of the system of the present invention is that the three pictures are transformed to the same plane at the same time and the intermediate viewing angle is calculated in real time. On the basis of the traditional two-picture backward transformation algorithm, the present invention further proposes a three-picture backward transformation algorithm, so that it can adapt to the situation of large parallax. The invention connects upstream data collection and downstream visual control technology in the field of computer vision, and greatly promotes the development of applications such as virtual reality and street view services.

Description

Translated fromChinese

一种基于三视角变换的真实场景实时虚拟漫游系统A real-time virtual roaming system of real scene based on three-view transformation

技术领域technical field

本发明涉及一种基于三视角变换的真实场景实时虚拟漫游系统，涉及计算机视觉中的虚拟现实领域。The invention relates to a real-time virtual roaming system of a real scene based on three-view transformation, and relates to the field of virtual reality in computer vision.

背景技术Background technique

在计算机视觉的虚拟现实领域，基于图片的渲染(Image-based Rendering)是一种有效的方法。这种方法通过拍摄少量的照片并将它们融合到一起就可以产生新的中间视角图片。In the field of virtual reality of computer vision, image-based rendering (Image-based rendering) is an effective method. This method takes a small number of photos and fuses them together to generate a new intermediate view image.

现有的基于图片的渲染方法可以分为三大类：1、完全不需要几何信息的渲染。2、需要少量几何信息的渲染。3、需要详细几何信息的渲染。第一种方法虽然不需要任何几何信息，但是需要以大数量的图片为代价。第三种方法需要的详细几何信息只有利用专门的设备才能得到，限制了它的应用范围。本系统属于第二种方法，既不需要详细的几何信息，也不需要以大量的图片为代价，并且能够根据用户头部位置和角度的变化实时计算出新视角的图片，以实现虚拟漫游的效果。Existing image-based rendering methods can be divided into three categories: 1. Rendering that does not require geometric information at all. 2. Rendering that requires a small amount of geometric information. 3. Rendering that requires detailed geometric information. The first method does not require any geometric information, but at the cost of a large number of images. The detailed geometric information required by the third method can only be obtained by using special equipment, which limits its application range. This system belongs to the second method. It does not require detailed geometric information, nor does it need to cost a large number of pictures, and can calculate the pictures of new perspectives in real time according to the changes of the user's head position and angle, so as to realize virtual roaming. Effect.

发明内容Contents of the invention

本发明公开了一种基于三视角变换的真实场景实时虚拟漫游系统，根据对同一场景在不同角度和位置拍摄的三张照片，实时计算出中间任意视角的图片。该技术连接了计算机视觉领域上游的数据采集和下游的视觉控制技术，极大地促进了虚拟现实、街景服务等应用的发展，并以提供至少以下描述的优点。The invention discloses a real-time virtual roaming system of a real scene based on three-view transformation, which calculates a picture of any middle view in real time according to three photos taken at different angles and positions of the same scene. This technology connects upstream data acquisition and downstream visual control technology in the field of computer vision, greatly promotes the development of applications such as virtual reality and street view services, and provides at least the advantages described below.

相比以往的系统，只需使用一个普通的相机与计算机，系统组成比较简单。Compared with previous systems, only an ordinary camera and computer are used, and the system composition is relatively simple.

不需要过多人为操作，自动化程度较高，新视角还原精度较高。There is no need for too much human operation, the degree of automation is high, and the accuracy of new perspective restoration is high.

能够根据用户的视角实时计算出新的视角图片。A new perspective picture can be calculated in real time according to the user's perspective.

为了实现上述目的，本发明公开的基于三视角变换的实时场景虚拟漫游系统，其特征在于包括以下步骤：In order to achieve the above object, the real-time scene virtual roaming system based on three-view transformation disclosed by the present invention is characterized in that it includes the following steps:

(1)利用相机拍摄同一个场景不同位置、不同角度的三张图片。(1) Use the camera to take three pictures of the same scene at different positions and angles.

(2)前向变换，将原始图片投影到同一个平面得到校正后的图片。(2) Forward transformation, projecting the original picture onto the same plane to obtain the corrected picture.

(3)计算校正后图片的映射关系，并在映射关系突变的位置插值。(3) Calculate the mapping relationship of the corrected picture, and interpolate at the position where the mapping relationship changes suddenly.

(4)线性融合三张校正后的图片，得到中间视角的图片。(4) Linearly fuse the three corrected pictures to obtain the picture of the middle view.

(5)后向变换，把上一步骤所得到的中间视角图片投影到原始图片的视角。(5) Backward transformation, project the intermediate perspective picture obtained in the previous step to the perspective of the original picture.

附图说明Description of drawings

附图说明用于提供对本发明技术方案的进一步理解，并构成说明书的一部分，与本发明的实施一起用于解释本发明的技术方案，并不构成对本发明技术方案的限制。附图说明如下：The description of the drawings is used to provide a further understanding of the technical solution of the present invention, and constitutes a part of the description, and is used to explain the technical solution of the present invention together with the implementation of the present invention, and does not constitute a limitation to the technical solution of the present invention. The accompanying drawings are as follows:

图1是本系统完整过程的流程图，图2是本系统的硬件组成图。Figure 1 is a flow chart of the complete process of the system, and Figure 2 is a hardware composition diagram of the system.

具体实施方式detailed description

以下将结合附图来详细说明本发明的实施方式，借此对本发明如何应用技术手段来解决问题，并达成技术效果的实现过程能充分理解并据以实施。The embodiments of the present invention will be described in detail below in conjunction with the accompanying drawings, so as to fully understand and implement the process of how to apply technical means to solve problems and achieve technical effects in the present invention.

模块一、数据采集。用相机21从不同角度、不同位置拍摄同一个场景的三张照片。Module one, data collection. Take three photos of the same scene from different angles and different positions with the camera 21 .

模块二、数据处理。该模块主要包括步骤：13、前向变换；14、映射关系插值；15、线性融合；16、后向变换。该模块的计算在一台普通电脑22中完成。Module two, data processing. The module mainly includes steps: 13. forward transformation; 14. mapping relationship interpolation; 15. linear fusion; 16. backward transformation. The calculations of this module are done in a normal computer 22 .

13、前向变换：将三张原始图片分为两组，每组两张图片。利用SIFT特征点的匹配关系计算出两组投影矩阵，并分别将每组图片投影到同一个平面。再根据图片之间的映射关系，同时校正三张图片，将它们投影到同一个平面。13. Forward transformation: Divide the three original pictures into two groups with two pictures in each group. Two sets of projection matrices are calculated by using the matching relationship of SIFT feature points, and each set of pictures is projected onto the same plane. Then, according to the mapping relationship between the pictures, the three pictures are corrected at the same time, and they are projected onto the same plane.

假设原始的三张图片为I₁，I₂，I₃，分组为I₁，I₂和I₁，I₃，可以得到平行的图片组I′₁，I′₂和I″₁，I″₃。使用目前最精确的立体视角匹配方法TSGO获得两组图片的映射关系和假设E是I′₁和I′₂的中间图像，对E和I₃再做一次校正，即可得到他们之间的校正矩阵H₆和H₅。结合之前的两组校正矩阵，就可以得到同时校正三张图片的投影矩阵。Assuming that the original three pictures are I₁ , I₂ , I₃ , grouped into I₁ , I₂ and I₁ , I₃ , a parallel picture group I′₁ , I′₂ and I″₁ , I″ can be obtained₃ . Use the most accurate stereoscopic perspective matching method TSGO to obtain the mapping relationship between the two sets of pictures with Assuming that E is the intermediate image of I′₁ and I′₂ , and correcting E and I₃ again, the correction matrices H₆ and H₅ between them can be obtained. Combining the previous two sets of correction matrices, a projection matrix that simultaneously corrects three images can be obtained.

14、映射关系插值：在上一个步骤已经得到两组图片的映射关系和结合投影矩阵的变换关系就可以得到平行的三张图片之间的映射关系，记为和14. Mapping relationship interpolation: the mapping relationship of the two sets of pictures has been obtained in the previous step with Combining the transformation relationship of the projection matrix The mapping relationship between the three parallel pictures can be obtained, denoted as with

由于或的跳变会导致下一步骤在融合图片时产生较大的黑洞，填补这些黑洞需要花费额外的时间插值，且插值效果与黑洞的大小成反比，所以本步骤提前在映射关系上插值，既可以缩短在线部分的耗时，又能提高生成图片的质量。because or The jump of will cause larger black holes to be generated in the next step when merging images. It takes extra time to interpolate to fill these black holes, and the interpolation effect is inversely proportional to the size of the black holes. Therefore, this step interpolates the mapping relationship in advance, which can be The time-consuming online part can be shortened, and the quality of generated pictures can be improved.

15、线性融合：根据已有的理论，平行视角的线性融合满足透视几何理论。基于中间视角的重心位置λ＝(λ₁，λ₂，λ₃)定义一个映射函数用来融合三张平行的图片根据上述步骤，15. Linear Fusion: According to existing theories, the linear fusion of parallel perspectives satisfies the theory of perspective geometry. Define a mapping function based on the center of gravity position λ=(λ₁ , λ₂ , λ₃ ) of the intermediate viewing angle Used to fuse three parallel images According to the above steps,

16、后向变换：在前面的步骤中，已经得到平行图片的中间任意视角最后需要把它投影到原始图片的视角，获得插值原始图片的像素位置和颜色的中间图片I_s。我们扩展了已有的两张图片后向变换算法，并且提出新的大视差条件下的三张图片后向变换算法。16. Backward transformation: In the previous steps, parallel pictures have been obtained Any angle in the middle of Finally, it needs to be projected to the perspective of the original image to obtain the intermediate image I_s that interpolates the pixel position and color of the original image. We extend the existing two-image backward transformation algorithm, and propose a new three-image backward transformation algorithm under the condition of large parallax.

根据已有的两张图片后向变换算法，推算出三张图片的情况下，投影矩阵为H_t＝H₆(H₆^-1H_s)^t，这里H_s＝H₅H₁[(H₅H₁)^-1(H₅H₂)]^s。经证明，在原始图片视角相差较大的情况下H_t不收敛，因此这种情况下这种方法不能得到有效的结果。According to the existing two-picture backward transformation algorithm, in the case of three pictures, the projection matrix is H_t ＝H₆ (H₆^-1 H_s )^t , where H_s ＝H₅ H₁ [(H₅ H₁ )^-1 (H₅ H₂ )]^s . It has been proved that H_t does not converge when the viewing angles of the original pictures are greatly different, so this method cannot obtain effective results in this case.

在原始图片视角相差很大的情况下，我们提出一种简洁有效的后向变换算法：H_t＝(1-s)H₆+tH_s，这里H_s＝(1-s)H₅H₁+sH₅H₂。该方法将各个投影矩阵线性组合，获得了很好的效果。In the case that the viewing angles of the original pictures are very different, we propose a simple and effective backward transformation algorithm: H_t =(1-s)H₆ +tH_s , where H_s =(1-s)H₅ H₁ +sH₅ H₂ . This method linearly combines each projection matrix and obtains good results.

把H_t^-1作用在上即可以得到正常视角图片I_s。由于前三个步骤都可以离线完成，只有这一个步骤需要在线计算，所以在GPU并行计算的帮助下，就可以达到0.05s每帧的速度，即20帧每秒，达到了实时的要求。Apply H_t^-1 to The normal viewing angle picture I_s can be obtained from above. Since the first three steps can be completed offline, only this step requires online calculation, so with the help of GPU parallel computing, the speed of 0.05s per frame can be achieved, that is, 20 frames per second, which meets the real-time requirements.

模块三、虚拟漫游。用户佩戴上虚拟现实头盔23(如Oculus Rift)，虚拟现实头盔捕捉用户头部的位置和方向信息，通过数据处理模块将其转化为上述的λ＝(λ₁，λ₂，λ₃)并实时计算出中间视角的图片。因此，用户可以体验真实世界场景的虚拟漫游。Module three, virtual roaming. The user wears a virtual reality helmet 23 (such as Oculus Rift), and the virtual reality helmet captures the position and direction information of the user's head, and converts it into the above-mentioned λ=(λ₁ , λ₂ , λ₃ ) through the data processing module and real-time Calculate the image of the intermediate view. Thus, users can experience a virtual tour of real-world scenes.

本领域的技术人员应该明白，上述的本发明的系统结构和各个步骤可以用通用的相机和计算装置来实现，它们可以集中在单个的计算装置上，或者分布在多个计算装置组成的网络上，可选地，它们可以用计算装置可执行的程序代码来实现，从而，可以将它们存储在存储装置中由计算装置来执行，或者将他们分别制作成各个集成电路模块，或者将它们中的多个模块或步骤制作成单个集成电路模块来实现。这样，本发明不限制于任何特定的硬件和软件结合。Those skilled in the art should understand that the above-mentioned system structure and various steps of the present invention can be implemented with general-purpose cameras and computing devices, and they can be concentrated on a single computing device, or distributed on a network composed of multiple computing devices , alternatively, they can be implemented with program codes executable by computing devices, thus, they can be stored in storage devices to be executed by computing devices, or they can be made into individual integrated circuit modules respectively, or their Multiple modules or steps are implemented as a single integrated circuit module. As such, the present invention is not limited to any specific combination of hardware and software.

虽然本发明所示出和描述的实施方式如上，但是所述的内容只是为了便于理解本发明而采用的实施方式，并非用以限定本发明。任何本发明所属技术领域内的技术人员，在不脱离本发明所揭露的精神和范围的前提下，可以在实施的形式上以及细节上做任何的修改与变化，但本发明的专利保护范围，仍须以所附的权利要求书所界定的范围为准。Although the embodiments of the present invention are shown and described as above, the content described is only the embodiments adopted for the convenience of understanding the present invention, and is not intended to limit the present invention. Any person skilled in the technical field to which the present invention belongs can make any modifications and changes in the form of implementation and details without departing from the spirit and scope disclosed in the present invention, but the patent protection scope of the present invention, The scope defined by the appended claims must still prevail.

Claims

Translated fromChinese

1.一种基于三视角变换的真实场景实时虚拟漫游系统，其特征是包含以下主要模块：1. A real-time virtual roaming system based on three perspective transformations, characterized in that it comprises the following main modules:

模块一、数据采集：直接用相机从不同角度、不同位置拍摄同一个场景的三张照片。Module 1. Data collection: directly use the camera to take three photos of the same scene from different angles and positions.

模块二、数据处理：该模块主要包括以下四个步骤：Module 2. Data processing: This module mainly includes the following four steps:

步骤一：前向变换，即把模块一采集的三张照片投影到同一个平面。Step 1: Forward transformation, that is, project the three photos collected by module 1 onto the same plane.

步骤二：映射关系插值，即计算同一个平面的三张图片的映射关系，并在映射关系突变的位置插值，得到连续的映射关系。Step 2: Mapping relationship interpolation, that is, calculating the mapping relationship of the three pictures on the same plane, and interpolating at the position where the mapping relationship changes suddenly, to obtain a continuous mapping relationship.

步骤三：线性融合，即通过插值同一平面三张图片的位置和颜色，得到它们的中间视角图片。Step 3: Linear fusion, that is, by interpolating the positions and colors of the three pictures on the same plane, their intermediate perspective pictures are obtained.

步骤四：后向变换，即把步骤三所得的图片投影到正常的视角，得到原始图片的中间任意视角。Step 4: Backward transformation, that is, project the picture obtained in step 3 to a normal viewing angle to obtain an arbitrary viewing angle in the middle of the original picture.

模块三、虚拟漫游：用户带上虚拟现实眼镜，就可以体验真实场景的虚拟漫游。Module 3, virtual roaming: users can experience the virtual roaming of real scenes by wearing virtual reality glasses.

2.权利要求1中所述方法的模块二中步骤一的特征在于：我们把三张图片同时投影到同一个平面。首先我们把三张图片分成两组，每组两张图片，并把每组图片分别投影到同一个平面。具体来说，该投影所用的公式是：2. The feature of step one in the module two of the method described in claim 1 is that: we project the three pictures onto the same plane simultaneously. First, we divide the three pictures into two groups, two pictures in each group, and project each group of pictures onto the same plane. Specifically, the formula used for this projection is:

其中，I₁，I₂，I₃表示相机拍摄的三张原始图片，表示投影到同一个平面的图片。H₁，H₂分别是I₁，I₂的投影矩阵，H₅，H₆分别是I₁，I₂的中间视角图片E和I₃的投影矩阵。通过两两校正图片组，并通过中间视角图片把两组图片联系到一起，我们获得三张平行的图片。Among them, I₁ , I₂ , and I₃ represent the three original pictures taken by the camera, Represents images projected onto the same plane. H₁ and H₂ are the projection matrices of I₁ and I₂ respectively, and H₅ and H₆ are the projection matrices of the intermediate perspective pictures E and I₃ of I₁ and I₂ respectively. By correcting the image groups in pairs, and linking the two groups of images through the middle view image, we obtain three parallel images.

3.权利要求1中所述方法的模块二中步骤四的特征在于：步骤三得到的中间视角图片是基于平行图片组因此并不是拍照时的视角，需要通过后向变换矩阵将它投影到正常的视角，即I₁，I₂，I₃的视角。具体来说，所用的公式是：3. The feature of step 4 in the module two of the method described in claim 1 is: the middle view angle picture that step 3 obtains is based on the group of parallel pictures Therefore, it is not the angle of view when taking pictures. It needs to be projected to the normal angle of view through the backward transformation matrix, that is, the angle of view of I₁ , I₂ , and I₃ . Specifically, the formula used is:

小视差情况：Small parallax case:

大视差情况：。Large parallax situation: .