CN115088016B

Movatterモバイル変換

Info

Publication number: CN115088016B
Application number: CN202180012967.XA
Authority: CN
Inventors: 邓凡
Original assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Current assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date: 2020-02-05
Filing date: 2021-02-04
Publication date: 2024-08-23
Anticipated expiration: 2041-02-04
Also published as: CN115088016A; WO2021155828A1

Abstract

A method implemented by a computer system, comprising: a visual simultaneous localization and mapping (vSLAM) unit in communication with the computer system is initialized using the first image and the first calibration data set. The first image has a first pixel resolution. The method further includes determining an initialization quality value and determining that the initialization quality value is outside a predetermined initialization threshold. The method further includes generating a second image at a second pixel resolution that is higher than the first pixel resolution, generating a second calibration data set based at least in part on the second pixel resolution associated with the second image, and reinitializing vSLAM the unit using the second image and the second calibration data set.

Description

Translated fromChinese

用于实现vSLAM系统的动态输入分辨率的方法和系统Method and system for achieving dynamic input resolution of vSLAM system

背景技术Background Art

增强现实(augmented reality，AR)将虚拟内容叠加在用户的真实世界的视图上。随着AR软件开发工具包(software development kits，SDK)的发展，移动行业将智能手机AR带入主流。ARSDK通常提供6自由度(six degrees-of-freedom，6DoF)追踪能力。用户可以使用电子设备(例如智能手机或AR系统)中包括的摄像头扫描环境，并且该设备执行实时视觉同步定位与建图(visual simultaneous localization and mapping，vSLAM)。可以使用vSLAM单元检测真实世界物体的特征并随着移动设备在其三维环境中移动追踪这些特征，来在移动设备中实现vSLAM。Augmented reality (AR) overlays virtual content on a user's view of the real world. With the development of AR software development kits (SDKs), the mobile industry has brought smartphone AR into the mainstream. ARSDKs typically provide six degrees-of-freedom (6DoF) tracking capabilities. A user can scan the environment using a camera included in an electronic device (such as a smartphone or AR system), and the device performs real-time visual simultaneous localization and mapping (vSLAM). vSLAM can be implemented in a mobile device using a vSLAM unit to detect features of real-world objects and track these features as the mobile device moves in its three-dimensional environment.

尽管在AR领域取得了进展，但本领域需要与AR相关的改进的方法和系统。Despite advances in the field of AR, there is a need in the art for improved methods and systems related to AR.

发明内容Summary of the invention

本发明总体上涉及与增强现实应用相关的方法和系统。更具体地，本发明实施例提供用于动态图像输入分辨率缩放的方法和系统。本发明适用于涉及vSLAM操作的各种应用，包括但不限于基于计算机视觉的在线3D建模、AR可视化、面部识别、机器人、以及自动驾驶汽车。The present invention generally relates to methods and systems related to augmented reality applications. More specifically, embodiments of the present invention provide methods and systems for dynamic image input resolution scaling. The present invention is applicable to various applications involving vSLAM operations, including but not limited to online 3D modeling based on computer vision, AR visualization, facial recognition, robotics, and self-driving cars.

如本文所述，本发明实施例通过在vSLAM过程中调整图像分辨率来响应计算资源需求。可以调整(例如降低)图像分辨率以降低计算需求并提高系统性能。例如，vSLAM初始化可以使用高分辨率图像，而姿势生成可以使用用缩放器缩小的较低分辨率图像。As described herein, embodiments of the present invention respond to computing resource requirements by adjusting image resolution during vSLAM. Image resolution can be adjusted (e.g., reduced) to reduce computing requirements and improve system performance. For example, vSLAM initialization can use high-resolution images, while gesture generation can use lower-resolution images that are scaled down using a scaler.

由一个或多个计算机组成的系统可以用于通过在系统上安装软件、固件、硬件或其组合来执行特定操作或动作，这些软件、固件、硬件或其组合在运行时使系统执行这些动作。一个或多个计算机程序可以用于通过包括指令来执行特定操作或动作，这些指令在由数据处理装置执行时使装置执行这些动作。一般方面包括一种动态视觉同时定位与建图(vSLAM)处理方法。在这种方法中，计算机系统使用第一图像和第一校准数据集初始化与计算机系统通信的vSLAM单元，其中，第一图像具有第一像素分辨率。该方法还包括确定初始化质量值和确定初始化质量值在预定初始化阈值外。该方法还包括以高于第一像素分辨率的第二像素分辨率生成第二图像，以及至少部分基于与第二图像相关联的第二像素分辨率生成第二校准数据集。该方法还包括使用第二图像和第二校准数据集重新初始化vSLAM单元。A system consisting of one or more computers can be used to perform specific operations or actions by installing software, firmware, hardware or a combination thereof on the system, which software, firmware, hardware or a combination thereof causes the system to perform these actions when running. One or more computer programs can be used to perform specific operations or actions by including instructions, which, when executed by a data processing device, cause the device to perform these actions. General aspects include a dynamic visual simultaneous localization and mapping (vSLAM) processing method. In this method, a computer system initializes a vSLAM unit that communicates with the computer system using a first image and a first calibration data set, wherein the first image has a first pixel resolution. The method also includes determining an initialization quality value and determining that the initialization quality value is outside a predetermined initialization threshold. The method also includes generating a second image at a second pixel resolution higher than the first pixel resolution, and generating a second calibration data set based at least in part on a second pixel resolution associated with the second image. The method also includes reinitializing the vSLAM unit using the second image and the second calibration data set.

上述方法的实施方式可以包括以下特征中的一个或多个。可选地，该方法包括以初始像素分辨率接收原始图像和接收初始校准数据集。该方法还可以包括从原始图像生成第一图像和至少部分根据第一像素分辨率从初始校准数据集生成第一校准数据集。可选地，计算机系统从与计算机系统通信的光学传感器接收原始图像。可选地，生成第一图像包括：在与性能监视器和光学传感器通信的缩放器单元从性能监视器接收缩小因子和至少部分根据缩小因子将原始图像从初始像素分辨率缩小到第一像素分辨率，其中，第一像素分辨率低于初始像素分辨率。可选地，初始化vSLAM单元包括生成初始化结果，该初始化结果包括初始输出姿势、坐标系、或初始物体建图中的至少一个。可选地，计算机系统至少部分通过使用与计算机系统通信的性能监视器测量初始化精度，至少部分通过测量初始化结果与由与vSLAM单元通信的惯性测量单元生成的运动数据之间的误差，来确定初始化质量值。可选地，通过将第一图像从第一像素分辨率放大到第二像素分辨率，从第一图像生成第二图像。可选地，第一校准数据集由vSLAM单元从与性能监视器通信的数据缩放处理器接收。可选地，数据缩放处理器至少部分根据来自性能监视器的一个或多个指令生成第二校准数据集。可选地，至少部分基于与和计算机系统通信的光学传感器相关联的硬件校准数据集生成第一校准数据集。Implementations of the above method may include one or more of the following features. Optionally, the method includes receiving an original image at an initial pixel resolution and receiving an initial calibration data set. The method may also include generating a first image from the original image and generating a first calibration data set from the initial calibration data set at least in part based on the first pixel resolution. Optionally, the computer system receives the original image from an optical sensor in communication with the computer system. Optionally, generating the first image includes: receiving a reduction factor from the performance monitor in a scaler unit in communication with the performance monitor and the optical sensor and reducing the original image from the initial pixel resolution to the first pixel resolution at least in part based on the reduction factor, wherein the first pixel resolution is lower than the initial pixel resolution. Optionally, initializing the vSLAM unit includes generating an initialization result, the initialization result including at least one of an initial output pose, a coordinate system, or an initial object mapping. Optionally, the computer system determines an initialization quality value by measuring an initialization accuracy at least in part using a performance monitor in communication with the computer system, at least in part by measuring an error between the initialization result and motion data generated by an inertial measurement unit in communication with the vSLAM unit. Optionally, a second image is generated from the first image by scaling the first image from the first pixel resolution to the second pixel resolution. Optionally, the first calibration data set is received by the vSLAM unit from a data scaling processor in communication with a performance monitor. Optionally, the data scaling processor generates the second calibration data set based at least in part on one or more instructions from the performance monitor. Optionally, the first calibration data set is generated based at least in part on a hardware calibration data set associated with an optical sensor in communication with a computer system.

另一一般方面包括一种用于执行动态特征追踪的方法。在这种方法中，计算机系统至少部分通过使用第一追踪校准数据集追踪具有第一追踪像素分辨率的第一追踪图像中的一个或多个特征，使用与计算机系统通信的vSLAM单元执行特征追踪。该方法还包括确定追踪性能标准和确定追踪性能标准在预定追踪阈值外。计算机系统还以低于第一追踪像素分辨率的第二追踪像素分辨率生成第二追踪图像，并且至少部分根据第二追踪图像的第二追踪像素分辨率生成第二追踪校准数据集。该方法还包括使用vSLAM单元、第二追踪图像、以及第二追踪校准数据集执行特征追踪。Another general aspect includes a method for performing dynamic feature tracking. In this method, a computer system tracks one or more features in a first tracking image having a first tracking pixel resolution using at least in part a first tracking calibration dataset, and performs feature tracking using a vSLAM unit in communication with the computer system. The method also includes determining a tracking performance criterion and determining that the tracking performance criterion is outside a predetermined tracking threshold. The computer system also generates a second tracking image at a second tracking pixel resolution lower than the first tracking pixel resolution, and generates a second tracking calibration dataset based at least in part on the second tracking pixel resolution of the second tracking image. The method also includes performing feature tracking using the vSLAM unit, the second tracking image, and the second tracking calibration dataset.

上述方法的实施方式可以包括以下特征中的一个或多个。可选地，第一追踪像素分辨率是第二像素分辨率，由与vSLAM单元通信的初始化器确定。可选地，第一追踪像素分辨率低于第二像素分辨率，并且通过将第二图像从第二像素分辨率缩小到第一追踪像素分辨率，从第二图像生成第一追踪图像。可选地，至少部分根据第二像素分辨率从第一追踪校准数据集生成第二追踪校准数据集。可选地，性能监视器至少部分通过测量特征检测速度、CPU利用率值、或功耗值中的一个或多个来确定追踪性能标准。Implementations of the above method may include one or more of the following features. Optionally, the first tracking pixel resolution is the second pixel resolution, determined by an initializer in communication with the vSLAM unit. Optionally, the first tracking pixel resolution is lower than the second pixel resolution, and the first tracking image is generated from the second image by scaling the second image from the second pixel resolution to the first tracking pixel resolution. Optionally, the second tracking calibration data set is generated from the first tracking calibration data set at least in part based on the second pixel resolution. Optionally, the performance monitor determines the tracking performance criterion at least in part by measuring one or more of a feature detection speed, a CPU utilization value, or a power consumption value.

另一一般方面包括一种用于在移动设备中实现动态视觉同时定位与建图(vSLAM)的系统。在这种系统中，存储器用于存储计算机可执行指令。光学传感器用于以初始像素分辨率生成图像。运动传感器用于生成运动数据。该系统包括与光学传感器通信的缩放器、与存储器通信的数据缩放处理器。该系统包括与缩放器和数据缩放处理器通信的性能监视器。该系统还包括与性能监视器、数据缩放处理器、缩放器、光学传感器、以及运动传感器通信的vSLAM单元。该系统还包括与存储器通信的一个或多个处理器，上述一个或多个处理器用于执行计算机可执行指令以实现上述方法中的一个或多个。例如，该系统可以实现一种方法，以在vSLAM单元未初始化时，执行vSLAM单元的动态初始化，并在vSLAM单元初始化时，vSLAM单元对光学传感器生成的图像中的一个或多个特征执行动态特征追踪。Another general aspect includes a system for implementing dynamic visual simultaneous localization and mapping (vSLAM) in a mobile device. In this system, a memory is used to store computer executable instructions. An optical sensor is used to generate an image with an initial pixel resolution. A motion sensor is used to generate motion data. The system includes a scaler that communicates with the optical sensor and a data scaling processor that communicates with the memory. The system includes a performance monitor that communicates with the scaler and the data scaling processor. The system also includes a vSLAM unit that communicates with the performance monitor, the data scaling processor, the scaler, the optical sensor, and the motion sensor. The system also includes one or more processors that communicate with the memory, and the one or more processors are used to execute computer executable instructions to implement one or more of the above methods. For example, the system can implement a method to perform dynamic initialization of the vSLAM unit when the vSLAM unit is not initialized, and when the vSLAM unit is initialized, the vSLAM unit performs dynamic feature tracking on one or more features in the image generated by the optical sensor.

另一一般方面包括一种使用vSLAM单元进行动态初始化和特征追踪的方法。在这种方法中，计算机系统以初始像素分辨率接收原始图像和初始校准数据集。计算机系统缩小原始图像来以低于初始像素分辨率的第一缩小像素分辨率提供第一缩小图像，并且至少部分根据第一缩小像素分辨率从初始校准数据集生成第一校准数据集。计算机系统还使用第一缩小图像和第一校准数据集初始化视觉同步定位与建图(vSLAM)系统。计算机系统还生成第一初始化质量值，并且可以确定第一初始化质量值在预定初始化阈值外。如果第一初始化质量值在预定初始化阈值外，则计算机系统以第二像素分辨率生成第二缩小图像，第二像素分辨率高于第一缩小像素分辨率且低于初始像素分辨率，并生成第二校准数据集。随后，计算机系统使用第二缩小图像和第二校准数据集初始化vSLAM系统。与之前的初始化类似，计算机系统确定第二初始化质量值，并且可以确定第二初始化质量值在预定初始化阈值内。如果第二初始化质量值在预定初始化阈值内，则计算机系统接收第三图像，并追踪第三图像中的一个或多个特征。计算机系统确定追踪性能标准，并且可以确定追踪性能标准在预定追踪阈值外。如果追踪性能标准在预定追踪阈值外，则计算机系统缩小第三图像来以低于第二像素分辨率的第三像素分辨率提供第三缩小图像，并且追踪第三缩小图像中的一个或多个特征。Another general aspect includes a method for dynamic initialization and feature tracking using a vSLAM unit. In this method, a computer system receives an original image and an initial calibration data set at an initial pixel resolution. The computer system reduces the original image to provide a first reduced image at a first reduced pixel resolution lower than the initial pixel resolution, and generates a first calibration data set from the initial calibration data set at least in part based on the first reduced pixel resolution. The computer system also initializes a visual simultaneous localization and mapping (vSLAM) system using the first reduced image and the first calibration data set. The computer system also generates a first initialization quality value and can determine that the first initialization quality value is outside a predetermined initialization threshold. If the first initialization quality value is outside the predetermined initialization threshold, the computer system generates a second reduced image at a second pixel resolution, the second pixel resolution is higher than the first reduced pixel resolution and lower than the initial pixel resolution, and generates a second calibration data set. Subsequently, the computer system initializes the vSLAM system using the second reduced image and the second calibration data set. Similar to the previous initialization, the computer system determines a second initialization quality value and can determine that the second initialization quality value is within the predetermined initialization threshold. If the second initialization quality value is within the predetermined initialization threshold, the computer system receives a third image and tracks one or more features in the third image. The computer system determines a tracking performance criterion and may determine that the tracking performance criterion is outside a predetermined tracking threshold. If the tracking performance criterion is outside the predetermined tracking threshold, the computer system reduces the third image to provide a third reduced image at a third pixel resolution lower than the second pixel resolution, and tracks one or more features in the third reduced image.

上述方法的实施方式可以包括以下特征中的一个或多个。可选地，以第二像素分辨率接收第三图像。Implementations of the above method may include one or more of the following features: Optionally, the third image is received at a second pixel resolution.

通过本发明相对于传统技术实现了许多益处。例如，本发明实施例提供了提高vSLAM初始化例程和特征检测例程的速度、计算性能和功耗特性的方法和系统。本发明的这些实施例和其他实施例及其许多优点和特征将结合下面的文字和附图进行更详细的描述。Many benefits are achieved by the present invention relative to conventional techniques. For example, embodiments of the present invention provide methods and systems for improving the speed, computational performance, and power consumption characteristics of vSLAM initialization routines and feature detection routines. These and other embodiments of the present invention and their many advantages and features will be described in more detail in conjunction with the following text and drawings.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1示出根据本发明实施例的包括用于特征检测和追踪应用的惯性测量单元(IMU)和RGB光学传感器的计算机系统的示例。1 illustrates an example of a computer system including an inertial measurement unit (IMU) and an RGB optical sensor for feature detection and tracking applications according to an embodiment of the present invention.

图2是示出根据本发明实施例的用于初始化vSLAM单元的系统的简化示意图。FIG. 2 is a simplified schematic diagram illustrating a system for initializing a vSLAM unit according to an embodiment of the present invention.

图3是示出根据本发明实施例的用于执行特征追踪的系统的简化示意图。FIG. 3 is a simplified schematic diagram illustrating a system for performing feature tracking according to an embodiment of the present invention.

图4是示出根据本发明实施例的使用vSLAM单元执行动态初始化和特征追踪的系统的简化示意图。4 is a simplified schematic diagram illustrating a system for performing dynamic initialization and feature tracking using a vSLAM unit according to an embodiment of the present invention.

图5是示出根据本发明实施例的初始化vSLAM单元的方法的简化流程图。FIG. 5 is a simplified flow chart illustrating a method of initializing a vSLAM unit according to an embodiment of the present invention.

图6是示出根据本发明实施例的执行特征追踪的方法的简化流程图。FIG. 6 is a simplified flow chart illustrating a method of performing feature tracking according to an embodiment of the present invention.

图7是示出根据本发明实施例的执行初始化和特征追踪的方法的简化流程图。FIG. 7 is a simplified flow chart illustrating a method of performing initialization and feature tracking according to an embodiment of the present invention.

图8示出根据本发明实施例的示例计算机系统。FIG. 8 illustrates an example computer system according to an embodiment of the present invention.

具体实施方式DETAILED DESCRIPTION

在根据本发明实施例的以下描述中，将描述各种实施例。出于解释的目的，阐述了具体配置和细节以便提供对实施例的透彻理解。然而，对于本领域技术人员来说也显而易见的是，实施例也可以在没有具体细节的情况下实施。此外，为了不使所描述的实施例模糊，可以省略或简化众所周知的特征。In the following description according to embodiments of the present invention, various embodiments will be described. For the purpose of explanation, specific configurations and details are set forth in order to provide a thorough understanding of the embodiments. However, it will also be apparent to those skilled in the art that the embodiments may also be implemented without the specific details. In addition, well-known features may be omitted or simplified in order not to obscure the described embodiments.

本公开实施例尤其针对初始化和操作移动设备中的视觉同时定位与建图(vSLAM)单元的技术。移动设备可以包括缩放器和数据缩放处理器，由此至少部分使用vSLAM单元操作条件来缩放特征追踪单元和初始化器接收的图像和校准数据。Embodiments of the present disclosure are particularly directed to techniques for initializing and operating a visual simultaneous localization and mapping (vSLAM) unit in a mobile device. The mobile device may include a scaler and a data scaling processor, whereby the image and calibration data received by the feature tracking unit and the initializer are scaled at least in part using the vSLAM unit operating conditions.

在一些实施例中，vSLAM单元与缩放器、数据缩放处理器、以及性能监视器通信。由RGB光学传感器(例如摄像头)捕获的包括真实世界物体的图像可以由缩放器至少部分根据来自性能监视器的一个或多个指令进行缩小。此外，数据缩放处理器可以从性能监视器接收单独的指令以更新与RGB光学传感器相关联的校准数据，以供vSLAM单元使用。通过这种方式，vSLAM单元可以使用已调整的输入数据实现初始化例程和/或特征检测和追踪例程，以优化vSLAM性能。In some embodiments, the vSLAM unit communicates with a scaler, a data scaling processor, and a performance monitor. An image including a real-world object captured by an RGB optical sensor (e.g., a camera) can be reduced by the scaler at least in part based on one or more instructions from a performance monitor. In addition, the data scaling processor can receive a separate instruction from the performance monitor to update the calibration data associated with the RGB optical sensor for use by the vSLAM unit. In this way, the vSLAM unit can use the adjusted input data to implement an initialization routine and/or a feature detection and tracking routine to optimize vSLAM performance.

在说明性示例中，智能手机应用(也称为智能手机app)可以包括用于将动画元素叠加在真实世界中的物体上的AR功能。例如，动画元素可以是标志、花卉图案、卡通动物等。例如，智能手机应用可以检测和追踪特定物体，以便特定动画元素仅在特定物体出现在摄像头的视场中时出现在手机屏幕上。智能手机应用可以依赖于有关手机周围环境中的物体的表面的信息，以将动画元素以适当的大小、视角和位置正确放置在显示域中，从而使其看起来像在与真实世界的物体交互。在一些情况下，此信息包括手机中的摄像头拍摄的图像以及手机在环境中的移动信息。在一些情况下，这两种类型的信息在作为手机一部分的vSLAM单元中一起处理，以允许智能手机应用识别手机周围世界中的物体的边界和表面。In an illustrative example, a smartphone application (also referred to as a smartphone app) may include an AR function for superimposing animated elements on objects in the real world. For example, the animated elements may be logos, floral patterns, cartoon animals, and the like. For example, a smartphone application may detect and track specific objects so that specific animated elements appear on the phone screen only when the specific objects appear in the camera's field of view. The smartphone application may rely on information about the surfaces of objects in the environment surrounding the phone to properly place the animated elements in the display domain at the appropriate size, perspective, and position so that it appears to be interacting with real-world objects. In some cases, this information includes images captured by the camera in the phone and information about the movement of the phone in the environment. In some cases, these two types of information are processed together in a vSLAM unit that is part of the phone to allow the smartphone application to identify the boundaries and surfaces of objects in the world around the phone.

vSLAM单元可以在检测和追踪物体特征之前完成初始化例程。初始化很重要，因为其定义了智能手机应用用于放置动画的二维坐标系，并且生成了手机视场中真实物体的初始建图。初始化结果用于特征检测和追踪例程，同时vSLAM单元持续更新真实物体的建图并且更新虚拟物体的放置和呈现。在上述示例中，智能手机应用可以初始化vSLAM单元并且持续检测和追踪手机周围环境中的真实物体，以便其可以将二维动画装饰实时建图在物体上。The vSLAM unit can complete an initialization routine before detecting and tracking object features. Initialization is important because it defines the two-dimensional coordinate system that the smartphone application uses to place animations and generates an initial map of real objects in the phone's field of view. The initialization results are used in feature detection and tracking routines while the vSLAM unit continues to update the mapping of real objects and updates the placement and rendering of virtual objects. In the above example, the smartphone application can initialize the vSLAM unit and continue to detect and track real objects in the environment around the phone so that it can map the two-dimensional animated decorations on the objects in real time.

在此示例中，vSLAM单元可能伴随有优化初始化、检测和追踪操作的附加单元。例如，缩放器可以修改vSLAM单元用于初始化和/或检测和追踪的图像的分辨率。例如，缩放器可以降低摄像头产生的图像的分辨率，从而使vSLAM单元需要的系统资源更少。缩放器还可以降低由摄像头产生的用于检测和追踪例程的图像的分辨率，这可以利用分辨率降低的图像。在另一示例中，缩放器可以由检测vSLAM单元是否被初始化的性能监视器控制。如果是被初始化，则性能监视器可以指示缩放器从初始化缩放切换到相对较低的检测和追踪缩放，用于vSLAM单元接收的图像。为了优化使用应用时智能手机的性能，用于检测和追踪的缩放可以随手机的移动或手机周围环境的变化而变化。在此示例中，性能监视器可以反复检查vSLAM单元中的检测和追踪质量，并相应地调整缩放调整。性能监视器还可以调整vSLAM单元用于处理摄像头产生的图像的数据。该数据可称为校准数据，描述了智能手机用于收集图像和运动数据的硬件。例如，vSLAM单元可以使用校准数据来说明智能手机中不同组件的不同位置，这些组件生成vSLAM单元使用的不同类型的数据。In this example, the vSLAM unit may be accompanied by additional units that optimize initialization, detection, and tracking operations. For example, a scaler can modify the resolution of the image used by the vSLAM unit for initialization and/or detection and tracking. For example, the scaler can reduce the resolution of the image generated by the camera, so that the vSLAM unit requires fewer system resources. The scaler can also reduce the resolution of the image generated by the camera for the detection and tracking routine, which can utilize the image with reduced resolution. In another example, the scaler can be controlled by a performance monitor that detects whether the vSLAM unit is initialized. If it is initialized, the performance monitor can instruct the scaler to switch from the initialization zoom to a relatively low detection and tracking zoom for the image received by the vSLAM unit. In order to optimize the performance of the smartphone when using the application, the zoom used for detection and tracking can change with the movement of the phone or the changes in the environment around the phone. In this example, the performance monitor can repeatedly check the detection and tracking quality in the vSLAM unit and adjust the zoom adjustment accordingly. The performance monitor can also adjust the data used by the vSLAM unit to process the image generated by the camera. This data can be called calibration data, which describes the hardware used by the smartphone to collect image and motion data. For example, a vSLAM unit can use calibration data to account for the different positions of different components in a smartphone that generate different types of data used by the vSLAM unit.

通常，vSLAM允许AR系统和使用计算机视觉(computer vision，CV)检测真实世界中的特征和物体的其他类型的系统在系统相对于物体移动时检测和追踪物体。因为移动设备上的计算资源通常是有限的，可以针对系统约束(例如功耗和CPU使用)优化vSLAM过程。此外，在计算资源充足时，优化的初始化质量会带来更好的运动追踪精度，同时系统在检测和追踪期间优化vSLAM单元的性能。本发明实施例减少了在vSLAM过程中定义表面的延迟，从而提高了系统性能。In general, vSLAM allows AR systems and other types of systems that use computer vision (CV) to detect features and objects in the real world to detect and track objects as the system moves relative to the object. Because computing resources on mobile devices are typically limited, the vSLAM process can be optimized for system constraints such as power consumption and CPU usage. In addition, when computing resources are sufficient, optimized initialization quality results in better motion tracking accuracy, while the system optimizes the performance of the vSLAM unit during detection and tracking. Embodiments of the present invention reduce the delay in defining surfaces during the vSLAM process, thereby improving system performance.

图1示出根据本发明实施例的包括用于特征检测和追踪应用的惯性测量单元(inertial measurement unit，IMU)112和RGB光学传感器114的计算机系统110的示例。特征检测和追踪可以由计算机系统110的vSLAM单元116实现。通常，RGB光学传感器114生成包括例如真实世界物体130的真实世界环境的RGB图像。在一些实施例中，IMU 112生成与计算机系统110在三维环境中的运动有关的运动数据，其中，该数据包括例如IMU 112相对于6个自由度的旋转和平移(例如根据三个笛卡尔轴的平移和旋转)。在AR会话的初始化(其中，该初始化可以包括校准和追踪)之后，vSLAM单元116在AR会话中渲染真实世界环境的优化输出姿势120，其中，优化输出姿势120描述了RGB光学传感器114至少部分相对于在真实世界物体130中检测到的特征图124的姿势。优化输出姿势120描述了用于将二维AR物体放置在真实世界物体130的真实世界物体表示122上的坐标系和图。FIG. 1 illustrates an example of a computer system 110 including an inertial measurement unit (IMU) 112 and an RGB optical sensor 114 for feature detection and tracking applications according to an embodiment of the present invention. Feature detection and tracking may be implemented by a vSLAM unit 116 of the computer system 110. Typically, the RGB optical sensor 114 generates an RGB image of a real-world environment including, for example, a real-world object 130. In some embodiments, the IMU 112 generates motion data related to the motion of the computer system 110 in a three-dimensional environment, wherein the data includes, for example, rotation and translation of the IMU 112 relative to six degrees of freedom (e.g., translation and rotation according to three Cartesian axes). After initialization of an AR session (which may include calibration and tracking), the vSLAM unit 116 renders an optimized output pose 120 of the real-world environment in the AR session, wherein the optimized output pose 120 describes a pose of the RGB optical sensor 114 at least partially relative to a feature map 124 detected in the real-world object 130. The optimized output pose 120 describes a coordinate system and graph for placing a two-dimensional AR object on a real-world object representation 122 of a real-world object 130 .

在示例中，计算机系统110表示合适的用户设备，除IMU 112和RGB光学传感器114外，该用户设备还包括一个或多个图形处理单元(graphical processing unit，GPU)、一个或多个通用处理器(general purpose processor，GPP)、以及存储可由至少一个处理器执行以执行本发明实施例的各种功能的计算机可读指令的一个或多个存储器。例如，计算机系统110可以是智能手机、平板电脑、AR耳机或可穿戴AR设备等中的任何一种。In the example, the computer system 110 represents a suitable user device, which includes one or more graphics processing units (GPUs), one or more general purpose processors (GPPs), and one or more memories storing computer-readable instructions that can be executed by at least one processor to perform various functions of the embodiments of the present invention, in addition to the IMU 112 and the RGB optical sensor 114. For example, the computer system 110 can be any one of a smartphone, a tablet computer, an AR headset, or a wearable AR device, etc.

IMU 112可以具有已知的采样率(例如数据点产生的时间频率)，并且该值可以本地存储和/或可由vSLAM单元116访问。RGB光学传感器114可以是彩色摄像头。RGB光学传感器114和IMU 112可以具有不同的采样率。通常，RGB光学传感器114的采样率低于IMU 112的采样率。例如，RGB光学传感器114的采样率可以为30Hz，而IMU 112的采样率可以为100Hz。The IMU 112 may have a known sampling rate (e.g., the temporal frequency at which data points are generated), and this value may be stored locally and/or accessible by the vSLAM unit 116. The RGB optical sensor 114 may be a color camera. The RGB optical sensor 114 and the IMU 112 may have different sampling rates. Typically, the sampling rate of the RGB optical sensor 114 is lower than the sampling rate of the IMU 112. For example, the sampling rate of the RGB optical sensor 114 may be 30 Hz, while the sampling rate of the IMU 112 may be 100 Hz.

此外，安装在计算机系统110中的IMU 112和RGB光学传感器114可以通过变换(例如距离偏移、视场角差等)分开。该变换可以是已知的并且其值可以本地存储和/或可由vSLAM单元116访问。在计算机系统110移动期间，RGB光学传感器114和IMU 112可能会经历相对于计算机系统110的形心、质心、或另一旋转点的不同运动。在一些情况下，变换可能会导致vSLAM优化输出姿势中的错误或不匹配。为此，计算机系统可以包括校准数据。在一些情况下，可以仅基于变换来设置校准数据。如参考图2至图4所述，校准数据可以包括至少部分与RGB光学传感器114的分辨率相关联的数据，使得RGB光学传感器114生成的图像的缩放的变化可以通过校准数据的相应调整来补偿。In addition, the IMU 112 and the RGB optical sensor 114 installed in the computer system 110 can be separated by a transformation (e.g., distance offset, field of view angle difference, etc.). The transformation can be known and its value can be stored locally and/or can be accessed by the vSLAM unit 116. During the movement of the computer system 110, the RGB optical sensor 114 and the IMU 112 may experience different movements relative to the centroid, center of mass, or another rotation point of the computer system 110. In some cases, the transformation may cause errors or mismatches in the vSLAM optimized output posture. To this end, the computer system may include calibration data. In some cases, the calibration data can be set only based on the transformation. As described with reference to Figures 2 to 4, the calibration data may include data associated at least in part with the resolution of the RGB optical sensor 114, so that changes in the scaling of the image generated by the RGB optical sensor 114 can be compensated by corresponding adjustments to the calibration data.

vSLAM单元116可以实现为专用硬件和/或硬件和软件的组合(例如通用处理器和存储在存储器中并且可由通用处理器执行的计算机可读指令)。如参考图2至图4所述，除作为vSLAM过程一部分的初始化AR会话和执行特征检测和追踪外，计算机系统110还可以通过实现与缩放器和数据缩放处理器通信的性能监视器来动态管理vSLAM单元116的计算需求和功耗。The vSLAM unit 116 may be implemented as dedicated hardware and/or a combination of hardware and software (e.g., a general purpose processor and computer readable instructions stored in a memory and executable by the general purpose processor). As described with reference to FIGS. 2 to 4 , in addition to initializing an AR session and performing feature detection and tracking as part of the vSLAM process, the computer system 110 may also dynamically manage the computational requirements and power consumption of the vSLAM unit 116 by implementing a performance monitor that communicates with the scaler and data scaling processor.

在图1的说明性示例中，智能手机用于显示真实世界环境的AR会话。特别地，AR会话包括渲染AR场景，AR场景包括真实世界桌子的表示，花瓶132(或一些其他真实世界物体)放置在该桌子上。虚拟物体126将显示在AR场景中。特别地，虚拟物体将显示在桌子上。作为检测智能手机在真实世界环境中如何相对于桌子和花瓶定向的一部分，智能手机可以使用来自RBG光学传感器114或其他摄像头的图像来初始化vSLAM单元。vSLAM单元将定义参考坐标系，vSLAM单元将相对于该参考坐标系检测桌子和花瓶中的特征。在初始化之后，vSLAM单元将检测和追踪作为整个AR系统一部分的特征。在检测和追踪特征时，智能手机可以监测vSLAM单元的功耗、计算需求和性能，并且可以调整图像的像素分辨率和vSLAM单元使用的校准数据，以主动管理功耗和计算需求。In the illustrative example of FIG. 1 , a smartphone is used to display an AR session of a real-world environment. In particular, the AR session includes rendering an AR scene that includes a representation of a real-world table on which a vase 132 (or some other real-world object) is placed. A virtual object 126 will be displayed in the AR scene. In particular, the virtual object will be displayed on the table. As part of detecting how the smartphone is oriented relative to the table and the vase in the real-world environment, the smartphone can initialize the vSLAM unit using images from the RBG optical sensor 114 or other camera. The vSLAM unit will define a reference coordinate system relative to which the vSLAM unit will detect features in the table and the vase. After initialization, the vSLAM unit will detect and track features as part of the overall AR system. When detecting and tracking features, the smartphone can monitor the power consumption, computational requirements, and performance of the vSLAM unit, and can adjust the pixel resolution of the image and the calibration data used by the vSLAM unit to actively manage power consumption and computational requirements.

图2是示出根据本发明实施例的用于初始化vSLAM单元的系统200的简化示意图。如参考图1所述，vSLAM单元116可以实现为计算机系统(例如图1的计算机系统110)的与缩放器220、性能监视器270、数据缩放处理器240通信的一部分。vSLAM单元116可以接收图像集210和运动数据280。例如，vSLAM单元116可以以与IMU相关联的采样率从IMU(例如图1的IMU 112)接收运动数据280。例如，采样率可以为大于0Hz的任何频率，包括但不限于50Hz、60Hz、70Hz、80Hz、90Hz、100Hz等。图像集210可以由光学传感器(例如RGB光学传感器114)以初始像素分辨率生成，上述光学传感器包括但不限于摄像头。初始像素分辨率可以是光学传感器的特性，并且可以在数千到数百万或更高的像素分辨率的范围内，包括但不限于1MP、2MP、10MP、20MP等。光学传感器可以以大于0Hz的刷新率(包括但不限于10Hz、24Hz、30Hz、48Hz、60Hz等)生成图像集210。FIG. 2 is a simplified schematic diagram showing a system 200 for initializing a vSLAM unit according to an embodiment of the present invention. As described with reference to FIG. 1 , the vSLAM unit 116 may be implemented as a part of a computer system (e.g., the computer system 110 of FIG. 1 ) that communicates with a scaler 220, a performance monitor 270, and a data scaling processor 240. The vSLAM unit 116 may receive an image set 210 and motion data 280. For example, the vSLAM unit 116 may receive motion data 280 from an IMU (e.g., the IMU 112 of FIG. 1 ) at a sampling rate associated with the IMU. For example, the sampling rate may be any frequency greater than 0 Hz, including but not limited to 50 Hz, 60 Hz, 70 Hz, 80 Hz, 90 Hz, 100 Hz, etc. The image set 210 may be generated by an optical sensor (e.g., an RGB optical sensor 114) at an initial pixel resolution, and the optical sensor may include but is not limited to a camera. The initial pixel resolution may be a characteristic of the optical sensor, and may be in the range of thousands to millions or higher pixel resolutions, including but not limited to 1 MP, 2 MP, 10 MP, 20 MP, etc. The optical sensor may generate the image set 210 at a refresh rate greater than 0 Hz (including but not limited to 10 Hz, 24 Hz, 30 Hz, 48 Hz, 60 Hz, etc.).

vSLAM单元116可以包含初始化器单元254，初始化器单元254执行与初始化vSLAM单元116一致的操作，这些操作包括但不限于如参考图1所述，生成坐标图和光学传感器的初始输出姿势。缩放器220可以将图像集210中的图像从初始像素分辨率缩小到低于初始像素分辨率的第一像素分辨率以促进vSLAM单元116的初始化。第一像素分辨率可以是至少部分与vSLAM单元116的硬件配置相关联的静态值，或者可以至少部分基于运动数据280和图像集210的一个或多个特性变化。例如，图像可能包含相对较少的特征，并且可以允许以相对较低的第一像素分辨率进行初始化。虽然在一些实施例中关于图像的缩小讨论了缩放器220，但这不是本发明所需的，并且缩放器可以将图像传递到vSLAM单元116而无需进行缩小或放大，而是保持原始分辨率。The vSLAM unit 116 may include an initializer unit 254 that performs operations consistent with initializing the vSLAM unit 116, including but not limited to generating a coordinate map and an initial output pose of the optical sensor as described with reference to FIG. 1. The scaler 220 may reduce the images in the image set 210 from an initial pixel resolution to a first pixel resolution lower than the initial pixel resolution to facilitate the initialization of the vSLAM unit 116. The first pixel resolution may be a static value associated at least in part with the hardware configuration of the vSLAM unit 116, or may be based at least in part on one or more characteristic changes of the motion data 280 and the image set 210. For example, an image may contain relatively few features and may allow initialization at a relatively low first pixel resolution. Although the scaler 220 is discussed in some embodiments with respect to the reduction of the image, this is not required by the present invention, and the scaler may pass the image to the vSLAM unit 116 without reducing or enlarging it, but maintaining the original resolution.

与vSLAM单元116通信的性能监视器270可以用于监视vSLAM单元116的性能并向本文所述的一个或多个系统元件提供数据。性能监视器可以生成一组性能度量，包括初始化速度、特征追踪速度、CPU资源的使用量、初始化精度等。在一些实施例中，初始化精度用测量成本表示，其至少部分基于在初始化过程生成的视觉信息与IMU信息之间计算的误差值，使得初始化精度至少部分描述了视觉数据和vSLAM单元116接收的运动数据280之间的匹配程度。响应于这些性能度量，可以如本文所述增大或减小图像分辨率。此外，因为对于降低的分辨率，可以利用与分辨率降低的图像相关联的新校准数据，所以性能度量可以用于提供对校准数据的请求。A performance monitor 270 in communication with the vSLAM unit 116 can be used to monitor the performance of the vSLAM unit 116 and provide data to one or more system elements described herein. The performance monitor can generate a set of performance metrics, including initialization speed, feature tracking speed, CPU resource usage, initialization accuracy, etc. In some embodiments, the initialization accuracy is represented by a measurement cost, which is at least partially based on an error value calculated between the visual information generated during the initialization process and the IMU information, so that the initialization accuracy at least partially describes the degree of match between the visual data and the motion data 280 received by the vSLAM unit 116. In response to these performance metrics, the image resolution can be increased or decreased as described herein. In addition, because new calibration data associated with the image with reduced resolution can be utilized for the reduced resolution, the performance metrics can be used to provide a request for calibration data.

性能监视器270还可以确定初始化质量值或度量以描述初始化精度。初始化质量值可以是任何指标、分数、数值或等级。例如，初始化质量值可以表示为满分为100分的分数、0和1之间的值、0和2之间的值等。其他示例是可能的。在一个示例中，初始化质量值表示为以1为中心的数字，其中大于或小于1给定余量的数字等价地指示降低的初始化质量。在此示例中，将相对于1来确定阈值，例如，0.95到1.05或0.9到1.1的范围外的初始化质量值。本领域的普通技术人员将认识到许多变化、修改和替代。作为确定初始化质量的至少一部分，性能监视器270可以将初始化质量值与阈值进行比较。阈值可以是静态余量(例如，最大初始化质量值的95％)，或者可以基于vSLAM单元116的一个或多个性能特性动态确定。The performance monitor 270 may also determine an initialization quality value or metric to describe the initialization accuracy. The initialization quality value may be any indicator, score, value, or grade. For example, the initialization quality value may be expressed as a score of 100 points, a value between 0 and 1, a value between 0 and 2, etc. Other examples are possible. In one example, the initialization quality value is expressed as a number centered on 1, where a number greater than or less than 1 given margin equivalently indicates a reduced initialization quality. In this example, a threshold value will be determined relative to 1, for example, an initialization quality value outside the range of 0.95 to 1.05 or 0.9 to 1.1. Those of ordinary skill in the art will recognize many variations, modifications, and substitutions. As at least part of determining the initialization quality, the performance monitor 270 may compare the initialization quality value with a threshold value. The threshold value may be a static margin (e.g., 95% of the maximum initialization quality value), or may be dynamically determined based on one or more performance characteristics of the vSLAM unit 116.

在一些vSLAM系统中，提供给vSLAM单元116的图像被缩小到低于图像的原始分辨率的分辨率。在一些实施例中，图像以原始分辨率提供给vSLAM单元116，在传送到vSLAM单元之前被缩小，或在传送到vSLAM单元之前被放大。在一个示例中，性能监视器270可以确定初始化质量低。作为响应，缩放器220可以从性能监视器接收指令以开始将图像集210放大到高于第一像素分辨率的第二像素分辨率。在一些情况下，对于初始化器254，第一像素分辨率可能太低而无法成功初始化vSLAM单元116，但是第二像素分辨率可能足以让vSLAM单元116成功初始化。性能监视器可以继续确定附加的初始化质量值并迭代地调整缩放器220，直到初始化质量值满足阈值。性能监视器还可以至少部分基于vSLAM单元116产生的信息确定目标像素分辨率，该信息包括但不限于运动数据、检测质量等。如参考图3所述，系统200可以响应于初始化质量值满足阈值，从由初始化器254执行的初始化过程切换到特征检测和追踪。In some vSLAM systems, the image provided to the vSLAM unit 116 is reduced to a resolution lower than the original resolution of the image. In some embodiments, the image is provided to the vSLAM unit 116 with the original resolution, reduced before being transmitted to the vSLAM unit, or enlarged before being transmitted to the vSLAM unit. In one example, the performance monitor 270 can determine that the initialization quality is low. In response, the scaler 220 can receive instructions from the performance monitor to start enlarging the image set 210 to a second pixel resolution higher than the first pixel resolution. In some cases, for the initializer 254, the first pixel resolution may be too low to successfully initialize the vSLAM unit 116, but the second pixel resolution may be enough to successfully initialize the vSLAM unit 116. The performance monitor can continue to determine additional initialization quality values and iteratively adjust the scaler 220 until the initialization quality value meets the threshold. The performance monitor can also determine the target pixel resolution based at least in part on the information generated by the vSLAM unit 116, which includes but is not limited to motion data, detection quality, etc. As described with reference to FIG. 3 , the system 200 may switch from the initialization process performed by the initializer 254 to feature detection and tracking in response to the initialization quality value satisfying a threshold.

系统200可以包括数据缩放处理器240，数据缩放处理器240在初始化期间至少部分基于初始化质量值从性能监视器270接收输入。如参考图1所述，与光学传感器相比，vSLAM单元116可以位于相对于计算机系统的形心不同的位置或方向。这可能需要一组校准数据230以允许vSLAM单元116补偿该不同。在其他实施例中，校准数据包括光学系统的内在和外在校准数据。该校准数据可以由数据缩放处理器用于基于当前由vSLAM单元116使用的图像分辨率(即缩放器220实现的图像数据的当前缩放)来调整校准数据，从而提供可以加载到vSLAM单元116中的更新校准数据。在一些情况下，可以基于计算机系统的硬件配置预先确定校准数据230。性能监视器270可以将缩放器220的操作与数据缩放处理器240的操作进行协调，以至少部分根据缩放器220进行的放大或缩小来调整校准数据230。作为示例，如果如性能监视器检测到的，初始化质量度量在预定阈值外，则数据缩放处理器和缩放器可用于增大提供给vSLAM单元的图像的分辨率，以改进初始化过程，例如，这无需修改系统其他部分(包括特征检测、特征追踪、捆绑调整等)的分辨率。The system 200 may include a data scaling processor 240 that receives input from a performance monitor 270 during initialization based at least in part on the initialization quality value. As described with reference to FIG. 1 , the vSLAM unit 116 may be located at a different position or orientation relative to the centroid of the computer system than the optical sensor. This may require a set of calibration data 230 to allow the vSLAM unit 116 to compensate for the difference. In other embodiments, the calibration data includes intrinsic and extrinsic calibration data of the optical system. The calibration data may be used by the data scaling processor to adjust the calibration data based on the image resolution currently used by the vSLAM unit 116 (i.e., the current scaling of the image data implemented by the scaler 220), thereby providing updated calibration data that can be loaded into the vSLAM unit 116. In some cases, the calibration data 230 may be predetermined based on the hardware configuration of the computer system. The performance monitor 270 may coordinate the operation of the scaler 220 with the operation of the data scaling processor 240 to adjust the calibration data 230 based at least in part on the zooming in or out performed by the scaler 220. As an example, if an initialization quality metric is outside a predetermined threshold as detected by a performance monitor, the data scaling processor and scaler may be used to increase the resolution of the image provided to the vSLAM unit to improve the initialization process, e.g., without modifying the resolution of other parts of the system (including feature detection, feature tracking, bundle adjustment, etc.).

图3是示出根据本发明实施例的用于执行特征追踪的系统300的简化示意图。如参考图1和图2所述，系统300可以包括vSLAM单元116、缩放器220、数据缩放处理器240、以及性能监视器270。为了执行特征检测和追踪，vSLAM单元116可以接收图像集210，图像集210已由缩放器220缩小到小于光学传感器(例如图1的RGB光学传感器114)产生的初始或原始像素分辨率的像素分辨率。vSLAM单元116接收到的缩小图像可用于特征追踪单元356的特征检测和追踪。缩小图像的像素分辨率可以低于在初始化过程中(例如图2的初始化器254)使用的像素分辨率。如参考图1所述，特征追踪单元356可以参考在vSLAM单元116初始化期间生成的坐标系和建图来检测图像中的特征。FIG. 3 is a simplified schematic diagram showing a system 300 for performing feature tracking according to an embodiment of the present invention. As described with reference to FIG. 1 and FIG. 2 , the system 300 may include a vSLAM unit 116, a scaler 220, a data scaling processor 240, and a performance monitor 270. In order to perform feature detection and tracking, the vSLAM unit 116 may receive an image set 210, which has been reduced by the scaler 220 to a pixel resolution that is less than the initial or original pixel resolution produced by the optical sensor (e.g., the RGB optical sensor 114 of FIG. 1 ). The reduced image received by the vSLAM unit 116 can be used for feature detection and tracking by a feature tracking unit 356. The pixel resolution of the reduced image can be lower than the pixel resolution used during the initialization process (e.g., the initializer 254 of FIG. 2 ). As described with reference to FIG. 1 , the feature tracking unit 356 can detect features in the image with reference to the coordinate system and mapping generated during the initialization of the vSLAM unit 116.

性能监视器270可以确定至少部分与一个或多个参数相关联的性能标准，上述一个或多个参数与特征追踪单元356的操作相关联。例如，性能监视器270可以确定特征追踪单元356的处理速度、与vSLAM单元116相关联的处理器(例如CPU)利用率、vSLAM单元的功率(例如电池)消耗率、或初始化精度。The performance monitor 270 may determine a performance metric associated at least in part with one or more parameters associated with the operation of the feature tracking unit 356. For example, the performance monitor 270 may determine a processing speed of the feature tracking unit 356, a processor (e.g., CPU) utilization associated with the vSLAM unit 116, a power (e.g., battery) consumption rate of the vSLAM unit, or an initialization accuracy.

特征追踪单元356的处理速率(也称为速度)可以是每个特征追踪周期的时间段。类似地，处理速度可以表示为每单位时间的循环数。可以将处理速率与阈值进行比较，响应于处理速率不满足阈值，性能监视器270可以调整缩放器单元220的操作以缩小图像集210并降低vSLAM单元116接收的图像的像素分辨率。如关于图2所讨论的，与调整缩放器单元220相协调，性能监视器270可以调整数据缩放处理器240的操作以实现校准数据230中的相应调整。性能监视器270可以响应于任何参数未能满足阈值来调整缩放器单元220和数据缩放处理器240。性能监视器270还可以确定单个性能标准值并根据该值与阈值的比较进行调整。在一些情况下，例如在初始化失败的情况下和在vSLAM单元116完成重新初始化之前，作为改进特征检测的一种方式，性能监视器270可以调整缩放器单元220和数据缩放处理器240以减少图像集210中的图像的缩小量。The processing rate (also referred to as speed) of the feature tracking unit 356 can be the time period of each feature tracking cycle. Similarly, the processing speed can be expressed as the number of cycles per unit time. The processing rate can be compared with a threshold value, and in response to the processing rate not meeting the threshold value, the performance monitor 270 can adjust the operation of the scaler unit 220 to reduce the image set 210 and reduce the pixel resolution of the image received by the vSLAM unit 116. As discussed with respect to Figure 2, in coordination with adjusting the scaler unit 220, the performance monitor 270 can adjust the operation of the data scaling processor 240 to achieve the corresponding adjustment in the calibration data 230. The performance monitor 270 can adjust the scaler unit 220 and the data scaling processor 240 in response to any parameter failing to meet the threshold value. The performance monitor 270 can also determine a single performance standard value and adjust it according to the comparison of the value with the threshold value. In some cases, such as in the event of an initialization failure and before the vSLAM unit 116 completes reinitialization, the performance monitor 270 may adjust the scaler unit 220 and the data scaling processor 240 to reduce the amount of downscaling of images in the image set 210 as a way to improve feature detection.

在操作期间，如果性能监视器270确定特征追踪性能在所需性能区域外，则如本文所述，可以缩小提供给vSLAM单元的图像以提高特征追踪性能。在一些实施例中，一旦性能在所需性能区域内，可以将提供给vSLAM单元的图像放大到更接近或等于摄像头系统的原始分辨率的分辨率。本领域的普通技术人员将认识到许多变化、修改和替代。During operation, if the performance monitor 270 determines that the feature tracking performance is outside the desired performance region, the image provided to the vSLAM unit can be scaled down to improve the feature tracking performance, as described herein. In some embodiments, once the performance is within the desired performance region, the image provided to the vSLAM unit can be scaled up to a resolution closer to or equal to the native resolution of the camera system. Those of ordinary skill in the art will recognize many variations, modifications, and alternatives.

vSLAM单元116可以包括捆绑调整(bundle adjustment，BA)单元358。BA单元358可以接收来自IMU(例如图1的IMU 112)的运动数据280和特征追踪单元356检测和追踪的特征的坐标。BA单元358可以优化vSLAM单元116生成的输出姿势360以最小化成本函数，该成本函数量化了将模型拟合到参数的误差，这些参数包括但不限于摄像头姿势和与在三维环境中检测到并由特征追踪单元356提供给BA单元358的特征(例如图1的特征124)相关联的坐标图中的坐标。The vSLAM unit 116 may include a bundle adjustment (BA) unit 358. The BA unit 358 may receive motion data 280 from an IMU (e.g., IMU 112 of FIG. 1 ) and coordinates of features detected and tracked by the feature tracking unit 356. The BA unit 358 may optimize the output pose 360 generated by the vSLAM unit 116 to minimize a cost function that quantifies the error of fitting the model to parameters including, but not limited to, camera poses and coordinates in a coordinate map associated with features (e.g., feature 124 of FIG. 1 ) detected in the three-dimensional environment and provided to the BA unit 358 by the feature tracking unit 356.

在系统300中，BA单元358中的绑定调整操作可以不受缩放器220的操作的影响。例如，在缩放器220进行缩小后对vSLAM单元116接收的图像集210中的图像的像素分辨率进行的调整可以不同地应用于BA单元358的操作。这样，vSLAM单元116相对于特征追踪单元356的性能可由性能监视器270管理而不影响vSLAM单元116生成的输出姿势360。In the system 300, binding adjustment operations in the BA unit 358 may be independent of the operation of the scaler 220. For example, adjustments to the pixel resolution of images in the image set 210 received by the vSLAM unit 116 after downscaling by the scaler 220 may be applied differently to the operation of the BA unit 358. In this way, the performance of the vSLAM unit 116 relative to the feature tracking unit 356 may be managed by the performance monitor 270 without affecting the output poses 360 generated by the vSLAM unit 116.

图4是示出根据本发明实施例的使用vSLAM单元执行动态初始化和特征追踪的系统400的简化示意图。vSLAM单元116可以在包含其的计算机系统(例如，图1的计算机系统110)可以从一个环境移动到具有不同条件的另一环境的时段期间持续操作。例如，在AR应用运行时，运行AR应用的手机可以从内部环境带到外部环境。在系统400的实施例中，vSLAM单元116可以包括实现为确定初始化状态452的软件或硬件的逻辑。当初始输出姿势、初始检测特征、或坐标系不再以足够的精度描述计算机系统的环境时，或者当描述该精度的误差函数未通过阈值测试时，初始化状态452可能失败。vSLAM单元116可以从特征追踪模式474切换到初始化模式472，由此，性能监视器270调整缩放器220以将vSLAM单元116接收的图像集210中的图像缩小到供初始化器单元254使用的像素分辨率，而不是使用由特征追踪单元356使用的像素分辨率。类似地，性能监视器270在初始化模式472下调整数据缩放处理器240的操作，以向vSLAM单元116提供已更新和/或调整的校准数据230，以供初始化器254使用。初始化模式472可以包括参考图2描述的动态调整。FIG. 4 is a simplified schematic diagram showing a system 400 for performing dynamic initialization and feature tracking using a vSLAM unit according to an embodiment of the present invention. The vSLAM unit 116 can continue to operate during a period in which a computer system (e.g., the computer system 110 of FIG. 1 ) containing it can move from one environment to another environment with different conditions. For example, when an AR application is running, a mobile phone running an AR application can be brought from an internal environment to an external environment. In an embodiment of the system 400, the vSLAM unit 116 may include logic implemented as software or hardware for determining an initialization state 452. When the initial output posture, initial detection features, or coordinate system no longer describes the environment of the computer system with sufficient accuracy, or when the error function describing the accuracy does not pass a threshold test, the initialization state 452 may fail. The vSLAM unit 116 can switch from a feature tracking mode 474 to an initialization mode 472, whereby the performance monitor 270 adjusts the scaler 220 to reduce the image in the image set 210 received by the vSLAM unit 116 to a pixel resolution for use by the initializer unit 254, rather than using the pixel resolution used by the feature tracking unit 356. Similarly, the performance monitor 270 adjusts the operation of the data scaling processor 240 in the initialization mode 472 to provide updated and/or adjusted calibration data 230 to the vSLAM unit 116 for use by the initializer 254. The initialization mode 472 may include the dynamic adjustments described with reference to FIG.

在初始化或重新初始化vSLAM单元116之后，当初始化状态452指示vSLAM系统被初始化时，系统400可以从初始化模式472切换到特征追踪模式474，应用更新的坐标图和初始化器254生成的初始输出姿势。系统400在特征追踪模式474下的操作可以如参考图3所述的进行，持续或定期动态调整缩放器220和数据缩放处理器240的操作以优化vSLAM单元116的性能。如参考图3所述，系统400的操作一次可以隔离到单个模式。例如，当系统400处于初始化模式472时，性能监视器可以仅针对图像集210中用于初始化而非用于特征追踪的图像调整缩放器220和数据缩放处理器。这样，当系统400从初始化模式472切换到特征追踪模式474时，缩放器220和数据缩放处理器可以应用缩放调整来进行特征追踪，而不是用初始化参数开始特征追踪。系统400可以存储缩放器220和数据缩放处理器240参数和/或可以在每次系统400切换模式时应用默认参数。系统400还可以应用其他方法来优化模式切换。After initializing or reinitializing the vSLAM unit 116, when the initialization state 452 indicates that the vSLAM system is initialized, the system 400 can switch from the initialization mode 472 to the feature tracking mode 474, applying the updated coordinate map and the initial output pose generated by the initializer 254. The operation of the system 400 in the feature tracking mode 474 can be performed as described with reference to FIG. 3, and the operation of the scaler 220 and the data scaling processor 240 can be continuously or periodically dynamically adjusted to optimize the performance of the vSLAM unit 116. As described with reference to FIG. 3, the operation of the system 400 can be isolated to a single mode at a time. For example, when the system 400 is in the initialization mode 472, the performance monitor can adjust the scaler 220 and the data scaling processor only for images in the image set 210 that are used for initialization rather than for feature tracking. In this way, when the system 400 switches from the initialization mode 472 to the feature tracking mode 474, the scaler 220 and the data scaling processor can apply scaling adjustments to perform feature tracking instead of starting feature tracking with the initialization parameters. The system 400 may store the sealer 220 and data scaling processor 240 parameters and/or may apply default parameters each time the system 400 switches modes. The system 400 may also apply other methods to optimize mode switching.

图5至图7是示出根据本公开至少一方面的执行vSLAM单元的动态操作的方法的简化流程图。结合作为上文所述的计算机系统的示例的计算机系统来描述流程。流程的一些或所有操作可以经由计算机系统上的特定硬件实现和/或可以实现为存储在计算机系统的非暂时性计算机可读介质上的计算机可读指令。如所存储的，计算机可读指令表示包括可由计算机系统的处理器执行的代码的可编程模块。这种指令的执行将计算机系统配置为执行相应的操作。与处理器结合的每个可编程模块代表用于执行相应操作的装置。虽然以特定顺序说明操作，但应理解，特定顺序不是必须的并且可以省略、跳过、和/或重新排序一个或多个操作。在单独的流程中描述的方法可以组合在单个方法中。Fig. 5 to Fig. 7 are simplified flow charts showing the method of performing the dynamic operation of the vSLAM unit according to at least one aspect of the present disclosure. The process is described in conjunction with a computer system as an example of the computer system described above. Some or all operations of the process can be implemented via specific hardware on the computer system and/or can be implemented as computer-readable instructions stored on a non-transitory computer-readable medium of the computer system. As stored, the computer-readable instruction represents a programmable module including a code executable by the processor of the computer system. The execution of such instructions configures the computer system to perform the corresponding operation. Each programmable module combined with the processor represents a device for performing the corresponding operation. Although the operation is described in a specific order, it should be understood that the specific order is not necessary and one or more operations can be omitted, skipped, and/or reordered. The method described in a separate process can be combined in a single method.

图5是示出根据本发明实施例的初始化vSLAM单元的方法的简化流程图。该方法包括使用第一图像和第一校准数据集初始化与计算机系统通信的视觉同时定位与建图(vSLAM)单元，其中，第一图像具有第一像素分辨率(502)。如参考图2所述，vSLAM单元可以在特征检测和追踪之前进行初始化，以确定用于将三维环境中的特征映射到二维坐标系的参数。可选地，该方法可以包括以初始像素分辨率接收原始图像，接收初始校准数据集，从原始图像生成第一图像，以及至少部分根据第一像素分辨率从初始校准数据集生成第一校准数据集。如参考图1所述，计算机系统可以包括RGB光学传感器，RGB光学传感器包括但不限于摄像头并且可以以初始或原始像素分辨率生成原始图像。在如参考图2至图4所述操作系统时，在初始化和/或特征检测和追踪操作期间，可以动态调整图像以优化vSLAM单元(例如图1至图4的vSLAM单元116)的性能。可选地，生成第一图像可以包括在与性能监视器和摄像头通信的缩放器单元从性能监视器接收缩小因子，并至少部分根据缩小因子将原始图像从初始像素分辨率缩小到第一像素分辨率，其中，第一像素分辨率低于初始像素分辨率。FIG5 is a simplified flowchart showing a method for initializing a vSLAM unit according to an embodiment of the present invention. The method includes initializing a visual simultaneous localization and mapping (vSLAM) unit communicating with a computer system using a first image and a first calibration data set, wherein the first image has a first pixel resolution (502). As described with reference to FIG2, the vSLAM unit can be initialized before feature detection and tracking to determine parameters for mapping features in a three-dimensional environment to a two-dimensional coordinate system. Optionally, the method may include receiving a raw image at an initial pixel resolution, receiving an initial calibration data set, generating a first image from the raw image, and generating a first calibration data set from the initial calibration data set at least in part according to the first pixel resolution. As described with reference to FIG1, the computer system may include an RGB optical sensor, the RGB optical sensor including but not limited to a camera and may generate a raw image at an initial or original pixel resolution. When operating the system as described with reference to FIGS. 2 to 4, during initialization and/or feature detection and tracking operations, the image may be dynamically adjusted to optimize the performance of the vSLAM unit (e.g., the vSLAM unit 116 of FIGS. 1 to 4). Optionally, generating the first image may include receiving a downscaling factor from the performance monitor at a scaler unit in communication with the performance monitor and the camera, and downscaling the original image from an initial pixel resolution to a first pixel resolution based at least in part on the downscaling factor, wherein the first pixel resolution is lower than the initial pixel resolution.

可选地，计算机系统可以从与计算机系统通信的摄像头接收原始图像。如参考图2所述，vSLAM单元可以利用降低的像素分辨率进行初始化和/或特征追踪操作，以便节省系统资源。例如，计算机系统可以是移动设备，包括但不限于智能手机或平板电脑。例如，智能手机可使用以高达20MP的原始分辨率捕获的图像，20MP的像素分辨率可能会导致vSLAM单元超出预期的性能目标进行操作。如参考图2至图4所述，计算机系统可以实现缩放器以生成缩小或放大的图像，使vSLAM单元能够有效操作。可选地，第一校准数据集由vSLAM单元可以从与性能监视器通信的数据缩放处理器接收。可选地，可以至少部分基于与和计算机系统通信的摄像头相关联的硬件校准数据集生成第一校准数据集。Optionally, the computer system can receive the original image from the camera communicating with the computer system. As described with reference to Figure 2, the vSLAM unit can use the reduced pixel resolution to perform initialization and/or feature tracking operations to save system resources. For example, the computer system can be a mobile device, including but not limited to a smart phone or a tablet computer. For example, a smart phone can use images captured with an original resolution of up to 20MP, and the pixel resolution of 20MP may cause the vSLAM unit to operate beyond the expected performance target. As described with reference to Figures 2 to 4, the computer system can implement a scaler to generate a reduced or enlarged image so that the vSLAM unit can operate effectively. Optionally, the first calibration data set can be received by the vSLAM unit from a data scaling processor communicating with a performance monitor. Optionally, the first calibration data set can be generated at least in part based on a hardware calibration data set associated with a camera communicating with the computer system.

该方法还包括确定初始化质量值(504)。在一些情况下，作为提高初始化精度和改进vSLAM操作的一种方式，计算机系统可以确定初始化质量值。可选地，计算机系统可以至少部分通过使用与计算机系统通信的性能监视器测量初始化精度来确定初始化质量值。如参考图2至图4更详细描述的，可以至少部分基于运动数据和图像数据之间的误差的测量根据成本函数确定初始化质量值。The method also includes determining an initialization quality value (504). In some cases, as a way to improve initialization accuracy and improve vSLAM operation, the computer system can determine the initialization quality value. Optionally, the computer system can determine the initialization quality value at least in part by measuring the initialization accuracy using a performance monitor in communication with the computer system. As described in more detail with reference to Figures 2 to 4, the initialization quality value can be determined based on a cost function based at least in part on a measurement of an error between the motion data and the image data.

该方法还包括确定初始化质量值在预定阈值外(506)。如参考图2所述，计算机系统可以将初始化质量值与阈值进行比较。例如，阈值可以指示初始化质量不令人满意。系统可能会重复初始化，向上调整图像分辨率，直到精度在预定阈值内。The method also includes determining that the initialization quality value is outside a predetermined threshold (506). As described with reference to FIG. 2, the computer system may compare the initialization quality value to a threshold. For example, the threshold may indicate that the initialization quality is unsatisfactory. The system may repeat the initialization, adjusting the image resolution upward, until the accuracy is within the predetermined threshold.

该方法还包括以高于第一像素分辨率的第二像素分辨率生成第二图像(508)。可选地，可以从第一图像生成第二图像。如参考图2至图4更详细描述的，vSLAM单元可以接收图像集中的图像，该图像用于初始化。在一些情况下，可以将该图像缩小用于第一初始化过程，这可能产生不令人满意的结果，因此将该图像放大到更高的第二像素分辨率用于第二初始化过程。The method also includes generating a second image at a second pixel resolution higher than the first pixel resolution (508). Optionally, the second image can be generated from the first image. As described in more detail with reference to Figures 2 to 4, the vSLAM unit can receive an image in the image set, which is used for initialization. In some cases, the image can be reduced for the first initialization process, which may produce unsatisfactory results, so the image is enlarged to a higher second pixel resolution for the second initialization process.

该方法还包括至少部分基于与第二图像相关联的第二像素分辨率生成第二校准数据集(510)。可选地，数据缩放处理器可以至少部分根据来自性能监视器的一个或多个指令生成第二校准数据集。如参考图2至图4更详细描述的，性能监视器可以至少部分基于缩放器生成的图像的像素分辨率调整校准数据。The method also includes generating a second calibration data set based at least in part on a second pixel resolution associated with the second image (510). Optionally, the data scaling processor may generate the second calibration data set based at least in part on one or more instructions from the performance monitor. As described in more detail with reference to FIGS. 2-4 , the performance monitor may adjust the calibration data based at least in part on a pixel resolution of the image generated by the scaler.

该方法还包括使用第二图像和第二校准数据集重新初始化vSLAM单元(512)。在初始化模式下，如参考图4所述，计算机系统的vSLAM单元可以持续操作，确定系统是否被初始化，并且继续初始化操作，直到初始化质量满足最低精度水平。例如，如果使用一个图像的初始化操作不令人满意，则vSLAM系统可以针对新的图像重复该操作。The method also includes reinitializing the vSLAM unit using the second image and the second calibration data set (512). In the initialization mode, as described with reference to FIG. 4, the vSLAM unit of the computer system can continue to operate, determine whether the system is initialized, and continue the initialization operation until the initialization quality meets the minimum accuracy level. For example, if the initialization operation using one image is not satisfactory, the vSLAM system can repeat the operation for a new image.

应理解，图5所示的具体步骤提供了根据本发明实施例的初始化vSLAM单元的特定方法。根据替代实施例，也可以执行其他步骤序列。例如，本发明的替代实施例可以以不同的顺序执行上述步骤。此外，图5所示的各个步骤可以包括多个子步骤，这些子步骤可以根据各个步骤以各种顺序执行。此外，可以根据特定应用添加或删除附加步骤。本领域的普通技术人员将认识到许多变化、修改和替代。It should be understood that the specific steps shown in Figure 5 provide a specific method for initializing a vSLAM unit according to an embodiment of the present invention. According to alternative embodiments, other step sequences may also be performed. For example, alternative embodiments of the present invention may perform the above steps in different orders. In addition, the various steps shown in Figure 5 may include multiple sub-steps, which may be performed in various orders according to the various steps. In addition, additional steps may be added or deleted according to specific applications. Those of ordinary skill in the art will recognize many variations, modifications, and substitutions.

图6是示出根据本发明实施例的执行特征追踪的方法的简化流程图。该方法包括至少部分通过至少部分根据第一校准数据集检测第一图像中的一个或多个特征，使用与计算机系统通信的视觉同时定位与建图(vSLAM)单元执行特征追踪，其中，第一图像具有第一像素分辨率(602)。如参考图3所述，计算机系统可以检测特征并随着时间和运动追踪这些特征，至少部分使用追踪的特征优化RGB光学传感器的输出姿势，使计算机系统能够用作AR系统。可选地，如参考图5更详细描述的，第一像素分辨率可以由与vSLAM单元通信的初始化器确定。可选地，该方法包括以初始像素分辨率接收原始图像，接收初始校准数据集，从原始图像生成第一图像，以及至少部分根据第一像素分辨率从初始校准数据集生成第一校准数据集。可选地，如参考图2更详细描述的，缩放器单元可以从与计算机系统通信的摄像头以初始像素分辨率接收原始图像。可选地，第一校准数据集可以由与计算机系统通信的数据缩放处理器生成，其中，第一校准数据集与摄像头的一个或多个特性相关联。FIG6 is a simplified flow chart showing a method for performing feature tracking according to an embodiment of the present invention. The method includes performing feature tracking using a visual simultaneous localization and mapping (vSLAM) unit in communication with a computer system, at least in part by detecting one or more features in a first image based at least in part on a first calibration data set, wherein the first image has a first pixel resolution (602). As described with reference to FIG3, the computer system can detect features and track these features over time and motion, and optimize the output posture of the RGB optical sensor using at least in part the tracked features, so that the computer system can be used as an AR system. Optionally, as described in more detail with reference to FIG5, the first pixel resolution can be determined by an initializer in communication with the vSLAM unit. Optionally, the method includes receiving a raw image at an initial pixel resolution, receiving an initial calibration data set, generating a first image from the raw image, and generating a first calibration data set from the initial calibration data set based at least in part on the first pixel resolution. Optionally, as described in more detail with reference to FIG2, a scaler unit can receive a raw image at an initial pixel resolution from a camera in communication with the computer system. Optionally, the first calibration data set can be generated by a data scaling processor in communication with the computer system, wherein the first calibration data set is associated with one or more characteristics of the camera.

该方法还包括确定追踪性能标准(604)。如参考图3所述，计算机系统可以持续操作，同时重复执行特征追踪的多次迭代，动态修改vSLAM单元使用的像素分辨率和校准数据，以优化性能监视器所确定的性能。为此，计算机系统可以确定性能特性，如参考图3所述，以确定如何动态优化vSLAM单元(例如图1至图4的vSLAM单元116)的性能。可选地，性能监视器至少部分通过测量特征检测速度、CPU利用率值、或功耗值中的一个或多个来确定性能标准。The method also includes determining a tracking performance criterion (604). As described with reference to FIG. 3, the computer system can continue to operate while repeatedly performing multiple iterations of feature tracking, dynamically modifying the pixel resolution and calibration data used by the vSLAM unit to optimize the performance determined by the performance monitor. To this end, the computer system can determine a performance characteristic, as described with reference to FIG. 3, to determine how to dynamically optimize the performance of the vSLAM unit (e.g., the vSLAM unit 116 of FIGS. 1 to 4). Optionally, the performance monitor determines the performance criterion at least in part by measuring one or more of a feature detection speed, a CPU utilization value, or a power consumption value.

该方法还包括确定追踪性能标准在预定阈值外(606)。如参考图3至图4更详细描述的，可以针对每个特征追踪过程(602)监测追踪性能标准，以测量vSLAM单元是否在预定参数内操作。例如，追踪性能标准可以表示特征追踪的速度与预定阈值相比太慢。The method also includes determining that the tracking performance criterion is outside a predetermined threshold (606). As described in more detail with reference to Figures 3-4, the tracking performance criterion can be monitored for each feature tracking process (602) to measure whether the vSLAM unit is operating within predetermined parameters. For example, the tracking performance criterion can indicate that the speed of feature tracking is too slow compared to a predetermined threshold.

该方法还包括以低于第一像素分辨率的第二像素分辨率生成第二图像(608)。如参考图3所描述的，响应于性能标准未能满足阈值，可以将第二图像与第一图像不同地缩小。可选地，从第一图像生成第二图像。通过这种方式，与将对应的原始图像从初始像素分辨率缩小到第二像素分辨率相反,可以通过将第一图像从第一像素分辨率放大到第二像素分辨率来提供第二图像。The method also includes generating a second image at a second pixel resolution lower than the first pixel resolution (608). As described with reference to FIG. 3, in response to the performance criterion failing to meet the threshold, the second image can be scaled down differently than the first image. Optionally, the second image is generated from the first image. In this way, the second image can be provided by scaling up the first image from the first pixel resolution to the second pixel resolution, as opposed to scaling down the corresponding original image from the initial pixel resolution to the second pixel resolution.

该方法还包括至少部分根据第二图像的第二像素分辨率生成第二校准数据集(610)。可选地，数据缩放处理器至少部分根据来自性能监视器的一个或多个指令生成第二校准数据集。如参考图3至图5更详细描述的，vSLAM单元可以从校准数据处理器接收校准数据，该校准数据处理器可以从性能监视器接收指令。作为动态vSLAM单元操作的一部分，可以至少部分基于由vSLAM单元用于特征追踪的像素分辨率调整校准数据。可选地，从第一校准数据集生成第二校准数据集。The method also includes generating a second calibration data set (610) based at least in part on a second pixel resolution of the second image. Optionally, the data scaling processor generates the second calibration data set based at least in part on one or more instructions from a performance monitor. As described in more detail with reference to FIGS. 3 to 5 , the vSLAM unit may receive calibration data from a calibration data processor that may receive instructions from a performance monitor. As part of the operation of the dynamic vSLAM unit, the calibration data may be adjusted based at least in part on a pixel resolution used by the vSLAM unit for feature tracking. Optionally, the second calibration data set is generated from the first calibration data set.

该方法还包括使用vSLAM单元、第二图像、以及第二校准数据集执行特征追踪(612)。如参考图3至图4更详细描述的，vSLAM单元可以对从传感器(例如摄像头或其他传感器)接收的图像集持续执行特征追踪，动态调整vSLAM系统使用的像素分辨率。The method also includes performing feature tracking using the vSLAM unit, the second image, and the second calibration data set (612). As described in more detail with reference to Figures 3 and 4, the vSLAM unit can continuously perform feature tracking on the image set received from the sensor (e.g., a camera or other sensor), dynamically adjusting the pixel resolution used by the vSLAM system.

应理解，图6所示的具体步骤提供了根据本发明实施例的追踪图像中的一个或多个特征的特定方法。根据替代实施例，也可以执行其他步骤序列。例如，本发明的替代实施例可以以不同的顺序执行上述步骤。此外，图6所示的各个步骤可以包括多个子步骤，这些子步骤可以根据各个步骤以各种顺序执行。此外，可以根据特定应用添加或删除附加步骤。本领域的普通技术人员将认识到许多变化、修改和替代。It should be understood that the specific steps shown in Figure 6 provide a specific method for tracking one or more features in an image according to an embodiment of the present invention. Other sequences of steps may also be performed according to alternative embodiments. For example, alternative embodiments of the present invention may perform the above steps in a different order. In addition, each of the steps shown in Figure 6 may include multiple sub-steps, which may be performed in various orders depending on the individual steps. In addition, additional steps may be added or deleted depending on the specific application. Those of ordinary skill in the art will recognize many variations, modifications, and alternatives.

图7是示出根据本发明实施例的执行初始化和特征追踪的方法的简化流程图。该方法包括以初始像素分辨率接收原始图像(702)。7 is a simplified flow chart illustrating a method of performing initialization and feature tracking according to an embodiment of the present invention. The method includes receiving a raw image at an initial pixel resolution (702).

该方法还包括接收初始校准数据集(704)。The method also includes receiving an initial calibration data set (704).

该方法还包括缩小原始图像来以低于初始像素分辨率的第一缩小像素分辨率提供第一缩小图像(706)。The method also includes downscaling the original image to provide a first downscaled image at a first downscaled pixel resolution that is lower than the initial pixel resolution (706).

该方法还包括至少部分根据与第一图像相关联的第一缩小像素分辨率从初始校准数据集生成第一校准数据集(708)。The method also includes generating a first calibration data set from the initial calibration data set based at least in part on a first reduced pixel resolution associated with the first image (708).

该方法还包括使用第一缩小像素分辨率和第一校准数据集初始化vSLAM系统(710)。The method also includes initializing the vSLAM system using the first reduced pixel resolution and the first calibration data set (710).

该方法还包括生成第一初始化质量值(712)。The method also includes generating a first initialization quality value (712).

该方法还包括确定第一初始化质量值在预定阈值外(714)。The method also includes determining that the first initialization quality value is outside a predetermined threshold (714).

该方法还包括以第二像素分辨率生成第二图像，第二像素分辨率高于第一像素分辨率且低于初始像素分辨率(716)。The method also includes generating a second image at a second pixel resolution that is higher than the first pixel resolution and lower than the initial pixel resolution (716).

该方法还包括至少部分根据第二图像的第二像素分辨率生成第二校准数据集(718)。The method also includes generating a second calibration data set based at least in part on a second pixel resolution of the second image (718).

该方法还包括使用第二缩小图像和第二校准数据集重新初始化vSLAM单元(720)。The method also includes reinitializing the vSLAM unit using the second downscaled image and the second calibration data set (720).

该方法还包括确定第二初始化质量值(722)。The method also includes determining a second initialization quality value (722).

该方法还包括确定第二初始化质量值在预定阈值内(724)。The method also includes determining that the second initialization quality value is within a predetermined threshold (724).

该方法还包括接收第三图像(726)。可选地，以第二像素分辨率接收第三图像。如参考图6更详细描述的，可以将用于初始化的像素分辨率用作第一像素分辨率来执行特征追踪操作，与缩放器和数据缩放处理器通信的性能监视器可以根据该像素分辨率动态调整图像集中的图像的像素分辨率以优化VSLAM单元性能。The method also includes receiving a third image (726). Optionally, the third image is received at a second pixel resolution. As described in more detail with reference to FIG. 6, the pixel resolution used for initialization can be used as the first pixel resolution to perform feature tracking operations, and a performance monitor in communication with the scaler and the data scaling processor can dynamically adjust the pixel resolution of the images in the image set according to the pixel resolution to optimize the VSLAM unit performance.

该方法还包括追踪第三图像中的一个或多个特征(728)。The method also includes tracking one or more features in the third image (728).

该方法还包括确定追踪性能标准(732)。The method also includes determining tracking performance criteria (732).

该方法还包括确定追踪性能标准在预定阈值外(734)。The method also includes determining that the tracking performance criterion is outside a predetermined threshold (734).

该方法还包括缩小第三图像来以低于第二像素分辨率的第三像素分辨率提供第三缩小图像(736)。The method also includes downscaling the third image to provide a third downscaled image at a third pixel resolution that is lower than the second pixel resolution (736).

该方法还包括追踪第三缩小图像中的一个或多个特征(738)。The method also includes tracking one or more features in the third reduced image (738).

应理解，图7所示的具体步骤提供了根据本发明实施例的初始化vSLAM单元和追踪图像中的一个或多个特征的特定方法。根据替代实施例，也可以执行其他步骤序列。例如，本发明的替代实施例可以以不同的顺序执行上述步骤。此外，图7所示的各个步骤可以包括多个子步骤，这些子步骤可以根据各个步骤以各种顺序执行。此外，可以根据特定应用添加或删除附加步骤。本领域的普通技术人员将认识到许多变化、修改和替代。It should be understood that the specific steps shown in Figure 7 provide a specific method for initializing a vSLAM unit and tracking one or more features in an image according to an embodiment of the present invention. According to alternative embodiments, other step sequences may also be performed. For example, alternative embodiments of the present invention may perform the above steps in different orders. In addition, the various steps shown in Figure 7 may include multiple sub-steps, which may be performed in various orders according to the various steps. In addition, additional steps may be added or deleted according to specific applications. Those of ordinary skill in the art will recognize many variations, modifications, and substitutions.

图8示出根据本发明实施例的示例计算机系统。计算机系统800是上文所述的计算机系统的示例。尽管这些组件示为属于同一计算机系统800，但是计算机系统800也可以是分布式的。8 shows an example computer system according to an embodiment of the present invention. Computer system 800 is an example of the computer system described above. Although these components are shown as belonging to the same computer system 800, computer system 800 may also be distributed.

计算机系统800至少包括处理器802、存储器804、存储设备806、输入/输出外围设备(input/output，I/O)808、通信外围设备810和接口总线812。接口总线812可以用于在计算机系统800的各种组件之间通信、发送和传输数据、控制和命令。存储器804和存储设备806包括计算机可读存储介质，例如RAM、ROM、电可擦可编程只读存储器(electricallyerasable programmable read-only memory，EEPROM)、硬盘驱动器、CD-ROM、光存储设备、磁存储设备、电子非易失性计算机存储，例如存储器，以及其他有形存储介质。任何这样的计算机可读存储介质都可以用于存储实施本公开的各方面的指令或程序代码。存储器804和存储设备806还包括计算机可读信号介质。计算机可读信号介质包括传播的数据信号，其中包含计算机可读程序代码。这种传播的信号采用多种形式中的任何一种，包括但不限于电磁、光或其任何组合。计算机可读信号介质包括不是计算机可读存储介质并且可以通信、传播或传输用于与计算机系统800结合使用的程序的任何计算机可读介质。The computer system 800 includes at least a processor 802, a memory 804, a storage device 806, an input/output peripheral (I/O) 808, a communication peripheral 810, and an interface bus 812. The interface bus 812 can be used to communicate, send, and transfer data, control, and commands between the various components of the computer system 800. The memory 804 and the storage device 806 include computer-readable storage media, such as RAM, ROM, electrically erasable programmable read-only memory (EEPROM), hard drives, CD-ROMs, optical storage devices, magnetic storage devices, electronic non-volatile computer storage, such as Memory, and other tangible storage media. Any such computer-readable storage media can be used to store instructions or program codes that implement various aspects of the present disclosure. Memory 804 and storage device 806 also include computer-readable signal media. Computer-readable signal media include propagated data signals containing computer-readable program code. Such propagated signals take any of a variety of forms, including but not limited to electromagnetic, optical, or any combination thereof. Computer-readable signal media include any computer-readable media that is not a computer-readable storage medium and can communicate, propagate, or transmit a program for use in conjunction with computer system 800.

此外，存储器804可以包括操作系统、程序和应用。处理器802用于执行存储的指令并且包括例如逻辑处理单元、微处理器、数字信号处理器和其他处理器。存储器804和/或处理器802可以被虚拟化并且可以托管在例如云网络或数据中心的另一计算机系统中。I/O外围设备808包括用户接口，例如键盘、屏幕(例如，触摸屏)、麦克风、扬声器、其他输入/输出设备，以及计算组件，例如图形处理单元、串行端口、并行端口、通用串行总线和其他输入/输出外围设备。I/O外围设备808通过耦合到接口总线812的任何端口连接到处理器802。通信外围设备810用于促进计算机系统800和其他计算设备之间通过通信网络的通信，并且包括例如网络接口控制器、调制解调器、无线和有线接口卡、天线和其他通信外围设备。In addition, the memory 804 may include an operating system, a program, and an application. The processor 802 is used to execute stored instructions and includes, for example, a logic processing unit, a microprocessor, a digital signal processor, and other processors. The memory 804 and/or the processor 802 may be virtualized and may be hosted in another computer system such as a cloud network or a data center. The I/O peripherals 808 include a user interface, such as a keyboard, a screen (e.g., a touch screen), a microphone, a speaker, other input/output devices, and a computing component such as a graphics processing unit, a serial port, a parallel port, a universal serial bus, and other input/output peripherals. The I/O peripherals 808 are connected to the processor 802 by any port coupled to the interface bus 812. The communication peripherals 810 are used to facilitate communication between the computer system 800 and other computing devices through a communication network, and include, for example, a network interface controller, a modem, a wireless and wired interface card, an antenna, and other communication peripherals.

尽管本主题已针对其特定实施例进行了详细描述，但应当理解，本领域技术人员在获得对前述内容的理解后，可以容易地产生对这些实施例的改变、变化和等价物。因此，应当理解，本公开是为了示例而不是限制的目的而呈现的，并且不排除包含对于普通技术人员来说是显而易见的对本主题的这种修改、变化和/或添加。实际上，本文描述的方法和系统可以以多种其他形式实施；此外，在不背离本公开的精神的情况下，可以对本文描述的方法和系统的形式进行各种省略、替换和改变。所附权利要求及其等价物旨在覆盖落入本公开的范围和精神内的此类形式或修改。Although the subject matter has been described in detail with respect to its specific embodiments, it should be understood that those skilled in the art, after obtaining an understanding of the foregoing, can easily generate changes, variations, and equivalents to these embodiments. Therefore, it should be understood that the present disclosure is presented for the purpose of illustration rather than limitation, and does not exclude the inclusion of such modifications, variations, and/or additions to the subject matter that are obvious to a person of ordinary skill in the art. In fact, the methods and systems described herein can be implemented in a variety of other forms; in addition, various omissions, substitutions, and changes can be made to the form of the methods and systems described herein without departing from the spirit of the present disclosure. The attached claims and their equivalents are intended to cover such forms or modifications that fall within the scope and spirit of the present disclosure.

除非另有明确说明，否则应了解，贯穿本说明书的讨论使用诸如“处理”、“计算”、“确定”和“识别”等术语是指计算设备(例如一个或多个计算机或类似的电子计算设备)的动作或过程，计算设备在计算平台的存储器、寄存器或其他信息存储设备、传输设备或显示设备中操纵或转换表示为物理电子或磁量的数据。Unless expressly stated otherwise, it should be understood that terms such as "process," "compute," "determine," and "identify" are used throughout the discussion of this specification to refer to the actions or processes of a computing device (e.g., one or more computers or similar electronic computing devices) that manipulates or transforms data represented as physical electronic or magnetic quantities in a memory, register, or other information storage device, transmission device, or display device of a computing platform.

这里讨论的一个或多个系统不限于任何特定的硬件架构或配置。计算设备可以包括提供以一个或多个输入为条件的结果的任何合适的组件布置。合适的计算设备包括访问存储的软件的基于微处理器的多用途计算机系统，该软件将计算机系统从通用计算装置编程或配置为实现本主题的一个或多个实施例的专用计算装置。任何合适的编程、脚本或其他类型的语言或语言的组合可用于在用于编程或配置计算设备的软件中实施本文中包含的教导。The one or more systems discussed herein are not limited to any particular hardware architecture or configuration. A computing device may include any suitable arrangement of components that provide a result conditioned on one or more inputs. Suitable computing devices include multi-purpose microprocessor-based computer systems that access stored software that programs or configures the computer system from a general-purpose computing device to a dedicated computing device that implements one or more embodiments of the present subject matter. Any suitable programming, scripting, or other type of language or combination of languages may be used to implement the teachings contained herein in software for programming or configuring a computing device.

本文公开的方法的实施例可以在这样的计算设备的操作中执行。以上示例中呈现的框的顺序可以改变——例如，框可以被重新排序、组合和/或分解成子框。某些框或过程可以并行执行。Embodiments of the methods disclosed herein may be performed in the operation of such a computing device. The order of the blocks presented in the above examples may be changed—for example, the blocks may be reordered, combined, and/or decomposed into sub-blocks. Certain blocks or processes may be performed in parallel.

本文使用的条件性语言，例如“可以”、“可能”、“例如”等，除非另有明确说明，或在所使用的上下文中以其他方式理解，通常旨在传达某些示例包括而其他示例不包括某些特征、元素和/或步骤。因此，这种条件性语言通常并不意味着一个或多个示例以任何方式需要特征、元素和/或步骤，或者一个或多个示例必须包括用于在有或没有作者输入或提示的情况下决定是否包括或将在任何特定示例中执行这些特征、元素和/或步骤的逻辑。Conditional language used herein, such as "may," "might," "for example," and the like, unless expressly stated otherwise or otherwise understood in the context of use, is generally intended to convey that some examples include and other examples do not include certain features, elements, and/or steps. Thus, such conditional language generally does not imply that one or more examples require the features, elements, and/or steps in any way, or that one or more examples must include logic for deciding, with or without author input or prompting, whether such features, elements, and/or steps are included or will be performed in any particular example.

术语“包括”、“有”、“具有”等是同义词，并且以开放式的方式包容性地使用，并且不排除其他元素、特征、动作、操作等。此外，术语“或”以其包容性(而不是排他性)使用，从而当例如用于连接元素列表时，术语“或”表示列表中的一个、一些或全部元素。此处使用的“适用于”或“用于”是指开放和包容性的语言，不排除适用于或用于执行附加任务或步骤的设备。此外，“基于”的使用意味着开放和包容，因为“基于”一个或多个列举的条件或值的过程、步骤、计算或其他动作实际上可能基于列举的之外的附加条件或值。类似地，“至少部分基于”的使用意味着开放和包容，因为“至少部分基于”一个或多个列举的条件或值的过程、步骤、计算或其他动作在实践中可以基于列举的之外的附加条件或值。本文包括的标题、列表和编号仅是为了便于解释，并不意味着限制。The terms "including", "having", "having", etc. are synonymous and are used inclusively in an open manner and do not exclude other elements, features, actions, operations, etc. In addition, the term "or" is used in its inclusiveness (rather than exclusiveness), so that when used to connect a list of elements, for example, the term "or" represents one, some or all of the elements in the list. "Applicable to" or "for" used herein refers to open and inclusive language, and does not exclude devices that are applicable to or used to perform additional tasks or steps. In addition, the use of "based on" means openness and inclusiveness, because the process, step, calculation or other action "based on" one or more listed conditions or values may actually be based on additional conditions or values outside the list. Similarly, the use of "based at least in part" means openness and inclusiveness, because the process, step, calculation or other action "based at least in part on" one or more listed conditions or values can be based on additional conditions or values outside the list in practice. The titles, lists and numbers included herein are only for ease of explanation and are not meant to be limiting.

上述各种特征和过程可以彼此独立地使用，或者可以以各种方式组合使用。所有可能的组合和子组合旨在落入本公开的范围内。此外，在一些实施方式中可以省略某些方法或过程框。本文描述的方法和过程也不限于任何特定的顺序，并且与其相关的框或状态可以以其他适当的顺序来执行。例如，所描述的框或状态可以以不同于具体公开的顺序执行，或者多个框或状态可以组合在单个框或状态中。示例框或状态可以串行、并行或以某种其他方式执行。可以将框或状态添加到所公开的示例中或从所公开的示例中删除。类似地，本文描述的示例系统和组件可以被配置为与所描述的不同。例如，与所公开的示例相比，可以将添加、移除或重新排列元素。The various features and processes described above can be used independently of each other, or can be used in combination in various ways. All possible combinations and sub-combinations are intended to fall within the scope of the present disclosure. In addition, some method or process blocks can be omitted in some embodiments. The methods and processes described herein are also not limited to any particular order, and the blocks or states related thereto can be performed in other appropriate orders. For example, the described blocks or states can be performed in a different order than that specifically disclosed, or multiple blocks or states can be combined in a single block or state. The example blocks or states can be performed serially, in parallel, or in some other way. Blocks or states can be added to or deleted from the disclosed examples. Similarly, the example systems and components described herein can be configured to be different from those described. For example, compared to the disclosed examples, elements can be added, removed, or rearranged.