CN117745988B

Movatterモバイル変換

Info

Publication number: CN117745988B
Application number: CN202311771011.5A
Authority: CN
Inventors: 侯晓辉; 田秀娟; 李生金
Original assignee: Hiscene Information Technology Co Ltd
Current assignee: Hiscene Information Technology Co Ltd
Priority date: 2023-12-20
Filing date: 2023-12-20
Publication date: 2024-09-13
Anticipated expiration: 2043-12-20
Also published as: WO2025130086A1; CN117745988A

Abstract

Translated fromChinese

本申请的目的是提供一种用于呈现AR标签信息的方法与设备，具体包括：建立或更新空间定位数据库；获取当前场景的场景定位图像，并确定所述场景定位图像对应的摄像位姿信息；基于所述场景定位图像获取一个或多个场景AR标签信息的标签位置信息；根据摄像位姿信息及所述一个或多个场景AR标签信息的标签位置信息，在所述场景定位图像中叠加呈现所述一个或多个场景AR标签信息的标签内容信息。本申请能够在当前场景中快速、精准地叠加呈现对应AR标签信息。

The purpose of this application is to provide a method and device for presenting AR tag information, which specifically includes: establishing or updating a spatial positioning database; obtaining a scene positioning image of the current scene, and determining the camera position information corresponding to the scene positioning image; obtaining the tag position information of one or more scene AR tag information based on the scene positioning image; and superimposing the tag content information of the one or more scene AR tag information in the scene positioning image according to the camera position information and the tag position information of the one or more scene AR tag information. This application can quickly and accurately superimpose and present the corresponding AR tag information in the current scene.

Description

Translated fromChinese

一种用于呈现AR标签信息的方法与设备A method and device for presenting AR tag information

技术领域Technical Field

本申请涉及图像处理领域，尤其涉及一种用于呈现AR标签信息的技术。The present application relates to the field of image processing, and in particular to a technology for presenting AR tag information.

背景技术Background Art

现有技术中大空间定位系统负责生成稀疏点云地图和纹理地图，稀疏点云地图是描述场景中点特征或者线特征的地图，基于稀疏点云地图存储的点特征和线特征可用于定位和目标识别；纹理地图用于描述场景中稠密的纹理结构，是场景的数字化三维模型，基于纹理地图可实现对场景的目标编辑和AR标签编辑。In the prior art, the large-space positioning system is responsible for generating sparse point cloud maps and texture maps. The sparse point cloud map is a map that describes the point features or line features in the scene. The point features and line features stored in the sparse point cloud map can be used for positioning and target recognition; the texture map is used to describe the dense texture structure in the scene and is a digital three-dimensional model of the scene. Based on the texture map, target editing and AR label editing of the scene can be realized.

发明内容Summary of the invention

本申请的一个目的是提供一种用于呈现AR标签信息的方法与设备。One object of the present application is to provide a method and device for presenting AR tag information.

根据本申请的一个方面，提供了一种用于呈现AR标签信息的方法，其中，该方法包括：According to one aspect of the present application, a method for presenting AR tag information is provided, wherein the method includes:

建立或更新空间定位数据库，其中，所述空间定位数据库包括一个或多个场景的场景记录，每个场景记录包括对应场景的稀疏点云地图、纹理地图以及该场景的一个或多个AR标签信息，所述AR标签信息包括标签内容信息；Establishing or updating a spatial positioning database, wherein the spatial positioning database includes scene records of one or more scenes, each scene record includes a sparse point cloud map and a texture map of the corresponding scene and one or more AR tag information of the scene, and the AR tag information includes tag content information;

获取当前场景的场景定位图像，并确定所述场景定位图像对应的摄像位姿信息；Obtaining a scene positioning image of the current scene, and determining the camera posture information corresponding to the scene positioning image;

基于所述场景定位图像获取一个或多个场景AR标签信息的标签位置信息，其中，所述一个或多个场景AR标签信息包含于所述空间定位数据库的AR标签信息；Acquire tag position information of one or more scene AR tag information based on the scene positioning image, wherein the one or more scene AR tag information is included in the AR tag information of the spatial positioning database;

根据摄像位姿信息及所述一个或多个场景AR标签信息的标签位置信息，在所述场景定位图像中叠加呈现所述一个或多个场景AR标签信息的标签内容信息。According to the camera posture information and the tag position information of the one or more scene AR tag information, tag content information of the one or more scene AR tag information is superimposed and presented in the scene positioning image.

根据本申请的另一个方面，提供了一种用于识别AR标签信息的设备，其中，该设备包括：According to another aspect of the present application, a device for identifying AR tag information is provided, wherein the device includes:

一一模块，用于建立或更新空间定位数据库，其中，所述空间定位数据库包括一个或多个场景的场景记录，每个场景记录包括对应场景的稀疏点云地图、纹理地图以及该场景的一个或多个AR标签信息，所述AR标签信息包括标签内容信息；A module for establishing or updating a spatial positioning database, wherein the spatial positioning database includes scene records of one or more scenes, each scene record includes a sparse point cloud map and a texture map of the corresponding scene and one or more AR tag information of the scene, and the AR tag information includes tag content information;

一二模块，用于获取当前场景的场景定位图像，并确定所述场景定位图像对应的摄像位姿信息；Module one and module two are used to obtain a scene positioning image of the current scene and determine the camera position information corresponding to the scene positioning image;

一三模块，用于基于所述场景定位图像获取一个或多个场景AR标签信息的标签位置信息，其中，所述一个或多个场景AR标签信息包含于所述空间定位数据库的AR标签信息；A module 3, configured to obtain tag position information of one or more scene AR tag information based on the scene positioning image, wherein the one or more scene AR tag information is included in the AR tag information of the spatial positioning database;

一四模块，用于根据摄像位姿信息及所述一个或多个场景AR标签信息的标签位置信息，在所述场景定位图像中叠加呈现所述一个或多个场景AR标签信息的标签内容信息。A module 4 is used to overlay and present the tag content information of the one or more scene AR tag information in the scene positioning image according to the camera posture information and the tag position information of the one or more scene AR tag information.

根据本申请的一个方面，提供了一种计算机设备，其中，该设备包括：According to one aspect of the present application, a computer device is provided, wherein the device includes:

处理器；以及Processor; and

被安排成存储计算机可执行指令的存储器，所述可执行指令在被执行时使所述处理器执行如上任一所述方法的步骤。A memory arranged to store computer executable instructions, which when executed cause the processor to perform the steps of any of the methods described above.

根据本申请的一个方面，提供了一种计算机可读存储介质，其上存储有计算机程序/指令，其特征在于，该计算机程序/指令在被执行时使得系统进行执行如上任一所述方法的步骤。According to one aspect of the present application, a computer-readable storage medium is provided, on which a computer program/instruction is stored, characterized in that when the computer program/instruction is executed, the system performs the steps of any of the methods described above.

根据本申请的一个方面，提供了一种计算机程序产品，包括计算机程序/指令，其特征在于，该计算机程序/指令被处理器执行时实现如上任一所述方法的步骤。According to one aspect of the present application, a computer program product is provided, including a computer program/instruction, characterized in that the computer program/instruction implements the steps of any of the above methods when executed by a processor.

与现有技术相比，本申请通过空间定位数据库，实现大空间定位中的地图数据和目标检测模块数据之间的相互利用关联，一方面实现了增强现实场景中对AR标签信息的统一管理，另一方面实现了目标检测模块对大空间中三维地图的利用，能够高效地、精准对目标训练样本的标记，并利用在当前场景中快速、精准地叠加呈现对应AR标签信息。Compared with the existing technology, the present application realizes the mutual utilization and association between map data in large space positioning and target detection module data through a spatial positioning database. On the one hand, it realizes the unified management of AR tag information in augmented reality scenes, and on the other hand, it realizes the utilization of three-dimensional maps in large spaces by target detection modules. It can efficiently and accurately mark target training samples, and use them to quickly and accurately overlay and present corresponding AR tag information in the current scene.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

通过阅读参照以下附图所作的对非限制性实施例所作的详细描述，本申请的其它特征、目的和优点将会变得更明显：Other features, objects and advantages of the present application will become more apparent by reading the detailed description of non-limiting embodiments made with reference to the following drawings:

图1示出根据本申请一个实施例的一种用于呈现AR标签信息的方法流程图；FIG1 shows a flow chart of a method for presenting AR tag information according to an embodiment of the present application;

图2示出根据本申请另一个实施例的一种计算机设备的设备结构图；FIG2 shows a device structure diagram of a computer device according to another embodiment of the present application;

图3示出可被用于实施本申请中所述的各个实施例的示例性系统。FIG. 3 illustrates an exemplary system that may be used to implement various embodiments described herein.

附图中相同或相似的附图标记代表相同或相似的部件。The same or similar reference numerals in the drawings represent the same or similar components.

具体实施方式DETAILED DESCRIPTION

下面结合附图对本申请作进一步详细描述。The present application is described in further detail below in conjunction with the accompanying drawings.

在本申请一个典型的配置中，终端、服务网络的设备和可信方均包括一个或多个处理器(例如，中央处理器(Central Processing Unit，CPU)、输入/输出接口、网络接口和内存。In a typical configuration of the present application, the terminal, the device of the service network and the trusted party all include one or more processors (eg, a central processing unit (CPU), an input/output interface, a network interface and a memory.

内存可能包括计算机可读介质中的非永久性存储器，随机存取存储器(RandomAccess Memory，RAM)和/或非易失性内存等形式，如只读存储器(Read Only Memory，ROM)或闪存(Flash Memory)。内存是计算机可读介质的示例。Memory may include non-permanent memory in a computer-readable medium, random access memory (RAM) and/or non-volatile memory in the form of read-only memory (ROM) or flash memory. Memory is an example of a computer-readable medium.

计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括，但不限于相变内存(Phase-Change Memory，PCM)、可编程随机存取存储器(Programmable Random Access Memory，PRAM)、静态随机存取存储器(Static Random-Access Memory，SRAM)、动态随机存取存储器(Dynamic Random AccessMemory，DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(Electrically-Erasable Programmable Read-Only Memory，EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(Compact Disc Read-Only Memory，CD-ROM)、数字多功能光盘(Digital Versatile Disc，DVD)或其他光学存储、磁盒式磁带，磁带磁盘存储或其他磁性存储设备或任何其他非传输介质，可用于存储可以被计算设备访问的信息。Computer readable media include permanent and non-permanent, removable and non-removable media that can be used to store information by any method or technology. Information can be computer readable instructions, data structures, modules of programs or other data. Examples of computer storage media include, but are not limited to, Phase-Change Memory (PCM), Programmable Random Access Memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of random access memory (RAM), Read-Only Memory (ROM), Electrically-Erasable Programmable Read-Only Memory (EEPROM), Flash memory or other memory technology, Compact Disc Read-Only Memory (CD-ROM), Digital Versatile Disc (DVD) or other optical storage, magnetic cassettes, tape disk storage or other magnetic storage devices or any other non-transmission media that can be used to store information that can be accessed by a computing device.

本申请所指设备包括但不限于用户设备、网络设备、或用户设备与网络设备通过网络相集成所构成的设备。所述用户设备包括但不限于任何一种可与用户进行人机交互(例如通过触摸板进行人机交互)的移动电子产品，例如智能手机、平板电脑、智能眼镜等，所述移动电子产品可以采用任意操作系统，如Android操作系统、iOS操作系统等。其中，所述网络设备包括一种能够按照事先设定或存储的指令，自动进行数值计算和信息处理的电子设备，其硬件包括但不限于微处理器、专用集成电路(Application SpecificIntegrated Circuit，ASIC)、可编程逻辑器件(Programmable Logic Device，PLD)、现场可编程门阵列(Field Programmable Gate Array，FPGA)、数字信号处理器(Digital SignalProcessor，DSP)、嵌入式设备等。所述网络设备包括但不限于计算机、网络主机、单个网络服务器、多个网络服务器集或多个服务器构成的云；在此，云由基于云计算(CloudComputing)的大量计算机或网络服务器构成，其中，云计算是分布式计算的一种，由一群松散耦合的计算机集组成的一个虚拟超级计算机。所述网络包括但不限于互联网、广域网、城域网、局域网、VPN网络、无线自组织网络(Ad Hoc网络)等。优选地，所述设备还可以是运行于所述用户设备、网络设备、或用户设备与网络设备、网络设备、触摸终端或网络设备与触摸终端通过网络相集成所构成的设备上的程序。The devices referred to in this application include but are not limited to user devices, network devices, or devices formed by integrating user devices and network devices through a network. The user devices include but are not limited to any mobile electronic products that can interact with users (for example, interact with users through a touchpad), such as smart phones, tablet computers, smart glasses, etc. The mobile electronic products can use any operating system, such as Android operating system, iOS operating system, etc. Among them, the network device includes an electronic device that can automatically perform numerical calculations and information processing according to pre-set or stored instructions, and its hardware includes but is not limited to microprocessors, application specific integrated circuits (Application Specific Integrated Circuit, ASIC), programmable logic devices (Programmable Logic Device, PLD), field programmable gate arrays (Field Programmable Gate Array, FPGA), digital signal processors (Digital Signal Processor, DSP), embedded devices, etc. The network devices include but are not limited to computers, network hosts, single network servers, multiple network server sets or multiple servers. The cloud is composed of a large number of computers or network servers based on cloud computing (Cloud Computing), wherein cloud computing is a type of distributed computing, a virtual supercomputer composed of a group of loosely coupled computer sets. The network includes but is not limited to the Internet, a wide area network, a metropolitan area network, a local area network, a VPN network, a wireless self-organizing network (Ad Hoc network), etc. Preferably, the device may also be a program running on the user device, the network device, or a device formed by integrating the user device and the network device, the network device, the touch terminal, or the network device and the touch terminal through a network.

当然，本领域技术人员应能理解上述设备仅为举例，其他现有的或今后可能出现的设备如可适用于本申请，也应包含在本申请保护范围以内，并在此以引用方式包含于此。Of course, those skilled in the art should understand that the above-mentioned devices are only examples, and other existing or future devices that are applicable to the present application should also be included in the scope of protection of the present application and are included here by reference.

在本申请的描述中，“多个”的含义是两个或者更多，除非另有明确具体的限定。In the description of the present application, “plurality” means two or more, unless otherwise clearly and specifically defined.

图1示出了根据本申请一个方面的一种用于呈现AR标签信息的方法，其中，该方法应用于计算机设备，包括步骤S101、步骤S102、步骤S103以及步骤S104。在步骤S101中，建立或更新空间定位数据库，其中，所述空间定位数据库包括一个或多个场景的场景记录，每个场景记录包括对应场景的稀疏点云地图、纹理地图以及该场景的一个或多个AR标签信息，所述AR标签信息包括标签内容信息；在步骤S102中，获取当前场景的场景定位图像，并确定所述场景定位图像对应的摄像位姿信息；在步骤S103中，基于所述场景定位图像获取一个或多个场景AR标签信息的标签位置信息，其中，所述一个或多个场景AR标签信息包含于所述空间定位数据库的AR标签信息；在步骤S104中，根据摄像位姿信息及所述一个或多个场景AR标签信息的标签位置信息，在所述场景定位图像中叠加呈现所述一个或多个场景AR标签信息的标签内容信息。其中，所述计算机设备包括但不限于用户设备、网络设备或者用户设备与网络设备的集合设备；其中，所述用户设备包括但不限于任何一种可与用户进行人机交互的电子产品，例如智能手机、平板电脑、智能眼镜、无人机、监控摄像机等；所述网络设备包括但不限于计算机、网络主机、单个网络服务器、多个网络服务器集或多个服务器构成的云，例如，地面控制中心服务器、业务平台等，其中，计算机设备可以在不同的步骤中使用不同的设备，在此不做限定。在此，本申请智能眼镜为例阐述以下该等实施例，本领域技术人员应能理解以下该等实施例同样适用于其他计算机设备等。通常，大空间定位技术和目标检测技术是独立执行的，这种独立执行的实现方案主要的缺陷是大空间定位系统与目标检测系统的数据之间没有关联，导致大空间定位系统不能管理场景中的动态目标，而目标检测系统一方面不能有效地利用大空间定位中的纹理地图，另一方面也不能利用大空间定位中的稀疏地图实现更高效的目标检测。本方案通过集成大空间定位中的地图数据和目标检测模块，实现两者数据之间互通，一方面实现了增强现实场景中对静态、动态AR(增强现实)标签的统一管理，另一方面实现了目标检测模块对大空间中三维地图的利用，例如，目标检测模块利用三维纹理地图可以高效地实现对目标训练样本的标记，利用三维稀疏地图可以对目标进行快速的检测识别。FIG1 shows a method for presenting AR tag information according to one aspect of the present application, wherein the method is applied to a computer device, and includes steps S101, S102, S103, and S104. In step S101, a spatial positioning database is established or updated, wherein the spatial positioning database includes scene records of one or more scenes, each scene record includes a sparse point cloud map and a texture map of the corresponding scene, and one or more AR tag information of the scene, wherein the AR tag information includes tag content information; in step S102, a scene positioning image of the current scene is obtained, and the camera pose information corresponding to the scene positioning image is determined; in step S103, tag position information of one or more scene AR tag information is obtained based on the scene positioning image, wherein the one or more scene AR tag information is included in the AR tag information of the spatial positioning database; in step S104, according to the camera pose information and the tag position information of the one or more scene AR tag information, the tag content information of the one or more scene AR tag information is superimposed and presented in the scene positioning image. Wherein, the computer device includes but is not limited to user equipment, network equipment or a collection of user equipment and network equipment; wherein, the user equipment includes but is not limited to any electronic product that can interact with the user, such as a smart phone, a tablet computer, smart glasses, a drone, a surveillance camera, etc.; the network device includes but is not limited to a computer, a network host, a single network server, a plurality of network server sets or a cloud composed of multiple servers, for example, a ground control center server, a business platform, etc., wherein the computer device can use different devices in different steps, which is not limited here. Here, the following embodiments are described by taking the smart glasses of this application as an example, and those skilled in the art should understand that the following embodiments are also applicable to other computer devices, etc. Usually, large space positioning technology and target detection technology are executed independently. The main defect of this independent execution implementation scheme is that there is no association between the data of the large space positioning system and the target detection system, resulting in the large space positioning system being unable to manage dynamic targets in the scene, and the target detection system cannot effectively utilize the texture map in the large space positioning on the one hand, and on the other hand, it cannot utilize the sparse map in the large space positioning to achieve more efficient target detection. This solution integrates the map data and target detection module in large-space positioning to achieve intercommunication between the two data. On the one hand, it realizes the unified management of static and dynamic AR (augmented reality) tags in augmented reality scenes, and on the other hand, it realizes the use of three-dimensional maps in large spaces by the target detection module. For example, the target detection module can efficiently mark the target training samples using the three-dimensional texture map, and can quickly detect and identify the target using the three-dimensional sparse map.

在步骤S101中，建立或更新空间定位数据库，其中，所述空间定位数据库包括一个或多个场景的场景记录，每个场景记录包括对应场景的稀疏点云地图、纹理地图以及该场景的一个或多个AR标签信息，所述AR标签信息包括标签内容信息。例如，计算机设备通过建立或更新对应空间定位数据库实现大空间定位的地图数据和目标检测模块之间的数据互联，在此，所述空间定位数据库可以是在计算机设备端建立，还可以是在其他设备端建立并供计算机设备实现数据调用，并能够基于计算机设备上传的地图数据实现数据库更新等。所述空间定位数据库用于存储和更新一个或多个建图场景对应的场景记录，所述场景记录包含每一个建图场景的稀疏点云地图和纹理地图等，所述稀疏点云地图包括用于描述场景中点特征或者线特征的地图，所述稀疏点云地图存储的点特征和线特征可用于定位和目标检测，所述纹理地图包括用于描述场景中稠密的纹理结构，是对应场景的数字化三维模型，所述纹理地图可用于实现对场景的目标编辑和AR标签编辑等。在一些情形下，每个建图场景的场景记录还包括该建图场景中包含的一个或多个AR标签信息，所述AR标签信息包括用于指示对应场景中某个点、线、面、对象或者区域的标注信息，每个AR标签信息包括该标签的标签内容信息，例如，对某个点、线、面、对象或者区域添加的图片、视频、3D模型、PDF文件、office文档、表单信息、音频、超链接、应用调用信息(用于执行应用的相关指令等，如打开应用、调用应用具体的功能-如拨打电话等)、实时传感信息(用于连接传感器并获取目标对象的传感数据)、涂鸦等标注信息。所述空间定位数据库可以是计算机设备通过导入初始的某个建图场景对应的地图数据创建的，还可以是后续叠加计算其他场景后，将其他场景的相关数据作为建图场景数据更新该空间定位数据库等。In step S101, a spatial positioning database is established or updated, wherein the spatial positioning database includes scene records of one or more scenes, each scene record includes a sparse point cloud map of the corresponding scene, a texture map and one or more AR tag information of the scene, and the AR tag information includes tag content information. For example, the computer device realizes data interconnection between the map data of large spatial positioning and the target detection module by establishing or updating the corresponding spatial positioning database. Here, the spatial positioning database can be established on the computer device side, or it can be established on other device sides and used for data call by the computer device, and can realize database update based on the map data uploaded by the computer device. The spatial positioning database is used to store and update scene records corresponding to one or more mapping scenes, and the scene record includes a sparse point cloud map and a texture map of each mapping scene, etc. The sparse point cloud map includes a map for describing point features or line features in the scene, and the point features and line features stored in the sparse point cloud map can be used for positioning and target detection. The texture map includes a digital three-dimensional model of the corresponding scene for describing the dense texture structure in the scene, and the texture map can be used to realize target editing and AR tag editing of the scene. In some cases, the scene record of each mapping scene also includes one or more AR tag information contained in the mapping scene, wherein the AR tag information includes annotation information for indicating a certain point, line, surface, object or area in the corresponding scene, and each AR tag information includes tag content information of the tag, for example, pictures, videos, 3D models, PDF files, office documents, form information, audio, hyperlinks, application call information (used to execute relevant instructions of the application, such as opening the application, calling specific functions of the application - such as making a call, etc.), real-time sensor information (used to connect to the sensor and obtain sensor data of the target object), graffiti and other annotation information added to a certain point, line, surface, object or area. The spatial positioning database can be created by a computer device by importing map data corresponding to an initial mapping scene, or it can be that after subsequent superposition calculation of other scenes, the relevant data of other scenes are used as mapping scene data to update the spatial positioning database, etc.

在一些实施方式中，所述方法还包括步骤S105(未示出)，在步骤S105中，获取对应建图场景的场景图像，并获取所述场景图像对应的建图场景的稀疏点云地图及建图场景的纹理地图；基于所述建图场景的稀疏点云地图、纹理地图建立所述建图场景的场景记录。例如，所述建图场景的场景图像包括建图场景的多个图像，例如，建图场景的图像序列或者视频等，又如，所述建图场景的场景图像除了建图场景的图像外，还包括对应的深度信息(如通过深度摄像头获取对应的图像和深度信息等)。所述建图场景的场景图像可以是对应摄像装置(例如，本地摄像头或外置摄像头等)采集的，还可以基于与其他设备的通信连接从其他设备获取的关于建图场景的图像序列/视频等。计算机设备可以对该场景图像进行数据处理，例如，三维重建等，获取对应稀疏点云地图和纹理地图，例如，先进行稀疏地图的构建，然后再进行纹理地图的构建，稀疏地图的构建可以是基于现有的增量式运动结构回复技术(Srtucture From Motion，SFM)实现的，纹理地图的构建可以是基于传统的多视点立体视觉技术(Multi-view stereo，MVS)实现的，其中，稀疏地图和纹理地图的构建在此仅为举例，不做限定。计算机设备获取到建图场景的稀疏点云地图和纹理地图后，一方面，可以基于该稀疏点云地图和纹理地图建立建图场景的场景记录，便于后续基于场景记录建立或更新空间定位数据库，另一方面，可以基于该稀疏点云地图和/或纹理地图进行建图场景的场景AR标签信息管理(如添加、修改、删除等)等，然后基于稀疏点云地图、纹理地图和场景AR标签信息建立建图场景的场景记录，便于后续基于场景记录建立或更新空间定位数据库，例如，基于当前用户关于稀疏点云地图和/或纹理地图的编辑操作生成对应建图AR标签信息等。在一些实施方式中，所述方法还包括步骤S106(未示出)，在步骤S106中，基于所述建图场景的稀疏点云地图和/或所述纹理地图，获取所述建图场景中的建图AR标签信息，其中，所述建图AR标签信息包括标签内容信息；其中，在步骤S105中，所述基于所述建图场景的稀疏点云地图、纹理地图建立所述建图场景的场景记录，包括：基于所述建图场景的稀疏点云地图、纹理地图以及所述建图AR标签信息建立所述建图场景的场景记录；其中，在步骤S101中，基于所述建图场景的场景记录建立或更新空间定位数据库，其中，所述空间定位数据库包括一个或多个场景的场景记录，每个场景记录包括对应场景的稀疏点云地图、纹理地图以及该场景的一个或多个AR标签信息，所述AR标签信息包括标签内容信息，所述建图场景包含于所述一个或多个场景。在一些实施例中，获取建图场景的稀疏点云地图及建图场景的纹理地图后，用户(例如，建图场景对应用户等)可以进行关于该建图场景的稀疏点云地图和/或纹理地图的编辑操作等生成关于该建图场景的建图AR标签信息等，从而基于建图场景的稀疏点云地图、纹理地图以及建图AR标签信息建立建图场景的场景记录，更近一步地，基于该场景记录建立或更新空间定位数据库。在另一些实施方式中，所述方法还包括步骤S107(未示出)，在步骤S107中，基于所述建图场景的稀疏点云地图和/或所述纹理地图，获取所述建图场景中的建图AR标签信息，其中，所述建图AR标签信息包括标签内容信息；根据所述建图AR标签信息更新所述空间定位数据库中所述建图场景对应的场景记录。在一些情形下，计算机设备还可以先将建图场景的稀疏点云地图和纹理地图上传至空间定位数据库，并基于后续用户(例如，建图场景对应用户或者访问空间定位数据库的其他用户等)关于该建图场景的稀疏点云地图和/或纹理地图的编辑操作等生成关于该建图场景的建图AR标签信息等。换言之，该建图场景的建图AR标签信息可以包括将建图获取的稀疏点云地图和纹理地图上传到空间定位数据库前对应用户对稀疏点云地图和/或纹理地图进行编辑操作获取的，还可以包括将建图获取的稀疏点云地图和纹理地图上传到空间定位数据库后用户调用稀疏点云地图和/或纹理地图并进行编辑操作后获取的。计算机设备可以基于建图场景的稀疏点云地图、纹理地图建立该建图场景的场景记录，在一些情形下，该场景记录的建立过程还可以包括上传到空间定位数据库之前获取的建图AR标签信息，在一些情形下，该场景记录上传到空间定位数据库后，还可以基于后续用户关于场景数据的调用并进行编辑操作获取的建图AR标签信息进行更新等。相应地，该空间定位数据库可以基于建图场景的场景记录的创建而建立或更新，并随着后续场景记录的更新而更新等。在此，所述纹理地图可以开放给用户进行可视化编辑，用户可以在纹理地图上完成对AR标签的编辑和目标区域的编辑等；稀疏点云地图可用于定位和目标检测。In some embodiments, the method further includes step S105 (not shown), in which a scene image corresponding to the mapping scene is obtained, and a sparse point cloud map of the mapping scene corresponding to the scene image and a texture map of the mapping scene are obtained; a scene record of the mapping scene is established based on the sparse point cloud map and texture map of the mapping scene. For example, the scene image of the mapping scene includes multiple images of the mapping scene, such as an image sequence or video of the mapping scene, etc. For another example, the scene image of the mapping scene includes corresponding depth information (such as obtaining the corresponding image and depth information through a depth camera, etc.) in addition to the image of the mapping scene. The scene image of the mapping scene can be collected by a corresponding camera device (for example, a local camera or an external camera, etc.), and can also be an image sequence/video of the mapping scene obtained from other devices based on a communication connection with other devices. The computer device can perform data processing on the scene image, for example, three-dimensional reconstruction, etc., to obtain the corresponding sparse point cloud map and texture map. For example, the sparse map is first constructed, and then the texture map is constructed. The construction of the sparse map can be based on the existing incremental motion structure recovery technology (Srtucture From Motion, SFM), and the construction of the texture map can be based on the traditional multi-view stereo vision technology (Multi-view stereo, MVS). The construction of the sparse map and the texture map is only for example and not limited here. After the computer device obtains the sparse point cloud map and texture map of the mapping scene, on the one hand, a scene record of the mapping scene can be established based on the sparse point cloud map and texture map, so as to facilitate the subsequent establishment or update of the spatial positioning database based on the scene record. On the other hand, scene AR tag information management (such as addition, modification, deletion, etc.) of the mapping scene can be performed based on the sparse point cloud map and/or texture map, and then a scene record of the mapping scene can be established based on the sparse point cloud map, texture map and scene AR tag information, so as to facilitate the subsequent establishment or update of the spatial positioning database based on the scene record, for example, corresponding mapping AR tag information can be generated based on the current user's editing operation on the sparse point cloud map and/or texture map. In some embodiments, the method further includes step S106 (not shown), in which, based on the sparse point cloud map and/or the texture map of the mapping scene, mapping AR tag information in the mapping scene is obtained, wherein the mapping AR tag information includes tag content information; wherein, in step S105, the scene record of the mapping scene is established based on the sparse point cloud map and the texture map of the mapping scene, including: establishing the scene record of the mapping scene based on the sparse point cloud map, the texture map and the mapping AR tag information of the mapping scene; wherein, in step S101, a spatial positioning database is established or updated based on the scene record of the mapping scene, wherein the spatial positioning database includes scene records of one or more scenes, each scene record includes a sparse point cloud map and a texture map of the corresponding scene and one or more AR tag information of the scene, the AR tag information includes tag content information, and the mapping scene is included in the one or more scenes. In some embodiments, after obtaining the sparse point cloud map and the texture map of the mapping scene, the user (for example, the user corresponding to the mapping scene, etc.) can perform editing operations on the sparse point cloud map and/or texture map of the mapping scene, etc. to generate mapping AR tag information about the mapping scene, etc., thereby establishing a scene record of the mapping scene based on the sparse point cloud map, texture map and mapping AR tag information of the mapping scene, and further, establishing or updating the spatial positioning database based on the scene record. In other embodiments, the method further includes step S107 (not shown), in which, based on the sparse point cloud map and/or the texture map of the mapping scene, the mapping AR tag information in the mapping scene is obtained, wherein the mapping AR tag information includes tag content information; and the scene record corresponding to the mapping scene in the spatial positioning database is updated according to the mapping AR tag information. In some cases, the computer device may also upload the sparse point cloud map and texture map of the mapping scene to the spatial positioning database first, and generate mapping AR tag information about the mapping scene based on the editing operation of the sparse point cloud map and/or texture map of the mapping scene by subsequent users (e.g., the user corresponding to the mapping scene or other users accessing the spatial positioning database, etc.). In other words, the mapping AR tag information of the mapping scene may include the information obtained by the corresponding user performing an editing operation on the sparse point cloud map and/or texture map before uploading the sparse point cloud map and texture map obtained by mapping to the spatial positioning database, and may also include the information obtained by the user calling the sparse point cloud map and/or texture map and performing an editing operation after uploading the sparse point cloud map and texture map obtained by mapping to the spatial positioning database. The computer device can establish a scene record of the mapping scene based on the sparse point cloud map and texture map of the mapping scene. In some cases, the process of establishing the scene record can also include the mapping AR tag information obtained before uploading to the spatial positioning database. In some cases, after the scene record is uploaded to the spatial positioning database, it can also be updated based on the subsequent user's call for scene data and the mapping AR tag information obtained by editing operations. Accordingly, the spatial positioning database can be established or updated based on the creation of the scene record of the mapping scene, and updated with the subsequent updates of the scene record. Here, the texture map can be open to users for visual editing, and users can complete the editing of AR tags and target areas on the texture map; sparse point cloud maps can be used for positioning and target detection.

在一些实施方式中，所述建图AR标签信息包括但不限于：静态AR标签信息，其中，所述静态AR标签信息还包括对应的标签位置信息；动态AR标签信息，其中，所述动态AR标签信息还包括对应动态目标的目标区域标识信息。例如，在稀疏点云地图和/或纹理地图上进行AR标签信息的编辑操作时，基于不同编辑对象的类型可以设置有多种标签类型，该多种标签类型可以是基于用户关于标签类型的选中操作而确定，还可以是基于目标检测确定对应编辑对象类型从而对AR标签信息进行分类确定，例如，若目标检测确定该AR标签信息编辑对象为静态、空间位置和物理结构完全不会发生变化的对象(例如，墙面、地面、马路上的路灯等)，则确定对应AR标签信息为静态AR标签信息。还例如，若目标检测确定该AR标签信息编辑对象为动态、空间位置会发生变化的刚性对象(例如，椅子、马路上的行人、车辆等)，则确定对应AR标签信息为动态AR标签信息。其中，对于静态AR标签信息，通常对应AR标签信息进行存储或更新时还会存储该AR标签信息的标签位置信息，该标签位置信息用于指示标签所编辑的点、线、面、区域或者对象所处的位置信息，如绝对位置的世界坐标、相对位置的图像坐标或者稀疏点云地图和/或纹理地图上的标注位置等，在一些情形下，所述静态AR标签信息包括存储文字描述、AR模型以及AR模型在地图中6维位姿数据+1维尺度数据等。对于动态AR标签信息，通常对应AR标签信息进行存储或更新时还会存储该AR标签信息所编辑的动态目标的目标区域标识信息，该目标区域标识信息用于指示该动态AR标签信息所编辑的动态目标的唯一标识，如名称、序号或者识别特征等；在一些情形下，所述动态AR标签信息包括存储目标ID、文字描述以及AR模型等。In some embodiments, the mapping AR tag information includes but is not limited to: static AR tag information, wherein the static AR tag information also includes corresponding tag location information; dynamic AR tag information, wherein the dynamic AR tag information also includes target area identification information corresponding to the dynamic target. For example, when performing an editing operation of AR tag information on a sparse point cloud map and/or a texture map, multiple tag types can be set based on different types of editing objects. The multiple tag types can be determined based on the user's selection operation on the tag type, or can be determined based on target detection to determine the corresponding editing object type so as to classify and determine the AR tag information. For example, if the target detection determines that the AR tag information editing object is a static object whose spatial position and physical structure will not change at all (for example, a wall, a ground, a street lamp on the road, etc.), the corresponding AR tag information is determined to be static AR tag information. For another example, if the target detection determines that the AR tag information editing object is a dynamic rigid object whose spatial position will change (for example, a chair, a pedestrian on the road, a vehicle, etc.), the corresponding AR tag information is determined to be dynamic AR tag information. Among them, for static AR tag information, usually when the corresponding AR tag information is stored or updated, the tag location information of the AR tag information is also stored. The tag location information is used to indicate the location information of the point, line, surface, area or object edited by the tag, such as the world coordinates of the absolute position, the image coordinates of the relative position, or the annotation position on the sparse point cloud map and/or texture map, etc. In some cases, the static AR tag information includes the storage of text descriptions, AR models, and 6-dimensional pose data + 1-dimensional scale data of the AR model in the map, etc. For dynamic AR tag information, usually when the corresponding AR tag information is stored or updated, the target area identification information of the dynamic target edited by the AR tag information is also stored. The target area identification information is used to indicate the unique identification of the dynamic target edited by the dynamic AR tag information, such as the name, serial number or identification feature, etc.; in some cases, the dynamic AR tag information includes the storage of target ID, text description and AR model, etc.

在一些实施方式中，所述方法还包括步骤S108(未示出)，在步骤S108中，对所述建图场景的稀疏点云地图进行Mask优化处理，获取优化后的优化稀疏点云地图；其中，在步骤S105中，基于所述建图场景的优化稀疏点云地图、纹理地图建立所述建图场景的场景记录。例如，为了节省计算资源和提升计算效率，计算机设备可以对稀疏点云地图进行掩膜(Mask)优化处理，所述Mask技术是一种用于选择、过滤或隐藏图像中特定区域的技术，Mask优化处理后的优化稀疏点云地图是一个与原始稀疏点云地图具有相同维度的矩阵，其中，优化后的稀疏点云地图中存在优化区域，该优化区域是指通过修改或者遮挡对应区域的图像像素，从而在后续添加标签或者特征计算时不进行计算等。通过Mask处理后，将基于建图场景的优化稀疏点云地图、纹理地图建立所述建图场景的场景记录。In some embodiments, the method further includes step S108 (not shown), in which the sparse point cloud map of the mapping scene is subjected to mask optimization processing to obtain an optimized sparse point cloud map after optimization; wherein, in step S105, a scene record of the mapping scene is established based on the optimized sparse point cloud map and texture map of the mapping scene. For example, in order to save computing resources and improve computing efficiency, the computer device can perform mask optimization processing on the sparse point cloud map, wherein the mask technology is a technology for selecting, filtering or hiding specific areas in an image, and the optimized sparse point cloud map after mask optimization processing is a matrix with the same dimension as the original sparse point cloud map, wherein there is an optimized area in the optimized sparse point cloud map, and the optimized area refers to modifying or blocking the image pixels of the corresponding area, so that no calculation is performed when adding labels or feature calculations in the subsequent process. After the mask processing, the scene record of the mapping scene is established based on the optimized sparse point cloud map and texture map of the mapping scene.

在步骤S102中，获取当前场景的场景定位图像，并确定所述场景定位图像对应的摄像位姿信息。例如，计算机设备可以通过摄像装置拍摄当前场景的场景定位图像，或者基于与其他设备的通信连接接收其他设备传输的当前场景的场景定位图像等。在此，所述场景定位图像可以是当前场景的一个或多个图像，我们以单个图像为例阐述以下该等实施例，若对应场景定位图像为多个图像，则每个图像的处理过程与该单个图像的处理过程相似，该等实施例同样适用于多个图像的处理过程。计算机设备可以基于该场景定位图像对当前场景进行场景定位从而确定对应摄像位姿信息，例如，通过场景定位图像进行特征提取并匹配确定对应摄像位姿信息；或者，通过获取场景定位图像的场景特征图，并将场景特征图利用特征点匹配和PNP计算得到场景定位图像的摄像位姿信息，所述摄像位姿信息包括对应场景定位图像被拍摄时摄像装置的摄像位置信息和摄像姿态信息等。In step S102, a scene positioning image of the current scene is obtained, and the camera posture information corresponding to the scene positioning image is determined. For example, the computer device can shoot the scene positioning image of the current scene through a camera device, or receive the scene positioning image of the current scene transmitted by other devices based on the communication connection with other devices. Here, the scene positioning image can be one or more images of the current scene. We take a single image as an example to illustrate the following embodiments. If the corresponding scene positioning image is a plurality of images, the processing process of each image is similar to the processing process of the single image, and the embodiments are also applicable to the processing process of multiple images. The computer device can perform scene positioning on the current scene based on the scene positioning image to determine the corresponding camera posture information, for example, by extracting features from the scene positioning image and matching to determine the corresponding camera posture information; or, by obtaining a scene feature map of the scene positioning image, and using the scene feature map to match feature points and calculate PNP to obtain the camera posture information of the scene positioning image, the camera posture information includes the camera position information and camera posture information of the camera device when the corresponding scene positioning image is taken.

在步骤S103中，基于所述场景定位图像获取一个或多个场景AR标签信息的标签位置信息，其中，所述一个或多个场景AR标签信息包含于所述空间定位数据库的AR标签信息。例如，计算机设备获取到场景定位图像的摄像位姿信息后，可以基于该摄像位姿信息将空间定位数据库中部分/全部AR标签信息在场景定位图像中进行叠加呈现，例如，计算机设备可以直接调用空间定位数据库中所有的静态AR标签信息，由于静态AR标签信息包含标签位置信息，可以基于静态AR标签信息的标签位置信息及摄像位姿信息进行所有静态AR标签信息的叠加呈现。或者，计算机设备还可以先进行场景匹配，确定空间定位数据库中存储的、包含于当前场景定位图像的场景AR标签信息等，如确定包含于当前场景的场景内的静态AR标签信息及通过目标检测识别出的场景定位图像中的目标区域标识信息及目标区域的位置信息，通过目标区域的标识信息得到对应的动态AR标签信息，将目标区域的位置信息确定为动态AR标签信息的标签位置信息，从而将基于摄像位姿信息和标签位置信息进行场景AR标签信息的叠加呈现等。在此，所述场景定位图像中确定的场景AR标签信息包含于空间定位数据库中存储的AR标签信息，换言之，所述一个或多个场景AR标签信息包括空间定位数据库中部分/全部的AR标签信息。In step S103, tag location information of one or more scene AR tag information is obtained based on the scene positioning image, wherein the one or more scene AR tag information is included in the AR tag information of the spatial positioning database. For example, after the computer device obtains the camera pose information of the scene positioning image, it can overlay and present part/all of the AR tag information in the spatial positioning database in the scene positioning image based on the camera pose information. For example, the computer device can directly call all static AR tag information in the spatial positioning database. Since the static AR tag information includes the tag location information, all static AR tag information can be overlaid and presented based on the tag location information and the camera pose information of the static AR tag information. Alternatively, the computer device can also first perform scene matching to determine the scene AR tag information stored in the spatial positioning database and included in the current scene positioning image, such as determining the static AR tag information in the scene included in the current scene and the target area identification information and the location information of the target area in the scene positioning image identified by target detection, obtaining the corresponding dynamic AR tag information through the identification information of the target area, and determining the location information of the target area as the tag location information of the dynamic AR tag information, thereby overlaying and presenting the scene AR tag information based on the camera pose information and the tag location information. Here, the scene AR tag information determined in the scene positioning image is included in the AR tag information stored in the spatial positioning database. In other words, the one or more scene AR tag information includes part/all of the AR tag information in the spatial positioning database.

在步骤S104中，根据摄像位姿信息及所述一个或多个场景AR标签信息的标签位置信息，在所述场景定位图像中叠加呈现所述一个或多个场景AR标签信息的标签内容信息。例如，计算机设备获取当前场景的场景定位图像的摄像位姿信息及一个或多个场景AR标签信息后，可以进行场景AR标签信息的叠加呈现，如对于静态AR标签信息，计算机设备基于摄像位姿信息及静态AR标签信息的标签位置信息计算静态AR标签信息在场景定位图像中的图像位置，从而在对图像位置叠加呈现对应AR标签信息。对于动态AR标签信息，通过目标检测识别出的场景定位图像中的目标区域标识信息及目标区域的位置信息，通过目标区域的标识信息得到对应的动态AR标签信息，将目标区域的位置信息确定为动态AR标签信息的标签位置信息，并基于摄像位姿信息及标签位置信息计算其在对应图像位置从而实现AR标签信息的叠加呈现。In step S104, the tag content information of the one or more scene AR tag information is superimposed and presented in the scene positioning image according to the camera pose information and the tag position information of the one or more scene AR tag information. For example, after the computer device obtains the camera pose information and the one or more scene AR tag information of the scene positioning image of the current scene, the scene AR tag information can be superimposed and presented. For example, for static AR tag information, the computer device calculates the image position of the static AR tag information in the scene positioning image based on the camera pose information and the tag position information of the static AR tag information, thereby superimposing and presenting the corresponding AR tag information at the image position. For dynamic AR tag information, the target area identification information and the position information of the target area in the scene positioning image identified by target detection are used to obtain the corresponding dynamic AR tag information through the identification information of the target area, and the position information of the target area is determined as the tag position information of the dynamic AR tag information, and the corresponding image position is calculated based on the camera pose information and the tag position information, thereby realizing the superimposed presentation of the AR tag information.

在一些实施方式中，所述方法还包括步骤S109(未示出)，在步骤S109中，获取用户在场景的所述纹理地图中的编辑操作，基于所述编辑操作确定该场景的一个或多个AR标签信息，所述AR标签信息包括标签内容信息。例如，计算机设备在获取到建图场景的纹理地图后，可以基于纹理地图进行AR标签信息的编辑操作，该编辑操作可以是在该纹理地图存储至空间定位数据库之前，还可以是在该纹理地图存储至空间定位数据库之后供用户调用并执行对应AR标签的编辑操作等。所述编辑操作包括但不限于对于纹理地图中点、线、面、对象或者三维区域的选中并添加、修改、删除标记/标注等，计算机设备可以基于编辑操作选中的点、线、面、对象或者三维区域及添加、修改、删除的标记/标注内容等生成该编辑操作对应的AR标签信息。In some embodiments, the method further includes step S109 (not shown), in which an editing operation of the user in the texture map of the scene is obtained, and one or more AR tag information of the scene is determined based on the editing operation, and the AR tag information includes tag content information. For example, after obtaining the texture map of the mapping scene, the computer device can perform an editing operation on the AR tag information based on the texture map. The editing operation can be before the texture map is stored in the spatial positioning database, or after the texture map is stored in the spatial positioning database for the user to call and perform the editing operation of the corresponding AR tag. The editing operation includes but is not limited to selecting points, lines, surfaces, objects or three-dimensional areas in the texture map and adding, modifying, deleting marks/markings, etc. The computer device can generate AR tag information corresponding to the editing operation based on the points, lines, surfaces, objects or three-dimensional areas selected by the editing operation and the added, modified, deleted marks/marking contents.

在一些实施方式中，所述AR标签信息包括静态AR标签信息；其中，所述方法还包括步骤S110(未示出)，在步骤S110中，根据所述编辑操作在所述场景的纹理地图中的编辑位置信息，确定并存储所述静态AR标签信息的标签位置信息；其中，在步骤S103中，根据所述场景定位图像，在所述空间定位数据库的一个或多个静态AR标签信息中确定对应的一个或多个场景AR标签信息，并查询所述一个或多个场景AR标签信息的标签位置信息。例如，计算机设备获取到用户在三维纹理地图中的编辑操作后，若该编辑操作对应的标签信息为静态AR标签信息，则计算机设备可以直接根据该编辑操作所对应的点、线、面或者三维区域在三维纹理地图中的位置确定该静态AR标签信息的标签位置信息，如直接基于三维纹理地图中的位置确定该静态AR标签的位置信息，或者，基于三维纹理地图中的位置进一步计算对应空间位置确定为该静态AR标签的位置信息等。其中，所述编辑位置信息用于指示编辑操作所对应的点、线、面或者三维区域在三维纹理地图中的指示位置，可以是该编辑操作所对应的点、线、面或者三维区域的所有点位置，还可以是该编辑操作所对应的点、线、面或者三维区域的部分点(例如，端点和/或中心点等)的位置等。计算机设备获取到场景定位图像后，可以直接调用空间定位数据库中一个或多个静态AR标签信息，如直接将空间定位数据中所有静态AR标签信息确定为场景AR标签信息，还例如，通过位置匹配确定包含于当前场景定位图像中的静态AR标签信息确定为对应场景AR标签信息等。In some embodiments, the AR tag information includes static AR tag information; wherein the method further includes step S110 (not shown), in which, according to the editing position information of the editing operation in the texture map of the scene, the tag position information of the static AR tag information is determined and stored; wherein, in step S103, according to the scene positioning image, one or more corresponding scene AR tag information is determined in one or more static AR tag information in the spatial positioning database, and the tag position information of the one or more scene AR tag information is queried. For example, after the computer device obtains the user's editing operation in the three-dimensional texture map, if the tag information corresponding to the editing operation is static AR tag information, the computer device can directly determine the tag position information of the static AR tag information according to the position of the point, line, surface or three-dimensional area corresponding to the editing operation in the three-dimensional texture map, such as directly determining the position information of the static AR tag based on the position in the three-dimensional texture map, or further calculating the corresponding spatial position based on the position in the three-dimensional texture map to determine the position information of the static AR tag, etc. The editing position information is used to indicate the indicated position of the point, line, surface or three-dimensional area corresponding to the editing operation in the three-dimensional texture map, which may be the positions of all points of the point, line, surface or three-dimensional area corresponding to the editing operation, or the positions of some points (e.g., endpoints and/or center points, etc.) of the point, line, surface or three-dimensional area corresponding to the editing operation, etc. After the computer device acquires the scene positioning image, it may directly call one or more static AR tag information in the spatial positioning database, such as directly determining all static AR tag information in the spatial positioning data as scene AR tag information, and for example, determining the static AR tag information contained in the current scene positioning image as the corresponding scene AR tag information through position matching.

在一些实施方式中，所述AR标签信息包括动态AR标签信息；其中，所述方法还包括步骤S111(未示出)，在步骤S111中，根据所述编辑操作在所述场景的纹理地图中编辑区域，确定所述目标区域的目标区域标识信息；其中，所述步骤S103包括子步骤S1031(未示出)和子步骤S1032(未示出)，在步骤S1031中，根据所述场景定位图像进行目标检测，确定所述场景定位图像中的至少一个目标区域标识信息，以及所述至少一个目标区域标识信息的位置信息；在步骤S1032中，根据所述至少一个目标区域标识信息查询确定对应的至少一个动态AR标签信息，将所述至少一个动态AR标签信息确定为对应场景AR标签信息，并将所述至少一个目标区域标识信息的位置信息确定为所述至少一个场景AR标签信息的标签位置信息，其中，所述至少一个场景AR标签信息包含于所述空间定位数据库的AR标签信息。例如，计算机设备呈现场景的可视化的纹理地图，用户可以选中纹理地图中特定动态对象的区域并对该区域进行区域标识信息的标注(从而完成了目标区域及该目标区域的目标区域标识信息的确定)，进一步地，可以对该目标区域进行AR标签信息的编辑，生成对应动态AR标签信息，其中，该动态AR标签信息包含用于指示动态对象的目标区域的目标区域标识信息等，如直接将动态对象的名称作为目标区域标识信息或者获取对应序号作为该目标区域的目标区域标识信息等。计算机设备确定目标区域的目标区域标识信息后，将目标区域标识信息与对应的动态AR标签信息存储至空间定位数据库中，供后续对场景定位图像进行目标检测确定目标区域标识信息后进行动态AR标签的匹配等。计算机设备获取到场景定位图像后，可以对场景定位图像进行目标检测，识别场景定位图像中包含的至少一个目标区域标识信息，同时可以检测确定识别到的该至少一个目标区域标识信息的位置信息，如在场景定位图像中的图像位置信息/空间位置信息等。计算机设备可以在空间定位数据库中通过至少一个目标区域标识信息进行查询匹配，确定与该至少一个目标区域标识信息相匹配的动态AR标签信息，将所述至少一个动态AR标签信息确定为对应场景AR标签信息，并将所述至少一个目标区域标识信息的位置信息确定为所述至少一个场景AR标签信息的标签位置信息。具体地，空间定位数据库中存储有每个动态AR标签信息及对应目标区域标识信息，若通过目标检测确定的目标区域标识信息与空间定位数据库中存储的目标区域标识信息相匹配，则将空间定位数据库中该存储的目标区域标识信息对应的动态AR标签信息确定为相匹配的动态AR标签信息，将其作为场景定位图像的场景AR标签信息等。In some embodiments, the AR tag information includes dynamic AR tag information; wherein the method further includes step S111 (not shown), in which, according to the editing operation, an area is edited in the texture map of the scene, and target area identification information of the target area is determined; wherein, the step S103 includes sub-step S1031 (not shown) and sub-step S1032 (not shown), in which, according to the scene positioning image, target detection is performed to determine at least one target area identification information in the scene positioning image, and position information of the at least one target area identification information; in step S1032, according to the at least one target area identification information, at least one corresponding dynamic AR tag information is determined, the at least one dynamic AR tag information is determined as the corresponding scene AR tag information, and the position information of the at least one target area identification information is determined as the tag position information of the at least one scene AR tag information, wherein the at least one scene AR tag information is included in the AR tag information of the spatial positioning database. For example, the computer device presents a visual texture map of the scene, and the user can select the area of a specific dynamic object in the texture map and annotate the area identification information of the area (thereby completing the determination of the target area and the target area identification information of the target area). Further, the AR tag information of the target area can be edited to generate corresponding dynamic AR tag information, wherein the dynamic AR tag information includes target area identification information for indicating the target area of the dynamic object, such as directly using the name of the dynamic object as the target area identification information or obtaining the corresponding serial number as the target area identification information of the target area. After the computer device determines the target area identification information of the target area, the target area identification information and the corresponding dynamic AR tag information are stored in the spatial positioning database for subsequent target detection of the scene positioning image to determine the target area identification information and then match the dynamic AR tag. After the computer device obtains the scene positioning image, it can perform target detection on the scene positioning image, identify at least one target area identification information contained in the scene positioning image, and detect and determine the position information of the at least one target area identification information identified, such as the image position information/spatial position information in the scene positioning image. The computer device can query and match at least one target area identification information in the spatial positioning database, determine the dynamic AR tag information that matches the at least one target area identification information, determine the at least one dynamic AR tag information as the corresponding scene AR tag information, and determine the location information of the at least one target area identification information as the tag location information of the at least one scene AR tag information. Specifically, each dynamic AR tag information and the corresponding target area identification information are stored in the spatial positioning database. If the target area identification information determined by target detection matches the target area identification information stored in the spatial positioning database, the dynamic AR tag information corresponding to the stored target area identification information in the spatial positioning database is determined as the matched dynamic AR tag information, and is used as the scene AR tag information of the scene positioning image, etc.

在一些实施方式中，所述方法还包括步骤S112(未示出)，在步骤S112中，根据所述目标区域确定所述编辑操作在所述场景的稀疏点云地图中的目标区域稀疏点云特征信息，并根据所述目标区域稀疏点云特征信息建立或更新对应目标检测数据库；其中，在步骤S1031中，提取所述场景定位图像对应的场景稀疏点云特征信息，并将所述场景稀疏点云特征信息与所述目标检测数据库中的目标区域稀疏点云特征信息进行匹配，确定所述场景定位图像中包含的至少一个目标区域标识信息，以及所述至少一个目标区域标识信息的位置信息。例如，为了加快检测过程，所述目标区域的目标检测可以依赖特定的目标检测数据库进行，所述目标检测数据库可以是独立于空间定位数据库的其他数据库，或者为该空间定位数据库的部分/子数据等。例如，对应场景存在相对应的稀疏点云地图和纹理地图，当用户在纹理地图中对目标区域进行编辑时，通过稀疏点云地图和纹理地图的对应关系，可以基于目标区域确定该目标区域在稀疏点云地图中的目标稀疏区域，并将该目标稀疏区域包含的点特征、线特征等确定为目标区域稀疏点云特征信息等。即计算机设备可以在纹理地图上完成对场景中目标区域的标注(如目标区域及目标区域的目标区域标识信息的确定等)和AR标签信息的编辑后，自动在稀疏点云地图上得到该目标区域的目标区域稀疏点云特征信息。计算机设备可以将多个编辑操作对应的多个目标区域稀疏点云特征信息存储至目标检测数据库，供后续场景定位图像进行特征匹配等，其中，每个目标区域稀疏点云特征信息与对应目标区域标识信息存在映射关系。例如，计算机设备获取到场景定位图像后，提取该场景定位图像的场景稀疏点云特征信息，并将该场景稀疏点云特征信息与目标检测数据库存储的目标区域稀疏点云特征信息进行相似度匹配，例如，通过最近邻、深度学习等方式，从而确定场景定位图像中包含的至少一个目标区域标识信息，并将该至少一个目标区域标识信息确定为场景定位图像包含的目标区域标识信息，并将该目标区域标识信息在场景定位图像中的识别位置信息确定为对应目标区域标识信息的位置信息等。In some embodiments, the method further includes step S112 (not shown), in which sparse point cloud feature information of the target area of the editing operation in the sparse point cloud map of the scene is determined according to the target area, and a corresponding target detection database is established or updated according to the sparse point cloud feature information of the target area; wherein, in step S1031, the scene sparse point cloud feature information corresponding to the scene positioning image is extracted, and the scene sparse point cloud feature information is matched with the target area sparse point cloud feature information in the target detection database to determine at least one target area identification information contained in the scene positioning image, and the location information of the at least one target area identification information. For example, in order to speed up the detection process, the target detection of the target area can be performed relying on a specific target detection database, and the target detection database can be other databases independent of the spatial positioning database, or part/sub-data of the spatial positioning database, etc. For example, there are corresponding sparse point cloud maps and texture maps for the corresponding scenes. When the user edits the target area in the texture map, the target sparse area of the target area in the sparse point cloud map can be determined based on the target area through the corresponding relationship between the sparse point cloud map and the texture map, and the point features, line features, etc. contained in the target sparse area are determined as the target area sparse point cloud feature information, etc. That is, the computer device can automatically obtain the target area sparse point cloud feature information of the target area on the sparse point cloud map after completing the annotation of the target area in the scene (such as the determination of the target area and the target area identification information of the target area, etc.) and editing of the AR tag information on the texture map. The computer device can store multiple target area sparse point cloud feature information corresponding to multiple editing operations in the target detection database for feature matching of subsequent scene positioning images, etc., wherein each target area sparse point cloud feature information has a mapping relationship with the corresponding target area identification information. For example, after the computer device acquires the scene positioning image, it extracts the scene sparse point cloud feature information of the scene positioning image, and performs similarity matching on the scene sparse point cloud feature information with the target area sparse point cloud feature information stored in the target detection database, for example, by nearest neighbor, deep learning, etc., so as to determine at least one target area identification information contained in the scene positioning image, and determine the at least one target area identification information as the target area identification information contained in the scene positioning image, and determine the identification position information of the target area identification information in the scene positioning image as the position information corresponding to the target area identification information, etc.

在一些实施方式中，所述方法还包括步骤S113(未示出)，在步骤S113中，根据所述目标区域确定所述编辑操作在所述场景的纹理地图中的区域样本信息，并根据所述区域样本信息训练对应目标检测网络模型；其中，在步骤S1031中，将所述场景定位图像输入所述目标检测网络模型，确定所述场景定位图像中包含的至少一个目标区域标识信息，以及所述至少一个目标区域标识信息的位置信息。例如，通过特征匹配进行目标检测识别速度快，但是检测鲁棒性差，且由于动态目标的姿态或者背景环境容易发生变化，对应识别率较低。在此，计算机设备还可以通过获取目标区域的区域样本信息，并训练对应目标检测网络模型从而实现目标检测的精准识别。例如，在三维纹理地图中标注的目标区域后，通过图像投影过程(如透视投影等)可以将三维的目标区域快速地、批量地映射到图像上，生成含有目标区域的图像集合，作为目标检测的训练样本。在一些实施例中，含有目标区域的图像集合同样可以存放于目标检测数据库中，供目标检测网络模型进行训练。所述目标检测网络模型可以是包含于目标检测数据库，或者独立于目标检测数据库等。所述目标检测网络模型包括用于输入图像并输出对象类别及位置坐标的深度学习模型，如YOLO(You Only LookOnce)、Faster R-CNN、SSD(Single Shot MultiBox Detector)模型等。与传统的通过相机拍摄目标区域(如目标对象所在区域等)获得包含目标区域的图集并在图集上进行每一张图像的目标区域标记相比，本发明中的训练样本生成方案快速，高效。计算机设备获取到场景定位图像后，将场景定位图像输入目标检测网络模型，并将模型输出结果中包含的目标区域标识信息确定为场景定位图像包含的至少一个目标区域标识信息等，并将输出结果包含的识别位置信息确定为对应目标区域标识信息的位置信息等。In some embodiments, the method further includes step S113 (not shown), in which the regional sample information of the editing operation in the texture map of the scene is determined according to the target area, and the corresponding target detection network model is trained according to the regional sample information; wherein, in step S1031, the scene positioning image is input into the target detection network model to determine at least one target area identification information contained in the scene positioning image, and the position information of the at least one target area identification information. For example, target detection and recognition speed by feature matching is fast, but detection robustness is poor, and the corresponding recognition rate is low because the posture or background environment of the dynamic target is easy to change. Here, the computer device can also obtain the regional sample information of the target area and train the corresponding target detection network model to achieve accurate recognition of target detection. For example, after the target area is marked in the three-dimensional texture map, the three-dimensional target area can be quickly and batch mapped to the image through an image projection process (such as perspective projection, etc.), and an image set containing the target area is generated as a training sample for target detection. In some embodiments, the image set containing the target area can also be stored in the target detection database for training the target detection network model. The target detection network model may be included in the target detection database, or may be independent of the target detection database, etc. The target detection network model includes a deep learning model for inputting an image and outputting an object category and position coordinates, such as YOLO (You Only Look Once), Faster R-CNN, SSD (Single Shot MultiBox Detector) models, etc. Compared with the traditional method of obtaining an atlas containing the target area by photographing the target area (such as the area where the target object is located) with a camera and marking the target area of each image on the atlas, the training sample generation scheme in the present invention is fast and efficient. After the computer device obtains the scene positioning image, the scene positioning image is input into the target detection network model, and the target area identification information contained in the model output result is determined as at least one target area identification information contained in the scene positioning image, etc., and the recognition position information contained in the output result is determined as the position information corresponding to the target area identification information, etc.

在一些实施方式中，所述方法还包括步骤S114(未示出)，在步骤S114中，根据所述目标区域确定所述编辑操作在所述场景的稀疏点云地图中的目标区域稀疏点云特征信息，并根据所述目标区域稀疏点云特征信息建立或更新对应目标检测数据库；根据所述目标区域确定所述编辑操作在所述场景的纹理地图中的区域样本信息，并根据所述区域样本信息训练对应目标检测网络模型；其中，在步骤S1031中，提取所述场景定位图像对应的场景稀疏点云特征信息，并将所述场景稀疏点云特征信息与所述目标检测数据库中的目标区域稀疏点云特征信息进行匹配，若匹配成功，则确定所述场景定位图像中包含的至少一个目标区域标识信息，以及所述至少一个目标区域标识信息的识别位置信息；若匹配失败，将所述场景定位图像输入所述目标检测网络模型，确定所述场景定位图像中包含的至少一个目标区域标识信息，以及所述至少一个目标区域标识信息的识别位置信息。例如，一方面，基于特征点云的目标检测方法优点是即时创建即时使用，且识别速度快，缺点是鲁棒性差当目标的姿态或者背景环境发生变化时该方法的识别率会下降。由于特征点云鲁棒性差，所以在目标检测模块一方面需要利用特征点云进行识别，另一方面需要利用深度学习模型进行识别。因此，计算机设备可以先通过特征点匹配进行场景定位图像的目标检测，如通过稀疏点云特征在目标位置和场景未发生时可以快速地完成目标检测任务，，若位置或场景发生较大变化，则特征匹配失败，则继续基于深度学习模型的目标检测具备鲁棒性高的特征，将模型输出结果中包含的目标区域标识信息确定为场景定位图像包含的至少一个目标区域标识信息等，并将输出结果包含的识别位置信息确定为对应目标区域标识信息的位置信息等。In some embodiments, the method further includes step S114 (not shown), in which sparse point cloud feature information of the target area of the editing operation in the sparse point cloud map of the scene is determined according to the target area, and a corresponding target detection database is established or updated according to the sparse point cloud feature information of the target area; regional sample information of the editing operation in the texture map of the scene is determined according to the target area, and a corresponding target detection network model is trained according to the regional sample information; wherein, in step S1031, the scene sparse point cloud feature information corresponding to the scene positioning image is extracted, and the scene sparse point cloud feature information is matched with the target area sparse point cloud feature information in the target detection database, and if the match is successful, at least one target area identification information contained in the scene positioning image and the identification position information of the at least one target area identification information are determined; if the match fails, the scene positioning image is input into the target detection network model to determine at least one target area identification information contained in the scene positioning image and the identification position information of the at least one target area identification information. For example, on the one hand, the advantage of the target detection method based on feature point cloud is that it can be created and used immediately, and the recognition speed is fast. The disadvantage is that the robustness is poor. When the posture of the target or the background environment changes, the recognition rate of this method will decrease. Due to the poor robustness of feature point cloud, the target detection module needs to use feature point cloud for recognition on the one hand, and needs to use deep learning model for recognition on the other hand. Therefore, the computer device can first perform target detection of scene positioning image through feature point matching, such as through sparse point cloud features when the target position and scene do not occur, the target detection task can be completed quickly. If the position or scene changes significantly, the feature matching fails, and the target detection based on the deep learning model continues to have the characteristics of high robustness, and the target area identification information contained in the model output result is determined as at least one target area identification information contained in the scene positioning image, and the recognition position information contained in the output result is determined as the position information corresponding to the target area identification information, etc.

在一些实施方式中，在步骤S102中，获取当前场景的场景定位图像，根据所述场景定位图像与所述空间定位数据库中一个或多个场景记录的场景稀疏点云地图，利用特征点匹配和PNP算法，确定所述场景定位图像对应的摄像位姿信息。其中，所述特征点匹配就是以图像上提取的具有某种局部特殊性质的点(称为特征点)作为共轭实体，以特征点的属性参数即特征描述作为匹配实体，通过计算相似性测度实现共轭实体配准的图像匹配方法。所述PNP算法是求解三维空间至二维点的对应方法，通常是给定3D点的坐标以及对应2D点的坐标以及内参矩阵，求解相机的姿态等。计算机设备先对场景定位图像进行特征提取，然后与空间定位数据库中的场景稀疏点云地图进行匹配，然后通过PNP计算得到摄像位姿信息等。In some embodiments, in step S102, a scene positioning image of the current scene is obtained, and the camera posture information corresponding to the scene positioning image is determined by using feature point matching and PNP algorithm according to the scene positioning image and the scene sparse point cloud map of one or more scene records in the spatial positioning database. Among them, the feature point matching is an image matching method that uses points with certain local special properties extracted from the image (called feature points) as conjugate entities, and attribute parameters of feature points, i.e., feature descriptions, as matching entities, and realizes conjugate entity registration by calculating similarity measures. The PNP algorithm is a method for solving the correspondence between three-dimensional space and two-dimensional points, usually given the coordinates of 3D points and the coordinates of corresponding 2D points and the internal parameter matrix, to solve the posture of the camera, etc. The computer device first extracts features from the scene positioning image, then matches it with the scene sparse point cloud map in the spatial positioning database, and then obtains the camera posture information, etc. through PNP calculation.

在一些实施方式中，在步骤S104中，根据所述摄像位姿信息及所述一个或多个场景AR标签信息的标签位置信息，确定所述一个或多个场景AR标签信息的标签图像位置信息，从而在所述场景定位图像中所述一个或多个场景AR标签信息的标签图像位置信息叠加呈现所述一个或多个场景AR标签信息的标签内容信息。例如，计算机设备获取场景AR标签信息的标签位置信息后，可以基于摄像位姿信息及标签位置信息进行坐标转换，确定AR标签信息在场景定位图像的图像坐标系中的标签图像位置信息，从而在场景定位图像中对应位置叠加呈现AR标签位置的标签内容等。In some implementations, in step S104, the label image position information of the one or more scene AR label information is determined based on the camera pose information and the label position information of the one or more scene AR label information, so that the label image position information of the one or more scene AR label information is superimposed and presented in the scene positioning image with the label content information of the one or more scene AR label information. For example, after the computer device obtains the label position information of the scene AR label information, it can perform coordinate conversion based on the camera pose information and the label position information to determine the label image position information of the AR label information in the image coordinate system of the scene positioning image, so as to superimpose and present the label content of the AR label position at the corresponding position in the scene positioning image.

上文主要对本申请一个方面的一种用于呈现AR标签信息的方法的各实施例进行了具体介绍，此外，本申请还提供了能够实施上述各实施例的具体设备，下面我们结合图2进行介绍。The above mainly introduces the various embodiments of a method for presenting AR tag information in one aspect of the present application. In addition, the present application also provides specific devices that can implement the above embodiments, which are introduced below in conjunction with Figure 2.

图2示出了根据本申请一个方面的一种用于呈现AR标签信息的设备，其中，该设备包括一一模块101、一二模块102、一三模块103以及一四模块104。一一模块101，用于建立或更新空间定位数据库，其中，所述空间定位数据库包括一个或多个场景的场景记录，每个场景记录包括对应场景的稀疏点云地图、纹理地图以及该场景的一个或多个AR标签信息，所述AR标签信息包括标签内容信息；一二模块102，用于获取当前场景的场景定位图像，并确定所述场景定位图像对应的摄像位姿信息；一三模块103，用于基于所述场景定位图像获取一个或多个场景AR标签信息的标签位置信息，其中，所述一个或多个场景AR标签信息包含于所述空间定位数据库的AR标签信息；一四模块104，用于根据摄像位姿信息及所述一个或多个场景AR标签信息的标签位置信息，在所述场景定位图像中叠加呈现所述一个或多个场景AR标签信息的标签内容信息。其中，所述计算机设备包括但不限于用户设备、网络设备或者用户设备与网络设备的集合设备；其中，所述用户设备包括但不限于任何一种可与用户进行人机交互的电子产品，例如智能手机、平板电脑、智能眼镜、无人机、监控摄像机等；所述网络设备包括但不限于计算机、网络主机、单个网络服务器、多个网络服务器集或多个服务器构成的云，例如，地面控制中心服务器、业务平台等。在此，本申请智能眼镜为例阐述以下该等实施例，本领域技术人员应能理解以下该等实施例同样适用于其他计算机设备等。FIG2 shows a device for presenting AR tag information according to one aspect of the present application, wherein the device includes a module 101, a module 102, a module 103, and a module 104. The module 101 is used to establish or update a spatial positioning database, wherein the spatial positioning database includes scene records of one or more scenes, each scene record includes a sparse point cloud map and a texture map of the corresponding scene, and one or more AR tag information of the scene, wherein the AR tag information includes tag content information; the module 102 is used to obtain a scene positioning image of the current scene, and determine the camera pose information corresponding to the scene positioning image; the module 103 is used to obtain tag position information of one or more scene AR tag information based on the scene positioning image, wherein the one or more scene AR tag information is included in the AR tag information of the spatial positioning database; the module 104 is used to overlay and present the tag content information of the one or more scene AR tag information in the scene positioning image according to the camera pose information and the tag position information of the one or more scene AR tag information. The computer device includes but is not limited to user equipment, network equipment or a combination of user equipment and network equipment; the user equipment includes but is not limited to any electronic product that can interact with the user, such as smart phones, tablet computers, smart glasses, drones, surveillance cameras, etc.; the network equipment includes but is not limited to computers, network hosts, single network servers, multiple network server sets or clouds composed of multiple servers, such as ground control center servers, business platforms, etc. Here, the smart glasses of this application are used as an example to illustrate the following embodiments, and those skilled in the art should understand that the following embodiments are also applicable to other computer equipment, etc.

在此，所述图2示出的一一模块101、一二模块102、一三模块103以及一四模块104对应具体实施方式与前述图1示出的步骤S101、步骤S102、步骤S103以及步骤S104的实施例相同或相似，因而不再赘述，以引用的方式包含于此。Here, the specific implementations corresponding to the module 101, the module 102, the module 103 and the module 104 shown in Figure 2 are the same or similar to the embodiments of step S101, step S102, step S103 and step S104 shown in the aforementioned Figure 1, and are therefore not repeated here and are included herein by reference.

在一些实施方式中，所述设备还包括一五模块(未示出)，用于获取对应建图场景的场景图像，并获取所述场景图像对应的建图场景的稀疏点云地图及建图场景的纹理地图；基于所述建图场景的稀疏点云地图、纹理地图建立所述建图场景的场景记录。在一些实施方式中，所述设备还包括一六模块(未示出)，用于基于所述建图场景的稀疏点云地图和/或所述纹理地图，获取所述建图场景中的建图AR标签信息，其中，所述建图AR标签信息包括标签内容信息；其中，所述基于所述建图场景的稀疏点云地图、纹理地图建立所述建图场景的场景记录，包括：基于所述建图场景的稀疏点云地图、纹理地图以及所述建图AR标签信息建立所述建图场景的场景记录；其中，一一模块101，用于基于所述建图场景的场景记录建立或更新空间定位数据库，其中，所述空间定位数据库包括一个或多个场景的场景记录，每个场景记录包括对应场景的稀疏点云地图、纹理地图以及该场景的一个或多个AR标签信息，所述AR标签信息包括标签内容信息，所述建图场景包含于所述一个或多个场景。在一些实施方式中，所述设备还包括一七模块(未示出)，用于基于所述建图场景的稀疏点云地图和/或所述纹理地图，获取所述建图场景中的建图AR标签信息，其中，所述建图AR标签信息包括标签内容信息；根据所述建图AR标签信息更新所述空间定位数据库中所述建图场景对应的场景记录。In some embodiments, the device also includes a module (not shown) for obtaining a scene image corresponding to a mapping scene, and obtaining a sparse point cloud map of the mapping scene corresponding to the scene image and a texture map of the mapping scene; and establishing a scene record of the mapping scene based on the sparse point cloud map and texture map of the mapping scene. In some embodiments, the device further includes a module 101 (not shown) for acquiring mapping AR tag information in the mapping scene based on the sparse point cloud map and/or the texture map of the mapping scene, wherein the mapping AR tag information includes tag content information; wherein the establishing a scene record of the mapping scene based on the sparse point cloud map and the texture map of the mapping scene includes: establishing a scene record of the mapping scene based on the sparse point cloud map, the texture map and the mapping AR tag information of the mapping scene; wherein a module 101 is used to establish or update a spatial positioning database based on the scene record of the mapping scene, wherein the spatial positioning database includes scene records of one or more scenes, each scene record includes a sparse point cloud map and a texture map of the corresponding scene and one or more AR tag information of the scene, the AR tag information includes tag content information, and the mapping scene is included in the one or more scenes. In some embodiments, the device also includes a module (not shown) for obtaining mapping AR tag information in the mapping scene based on the sparse point cloud map and/or the texture map of the mapping scene, wherein the mapping AR tag information includes tag content information; and updating the scene record corresponding to the mapping scene in the spatial positioning database according to the mapping AR tag information.

在一些实施方式中，所述建图AR标签信息包括但不限于：静态AR标签信息，其中，所述静态AR标签信息还包括对应的标签位置信息；动态AR标签信息，其中，所述动态AR标签信息还包括对应动态目标的目标标识信息。In some embodiments, the mapping AR tag information includes but is not limited to: static AR tag information, wherein the static AR tag information also includes corresponding tag location information; dynamic AR tag information, wherein the dynamic AR tag information also includes target identification information of corresponding dynamic targets.

在一些实施方式中，所述设备还包括一八模块(未示出)，用于对所述建图场景的稀疏点云地图进行Mask优化处理，获取优化后的优化稀疏点云地图；其中，所述基于所述建图场景的稀疏点云地图、纹理地图建立所述建图场景的场景记录，包括：基于所述建图场景的优化稀疏点云地图、纹理地图建立所述建图场景的场景记录。In some embodiments, the device also includes an eight-module (not shown) for performing Mask optimization processing on the sparse point cloud map of the mapping scene to obtain an optimized sparse point cloud map after optimization; wherein, establishing a scene record of the mapping scene based on the sparse point cloud map and texture map of the mapping scene includes: establishing a scene record of the mapping scene based on the optimized sparse point cloud map and texture map of the mapping scene.

在一些实施方式中，所述设备还包括一九模块(未示出)，用于获取用户在场景的所述纹理地图中的编辑操作，基于所述编辑操作确定该场景的一个或多个AR标签信息，所述AR标签信息包括标签内容信息。In some embodiments, the device also includes a module (not shown) for obtaining a user's editing operation in the texture map of the scene, and determining one or more AR tag information of the scene based on the editing operation, wherein the AR tag information includes tag content information.

在一些实施方式中，所述AR标签信息包括静态AR标签信息；其中，所述设备还包括一十模块(未示出)，用于根据所述编辑操作在所述场景的纹理地图中的编辑位置信息，确定并存储所述静态AR标签信息的标签位置信息；其中，一三模块103，用于根据所述场景定位图像，在所述空间定位数据库的一个或多个静态AR标签信息中确定对应的一个或多个场景AR标签信息，并查询所述一个或多个场景AR标签信息的标签位置信息。In some embodiments, the AR tag information includes static AR tag information; wherein the device also includes a module (not shown) for determining and storing tag position information of the static AR tag information based on the editing position information of the editing operation in the texture map of the scene; wherein the module 103 is used to determine corresponding one or more scene AR tag information in one or more static AR tag information in the spatial positioning database based on the scene positioning image, and query the tag position information of the one or more scene AR tag information.

在一些实施方式中，所述AR标签信息包括动态AR标签信息；其中，所述设备还包括一十一模块(未示出)，用于根据所述编辑操作在所述场景的纹理地图中编辑区域，确定所述目标区域的目标区域标识信息；其中，所述一三模块103包括一三一单元(未示出)和一三二单元(未示出)；一三一单元，用于根据所述场景定位图像进行目标检测，确定所述场景定位图像中的至少一个目标区域标识信息，以及所述至少一个目标区域标识信息的位置信息；一三二单元用，用于根据所述至少一个目标区域标识信息查询确定对应的至少一个动态AR标签信息，将所述至少一个动态AR标签信息确定为对应场景AR标签信息，并将所述至少一个目标区域标识信息的位置信息确定为所述至少一个场景AR标签信息的标签位置信息，其中，所述至少一个场景AR标签信息包含于所述空间定位数据库的AR标签信息。In some embodiments, the AR tag information includes dynamic AR tag information; wherein the device also includes an eleventh module (not shown) for editing an area in a texture map of the scene according to the editing operation, and determining target area identification information of the target area; wherein the one-three module 103 includes a one-three-one unit (not shown) and a one-three-two unit (not shown); the one-three-one unit is used to perform target detection according to the scene positioning image, determine at least one target area identification information in the scene positioning image, and position information of the at least one target area identification information; the one-three-two unit is used to query and determine at least one corresponding dynamic AR tag information according to the at least one target area identification information, determine the at least one dynamic AR tag information as the corresponding scene AR tag information, and determine the position information of the at least one target area identification information as the tag position information of the at least one scene AR tag information, wherein the at least one scene AR tag information is included in the AR tag information of the spatial positioning database.

在一些实施方式中，所述设备还包括一十二模块(未示出)，用于根据所述目标区域确定所述编辑操作在所述场景的稀疏点云地图中确定所述目标区域对应的目标区域稀疏点云特征信息，并根据所述目标区域稀疏点云特征信息建立或更新对应目标检测数据库；其中，一三一单元，用于提取所述场景定位图像对应的场景稀疏点云特征信息，并将所述场景稀疏点云特征信息与所述目标检测数据库中的目标区域稀疏点云特征信息进行匹配，确定所述场景定位图像中包含的至少一个目标区域标识信息，以及所述至少一个目标区域标识信息的位置信息。In some embodiments, the device also includes a module (not shown) for determining, based on the target area, the target area sparse point cloud feature information corresponding to the target area in the sparse point cloud map of the scene according to the editing operation, and establishing or updating a corresponding target detection database based on the target area sparse point cloud feature information; wherein, a unit 131 is used to extract the scene sparse point cloud feature information corresponding to the scene positioning image, and match the scene sparse point cloud feature information with the target area sparse point cloud feature information in the target detection database to determine at least one target area identification information contained in the scene positioning image, and the location information of the at least one target area identification information.

在一些实施方式中，所述设备还包括一十三模块(未示出)，用于根据所述目标区域确定所述编辑操作在所述场景的纹理地图中确定所述目标区域的区域样本信息，并根据所述区域样本信息训练对应目标检测网络模型；其中，一三一单元，用于将所述场景定位图像输入所述目标检测网络模型，确定所述场景定位图像中包含的至少一个目标区域标识信息，以及所述至少一个目标区域标识信息的位置信息。In some embodiments, the device also includes a thirteenth module (not shown), which is used to determine the regional sample information of the target area in the texture map of the scene according to the target area determined by the editing operation, and train the corresponding target detection network model according to the regional sample information; wherein, the one-three-one unit is used to input the scene positioning image into the target detection network model, determine at least one target area identification information contained in the scene positioning image, and the location information of the at least one target area identification information.

在一些实施方式中，所述设备还包括一十四单元(未示出)，用于根据所述目标区域确定所述编辑操作在所述场景的稀疏点云地图中的目标区域稀疏点云特征信息，并根据所述目标区域稀疏点云特征信息建立或更新对应目标检测数据库；根据所述目标区域确定所述编辑操作在所述场景的纹理地图中的区域样本信息，并根据所述区域样本信息训练对应目标检测网络模型；其中，一三一单元，用于提取所述场景定位图像对应的场景稀疏点云特征信息，并将所述场景稀疏点云特征信息与所述目标检测数据库中的目标区域稀疏点云特征信息进行匹配，若匹配成功，则确定所述场景定位图像中包含的至少一个目标区域标识信息，以及所述至少一个目标区域标识信息的识别位置信息；若匹配失败，将所述场景定位图像输入所述目标检测网络模型，确定所述场景定位图像中包含的至少一个目标区域标识信息，以及所述至少一个目标区域标识信息的识别位置信息。In some embodiments, the device also includes a fourteenth unit (not shown), which is used to determine the sparse point cloud feature information of the target area of the editing operation in the sparse point cloud map of the scene according to the target area, and establish or update the corresponding target detection database according to the sparse point cloud feature information of the target area; determine the regional sample information of the editing operation in the texture map of the scene according to the target area, and train the corresponding target detection network model according to the regional sample information; wherein, the one three one unit is used to extract the scene sparse point cloud feature information corresponding to the scene positioning image, and match the scene sparse point cloud feature information with the target area sparse point cloud feature information in the target detection database. If the match is successful, at least one target area identification information contained in the scene positioning image and the identification position information of the at least one target area identification information are determined; if the match fails, the scene positioning image is input into the target detection network model to determine at least one target area identification information contained in the scene positioning image and the identification position information of the at least one target area identification information.

在一些实施方式中，一二模块102，用于获取当前场景的场景定位图像，根据所述场景定位图像与所述空间定位数据库中一个或多个场景记录的场景稀疏点云地图，利用特征点匹配和PNP算法，确定所述场景定位图像对应的摄像位姿信息。In some embodiments, module 102 is used to obtain a scene positioning image of the current scene, and determine the camera posture information corresponding to the scene positioning image by using feature point matching and PNP algorithm based on the scene positioning image and the scene sparse point cloud map recorded in one or more scenes in the spatial positioning database.

在一些实施方式中，一四模块104，用于根据所述摄像位姿信息及所述一个或多个场景AR标签信息的标签位置信息，确定所述一个或多个场景AR标签信息的标签图像位置信息，从而在所述场景定位图像中所述一个或多个场景AR标签信息的标签图像位置信息叠加呈现所述一个或多个场景AR标签信息的标签内容信息。In some embodiments, module 104 is used to determine the tag image position information of the one or more scene AR tag information based on the camera posture information and the tag position information of the one or more scene AR tag information, so as to superimpose the tag image position information of the one or more scene AR tag information in the scene positioning image to present the tag content information of the one or more scene AR tag information.

在此，所述一五模块至一十四模块对应的具体实施方式与前述步骤S105至步骤S114的实施例相同或相似，因而不再赘述，以引用的方式包含于此。Here, the specific implementations corresponding to the modules 15 to 14 are the same as or similar to the embodiments of the aforementioned steps S105 to S114, and thus are not described in detail and are included herein by reference.

除上述各实施例介绍的方法和设备外，本申请还提供了一种计算机可读存储介质，所述计算机可读存储介质存储有计算机代码，当所述计算机代码被执行时，如前任一项所述的方法被执行。In addition to the methods and devices described in the above embodiments, the present application also provides a computer-readable storage medium, which stores computer code. When the computer code is executed, the method described in any of the preceding items is executed.

本申请还提供了一种计算机程序产品，当所述计算机程序产品被计算机设备执行时，如前任一项所述的方法被执行。The present application also provides a computer program product. When the computer program product is executed by a computer device, the method described in any of the preceding items is executed.

本申请还提供了一种计算机设备，所述计算机设备包括：The present application also provides a computer device, the computer device comprising:

一个或多个处理器；one or more processors;

存储器，用于存储一个或多个计算机程序；a memory for storing one or more computer programs;

当所述一个或多个计算机程序被所述一个或多个处理器执行时，使得所述一个或多个处理器实现如前任一项所述的方法。When the one or more computer programs are executed by the one or more processors, the one or more processors are caused to implement the method as described in any of the preceding items.

图3示出了可被用于实施本申请中所述的各个实施例的示例性系统；FIG3 illustrates an exemplary system that can be used to implement various embodiments described in this application;

如图3所示在一些实施例中，系统300能够作为各所述实施例中的任意一个上述设备。在一些实施例中，系统300可包括具有指令的一个或多个计算机可读介质(例如，系统存储器或非易失性内存(non-volatile memory，NVM)/存储设备320)以及与该一个或多个计算机可读介质耦合并被配置为执行指令以实现模块从而执行本申请中所述的动作的一个或多个处理器(例如，(一个或多个)处理器305)。As shown in FIG3 , in some embodiments, the system 300 can be used as any of the above-mentioned devices in each of the embodiments. In some embodiments, the system 300 may include one or more computer-readable media (e.g., system memory or non-volatile memory (NVM)/storage device 320) with instructions and one or more processors (e.g., (one or more) processors 305) coupled to the one or more computer-readable media and configured to execute instructions to implement modules to perform the actions described in the present application.

对于一个实施例，系统控制模块310可包括任意适当的接口控制器，以向(一个或多个)处理器305中的至少一个和/或与系统控制模块310通信的任意适当的设备或组件提供任意适当的接口。For one embodiment, system control module 310 may include any suitable interface controller to provide any suitable interface to at least one of processor(s) 305 and/or any suitable device or component in communication with system control module 310 .

系统控制模块310可包括存储器控制器模块330，以向系统存储器315提供接口。存储器控制器模块330可以是硬件模块、软件模块和/或固件模块。The system control module 310 may include a memory controller module 330 to provide an interface to the system memory 315. The memory controller module 330 may be a hardware module, a software module, and/or a firmware module.

系统存储器315可被用于例如为系统300加载和存储数据和/或指令。对于一个实施例，系统存储器315可包括任意适当的易失性存储器，例如，适当的DRAM。在一些实施例中，系统存储器315可包括双倍数据速率类型四同步动态随机存取存储器(Double DataRate4 SDRAM，DDR4SDRAM)。The system memory 315 may be used, for example, to load and store data and/or instructions for the system 300. For one embodiment, the system memory 315 may include any suitable volatile memory, such as a suitable DRAM. In some embodiments, the system memory 315 may include a double data rate type four synchronous dynamic random access memory (Double Data Rate 4 SDRAM, DDR4 SDRAM).

对于一个实施例，系统控制模块310可包括一个或多个输入/输出(I/O)控制器，以向NVM/存储设备320及(一个或多个)通信接口325提供接口。For one embodiment, system control module 310 may include one or more input/output (I/O) controllers to provide interfaces to NVM/storage device 320 and communication interface(s) 325 .

例如，NVM/存储设备320可被用于存储数据和/或指令。NVM/存储设备320可包括任意适当的非易失性存储器(例如，闪存)和/或可包括任意适当的(一个或多个)非易失性存储设备(例如，一个或多个硬盘驱动器(Hard-Disk Drive，HDD)、一个或多个光盘(CompactDisc，CD)驱动器和/或一个或多个数字通用光盘(Digital Video Disc，DVD)驱动器)。For example, NVM/storage device 320 may be used to store data and/or instructions. NVM/storage device 320 may include any suitable non-volatile memory (e.g., flash memory) and/or may include any suitable non-volatile storage device(s) (e.g., one or more hard disk drives (HDD), one or more compact disc (CD) drives, and/or one or more digital versatile disc (DVD) drives).

NVM/存储设备320可包括在物理上作为系统300被安装在其上的设备的一部分的存储资源，或者其可被该设备访问而不必作为该设备的一部分。例如，NVM/存储设备320可通过网络经由(一个或多个)通信接口325进行访问。NVM/storage device 320 may include storage resources that are physically part of the device on which system 300 is installed, or it may be accessible to the device without being part of the device. For example, NVM/storage device 320 may be accessed over a network via communication interface(s) 325.

(一个或多个)通信接口325可为系统300提供接口以通过一个或多个网络和/或与任意其他适当的设备通信。系统300可根据一个或多个无线网络标准和/或协议中的任意标准和/或协议来与无线网络的一个或多个组件进行无线通信。Communication interface(s) 325 may provide an interface for system 300 to communicate over one or more networks and/or with any other suitable devices. System 300 may wirelessly communicate with one or more components of a wireless network in accordance with any of one or more wireless network standards and/or protocols.

对于一个实施例，(一个或多个)处理器305中的至少一个可与系统控制模块310的一个或多个控制器(例如，存储器控制器模块330)的逻辑封装在一起。对于一个实施例，(一个或多个)处理器305中的至少一个可与系统控制模块310的一个或多个控制器的逻辑封装在一起以形成系统级封装(System In a Package，SiP)。对于一个实施例，(一个或多个)处理器305中的至少一个可与系统控制模块310的一个或多个控制器的逻辑集成在同一模具上。对于一个实施例，(一个或多个)处理器305中的至少一个可与系统控制模块310的一个或多个控制器的逻辑集成在同一模具上以形成片上系统(System On Chip，SoC)。For one embodiment, at least one of the processor(s) 305 may be packaged together with the logic of one or more controllers (e.g., memory controller module 330) of the system control module 310. For one embodiment, at least one of the processor(s) 305 may be packaged together with the logic of one or more controllers of the system control module 310 to form a System In a Package (SiP). For one embodiment, at least one of the processor(s) 305 may be integrated on the same die with the logic of one or more controllers of the system control module 310. For one embodiment, at least one of the processor(s) 305 may be integrated on the same die with the logic of one or more controllers of the system control module 310 to form a System On Chip (SoC).

在各个实施例中，系统300可以但不限于是：服务器、工作站、台式计算设备或移动计算设备(例如，膝上型计算设备、手持计算设备、平板电脑、上网本等)。在各个实施例中，系统300可具有更多或更少的组件和/或不同的架构。例如，在一些实施例中，系统300包括一个或多个摄像机、键盘、液晶显示器(Liquid Crystal Display，LCD)屏幕(包括触屏显示器)、非易失性存储器端口、多个天线、图形芯片、专用集成电路(Application SpecificIntegrated Circuit，ASIC)和扬声器。In various embodiments, the system 300 may be, but is not limited to: a server, a workstation, a desktop computing device, or a mobile computing device (e.g., a laptop computing device, a handheld computing device, a tablet computer, a netbook, etc.). In various embodiments, the system 300 may have more or fewer components and/or different architectures. For example, in some embodiments, the system 300 includes one or more cameras, a keyboard, a liquid crystal display (LCD) screen (including a touch screen display), a non-volatile memory port, multiple antennas, a graphics chip, an application specific integrated circuit (ASIC), and a speaker.

需要注意的是，本申请可在软件和/或软件与硬件的组合体中被实施，例如，可采用专用集成电路(ASIC)、通用目的计算机或任何其他类似硬件设备来实现。在一个实施例中，本申请的软件程序可以通过处理器执行以实现上文所述步骤或功能。同样地，本申请的软件程序(包括相关的数据结构)可以被存储到计算机可读记录介质中，例如，RAM存储器，磁或光驱动器或软磁盘及类似设备。另外，本申请的一些步骤或功能可采用硬件来实现，例如，作为与处理器配合从而执行各个步骤或功能的电路。It should be noted that the present application can be implemented in software and/or a combination of software and hardware, for example, can be implemented using an application specific integrated circuit (ASIC), a general purpose computer or any other similar hardware device. In one embodiment, the software program of the present application can be executed by a processor to implement the steps or functions described above. Similarly, the software program of the present application (including relevant data structures) can be stored in a computer-readable recording medium, for example, a RAM memory, a magnetic or optical drive or a floppy disk and similar devices. In addition, some steps or functions of the present application can be implemented using hardware, for example, as a circuit that cooperates with a processor to perform each step or function.

另外，本申请的一部分可被应用为计算机程序产品，例如计算机程序指令，当其被计算机执行时，通过该计算机的操作，可以调用或提供根据本申请的方法和/或技术方案。本领域技术人员应能理解，计算机程序指令在计算机可读介质中的存在形式包括但不限于源文件、可执行文件、安装包文件等，相应地，计算机程序指令被计算机执行的方式包括但不限于：该计算机直接执行该指令，或者该计算机编译该指令后再执行对应的编译后程序，或者该计算机读取并执行该指令，或者该计算机读取并安装该指令后再执行对应的安装后程序。在此，计算机可读介质可以是可供计算机访问的任意可用的计算机可读存储介质或通信介质。In addition, a part of the present application may be applied as a computer program product, such as a computer program instruction, which, when executed by a computer, can call or provide the method and/or technical solution according to the present application through the operation of the computer. Those skilled in the art should understand that the existence of computer program instructions in computer-readable media includes but is not limited to source files, executable files, installation package files, etc., and accordingly, the way in which computer program instructions are executed by a computer includes but is not limited to: the computer directly executes the instruction, or the computer compiles the instruction and then executes the corresponding compiled program, or the computer reads and executes the instruction, or the computer reads and installs the instruction and then executes the corresponding installed program. Here, the computer-readable medium can be any available computer-readable storage medium or communication medium accessible to the computer.

通信介质包括藉此包含例如计算机可读指令、数据结构、程序模块或其他数据的通信信号被从一个系统传送到另一系统的介质。通信介质可包括有导的传输介质(诸如电缆和线(例如，光纤、同轴等))和能传播能量波的无线(未有导的传输)介质，诸如声音、电磁、射频(Radio Frequency,RF)、微波和红外。计算机可读指令、数据结构、程序模块或其他数据可被体现为例如无线介质(诸如载波或诸如被体现为扩展频谱技术的一部分的类似机制)中的已调制数据信号。术语“已调制数据信号”指的是其一个或多个特征以在信号中编码信息的方式被更改或设定的信号。调制可以是模拟的、数字的或混合调制技术。Communication media include media by which communication signals containing, for example, computer readable instructions, data structures, program modules or other data are transmitted from one system to another. Communication media may include guided transmission media such as cables and wires (e.g., fiber optic, coaxial, etc.) and wireless (unguided transmission) media that can propagate energy waves, such as acoustic, electromagnetic, radio frequency (RF), microwave and infrared. Computer readable instructions, data structures, program modules or other data may be embodied as a modulated data signal in, for example, a wireless medium such as a carrier wave or similar mechanism such as embodied as part of spread spectrum technology. The term "modulated data signal" refers to a signal whose one or more characteristics are changed or set in such a manner as to encode information in the signal. Modulation may be analog, digital, or a hybrid modulation technique.

作为示例而非限制，计算机可读存储介质可包括以用于存储诸如计算机可读指令、数据结构、程序模块或其它数据的信息的任何方法或技术实现的易失性和非易失性、可移动和不可移动的介质。例如，计算机可读存储介质包括，但不限于，易失性存储器，诸如随机存储器(RAM,DRAM,SRAM)；以及非易失性存储器，诸如闪存、各种只读存储器(ROM,PROM,EPROM,EEPROM)、磁性和铁磁/铁电存储器(MRAM,FeRAM)；以及磁性和光学存储设备(硬盘、磁带、CD、DVD)；或其它现在已知的介质或今后开发的能够存储供计算机系统使用的计算机可读信息/数据。By way of example and not limitation, computer-readable storage media may include volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storing information such as computer-readable instructions, data structures, program modules or other data. For example, computer-readable storage media include, but are not limited to, volatile memory, such as random access memory (RAM, DRAM, SRAM); and non-volatile memory, such as flash memory, various read-only memories (ROM, PROM, EPROM, EEPROM), magnetic and ferromagnetic/ferroelectric memories (MRAM, FeRAM); and magnetic and optical storage devices (hard disks, magnetic tapes, CDs, DVDs); or other media now known or later developed that can store computer-readable information/data for use by a computer system.

在此，根据本申请的一个实施例包括一个装置，该装置包括用于存储计算机程序指令的存储器和用于执行程序指令的处理器，其中，当该计算机程序指令被该处理器执行时，触发该装置运行基于前述根据本申请的多个实施例的方法和/或技术方案。Here, according to an embodiment of the present application, a device is included, which includes a memory for storing computer program instructions and a processor for executing the program instructions, wherein, when the computer program instructions are executed by the processor, the device is triggered to run the methods and/or technical solutions based on the aforementioned multiple embodiments of the present application.

对于本领域技术人员而言，显然本申请不限于上述示范性实施例的细节，而且在不背离本申请的精神或基本特征的情况下，能够以其他的具体形式实现本申请。因此，无论从哪一点来看，均应将实施例看作是示范性的，而且是非限制性的，本申请的范围由所附权利要求而不是上述说明限定，因此旨在将落在权利要求的等同要件的含义和范围内的所有变化涵括在本申请内。不应将权利要求中的任何附图标记视为限制所涉及的权利要求。此外，显然“包括”一词不排除其他单元或步骤，单数不排除复数。装置权利要求中陈述的多个单元或装置也可以由一个单元或装置通过软件或者硬件来实现。第一，第二等词语用来表示名称，而并不表示任何特定的顺序。It is obvious to those skilled in the art that the present application is not limited to the details of the above exemplary embodiments, and that the present application can be implemented in other specific forms without departing from the spirit or basic features of the present application. Therefore, from any point of view, the embodiments should be regarded as exemplary and non-restrictive, and the scope of the present application is limited by the attached claims rather than the above description, so it is intended to include all changes that fall within the meaning and scope of the equivalent elements of the claims in the present application. Any figure mark in the claims should not be regarded as limiting the claims involved. In addition, it is obvious that the word "comprising" does not exclude other units or steps, and the singular does not exclude the plural. Multiple units or devices stated in the device claim can also be implemented by one unit or device through software or hardware. The words first, second, etc. are used to indicate names, and do not indicate any particular order.