CN116091572A

Movatterモバイル変換

Info

Publication number: CN116091572A
Application number: CN202211290478.3A
Authority: CN
Inventors: 高旭
Original assignee: Honor Device Co Ltd
Current assignee: Honor Device Co Ltd
Priority date: 2022-10-21
Filing date: 2022-10-21
Publication date: 2023-05-09
Anticipated expiration: 2042-10-21
Also published as: CN116091572B

Abstract

A method for acquiring image depth information, an electronic device and a storage medium. The method comprises the following steps: acquiring a plurality of first weight coefficients from a plurality of feature points to a target point in an image; determining bias coefficients corresponding to the target points and negatively related to the dispersion according to the dispersion of the plurality of first weight coefficients; adjusting a plurality of first weight coefficients using the bias coefficients; and acquiring a transformation coefficient of the target point according to the adjusted multiple first weight coefficients, and further acquiring image depth information of the target point. When the target point is in the partial vacancy area, the influence degree of each characteristic point in the plurality of characteristic points on the transformation coefficient of the target point is small, the acquired plurality of first weight coefficients are almost 0, the discrete degree is low, and the bias coefficient is large. The first weight coefficient is adjusted by using a larger bias coefficient, so that the influence of each characteristic point on the image on the target point is averaged, the accuracy of a transformation system obtained by the target point due to partial gaps is reduced, and the accuracy of the obtained image depth information is improved.

Description

Translated fromChinese

获取图像深度信息的方法、电子设备及存储介质Method, electronic device and storage medium for acquiring image depth information

技术领域technical field

本申请涉及增强现实技术领域，尤其涉及一种获取图像深度信息的方法、电子设备及存储介质。The present application relates to the field of augmented reality technology, and in particular to a method for acquiring image depth information, electronic equipment, and a storage medium.

背景技术Background technique

增强现实(augment reality，AR)是利用生成的虚拟信息对真实世界进行补充，使虚拟信息与真实世界看起来共存于同一空间的技术。目前，电子设备上运行的大多数AR应用是将虚拟场景叠加在真实场景的前方，但这种简单叠加技术缺乏正确处理虚拟场景与真实场景的遮挡关系，会给用户造成错误的空间感知。Augmented reality (augment reality, AR) is a technology that uses generated virtual information to supplement the real world, making virtual information and the real world appear to coexist in the same space. At present, most AR applications running on electronic devices superimpose the virtual scene in front of the real scene, but this simple superimposition technology lacks the correct handling of the occlusion relationship between the virtual scene and the real scene, which will cause wrong spatial perception for users.

因此，为强化虚拟物体在现实场景中的真实感，让用户在AR应用中产生正确的空间感知，正确处理AR中的虚拟物体和真实场景的虚实遮挡关系具有重要的意义。AR中的虚实遮挡处理一般采用深度提取等手段，通过获取实景图像中前景物体的遮挡轮廓和恢复前景物体深度信息，生成具有正确遮挡关系的虚实融合图像。Therefore, in order to enhance the realism of virtual objects in real scenes and allow users to generate correct spatial perception in AR applications, it is of great significance to correctly handle the virtual-real occlusion relationship between virtual objects in AR and real scenes. Virtual-real occlusion processing in AR generally adopts depth extraction and other means to generate a virtual-real fusion image with correct occlusion relationship by obtaining the occlusion contour of the foreground object in the real-scene image and restoring the depth information of the foreground object.

恢复场景物体深度信息，一般指利用单目图像获取场景绝对深度图。即，利用相对深度模型，比如卷积神经网络(convolutional neural networks，CNN)模型，处理单目图像，获取相对深度图。其中，单目图像为同一场景中的一个图像。然后，结合同步定位与建图系统(Simultaneous Localization and Mapping，SLAM)特征点，通过平移、旋转和放缩变换，获取场景绝对深度图。但是，由于SLAM特征点可能分布不均匀，比如可能出现局部空缺，导致获取的场景绝对深度图与真实值误差较大，从而形成不正确的虚实遮挡关系，进而降低了AR应用在真实场景中添加的虚拟场景的真实感，甚至影响用户的正确空间感知。Restoring the depth information of scene objects generally refers to obtaining the absolute depth map of the scene using monocular images. That is, use a relative depth model, such as a convolutional neural network (CNN) model, to process a monocular image and obtain a relative depth map. Wherein, the monocular image is an image in the same scene. Then, combined with the feature points of the Simultaneous Localization and Mapping (SLAM) system, the absolute depth map of the scene is obtained through translation, rotation and scaling transformation. However, due to the uneven distribution of SLAM feature points, for example, there may be local vacancies, resulting in a large error between the obtained absolute depth map of the scene and the real value, resulting in an incorrect virtual-real occlusion relationship, which in turn reduces the added value of AR applications in real scenes. The realism of the virtual scene even affects the user's correct spatial perception.

发明内容Contents of the invention

为了解决上述问题，本申请提供一种获取图像深度信息的方法、电子设备及存储介质，用于获取正确的图像深度信息，进而获取正确的虚实遮挡关系，从而提高AR应用在真实场景中添加的虚拟场景的真实感，使用户产生正确的空间感知。In order to solve the above problems, the present application provides a method for obtaining image depth information, electronic equipment, and a storage medium, which are used to obtain correct image depth information, and then obtain correct virtual and real occlusion relations, thereby improving AR applications in real scenes. The realism of the virtual scene enables users to have a correct spatial perception.

第一方面，本申请提供了一种获取图像深度信息的方法，所述方法包括：In a first aspect, the present application provides a method for acquiring image depth information, the method comprising:

获取图像中多个特征点到目标点的多个第一权重系数；一个第一权重系数表示多个特征点中一个特征点对目标点的变换系数的影响程度，变换系数表示平移大小和放缩大小；Acquire multiple first weight coefficients from multiple feature points in the image to the target point; one first weight coefficient represents the degree of influence of one of the multiple feature points on the transformation coefficient of the target point, and the transformation coefficient represents the translation size and scaling size;

根据多个第一权重系数的离散度，确定与目标点对应的偏置系数；离散度表示多个第一权重系数的离散程度，偏置系数与离散度呈负相关关系；利用偏置系数调整多个第一权重系数；According to the dispersion of multiple first weight coefficients, determine the bias coefficient corresponding to the target point; the dispersion indicates the degree of dispersion of multiple first weight coefficients, and the bias coefficient and dispersion are negatively correlated; use the bias coefficient to adjust a plurality of first weight coefficients;

根据调整后的多个第一权重系数，获取目标点的变换系数；并根据目标点的变换系数，获取目标点的图像深度信息。According to the adjusted multiple first weight coefficients, the transformation coefficient of the target point is obtained; and according to the transformation coefficient of the target point, the image depth information of the target point is obtained.

本申请提供的方案，对于目标点在不存在有效特征点的区域，比如局部空缺区域，多个特征点中每个特征点对目标点的变换系数影响程度小，获取的多个第一权重系数几乎为0。此时多个第一权重系数的离散程度低，获取的偏置系数大。利用大的偏置系数调整第一权重系数。对于目标点在其他区域，多个特征点中每个特征点对目标点的变换系数影响程度大，获取的多个第一权重系数离散程度大。此时偏置系数小。利用小的偏置系数调整第一权重系数。如此，将图像上各特征点对目标点的影响力度平均化，从而降低局部空缺的影响，导致目标点获取的变换系数精度偏低的问题，提高获取的图像深度信息的准确度。In the solution provided by this application, for the target point in an area where there are no effective feature points, such as a local vacancy area, each feature point in multiple feature points has a small influence on the transformation coefficient of the target point, and the obtained multiple first weight coefficients almost 0. At this time, the degree of dispersion of the plurality of first weight coefficients is low, and the acquired bias coefficient is large. The first weight coefficient is adjusted with a large bias coefficient. For the target point in other regions, each of the multiple feature points has a large influence on the transformation coefficient of the target point, and the obtained multiple first weight coefficients have a large degree of dispersion. At this time, the bias coefficient is small. The first weight coefficient is adjusted with a small bias coefficient. In this way, the influence of each feature point on the image on the target point is averaged, thereby reducing the influence of local vacancies, resulting in the problem of low accuracy of the transformation coefficient obtained by the target point, and improving the accuracy of the acquired image depth information.

在一种可能的实现方式中，离散度为标准差，根据多个第一权重系数的标准差，基于预设偏置系数和标准差的映射关系，确定与目标点对应的偏置系数。其中，偏置系数为0至1范围内的数。In a possible implementation manner, the degree of dispersion is a standard deviation, and the bias coefficient corresponding to the target point is determined based on the standard deviation of the plurality of first weight coefficients and based on a preset mapping relationship between the bias coefficient and the standard deviation. Wherein, the bias coefficient is a number ranging from 0 to 1.

在一种可能的实现方式中，获取多个特征点每个特征点到目标点的距离；根据每个特征点到目标点的距离，获取第一权重系数；其中第一权重系数与距离呈负相关关系，且第一权重系数为0至1范围内的数。In a possible implementation, the distance from each feature point to the target point is obtained from multiple feature points; according to the distance from each feature point to the target point, the first weight coefficient is obtained; wherein the first weight coefficient and the distance are negative correlation, and the first weight coefficient is a number in the range of 0 to 1.

在一种可能的实现方式中，根据每个特征点到目标点的距离，以及每个特征点的置信度，获取第一权重系数；第一权重系数与置信度成正相关关系，即置信度越大，第一权重系数越大，其中置信度用于表示每个特征点的真实值落到测量结果周围的概率。概率越大，置信度越大，第一权重系数越大。概率越小，置信度越低，第一权重系数越小。In a possible implementation, the first weight coefficient is obtained according to the distance from each feature point to the target point and the confidence degree of each feature point; the first weight coefficient is positively correlated with the confidence degree, that is, the higher the confidence degree Larger, the greater the first weight coefficient, where the confidence is used to represent the probability that the true value of each feature point falls around the measurement result. The greater the probability, the greater the confidence, and the greater the first weight coefficient. The smaller the probability, the lower the confidence, and the smaller the first weight coefficient.

在一种可能的实现方式中，根据调整后的第一权重系数，获取以第一权重系数为主对角元素的对角矩阵，即权重矩阵，利用权重矩阵和预设曲线拟合方式，计算目标点的变换系数。In a possible implementation, according to the adjusted first weight coefficient, obtain the diagonal matrix with the first weight coefficient as the main diagonal element, that is, the weight matrix, and use the weight matrix and the preset curve fitting method to calculate The transformation coefficient of the target point.

在一种可能的实现方式中，预设曲线拟合方式为最小二乘法。In a possible implementation manner, the preset curve fitting method is the least square method.

在一种可能的实现方式中，所述方法还包括：In a possible implementation, the method further includes:

根据目标点的变换系数，利用插值算法，获取其他目标点的变换系数；并根据其他目标点的变换系数，确定其他目标点的图像深度信息。According to the transformation coefficient of the target point, the transformation coefficient of other target points is obtained by using an interpolation algorithm; and the image depth information of other target points is determined according to the transformation coefficient of other target points.

在一种可能的实现方式中，插值算法为双线性插值算法。In a possible implementation manner, the interpolation algorithm is a bilinear interpolation algorithm.

如此，只需要计算部分目标点的变换系数，结合插值算法，既可以计算其他目标点的变换系数。相对于利用偏置系数调整权重系数，利用权重系数计算目标点的变换系数的方式，插值算法计算工作量少，能够获取图像深度信息的效率。In this way, only the transformation coefficients of some target points need to be calculated, and the transformation coefficients of other target points can be calculated in combination with the interpolation algorithm. Compared with the method of using the bias coefficient to adjust the weight coefficient and using the weight coefficient to calculate the transformation coefficient of the target point, the interpolation algorithm has less computational workload and can obtain image depth information efficiently.

在一种可能的实现方式中，目标点为像素点。In a possible implementation manner, the target point is a pixel point.

在一种可能的实现方式中，目标点为局部中心点，具体获取局部中心点的方法为：In a possible implementation, the target point is a local center point, and the specific method for obtaining the local center point is:

获取多个特征点中每个特征点到每个像素点的第二权重系数。一个第二权重系数表示一个特征点对一个像素点的变换系数的影响程度。比如N个特征点到M个像素点的N*M个第二权重系数；A second weight coefficient from each feature point to each pixel point among the plurality of feature points is acquired. A second weight coefficient represents the degree of influence of a feature point on the transformation coefficient of a pixel point. For example, N*M second weight coefficients from N feature points to M pixel points;

根据多个第二权重系数，获取图像特征点分布密度图；基于预设规则分割特征点分布密度图，获取多块分割后的特征点分布密度图。确定每块特征点分布密度图对应的局部中心点。每个局部中心为一个目标点。在获取图像深度信息时，只需要计算每块局部中心点的图像深度信息，而不需要计算每个像素点的图像深度信息，减少了计算工作量，提高了获取图像深度信息的效率。According to a plurality of second weight coefficients, the feature point distribution density map of the image is obtained; the feature point distribution density map is divided based on preset rules, and the feature point distribution density map after multi-block segmentation is obtained. Determine the local center point corresponding to each feature point distribution density map. Each local center is a target point. When obtaining the image depth information, only the image depth information of each local center point needs to be calculated, instead of the image depth information of each pixel point, which reduces the calculation workload and improves the efficiency of obtaining image depth information.

在一种可能的实现方式中，累加每个像素点对应的多个第二权重系数，获取权重累加值。根据权重累加值获取特征点分布密度图。In a possible implementation manner, multiple second weight coefficients corresponding to each pixel are accumulated to obtain an accumulated weight value. Obtain the feature point distribution density map according to the weight accumulation value.

在一种可能的实现方式中，预设规则为权重累加值的差异值在预设差异范围内，且像素点距离在预设距离范围内的多个像素点分布在同一区块。In a possible implementation manner, the preset rule is that the difference value of the weight accumulation value is within a preset difference range, and multiple pixel points whose pixel distance is within a preset distance range are distributed in the same block.

在一种可能的实现方式中，利用超像素分割算法分割特征点分布密度图。In a possible implementation manner, the feature point distribution density map is segmented using a superpixel segmentation algorithm.

第二方面，本申请还提供了一种电子设备，该电子设备包括存储器和处理器，存储器与处理器耦合。存储器存储有程序指令，当所述程序指令由所述处理器执行时，使得电子设备执行以上第一方面或第一方面对应的各实现方式。In a second aspect, the present application also provides an electronic device, where the electronic device includes a memory and a processor, and the memory is coupled to the processor. The memory stores program instructions, and when the program instructions are executed by the processor, the electronic device executes the above first aspect or each implementation manner corresponding to the first aspect.

第三方面，本申请提供了一种计算机可读介质，包括计算机可读指令，当所述计算机可读指令在计算设备上运行时，使得计算设备执行以上第一方面或第一方面对应的各实现方式。In a third aspect, the present application provides a computer-readable medium, including computer-readable instructions. When the computer-readable instructions are run on a computing device, the computing device executes the above first aspect or each corresponding to the first aspect. Method to realize.

应当理解的是，本申请中对技术特征、技术方案、有益效果或类似语言的描述并不是暗示在任意的单个实施例中可以实现所有的特点和优点。相反，可以理解的是对于特征或有益效果的描述意味着在至少一个实施例中包括特定的技术特征、技术方案或有益效果。因此，本说明书中对于技术特征、技术方案或有益效果的描述并不一定是指相同的实施例。进而，还可以任何适当的方式组合本实施例中所描述的技术特征、技术方案和有益效果。本领域技术人员将会理解，无需特定实施例的一个或多个特定的技术特征、技术方案或有益效果即可实现实施例。在其他实施例中，还可在没有体现所有实施例的特定实施例中识别出额外的技术特征和有益效果。It should be understood that descriptions of technical features, technical solutions, beneficial effects or similar language in this application do not imply that all features and advantages can be realized in any single embodiment. On the contrary, it can be understood that the description of features or beneficial effects means that specific technical features, technical solutions or beneficial effects are included in at least one embodiment. Therefore, descriptions of technical features, technical solutions or beneficial effects in this specification do not necessarily refer to the same embodiment. Furthermore, the technical features, technical solutions and beneficial effects described in this embodiment may also be combined in any appropriate manner. Those skilled in the art will understand that the embodiments can be implemented without one or more specific technical features, technical solutions or advantageous effects of the specific embodiments. In other embodiments, additional technical features and beneficial effects may also be identified in certain embodiments that do not embody all embodiments.

附图说明Description of drawings

图1为本申请实施例提供的一种电子设备硬件结构示意图；FIG. 1 is a schematic diagram of a hardware structure of an electronic device provided by an embodiment of the present application;

图2为本申请实施例提供的一种电子设备软件结构示意图；FIG. 2 is a schematic structural diagram of an electronic device software provided by an embodiment of the present application;

图3A为本申请实施例提供的一个真实场景的RGB彩色图像示意图；FIG. 3A is a schematic diagram of an RGB color image of a real scene provided by the embodiment of the present application;

图3B为本申请实施例提供的一种相对深度信息图的示意图；FIG. 3B is a schematic diagram of a relative depth information map provided by an embodiment of the present application;

图3C为本申请实施例提供的一种特征点示意图；FIG. 3C is a schematic diagram of a feature point provided by the embodiment of the present application;

图3D为本申请实施例提供的一种绝对深度信息图的示意图；FIG. 3D is a schematic diagram of an absolute depth information map provided by an embodiment of the present application;

图4为本申请实施例提供的一种手机界面示意图；FIG. 4 is a schematic diagram of a mobile phone interface provided by an embodiment of the present application;

图5A为本申请实施例提供的一种拍摄界面的示意图；FIG. 5A is a schematic diagram of a shooting interface provided by an embodiment of the present application;

图5B为本申请实施例提供的另一种拍摄界面示意图；FIG. 5B is a schematic diagram of another shooting interface provided by the embodiment of the present application;

图5C为本申请实施例提供的另一种拍摄界面中包含选项菜单的界面示意图；FIG. 5C is a schematic diagram of an interface including an option menu in another shooting interface provided by the embodiment of the present application;

图5D为本申请实施例提供的一种设置在拍照功能的拍摄界面示意图；FIG. 5D is a schematic diagram of a photographing interface provided in an embodiment of the present application provided in the photographing function;

图5E为本申请实施例提供的一种设置在录像功能的拍摄界面示意图；FIG. 5E is a schematic diagram of a shooting interface set in the video recording function provided by the embodiment of the present application;

图6为本申请实施例提供的一种开启第一功能以进行“AR图像”预览的方法流程图；Fig. 6 is a flow chart of a method for enabling the first function to preview "AR image" provided by the embodiment of the present application;

图7A为本申请实施例提供的一种获取图像深度信息的方法流程图；FIG. 7A is a flow chart of a method for acquiring image depth information provided by an embodiment of the present application;

图7B为本申请实施例提供的一种SLAM特征点示意图；FIG. 7B is a schematic diagram of a SLAM feature point provided by the embodiment of the present application;

图7C为本申请实施例提供的一种图像深度信息图的示意图；FIG. 7C is a schematic diagram of an image depth information map provided by an embodiment of the present application;

图8A为本申请实施例提供的另一种获取图像深度信息的方法流程图；FIG. 8A is a flow chart of another method for obtaining image depth information provided by the embodiment of the present application;

图8B为本申请实施例提供的一种SLAM特征点示意图；FIG. 8B is a schematic diagram of a SLAM feature point provided by the embodiment of the present application;

图8C为本申请实施例提供的一种特征点分布密度图；FIG. 8C is a distribution density map of feature points provided by the embodiment of the present application;

图8D为本申请实施例提供的一种特征点分布密度图的分块示意图；FIG. 8D is a block schematic diagram of a characteristic point distribution density map provided by the embodiment of the present application;

图8E为本申请实施例获取的另一种图像深度信息图的示意图；FIG. 8E is a schematic diagram of another image depth information map acquired in the embodiment of the present application;

图9为本申请实施例提供的一种图像深度信息获取的装置结构示意图。FIG. 9 is a schematic structural diagram of an apparatus for acquiring image depth information provided by an embodiment of the present application.

具体实施方式Detailed ways

本申请说明书和权利要求书及附图说明中的术语“第一”、“第二”和“第三”等是用于区别不同对象，而不是用于限定特定顺序。The terms "first", "second" and "third" in the specification, claims and description of the drawings of this application are used to distinguish different objects, rather than to limit a specific order.

在本申请实施例中，“示例性的”或者“例如”等词用于表示作例子、例证或说明。本申请实施例中被描述为“示例性的”或者“例如”的任何实施例或设计方案不应被解释为比其它实施例或设计方案更优选或更具优势。确切而言，使用“示例性的”或者“例如”等词旨在以具体方式呈现相关概念。In the embodiments of the present application, words such as "exemplary" or "for example" are used as examples, illustrations or illustrations. Any embodiment or design scheme described as "exemplary" or "for example" in the embodiments of the present application shall not be interpreted as being more preferred or more advantageous than other embodiments or design schemes. Rather, the use of words such as "exemplary" or "such as" is intended to present related concepts in a concrete manner.

为了使本领域技术人员更清楚的理解本申请的技术方案，下面首先说明本申请技术方案的应用场景。In order to make those skilled in the art understand the technical solution of the present application more clearly, the application scenarios of the technical solution of the present application will be firstly described below.

本申请提供的图像深度信息获取方法应用于电子设备。其中，电子设备可以为手机、笔记本电脑、可穿戴电子设备(例如智能手表)、平板电脑、AR设备以及车载设备等。以上的电子设备可以包括一个或多个摄像头。以上电子设备的应用程序可提供“AR特效”功能和/或虚化功能。The image depth information acquisition method provided in this application is applied to electronic equipment. Wherein, the electronic device may be a mobile phone, a notebook computer, a wearable electronic device (such as a smart watch), a tablet computer, an AR device, a vehicle-mounted device, and the like. The above electronic devices may include one or more cameras. The application programs of the above electronic devices may provide "AR special effect" function and/or virtualization function.

其中，“AR特效”功能可用于在拍照、录像或视频通话过程中，对被拍摄的真实场景与虚拟场景进行融合，以增强真实场景的内容。比如，视频通话过程中，在通话者面部实时叠加帽子等虚拟场景，以增强通话者的场景内容，实现提高视频对话的趣味性。“AR特效”功能涉及的真实场景与虚拟场景的融合可以包括：计算真实场景图的深度信息，然后根据用户的视点位置、虚拟场景的叠加位置以及深度信息，获取虚拟场景与真实场景的空间位置关系，即获取正确的虚实遮挡关系，最后利用虚实遮挡关系进行真实场景与虚拟场景的融合，以实现融合图像给用户正确的空间感知。Among them, the "AR special effects" function can be used to integrate the real scene and the virtual scene to enhance the content of the real scene during the process of taking pictures, videos or video calls. For example, during a video call, virtual scenes such as hats are superimposed on the face of the caller in real time to enhance the scene content of the caller and improve the fun of the video conversation. The fusion of the real scene and the virtual scene involved in the "AR special effect" function may include: calculating the depth information of the real scene map, and then obtaining the spatial position of the virtual scene and the real scene according to the user's viewpoint position, the superposition position of the virtual scene, and the depth information Relationship, that is, to obtain the correct virtual-real occlusion relationship, and finally use the virtual-real occlusion relationship to fuse the real scene and the virtual scene, so as to realize the correct spatial perception of the fused image for the user.

其中，虚化功能可用于在拍照、录像或视频过程中突出图像主题，虚化图像背景，使图像更加生动。比如，在拍摄风景图像过程中，对拍摄到的其他对象进行虚化，突出拍摄风景，可使图像更加生动。虚化功能涉及的虚化图像背景可以包括：计算图像深度信息，然后对深度进行过滤，并依据深度信息识别出感兴趣区域，为不同深度位置的像素点进行不同程度虚化，以实现虚化图像背景，突出拍摄主题。Among them, the blurring function can be used to highlight the subject of the image, blur the background of the image, and make the image more vivid during the process of taking pictures, videos or videos. For example, in the process of taking a landscape image, blurring other captured objects to highlight the scenery can make the image more vivid. The blurred image background involved in the blurring function can include: calculating the depth information of the image, then filtering the depth, and identifying the region of interest based on the depth information, and blurring the pixels at different depth positions to achieve blurring The image background highlights the subject of the photo.

在本申请实施例中，“AR特效”功能或虚化功能可以集成于手机等电子设备的“相机”应用程序，具体集成于“相机”应用程序的“拍照”功能和“录像”功能。当电子设备进行“拍照”功能和“录像”功能时，可以通过配置文件选择“AR特效”或虚化功能进行拍照和录像。“AR特效”功能或虚化功能也可以集成于手机等电子设备的通信类应用程序中“视频通信”功能。“AR特效”功能还可以作为单独应用程序。In this embodiment of the application, the "AR special effect" function or the virtualization function can be integrated into the "camera" application program of electronic devices such as mobile phones, and specifically integrated into the "photographing" and "video recording" functions of the "camera" application program. When the electronic device performs the "photographing" and "recording" functions, the "AR special effect" or virtual blur function can be selected through the configuration file to take photos and videos. The "AR special effects" function or virtualization function can also be integrated into the "video communication" function in communication applications of electronic devices such as mobile phones. The "AR Special Effects" function is also available as a separate application.

为了使本申请技术方案更加清楚、易于理解，下面先对电子设备及其图像处理系统架构进行介绍。In order to make the technical solution of the present application clearer and easier to understand, the electronic device and its image processing system architecture are firstly introduced below.

本申请实施例提供的一种电子设备100，参见图1。电子设备100可以包括：处理器110，外部存储器接口120，内部存储器121，移动通信模块150，传感器模块180，按键190，摄像头193，显示屏194等。其中，传感器模块180可以包括压力传感器180A，陀螺仪传感器180B，指纹传感器180H，触摸传感器180K等。An electronic device 100 provided in an embodiment of the present application is shown in FIG. 1 . The electronic device 100 may include: a processor 110, an external memory interface 120, an internal memory 121, amobile communication module 150, a sensor module 180, keys 190, a camera 193, a display screen 194, and the like. Wherein, the sensor module 180 may include a pressure sensor 180A, a gyro sensor 180B, a fingerprint sensor 180H, a touch sensor 180K and the like.

可以理解的是，本发明实施例示意的结构并不构成对电子设备100的具体限定。在本申请另一些实施例中，电子设备100可以包括比图示更多或更少的部件，或者组合某些部件，或者拆分某些部件，或者不同的部件布置。图示的部件可以以硬件，软件或软件和硬件的组合实现。It can be understood that, the structure illustrated in the embodiment of the present invention does not constitute a specific limitation on the electronic device 100 . In other embodiments of the present application, the electronic device 100 may include more or fewer components than shown in the figure, or combine certain components, or separate certain components, or arrange different components. The illustrated components can be realized in hardware, software or a combination of software and hardware.

处理器110可以包括一个或多个处理单元，例如：处理器110可以包括应用处理器(application processor，AP)，调制解调处理器，图形处理器(graphics processingunit，GPU)，图像信号处理器(image signal processor，ISP)，控制器，视频编解码器，数字信号处理器(digital signal processor，DSP)，基带处理器，和/或神经网络处理器(neural-network processing unit，NPU)等。其中，不同的处理单元可以是独立的器件，也可以集成在一个或多个处理器中。The processor 110 may include one or more processing units, for example: the processor 110 may include an application processor (application processor, AP), a modem processor, a graphics processing unit (graphics processing unit, GPU), an image signal processor ( image signal processor (ISP), controller, video codec, digital signal processor (digital signal processor, DSP), baseband processor, and/or neural network processor (neural-network processing unit, NPU), etc. Wherein, different processing units may be independent devices, or may be integrated in one or more processors.

处理器110中还可以设置存储器，用于存储指令和数据。在一些实施例中，处理器110中的存储器为高速缓冲存储器。该存储器可以保存处理器110刚用过或循环使用的指令或数据。如果处理器110需要再次使用该指令或数据，可从所述存储器中直接调用。避免了重复存取，减少了处理器110的等待时间，因而提高了系统的效率。A memory may also be provided in the processor 110 for storing instructions and data. In some embodiments, the memory in processor 110 is a cache memory. The memory may hold instructions or data that the processor 110 has just used or recycled. If the processor 110 needs to use the instruction or data again, it can be called directly from the memory. Repeated access is avoided, and the waiting time of the processor 110 is reduced, thereby improving the efficiency of the system.

电子设备100通过GPU，显示屏194，以及应用处理器等实现显示功能。GPU为图像处理的微处理器，连接显示屏194和应用处理器。GPU用于执行数学和几何计算，用于图形渲染，比如图像虚化和AR特效功能。处理器110可包括一个或多个GPU，其执行程序指令以生成或改变显示信息。The electronic device 100 realizes the display function through the GPU, the display screen 194 , and the application processor. The GPU is a microprocessor for image processing, and is connected to the display screen 194 and the application processor. The GPU is used to perform mathematical and geometric calculations for graphics rendering, such as image blur and AR special effects. Processor 110 may include one or more GPUs that execute program instructions to generate or change display information.

显示屏194用于显示图像，视频等。显示屏194包括显示面板。显示面板可以采用液晶显示屏(liquid crystal display，LCD)，有机发光二极管(organic light-emittingdiode，OLED)，有源矩阵有机发光二极体或主动矩阵有机发光二极体(active-matrixorganic light emitting diode的，AMOLED)，柔性发光二极管(flex light-emittingdiode，FLED)，Miniled，MicroLed，Micro-oLed，量子点发光二极管(quantum dot lightemitting diodes，QLED)等。在一些实施例中，电子设备100可以包括1个或N个显示屏194，N为大于1的正整数。The display screen 194 is used to display images, videos and the like. The display screen 194 includes a display panel. The display panel can adopt liquid crystal display (liquid crystal display, LCD), organic light-emitting diode (organic light-emitting diode, OLED), active-matrix organic light-emitting diode or active-matrix organic light-emitting diode (active-matrixorganic light-emitting diode) , AMOLED), flexible light-emitting diode (flex light-emitting diode, FLED), Miniled, MicroLed, Micro-oLed, quantum dot light-emitting diodes (quantum dot light emitting diodes, QLED), etc. In some embodiments, the electronic device 100 may include 1 or N display screens 194 , where N is a positive integer greater than 1.

电子设备100可以通过图像信号处理器(Image Signal Processor,ISP)，摄像头193，视频编解码器，GPU，显示屏194以及应用处理器等实现拍摄功能。The electronic device 100 can realize the shooting function through an image signal processor (Image Signal Processor, ISP), a camera 193, a video codec, a GPU, a display screen 194, and an application processor.

ISP 用于处理摄像头193反馈的数据。例如，拍照时，打开快门，光线通过镜头被传递到摄像头感光元件上，光信号转换为电信号，摄像头感光元件将所述电信号传递给ISP处理，转化为肉眼可见的图像。ISP还可以对图像的噪点，亮度，肤色进行算法优化。ISP还可以对拍摄场景的曝光，色温等参数优化。ISP还可以对图像进行虚化或AR特效处理。在一些实施例中，ISP可以设置在摄像头193中。The ISP is used to process the data fed back by the camera 193. For example, when taking a picture, open the shutter, the light is transmitted to the photosensitive element of the camera through the lens, and the light signal is converted into an electrical signal, and the photosensitive element of the camera transmits the electrical signal to the ISP for processing, and converts it into an image visible to the naked eye. ISP can also perform algorithm optimization on image noise, brightness, and skin color. ISP can also optimize the exposure, color temperature and other parameters of the shooting scene. The ISP can also perform blurring or AR special effects on the image. In some embodiments, the ISP may be located in the camera 193 .

在一些实施例中，摄像头193可以由彩色摄像模组和3D感测模组组件。In some embodiments, the camera 193 can be composed of a color camera module and a 3D sensing module.

在一些实施例中，彩色摄像模组的摄像头的感光元件可以是电荷耦合器件(charge coupled device，CCD)或互补金属氧化物半导体(complementary metal-oxide-semiconductor，CMOS)光电晶体管。感光元件把光信号转换成电信号，之后将电信号传递给ISP转换成数字图像信号。ISP将数字图像信号输出到DSP加工处理。DSP将数字图像信号转换成标准的RGB，YUV等格式的图像信号。In some embodiments, the photosensitive element of the camera head of the color camera module may be a charge coupled device (charge coupled device, CCD) or a complementary metal-oxide-semiconductor (complementary metal-oxide-semiconductor, CMOS) phototransistor. The photosensitive element converts the light signal into an electrical signal, and then transmits the electrical signal to the ISP to convert it into a digital image signal. The ISP outputs the digital image signal to the DSP for processing. DSP converts digital image signals into standard RGB, YUV and other image signals.

在一些实施例中，3D感测模组可以是(time of flight, TOF)3D感测模块或结构光(structured light)3D感测模块。其中，结构光3D感测是一种主动式深度感测技术，结构光 3D感测模组的基本零组件可包括红外线(Infrared)发射器、IR相机模等。结构光 3D感测模组的工作原理是先对被拍摄物体发射特定图案的光斑(pattern)，再接收该物体表面上的光斑图案编码(light coding)，进而比对与原始投射光斑的异同，并利用三角原理计算出物体的三维坐标。该三维坐标中就包括电子设备100距离被拍摄物体的距离。其中，TOF 3D 感测也是主动式深度感测技术，TOF 3D感测模组的基本组件可包括红外线(Infrared)发射器、IR相机模等。TOF 3D感测模组的工作原理是通过红外线折返的时间去计算TOF 3D感测模组跟被拍摄物体之间的距离(即深度)，以得到3D景深图。In some embodiments, the 3D sensing module may be a time of flight (TOF) 3D sensing module or a structured light (structured light) 3D sensing module. Among them, structured light 3D sensing is an active depth sensing technology. The basic components of a structured light 3D sensing module may include an infrared (Infrared) emitter, an IR camera module, and the like. The working principle of the structured light 3D sensing module is to emit a specific pattern of light spots (pattern) on the object to be photographed, and then receive the light spot pattern code (light coding) on the surface of the object, and then compare the similarities and differences with the original projected light spots. And use the principle of trigonometry to calculate the three-dimensional coordinates of the object. The three-dimensional coordinates include the distance between the electronic device 100 and the object to be photographed. Among them, TOF 3D sensing is also an active depth sensing technology. The basic components of TOF 3D sensing module can include infrared (Infrared) emitters, IR camera modules, etc. The working principle of the TOF 3D sensing module is to calculate the distance (that is, depth) between the TOF 3D sensing module and the object to be photographed through the time of infrared ray return to obtain a 3D depth map.

结构光3D感测模组还可应用于人脸识别、体感游戏机、工业用机器视觉检测等领域。TOF 3D感测模组还可应用于游戏机、增强现实(augmented reality，AR)/虚拟现实(virtual reality，VR)等领域。The structured light 3D sensing module can also be applied in fields such as face recognition, somatosensory game consoles, and industrial machine vision inspection. TOF 3D sensing modules can also be applied to game consoles, augmented reality (augmented reality, AR)/virtual reality (virtual reality, VR) and other fields.

在另一些实施例中，摄像头193还可以由两个或更多个摄像头构成。这两个或更多个摄像头可包括彩色摄像头，彩色摄像头可用于采集被拍摄物体的彩色图像数据。这两个或更多个摄像头可采用立体视觉(stereo vision)技术来采集被拍摄物体的深度数据。立体视觉技术是基于人眼视差的原理，在自然光源下，透过两个或两个以上的摄像头从不同的角度对同一物体拍摄影像，再进行三角测量法等运算来得到电子设备100 与被拍摄物之间的距离信息，即深度信息。In some other embodiments, the camera 193 may also be composed of two or more cameras. The two or more cameras can include a color camera, which can be used to collect color image data of the object being photographed. The two or more cameras can use stereo vision (stereo vision) technology to collect depth data of the object being photographed. Stereo vision technology is based on the principle of human eye parallax. Under natural light, two or more cameras take images of the same object from different angles, and then perform calculations such as triangulation to obtain the electronic device 100 and the object. Distance information between photographed objects, that is, depth information.

在一些实施例中，电子设备100可以包括1个或N个摄像头193，N为大于1的正整数。具体的，电子设备100可以包括1个前置摄像头193以及1个后置摄像头193。其中，前置摄像头193通常可用于采集面对显示屏194的拍摄者自己的彩色图像数据以及深度数据，后置3D摄像模组193可用于采集拍摄者所面对的拍摄对象(如人物、风景等)的彩色图像数据以及深度数据。In some embodiments, the electronic device 100 may include 1 or N cameras 193 , where N is a positive integer greater than 1. Specifically, the electronic device 100 may include a front camera 193 and a rear camera 193 . Wherein, the front camera 193 can usually be used to collect color image data and depth data of the photographer facing the display screen 194, and the rear 3D camera module 193 can be used to collect the shooting objects (such as characters, landscapes) that the photographer is facing. etc.) color image data and depth data.

在一些实施例中，处理器110中的CPU或GPU或NPU可以对3D摄像模组193采集的彩色图像数据和深度数据进行处理。在一些实施例中，NPU可以通过特征点识别技术所基于的神经网络算法，比如同步定位与建图SLAM算法，识别3D摄像模组193(具体是彩色摄像模组)所采集的彩色图像数据，以确定被拍摄图像的特征点。CPU或GPU也可以运行神经网络算法，以实现根据彩色图像数据确定被拍摄图像的特征点。在一些实施例中，CPU或GPU或 NPU还可用于根据3D摄像模组193(具体是3D感测模组)所采集的深度数据和已确定的特征点，获取图像绝对深度信息，并根据图像深度信息对图像进行虚化或实现“AR特效”。后续实施例中会详细介绍如何基于3D摄像模组193所采集的彩色图像数据和深度数据对被拍摄图像进行“AR特效”和虚化处理，这里先不赘述。In some embodiments, the CPU, GPU or NPU in the processor 110 can process the color image data and depth data collected by the 3D camera module 193 . In some embodiments, the NPU can recognize the color image data collected by the 3D camera module 193 (specifically, the color camera module) through the neural network algorithm based on the feature point recognition technology, such as the synchronous positioning and mapping SLAM algorithm, To determine the feature points of the captured image. The CPU or GPU can also run a neural network algorithm to determine the feature points of the captured image based on the color image data. In some embodiments, the CPU or GPU or NPU can also be used to obtain the absolute depth information of the image according to the depth data collected by the 3D camera module 193 (specifically, the 3D sensing module) and the determined feature points, and according to the image The depth information blurs the image or realizes "AR special effects". Subsequent embodiments will introduce in detail how to perform "AR special effects" and blur processing on captured images based on the color image data and depth data collected by the 3D camera module 193 , and details will not be repeated here.

数字信号处理器用于处理数字信号，除了可以处理数字图像信号，还可以处理其他数字信号。例如，当电子设备100在频点选择时，数字信号处理器用于对频点能量进行傅里叶变换等。Digital signal processors are used to process digital signals. In addition to digital image signals, they can also process other digital signals. For example, when the electronic device 100 selects a frequency point, the digital signal processor is used to perform Fourier transform on the energy of the frequency point.

视频编解码器用于对数字视频压缩或解压缩。电子设备100可以支持一种或多种视频编解码器。这样，电子设备100可以播放或录制多种编码格式的视频，例如：动态图像专家组(moving picture experts group，MPEG)1，MPEG2，MPEG3，MPEG4等。Video codecs are used to compress or decompress digital video. The electronic device 100 may support one or more video codecs. In this way, the electronic device 100 can play or record videos in various encoding formats, for example: moving picture experts group (moving picture experts group, MPEG) 1, MPEG2, MPEG3, MPEG4 and so on.

NPU为神经网络(neural-network ，NN)计算处理器，通过借鉴生物神经网络结构，例如借鉴人脑神经元之间传递模式，对输入信息快速处理，还可以不断的自学习。通过NPU可以实现电子设备100的智能认知等应用，例如：图像识别，人脸识别，语音识别，文本理解等。NPU is a neural-network (NN) computing processor. By referring to the structure of biological neural networks, such as the transfer mode between neurons in the human brain, it can quickly process input information and continuously learn by itself. Applications such as intelligent cognition of the electronic device 100 can be realized through the NPU, such as image recognition, face recognition, speech recognition, text understanding, and the like.

外部存储器接口120可以用于连接外部存储卡，例如Micro SD卡，实现扩展电子设备100的存储能力。外部存储卡通过外部存储器接口120与处理器110通信，实现数据存储功能。例如将音乐，视频等文件保存在外部存储卡中。The external memory interface 120 can be used to connect an external memory card, such as a Micro SD card, so as to expand the storage capacity of the electronic device 100. The external memory card communicates with the processor 110 through the external memory interface 120 to implement a data storage function. Such as saving music, video and other files in the external memory card.

内部存储器121可以用于存储计算机可执行程序代码，所述可执行程序代码包括指令。内部存储器121可以包括存储程序区和存储数据区。其中，存储程序区可存储操作系统，至少一个功能所需的应用程序(比如声音播放功能，图像播放功能等)等。存储数据区可存储电子设备100使用过程中所创建的数据(比如音频数据，电话本等)等。此外，内部存储器121可以包括高速随机存取存储器，还可以包括非易失性存储器，例如至少一个磁盘存储器件，闪存器件，通用闪存存储器(universal flash storage，UFS)等。处理器110通过运行存储在内部存储器121的指令，和/或存储在设置于处理器中的存储器的指令，执行电子设备100的各种功能应用以及数据处理。The internal memory 121 may be used to store computer-executable program codes including instructions. The internal memory 121 may include an area for storing programs and an area for storing data. Wherein, the stored program area can store an operating system, at least one application program required by a function (such as a sound playing function, an image playing function, etc.) and the like. The storage data area can store data created during the use of the electronic device 100 (such as audio data, phonebook, etc.) and the like. In addition, the internal memory 121 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, flash memory device, universal flash storage (universal flash storage, UFS) and the like. The processor 110 executes various functional applications and data processing of the electronic device 100 by executing instructions stored in the internal memory 121 and/or instructions stored in a memory provided in the processor.

上面详细介绍了电子设备100的硬件系统，下面介绍电子设备100的软件系统。软件系统可以采用分层架构，事件驱动架构，微核架构，微服务架构，或云架构。本发明实施例以分层架构的Android系统为例，示例性说明电子设备100的软件结构。The hardware system of the electronic device 100 has been introduced in detail above, and the software system of the electronic device 100 will be introduced below. A software system can adopt a layered architecture, an event-driven architecture, a microkernel architecture, a microservices architecture, or a cloud architecture. In the embodiment of the present invention, the software structure of the electronic device 100 is exemplarily described by taking an Android system with a layered architecture as an example.

如图2所示，采用分层结构的Android系统分成若干层，每一层都有清晰的角色和分工。层与层之间通过软件接口通信。在一些实施例中，将Android系统分为四层，从上至下分别为应用程序层，应用程序框架层，安卓运行时(Android runtime)和系统库，以及内核层。As shown in Figure 2, the Android system with a layered structure is divided into several layers, and each layer has a clear role and division of labor. Layers communicate through software interfaces. In some embodiments, the Android system is divided into four layers, which are respectively the application program layer, the application program framework layer, the Android runtime (Android runtime) and the system library, and the kernel layer from top to bottom.

应用程序层可以包括一系列应用程序包。应用程序包可以包括相机，图库，日历，通话，地图，导航，WLAN，蓝牙，音乐，视频，短信息等应用程序。The application layer can consist of a series of application packages. Application packages can include applications such as camera, gallery, calendar, call, map, navigation, WLAN, Bluetooth, music, video, and SMS.

应用程序框架层为应用程序层的应用程序提供应用编程接口(applicationprogramming interface，API)和编程框架。应用程序框架层包括一些预先定义的函数。The application framework layer provides an application programming interface (application programming interface, API) and a programming framework for applications in the application layer. The application framework layer includes some predefined functions.

应用程序框架层可以包括窗口管理器，内容提供器，视图系统，资源管理器等。Application framework layers can include window managers, content providers, view systems, resource managers, and more.

窗口管理器用于管理窗口程序。窗口管理器可以获取显示屏大小，判断是否有状态栏，锁定屏幕，截取屏幕等。A window manager is used to manage window programs. The window manager can get the size of the display screen, determine whether there is a status bar, lock the screen, capture the screen, etc.

内容提供器用来存放和获取数据，并使这些数据可以被应用程序访问。所述数据可以包括视频，图像，音频，拨打和接听的电话，浏览历史和书签，电话簿等。Content providers are used to store and retrieve data and make it accessible to applications. Said data may include video, images, audio, calls made and received, browsing history and bookmarks, phonebook, etc.

视图系统包括可视控件，例如显示文字的控件，显示图片的控件等。视图系统可用于构建应用程序。显示界面可以由一个或多个视图组成的。例如，包括短信通知图标的显示界面，可以包括显示文字的视图以及显示图片的视图。The view system includes visual controls, such as controls for displaying text, controls for displaying pictures, and so on. The view system can be used to build applications. A display interface can consist of one or more views. For example, a display interface including a text message notification icon may include a view for displaying text and a view for displaying pictures.

资源管理器为应用程序提供各种资源，比如本地化字符串，图标，图片，布局文件，视频文件等等。The resource manager provides various resources for the application, such as localized strings, icons, pictures, layout files, video files, and so on.

Android Runtime包括核心库和虚拟机。Android runtime负责安卓系统的调度和管理。Android Runtime includes core library and virtual machine. The Android runtime is responsible for the scheduling and management of the Android system.

核心库包含两部分：一部分是java语言需要调用的功能函数，另一部分是安卓的核心库。The core library consists of two parts: one part is the function function that the java language needs to call, and the other part is the core library of Android.

应用程序层和应用程序框架层运行在虚拟机中。虚拟机将应用程序层和应用程序框架层的java文件执行为二进制文件。虚拟机用于执行对象生命周期的管理，堆栈管理，线程管理，安全和异常的管理，以及垃圾回收等功能。The application layer and the application framework layer run in virtual machines. The virtual machine executes the java files of the application program layer and the application program framework layer as binary files. The virtual machine is used to perform functions such as object life cycle management, stack management, thread management, security and exception management, and garbage collection.

系统库可以包括多个功能模块。例如：表面1管理器(surface manager)，媒体库(Media Libraries)，三维图形处理库(例如：OpenGL ES)，2D图形引擎(例如：SGL)等。A system library can include multiple function modules. For example: surface manager (surface manager), media library (Media Libraries), 3D graphics processing library (eg: OpenGL ES), 2D graphics engine (eg: SGL), etc.

表面管理器用于对显示子系统进行管理，并且为多个应用程序提供了2D和3D图层的融合。The surface manager is used to manage the display subsystem and provides the fusion of 2D and 3D layers for multiple applications.

媒体库支持多种常用的音频，视频格式回放和录制，以及静态图像文件等。媒体库可以支持多种音视频编码格式，例如: MPEG4，H.264，MP3，AAC，AMR，JPG，PNG等。The media library supports playback and recording of various commonly used audio and video formats, as well as still image files, etc. The media library can support a variety of audio and video encoding formats, such as: MPEG4, H.264, MP3, AAC, AMR, JPG, PNG, etc.

三维图形处理库用于实现三维图形绘图，图像渲染，合成，和图层处理等。The 3D graphics processing library is used to implement 3D graphics drawing, image rendering, compositing, and layer processing, etc.

2D图形引擎是2D绘图的绘图引擎。2D graphics engine is a drawing engine for 2D drawing.

内核层是硬件和软件之间的层。内核层至少包含显示驱动，摄像头驱动，音频驱动，传感器驱动。The kernel layer is the layer between hardware and software. The kernel layer includes at least a display driver, a camera driver, an audio driver, and a sensor driver.

下面结合捕获拍照场景，示例性说明电子设备100软件以及硬件的工作流程。The workflow of the software and hardware of the electronic device 100 will be exemplarily described below in conjunction with capturing and photographing scenes.

当触摸传感器180K接收到触摸操作，相应的硬件中断被发给内核层。内核层将触摸操作加工成原始输入事件(包括触摸坐标，触摸操作的时间戳等信息)。原始输入事件被存储在内核层。应用程序框架层从内核层获取原始输入事件，识别该输入事件所对应的控件。以该触摸操作是触摸单击操作，该单击操作所对应的控件为相机应用图标的控件为例，相机应用调用应用框架层的接口，启动相机应用，进而通过调用内核层启动摄像头驱动，通过摄像头193捕获静态图像或视频。When the touch sensor 180K receives a touch operation, a corresponding hardware interrupt is sent to the kernel layer. The kernel layer processes touch operations into original input events (including touch coordinates, time stamps of touch operations, and other information). Raw input events are stored at the kernel level. The application framework layer obtains the original input event from the kernel layer, and identifies the control corresponding to the input event. Take the touch operation as a touch click operation, and the control corresponding to the click operation is the control of the camera application icon as an example. The camera application calls the interface of the application framework layer to start the camera application, and then starts the camera driver by calling the kernel layer. Camera 193 captures still images or video.

本申请以电子设备100是手机，实现“AR特效”功能或“图像虚化”功能为手机“相机”程序为例。下面结合附图对本申请提供的技术方案进行详细说明。In this application, the electronic device 100 is a mobile phone, and the "AR special effect" function or the "image blurring" function is implemented as a mobile phone "camera" program as an example. The technical solutions provided by the present application will be described in detail below in conjunction with the accompanying drawings.

首先对本申请涉及的术语进行解释说明。First, the terms involved in this application are explained.

单目图像：是指从单个相机传感器中获取到的图像信息。Monocular image: refers to the image information obtained from a single camera sensor.

SLAM：同步定位与建图，主要用于解决在未知环境中的定位与地图建构问题。SLAM: Synchronous positioning and mapping, mainly used to solve the problem of positioning and map construction in unknown environments.

SLAM特征点：通过SLAM算法获取的图像中具有鲜明特性并能够有效反映图像本质特征的离散点。SLAM feature points: The discrete points in the image obtained by the SLAM algorithm have distinctive characteristics and can effectively reflect the essential characteristics of the image.

AR特效：是一种将虚拟场景和真实场景巧妙融合的技术，运用了多媒体、三维建模、实时跟踪及注册、智能交互、传感等多种技术手段，将计算机生成的文字、图像、三维建模、音乐等虚拟场景模拟仿真后，应用于真实场景，以增强真实场景的内容。AR special effects: It is a technology that ingeniously integrates virtual scenes and real scenes. It uses multimedia, 3D modeling, real-time tracking and registration, intelligent interaction, sensing and other technical means to combine computer-generated text, images, 3D Modeling, music and other virtual scenes are simulated and then applied to real scenes to enhance the content of real scenes.

虚实遮挡关系：虚拟场景与真实场景应该具备的正确遮挡关系，如虚拟场景在遮挡背景同时，虚拟场景能够被前景物体遮挡。然而错误的遮挡关系，会给用户造成错误的空间感知。一般采用基于深度计算的技术手段，通过计算真实场景图深度信息，根据用户的视点位置、虚拟场景的叠加位置以及图像深度信息，获取虚拟场景与真实场景的空间位置关系，进而生成具有正确遮挡关系。Virtual-real occlusion relationship: The correct occlusion relationship between the virtual scene and the real scene. For example, while the virtual scene blocks the background, the virtual scene can be blocked by foreground objects. However, the wrong occlusion relationship will cause users to have wrong spatial perception. Generally, the technical means based on depth calculation is adopted to obtain the spatial position relationship between the virtual scene and the real scene by calculating the depth information of the real scene map, according to the user's viewpoint position, the superposition position of the virtual scene, and the image depth information, and then generate a correct occlusion relationship. .

相对深度信息：代表像素点与像素点之间的相对远近关系，不是真实的深度值。现有深度生成模型输出的深度信息为相对深度信息。Relative depth information: represents the relative distance between pixels, not the real depth value. The depth information output by existing depth generative models is relative depth information.

绝对深度信息：像素点的值代表距离摄像机的真实距离，通常用米为单位的数值来度量。现有技术往往先获取相对深度信息，在结合偏移、放缩等变换手段，获取绝对深度信息。示例性说明：将图3A所示的图像(实际应用中一般为彩色图像)，输入现有相对深度生成模型，输出相对深度信息，如图3B。结合图3C中的各个特征点，通过偏移、放缩等变换手段，将相对深度信息转变为绝对深度信息图3D。Absolute depth information: The pixel value represents the real distance from the camera, usually measured in meters. Existing technologies often obtain relative depth information first, and then obtain absolute depth information in combination with transformation methods such as offset and scaling. Exemplary explanation: the image shown in Figure 3A (usually a color image in practical applications) is input into the existing relative depth generation model, and the relative depth information is output, as shown in Figure 3B. Combining with each feature point in Fig. 3C, the relative depth information is transformed into the absolute depth information map 3D by shifting, scaling and other transformation means.

由于相对深度信息存在估计误差，且特征点存在误差，使得求取的变换系数的精度直接影响变换后获取的绝对深度信息的精度。其中，变换系数用于表示相对深度信息通过偏移和放缩等变换手段获取绝对深度信息的变换大小，包括偏移系数s和放缩系数μ。为获取精度更高的变换系数，在一种可能的设计中，采用多个特征点通过曲线拟合方式，比如全局最小二乘法，计算公式为：Since there are estimation errors in the relative depth information and errors in the feature points, the accuracy of the calculated transformation coefficient directly affects the accuracy of the absolute depth information obtained after transformation. Among them, the transformation coefficient is used to represent the transformation size of the absolute depth information obtained by the relative depth information through the transformation means such as offset and scaling, including the offset coefficients and the scaling coefficientμ . In order to obtain transformation coefficients with higher precision, in a possible design, multiple feature points are used to fit curves, such as the global least squares method, and the calculation formula is:

其中，d_abs为绝对深度信息，d_abs为相对深度信息。Wherein,d_abs is absolute depth information,and d_abs is relative depth information.

具体获取变换系数β为最小二乘法，具体为：The specific way to obtain the transformation coefficientβ is the least squares method, specifically:

其中，in,

如此，可以求得：In this way, it can be obtained:

其中，X^T为相对深度点组成的矩阵X的转置矩阵，y为像素点对应的绝对深度信息组成的矩阵。Wherein,X^T is a transpose matrix of a matrixX composed of relative depth points,and y is a matrix composed of absolute depth information corresponding to pixels.

然而这种拟合方式，由于相对深度点误差影响，导致获取的绝对深度信息误差较大。因此，在一种可能的实现方式中，采用局部加权最小二乘法，即通过在拟合时对每个特征点添加权重系数，获取每个点的变换系数。However, this fitting method, due to the influence of the relative depth point error, leads to a large error in the obtained absolute depth information. Therefore, in a possible implementation manner, a local weighted least squares method is adopted, that is, the transformation coefficient of each point is obtained by adding a weight coefficient to each feature point during fitting.

然而，这种局部加权最小二乘法在拟合时未考虑局部特性的影响，比如特征点局部空缺，具体例如图3C右下角沙发位置不存在有效的特征点，使得获取的变换系数精度低，导致恢复得到的绝对深度信息图与真实深度信息图误差较大。However, this locally weighted least squares method does not consider the influence of local characteristics during fitting, such as local vacancies of feature points. The accuracy is low, resulting in a large error between the restored absolute depth information map and the real depth information map.

基于此，本申请提供了一种获取图像深度信息的方式，在利用特征点进行曲线拟合时获取变换系数时，利用偏置系数对权重系数进行调整。具体的，对于局部空缺位置处的特征点，权重系数离散程度低，几乎为0，利用较大的偏置系数调整权重系数。对于其他位置处的特征点，权重系数离散程度高，利用较小的偏置系数调整权重系数。如此，将图像上各特征点对图像上目标点的影响力度平均化，从而降低局部空缺使得曲线拟合获取的变换系数精度低的问题，从而提高获得的绝对深度信息图的准确度。Based on this, the present application provides a method for acquiring image depth information. When acquiring transformation coefficients when using feature points for curve fitting, the weight coefficients are adjusted using offset coefficients. Specifically, for the feature points at local vacant positions, the weight coefficient has a low degree of dispersion, which is almost 0, and a larger bias coefficient is used to adjust the weight coefficient. For the feature points at other positions, the weight coefficient is highly discrete, and the weight coefficient is adjusted with a smaller bias coefficient. In this way, the influence of each feature point on the image on the target point on the image is averaged, thereby reducing the problem of low accuracy of the transformation coefficient obtained by curve fitting due to local vacancies, thereby improving the accuracy of the obtained absolute depth information map.

为了更好的说明本申请技术方案，下面详细介绍了一下的应用场景的具体实现方式。In order to better illustrate the technical solution of the present application, the specific implementation manners of the following application scenarios are introduced in detail below.

示例性，如图4所示的手机，用户点击“相机”应用程序。该相机应用可以为手机的原生相机应用，也可以使第三方开发的多功能相机应用等。Exemplarily, on the mobile phone shown in FIG. 4 , the user clicks on the "Camera" application program. The camera application can be a native camera application of the mobile phone, or a multi-function camera application developed by a third party, etc.

响应于接收到用户开启相机应用的操作，手机可以显示如图5A中所示的拍摄界面500。相机应用可以默认开启“拍照”功能。拍摄界面500包括取景框501，以及“人像”、“拍照”、“录像”、“专业”、“更多”等功能控件。在一些示例中，如图5A所示，拍摄界面500还包括“第一功能”控件502，用户可以通过操作“第一功能”控件，开启手机第一功能。其中第一功能包括“AR特效”功能，和/或图像虚化功能。In response to receiving the user's operation of starting the camera application, the mobile phone may display ashooting interface 500 as shown in FIG. 5A . The camera application can enable the "photograph" function by default. Theshooting interface 500 includes aviewfinder frame 501 and functional controls such as "portrait", "photograph", "video", "professional", and "more". In some examples, as shown in FIG. 5A , theshooting interface 500 further includes a "first function"control 502, and the user can activate the first function of the mobile phone by operating the "first function" control. Wherein the first function includes an "AR special effect" function, and/or an image blurring function.

其中，“AR特效”功能为基于用户选择的虚拟场景(例如，从云端服务器中选择的虚拟场景)，将用户拍摄的真实场景与虚拟场景图像融合，以增强真实场景图内容。图像虚化功能为用户调节景深，从而控制图像中画面主题以及背景环境的清晰度。Among them, the "AR special effect" function is based on the virtual scene selected by the user (for example, the virtual scene selected from the cloud server), which combines the real scene taken by the user with the virtual scene image to enhance the content of the real scene map. The image blur function adjusts the depth of field for the user, thereby controlling the clarity of the subject matter and the background environment in the image.

本申请实施例中，将“第一功能”集成于“第一功能”拍摄功能，具体包括“第一功能”拍照功能和“第一功能”录像功能。开启“第一功能”拍照或录像功能后，用户可以得到第一功能图像。比如，将“AR特效”功能集成为“AR特效”拍摄功能，具体包括“AR特效”的拍照功能和“AR特效”的录像功能。开启“AR特效”的拍照或录像功能后，手机基于选择的虚拟场景，能够将拍摄的真实场景与虚拟场景进行融合，得到融合的图像。或者，手机基于选择的虚拟场景，指导用户调整手机的拍摄位置或手机姿态，以使得录制的视频中的图像能够与虚拟场景融合，得到融合图像。In the embodiment of the present application, the "first function" is integrated into the "first function" shooting function, specifically including the "first function" photographing function and the "first function" video recording function. After turning on the "first function" photo or video function, the user can get the first function image. For example, the "AR Special Effects" function is integrated into the "AR Special Effects" shooting function, which specifically includes the "AR Special Effects" camera function and the "AR Special Effects" video recording function. After turning on the camera or video function of the "AR special effect", the mobile phone can fuse the real scene and the virtual scene based on the selected virtual scene to obtain a fused image. Or, based on the selected virtual scene, the mobile phone guides the user to adjust the shooting position or posture of the mobile phone, so that the image in the recorded video can be fused with the virtual scene to obtain a fused image.

在另一些示例中，如图5B所示，响应于用户在拍摄界面507中操作“更多”控件504。手机打开如图5C所示的更多功能控件的选项菜单508，选项菜单508中设置有“第一功能”控件502。或者，用户还可以通过在拍摄界面507中操作“设置”控件505，打开相机应用的设置选项界面，设置选项界面中设置“第一功能”控件502。In other examples, as shown in FIG. 5B , in response to the user operating the “more”control 504 in thecapture interface 507 . The mobile phone opens anoption menu 508 of more functional controls as shown in FIG. 5C , and a “first function”control 502 is set in theoption menu 508 . Or, the user can also open the setting option interface of the camera application by operating the "setting"control 505 in theshooting interface 507, and set the "first function"control 502 in the setting option interface.

在又一些示例中，如图5D所示，“第一功能”控件502也可以设置在“拍照”功能中的拍摄界面506。或者，如图5E所示，“第一功能”控件502设置在“录像”功能中的拍摄界面509中。用户可以通过在使用“拍照”功能或者“录像”功能时，通过“第一功能”控件502快速开启第一功能。例如，用户在使用“拍照”功能时，通过“AR特效”功能控件502开启“AR特效”功能后，手机默认开启“AR特效”的拍照功能。又例如，用户在使用“录像”功能时，通过“AR特效”功能控件502开启“AR特效”功能后，手机默认开启“AR特效”的录像功能。In some other examples, as shown in FIG. 5D , the "first function"control 502 may also be set on theshooting interface 506 of the "photographing" function. Alternatively, as shown in FIG. 5E , the "first function"control 502 is set in the shooting interface 509 of the "recording" function. The user can quickly activate the first function through the "first function"control 502 when using the "photographing" function or the "video recording" function. For example, when the user uses the "photographing" function, after the "AR special effect" function is enabled through the "AR special effect"function control 502, the mobile phone defaults to the "AR special effect" camera function. For another example, when the user uses the "recording" function, after the user activates the "AR special effect" function through the "AR special effect"function control 502, the mobile phone will enable the "AR special effect" recording function by default.

需要说明的是，本申请实施例对“第一功能”控件502的设置位置，以及用户操作“AR特效”功能控件502的方式不做限定。当然，用户也可以通过其他方式开启第一功能，例如执行预定义隔空手势、输入语音命令、按压物理控件、在手机的触摸屏上绘制预定义图案等。It should be noted that, the embodiment of the present application does not limit the setting position of the “first function”control 502 and the way for the user to operate the “AR special effect”function control 502 . Of course, the user can also activate the first function in other ways, such as performing a predefined space gesture, inputting a voice command, pressing a physical control, drawing a predefined pattern on the touch screen of the mobile phone, and the like.

为了便于理解，下面先结合附图介绍如图5A所示的拍摄界面507，获取如图5C所示的选项菜单508上操作第一功能控件502的方式开启第一功能进行图像预览的方法。其中，第一功能为“AR特效”功能。如图6所示，该图为本申请实施例提供的一种开启第一功能以进行“AR图像”预览的方法流程图。该方法包括：For ease of understanding, the following first introduces theshooting interface 507 shown in FIG. 5A with reference to the accompanying drawings, and obtains the method of operating thefirst function control 502 on theoption menu 508 shown in FIG. 5C to enable the first function for image preview. Among them, the first function is the "AR special effect" function. As shown in FIG. 6 , this figure is a flow chart of a method for enabling the first function to preview an "AR image" provided by an embodiment of the present application. The method includes:

S601：第一功能控件接收云端服务器传输的虚拟场景信息。S601: The first functional control receives the virtual scene information transmitted by the cloud server.

在本申请实施例中，云端服务器存储有虚拟场景信息库。在一种可能的实现方式中，第一功能控件获取预设需求，并根据预设需求，从云端服务器的虚拟场景信息库获取虚拟场景信息。比如，第一功能控件内置与虚拟场景信息库中与虚拟场景对应的图标，通过该图标实现第一功能控件与云端服务器通信。用户从第一功能控件中根据预设需求点击相应的图标，从云端服务器获取与图标对应的虚拟场景信息。In the embodiment of the present application, the cloud server stores a virtual scene information library. In a possible implementation manner, the first functional control obtains a preset requirement, and obtains virtual scene information from a virtual scene information database of a cloud server according to the preset requirement. For example, the first functional control has a built-in icon corresponding to the virtual scene in the virtual scene information database, and the communication between the first functional control and the cloud server is realized through the icon. The user clicks the corresponding icon from the first functional control according to the preset requirement, and obtains the virtual scene information corresponding to the icon from the cloud server.

S602：启动第一功能控件，并向手机设备相机发送打开指令。S602: Start the first function control, and send an opening instruction to the camera of the mobile phone device.

在本申请实施例中，用户点击手机“相机”应用程序中的第一功能控件，也即用户输入第一功能开启指令。第一功能控件的具体位置如图5A-图5E所示。第一功能控件将产生的开启指令发送给手机设备相机。其中，设备相机可以为前置摄像头，也可以为后置摄像头。In the embodiment of the present application, the user clicks on the first function control in the "camera" application program of the mobile phone, that is, the user inputs the instruction to enable the first function. The specific position of the first functional control is shown in FIG. 5A-FIG. 5E. The first function control sends the generated opening instruction to the camera of the mobile phone device. Wherein, the device camera may be a front camera or a rear camera.

在一种可能的实现方式中，第一功能控件与设备相机通过MIPI接口连接。MIPI接口包括摄像头串行接口(camera serial interface，CSI)等。第一功能控件通过CSI接口与设备相机通信，实现开启指令和图像信息的传递。In a possible implementation manner, the first functional control is connected to the device camera through a MIPI interface. MIPI interface includes camera serial interface (camera serial interface, CSI) and so on. The first functional control communicates with the device camera through the CSI interface to realize the transmission of the opening instruction and image information.

S603：摄像头基于打开指令采集真实场景的单目图像，并将单目图像信息发送给第一功能控件。S603: The camera collects a monocular image of a real scene based on the opening instruction, and sends the monocular image information to the first functional control.

摄像头基于打开指令，获取真实场景的单目图像。在本申请实施例中，单目图像为RGB彩色图像。示例性说明，摄像头基于打开指令采集的真实场景的单目图像如图4所示。摄像头将采集的RGB彩色图像发送给第一功能控件，执行第一功能操作。比如，摄像头通过CSI接口将采集的RGB彩色图像发送给第一功能控件。Based on the opening instruction, the camera obtains the monocular image of the real scene. In the embodiment of the present application, the monocular image is an RGB color image. As an example, the monocular image of the real scene collected by the camera based on the opening instruction is shown in FIG. 4 . The camera sends the collected RGB color image to the first functional control to execute the first functional operation. For example, the camera sends the collected RGB color image to the first functional control through the CSI interface.

S604：第一功能控件将接收的云端服务器传输的虚拟场景信息和单目图像信息融合，获取第一功能图像。S604: The first functional control unit fuses the received virtual scene information transmitted by the cloud server with the monocular image information to acquire the first functional image.

第一功能控件接收摄像头采集的单目图像，且可以接收云端服务器传输的虚拟场景信息。且通过对单目图像处理，获取相对深度信息图和特征点图。由于室外真实场景的真实深度往往很难获取，但人可以根据经验、遮挡关系、光纤和阴影等只是获取相对深度，即物体的前后关系。对于获取的相对深度信息图，通过坐标点的偏移和放缩等变换，获取图像深度信息。其中图像深度信息为图像的绝对信息(真实深度信息)。具体处理方式，参见下文图7A，和/或图8A所示的方法流程图。基于图像深度信息，将单目图像与虚拟场景图像进行融合，获取第一功能图像。The first functional control receives the monocular image collected by the camera, and can receive the virtual scene information transmitted by the cloud server. And by processing the monocular image, the relative depth information map and feature point map are obtained. Because the real depth of outdoor real scenes is often difficult to obtain, but people can only obtain relative depth based on experience, occlusion relationship, optical fiber and shadow, that is, the front and back relationship of objects. For the obtained relative depth information map, the image depth information is obtained through transformations such as coordinate point offset and scaling. The image depth information is the absolute information (true depth information) of the image. For a specific processing manner, refer to FIG. 7A below, and/or the method flowchart shown in FIG. 8A. Based on the image depth information, the monocular image is fused with the virtual scene image to obtain the first functional image.

下面，针对具体获取图像深度信息的方式进行介绍。In the following, a specific manner of acquiring image depth information will be introduced.

参见图7A，为本申请实施例提供的一种获取图像深度信息的方法流程图。Referring to FIG. 7A , it is a flowchart of a method for acquiring image depth information provided by an embodiment of the present application.

该方法包括：The method includes:

S701：获取图像的多个特征点。S701: Acquire multiple feature points of an image.

特征点为相机拍摄的场景图中具有深度信息，且具有鲜明特性并能够有效反应图像本质的点。在一种实施例中，特征点还可以包括该点的置信度，用于表示该点的真实值落在测量结果周围的概率。概率越大，置信度越高。Feature points are points in the scene image captured by the camera that have depth information, have distinct characteristics, and can effectively reflect the essence of the image. In an embodiment, the feature point may also include a confidence level of the point, which is used to represent the probability that the true value of the point falls around the measurement result. The greater the probability, the higher the confidence.

示例性说明：已知第i个特征点表示为(u_i，v_i，y_i，k_i)。其中，(u_i，v_i)表示为第i个特征点的二维坐标，y_i表示为第i个特征点的绝对深度信息，k_i为第i个特征点的置信度，表示特征点坐标落在该坐标值周围的概率为k_i。Exemplary description: It is known that thei -th feature point is expressed as (u_i ,v_i, y_i, ki₎ . Among them, (u_i ,v_i ) represents the two-dimensional coordinates of thei- th feature point,y_i represents the absolute depth information of thei- th feature point,and ki_is the confidence degree of thei -th feature point, representing the feature point The probability that a coordinate falls around this coordinate value isk_i .

在获取特征点坐标时同时获取特征点的置信度，可以将不同精度的特征点区别对待，置信度低的特征点降低其影响力度，如此能够提高拟合精度。When acquiring the coordinates of the feature points, the confidence of the feature points can be obtained at the same time, and the feature points with different precision can be treated differently. The feature points with low confidence can reduce their influence, which can improve the fitting accuracy.

进一步的，拍照应用程序内置特征点提取控件，对于拍摄中的图像，直接获取图像的特征点。在有一种实施例中，“手机”的“相机”程序的“AR特效”功能控件中内置特征点提取控件，对于摄像头拍摄的图像，上传至“AR特性”功能控件提取特征点。Further, the camera application has a built-in feature point extraction control, and for the image being captured, the feature points of the image are directly obtained. In one embodiment, the feature point extraction control is built in the "AR special effect" function control of the "camera" program of the "mobile phone", and the image captured by the camera is uploaded to the "AR feature" function control to extract feature points.

在一种实施例中，特征点提取控件可以为同时定位与地图构建(SimultaneousLocalization and Mapping，SLAM)算法，获取的特征点为SLAM特征点。In one embodiment, the feature point extraction control can be a Simultaneous Localization and Mapping (SLAM) algorithm, and the acquired feature points are SLAM feature points.

SLAM算法具体实现提取特征点的方法为：读取摄像头拍摄到的图像，检测角点位置，根据角点位置计算描述子，对图像中的描述子进行匹配，获取SLAM特征点。The specific implementation of the SLAM algorithm to extract feature points is: read the image captured by the camera, detect the corner position, calculate the descriptor according to the corner position, match the descriptor in the image, and obtain the SLAM feature point.

其中，角点是图像中属性强度最大或最小的孤立点，比如灰度的梯度局部最大所对应的像素点，图像中梯度值和梯度方向变化速率最高的点。描述子为角点周围的向量。在一种实施例中，可以在角点周围随机取128或256对点对p和q，若p大于q，记作第一值，否则，记作第二值，描述子为第一值和第二值组成的一组128位序列或256序列。在一种实施例中，第一值为0，第二值为1。在又一种实施例中，第一值为1，第二值为0。Among them, the corner point is the isolated point in the image with the largest or smallest attribute intensity, such as the pixel corresponding to the local maximum gradient of the gray scale, and the point in the image with the highest gradient value and gradient direction change rate. Descriptors are vectors around the corners. In one embodiment, 128 or 256 pairs of points p and q can be randomly selected around the corner point, if p is greater than q, it is recorded as the first value, otherwise, it is recorded as the second value, and the descriptor is the first value and A set of 128-bit sequences or 256-bit sequences consisting of the second value. In one embodiment, the first value is 0 and the second value is 1. In yet another embodiment, the first value is 1 and the second value is 0.

示例性的，如图7B所示，为基于SLAM算法处理图4所示的图像，获取的SLAM特征点示意图。Exemplarily, as shown in FIG. 7B , it is a schematic diagram of SLAM feature points obtained by processing the image shown in FIG. 4 based on the SLAM algorithm.

图7B中，SLAM特征点用“＋”标注，SLAM特征点分布不均匀，比如右侧的沙发及其上的区域没有有效的SLAM特征点，中间台灯及其左侧的白色柜子处特征点集中分布。In Figure 7B, the SLAM feature points are marked with "+", and the distribution of SLAM feature points is uneven. For example, the sofa on the right and the area above it have no effective SLAM feature points, and the feature points in the middle table lamp and the white cabinet on the left are concentrated. distributed.

S702：获取图像上各特征点到目标点的权重系数。S702: Obtain weight coefficients from each feature point on the image to a target point.

权重系数用于表示图像上各特征点对目标点的影响力度。在本申请实施例中，目标点为图像的一个像素点。The weight coefficient is used to represent the influence of each feature point on the image on the target point. In the embodiment of the present application, the target point is a pixel point of the image.

在一种实施例中，权重系数为基于距离-权重映射函数获取的(0~1)范围内的数。其中，距离-权重映射函数，自变量为距离，因变量为权重系数。In one embodiment, the weight coefficient is a number within the range (0~1) obtained based on the distance-weight mapping function. Among them, the distance-weight mapping function, the independent variable is the distance, and the dependent variable is the weight coefficient.

距离越大，权重系数越小；距离越小，权重系数越大。即，距离-权重映射函数为值域为(0,1)，在x轴(代表距离)正半轴上单调递减的函数。在一种实施例中，距离-权重映射函数可以为高斯函数，具体表达式为：The larger the distance, the smaller the weight coefficient; the smaller the distance, the larger the weight coefficient. That is, the distance-weight mapping function is a function with a range of (0,1) and a monotonically decreasing function on the positive semi-axis ofthe x- axis (representing the distance). In one embodiment, the distance-weight mapping function can be a Gaussian function, and the specific expression is:

式(3)中，a为高斯系数，根据需要可以自行调整。b为各特征点到目标点的距离平均值，c为各特征点到特征点距离标准差，用于表示距离的波动情况。In formula (3),a is the Gaussian coefficient, which can be adjusted according to the needs.b is the average distance from each feature point to the target point,and c is the standard deviation of the distance from each feature point to the feature point, which is used to represent the fluctuation of the distance.

在一种实施例中，距离可以为欧式距离。示例性的，步骤S701中所示第i(i=1,2，……n)个特征点表示为(u_i，v_i，y_i，k_i)，则第i个特征点到目标点(u_j，v_j)(j≥1的正整数)的距离：In one embodiment, the distance may be a Euclidean distance. Exemplarily, thei -th (i =1,2,...n) feature point shown in step S701 is expressed as (u_i ,v_i, y_i, k_i ), then thei -th feature point to the target point (u_j ,v_j ) (positive integer j≥1) distance:

则第i个特征点到目标点(u_j，v_j)的权重系数Then the weight coefficient fromthe i -th feature point to the target point (u_j ,v_j )

其中，n为特征点的个数。Among them,n is the number of feature points.

在一种实施例中，距离-权重映射函数还可以与置信度有关，示例性的，步骤S701中第i个特征点表示为(u_i，v_i，y_i，k_i)，则第i个特征点到目标点(u_j，v_j)的权重系数In one embodiment, the distance-weight mapping function may also be related to confidence. Exemplarily,the i- th feature point in step S701 is expressed as (u_i ,v_i, y_i, k_i ), then thei -th feature point Weight coefficients from feature points to target points (u_j ,v_j )

在计算权重系数时，考虑特征点的置信度，将不同特征点区别对待，通过将置信度低的特征点降低其影响力度，提高拟合精度。When calculating the weight coefficient, the confidence of the feature points is considered, and different feature points are treated differently. By reducing the influence of feature points with low confidence, the fitting accuracy is improved.

S703：计算目标点处的偏置系数，利用偏置系数调整权重系数。S703: Calculate the bias coefficient at the target point, and use the bias coefficient to adjust the weight coefficient.

对于目标点(u_j，v_j)，通过距离-权重计算函数获取的权重系数为w(i，u_j，v_j)。For the target point (u_j ,v_j ), the weight coefficient obtained through the distance-weight calculation function isw (i ,u_j ,v_j ).

在本申请实施例中，通过偏置系数调整权重系数，以避免特征点在图像上分布不均匀，比如局部空缺，导致出现权重系数几乎为0的问题。偏置系数取值范围为(0,1)的数，用于调整权重系数，当权重系数较小，比如几乎为0时，利用较大的偏置系数调整，否则利用较小的偏置系数调整。In the embodiment of the present application, the weight coefficient is adjusted through the bias coefficient to avoid the uneven distribution of feature points on the image, such as local vacancies, resulting in the problem that the weight coefficient is almost 0. The value range of the bias coefficient is (0,1), which is used to adjust the weight coefficient. When the weight coefficient is small, such as almost 0, use a larger bias coefficient to adjust, otherwise use a smaller bias coefficient Adjustment.

在一种可能的实现方式中，考虑到特征点在图像上分布不均匀，图像上有的部分特征点分布比较多，而有的部分局部空缺。对于特征点分布比较多的地方，此时各特征点到目标点的权重系数分布不均匀，而对于局部空缺，此时由于不存在有效的特征点，即到目标点的权重系数均匀分布，几乎为0。In a possible implementation, considering that the feature points are not evenly distributed on the image, some feature points on the image are more distributed, while some are locally vacant. For places where there are many feature points, the distribution of weight coefficients from each feature point to the target point is uneven at this time, and for local vacancies, since there are no effective feature points at this time, that is, the weight coefficients to the target point are uniformly distributed, almost is 0.

在本申请实施例中，偏置系数与各特征点到目标点的权重系数的离散值呈负相关关系。权重系数的离散值用于表示权重系数的离散程度。如此，对于局部空缺，到目标点的权重系数相同几乎为0，即差异值较小，此时偏置系数较大，如此利用较大的偏置系数调整权重系数，可以避免权重系数近乎为0的问题。In the embodiment of the present application, the bias coefficient is negatively correlated with the discrete value of the weight coefficient from each feature point to the target point. The discrete value of the weight coefficient is used to represent the discrete degree of the weight coefficient. In this way, for local vacancies, the weight coefficient to the target point is the same as almost 0, that is, the difference value is small, and the bias coefficient is large at this time. Using a larger bias coefficient to adjust the weight coefficient can prevent the weight coefficient from being close to 0. The problem.

在一种可能的实现方式中，偏置系数与目其他特征点到目标点处的权重系数标准差呈负相关。其中，标准差用于表示权重系数的离散程度。示例性说明：偏置系数：In a possible implementation manner, the bias coefficient is negatively correlated with the standard deviation of the weight coefficient from other feature points to the target point. Among them, the standard deviation is used to represent the degree of dispersion of the weight coefficient. Exemplary Explanation: Bias Coefficients :

其中，std()为标准差函数。为调整标准差影响大小的参数，，比如。Among them, std() is the standard deviation function. In order to adjust the parameters of the size of the influence of the standard deviation, ,for example .

进一步，利用上述方式获取的偏置系数，调整权重系数。在一种可能的实现方式中，在一种可能的实现方式中，偏置系数与权重系数相加，将相加结果形成一对角阵，其对角元素为偏置系数与权重系数的和值，这一对角阵为权重矩阵。上述示例性说明中，权重矩阵Further, the weight coefficient is adjusted by using the bias coefficient obtained in the above manner. In a possible implementation manner, in a possible implementation manner, the bias coefficient is added to the weight coefficient, and the result of the addition forms a pair of diagonal arrays, the diagonal elements of which are the sum of the bias coefficient and the weight coefficient value, this diagonal matrix is the weight matrix. In the above exemplary description, the weight matrix

则权重矩阵具体表示形式为：Then the specific expression of the weight matrix is:

本申请通过偏置系数调整各特征点到目标点的权重系数。其中偏置系数在权重系数离散值大时，值较小，差异较小时，比如局部空缺，偏置系数较大。利用偏置系数调整权重系数，可以使权重系数在预设距离范围内波动。示例性：预设距离范围0.95~1.05范围内波动。比如，调整后获取的权重系数为0.99，在这一范围内波动，满足调整要求。通过权利偏置系数调整权重系数，可以实现平均化各特征点差异，使结果趋向于全局拟合。This application adjusts the weight coefficients from each feature point to the target point through the bias coefficient. Among them, the bias coefficient is small when the discrete value of the weight coefficient is large, and the bias coefficient is large when the difference is small, such as a local vacancy. Using the bias coefficient to adjust the weight coefficient can make the weight coefficient fluctuate within the preset distance range. Exemplary: The preset distance fluctuates within the range of 0.95~1.05. For example, the weight coefficient obtained after adjustment is 0.99, which fluctuates within this range and meets the adjustment requirements. By adjusting the weight coefficient through the right bias coefficient, the difference of each feature point can be averaged, so that the result tends to global fitting.

S704：计算目标点处的变换系数。S704: Calculate the transformation coefficient at the target point.

正如前文所述，目前计算变换系数，即计算偏移系数和放缩系数往往通过多个特征点进行曲线拟合的方式获取。示例性，本申请以最小二乘法的曲线拟合方式为例，结合步骤S703获取权重矩阵，详细介绍如何获取变换系数。As mentioned above, the current calculation of transformation coefficients, that is, the calculation of offset coefficients and scaling coefficients, is often obtained by performing curve fitting on multiple feature points. Exemplarily, this application takes the least squares method of curve fitting as an example, and introduces in detail how to obtain the transformation coefficients in combination with step S703 to obtain the weight matrix.

假设目标点(u_j，v_j)的变换系数为β_j=(s_j，μ_j)，其中，放缩系数为s_j，偏移系数为μ_j。最小二乘法为：Assume that the transformation coefficient of the target point (u_j ,v_j ) isβ_j =(s_j ,μ_j ), where the scaling coefficient iss_j and the offset coefficient isμ_j . The method of least squares is:

其中，d_i为第i个特征点的相对深度，k_i为第i个特征点的置信度，λ为关于放缩系数为s_j的正则项，目的在于保证放缩系数为s_j有意义，本领域技术人员根据需要可以自行设定。Among them,d_i is the relative depth of the i-th feature point,ki is the confidence degree of thei- th feature point, and_λ is a regular term about the scaling factors_j , the purpose is to ensure that the scaling factor iss_j meaningful , can be set by those skilled in the art as needed.

对公式(13)推导计算，确定：Deriving and calculating formula (13), it is determined that:

其中，为β_j的估计值，W_j为权重矩阵。X为相对深度x_i=[d_i,1]组成的矩阵，X^T为矩阵X的转置矩阵，y为绝对深度矩阵。in, is the estimated value ofβ_j ,and W_j is the weight matrix.X is a matrix composed of relative depthx_i =[d_i ,1],X^T is the transpose matrix of matrixX ,and y is an absolute depth matrix.

如此，将上述步骤获取的权重矩阵，各特征点处的相对深度值组成的相对深度矩阵，和各特征点获取的绝对深度值组成的绝对深度矩阵，计算变换系数的估计值。In this way, the weight matrix obtained in the above steps, the relative depth matrix composed of the relative depth values at each feature point, and the absolute depth matrix composed of the absolute depth values obtained at each feature point are used to calculate the estimated value of the transformation coefficient.

S705：获取图像上其他目标点的变换系数。S705: Obtain transformation coefficients of other target points on the image.

为了更好获取图像上的深度信息，即绝对深度信息，需要计算图像上其他目标点进行变换系数，如此通过相对深度信息，结合变换系数进行放缩和偏移，获取绝对深度信息。In order to better obtain the depth information on the image, that is, the absolute depth information, it is necessary to calculate the transformation coefficients of other target points on the image. In this way, the relative depth information is combined with the transformation coefficients for scaling and offset to obtain absolute depth information.

在一种可能的实现方式中，可以通过插值法获取图像上其他目标点的变换系数。可以通过单线插值法，也可以通过双线插值法获取图像其他目标点的变换系数。本申请实施例不对图像具体插值法进行限定。In a possible implementation manner, the transformation coefficients of other target points on the image may be acquired through an interpolation method. The transformation coefficients of other target points in the image can be obtained by single-line interpolation or bi-line interpolation. The embodiment of the present application does not limit the specific interpolation method of the image.

示例性说明：以双线性插值为例，获取其他目标点的变换系数。Example description: Take bilinear interpolation as an example to obtain the transformation coefficients of other target points.

通过上述实例，可以根据获取的四个相邻目标点的坐标及变换系数，计算其他目标点的变换系数。假设这四个目标点的坐标值分别为(u₁,v₁)，(u₁,v₁+1)，(u₁+1,v₁)和(u₁+1,v₁+1)，对应的变换系数组成为:Through the above example, the transformation coefficients of other target points can be calculated according to the obtained coordinates and transformation coefficients of the four adjacent target points. Suppose the coordinates of these four target points are (u₁ ,v₁ ), (u₁ ,v₁ +1), (u₁ +1,v₁ ) and (u₁ +1,v₁ +1) , the corresponding transformation coefficients are composed of:

为了使本领域技术人员更好的理解本申请所述的双线插值获取每个坐标点的放缩系数和偏移系数，下面通过具体数据表示。In order to enable those skilled in the art to better understand the bilinear interpolation described in the present application to obtain the scaling coefficient and offset coefficient of each coordinate point, the specific data are shown below.

假设获取的固定离散深度点为(104,615,202,941), 四个目标点的坐标值分别为(104,202)，(104,203)，(105,202)，(105,203)。Suppose the obtained fixed discrete depth point is (104,615,202,941), and the coordinate values of the four target points are (104,202), (104,203), (105,202), (105,203).

则：but:

将公式(16)的值代入即可求得其他目标点对应的放缩系数和偏移系数。Substituting the value of formula (16) into it can obtain the scaling coefficient and offset coefficient corresponding to other target points.

在本申请实施例中，通过插值法代替对每个坐标点都计算一组放缩变换参数，可以减少计算工作量，提高计算效率。In the embodiment of the present application, instead of calculating a set of scaling transformation parameters for each coordinate point, the interpolation method can reduce the calculation workload and improve the calculation efficiency.

S706：获取每个目标点的图像深度信息。S706: Obtain image depth information of each target point.

将获取的相对深度值，通过变换系数中的放缩系数和偏移系数进行放缩和偏移变换，可以获取图像的深度信息。具体图像深度信息获取公式如下：The acquired relative depth value is scaled and offset transformed by the scaling coefficient and offset coefficient in the transformation coefficient, so as to obtain the depth information of the image. The specific image depth information acquisition formula is as follows:

其中，dabs(u_j，v_j)为第j个目标点的绝对深度值。drel(u_j，v_j)为第j个目标点的相对深度值。s_j，μ_j分别为第j个目标点的变换系数对应的放缩系数和偏移系数。Among them,dabs (u_j ,v_j ) is the absolute depth value of thejth target point.drel (u_j ,v_j ) is the relative depth value of thejth target point.s_j ,μ_j are the scaling coefficient and offset coefficient corresponding to the transformation coefficient of thejth target point respectively.

示例性：对于图3A，通过计算各特征点到目标点的权重系数，且在权重系数中考虑特征点的置信度。然后利用权重系数的标准差调整权重系数，获取权重矩阵。利用公式(15)计算目标点处的变换系数。接着利用双线插值法获取图像上其他目标点的变换系数，利用获得图像目标点的变换系数，基于公式(22)获取图像的深度信息图。具体图信息参见图7C所示。Exemplary: For Fig. 3A, the weight coefficients from each feature point to the target point are calculated, and the confidence of the feature points is considered in the weight coefficients. Then use the standard deviation of the weight coefficients to adjust the weight coefficients to obtain the weight matrix. Use formula (15) to calculate the transformation coefficient at the target point. Then use the bilinear interpolation method to obtain the transformation coefficients of other target points on the image, and use the obtained transformation coefficients of the image target points to obtain the depth information map of the image based on formula (22). Refer to Figure 7C for specific map information.

在本申请实施例中，利用特征点进行曲线拟合时获取变换系数时，考虑基于置信度的权重系数的影响，并利用偏置系数对权重系数进行调整。In the embodiment of the present application, when the transformation coefficients are obtained during curve fitting using feature points, the influence of the weight coefficient based on the confidence is considered, and the weight coefficient is adjusted by using the bias coefficient.

具体的，对于局部空缺位置处的特征点，权重系数离散程度小，利用较大的偏置系数调整权重系数。对于其他位置处的特征点，权重系数离散程度大，利用较小的偏置系数调整权重系数。如此，实现平均化图像上各特征点对图像上目标点的影响力度，从而消除局部空缺导致深度图像信息局部空白的问题，提高了获取图像深度信息的准确性。Specifically, for the feature points at local vacant positions, the weight coefficients have a small degree of dispersion, and a larger bias coefficient is used to adjust the weight coefficients. For the feature points at other positions, the weight coefficients have a large degree of dispersion, and a smaller bias coefficient is used to adjust the weight coefficients. In this way, the influence of each feature point on the image on the target point on the image is averaged, thereby eliminating the problem of partial blanks in the depth image information caused by local vacancies, and improving the accuracy of obtaining image depth information.

此外，为进一步提升获取图像深度信息的时效性，参见图8A，为本申请实施例提供的另一种获取图像深度信息的方法流程图。该方法包括：In addition, in order to further improve the timeliness of acquiring image depth information, refer to FIG. 8A , which is a flowchart of another method for acquiring image depth information provided by an embodiment of the present application. The method includes:

S801：获取图像中的多个特征点。S801: Acquire multiple feature points in an image.

示例性说明：已知第i个特征点表示为(u_i，v_i，y_i，k_i)，其中，(u_i，v_i)表示为第i个特征点的二维坐标，y_i表示为第i个特征点的绝对深度信息，k_i为第i个特征点的置信度，表示特征点坐标落在该坐标值周围的概率为k_i。Exemplary description: It is known that thei- th feature point is expressed as (u_i ,v_i, y_i, ki₎ , where (u_i ,v_i ) is expressed as the two-dimensional coordinates of thei -th feature point,and y_i Expressed as the absolute_depth information of thei- th feature point,ki_is the confidence degree of the i-th feature point, indicating that the probability that the feature point coordinates fall around the coordinate value iski .

示例性的，如图8B所示，为基于SLAM算法处理图4所示的单目图像，获取的SLAM特征点示意图。图8B中，SLAM特征点用“＋”标注，SLAM特征点分布不均匀，比如右侧的沙发及其上不区域没有有效的SLAM特征点，中间台灯及其左侧的白色柜子处特征点集中分布。Exemplarily, as shown in FIG. 8B , it is a schematic diagram of SLAM feature points obtained by processing the monocular image shown in FIG. 4 based on the SLAM algorithm. In Figure 8B, the SLAM feature points are marked with "+", and the distribution of SLAM feature points is uneven. For example, there are no valid SLAM feature points on the sofa on the right and the area above it, and the feature points on the middle table lamp and the white cabinet on the left are concentrated. distributed.

S802：获取图像上各特征点到图像上每个像素点的权重系数。S802: Acquire weight coefficients from each feature point on the image to each pixel on the image.

权重系数用于表示图像上其各特征点对目标点的影响力度。在一种实施例中，权重系数为基于距离-权重映射函数获取的(0~1)范围内的数。The weight coefficient is used to represent the influence of each feature point on the image on the target point. In one embodiment, the weight coefficient is a number within the range (0~1) obtained based on the distance-weight mapping function.

其中，距离-权重映射函数，自变量为距离，因变量为权重系数。距离越大，权重系数越小，相应的，距离越小，权重系数越大。即，距离-权重映射函数为值域为(0,1)，在x轴(代表距离)正半轴上单调递减的函数。在一种实施例中，距离-权重映射函数可以为高斯函数，具体表达式为：Among them, the distance-weight mapping function, the independent variable is the distance, and the dependent variable is the weight coefficient. The larger the distance, the smaller the weight coefficient, and correspondingly, the smaller the distance, the larger the weight coefficient. That is, the distance-weight mapping function is a function with a range of (0,1) and a monotonically decreasing function on the positive semi-axis ofthe x- axis (representing the distance). In one embodiment, the distance-weight mapping function can be a Gaussian function, and the specific expression is:

其中，为高斯系数，根据需要可以自行调整。为各点到特征点的距离平均值，为各点到特征点距离标准差，用于表示距离的波动情况。in, is the Gaussian coefficient, which can be adjusted according to the needs. is the average distance from each point to the feature point, The standard deviation of the distance from each point to the feature point is used to represent the fluctuation of the distance.

在一种实施例中，距离可以为欧式距离。示例性的，步骤S801中所示第i(i=1,2，……n)个特征点表示为(u_i，v_i，y_i，k_i)，则第i个特征点到像素点(u，v)的距离：In one embodiment, the distance may be a Euclidean distance. Exemplarily, thei -th (i =1,2,...n) feature point shown in step S801 is expressed as (u_i ,v_i, y_i, k_i ), then thei -th feature point to pixel point (u ,v ) distance:

则第i个特征点到像素点(u，v)的权重系数Then the weight coefficient fromthe i -th feature point to the pixel point (u ,v )

其中，n为特征点的个数。Among them,n is the number of feature points.

在一种实施例中，距离-权重映射函数还可以与置信度有关，示例性的，步骤S801中第i个特征点表示为(u_i，v_i，y_i，k_i)，则第i个特征点到像素点(u，v)的权重系数In one embodiment, the distance-weight mapping function can also be related to the confidence level. Exemplarily,the i- th feature point in step S801 is expressed as (u_i ,v_i, y_i, k_i ), then thei -th feature point Weight coefficients from feature points to pixel points (u ,v )

S803：获取特征点分布密度图。S803: Obtain a feature point distribution density map.

特征点分布密度图是将每个像素点的多个权重系数在特征点维度进行累加获取的特征点分布图示意图。其中，特征点分布密度图用于表示图像上特征点的稠密情况。具体的，将权重系数在特征点维度进行累加获取权重累加值，权重累加值越大，表示图像上在该像素点处位置的特征点越稠密，否则表示图像上此处位置的特征点越稀疏。The feature point distribution density map is a schematic diagram of the feature point distribution map obtained by accumulating multiple weight coefficients of each pixel point in the feature point dimension. Among them, the feature point distribution density map is used to represent the density of feature points on the image. Specifically, the weight coefficients are accumulated in the feature point dimension to obtain the weight accumulation value. The larger the weight accumulation value, the denser the feature points at the pixel point on the image are, otherwise the sparser the feature points at this position on the image are. .

示例性的，如图8B所示，通过权重系数在特征点维度累加，即，Exemplarily, as shown in FIG. 8B, the weight coefficient is accumulated in the feature point dimension, that is,

其中，权重系数为考虑置信度的权重系数，获取的特征点分布密度图，如图8C所示。其中，A部分表示图像上特征点稠密，即每个像素点的权重累加值大，B部分表示图像上特征点稀疏，即每个像素点的权重累加值较小，C部分表示图像上无特征点。Wherein, the weight coefficient is a weight coefficient considering the confidence degree, and the acquired feature point distribution density map is shown in FIG. 8C . Among them, part A indicates that the feature points on the image are dense, that is, the weight accumulation value of each pixel point is large, part B indicates that the feature points on the image are sparse, that is, the weight accumulation value of each pixel point is small, and part C indicates that there is no feature on the image point.

S804：基于特征点分布密度图进行图像分块，获得各区块的局部中心点。S804: Divide the image into blocks based on the feature point distribution density map, and obtain a local center point of each block.

图像分块是指将图像依照预设规则划分为多个区块。可以用分块标记L(u,v)∈1,…,K表示。K表示分开的总块数。即，若L(u,v)∈1，则特征点(u,v)属于第一区块。相应的，若L(u,v)∈K，则特征点(u,v)属于第K块。Image segmentation refers to dividing an image into multiple blocks according to preset rules. It can be represented by block notation L(u,v )∈1,...,K. K represents the total number of blocks divided. That is, if L(u,v )∈1, then the feature point (u,v ) belongs to the first block. Correspondingly, if L(u,v )∈K, then the feature point (u,v ) belongs to the Kth block.

在一种可能的实现中，预设规则可以为将权重系数的权重累加值的差异值在预设范围的相邻像素点划分为同一区块。在一种实施例中，可以通过超像素分割算法实现预设规则，将特征点分布密度图进行图像分块。In a possible implementation, the preset rule may be to divide adjacent pixel points within a preset range of difference values of the accumulated weight values of the weight coefficients into the same block. In one embodiment, a superpixel segmentation algorithm may be used to implement preset rules, and the feature point distribution density map may be divided into image blocks.

超像素分割算法是利用像素之间特征的相似性将像素分组，用少量超像素代替大量像素来表达图片特征，用于降低图像后处理的复杂度。常见的超像素分割算法包括简单的线性迭代聚类(simple linear iterativeclustering，SLIC)，归一化割(NormalizedCut，NCut)等。下面以SLIC超像素分割算法为例，针对图8C的特征点分布密度图进行分块。The superpixel segmentation algorithm uses the similarity of features between pixels to group pixels, and replaces a large number of pixels with a small number of superpixels to express image features, which is used to reduce the complexity of image post-processing. Common superpixel segmentation algorithms include simple linear iterative clustering (SLIC), normalized cut (NormalizedCut, NCut), etc. Taking the SLIC superpixel segmentation algorithm as an example, the feature point distribution density map in FIG. 8C is divided into blocks.

S804.1：初始化聚类中心。S804.1: Initialize the clustering center.

依照设定的超像素个数，在图8B图像内均匀的分配聚类中心。假定，图像有N个特征点，预分割为K个相同尺寸的超像素，则每个超像素大小为N/K，相邻聚类中心的距离(又称步长)为：According to the set number of superpixels, the cluster centers are evenly distributed in the image in Fig. 8B. Assuming that the image has N feature points and is pre-segmented into K superpixels of the same size, the size of each superpixel is N/K, and the distance between adjacent cluster centers (also known as the step size) is:

S804.2：在聚类中心的邻域重新选择聚类中心。S804.2: Reselect the cluster center in the neighborhood of the cluster center.

本实施选择的聚类中心的邻域大小为3*3。首先计算该邻域内所有特征点的梯度值，将聚类中心移动到该邻域内梯度最小处。The neighborhood size of the cluster centers selected in this implementation is 3*3. First calculate the gradient value of all feature points in the neighborhood, and move the cluster center to the minimum gradient in the neighborhood.

S804.3：在聚类中心邻域内为每个特征点分配标签。S804.3: Assign labels to each feature point within the neighborhood of the cluster center.

本实施例中分配标签用于表示该特征点属于哪一聚类中心。为了加速收敛，将搜索范围扩大为邻域范围的1倍，即搜索范围为6*6。In this embodiment, assigning a label is used to indicate which cluster center the feature point belongs to. In order to speed up the convergence, the search range is expanded to 1 times the neighborhood range, that is, the search range is 6*6.

S804.4：衡量相似度。S804.4: Measure the similarity.

对于每个搜索的特征点，分别计算特征点及其距离该特征点最近的聚类中心相似程度。将最相似的聚类中心的标签赋值给该特征点。通过不断迭代该过程直至收敛。具体相似度衡量方式如下：For each searched feature point, calculate the similarity between the feature point and its nearest cluster center. Assign the label of the most similar cluster center to the feature point. This process is iterated continuously until convergence. The specific similarity measurement method is as follows:

d_c为特征点间的颜色差异，d_s为特征点间的距离，D’为特征点间的相似度，m为平衡参数，用来衡量颜色值和空间信息在相似度衡量中的比重。D’越大，表示两特征点间越相似。第j个特征点颜色空间信息为(l_j,a_j,b_j)，二维坐标信息(x_j,y_j)。第i个特征点的颜色信息为(l_i,a_i,b_i)，二维坐标信息为(x_i,y_i)。d_c is the color difference between feature points,d_s is the distance between feature points,D' is the similarity between feature points, and m is a balance parameter, which is used to measure the proportion of color value and spatial information in similarity measurement. The largerD' is, the more similar the two feature points are. The color space information of thejth feature point is (l_j, a_j, b_j ), and the two-dimensional coordinate information (x_j, y_j ). The color information of thei- th feature point is (l_i, a_i, b_i ), and the two-dimensional coordinate information is (x_i, y_i ).

参见图8D所示，为图8C结合上述实现方式，基于K为24*32获得的图像分块结果示意图。Referring to FIG. 8D , it is a schematic diagram of an image segmentation result obtained based on K being 24*32 in combination with the above implementation in FIG. 8C .

进一步，对于图像中每一模块，可以计算每区块的局部中心点。以图8D为例，假设第j块的局部中心点(u_j，v_j)，可以通过如下方式获得：Further, for each module in the image, the local center point of each block can be calculated. Taking Figure 8D as an example, assuming that the local center point (u_j ,v_j ) of the jth block can be obtained as follows:

其中，(u_j^*，v_j^*)第j块的特征点坐标，j=1，2,……，K。Among them, (u_j^* ,v_j^* ) the feature point coordinates of thejth block,j =1, 2,...,K.

S805：计算目标点的偏置系数，利用偏置系数调整权重系数。S805: Calculate the bias coefficient of the target point, and use the bias coefficient to adjust the weight coefficient.

对于步骤S804获取的局部中心点(u_j，v_j)作为目标点，通过步骤S802所述的距离-权重计算公式，获取权重系数w(i，u_j，v_j)。For the local center point (u_j ,v_j ) obtained in step S804 as the target point, the weight coefficientw (i ,u_j ,v_j ) is obtained through the distance-weight calculation formula described in step S802.

偏置系数取值范围为(0,1)的数，用于调整权重系数，当权重系数较小，比如几乎为0时，利用较大的偏置系数调整，否则利用较小的偏置系数调整。在一种可能的实现方式中，考虑到，特征点在图像上分布不均匀，图像上有的部分特征点分布比较多，而有的部分局部空缺。对于特征点分布比较多的地方，此时局部中心点处的各特征点的权重系数离散值较大，而对于局部空缺，此时局部中心点处的权重系数离散值较小，且权重系数几乎为0。为了避免图像局部空缺导致权重系数几乎为0的问题，偏置系数与局部中心点处各特征点的权重系数离散值呈负相关映射关系。在一种可能的实现方式中，偏置系数与局部中心点处的各特征点权重系数的标准差呈负相关。示例性说明：偏置系数：The value range of the bias coefficient is (0,1), which is used to adjust the weight coefficient. When the weight coefficient is small, such as almost 0, use a larger bias coefficient to adjust, otherwise use a smaller bias coefficient Adjustment. In a possible implementation manner, it is considered that the distribution of feature points on the image is not uniform, and some feature points on the image are more distributed, while some parts are partially vacant. For a place where there are many feature points, the discrete value of the weight coefficient of each feature point at the local center point is relatively large, while for a local vacancy, the discrete value of the weight coefficient at the local center point is small, and the weight coefficient is almost is 0. In order to avoid the problem that the weight coefficient is almost 0 due to the local vacancy of the image, the bias coefficient and the discrete value of the weight coefficient of each feature point at the local center point have a negative correlation mapping relationship. In a possible implementation manner, the bias coefficient is negatively correlated with the standard deviation of the weight coefficients of each feature point at the local center point. Exemplary instructions: Bias coefficients:

其中，std()为标准差函数。a为调整标准差影响大小的参数，a>0，比如a =5。Among them, std() is the standard deviation function.a is a parameter to adjust the influence of the standard deviation,a>0 , such asa =5.

进一步，利用上述方式获取的偏置系数，调整权重系数。在一种可能的实现方式中，偏置系数与权重系数相加，将相加结果形成一对角阵，其对角元素为偏置系数加权重系数，这一对角阵为权重矩阵。上述示例性说明中，权重矩阵Further, the weight coefficient is adjusted by using the bias coefficient obtained in the above manner. In a possible implementation manner, the bias coefficient is added to the weight coefficient, and the addition result forms a pair of diagonal arrays, the diagonal elements of which are the bias coefficient plus the weight coefficient, and this diagonal array is a weight matrix. In the above exemplary description, the weight matrix

若令:Ruo Ling:

本申请通过偏置系数调整各特征点到局部中心点的权重系数，当偏置系数在权重系数离散值较大时，值较小，离散值较小时，比如局部空缺，偏置系数较大。利用偏置系数调整权重系数，可以使权重系数在预设距离范围内波动。示例性：预设距离范围0.95~1.05范围内波动。比如，调整后获取的权重系数为0.99，在这一范围内波动，满足调整要求。通过权利偏置系数调整权重系数，可以实现平均化各特征点差异，使结果趋向于全局拟合。This application adjusts the weight coefficient from each feature point to the local center point through the bias coefficient. When the bias coefficient has a large discrete value of the weight coefficient, the value is small, and when the discrete value is small, such as a local vacancy, the bias coefficient is large. Using the bias coefficient to adjust the weight coefficient can make the weight coefficient fluctuate within the preset distance range. Exemplary: The preset distance fluctuates within the range of 0.95~1.05. For example, the weight coefficient obtained after adjustment is 0.99, which fluctuates within this range and meets the adjustment requirements. By adjusting the weight coefficient through the right bias coefficient, the difference of each feature point can be averaged, so that the result tends to global fitting.

S806：计算目标点处的变换系数。S806: Calculate the transformation coefficient at the target point.

正如前文所述，目前计算变换系数，即偏移系数和放缩系数往往通过多个特征点进行曲线拟合的方式获取。示例性，本申请以最小二乘法的曲线拟合方式为例，结合步骤S805获取的局部中心点和步骤S802获取的权重系数，详细介绍如何获取变换系数。As mentioned above, the current calculation of transformation coefficients, that is, offset coefficients and scaling coefficients, is often obtained by performing curve fitting on multiple feature points. Exemplarily, this application takes the least squares method of curve fitting as an example, combining the local center point obtained in step S805 and the weight coefficient obtained in step S802, to introduce in detail how to obtain the transformation coefficient.

假设第j块的局部中心点(u_j，v_j)为的变换系数为β_j=(s_j，μ_j)，其中，放缩系数为s_j，偏移系数为μ_j，即。最小二乘法为：Assume that the transformation coefficient of the local center point (u_j ,v_j ) of thejth block isβ_j = (s_j ,μ_j ), where the scaling coefficient iss_j , and the offset coefficient isμ_j , ie. The method of least squares is:

对公式(1)推导计算，确定：Deriving and calculating the formula (1), it is determined that:

如此，将上述获取的权重矩阵，各特征点处的相对深度值组成的相对深度矩阵，以及特征点分布密度图中各块局部中心点，获取的各特征点处的绝对深度组成的绝对深度矩阵，可以计算变换系数的估计值。In this way, the weight matrix obtained above, the relative depth matrix composed of the relative depth values at each feature point, and the local center point of each block in the feature point distribution density map, the absolute depth matrix composed of the absolute depth at each feature point obtained , an estimate of the transform coefficients can be computed.

S807：获取图像上其他目标点的变换系数。S807: Obtain transformation coefficients of other target points on the image.

为了更好获取图像上的深度信息，即绝对深度信息，需要对图像上每个目标点进行变换系数计算，如此通过相对深度信息，结合放缩系数和偏移系数进行放缩和偏移，获取图像绝对深度信息。In order to better obtain the depth information on the image, that is, the absolute depth information, it is necessary to calculate the transformation coefficient for each target point on the image. In this way, the relative depth information is combined with the scaling coefficient and offset coefficient to perform scaling and offsetting to obtain Image absolute depth information.

在一种可能的实现方式中，可以通过插值法获取图像上每个目标点的变换系数。可以通过单线插值法，也可以通过双线插值法获取图像上每个目标点的变换系数。本申请实施例不对图像具体插值法进行限定。In a possible implementation manner, the transformation coefficient of each target point on the image may be acquired through an interpolation method. The transformation coefficient of each target point on the image can be obtained by single-line interpolation or bi-line interpolation. The embodiment of the present application does not limit the specific interpolation method of the image.

示例性说明：以双线性插值为例，获取每个坐标点的放缩系数和偏移系数。Exemplary description: Take bilinear interpolation as an example to obtain the scaling factor and offset factor of each coordinate point.

比如上述实施例中，为获取第j块的局部中心点(u_j，v_j)的变换系数，其中放缩系数为s_j，偏移系数为μ_j，可以获取与第j块的局部中心点相邻的四个局部中心点j₁, j₂, j₃, j₄对应的变换系数:For example, in the above embodiment, to obtain the transformation coefficient of the local center point (u_j ,v_j ) of thejth block , where the scaling coefficient iss_j and the offset coefficient isμ_j , the transformation coefficients corresponding to the four local center pointsj₁, j₂, j₃,j₄ adjacent to the local center point of thejth block can be obtained:

其中，j₁, j₂为第j块局部中心点上方从左至右两个相邻的中心点。j₃, j₄为第j块局部中心点下方从左至右相邻的两个中心点。利用双线插值函数获取第j块的局部中心点(u_j，v_j)的变换系数，具体公式如下：Among them,j₁, j₂ are two adjacent center points from left to right above the local center point of thejth block.j₃, j₄ are two center points adjacent from left to right below the local center point of thejth block. Use the bilinear interpolation function to obtain the transformation coefficient of the local center point (u_j ,v_j ) of thejth block, the specific formula is as follows:

(u₁，v₁)和(u₁，v₁)分别为j₁和j₄对应的局部中心点。(u₁ ,v₁ ) and (u₁ ,v₁ ) are the local center points corresponding toj₁ andj₄ respectively.

为了更好的说明双线插值获取第j块局部中心点的变换系数，下面通过具体数值进行表示。In order to better illustrate the bilinear interpolation to obtain the transformation coefficient of the local center point of thej -th block, it is represented by specific numerical values below.

j₁, j₂, j₃, j₄对应的局部中心点为(104,202)，(104,203)，(105,202)，(105,203)。第j块局部中心点为(104.615,202.941)。The local center points corresponding toj₁, j₂, j₃, and j₄ are (104,202), (104,203), (105,202), (105,203). The local center point of the jth block is (104.615, 202.941).

则：but:

即：Right now:

将公式(40)值代入即可求得第j块局部中心点对应的变换系数。The transformation coefficient corresponding to the local center point of thejth block can be obtained by substituting the value of the formula (40).

如此，在本申请实施例中，用插值法代替对每个坐标点都计算一组放缩变换参数，可以减少计算工作量，提高运行效率。In this way, in the embodiment of the present application, interpolation is used instead of calculating a set of scaling transformation parameters for each coordinate point, which can reduce calculation workload and improve operating efficiency.

S808：获取图像上每个目标点的深度信息。S808: Obtain depth information of each target point on the image.

将获取的相对深度值，通过放缩系数和偏移系数进行放缩和偏移变换，可以获取图像的深度信息。具体图像深度信息获取公式如下：The acquired relative depth value is scaled and offset transformed by the scaling coefficient and the offset coefficient, and the depth information of the image can be obtained. The specific image depth information acquisition formula is as follows:

具体的，针对图8B获取的图像超像素分割后的图像，通过获取第j块的局部中心点Specifically, for the image obtained by superpixel segmentation of the image obtained in FIG. 8B , by obtaining the local center point of the jth block

(u_j，v_j)，其中，(u_j ,v_j ), where,

然后利用偏置系数Then use the bias coefficient

调整权重系数，获取权重矩阵Adjust the weight coefficients to obtain the weight matrix

基于步骤S806中示例性方式获取放缩系数和偏移系数的方式，即：The manner of obtaining the scaling coefficient and the offset coefficient based on the exemplary manner in step S806, namely:

计算每一块局部中心点的放缩系数和偏移系数，利用双线插值法获取图像上每个坐标点的放缩系数和偏移系数。最后基于图像绝对深度公式获取图像绝对深度信息，如图8E所示。通过图8E可以发现，该方法获取的图像深度信息规避了因特征点分布不均匀，比如局部空缺导致图像异常空白区域。Calculate the scaling coefficient and offset coefficient of each local center point, and use the bilinear interpolation method to obtain the scaling coefficient and offset coefficient of each coordinate point on the image. Finally, the image absolute depth information is obtained based on the image absolute depth formula, as shown in FIG. 8E . It can be seen from Figure 8E that the image depth information obtained by this method avoids the abnormal blank area of the image caused by uneven distribution of feature points, such as local vacancies.

本申请实施例提供的图像深度信息获取的方法，通过引入特征点分布密度图，并对图像进行图像分块和获取各区块的局部中心点，以局部中心点作为目标点，根据各特征点到目标点的权重系数离散程度，计算偏置系数，利用偏置系数调整各特征点到目标点的权重系数。利用调整后的各特征点到目标点的权重系数，优化曲线拟合，获取变换系数。如此，在保证获取绝对深度信息图的准确度的前提下，由于不需要计算每个像素点的变换系数，仅需要计算区块内的局部中心点的变换系数，减少了计算工作量，提高了工作效率。The method for obtaining image depth information provided by the embodiment of the present application introduces the distribution density map of feature points, and divides the image into blocks and obtains the local center point of each block, and uses the local center point as the target point, according to each feature point to The degree of dispersion of the weight coefficient of the target point, calculate the bias coefficient, and use the bias coefficient to adjust the weight coefficient of each feature point to the target point. Using the adjusted weight coefficients from each feature point to the target point, optimize the curve fitting and obtain the transformation coefficient. In this way, under the premise of ensuring the accuracy of obtaining the absolute depth information map, since it is not necessary to calculate the transformation coefficient of each pixel point, only the transformation coefficient of the local center point in the block needs to be calculated, which reduces the calculation workload and improves the work efficiency.

上文结合图1至图8E对本申请实施例提供的图像深度信息获取的方法进行了详细介绍，下面结合附图对本申请实施例提供的图像深度信息获取的装置进行介绍。The method for obtaining image depth information provided by the embodiment of the present application is described in detail above with reference to FIG. 1 to FIG. 8E . The apparatus for obtaining image depth information provided by the embodiment of the present application will be introduced below in conjunction with the accompanying drawings.

参见图9，为本申请实施例提供的一种图像深度信息获取的装置结构示意图，该装置900包括：Referring to FIG. 9 , it is a schematic structural diagram of a device for acquiring image depth information provided by an embodiment of the present application. The device 900 includes:

第一获取单元901，用于获取图像的多个特征点。The first acquiring unit 901 is configured to acquire multiple feature points of an image.

第二获取单元902，用于获取图像上各特征点到目标点的权重系数。The second acquisition unit 902 is configured to acquire weight coefficients from each feature point to the target point on the image.

第一计算单元903，用于计算目标点处的偏置系数，利用偏置系数调整权重系数。The first calculation unit 903 is configured to calculate the bias coefficient at the target point, and use the bias coefficient to adjust the weight coefficient.

第二计算单元904，用于计算目标点处的变换系数。The second calculation unit 904 is configured to calculate the transformation coefficient at the target point.

第三获取单元905，用于获取图像上其他目标点的变换系数。The third acquiring unit 905 is configured to acquire transformation coefficients of other target points on the image.

第四获取单元906，用于获取每个目标点的图像深度信息。The fourth acquiring unit 906 is configured to acquire image depth information of each target point.

可选的，该装置900还可以包括：Optionally, the device 900 may also include:

第五获取单元，用于获取特征点分布密度图。The fifth acquisition unit is configured to acquire a feature point distribution density map.

分块单元，用于基于特征点分布密度图进行图像分块，获得各块的中心点。The block unit is used to block the image based on the feature point distribution density map, and obtain the center point of each block.

本申请提供的一种图像深度信息获取的装置，可以通过偏置系数调整调整权重系数，可以消除局部空缺导致深度图像信息局部空白的问题，提高了获取图像深度信息的准确性。The device for obtaining image depth information provided by the present application can adjust the weight coefficient by adjusting the bias coefficient, which can eliminate the problem of partial blank depth image information caused by local vacancy, and improve the accuracy of image depth information acquisition.

根据本申请实施例的装置900的各个模块/单元的上述和其它操作和/或功能分别为了实现图像深度信息获得方法中所示实施例中的各个方法的相应流程，为了简洁，在此不再赘述。The above-mentioned and other operations and/or functions of the various modules/units of the device 900 according to the embodiment of the present application are respectively to realize the corresponding processes of the various methods in the embodiments shown in the method for obtaining image depth information, and for the sake of brevity, they are not repeated here. repeat.

通过以上的实施方式的描述，所属领域的技术人员可以清楚地了解到，为描述的方便和简洁，仅以上述各功能模块的划分进行举例说明，实际应用中，可以根据需要而将上述功能分配由不同的功能模块完成，即将装置的内部结构划分成不同的功能模块，以完成以上描述的全部或者部分功能。上述描述的系统，装置和单元的具体工作过程，可以参考前述方法实施例中的对应过程，在此不再赘述。Through the description of the above embodiments, those skilled in the art can clearly understand that for the convenience and brevity of the description, only the division of the above-mentioned functional modules is used as an example for illustration. In practical applications, the above-mentioned functions can be allocated according to needs It is completed by different functional modules, that is, the internal structure of the device is divided into different functional modules to complete all or part of the functions described above. For the specific working process of the above-described system, device, and unit, reference may be made to the corresponding process in the foregoing method embodiments, and details are not repeated here.

本申请实施例还提供了一种计算机可读存储介质。所述计算机可读存储介质可以是计算设备能够存储的任何可用介质或者是包含一个或多个可用介质的数据中心等数据存储设备。所述可用介质可以是磁性介质，(例如，软盘、硬盘、磁带)、光介质(例如，DVD)、或者半导体介质(例如固态硬盘)等。该计算机可读存储介质包括指令，所述指令指示计算设备执行上述AI模型训练方法。The embodiment of the present application also provides a computer-readable storage medium. The computer-readable storage medium may be any available medium that a computing device can store, or a data storage device such as a data center that includes one or more available media. The available media may be magnetic media (eg, floppy disk, hard disk, magnetic tape), optical media (eg, DVD), or semiconductor media (eg, solid state hard disk), etc. The computer-readable storage medium includes instructions, and the instructions instruct a computing device to execute the above AI model training method.

本申请实施例还提供了一种计算机程序产品。所述计算机程序产品包括一个或多个计算机指令。在计算设备上加载和执行所述计算机指令时，全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机指令可以存储在计算机可读存储介质中，或者从一个计算机可读存储介质向另一计算机可读存储介质传输，例如，所述计算机指令可以从一个网站站点、计算设备或数据中心通过有线(例如同轴电缆、光纤、数字用户线(DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算设备或数据中心进行传输。所述计算机程序产品可以为一个软件安装包，在需要使用前述AI模型训练方法的任一方法的情况下，可以下载该计算机程序产品并在计算设备上执行该计算机程序产品。The embodiment of the present application also provides a computer program product. The computer program product includes one or more computer instructions. When the computer instructions are loaded and executed on the computing device, the processes or functions according to the embodiments of the present application will be generated in whole or in part. The computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, from a website, computing device, or data center via Wired (eg, coaxial cable, fiber optic, digital subscriber line (DSL)) or wireless (eg, infrared, wireless, microwave, etc.) transmission to another website site, computing device, or data center. The computer program product may be a software installation package, and if any method of the aforementioned AI model training method needs to be used, the computer program product may be downloaded and executed on the computing device.

上述各个附图对应的流程或结构的描述各有侧重，某个流程或结构中没有详述的部分，可以参见其他流程或结构的相关描述。The description of the process or structure corresponding to each of the above drawings has its own emphasis. For the part that is not described in detail in a certain process or structure, you can refer to the relevant description of other processes or structures.