CN118230072A

Movatterモバイル変換

Info

Publication number: CN118230072A
Application number: CN202410635694.XA
Authority: CN
Inventors: 李梦柯; 黄惠; 黎达
Original assignee: Guangdong Provincial Laboratory Of Artificial Intelligence And Digital Economy Shenzhen; Shenzhen University
Current assignee: Guangdong Provincial Laboratory Of Artificial Intelligence And Digital Economy Shenzhen; Shenzhen University
Priority date: 2024-05-22
Filing date: 2024-05-22
Publication date: 2024-06-21
Anticipated expiration: 2044-05-22
Also published as: CN118230072B

Abstract

Translated fromChinese

本申请涉及一种基于二维分类模型的三维点云分类模型训练方法和介质。所述方法包括：获取二维图像对应的图像特征向量序列，基于图像特征向量序列对初始二维分类模型进行训练，得到目标二维分类模型；基于目标二维分类模型和初始校准单元，得到初始三维点云分类模型；获取三维点云对应的数据特征向量序列，基于数据特征向量序列对初始三维点云分类模型进行训练，以调整初始校准单元中的初始校准参数，得到目标三维点云分类模型；目标三维点云分类模型用于基于三维点云对应的数据特征向量序列确定三维点云对应的物体类别。采用本方法能够提高三维点云分类模型的准确性。

The present application relates to a method and medium for training a three-dimensional point cloud classification model based on a two-dimensional classification model. The method includes: obtaining an image feature vector sequence corresponding to a two-dimensional image, training an initial two-dimensional classification model based on the image feature vector sequence, and obtaining a target two-dimensional classification model; obtaining an initial three-dimensional point cloud classification model based on the target two-dimensional classification model and an initial calibration unit; obtaining a data feature vector sequence corresponding to a three-dimensional point cloud, training the initial three-dimensional point cloud classification model based on the data feature vector sequence to adjust the initial calibration parameters in the initial calibration unit, and obtaining a target three-dimensional point cloud classification model; the target three-dimensional point cloud classification model is used to determine the object category corresponding to the three-dimensional point cloud based on the data feature vector sequence corresponding to the three-dimensional point cloud. The use of this method can improve the accuracy of the three-dimensional point cloud classification model.

Description

Translated fromChinese

基于二维分类模型的三维点云分类模型训练方法和介质Three-dimensional point cloud classification model training method and medium based on two-dimensional classification model

技术领域Technical Field

本申请涉及人工智能领域，特别是涉及一种基于二维分类模型的三维点云分类模型训练方法和介质。The present application relates to the field of artificial intelligence, and in particular to a three-dimensional point cloud classification model training method and medium based on a two-dimensional classification model.

背景技术Background technique

随着人工智能技术的发展，出现了智能机器人和自动驾驶车辆等智能设备，智能设备在运行的过程中需要对采集到的三维点云进行分析，以确定三维点云对应的物体类别。With the development of artificial intelligence technology, intelligent devices such as intelligent robots and self-driving vehicles have emerged. During operation, intelligent devices need to analyze the collected three-dimensional point clouds to determine the object category corresponding to the three-dimensional point clouds.

传统技术中，首先对三维点云进行映射，得到三维点云对应的三维图像，然后将三维图像输入至训练完成的图像分类模型，以确定三维点云对应的物体类别，导致三维点云分类的准确性较低。In traditional technology, the three-dimensional point cloud is first mapped to obtain a three-dimensional image corresponding to the three-dimensional point cloud, and then the three-dimensional image is input into a trained image classification model to determine the object category corresponding to the three-dimensional point cloud, resulting in low accuracy in three-dimensional point cloud classification.

发明内容Summary of the invention

基于此，有必要针对上述技术问题，提供一种能够提高分类准确性的基于二维分类模型的三维点云分类模型训练方法和介质。Based on this, it is necessary to provide a three-dimensional point cloud classification model training method and medium based on a two-dimensional classification model that can improve classification accuracy in response to the above technical problems.

第一方面，本申请提供了一种基于二维分类模型的三维点云分类模型训练方法。所述方法包括：In a first aspect, the present application provides a method for training a three-dimensional point cloud classification model based on a two-dimensional classification model. The method comprises:

获取二维图像对应的图像特征向量序列，基于所述图像特征向量序列对初始二维分类模型进行训练，得到目标二维分类模型；Acquire an image feature vector sequence corresponding to the two-dimensional image, and train an initial two-dimensional classification model based on the image feature vector sequence to obtain a target two-dimensional classification model;

基于所述目标二维分类模型和初始校准单元，得到初始三维点云分类模型；Based on the target two-dimensional classification model and the initial calibration unit, an initial three-dimensional point cloud classification model is obtained;

获取三维点云对应的数据特征向量序列，基于所述数据特征向量序列对所述初始三维点云分类模型进行训练，以调整所述初始校准单元中的初始校准参数，得到目标三维点云分类模型；所述目标三维点云分类模型用于基于三维点云对应的数据特征向量序列确定所述三维点云对应的物体类别。A data feature vector sequence corresponding to the three-dimensional point cloud is obtained, and the initial three-dimensional point cloud classification model is trained based on the data feature vector sequence to adjust the initial calibration parameters in the initial calibration unit to obtain a target three-dimensional point cloud classification model; the target three-dimensional point cloud classification model is used to determine the object category corresponding to the three-dimensional point cloud based on the data feature vector sequence corresponding to the three-dimensional point cloud.

在一个实施例中，所述获取三维点云对应的数据特征向量序列，包括：In one embodiment, obtaining a data feature vector sequence corresponding to the three-dimensional point cloud includes:

获取三维点云，基于所述三维点云中三维图像点之间的直线距离，确定所述三维点云对应的多个采样点和每个所述采样点对应的邻域集合；Acquire a three-dimensional point cloud, and determine a plurality of sampling points corresponding to the three-dimensional point cloud and a neighborhood set corresponding to each of the sampling points based on straight-line distances between three-dimensional image points in the three-dimensional point cloud;

针对每一个所述邻域集合，对所述邻域集合进行特征提取，得到所述邻域集合对应的数据特征向量；For each of the neighborhood sets, extract features of the neighborhood set to obtain a data feature vector corresponding to the neighborhood set;

基于多个所述邻域集合之间的相对位置关系，对多个所述邻域集合进行排序，得到各个所述邻域集合对应的排列顺序；Based on the relative position relationship between the plurality of neighborhood sets, the plurality of neighborhood sets are sorted to obtain an arrangement order corresponding to each of the neighborhood sets;

基于多个所述邻域集合对应的排列顺序，对多个所述邻域集合对应的数据特征向量进行排序，得到所述三维点云对应的数据特征向量序列。Based on the arrangement order corresponding to the multiple neighborhood sets, the data feature vectors corresponding to the multiple neighborhood sets are sorted to obtain a data feature vector sequence corresponding to the three-dimensional point cloud.

在一个实施例中，所述基于所述三维点云中三维图像点之间的直线距离，确定所述三维点云对应的多个采样点和每个所述采样点对应的邻域集合，包括：In one embodiment, determining a plurality of sampling points corresponding to the three-dimensional point cloud and a neighborhood set corresponding to each sampling point based on the straight-line distance between three-dimensional image points in the three-dimensional point cloud includes:

将所述三维点云中的一个三维图像点确定为参考点，将除去所述参考点的所述三维点云确定为候选点集合；Determine a 3D image point in the 3D point cloud as a reference point, and determine the 3D point cloud excluding the reference point as a candidate point set;

针对所述候选点集合中的每一个候选点，确定所述候选点与所述参考点之间的直线距离；For each candidate point in the candidate point set, determining a straight-line distance between the candidate point and the reference point;

对多个所述直线距离进行比较，得到最大的直线距离，将所述最大的直线距离所对应的候选点确定为采样点；Compare the plurality of straight-line distances to obtain a maximum straight-line distance, and determine the candidate point corresponding to the maximum straight-line distance as a sampling point;

将所述采样点的多个相邻像素点确定为所述采样点的邻域集合；Determine a plurality of adjacent pixel points of the sampling point as a neighborhood set of the sampling point;

将所述采样点确定为更新后的参考点，将除去所述采样点和所述邻域集合的所述候选点集合，确定为更新后的候选点集合，重复执行所述针对所述候选点集合中的每一个候选点，确定所述候选点与所述参考点之间的直线距离的步骤，得到多个采样点和每个所述采样点对应的邻域集合。The sampling point is determined as an updated reference point, the candidate point set excluding the sampling point and the neighborhood set is determined as an updated candidate point set, and the step of determining the straight-line distance between the candidate point and the reference point for each candidate point in the candidate point set is repeatedly performed to obtain multiple sampling points and a neighborhood set corresponding to each sampling point.

在一个实施例中，所述目标二维分类模型中包括多头自注意力机制单元和多层感知机单元；所述基于所述数据特征向量序列对所述初始三维点云分类模型进行训练，以调整所述初始校准单元中的初始校准参数，得到目标三维点云分类模型，包括：In one embodiment, the target two-dimensional classification model includes a multi-head self-attention mechanism unit and a multi-layer perceptron unit; the initial three-dimensional point cloud classification model is trained based on the data feature vector sequence to adjust the initial calibration parameters in the initial calibration unit to obtain the target three-dimensional point cloud classification model, including:

将所述数据特征向量序列输入至所述初始三维点云分类模型，针对所述数据特征向量序列中的每一个数据特征向量，通过所述多头自注意力机制单元，确定所述数据特征向量对应的依赖特征向量；Inputting the data feature vector sequence into the initial three-dimensional point cloud classification model, and determining, for each data feature vector in the data feature vector sequence, a dependent feature vector corresponding to the data feature vector through the multi-head self-attention mechanism unit;

通过所述初始校准单元，确定所述依赖特征向量对应的修正特征向量；Determining, by means of the initial calibration unit, a modified eigenvector corresponding to the dependent eigenvector;

基于所述依赖特征向量和所述修正特征向量，通过所述多层感知机单元，确定数据特征向量对应的分类特征向量；Based on the dependent feature vector and the modified feature vector, determining a classification feature vector corresponding to the data feature vector through the multi-layer perceptron unit;

基于多个所述分类特征向量，对所述初始校准单元中的初始校准参数进行调整，得到目标三维点云分类模型。Based on the multiple classification feature vectors, the initial calibration parameters in the initial calibration unit are adjusted to obtain a target three-dimensional point cloud classification model.

在一个实施例中，所述初始校准参数包括初始降维矩阵和初始升维矩阵；所述通过所述初始校准单元，确定所述依赖特征向量对应的修正特征向量，包括：In one embodiment, the initial calibration parameters include an initial dimension reduction matrix and an initial dimension increase matrix; and determining the modified eigenvector corresponding to the dependent eigenvector through the initial calibration unit includes:

对所述依赖特征向量进行归一化处理，得到归一化特征向量；Normalizing the dependent feature vector to obtain a normalized feature vector;

基于所述初始降维矩阵，对所述归一化特征向量进行降维处理，得到降维特征向量；Based on the initial dimension reduction matrix, the normalized feature vector is subjected to dimension reduction processing to obtain a dimension reduction feature vector;

基于所述降维特征向量、激活函数和所述初始升维矩阵，确定所述依赖特征向量对应的修正特征向量。Based on the reduced-dimensionality feature vector, the activation function and the initial increased-dimensionality matrix, a modified feature vector corresponding to the dependent feature vector is determined.

在一个实施例中，所述获取二维图像对应的图像特征向量序列，包括：In one embodiment, the step of obtaining a sequence of image feature vectors corresponding to the two-dimensional image includes:

获取二维图像，将所述二维图像块划分为多个图像块；Acquire a two-dimensional image, and divide the two-dimensional image block into a plurality of image blocks;

对所述图像块进行特征提取，得到所述图像块对应的图像特征向量；Performing feature extraction on the image block to obtain an image feature vector corresponding to the image block;

基于所述图像块在所述二维图像中的位置，确定所述图像块对应的位置顺序；Based on the positions of the image blocks in the two-dimensional image, determining the position order corresponding to the image blocks;

基于多个所述图像块对应的位置顺序，对多个所述图像块对应的图像特征向量进行排序，得到所述二维图像对应的图像特征向量序列。Based on the position order corresponding to the plurality of image blocks, the image feature vectors corresponding to the plurality of image blocks are sorted to obtain a sequence of image feature vectors corresponding to the two-dimensional image.

第二方面，本申请还提供了一种基于二维分类模型的三维点云分类模型训练装置。所述装置包括：In a second aspect, the present application also provides a three-dimensional point cloud classification model training device based on a two-dimensional classification model. The device comprises:

第一训练模块，用于获取二维图像对应的图像特征向量序列，基于所述图像特征向量序列对初始二维分类模型进行训练，得到目标二维分类模型；A first training module is used to obtain an image feature vector sequence corresponding to the two-dimensional image, and train an initial two-dimensional classification model based on the image feature vector sequence to obtain a target two-dimensional classification model;

组合模块，用于基于所述目标二维分类模型和初始校准单元，得到初始三维点云分类模型；A combination module, used to obtain an initial three-dimensional point cloud classification model based on the target two-dimensional classification model and the initial calibration unit;

第二训练模块，用于获取三维点云对应的数据特征向量序列，基于所述数据特征向量序列对所述初始三维点云分类模型进行训练，以调整所述初始校准单元中的初始校准参数，得到目标三维点云分类模型；所述目标三维点云分类模型用于基于三维点云对应的数据特征向量序列确定所述三维点云对应的物体类别。The second training module is used to obtain a data feature vector sequence corresponding to the three-dimensional point cloud, and train the initial three-dimensional point cloud classification model based on the data feature vector sequence to adjust the initial calibration parameters in the initial calibration unit to obtain a target three-dimensional point cloud classification model; the target three-dimensional point cloud classification model is used to determine the object category corresponding to the three-dimensional point cloud based on the data feature vector sequence corresponding to the three-dimensional point cloud.

第三方面，本申请还提供了一种计算机设备，包括存储器和处理器，所述存储器存储有计算机程序，所述处理器执行所述计算机程序时实现第一方面中任一项所述方法的步骤。In a third aspect, the present application further provides a computer device, comprising a memory and a processor, wherein the memory stores a computer program, and the processor implements the steps of any one of the methods described in the first aspect when executing the computer program.

第四方面，本申请还提供了一种计算机可读存储介质，其上存储有计算机程序，所述计算机程序被处理器执行时实现第一方面中任一项所述方法的步骤。In a fourth aspect, the present application further provides a computer-readable storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, implements the steps of any one of the methods described in the first aspect.

第五方面，本申请还提供了一种计算机程序产品，包括计算机程序，该计算机程序被处理器执行时实现第一方面中任一项所述方法的步骤。In a fifth aspect, the present application further provides a computer program product, comprising a computer program, which, when executed by a processor, implements the steps of any one of the methods described in the first aspect.

上述基于二维分类模型的三维点云分类模型训练方法，获取二维图像对应的图像特征向量序列，基于图像特征向量序列对初始二维分类模型进行训练，得到目标二维分类模型；基于目标二维分类模型和初始校准单元，得到初始三维点云分类模型；获取三维点云对应的数据特征向量序列，基于数据特征向量序列对初始三维点云分类模型进行训练，以调整初始校准单元中的初始校准参数，得到目标三维点云分类模型；目标三维点云分类模型用于基于三维点云对应的数据特征向量序列确定三维点云对应的物体类别。通过二维图像对应的图像特征向量序列对初始二维分类模型进行训练，得到目标二维分类模型，提高了目标二维分类模型的准确性；若直接将三维点云对应的数据特征向量序列输入至目标二维分类模型，以此确定三维点云对应的物体类别，由于二维图像和三维点云的结构差异，直接使用目标二维分类模型确定三维点云对应的物体类别，会导致三维点云对应的物体类别的识别准确性较低，将目标二维分类模型和初始校准单元进行组合，初始校准单元生成的修正特征向量用于对二维分类模型中Transformer块的输出向量进行修正，从而提高了三维点云对应的物体类别的准确性；在此基础上，使用三维点云对应的数据特征向量序列对初始三维点云分类模型的初始校准单元中的初始校准参数进行调整，进一步提高了目标三维点云分类模型确定三维点云所对应类别的准确性。The above-mentioned three-dimensional point cloud classification model training method based on the two-dimensional classification model obtains the image feature vector sequence corresponding to the two-dimensional image, trains the initial two-dimensional classification model based on the image feature vector sequence, and obtains the target two-dimensional classification model; obtains the initial three-dimensional point cloud classification model based on the target two-dimensional classification model and the initial calibration unit; obtains the data feature vector sequence corresponding to the three-dimensional point cloud, and trains the initial three-dimensional point cloud classification model based on the data feature vector sequence to adjust the initial calibration parameters in the initial calibration unit to obtain the target three-dimensional point cloud classification model; the target three-dimensional point cloud classification model is used to determine the object category corresponding to the three-dimensional point cloud based on the data feature vector sequence corresponding to the three-dimensional point cloud. The initial two-dimensional classification model is trained by the image feature vector sequence corresponding to the two-dimensional image to obtain the target two-dimensional classification model, thereby improving the accuracy of the target two-dimensional classification model; if the data feature vector sequence corresponding to the three-dimensional point cloud is directly input into the target two-dimensional classification model to determine the object category corresponding to the three-dimensional point cloud, due to the structural difference between the two-dimensional image and the three-dimensional point cloud, directly using the target two-dimensional classification model to determine the object category corresponding to the three-dimensional point cloud will result in low recognition accuracy of the object category corresponding to the three-dimensional point cloud. The target two-dimensional classification model and the initial calibration unit are combined, and the corrected feature vector generated by the initial calibration unit is used to correct the output vector of the Transformer block in the two-dimensional classification model, thereby improving the accuracy of the object category corresponding to the three-dimensional point cloud; on this basis, the data feature vector sequence corresponding to the three-dimensional point cloud is used to adjust the initial calibration parameters in the initial calibration unit of the initial three-dimensional point cloud classification model, thereby further improving the accuracy of the target three-dimensional point cloud classification model in determining the category corresponding to the three-dimensional point cloud.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1为一个实施例中基于二维分类模型的三维点云分类模型训练方法的应用环境图；FIG1 is a diagram showing an application environment of a three-dimensional point cloud classification model training method based on a two-dimensional classification model in one embodiment;

图2为一个实施例中基于二维分类模型的三维点云分类模型训练方法的流程示意图；FIG2 is a schematic diagram of a flow chart of a method for training a three-dimensional point cloud classification model based on a two-dimensional classification model in one embodiment;

图3为一个实施例中目标二维分类模型的示意图；FIG3 is a schematic diagram of a target two-dimensional classification model in one embodiment;

图4为一个实施例中初始三维点云分类模型的示意图；FIG4 is a schematic diagram of an initial three-dimensional point cloud classification model in one embodiment;

图5为一个实施例中数据特征向量序列确定步骤的流程示意图；FIG5 is a schematic flow chart of a step of determining a data feature vector sequence in one embodiment;

图6为一个实施例中采样点和邻域集合确定步骤的流程示意图；FIG6 is a schematic flow chart of a step of determining a sampling point and a neighborhood set in one embodiment;

图7为一个实施例中初始校准单元训练步骤的流程示意图；FIG7 is a schematic diagram of a flow chart of an initial calibration unit training step in one embodiment;

图8为一个实施例中基于二维分类模型的三维点云分类模型训练装置的结构框图；FIG8 is a structural block diagram of a three-dimensional point cloud classification model training device based on a two-dimensional classification model in one embodiment;

图9为一个实施例中计算机设备的内部结构图。FIG. 9 is a diagram showing the internal structure of a computer device in one embodiment.

具体实施方式Detailed ways

为了使本申请的目的、技术方案及优点更加清楚明白，以下结合附图及实施例，对本申请进行进一步详细说明。应当理解，此处描述的具体实施例仅仅用以解释本申请，并不用于限定本申请。In order to make the purpose, technical solution and advantages of the present application more clearly understood, the present application is further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present application and are not used to limit the present application.

本申请实施例提供的基于二维分类模型的三维点云分类模型训练方法，可以应用于如图1所示的应用环境中。其中，终端102通过网络与服务器104进行通信。数据存储系统可以存储服务器104需要处理的数据。数据存储系统可以集成在服务器104上，也可以放在云上或其他网络服务器上。终端和服务器均可单独用于执行本申请实施例中提供的基于二维分类模型的三维点云分类模型训练方法。终端和服务器也可协同用于执行本申请实施例中提供的基于二维分类模型的三维点云分类模型训练方法。例如，终端获取二维图像对应的图像特征向量序列，基于图像特征向量序列对初始二维分类模型进行训练，得到目标二维分类模型；基于目标二维分类模型和初始校准单元，得到初始三维点云分类模型；获取三维点云对应的数据特征向量序列，基于数据特征向量序列对初始三维点云分类模型进行训练，以调整初始校准单元中的初始校准参数，得到目标三维点云分类模型；目标三维点云分类模型用于基于三维点云对应的数据特征向量序列确定三维点云对应的物体类别。其中，终端102可以但不限于是各种个人计算机、笔记本电脑、智能手机、平板电脑、物联网设备和便携式可穿戴设备，物联网设备可为智能音箱、智能电视、智能空调、智能车载设备等。便携式可穿戴设备可为智能手表、智能手环、头戴设备等。服务器104可以用独立的服务器或者是多个服务器组成的服务器集群来实现。The three-dimensional point cloud classification model training method based on the two-dimensional classification model provided in the embodiment of the present application can be applied in the application environment shown in FIG. 1. Among them, the terminal 102 communicates with the server 104 through the network. The data storage system can store the data that the server 104 needs to process. The data storage system can be integrated on the server 104, or it can be placed on the cloud or other network servers. Both the terminal and the server can be used alone to execute the three-dimensional point cloud classification model training method based on the two-dimensional classification model provided in the embodiment of the present application. The terminal and the server can also be used in collaboration to execute the three-dimensional point cloud classification model training method based on the two-dimensional classification model provided in the embodiment of the present application. For example, the terminal obtains the image feature vector sequence corresponding to the two-dimensional image, trains the initial two-dimensional classification model based on the image feature vector sequence, and obtains the target two-dimensional classification model; obtains the initial three-dimensional point cloud classification model based on the target two-dimensional classification model and the initial calibration unit; obtains the data feature vector sequence corresponding to the three-dimensional point cloud, and trains the initial three-dimensional point cloud classification model based on the data feature vector sequence to adjust the initial calibration parameters in the initial calibration unit to obtain the target three-dimensional point cloud classification model; the target three-dimensional point cloud classification model is used to determine the object category corresponding to the three-dimensional point cloud based on the data feature vector sequence corresponding to the three-dimensional point cloud. The terminal 102 may be, but is not limited to, various personal computers, laptops, smart phones, tablet computers, IoT devices, and portable wearable devices. The IoT devices may be smart speakers, smart TVs, smart air conditioners, smart car-mounted devices, etc. The portable wearable devices may be smart watches, smart bracelets, head-mounted devices, etc. The server 104 may be implemented as an independent server or a server cluster consisting of multiple servers.

在一个实施例中，如图2所示，提供了一种基于二维分类模型的三维点云分类模型训练方法，本实施例以该方法应用于计算机设备为例进行说明，计算机设备可以是终端或服务器，包括步骤202至步骤206。In one embodiment, as shown in FIG. 2 , a three-dimensional point cloud classification model training method based on a two-dimensional classification model is provided. This embodiment is described by taking the method applied to a computer device as an example. The computer device may be a terminal or a server, and includes steps 202 to 206 .

步骤202，获取二维图像对应的图像特征向量序列，基于图像特征向量序列对初始二维分类模型进行训练，得到目标二维分类模型。Step 202: Obtain an image feature vector sequence corresponding to the two-dimensional image, and train an initial two-dimensional classification model based on the image feature vector sequence to obtain a target two-dimensional classification model.

其中，二维图像是指二维平面上的图像，二维图像由多个像素点构成，每个像素点具有对应的二维坐标和像素值。图像特征向量序列是指由多个图像特征向量按顺序排列组成的序列，图像特征向量序列用于表征二维图像的图像特征，图像特征向量序列中的图像特征向量的数量与组成二维图像的图像块的数量相等，即图像特征向量与图像块一一对应，图像特征向量表征图像块的局部图像特征，图像特征向量在图像特征向量序列中的位置顺序与所对应的图像块在二维图像中的位置顺序相同。初始二维分类模型是指未完成训练，用于对二维图像进行分类的神经网络模型，初始二维分类模型的输入为二维图像对应的图像特征向量序列，输出为二维图像的类别。目标二维分类模型是指完成训练，用于对二维图像进行分类的神经网络模型，目标二维分类模型可以为二维视觉模型（VisualTransformers，基于Transformer架构的视觉处理模型），由多个Transformer块组成。Among them, a two-dimensional image refers to an image on a two-dimensional plane. The two-dimensional image is composed of multiple pixels, and each pixel has a corresponding two-dimensional coordinate and pixel value. An image feature vector sequence refers to a sequence composed of multiple image feature vectors arranged in order. The image feature vector sequence is used to characterize the image features of the two-dimensional image. The number of image feature vectors in the image feature vector sequence is equal to the number of image blocks that constitute the two-dimensional image, that is, the image feature vectors correspond to the image blocks one by one. The image feature vectors characterize the local image features of the image blocks. The position order of the image feature vectors in the image feature vector sequence is the same as the position order of the corresponding image blocks in the two-dimensional image. The initial two-dimensional classification model refers to a neural network model that has not been trained and is used to classify two-dimensional images. The input of the initial two-dimensional classification model is the image feature vector sequence corresponding to the two-dimensional image, and the output is the category of the two-dimensional image. The target two-dimensional classification model refers to a neural network model that has been trained and is used to classify two-dimensional images. The target two-dimensional classification model can be a two-dimensional visual model (VisualTransformers, a visual processing model based on the Transformer architecture), which is composed of multiple Transformer blocks.

示例性地，计算机设备获取多个二维图像和每个二维图像对应的标注类别，分别确定每一个二维图像对应的图像特征向量序列；从多个二维图像中选取一个训练图像，将训练图像对应的图像特征向量序列和标注类别输入至初始二维分类模型，对初始二维分类模型进行训练，得到更新二维分类模型；将更新二维分类模型确定为初始二维分类模型，从剩余的二维图像中选取一个训练图像，重复执行上述将训练图像对应的图像特征向量序列和标注类别输入至初始二维分类模型，对初始二维分类模型进行训练，得到更新二维分类模型的步骤，直至达到第一训练停止条件，得到目标二维分类模型。第一训练停止条件可以为第一训练次数等于第一训练次数阈值，或者，第一误差损失小于第一误差损失阈值等，第一训练停止条件可以根据实际需求进行设置。Exemplarily, a computer device obtains multiple two-dimensional images and the annotation categories corresponding to each two-dimensional image, and determines the image feature vector sequence corresponding to each two-dimensional image; selects a training image from the multiple two-dimensional images, inputs the image feature vector sequence and the annotation category corresponding to the training image into the initial two-dimensional classification model, trains the initial two-dimensional classification model, and obtains an updated two-dimensional classification model; determines the updated two-dimensional classification model as the initial two-dimensional classification model, selects a training image from the remaining two-dimensional images, and repeatedly performs the steps of inputting the image feature vector sequence and the annotation category corresponding to the training image into the initial two-dimensional classification model, training the initial two-dimensional classification model, and obtaining an updated two-dimensional classification model, until the first training stop condition is reached, and the target two-dimensional classification model is obtained. The first training stop condition can be that the first training number is equal to the first training number threshold, or the first error loss is less than the first error loss threshold, etc. The first training stop condition can be set according to actual needs.

在一个实施例中，目标二维分类模型包括多个Transformer块和一个分类头，一个Transformer块中包括如图3所示的LN（Layer Normalization，层归一化）单元、MSA（Multi-Head Self-Attention，多头自注意力机制）单元和MLP（Multi-Layer Perceptron，多层感知机）单元。例如，二维图像对应的图像特征向量序列为，针对图像特征向量序列中的每一个图像特征向量/>，经过第/>个Transformer块的计算过程如下所示：In one embodiment, the target two-dimensional classification model includes multiple Transformer blocks and a classification head, and a Transformer block includes an LN (Layer Normalization) unit, an MSA (Multi-Head Self-Attention) unit, and an MLP (Multi-Layer Perceptron) unit as shown in FIG3. For example, the image feature vector sequence corresponding to the two-dimensional image is , for each image feature vector in the image feature vector sequence/> , after the / > The calculation process of a Transformer block is as follows:

公式（1） Formula 1)

公式（2） Formula (2)

其中，为经过的Transformer块的数量，可以理解为，经过的神经网络的层数；m为图像特征向量序列中图像特征向量的数量；/>为第i个图像特征向量经过/>个Transformer块得到的分类特征向量；/>为/>经过第/>个Transformer块中的MSA单元得到的依赖特征向量；/>为/>经过第/>个Transformer块中的LN单元和MLP单元得到的分类特征向量。in, is the number of Transformer blocks passed, which can be understood as the number of layers of the neural network passed; m is the number of image feature vectors in the image feature vector sequence; /> For the i-th image feature vector, The classification feature vector obtained by the Transformer block; /> For/> After the / > The dependency feature vector obtained by the MSA unit in the Transformer block; /> For/> After the / > The classification feature vector obtained by the LN unit and MLP unit in the Transformer block.

步骤204，基于目标二维分类模型和初始校准单元，得到初始三维点云分类模型。Step 204 , obtaining an initial three-dimensional point cloud classification model based on the target two-dimensional classification model and the initial calibration unit.

其中，初始校准单元是指未完成训练，用于生成修正特征向量的单元。初始校准单元的输入为依赖特征向量，输出为修正特征向量。初始校准单元中可以包括Dec（Decoder，解码器）、ReLU（Rectified Linear Unit，修正线性单元）和Enc（Encoder，编码器），初始校准单元的Dec中包括初始升维矩阵，Enc中包括初始降维矩阵。The initial calibration unit refers to a unit that has not completed training and is used to generate a modified feature vector. The input of the initial calibration unit is the dependent feature vector, and the output is the modified feature vector. The initial calibration unit may include Dec (Decoder), ReLU (Rectified Linear Unit), and Enc (Encoder). The Dec of the initial calibration unit includes an initial dimension increase matrix, and Enc includes an initial dimension reduction matrix.

示例性地，计算机设备获取初始校准单元，在目标二维分类模型的每个Transformer块上添加一个初始校准单元，得到初始三维点云分类模型。其中，初始校准单元的输入为Transformer块中MSA单元的输入特征向量和输出特征向量之和，初始校准单元的输出用于对Transformer块输出的初始分类特征向量进行修正，以得到分类特征向量。Exemplarily, the computer device obtains an initial calibration unit, and adds an initial calibration unit to each Transformer block of the target two-dimensional classification model to obtain an initial three-dimensional point cloud classification model. The input of the initial calibration unit is the sum of the input feature vector and the output feature vector of the MSA unit in the Transformer block, and the output of the initial calibration unit is used to correct the initial classification feature vector output by the Transformer block to obtain a classification feature vector.

在一个实施例中，初始三维点云分类模型如图4所示，包括目标二维分类模型和初始校准单元，初始校准单元中包括Dec（Decoder，解码器）、ReLU（Rectified Linear Unit，修正线性单元）和Enc（Encoder，编码器），初始校准单元的Dec中包括初始升维矩阵，Enc中包括初始降维矩阵，初始校准单元的输入为Transformer块中MSA单元的输入特征向量和输出特征向量之和，初始校准单元的输出用于对Transformer块输出的初始分类特征向量进行修正，以得到分类特征向量。In one embodiment, the initial three-dimensional point cloud classification model is shown in Figure 4, including a target two-dimensional classification model and an initial calibration unit, the initial calibration unit includes Dec (Decoder), ReLU (Rectified Linear Unit) and Enc (Encoder), the Dec of the initial calibration unit includes an initial dimensionality increase matrix, Enc includes an initial dimensionality reduction matrix, the input of the initial calibration unit is the sum of the input feature vector and the output feature vector of the MSA unit in the Transformer block, and the output of the initial calibration unit is used to correct the initial classification feature vector output by the Transformer block to obtain a classification feature vector.

步骤206，获取三维点云对应的数据特征向量序列，基于数据特征向量序列对初始三维点云分类模型进行训练，以调整初始校准单元中的初始校准参数，得到目标三维点云分类模型；目标三维点云分类模型用于基于三维点云对应的数据特征向量序列确定三维点云对应的物体类别。Step 206, obtain the data feature vector sequence corresponding to the three-dimensional point cloud, and train the initial three-dimensional point cloud classification model based on the data feature vector sequence to adjust the initial calibration parameters in the initial calibration unit to obtain the target three-dimensional point cloud classification model; the target three-dimensional point cloud classification model is used to determine the object category corresponding to the three-dimensional point cloud based on the data feature vector sequence corresponding to the three-dimensional point cloud.

其中，三维点云是指三维图像点组成的集合，三维点云中包括多个无序的三维图像点，每个三维图像点对应一个三维位置坐标和至少一个特征维度的特征值，特征维度可以为亮度，特征值可以为亮度值。三维点云可以由智能设备上的传感器采集得到，也可以由相机传感器采集得到。数据特征向量序列是指由多个数据特征向量按照顺序组成的序列，数据特征向量序列表征三维点云的图像特征，数据特征向量的数量与采样点的数量相等，即数据特征向量与采样点一一对应，数据特征向量表征采样点所对应的邻域集合中的多个相邻像素点的局部图像特征。初始校准参数是指初始校准单元中的未经过调整的参数，初始校准参数包括但不限于Enc中的初始降维矩阵和Dec中的初始升维矩阵。物体类别是指三维点云所对应三维物体的类别，物体类别可以为物体的种类，例如，物体类别为猫、狗等。Among them, a three-dimensional point cloud refers to a set of three-dimensional image points, which includes multiple unordered three-dimensional image points, each of which corresponds to a three-dimensional position coordinate and a feature value of at least one feature dimension, the feature dimension can be brightness, and the feature value can be a brightness value. The three-dimensional point cloud can be collected by a sensor on a smart device or by a camera sensor. A data feature vector sequence refers to a sequence composed of multiple data feature vectors in order, and the data feature vector sequence represents the image features of the three-dimensional point cloud. The number of data feature vectors is equal to the number of sampling points, that is, the data feature vectors correspond to the sampling points one by one, and the data feature vectors represent the local image features of multiple adjacent pixels in the neighborhood set corresponding to the sampling points. The initial calibration parameters refer to the unadjusted parameters in the initial calibration unit, and the initial calibration parameters include but are not limited to the initial dimension reduction matrix in Enc and the initial dimension increase matrix in Dec. The object category refers to the category of the three-dimensional object corresponding to the three-dimensional point cloud, and the object category can be the type of the object, for example, the object category is cat, dog, etc.

示例性地，计算机设备获取多个三维点云和每个三维点云对应的类别标签，分别确定每一个三维点云对应的数据特征向量序列；从多个三维点云中获取一个训练点云，将训练点云对应的数据特征向量序列和类别标签输入至初始三维点云分类模型，对初始三维模型中的初始校准模块进行训练，以对初始校准单元中的初始校准参数进行调整，得到更新校准参数；将更新校准参数作为初始校准参数，从剩余的三维点云中获取一个训练点云，重复执行将训练点云对应的数据特征向量序列和类别标签输入至初始三维点云分类模型，对初始三维模型中的初始校准模块进行训练，以对初始校准单元中的初始校准参数进行调整，得到更新校准参数的步骤，直至达到第二训练停止条件，得到目标三维点云分类模型，目标三维点云分类模型由目标二维分类模型和目标校准模块组成。其中，第二训练停止条件可以为第二训练次数等于第二训练次数阈值，或者，第二误差损失小于第二误差损失阈值等，第二训练停止条件可以根据实际需求进行设置。Exemplarily, a computer device obtains multiple three-dimensional point clouds and the category labels corresponding to each three-dimensional point cloud, and determines the data feature vector sequence corresponding to each three-dimensional point cloud; obtains a training point cloud from multiple three-dimensional point clouds, inputs the data feature vector sequence and the category label corresponding to the training point cloud into the initial three-dimensional point cloud classification model, trains the initial calibration module in the initial three-dimensional model to adjust the initial calibration parameters in the initial calibration unit, and obtains updated calibration parameters; uses the updated calibration parameters as the initial calibration parameters, obtains a training point cloud from the remaining three-dimensional point clouds, and repeatedly executes the steps of inputting the data feature vector sequence and the category label corresponding to the training point cloud into the initial three-dimensional point cloud classification model, training the initial calibration module in the initial three-dimensional model to adjust the initial calibration parameters in the initial calibration unit, and obtaining updated calibration parameters, until the second training stop condition is reached, and the target three-dimensional point cloud classification model is obtained, and the target three-dimensional point cloud classification model is composed of the target two-dimensional classification model and the target calibration module. Among them, the second training stop condition can be that the second training times are equal to the second training times threshold, or the second error loss is less than the second error loss threshold, etc. The second training stop condition can be set according to actual needs.

在一个实施例中，在得到目标三维点云分类模型之后，还包括：In one embodiment, after obtaining the target three-dimensional point cloud classification model, the method further includes:

获取目标三维点云，确定目标三维点云对应的目标数据特征向量序列，将目标数据特征向量序列输入至目标三维点云分类模型，得到目标三维点云对应的目标分类类别。A target three-dimensional point cloud is obtained, a target data feature vector sequence corresponding to the target three-dimensional point cloud is determined, and the target data feature vector sequence is input into a target three-dimensional point cloud classification model to obtain a target classification category corresponding to the target three-dimensional point cloud.

上述基于二维分类模型的三维点云分类模型训练方法中，通过二维图像对应的图像特征向量序列对初始二维分类模型进行训练，得到目标二维分类模型，提高了目标二维分类模型的准确性；若直接将三维点云对应的数据特征向量序列输入至目标二维分类模型，以此确定三维点云对应的物体类别，由于二维图像和三维点云的结构差异，直接使用目标二维分类模型确定三维点云对应的物体类别，会导致三维点云对应的物体类别的识别准确性较低，将目标二维分类模型和初始校准单元进行组合，初始校准单元生成的修正特征向量用于对二维分类模型中Transformer块的输出向量进行修正，从而提高了三维点云对应的物体类别的准确性；在此基础上，使用三维点云对应的数据特征向量序列对初始三维点云分类模型的初始校准单元中的初始校准参数进行调整，进一步提高了目标三维点云分类模型确定三维点云所对应类别的准确性。In the above-mentioned three-dimensional point cloud classification model training method based on the two-dimensional classification model, the initial two-dimensional classification model is trained by the image feature vector sequence corresponding to the two-dimensional image to obtain the target two-dimensional classification model, thereby improving the accuracy of the target two-dimensional classification model; if the data feature vector sequence corresponding to the three-dimensional point cloud is directly input into the target two-dimensional classification model to determine the object category corresponding to the three-dimensional point cloud, due to the structural difference between the two-dimensional image and the three-dimensional point cloud, directly using the target two-dimensional classification model to determine the object category corresponding to the three-dimensional point cloud will result in low recognition accuracy of the object category corresponding to the three-dimensional point cloud. The target two-dimensional classification model and the initial calibration unit are combined, and the corrected feature vector generated by the initial calibration unit is used to correct the output vector of the Transformer block in the two-dimensional classification model, thereby improving the accuracy of the object category corresponding to the three-dimensional point cloud; on this basis, the data feature vector sequence corresponding to the three-dimensional point cloud is used to adjust the initial calibration parameters in the initial calibration unit of the initial three-dimensional point cloud classification model, thereby further improving the accuracy of the target three-dimensional point cloud classification model in determining the category corresponding to the three-dimensional point cloud.

在一个实施例中，如图5所示，获取三维点云对应的数据特征向量序列，包括：In one embodiment, as shown in FIG5 , obtaining a data feature vector sequence corresponding to a three-dimensional point cloud includes:

步骤502，获取三维点云，基于三维点云中三维图像点之间的直线距离，确定三维点云对应的多个采样点和每个采样点对应的邻域集合。Step 502 : Obtain a three-dimensional point cloud, and determine a plurality of sampling points corresponding to the three-dimensional point cloud and a neighborhood set corresponding to each sampling point based on straight-line distances between three-dimensional image points in the three-dimensional point cloud.

其中，直线距离是指两个三维图像点之间的距离。邻域集合是指与采样点相邻的多个三维图像点组成的集合。例如，三维点云为，N为三维点云中三维图像点的数量，/>为第N个三维图像点，多个采样点组成的集合为三维点云的子集，每一个采样点对应的邻域集合中包括k个相邻三维图像点，则多个采样点对应的邻域集合组成的集合为/>。The straight-line distance is the distance between two 3D image points. The neighborhood set is a set of multiple 3D image points adjacent to the sampling point. For example, the 3D point cloud is , N is the number of 3D image points in the 3D point cloud,/> is the Nth 3D image point, and the set of multiple sampling points is a subset of the 3D point cloud , the neighborhood set corresponding to each sampling point includes k adjacent 3D image points, then the set of neighborhood sets corresponding to multiple sampling points is / > .

示例性地，计算机设备获取三维点云，从三维点云中确定一个采样点，分别确定该采样点与每一个三维图像点之间的直线距离，对多个直线距离进行比较，得到最大的直线距离；获取预设采样点数量，基于预设采样点数量和上述最大的直线距离，确定多个目标直线距离，目标直线距离的数量等于预设采样点数量；针对每一个目标直线距离，将与该目标直线距离之间的差异值最小的直线距离所对应的三维图像点确定为采样点，得到多个采样点；针对每个采样点，基于采样点的三维位置坐标，确定该采样点对应的相邻三维图像点，将预设数量的相邻三维图像点确定为采样点对应的邻域集合。Exemplarily, a computer device obtains a three-dimensional point cloud, determines a sampling point from the three-dimensional point cloud, determines the straight-line distance between the sampling point and each three-dimensional image point, compares multiple straight-line distances, and obtains the maximum straight-line distance; obtains a preset number of sampling points, and determines multiple target straight-line distances based on the preset number of sampling points and the above-mentioned maximum straight-line distance, and the number of target straight-line distances is equal to the preset number of sampling points; for each target straight-line distance, the three-dimensional image point corresponding to the straight-line distance with the smallest difference value with the target straight-line distance is determined as the sampling point, to obtain multiple sampling points; for each sampling point, based on the three-dimensional position coordinates of the sampling point, the adjacent three-dimensional image points corresponding to the sampling point are determined, and a preset number of adjacent three-dimensional image points are determined as the neighborhood set corresponding to the sampling point.

步骤504，针对每一个邻域集合，对邻域集合进行特征提取，得到邻域集合对应的数据特征向量。Step 504: for each neighborhood set, extract features of the neighborhood set to obtain a data feature vector corresponding to the neighborhood set.

其中，数据特征向量是指表征邻域集合中的多个相邻三维图像点所对应的局部三维图像特征的向量。对邻域集合进行特征提取可以使用点云特征提取模型，点云特征提取模型可以为Point_Embed、pointNet，pointNext和pointMLP等中的一种，例如，使用Point_Embed提取第N_S个邻域集合对应的数据特征向量，数据特征向量。The data feature vector refers to a vector that represents the local three-dimensional image features corresponding to multiple adjacent three-dimensional image points in the neighborhood set. The point cloud feature extraction model can be used to extract features from the neighborhood set. The point cloud feature extraction model can be one of Point_Embed, pointNet, pointNext and pointMLP. For example, Point_Embed is used to extract the N_S neighborhood set. The corresponding data feature vector, data feature vector .

示例性地，针对每一个邻域集合，计算机设备将邻域集合输入到点云特征提取模型，点云特征提取模型输出数据特征向量，计算机设备得到该邻域集合对应的数据特征向量。Exemplarily, for each neighborhood set, the computer device inputs the neighborhood set into the point cloud feature extraction model, the point cloud feature extraction model outputs a data feature vector, and the computer device obtains the data feature vector corresponding to the neighborhood set.

步骤506，对多个邻域集合进行排序，得到各个邻域集合对应的排列顺序。Step 506: sort the multiple neighborhood sets to obtain the arrangement order corresponding to each neighborhood set.

其中，排列顺序是指邻域集合在多个邻域集合中的顺序，排列顺序用于表征邻域集合在三维点云所对应的三维图像中的位置。The arrangement order refers to the order of the neighborhood set in multiple neighborhood sets, and the arrangement order is used to characterize the position of the neighborhood set in the three-dimensional image corresponding to the three-dimensional point cloud.

示例性地，计算机设备使用莫顿排序方法对多个邻域集合进行排序，得到各个邻域集合对应的排列顺序。例如，使用莫顿排序方法对多个邻域集合进行排序，多个邻域集合的排序序列/>，再从排序序列中分别确定每个邻域集合对应的排列顺序。Exemplarily, the computer device uses the Morton sorting method to sort the multiple neighborhood sets to obtain the arrangement order corresponding to each neighborhood set. Sorting, sorting sequence of multiple neighborhood sets/> , and then determine the arrangement order corresponding to each neighborhood set from the sorted sequence.

步骤508，基于多个邻域集合对应的排列顺序，对多个邻域集合对应的数据特征向量进行排序，得到三维点云对应的数据特征向量序列。Step 508 , based on the arrangement order corresponding to the multiple neighborhood sets, sort the data feature vectors corresponding to the multiple neighborhood sets to obtain a data feature vector sequence corresponding to the three-dimensional point cloud.

示例性地，计算机设备基于多个邻域集合对应的排列顺序，按照预设顺序对多个邻域集合对应的数据特征向量进行排序，得到三维点云对应的数据特征向量序列。其中，预设顺序可以为从小到大的顺序或者从大到小的顺序。Exemplarily, the computer device sorts the data feature vectors corresponding to the multiple neighborhood sets in a preset order based on the arrangement order corresponding to the multiple neighborhood sets to obtain a data feature vector sequence corresponding to the three-dimensional point cloud. The preset order may be from small to large or from large to small.

本实施例中，通过三维点云中三维图像点之间的直线距离，确定三维点云对应的多个采样点和每个采样点对应的邻域集合，使用多个邻域集合代表三维点云，对邻域集合进行特征提取，得到表征邻域集合特征的数据特征向量，基于多个邻域集合对应的排列顺序，对多个邻域集合对应的数据特征向量进行排序，得到三维点云对应的数据特征向量序列，数据特征向量序列中的多个数据特征向量的顺序表征了多个邻域集合在三维空间中的位置，即数据特征向量序列表征了三维图像点所对应三维物体的物体特征。In this embodiment, multiple sampling points corresponding to the three-dimensional point cloud and a neighborhood set corresponding to each sampling point are determined by the straight-line distance between three-dimensional image points in the three-dimensional point cloud, multiple neighborhood sets are used to represent the three-dimensional point cloud, and features are extracted from the neighborhood sets to obtain data feature vectors representing the features of the neighborhood sets. Based on the arrangement order corresponding to the multiple neighborhood sets, the data feature vectors corresponding to the multiple neighborhood sets are sorted to obtain a data feature vector sequence corresponding to the three-dimensional point cloud. The order of the multiple data feature vectors in the data feature vector sequence represents the positions of the multiple neighborhood sets in the three-dimensional space, that is, the data feature vector sequence represents the object features of the three-dimensional object corresponding to the three-dimensional image point.

在一个实施例中，如图6所示，基于三维点云中三维图像点之间的直线距离，确定三维点云对应的多个采样点和每个采样点对应的邻域集合，包括：In one embodiment, as shown in FIG6 , based on the straight-line distance between three-dimensional image points in the three-dimensional point cloud, determining a plurality of sampling points corresponding to the three-dimensional point cloud and a neighborhood set corresponding to each sampling point includes:

步骤602，将三维点云中的一个三维图像点确定为参考点，将除去参考点的三维点云确定为候选点集合。Step 602: determine a 3D image point in the 3D point cloud as a reference point, and determine the 3D point cloud excluding the reference point as a candidate point set.

其中，候选点集合是指除去参考点的三维图像点组成的集合。The candidate point set refers to a set of three-dimensional image points excluding the reference point.

示例性地，计算机设备从三维点云中随机选取一个三维图像点作为参考点，将除去参考点的三维点云作为候选点集合。Exemplarily, the computer device randomly selects a three-dimensional image point from the three-dimensional point cloud as a reference point, and uses the three-dimensional point cloud excluding the reference point as a candidate point set.

步骤604，针对候选点集合中的每一个候选点，确定候选点与参考点之间的直线距离。Step 604: for each candidate point in the candidate point set, determine the straight-line distance between the candidate point and the reference point.

示例性地，针对候选点集合中的每一个候选点，计算机设备获取候选点的三维位置坐标和参考点的三维位置坐标，计算候选点的三维位置坐标和参考点的三维位置坐标之间的直线距离，得到候选点与参考点之间的直线距离。Exemplarily, for each candidate point in the candidate point set, the computer device obtains the three-dimensional position coordinates of the candidate point and the three-dimensional position coordinates of the reference point, calculates the straight-line distance between the three-dimensional position coordinates of the candidate point and the three-dimensional position coordinates of the reference point, and obtains the straight-line distance between the candidate point and the reference point.

步骤606，对多个直线距离进行比较，得到最大的直线距离，将最大的直线距离所对应的候选点确定为采样点。Step 606: compare multiple straight-line distances to obtain the maximum straight-line distance, and determine the candidate point corresponding to the maximum straight-line distance as the sampling point.

示例性地，计算机设备将候选点集合中的多个候选点对应的直线距离进行比较，得到最大的直线距离，将最大的直线距离所对应的候选点确定为采样点。Exemplarily, the computer device compares the straight-line distances corresponding to multiple candidate points in the candidate point set to obtain the maximum straight-line distance, and determines the candidate point corresponding to the maximum straight-line distance as the sampling point.

步骤608，将采样点的多个相邻像素点确定为采样点的邻域集合。Step 608: determine a plurality of adjacent pixel points of the sampling point as a neighborhood set of the sampling point.

示例性地，计算机设备基于采样点的三维位置坐标，确定该采样点对应的相邻三维图像点，将预设数量的相邻三维图像点确定为采样点对应的邻域集合。Exemplarily, the computer device determines adjacent three-dimensional image points corresponding to the sampling point based on the three-dimensional position coordinates of the sampling point, and determines a preset number of adjacent three-dimensional image points as a neighborhood set corresponding to the sampling point.

步骤610，将采样点确定为更新后的参考点，将除去采样点和邻域集合的候选点集合，确定为更新后的候选点集合，重复执行针对候选点集合中的每一个候选点，确定候选点与参考点之间的直线距离的步骤，得到多个采样点和每个采样点对应的邻域集合。In step 610, the sampling point is determined as the updated reference point, the candidate point set excluding the sampling point and the neighborhood set is determined as the updated candidate point set, and the step of determining the straight-line distance between the candidate point and the reference point is repeated for each candidate point in the candidate point set to obtain multiple sampling points and the neighborhood set corresponding to each sampling point.

示例性地，计算机设备将采样点确定为更新后的参考点，将除去采样点和邻域集合的候选点集合确定为更新后的候选点集合，重复执行步骤604至步骤608，直至达到预设循环停止条件，得到多个采样点和每个采样点对应的邻域集合。其中，预设循环停止条件是指预先设置的循环停止条件，预设循环停止条件可以为采样点的数量等于预设数量。Exemplarily, the computer device determines the sampling point as the updated reference point, determines the candidate point set excluding the sampling point and the neighborhood set as the updated candidate point set, and repeatedly executes steps 604 to 608 until a preset loop stop condition is reached, thereby obtaining multiple sampling points and a neighborhood set corresponding to each sampling point. The preset loop stop condition refers to a preset loop stop condition, and the preset loop stop condition may be that the number of sampling points is equal to a preset number.

本实施例中，通过将候选点集合中的多个候选点对应的直线距离进行比较，得到最大的直线距离，将最大的直线距离所对应的候选点确定为采样点，即选取与当前的采样点距离最远的三维图像点作为下一个采样点，通过上述方法得到的多个采样点即可覆盖三维点云所对应的三维物体，为后续确定数据特征向量序列提供准确的基础数据。In this embodiment, by comparing the straight-line distances corresponding to multiple candidate points in the candidate point set, the maximum straight-line distance is obtained, and the candidate point corresponding to the maximum straight-line distance is determined as the sampling point, that is, the three-dimensional image point farthest from the current sampling point is selected as the next sampling point. The multiple sampling points obtained by the above method can cover the three-dimensional object corresponding to the three-dimensional point cloud, and provide accurate basic data for the subsequent determination of the data feature vector sequence.

在一个实施例中，如图7所示，目标二维分类模型中包括多头自注意力机制单元和多层感知机单元；基于数据特征向量序列对初始三维点云分类模型进行训练，以调整初始校准单元中的初始校准参数，包括：In one embodiment, as shown in FIG7 , the target two-dimensional classification model includes a multi-head self-attention mechanism unit and a multi-layer perceptron unit; the initial three-dimensional point cloud classification model is trained based on the data feature vector sequence to adjust the initial calibration parameters in the initial calibration unit, including:

步骤702，将数据特征向量序列输入至初始三维点云分类模型，针对数据特征向量序列中的每一个数据特征向量，通过多头自注意力机制单元，确定数据特征向量对应的依赖特征向量。Step 702: Input the data feature vector sequence into the initial three-dimensional point cloud classification model, and for each data feature vector in the data feature vector sequence, determine the dependent feature vector corresponding to the data feature vector through a multi-head self-attention mechanism unit.

其中，多头自注意力机制单元是指MSA，MSA是指处理序列数据的注意力机制，用于学习序列中不同位置之间的依赖关系。Among them, the multi-head self-attention mechanism unit refers to MSA, and MSA refers to the attention mechanism for processing sequence data, which is used to learn the dependencies between different positions in the sequence.

示例性地，计算机设备将数据特征向量序列输入至初始三维点云分类模型，初始三维点云分类模型获取数据特征向量序列之后，针对每一个数据特征向量，初始三维点云分类模型中的自注意力机制单元输出数据特征向量对应的自注意力特征向量，初始三维点云分类模型将数据特征向量加上数据特征向量对应的自注意力特征向量，得到数据特征向量对应的依赖特征向量。Exemplarily, a computer device inputs a data feature vector sequence into an initial three-dimensional point cloud classification model. After the initial three-dimensional point cloud classification model obtains the data feature vector sequence, for each data feature vector, the self-attention mechanism unit in the initial three-dimensional point cloud classification model outputs a self-attention feature vector corresponding to the data feature vector. The initial three-dimensional point cloud classification model adds the data feature vector to the self-attention feature vector corresponding to the data feature vector to obtain a dependency feature vector corresponding to the data feature vector.

在一个实施例中，依赖特征向量如下所示：In one embodiment, the dependency feature vector is as follows:

公式（3） Formula (3)

其中，为经过的Transformer块的数量；/>为第i个数据特征向量经过/>个Transformer块得到的分类特征向量；/>为/>第/>个Transformer块中的多头自注意力机制单元输出的依赖特征向量。in, is the number of Transformer blocks passed; /> The i-th data feature vector is obtained by The classification feature vector obtained by the Transformer block; /> For/> No./> The dependency feature vector output by the multi-head self-attention mechanism unit in the Transformer block.

步骤704，通过初始校准单元，确定依赖特征向量对应的修正特征向量。Step 704: Determine a modified eigenvector corresponding to the dependent eigenvector through an initial calibration unit.

示例性地，计算机设备通过初始三维点云分类模型将依赖特征向量输入至初始校准单元，初始校准单元输出依赖特征向量对应的修正特征向量。Exemplarily, the computer device inputs the dependent feature vector into the initial calibration unit through the initial three-dimensional point cloud classification model, and the initial calibration unit outputs the corrected feature vector corresponding to the dependent feature vector.

步骤706，基于依赖特征向量和修正特征向量，通过多层感知机单元，确定数据特征向量对应的分类特征向量。Step 706: Based on the dependent feature vector and the modified feature vector, a classification feature vector corresponding to the data feature vector is determined through a multi-layer perceptron unit.

其中，多层感知机单元是指MLP。Here, the multi-layer perceptron unit refers to MLP.

示例性地，计算机设备通过初始三维点云分类模型中的LN单元对依赖特征向量进行归一化处理，得到依赖特征向量对应的归一化特征向量，多层感知机单元输出归一化特征向量对应的感知特征向量，初始三维点云分类模型对依赖特征向量、感知特征向量和修正特征向量进行统计，得到数据特征向量对应的分类特征向量。Exemplarily, the computer device normalizes the dependent feature vector through the LN unit in the initial three-dimensional point cloud classification model to obtain a normalized feature vector corresponding to the dependent feature vector, and the multi-layer perceptron unit outputs a perceptual feature vector corresponding to the normalized feature vector. The initial three-dimensional point cloud classification model performs statistics on the dependent feature vector, the perceptual feature vector and the corrected feature vector to obtain a classification feature vector corresponding to the data feature vector.

在一个实施例中，分类特征向量如下所示：In one embodiment, the classification feature vector is as follows:

公式（4） Formula (4)

其中，为经过第/>个Transformer块得到的分类特征向量；s为加权参数。in, For the first/> The classification feature vector obtained by the Transformer block; s is the weighting parameter.

步骤708，基于多个分类特征向量，对初始校准单元中的初始校准参数进行调整，得到初始校准参数对应的更新校准参数。Step 708: Based on the multiple classification feature vectors, the initial calibration parameters in the initial calibration unit are adjusted to obtain updated calibration parameters corresponding to the initial calibration parameters.

示例性地，计算机设备通过初始三维点云分类模型将多个分类特征向量组合成分类特征向量序列，将分类特征向量序列输入至分类头，分类头输出数据特征向量序列对应的预测类别，基于数据特征向量序列对应的预测类别和类别标签，对初始校准单元中的初始校准参数进行调整，得到初始校准参数对应的更新校准参数。Exemplarily, a computer device combines multiple classification feature vectors into a classification feature vector sequence through an initial three-dimensional point cloud classification model, inputs the classification feature vector sequence into a classification head, and the classification head outputs a predicted category corresponding to the data feature vector sequence. Based on the predicted category and category label corresponding to the data feature vector sequence, the initial calibration parameters in the initial calibration unit are adjusted to obtain updated calibration parameters corresponding to the initial calibration parameters.

本实施例中，使用三维点云对应的数据特征向量序列对初始三维点云分类模型的初始校准单元中的初始校准参数进行调整，得到初始校准参数对应的更新校准参数，提高了初始校准单元的准确性。In this embodiment, the data feature vector sequence corresponding to the three-dimensional point cloud is used to adjust the initial calibration parameters in the initial calibration unit of the initial three-dimensional point cloud classification model to obtain updated calibration parameters corresponding to the initial calibration parameters, thereby improving the accuracy of the initial calibration unit.

在一个实施例中，初始校准参数包括初始降维矩阵和初始升维矩阵；通过初始校准单元，确定依赖特征向量对应的修正特征向量，包括：In one embodiment, the initial calibration parameters include an initial dimension reduction matrix and an initial dimension increase matrix; determining the modified eigenvector corresponding to the dependent eigenvector through the initial calibration unit includes:

对依赖特征向量进行归一化处理，得到归一化特征向量；基于初始降维矩阵，对归一化特征向量进行降维处理，得到降维特征向量；基于降维特征向量、激活函数和初始升维矩阵，确定依赖特征向量对应的修正特征向量。The dependent eigenvector is normalized to obtain a normalized eigenvector; based on the initial dimensionality reduction matrix, the normalized eigenvector is reduced in dimension to obtain a reduced dimensionality eigenvector; based on the reduced dimensionality eigenvector, the activation function and the initial dimensionality increase matrix, a modified eigenvector corresponding to the dependent eigenvector is determined.

其中，初始降维矩阵是指初始校准单元中Enc的参数。初始升维矩阵是指初始校准单元中Dec的参数。激活函数是指ReLU中的函数。Among them, the initial dimension reduction matrix refers to the parameters of Enc in the initial calibration unit. The initial dimension increase matrix refers to the parameters of Dec in the initial calibration unit. The activation function refers to the function in ReLU.

示例性地，计算机设备通过LN单元对依赖特征向量进行归一化处理，得到依赖特征向量对应的归一化特征向量，将归一化特征向量与初始降维矩阵相乘，得到降维特征向量，将降维特征向量输入至激活函数，得到激活特征向量，将激活特征向量与初始升维矩阵相乘，得到依赖特征向量对应的修正特征向量。Exemplarily, a computer device normalizes the dependent eigenvector through an LN unit to obtain a normalized eigenvector corresponding to the dependent eigenvector, multiplies the normalized eigenvector by an initial dimensionality reduction matrix to obtain a reduced dimensionality eigenvector, inputs the reduced dimensionality eigenvector into an activation function to obtain an activated eigenvector, and multiplies the activated eigenvector by an initial dimensionality increase matrix to obtain a modified eigenvector corresponding to the dependent eigenvector.

在一个实施例中，修正特征向量如下所示：In one embodiment, the modified feature vector is as follows:

公式（5） Formula (5)

其中，为初始降维矩阵；/>为初始升维矩阵；/>为第/>个Transformer块中的初始校准单元输出的修正特征向量。in, is the initial dimension reduction matrix; /> is the initial dimension-raising matrix; /> For the first/> The corrected feature vector output by the initial calibration unit in the Transformer block.

本实施例中，通过初始校准单元确定依赖特征向量对应的修正特征向量，修正特征向量用于对分类特征向量进行校正，从而提高了分类特征向量的准确性。In this embodiment, the correction feature vector corresponding to the dependent feature vector is determined by the initial calibration unit, and the correction feature vector is used to correct the classification feature vector, thereby improving the accuracy of the classification feature vector.

在一个实施例中，获取二维图像对应的图像特征向量序列，包括：In one embodiment, obtaining an image feature vector sequence corresponding to a two-dimensional image includes:

获取二维图像，将二维图像块划分为多个图像块；对图像块进行特征提取，得到图像块对应的图像特征向量；基于图像块在二维图像中的位置，确定图像块对应的位置顺序；基于多个图像块对应的位置顺序，对多个图像块对应的图像特征向量进行排序，得到二维图像对应的图像特征向量序列。Acquire a two-dimensional image, divide the two-dimensional image block into multiple image blocks; perform feature extraction on the image block to obtain an image feature vector corresponding to the image block; determine the position order corresponding to the image block based on the position of the image block in the two-dimensional image; sort the image feature vectors corresponding to the multiple image blocks based on the position order corresponding to the multiple image blocks to obtain an image feature vector sequence corresponding to the two-dimensional image.

其中，图像块是指二维图像中的局部图像。位置顺序是指图像块的排列顺序。The image block refers to a local image in a two-dimensional image, and the position order refers to the arrangement order of the image blocks.

示例性地，计算机设备获取二维图像，将二维图像划分为多个图像块，对图像块进行特征提取，得到图像块对应的图像特征向量，基于图像块在二维图像中的位置，确定图像块对应的位置顺序，基于多个图像块对应的位置顺序，对多个图像块对应的图像特征向量进行排序，得到二维图像对应的图像特征向量序列。例如，二维图像为，将/>分成/>个图像块，得到/>个图像块/>，对图像块/>进行特征提取，得到图像特征向量，/>为特征提取参数，图像特征向量序列为。Exemplarily, a computer device acquires a two-dimensional image, divides the two-dimensional image into a plurality of image blocks, extracts features from the image blocks, obtains image feature vectors corresponding to the image blocks, determines a position order corresponding to the image blocks based on the positions of the image blocks in the two-dimensional image, sorts the image feature vectors corresponding to the plurality of image blocks based on the position order corresponding to the plurality of image blocks, and obtains an image feature vector sequence corresponding to the two-dimensional image. For example, the two-dimensional image is , will/> Divide into/> image blocks, get/> Image blocks/> , for image blocks/> Perform feature extraction to obtain image feature vector ,/> is the feature extraction parameter, and the image feature vector sequence is .

本实施例中，图像特征向量序列为多个图像特征向量按顺序排列组成的序列，图像特征向量序列表征了二维图像的图像特征。In this embodiment, the image feature vector sequence is a sequence composed of multiple image feature vectors arranged in sequence, and the image feature vector sequence represents the image features of the two-dimensional image.

在一个示例性地实施例中，基于二维分类模型的三维点云分类模型训练方法包含如下步骤：In an exemplary embodiment, a three-dimensional point cloud classification model training method based on a two-dimensional classification model comprises the following steps:

计算机设备获取多个二维图像和每个二维图像对应的标注类别，针对每一个二维图像，将二维图像划分为多个图像块，对图像块进行特征提取，得到图像块对应的图像特征向量，基于图像块在二维图像中的位置，确定图像块对应的位置顺序，基于多个图像块对应的位置顺序，对多个图像块对应的图像特征向量进行排序，得到二维图像对应的图像特征向量序列。The computer device obtains multiple two-dimensional images and the annotation category corresponding to each two-dimensional image, divides the two-dimensional image into multiple image blocks for each two-dimensional image, performs feature extraction on the image blocks, obtains image feature vectors corresponding to the image blocks, determines the position order corresponding to the image blocks based on the positions of the image blocks in the two-dimensional image, sorts the image feature vectors corresponding to the multiple image blocks based on the position order corresponding to the multiple image blocks, and obtains an image feature vector sequence corresponding to the two-dimensional image.

从多个二维图像中选取一个训练图像，将训练图像对应的图像特征向量序列和标注类别输入至初始二维分类模型，对初始二维分类模型进行训练，得到更新二维分类模型；将更新二维分类模型确定为初始二维分类模型，从剩余的二维图像中选取一个训练图像，重复执行上述将训练图像对应的图像特征向量序列和标注类别输入至初始二维分类模型，对初始二维分类模型进行训练，得到更新二维分类模型的步骤，直至达到第一训练停止条件，得到目标二维分类模型。A training image is selected from multiple two-dimensional images, and the image feature vector sequence and the label category corresponding to the training image are input into the initial two-dimensional classification model, and the initial two-dimensional classification model is trained to obtain an updated two-dimensional classification model; the updated two-dimensional classification model is determined as the initial two-dimensional classification model, and a training image is selected from the remaining two-dimensional images, and the above steps of inputting the image feature vector sequence and the label category corresponding to the training image into the initial two-dimensional classification model, training the initial two-dimensional classification model, and obtaining an updated two-dimensional classification model are repeated until the first training stop condition is reached to obtain a target two-dimensional classification model.

计算机设备获取初始校准单元，在目标二维分类模型的每个Transformer块上添加一个初始校准单元，得到初始三维点云分类模型。The computer device obtains an initial calibration unit, adds an initial calibration unit to each Transformer block of the target two-dimensional classification model, and obtains an initial three-dimensional point cloud classification model.

计算机设备获取多个三维点云和每个三维点云对应的类别标签；针对每一个三维点云，从三维点云中随机选取一个三维图像点作为参考点，将除去参考点的三维点云作为候选点集合，针对候选点集合中的每一个候选点，获取候选点的三维位置坐标和参考点的三维位置坐标，计算候选点的三维位置坐标和参考点的三维位置坐标之间的直线距离，得到候选点与参考点之间的直线距离，将候选点集合中的多个候选点对应的直线距离进行比较，得到最大的直线距离，将最大的直线距离所对应的候选点确定为采样点；基于采样点的三维位置坐标，确定该采样点对应的相邻三维图像点，将预设数量的相邻三维图像点确定为采样点对应的邻域集合；将采样点确定为更新后的参考点，将除去采样点和邻域集合的候选点集合确定为更新后的候选点集合，重复执行上述步骤，直至达到预设循环停止条件，得到多个采样点和每个采样点对应的邻域集合。The computer device obtains multiple three-dimensional point clouds and a category label corresponding to each three-dimensional point cloud; for each three-dimensional point cloud, randomly selects a three-dimensional image point from the three-dimensional point cloud as a reference point, uses the three-dimensional point cloud excluding the reference point as a candidate point set, obtains the three-dimensional position coordinates of the candidate point and the three-dimensional position coordinates of the reference point for each candidate point in the candidate point set, calculates the straight-line distance between the three-dimensional position coordinates of the candidate point and the three-dimensional position coordinates of the reference point, obtains the straight-line distance between the candidate point and the reference point, compares the straight-line distances corresponding to multiple candidate points in the candidate point set, obtains the maximum straight-line distance, and determines the candidate point corresponding to the maximum straight-line distance as a sampling point; based on the three-dimensional position coordinates of the sampling point, determines the adjacent three-dimensional image points corresponding to the sampling point, and determines a preset number of adjacent three-dimensional image points as a neighborhood set corresponding to the sampling point; determines the sampling point as an updated reference point, determines the candidate point set excluding the sampling point and the neighborhood set as an updated candidate point set, and repeats the above steps until a preset loop stop condition is reached, and obtains multiple sampling points and a neighborhood set corresponding to each sampling point.

针对每一个邻域集合，计算机设备将邻域集合输入到点云特征提取模型，点云特征提取模型输出数据特征向量，计算机设备得到该邻域集合对应的数据特征向量，使用莫顿排序方法对多个邻域集合进行排序，得到各个邻域集合对应的排列顺序，基于多个邻域集合对应的排列顺序，按照预设顺序对多个邻域集合对应的数据特征向量进行排序，得到三维点云对应的数据特征向量序列。For each neighborhood set, the computer device inputs the neighborhood set into the point cloud feature extraction model, the point cloud feature extraction model outputs a data feature vector, the computer device obtains the data feature vector corresponding to the neighborhood set, and uses the Morton sorting method to sort the multiple neighborhood sets to obtain the arrangement order corresponding to each neighborhood set. Based on the arrangement order corresponding to the multiple neighborhood sets, the data feature vectors corresponding to the multiple neighborhood sets are sorted in a preset order to obtain a data feature vector sequence corresponding to the three-dimensional point cloud.

从多个三维点云中获取一个训练点云，将训练点云对应的数据特征向量序列和类别标签输入至初始三维点云分类模型，对初始三维模型中的初始校准模块进行训练，以对初始校准单元中的初始校准参数进行调整，得到更新校准参数；将更新校准参数作为初始校准参数，从剩余的三维点云中获取一个训练点云，重复执行将训练点云对应的数据特征向量序列和类别标签输入至初始三维点云分类模型，对初始三维模型中的初始校准模块进行训练，以对初始校准单元中的初始校准参数进行调整，得到更新校准参数的步骤，直至达到第二训练停止条件，得到目标三维点云分类模型，目标三维点云分类模型由目标二维分类模型和目标校准模块组成。A training point cloud is obtained from multiple three-dimensional point clouds, and the data feature vector sequence and category labels corresponding to the training point cloud are input into an initial three-dimensional point cloud classification model, and an initial calibration module in the initial three-dimensional model is trained to adjust the initial calibration parameters in the initial calibration unit to obtain updated calibration parameters; the updated calibration parameters are used as initial calibration parameters, and a training point cloud is obtained from the remaining three-dimensional point clouds, and the steps of inputting the data feature vector sequence and category labels corresponding to the training point cloud into the initial three-dimensional point cloud classification model, and training the initial calibration module in the initial three-dimensional model to adjust the initial calibration parameters in the initial calibration unit to obtain updated calibration parameters are repeated until a second training stop condition is reached to obtain a target three-dimensional point cloud classification model, which is composed of a target two-dimensional classification model and a target calibration module.

获取目标三维点云，确定目标三维点云对应的目标数据特征向量序列，将目标数据特征向量序列输入至目标三维点云分类模型，得到目标三维点云对应的目标分类类别。例如，使用目标三维点云分类模型确定ModelNet40（一个常用的用于三维物体分类的数据集）中的三维点云对应的物体类别，准确率达到94.2%，准确性得到了较大提升。Get the target 3D point cloud, determine the target data feature vector sequence corresponding to the target 3D point cloud, input the target data feature vector sequence into the target 3D point cloud classification model, and obtain the target classification category corresponding to the target 3D point cloud. For example, the target 3D point cloud classification model is used to determine the object category corresponding to the 3D point cloud in ModelNet40 (a commonly used dataset for 3D object classification), with an accuracy rate of 94.2%, which is a significant improvement in accuracy.

应该理解的是，虽然如上所述的各实施例所涉及的流程图中的各个步骤按照箭头的指示依次显示，但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明，这些步骤的执行并没有严格的顺序限制，这些步骤可以以其它的顺序执行。而且，如上所述的各实施例所涉及的流程图中的至少一部分步骤可以包括多个步骤或者多个阶段，这些步骤或者阶段并不必然是在同一时刻执行完成，而是可以在不同的时刻执行，这些步骤或者阶段的执行顺序也不必然是依次进行，而是可以与其它步骤或者其它步骤中的步骤或者阶段的至少一部分轮流或者交替地执行。It should be understood that, although the various steps in the flowcharts involved in the above-mentioned embodiments are displayed in sequence according to the indication of the arrows, these steps are not necessarily executed in sequence according to the order indicated by the arrows. Unless there is a clear explanation in this article, the execution of these steps does not have a strict order restriction, and these steps can be executed in other orders. Moreover, at least a part of the steps in the flowcharts involved in the above-mentioned embodiments can include multiple steps or multiple stages, and these steps or stages are not necessarily executed at the same time, but can be executed at different times, and the execution order of these steps or stages is not necessarily carried out in sequence, but can be executed in turn or alternately with other steps or at least a part of the steps or stages in other steps.

基于同样的发明构思，本申请实施例还提供了一种用于实现上述所涉及的基于二维分类模型的三维点云分类模型训练方法的基于二维分类模型的三维点云分类模型训练装置。该装置所提供的解决问题的实现方案与上述方法中所记载的实现方案相似，故下面所提供的一个或多个基于二维分类模型的三维点云分类模型训练装置实施例中的具体限定可以参见上文中对于基于二维分类模型的三维点云分类模型训练方法的限定，在此不再赘述。Based on the same inventive concept, the embodiment of the present application also provides a three-dimensional point cloud classification model training device based on a two-dimensional classification model for implementing the three-dimensional point cloud classification model training method based on a two-dimensional classification model involved above. The implementation scheme for solving the problem provided by the device is similar to the implementation scheme recorded in the above method, so the specific limitations in one or more embodiments of the three-dimensional point cloud classification model training device based on a two-dimensional classification model provided below can refer to the limitations of the three-dimensional point cloud classification model training method based on a two-dimensional classification model above, and will not be repeated here.

在一个实施例中，如图8所示，提供了一种基于二维分类模型的三维点云分类模型训练装置，包括：第一训练模块802、组合模块804和第二训练模块806，其中：In one embodiment, as shown in FIG8 , a three-dimensional point cloud classification model training device based on a two-dimensional classification model is provided, comprising: a first training module 802, a combination module 804, and a second training module 806, wherein:

第一训练模块802，用于获取二维图像对应的图像特征向量序列，基于图像特征向量序列对初始二维分类模型进行训练，得到目标二维分类模型；The first training module 802 is used to obtain an image feature vector sequence corresponding to the two-dimensional image, and train the initial two-dimensional classification model based on the image feature vector sequence to obtain a target two-dimensional classification model;

组合模块804，用于基于目标二维分类模型和初始校准单元，得到初始三维点云分类模型；A combination module 804 is used to obtain an initial three-dimensional point cloud classification model based on the target two-dimensional classification model and the initial calibration unit;

第二训练模块806，用于获取三维点云对应的数据特征向量序列，基于数据特征向量序列对初始三维点云分类模型进行训练，以调整初始校准单元中的初始校准参数，得到目标三维点云分类模型；目标三维点云分类模型用于基于三维点云对应的数据特征向量序列确定三维点云对应的物体类别。The second training module 806 is used to obtain a data feature vector sequence corresponding to the three-dimensional point cloud, and train the initial three-dimensional point cloud classification model based on the data feature vector sequence to adjust the initial calibration parameters in the initial calibration unit to obtain a target three-dimensional point cloud classification model; the target three-dimensional point cloud classification model is used to determine the object category corresponding to the three-dimensional point cloud based on the data feature vector sequence corresponding to the three-dimensional point cloud.

在一个实施例中，第二训练模块806还用于：获取三维点云，基于三维点云中三维图像点之间的直线距离，确定三维点云对应的多个采样点和每个采样点对应的邻域集合；针对每一个邻域集合，对邻域集合进行特征提取，得到邻域集合对应的数据特征向量；对多个邻域集合进行排序，得到各个邻域集合对应的排列顺序；基于多个邻域集合对应的排列顺序，对多个邻域集合对应的数据特征向量进行排序，得到三维点云对应的数据特征向量序列。In one embodiment, the second training module 806 is also used to: obtain a three-dimensional point cloud, determine multiple sampling points corresponding to the three-dimensional point cloud and a neighborhood set corresponding to each sampling point based on the straight-line distance between three-dimensional image points in the three-dimensional point cloud; for each neighborhood set, perform feature extraction on the neighborhood set to obtain a data feature vector corresponding to the neighborhood set; sort multiple neighborhood sets to obtain an arrangement order corresponding to each neighborhood set; sort the data feature vectors corresponding to the multiple neighborhood sets based on the arrangement order corresponding to the multiple neighborhood sets to obtain a data feature vector sequence corresponding to the three-dimensional point cloud.

在一个实施例中，第二训练模块806还用于：将三维点云中的一个三维图像点确定为参考点，将除去参考点的三维点云确定为候选点集合；针对候选点集合中的每一个候选点，确定候选点与参考点之间的直线距离；对多个直线距离进行比较，得到最大的直线距离，将最大的直线距离所对应的候选点确定为采样点；将采样点的多个相邻像素点确定为采样点的邻域集合；将采样点确定为更新后的参考点，将除去采样点和邻域集合的候选点集合，确定为更新后的候选点集合，重复执行针对候选点集合中的每一个候选点，确定候选点与参考点之间的直线距离的步骤，得到多个采样点和每个采样点对应的邻域集合。In one embodiment, the second training module 806 is further used to: determine a three-dimensional image point in the three-dimensional point cloud as a reference point, and determine the three-dimensional point cloud excluding the reference point as a candidate point set; for each candidate point in the candidate point set, determine the straight-line distance between the candidate point and the reference point; compare multiple straight-line distances to obtain the maximum straight-line distance, and determine the candidate point corresponding to the maximum straight-line distance as a sampling point; determine multiple adjacent pixel points of the sampling point as a neighborhood set of the sampling point; determine the sampling point as an updated reference point, determine the candidate point set excluding the sampling point and the neighborhood set as an updated candidate point set, and repeat the step of determining the straight-line distance between the candidate point and the reference point for each candidate point in the candidate point set to obtain multiple sampling points and a neighborhood set corresponding to each sampling point.

在一个实施例中，第二训练模块806还用于：将数据特征向量序列输入至初始三维点云分类模型，针对数据特征向量序列中的每一个数据特征向量，通过多头自注意力机制单元，确定数据特征向量对应的依赖特征向量；通过初始校准单元，确定依赖特征向量对应的修正特征向量；基于依赖特征向量和修正特征向量，通过多层感知机单元，确定数据特征向量对应的分类特征向量；基于多个分类特征向量，对初始校准单元中的初始校准参数进行调整，得到初始校准参数对应的更新校准参数。In one embodiment, the second training module 806 is also used to: input the data feature vector sequence into the initial three-dimensional point cloud classification model, and for each data feature vector in the data feature vector sequence, determine the dependent feature vector corresponding to the data feature vector through a multi-head self-attention mechanism unit; determine the correction feature vector corresponding to the dependent feature vector through an initial calibration unit; based on the dependent feature vector and the correction feature vector, determine the classification feature vector corresponding to the data feature vector through a multi-layer perceptron unit; based on multiple classification feature vectors, adjust the initial calibration parameters in the initial calibration unit to obtain updated calibration parameters corresponding to the initial calibration parameters.

在一个实施例中，第二训练模块806还用于：对依赖特征向量进行归一化处理，得到归一化特征向量；基于初始降维矩阵，对归一化特征向量进行降维处理，得到降维特征向量；基于降维特征向量、激活函数和初始升维矩阵，确定依赖特征向量对应的修正特征向量。In one embodiment, the second training module 806 is also used to: normalize the dependent eigenvector to obtain a normalized eigenvector; based on the initial dimensionality reduction matrix, reduce the dimensionality of the normalized eigenvector to obtain a reduced dimensionality eigenvector; and determine the modified eigenvector corresponding to the dependent eigenvector based on the reduced dimensionality eigenvector, the activation function and the initial dimensionality increase matrix.

在一个实施例中，第一训练模块802还用于：获取二维图像，将二维图像块划分为多个图像块；对图像块进行特征提取，得到图像块对应的图像特征向量；基于图像块在二维图像中的位置，确定图像块对应的位置顺序；基于多个图像块对应的位置顺序，对多个图像块对应的图像特征向量进行排序，得到二维图像对应的图像特征向量序列。In one embodiment, the first training module 802 is also used to: acquire a two-dimensional image, divide the two-dimensional image block into multiple image blocks; perform feature extraction on the image block to obtain an image feature vector corresponding to the image block; determine the position order corresponding to the image block based on the position of the image block in the two-dimensional image; sort the image feature vectors corresponding to the multiple image blocks based on the position order corresponding to the multiple image blocks to obtain an image feature vector sequence corresponding to the two-dimensional image.

上述基于二维分类模型的三维点云分类模型训练装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中，也可以以软件形式存储于计算机设备中的存储器中，以便于处理器调用执行以上各个模块对应的操作。Each module in the above-mentioned three-dimensional point cloud classification model training device based on the two-dimensional classification model can be implemented in whole or in part by software, hardware and their combination. Each of the above-mentioned modules can be embedded in or independent of the processor in the computer device in the form of hardware, or can be stored in the memory of the computer device in the form of software, so that the processor can call and execute the operations corresponding to each of the above modules.

在一个实施例中，提供了一种计算机设备，该计算机设备可以是终端，其内部结构图可以如图9所示。该计算机设备包括处理器、存储器、输入/输出接口、通信接口、显示单元和输入装置。其中，处理器、存储器和输入/输出接口通过系统总线连接，通信接口、显示单元和输入装置通过输入/输出接口连接到系统总线。其中，该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质和内存储器。该非易失性存储介质存储有操作系统和计算机程序。该内存储器为非易失性存储介质中的操作系统和计算机程序的运行提供环境。该计算机设备的输入/输出接口用于处理器与外部设备之间交换信息。该计算机设备的通信接口用于与外部的终端进行有线或无线方式的通信，无线方式可通过WIFI、移动蜂窝网络、NFC（近场通信）或其他技术实现。该计算机程序被处理器执行时以实现一种基于二维分类模型的三维点云分类模型训练方法。该计算机设备的显示单元用于形成视觉可见的画面，可以是显示屏、投影装置或虚拟现实成像装置。显示屏可以是液晶显示屏或者电子墨水显示屏，该计算机设备的输入装置可以是显示屏上覆盖的触摸层，也可以是计算机设备外壳上设置的按键、轨迹球或触控板，还可以是外接的键盘、触控板或鼠标等。In one embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be shown in FIG9. The computer device includes a processor, a memory, an input/output interface, a communication interface, a display unit, and an input device. The processor, the memory, and the input/output interface are connected via a system bus, and the communication interface, the display unit, and the input device are connected to the system bus via the input/output interface. The processor of the computer device is used to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and the computer program in the non-volatile storage medium. The input/output interface of the computer device is used to exchange information between the processor and an external device. The communication interface of the computer device is used to communicate with an external terminal in a wired or wireless manner, and the wireless manner may be implemented through WIFI, a mobile cellular network, NFC (near field communication) or other technologies. When the computer program is executed by the processor, a three-dimensional point cloud classification model training method based on a two-dimensional classification model is implemented. The display unit of the computer device is used to form a visually visible picture, which may be a display screen, a projection device, or a virtual reality imaging device. The display screen can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer device can be a touch layer covering the display screen, or a button, trackball or touchpad set on the computer device shell, or an external keyboard, touchpad or mouse.

本领域技术人员可以理解，图9中示出的结构，仅仅是与本申请方案相关的部分结构的框图，并不构成对本申请方案所应用于其上的计算机设备的限定，具体的计算机设备可以包括比图中所示更多或更少的部件，或者组合某些部件，或者具有不同的部件布置。Those skilled in the art will understand that the structure shown in FIG. 9 is merely a block diagram of a partial structure related to the solution of the present application, and does not constitute a limitation on the computer device to which the solution of the present application is applied. The specific computer device may include more or fewer components than shown in the figure, or combine certain components, or have a different arrangement of components.

在一个实施例中，提供了一种计算机设备，包括存储器和处理器，存储器中存储有计算机程序，该处理器执行计算机程序时实现上述各方法实施例中的步骤。In one embodiment, a computer device is provided, including a memory and a processor, wherein a computer program is stored in the memory, and the processor implements the steps in the above-mentioned method embodiments when executing the computer program.

在一个实施例中，提供了一种计算机可读存储介质，其上存储有计算机程序，该计算机程序被处理器执行时实现上述各方法实施例中的步骤。In one embodiment, a computer-readable storage medium is provided, on which a computer program is stored. When the computer program is executed by a processor, the steps in the above-mentioned method embodiments are implemented.

在一个实施例中，提供了一种计算机程序产品，包括计算机程序，该计算机程序被处理器执行时实现上述各方法实施例中的步骤。In one embodiment, a computer program product is provided, including a computer program, which implements the steps in the above method embodiments when executed by a processor.

需要说明的是，本申请所涉及的用户信息（包括但不限于用户设备信息、用户个人信息等）和数据（包括但不限于用于分析的数据、存储的数据、展示的数据等），均为经用户授权或者经过各方充分授权的信息和数据。It should be noted that the user information (including but not limited to user device information, user personal information, etc.) and data (including but not limited to data used for analysis, stored data, displayed data, etc.) involved in this application are all information and data authorized by the user or fully authorized by all parties.

本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程，是可以通过计算机程序来指令相关的硬件来完成，所述的计算机程序可存储于一非易失性计算机可读取存储介质中，该计算机程序在执行时，可包括如上述各方法的实施例的流程。其中，本申请所提供的各实施例中所使用的对存储器、数据库或其它介质的任何引用，均可包括非易失性和易失性存储器中的至少一种。非易失性存储器可包括只读存储器（Read-OnlyMemory，ROM）、磁带、软盘、闪存、光存储器、高密度嵌入式非易失性存储器、阻变存储器（ReRAM）、磁变存储器（Magnetoresistive Random Access Memory，MRAM）、铁电存储器（Ferroelectric Random Access Memory，FRAM）、相变存储器（Phase Change Memory，PCM）、石墨烯存储器等。易失性存储器可包括随机存取存储器（Random Access Memory，RAM）或外部高速缓冲存储器等。作为说明而非局限，RAM可以是多种形式，比如静态随机存取存储器（Static Random Access Memory，SRAM）或动态随机存取存储器（Dynamic RandomAccess Memory，DRAM）等。本申请所提供的各实施例中所涉及的数据库可包括关系型数据库和非关系型数据库中至少一种。非关系型数据库可包括基于区块链的分布式数据库等，不限于此。本申请所提供的各实施例中所涉及的处理器可为通用处理器、中央处理器、图形处理器、数字信号处理器、可编程逻辑器、基于量子计算的数据处理逻辑器等，不限于此。Those skilled in the art can understand that all or part of the processes in the above-mentioned embodiment methods can be completed by instructing the relevant hardware through a computer program, and the computer program can be stored in a non-volatile computer-readable storage medium. When the computer program is executed, it can include the processes of the embodiments of the above-mentioned methods. Among them, any reference to the memory, database or other medium used in the embodiments provided in this application can include at least one of non-volatile and volatile memory. Non-volatile memory can include read-only memory (ROM), magnetic tape, floppy disk, flash memory, optical memory, high-density embedded non-volatile memory, resistive random access memory (ReRAM), magnetoresistive random access memory (MRAM), ferroelectric random access memory (FRAM), phase change memory (PCM), graphene memory, etc. Volatile memory can include random access memory (RAM) or external cache memory, etc. As an illustration and not limitation, RAM can be in various forms, such as static random access memory (SRAM) or dynamic random access memory (DRAM). The database involved in each embodiment provided in this application may include at least one of a relational database and a non-relational database. Non-relational databases may include distributed databases based on blockchains, etc., but are not limited to this. The processor involved in each embodiment provided in this application may be a general-purpose processor, a central processing unit, a graphics processor, a digital signal processor, a programmable logic device, a data processing logic device based on quantum computing, etc., but are not limited to this.

以上实施例的各技术特征可以进行任意的组合，为使描述简洁，未对上述实施例中的各个技术特征所有可能的组合都进行描述，然而，只要这些技术特征的组合不存在矛盾，都应当认为是本说明书记载的范围。The technical features of the above embodiments may be arbitrarily combined. To make the description concise, not all possible combinations of the technical features in the above embodiments are described. However, as long as there is no contradiction in the combination of these technical features, they should be considered to be within the scope of this specification.

以上所述实施例仅表达了本申请的几种实施方式，其描述较为具体和详细，但并不能因此而理解为对本申请专利范围的限制。应当指出的是，对于本领域的普通技术人员来说，在不脱离本申请构思的前提下，还可以做出若干变形和改进，这些都属于本申请的保护范围。因此，本申请的保护范围应以所附权利要求为准。The above-described embodiments only express several implementation methods of the present application, and the descriptions thereof are relatively specific and detailed, but they cannot be understood as limiting the scope of the present application. It should be pointed out that, for a person of ordinary skill in the art, several variations and improvements can be made without departing from the concept of the present application, and these all belong to the protection scope of the present application. Therefore, the protection scope of the present application shall be subject to the attached claims.