wherein v is_i，kFor the average pooling result, r_i，j，kFor uncertain feature vectors, n_iAnd (3) the number of points contained in the three-dimensional point cloud sample is represented by i, i represents the ith object, i and j together represent the jth point in the ith object, and k represents the serial number of the uncertain feature vector.

Unified feature vector V_iConsisting of the average pooling result, which can be expressed as

V_i＝[v_i,1,v_i,2,…,v_i,k,…,v_i,d]^T，

It can be seen that the feature vectors in the uncertain feature space can be mapped to the uniform feature space through the average pooling operation.

S203, transposing the current unified feature vector to obtain a current feature node.

Transposing the current unified feature vector results in a feature node (feature node), which can be denoted as current feature node,

X_i＝V_i^T，

wherein, X_iRepresenting the current characteristic node, V, of the ith object_iRepresenting a uniform feature vector for the ith object.

It should be noted that the current feature node mentioned here is the same data type as the preset feature node appearing later, and is only used for distinguishing.

The object identification method based on width learning provided by the embodiment of the invention is applied to the current coding matrix and relates to the specific implementation mode of the current characteristic node, and the original three-dimensional point cloud data can be successfully converted into the data type which is easier to process and more suitable for the preset width learning neural network by using the implementation mode.

Fig. 3 is a flowchart of an object recognition method based on width learning according to yet another embodiment of the present invention, which is based on the embodiment shown in fig. 1.

In this embodiment, the S3 specifically includes:

s301, carrying out object identification on the current feature node through a preset weight matrix in a preset width learning neural network.

In the using link, the preset weight matrix can be recorded as W^*When the preset width learning neural network is applied to object recognition, a preset weight matrix in the preset width learning neural network can be confirmed in a training link, and the preset width learning neural network can be directly used in a using link.

The object identification method based on width learning provided by the embodiment of the invention uses the preset weight matrix to carry out object identification operation.

On the basis of the foregoing embodiment, preferably, before S1, the method specifically includes:

and S11, acquiring preset characteristic nodes, preset enhanced nodes and a weight matrix to be updated in the width learning neural network to be trained.

And S12, performing output processing according to the preset feature node, the preset enhancement node and the weight matrix to be updated to obtain an output matrix.

Specifically, the second training link corresponding to the preset width learning neural network is involved here, and in order to determine the weight matrix actually used in the using link, that is, the preset weight matrix, the second training link is to train and optimize the weight matrix.

The untrained width learning neural network can be recorded as a to-be-trained width learning neural network, and the trained width learning neural network can be recorded as a preset width learning neural network.

For example, for the training optimization process, a matrix default value of a weight matrix may be initialized first, and the matrix default value is recorded as the weight matrix to be updated. The preset feature node, the preset enhanced node (enhanced node) and the weight matrix to be updated can be output by a preset output processing algorithm.

As for the preset output processing algorithm, as follows,

γ＝[Zⁿ|H^m]W^*，

wherein γ represents the output matrix, ZⁿRepresenting a predetermined characteristic node, H^mRepresents a predetermined enhanced node, W^*Representing the weight matrix, n and m both representing the sequence numbers.

And S13, updating the weight matrix according to the preset feature node, the preset enhanced node and the output matrix so as to update the weight matrix to be updated into a preset weight matrix.

Then, the output matrix obtained based on the weight matrix to be updated can be used for updating the weight matrix.

For example, a specific implementation manner is that the preset feature section can be firstly selectedPoint ZⁿAnd a predetermined enhanced node H^mA block matrix F is determined which, as follows,

F＝[Zⁿ|H^m]。

then, a new weight matrix, i.e. a preset weight matrix, is determined according to the block matrix and the output matrix,

W^*＝F⁺γ＝(λI+FF^T)^-1F^Tγ，

wherein, W^*Representing a weight matrix, F⁺The generalized inverse matrix of F is represented, F represents a block matrix, gamma represents an output matrix, the variable lambda is an eigenvalue, and I represents an identity matrix.

In addition to this, the present invention is,

wherein, F⁺In order to be a general expression of the items,

then it represents the matrix F obtained when the variable lambda approaches 0⁺The solution of (1).

In addition, the updating of the weight matrix according to the preset feature node, the preset enhanced node and the output matrix to update the weight matrix to be updated to a preset weight matrix specifically includes:

updating a weight matrix according to the preset characteristic node, the preset enhancement node and the output matrix to obtain a target weight matrix;

and if the object identification accuracy rate corresponding to the target weight matrix is not in the preset accuracy rate range, taking the target weight matrix as a new matrix to be updated, executing the step of performing output processing according to the preset characteristic node, the preset enhanced node and the weight matrix to be updated again to obtain an output matrix, and updating the target weight matrix into the preset weight matrix until the object identification accuracy rate corresponding to the target weight matrix is in the preset accuracy rate range.

For example, the weight matrix after the first update may be referred to as a target weight matrix, which may be abbreviated as a matrix L1, and an automatic test of object identification may be performed through the matrix L1 to automatically generate an object identification accuracy, and if the object identification accuracy is not at a higher value, the update is continued. In the continuous update operation, the matrix L1 is used to regenerate the output matrix, and the weight matrix is updated through the output matrix to obtain a new target weight matrix, which can be denoted as the matrix L2. If the object recognition accuracy corresponding to the matrix L2 is high, that is, within the preset accuracy range, the matrix L2 may be selected as a weight matrix used in the subsequent object recognition, that is, a preset weight matrix.

Of course, if the object recognition accuracy corresponding to the matrix L1 is within the preset accuracy range, the matrix L1 may be selected as the weight matrix used in the subsequent object recognition.

Therefore, a weight matrix with high object identification accuracy can be obtained by continuously and circularly updating the weight matrix.

Meanwhile, the weight matrix in the neural network structure is updated, so that the identification accuracy when the neural network structure is used for identifying objects can be improved.

Fig. 4 is a flowchart of an object recognition method based on width learning according to another embodiment of the present invention, where another embodiment of the present invention is based on the embodiment shown in fig. 3.

In this embodiment, before the obtaining of the preset feature node, the preset enhanced node, and the weight matrix to be updated in the to-be-trained width learning neural network, the width learning-based object identification method further includes:

acquiring a preset characteristic node;

It is understood that the preset enhanced node is used in the breadth learning neural network, and the generation manner of the preset enhanced node is referred to herein.

Specifically, a preset feature node is obtained first, and the preset feature node is input into a preset activation function to obtain a preset enhanced node. As for the preset activation function, as follows,

H_m＝ζ(ZⁿW_m'+β_m)，

wherein H_mA preset enhancement node representing an enhancement layer, ζ represents a preset activation function, ZⁿRepresenting a predetermined characteristic node, W_m' denotes a predefined matrix, β_mThe expression offset vector is embodied as a set of randomly generated fixed values, and m denotes a sequence number.

In addition, see fig. 5, a schematic diagram of the architecture of the predetermined unified space coder and the predetermined width learning neural network, Zⁿ＝[Z₁|Z₂|...|Z_l|...|Z_u]，Z_l＝[z_l,1,z_l,2,…,z_l,p,…,z_l,d]，H_m＝[h_m,1,…,h_m,q,…h_m,s]，β_m＝[b_m,1,b_m,2,…,b_m,q,…,b_s]N, l, u, p, d, m, q and s each represent a number. Wherein, UASE represents a preset unified space coder, average position represents average pooling, and transpose represents transposition.

The preset width learning neural network can be divided into 3 layers, namely an input layer, an enhancement layer and an output layer, and the number of layers of the preset width learning neural network is less than that of the traditional deep learning neural network. The preset characteristic node is an input layer of the preset width learning neural network, the preset enhancement node is located in the enhancement layer, and the output layer can obtain an object recognition result. The comparison operation of the labels is involved in the output layer.

Therefore, the preset enhanced node can be generated through the embodiment of the invention.

On the basis of the foregoing embodiment, preferably, before S11, the method for object recognition based on width learning further includes:

and S111, obtaining a three-dimensional point cloud sample.

It is understood that preset feature nodes are used in the width learning neural network, and the generation manner of the preset feature nodes is referred to herein.

Specifically, in a first training link corresponding to a preset unified space encoder, three-dimensional point cloud data in an outdoor environment can be collected by any LiDAR sensor to serve as a training sample. Of course, the training sample may be stored in a local hard disk.

Then, a large number of three-dimensional point cloud samples can be visualized, see fig. 6, where fig. 6 is a schematic view of three-dimensional point cloud data visualization in an outdoor environment, and corresponding labels can be marked on different three-dimensional point cloud samples.

For example, referring to fig. 7, fig. 7 is a schematic diagram of the spatial distribution of three-dimensional point cloud data corresponding to a target object, and the spatial distribution of three-dimensional point cloud data corresponding to a car, a pedestrian, a bush, a trunk, a tree, and a building respectively from left to right, that is, the target object includes six types of cars, pedestrians, bushes, trunks, trees, and buildings.

Therefore, three-dimensional point cloud samples corresponding to different target objects can be stored in separate files, and a sample label representing the object type is added to each file, for example, the sample label can be an automobile.

When the method is actually used, a plurality of existing files can be directly obtained to obtain the three-dimensional point cloud sample and the sample label corresponding to the three-dimensional point cloud sample.

And S112, selecting a current coding matrix from preset coding matrixes according to the coordinate information in the three-dimensional point cloud sample.

A plurality of encoding matrices and decoding matrices may be generated first and recorded as preset encoding matrices and preset decoding matrices.

Then, the coordinate information can be used as a proof to screen the preset coding matrix so as to select a coding matrix and record the coding matrix as the current coding matrix.

S113, processing the three-dimensional point cloud sample through the current coding matrix to obtain a target uncertain feature vector under an uncertain feature space.

Then, the original three-dimensional point cloud sample can be mapped to an Uncertain feature space (Uncertain feature space) by using the screened current coding matrix, and the feature vector mapped to the Uncertain feature space is marked as a target Uncertain feature vector.

For example, the three-dimensional coordinate information in the three-dimensional point cloud sample may be denoted as (x, y, z), and the three-dimensional coordinate information is processed through the current coding matrix in a preset uncertain space mapping formula.

As for the preset uncertain space mapping formula, as follows,

r_i,j,k＝w_k,1x_i,j+w_k,2y_i,j+w_k,3z_i,j+b，

wherein r is_i，j，kFor uncertain feature vectors, w_k,1To w_k,3For the elements in the kth coding matrix, i and j together represent the jth point in the ith object, k represents the vector serial number of the uncertain feature vector, and b represents the offset.

S114, performing pooling treatment on the target uncertain feature vectors through an average pooling algorithm to obtain target uniform feature vectors in a uniform feature space.

As for the average pooling algorithm, as follows,

V_i＝[v_i,1,v_i,2,…,v_i,k,…,v_i,d]^T，

And S115, transposing the target uniform characteristic vector to obtain a preset characteristic node.

The uniform feature vector is transposed to obtain the feature nodes shown below, which can be marked as preset feature nodes,

X_i＝V_i^T，

wherein, X_iA predetermined characteristic node, V, representing the ith object_iRepresenting a uniform feature vector for the ith object.

It should be noted that, in the embodiments of the present invention, the data types are still the same only for distinguishing the data contents in different situations, for example, the target uniform feature vector and the current uniform feature vector, and so on.

Therefore, the preset feature node can be generated through the embodiment of the invention.

On the basis of the foregoing embodiment, preferably, the selecting a current encoding matrix from preset encoding matrices according to the coordinate information in the three-dimensional point cloud sample specifically includes:

It can be understood that, the embodiment of the present invention optimizes the training of the coding matrix, and the accuracy of object identification can be further improved by optimizing the coding accuracy of the coding matrix.

Specifically, the encoding matrix and the decoding matrix are trained through a preset Batch Gradient Descent (BGD) algorithm, for example, if the trained encoding matrix is a preset encoding matrix, the trained decoding matrix is a preset decoding matrix.

Then, the preset encoding matrix and the preset decoding matrix may be used for testing, for example, the coordinate information and a bias node in the three-dimensional point cloud sample may be mapped to the hidden layer through the preset encoding matrix to obtain hidden layer information corresponding to the coordinate information.

The hidden layer information and another bias node may then be mapped to the output layer by a preset decoding matrix to obtain output layer information.

Wherein, the testing process of the testing using the predetermined encoding matrix and the predetermined decoding matrix can be referred to the following predetermined testing formula,

wherein,

for the output layer information i.e. the coding information,

for a predetermined decoding matrix, W for a predetermined encoding matrix, a_i,jFor the coordinate information of the input three-dimensional point cloud sample, i and j together represent the j-th point of the i-th object, and the f function and the g function are both Sigmoid functions.

Then, if the coordinate information before encoding and the output layer information after encoding have higher similarity, the preset encoding matrix can be reserved as the encoding matrix used later and recorded as the current encoding matrix.

In addition, see also the following table 1,

TABLE 1 test run Table

Table 1 records that Name represents a Name, UASE Training sample represents the number of Training samples of a preset uniform spatial encoder, BLS Training sample represents the number of Training samples of a preset width learning neural network, Training Accuracy represents Training Accuracy, BLS Testing sample represents the number of test samples of the preset width learning neural network, and Testing Accuracy represents Testing Accuracy;

car denotes Car, Pedestian denotes Pedestrian, Bush denotes shrub, Trunk denotes Trunk, Tree denotes Tree, Building denotes Building, Total/Aver denotes Total/average.

As shown in the test flow of table 1 above, 10 feature nodes will be used as inputs to the preset width learning neural network, 12 preset unified spatial encoders (USAE), and 9000 enhancement nodes.

Therefore, the embodiment of the invention can optimize the coding matrix so as to find the coding matrix with better coding performance to be used in object identification.

Fig. 8 is a schematic structural diagram of an object recognition system based on width learning according to an embodiment of the present invention, as shown in fig. 8, the system includes: adata acquisition module 301, aspatial coding module 302 and anobject identification module 303;

thedata acquisition module 301 is used for acquiring three-dimensional point cloud data in a current area;

thespatial coding module 302 is configured to process the three-dimensional point cloud data through a preset uniform spatial coder to obtain a current feature node in a uniform feature space;

and theobject identification module 303 is configured to perform object identification on the current feature node through a preset width learning neural network.

The object identification system based on width learning provided by the embodiment of the invention firstly collects three-dimensional point cloud data in a current area; processing the three-dimensional point cloud data through a preset uniform space encoder to obtain current feature nodes in a uniform feature space; and carrying out object recognition on the current feature nodes through a preset width learning neural network. When the object is identified, the preset width learning neural network adopted by the embodiment of the invention is different from the traditional deep learning neural network, the number of network layers is less, the number of parameters involved in the neural network structure is less, and the overall calculation efficiency is higher; meanwhile, the input quantity of the preset width learning neural network is changed from the original three-dimensional point cloud data to feature vectors in a unified feature space, the data type is simpler to process, and the calculation efficiency is further improved.

The system embodiment provided in the embodiments of the present invention is for implementing the above method embodiments, and for details of the process and the details, reference is made to the above method embodiments, which are not described herein again.

Fig. 9 is a schematic entity structure diagram of an electronic device according to an embodiment of the present invention, and as shown in fig. 9, the electronic device may include: a processor (processor)401, a communication Interface (communication Interface)402, a memory (memory)403 and abus 404, wherein theprocessor 401, thecommunication Interface 402 and thememory 403 complete communication with each other through thebus 404. Thecommunication interface 402 may be used for information transfer of an electronic device.Processor 401 may call logic instructions inmemory 403 to perform a method comprising:

collecting three-dimensional point cloud data in a current area;

In addition, the logic instructions in thememory 403 may be implemented in the form of software functional units and stored in a computer readable storage medium when the software functional units are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the above-described method embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

In another aspect, an embodiment of the present invention further provides a non-transitory computer-readable storage medium, on which a computer program is stored, where the computer program is implemented by a processor to perform the method provided by the foregoing embodiments, for example, including:

collecting three-dimensional point cloud data in a current area;

The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.