Movatterモバイル変換


[0]ホーム

URL:


CN111160198A - Object identification method and system based on width learning - Google Patents

Object identification method and system based on width learning
Download PDF

Info

Publication number
CN111160198A
CN111160198ACN201911340626.6ACN201911340626ACN111160198ACN 111160198 ACN111160198 ACN 111160198ACN 201911340626 ACN201911340626 ACN 201911340626ACN 111160198 ACN111160198 ACN 111160198A
Authority
CN
China
Prior art keywords
preset
feature
matrix
width learning
current
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911340626.6A
Other languages
Chinese (zh)
Other versions
CN111160198B (en
Inventor
宋伟
刘子澍
田逸非
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhongqu Beijing Technology Co ltd
Original Assignee
North China University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by North China University of TechnologyfiledCriticalNorth China University of Technology
Priority to CN201911340626.6ApriorityCriticalpatent/CN111160198B/en
Publication of CN111160198ApublicationCriticalpatent/CN111160198A/en
Application grantedgrantedCritical
Publication of CN111160198BpublicationCriticalpatent/CN111160198B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Images

Classifications

Landscapes

Abstract

The embodiment of the invention discloses an object identification method and system based on width learning. The embodiment of the invention firstly collects the three-dimensional point cloud data in the current area; processing the three-dimensional point cloud data through a preset uniform space encoder to obtain current feature nodes in a uniform feature space; and carrying out object recognition on the current feature nodes through a preset width learning neural network. Therefore, when the object recognition is carried out, the preset width learning neural network adopted by the embodiment of the invention is different from the traditional deep learning neural network, and the overall calculation efficiency is higher; meanwhile, the input quantity of the preset width learning neural network is changed from the original three-dimensional point cloud data to feature vectors in a unified feature space, the data type is simpler to process, and the calculation efficiency is further improved.

Description

Object identification method and system based on width learning
Technical Field
The invention relates to the technical field of object recognition, in particular to an object recognition method and system based on width learning.
Background
With the continuous maturity of imaging and analysis technologies of binocular images and continuous videos, three-dimensional light field perception of a real environment can be achieved, and the method can be widely applied to the fields of unmanned vehicle automatic driving and object recognition in three-dimensional scenes.
However, the defect that the resolution of the distant view image is low exists, which results in low data accuracy of the acquired ambient data; in addition, the estimation accuracy of the technology is greatly influenced by illumination and weather, and accurate three-dimensional data cannot be obtained easily.
In this regard, a LiDAR (Light detection and ranging) sensor may be used to overcome the defect, and the LiDAR sensor may be used to obtain three-dimensional point cloud data of an unknown environment due to its characteristics of high speed, high precision, and long distance.
In addition, three-dimensional point cloud data acquired based on laser ranging is not easily affected by illumination and weather, and distance information within 100 meters can be accurately measured.
However, the three-dimensional point cloud data has the characteristics of sparsity, unstructured property, uneven spatial distribution and the like, and the three-dimensional point cloud data of different objects contains different numbers of three-dimensional points, so that most deep learning neural network architectures cannot directly process the original three-dimensional point cloud data.
Even if the trained traditional deep learning neural network is used for identifying the three-dimensional point cloud data, the calculation efficiency is low.
Disclosure of Invention
In order to solve the above problem, embodiments of the present invention provide an object identification method and system based on width learning.
In a first aspect, an embodiment of the present invention provides an object identification method based on width learning, including:
collecting three-dimensional point cloud data in a current area;
processing the three-dimensional point cloud data through a preset uniform space encoder to obtain current feature nodes in a uniform feature space;
and carrying out object recognition on the current feature nodes through a preset width learning neural network.
Preferably, the processing the three-dimensional point cloud data by a preset uniform space encoder to obtain a current feature node in a uniform feature space specifically includes:
processing the three-dimensional point cloud data through a current coding matrix in a preset unified space coder to obtain a current uncertain feature vector under an uncertain feature space;
pooling the current uncertain feature vectors through an average pooling algorithm in the preset unified space encoder to obtain current unified feature vectors in a unified feature space;
and transposing the current unified feature vector to obtain a current feature node.
Preferably, the identifying the object by using the current feature node through a preset width learning neural network specifically includes:
and carrying out object identification on the current characteristic node through a preset weight matrix in a preset width learning neural network.
Preferably, before the acquiring the three-dimensional point cloud data in the current region, the object identification method based on width learning further includes:
acquiring a preset characteristic node, a preset enhancement node and a weight matrix to be updated in a to-be-trained width learning neural network;
performing output processing according to the preset characteristic node, the preset enhanced node and the weight matrix to be updated to obtain an output matrix;
and updating the weight matrix according to the preset characteristic node, the preset enhanced node and the output matrix so as to update the weight matrix to be updated into a preset weight matrix.
Preferably, before the obtaining of the preset feature node, the preset enhanced node and the weight matrix to be updated in the width learning neural network to be trained, the width learning-based object identification method further includes:
acquiring a preset characteristic node;
and constructing an enhancement layer by using the preset feature node through a preset activation function so as to construct a preset enhancement node.
Preferably, before the obtaining of the preset feature node, the preset enhanced node and the weight matrix to be updated in the width learning neural network to be trained, the width learning-based object identification method further includes:
acquiring a three-dimensional point cloud sample;
selecting a current coding matrix from preset coding matrices according to coordinate information in the three-dimensional point cloud sample;
processing the three-dimensional point cloud sample through the current coding matrix to obtain a target uncertain feature vector under an uncertain feature space;
pooling the target uncertain feature vectors through an average pooling algorithm to obtain target uniform feature vectors in a uniform feature space;
and transposing the target uniform feature vector to obtain a preset feature node.
Preferably, the selecting a current coding matrix from preset coding matrices according to the coordinate information in the three-dimensional point cloud sample specifically includes:
training an encoding matrix and a decoding matrix through a preset batch gradient descent algorithm to obtain a preset encoding matrix and a preset decoding matrix;
mapping coordinate information in the three-dimensional point cloud sample through the preset coding matrix to obtain hidden layer information;
mapping the hidden layer information through the preset decoding matrix to obtain output layer information;
and if the similarity between the coordinate information and the output layer information is within a preset similarity range, taking the preset coding matrix as the current coding matrix.
In a second aspect, an embodiment of the present invention provides an object recognition system based on width learning, including:
the data acquisition module is used for acquiring three-dimensional point cloud data in the current area;
the spatial coding module is used for processing the three-dimensional point cloud data through a preset unified spatial coder to obtain current characteristic nodes in a unified characteristic space;
and the object identification module is used for identifying the object of the current feature node through a preset width learning neural network.
In a third aspect, an embodiment of the present invention provides an electronic device, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor implements the steps of the object identification method based on width learning provided in the first aspect of the present invention when executing the program.
In a fourth aspect, an embodiment of the present invention provides a non-transitory computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps of the width learning-based object identification method provided in the first aspect of the present invention.
The object identification method and system based on width learning provided by the embodiment of the invention firstly collect three-dimensional point cloud data in a current area; processing the three-dimensional point cloud data through a preset uniform space encoder to obtain current feature nodes in a uniform feature space; and carrying out object recognition on the current feature nodes through a preset width learning neural network. When the object is identified, the preset width learning neural network adopted by the embodiment of the invention is different from the traditional deep learning neural network, the number of network layers is less, the number of parameters involved in the neural network structure is less, and the overall calculation efficiency is higher; meanwhile, the input quantity of the preset width learning neural network is changed from the original three-dimensional point cloud data to feature vectors in a unified feature space, the data type is simpler to process, and the calculation efficiency is further improved.
Drawings
Fig. 1 is a flowchart of an object identification method based on width learning according to an embodiment of the present invention;
fig. 2 is a flowchart of an object recognition method based on width learning according to another embodiment of the present invention;
FIG. 3 is a flowchart of an object recognition method based on width learning according to yet another embodiment of the present invention;
FIG. 4 is a flowchart of an object recognition method based on width learning according to another embodiment of the present invention;
fig. 5 is a schematic diagram of an architecture of a predetermined unified space coder and a predetermined width learning neural network according to another embodiment of the present invention;
FIG. 6 is a schematic diagram of a three-dimensional point cloud data visualization of an outdoor environment according to another embodiment of the present invention;
fig. 7 is a schematic diagram of spatial distribution of three-dimensional point cloud data corresponding to a target object according to another embodiment of the present invention;
FIG. 8 is a schematic structural diagram of an object recognition system based on width learning according to an embodiment of the present invention;
fig. 9 is a schematic physical structure diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 is a flowchart of an object identification method based on width learning according to an embodiment of the present invention, as shown in fig. 1, the method includes:
and S1, acquiring the three-dimensional point cloud data in the current area.
The execution subject of the embodiment of the invention is electronic equipment, the electronic equipment can be a vehicle-mounted terminal, the vehicle-mounted terminal can comprise a LiDAR sensor, and the LiDAR sensor is used for acquiring three-dimensional point cloud data in the current area, wherein an object to be identified can exist in the current area.
The object to be identified may be a tree or a building, etc.
In addition, the in-vehicle terminal may be disposed on top of an Unmanned Ground Vehicle (UGV).
And S2, processing the three-dimensional point cloud data through a preset uniform space encoder to obtain current feature nodes in a uniform feature space.
And S3, performing object recognition on the current feature nodes through a preset width learning neural network.
It is understood that, the embodiment of the present invention may adopt a preset width Learning (Learning System) neural network to perform object identification on the object to be identified in the current area, for example, if there is an object to be identified and the object type of the object to be identified is a tree, an object identification result may be obtained, and the object identification result may be "the object to be identified is a tree".
The preset width learning neural network is a neural network structure independent of a depth structure.
Compared with the traditional deep learning neural network, the preset width learning neural network used in the method has a simple neural network structure, for example, the number of network layers of the neural network structure is less, so that the calculation efficiency is higher, and the neural network has a real-time processing characteristic with excellent performance; meanwhile, the number of parameters involved in the neural network is much smaller than that of the traditional deep learning neural network, so that the neural network has the characteristic of light weight of a network structure, and the requirement of the unmanned driving field on algorithm instantaneity can be met, thereby further improving the calculation efficiency.
After all, even if the trained deep learning neural network is used for object identification under the three-dimensional point cloud data, the real-time processing aspect is still poor due to the fact that the related parameters are more.
Moreover, on the basis of using the width learning neural network, the embodiment of the invention additionally introduces a network structure of a preset unified space encoder, and the three-dimensional point cloud data can be processed in advance through the network structure, so that the original three-dimensional point cloud data is converted into a feature vector under a unified feature space, namely the current feature node. When the width learning neural network is used specifically, the input quantity is changed from original three-dimensional point cloud data into a feature vector under a uniform feature space, and the data type is simpler to process, so that the calculation efficiency is greatly improved.
The object identification method based on width learning provided by the embodiment of the invention comprises the steps of firstly collecting three-dimensional point cloud data in a current area; processing the three-dimensional point cloud data through a preset uniform space encoder to obtain current feature nodes in a uniform feature space; and carrying out object recognition on the current feature nodes through a preset width learning neural network. When the object is identified, the preset width learning neural network adopted by the embodiment of the invention is different from the traditional deep learning neural network, the number of network layers is less, the number of parameters involved in the neural network structure is less, and the overall calculation efficiency is higher; meanwhile, the input quantity of the preset width learning neural network is changed from the original three-dimensional point cloud data to feature vectors in a unified feature space, the data type is simpler to process, and the calculation efficiency is further improved.
Fig. 2 is a flowchart of an object recognition method based on width learning according to another embodiment of the present invention, where the another embodiment of the present invention is based on the embodiment shown in fig. 1.
In this embodiment, the S2 specifically includes:
s201, processing the three-dimensional point cloud data through a current coding matrix in a preset unified space coder to obtain a current uncertain feature vector under an uncertain feature space.
For the convenience of distinguishing, the embodiment of the invention divides the whole process into a training link and a using link when the object recognition is actually used, and the training link can be divided into two parts, namely a first training link corresponding to a preset uniform space encoder and a second training link corresponding to a preset width learning neural network.
Specifically, in the using link, the original three-dimensional point cloud data is mapped to an Uncertain feature space (uncertainless feature space) by using a predetermined current coding matrix, and the feature vector mapped to the Uncertain feature space is recorded as the current Uncertain feature vector.
For example, the three-dimensional coordinate information in the three-dimensional point cloud data may be denoted as (x, y, z), and the three-dimensional coordinate information is processed through the current encoding matrix in a preset uncertain space mapping formula.
As for the preset uncertain space mapping formula, as follows,
ri,j,k=wk,1xi,j+wk,2yi,j+wk,3zi,j+b,
wherein r isi,j,kFor uncertain feature vectors, wk,1To wk,3And the current coding matrix W, i and j together represent the jth point in the ith object, k represents the vector serial number of the uncertain characteristic vector, and b represents the offset.
And S202, performing pooling treatment on the current uncertain feature vectors through an average pooling algorithm in the preset unified space encoder to obtain the current unified feature vectors in the unified feature space.
Then, the feature vectors in the uncertain feature space can be pooled by using an average pooling algorithm to be encoded into feature vectors in a Unified feature space (Unified-feature space), and the feature vectors are recorded as a Unified feature vector Vi
As for the average pooling algorithm, as follows,
Figure BDA0002332164770000081
wherein v isi,kFor the average pooling result, ri,j,kFor uncertain feature vectors, niAnd (3) the number of points contained in the three-dimensional point cloud sample is represented by i, i represents the ith object, i and j together represent the jth point in the ith object, and k represents the serial number of the uncertain feature vector.
Unified feature vector ViConsisting of the average pooling result, which can be expressed as
Vi=[vi,1,vi,2,…,vi,k,…,vi,d]T
It can be seen that the feature vectors in the uncertain feature space can be mapped to the uniform feature space through the average pooling operation.
S203, transposing the current unified feature vector to obtain a current feature node.
Transposing the current unified feature vector results in a feature node (feature node), which can be denoted as current feature node,
Xi=ViT
wherein, XiRepresenting the current characteristic node, V, of the ith objectiRepresenting a uniform feature vector for the ith object.
It should be noted that the current feature node mentioned here is the same data type as the preset feature node appearing later, and is only used for distinguishing.
The object identification method based on width learning provided by the embodiment of the invention is applied to the current coding matrix and relates to the specific implementation mode of the current characteristic node, and the original three-dimensional point cloud data can be successfully converted into the data type which is easier to process and more suitable for the preset width learning neural network by using the implementation mode.
Fig. 3 is a flowchart of an object recognition method based on width learning according to yet another embodiment of the present invention, which is based on the embodiment shown in fig. 1.
In this embodiment, the S3 specifically includes:
s301, carrying out object identification on the current feature node through a preset weight matrix in a preset width learning neural network.
In the using link, the preset weight matrix can be recorded as W*When the preset width learning neural network is applied to object recognition, a preset weight matrix in the preset width learning neural network can be confirmed in a training link, and the preset width learning neural network can be directly used in a using link.
The object identification method based on width learning provided by the embodiment of the invention uses the preset weight matrix to carry out object identification operation.
On the basis of the foregoing embodiment, preferably, before S1, the method specifically includes:
and S11, acquiring preset characteristic nodes, preset enhanced nodes and a weight matrix to be updated in the width learning neural network to be trained.
And S12, performing output processing according to the preset feature node, the preset enhancement node and the weight matrix to be updated to obtain an output matrix.
Specifically, the second training link corresponding to the preset width learning neural network is involved here, and in order to determine the weight matrix actually used in the using link, that is, the preset weight matrix, the second training link is to train and optimize the weight matrix.
The untrained width learning neural network can be recorded as a to-be-trained width learning neural network, and the trained width learning neural network can be recorded as a preset width learning neural network.
For example, for the training optimization process, a matrix default value of a weight matrix may be initialized first, and the matrix default value is recorded as the weight matrix to be updated. The preset feature node, the preset enhanced node (enhanced node) and the weight matrix to be updated can be output by a preset output processing algorithm.
As for the preset output processing algorithm, as follows,
γ=[Zn|Hm]W*
wherein γ represents the output matrix, ZnRepresenting a predetermined characteristic node, HmRepresents a predetermined enhanced node, W*Representing the weight matrix, n and m both representing the sequence numbers.
And S13, updating the weight matrix according to the preset feature node, the preset enhanced node and the output matrix so as to update the weight matrix to be updated into a preset weight matrix.
Then, the output matrix obtained based on the weight matrix to be updated can be used for updating the weight matrix.
For example, a specific implementation manner is that the preset feature section can be firstly selectedPoint ZnAnd a predetermined enhanced node HmA block matrix F is determined which, as follows,
F=[Zn|Hm]。
then, a new weight matrix, i.e. a preset weight matrix, is determined according to the block matrix and the output matrix,
W*=F+γ=(λI+FFT)-1FTγ,
wherein, W*Representing a weight matrix, F+The generalized inverse matrix of F is represented, F represents a block matrix, gamma represents an output matrix, the variable lambda is an eigenvalue, and I represents an identity matrix.
In addition to this, the present invention is,
Figure BDA0002332164770000101
wherein, F+In order to be a general expression of the items,
Figure BDA0002332164770000102
then it represents the matrix F obtained when the variable lambda approaches 0+The solution of (1).
In addition, the updating of the weight matrix according to the preset feature node, the preset enhanced node and the output matrix to update the weight matrix to be updated to a preset weight matrix specifically includes:
updating a weight matrix according to the preset characteristic node, the preset enhancement node and the output matrix to obtain a target weight matrix;
and if the object identification accuracy rate corresponding to the target weight matrix is not in the preset accuracy rate range, taking the target weight matrix as a new matrix to be updated, executing the step of performing output processing according to the preset characteristic node, the preset enhanced node and the weight matrix to be updated again to obtain an output matrix, and updating the target weight matrix into the preset weight matrix until the object identification accuracy rate corresponding to the target weight matrix is in the preset accuracy rate range.
For example, the weight matrix after the first update may be referred to as a target weight matrix, which may be abbreviated as a matrix L1, and an automatic test of object identification may be performed through the matrix L1 to automatically generate an object identification accuracy, and if the object identification accuracy is not at a higher value, the update is continued. In the continuous update operation, the matrix L1 is used to regenerate the output matrix, and the weight matrix is updated through the output matrix to obtain a new target weight matrix, which can be denoted as the matrix L2. If the object recognition accuracy corresponding to the matrix L2 is high, that is, within the preset accuracy range, the matrix L2 may be selected as a weight matrix used in the subsequent object recognition, that is, a preset weight matrix.
Of course, if the object recognition accuracy corresponding to the matrix L1 is within the preset accuracy range, the matrix L1 may be selected as the weight matrix used in the subsequent object recognition.
Therefore, a weight matrix with high object identification accuracy can be obtained by continuously and circularly updating the weight matrix.
Meanwhile, the weight matrix in the neural network structure is updated, so that the identification accuracy when the neural network structure is used for identifying objects can be improved.
Fig. 4 is a flowchart of an object recognition method based on width learning according to another embodiment of the present invention, where another embodiment of the present invention is based on the embodiment shown in fig. 3.
In this embodiment, before the obtaining of the preset feature node, the preset enhanced node, and the weight matrix to be updated in the to-be-trained width learning neural network, the width learning-based object identification method further includes:
acquiring a preset characteristic node;
and constructing an enhancement layer by using the preset feature node through a preset activation function so as to construct a preset enhancement node.
It is understood that the preset enhanced node is used in the breadth learning neural network, and the generation manner of the preset enhanced node is referred to herein.
Specifically, a preset feature node is obtained first, and the preset feature node is input into a preset activation function to obtain a preset enhanced node. As for the preset activation function, as follows,
Hm=ζ(ZnWm'+βm),
wherein HmA preset enhancement node representing an enhancement layer, ζ represents a preset activation function, ZnRepresenting a predetermined characteristic node, Wm' denotes a predefined matrix, βmThe expression offset vector is embodied as a set of randomly generated fixed values, and m denotes a sequence number.
In addition, see fig. 5, a schematic diagram of the architecture of the predetermined unified space coder and the predetermined width learning neural network, Zn=[Z1|Z2|...|Zl|...|Zu],Zl=[zl,1,zl,2,…,zl,p,…,zl,d],Hm=[hm,1,…,hm,q,…hm,s],βm=[bm,1,bm,2,…,bm,q,…,bs]N, l, u, p, d, m, q and s each represent a number. Wherein, UASE represents a preset unified space coder, average position represents average pooling, and transpose represents transposition.
The preset width learning neural network can be divided into 3 layers, namely an input layer, an enhancement layer and an output layer, and the number of layers of the preset width learning neural network is less than that of the traditional deep learning neural network. The preset characteristic node is an input layer of the preset width learning neural network, the preset enhancement node is located in the enhancement layer, and the output layer can obtain an object recognition result. The comparison operation of the labels is involved in the output layer.
Therefore, the preset enhanced node can be generated through the embodiment of the invention.
On the basis of the foregoing embodiment, preferably, before S11, the method for object recognition based on width learning further includes:
and S111, obtaining a three-dimensional point cloud sample.
It is understood that preset feature nodes are used in the width learning neural network, and the generation manner of the preset feature nodes is referred to herein.
Specifically, in a first training link corresponding to a preset unified space encoder, three-dimensional point cloud data in an outdoor environment can be collected by any LiDAR sensor to serve as a training sample. Of course, the training sample may be stored in a local hard disk.
Then, a large number of three-dimensional point cloud samples can be visualized, see fig. 6, where fig. 6 is a schematic view of three-dimensional point cloud data visualization in an outdoor environment, and corresponding labels can be marked on different three-dimensional point cloud samples.
For example, referring to fig. 7, fig. 7 is a schematic diagram of the spatial distribution of three-dimensional point cloud data corresponding to a target object, and the spatial distribution of three-dimensional point cloud data corresponding to a car, a pedestrian, a bush, a trunk, a tree, and a building respectively from left to right, that is, the target object includes six types of cars, pedestrians, bushes, trunks, trees, and buildings.
Therefore, three-dimensional point cloud samples corresponding to different target objects can be stored in separate files, and a sample label representing the object type is added to each file, for example, the sample label can be an automobile.
When the method is actually used, a plurality of existing files can be directly obtained to obtain the three-dimensional point cloud sample and the sample label corresponding to the three-dimensional point cloud sample.
And S112, selecting a current coding matrix from preset coding matrixes according to the coordinate information in the three-dimensional point cloud sample.
A plurality of encoding matrices and decoding matrices may be generated first and recorded as preset encoding matrices and preset decoding matrices.
Then, the coordinate information can be used as a proof to screen the preset coding matrix so as to select a coding matrix and record the coding matrix as the current coding matrix.
S113, processing the three-dimensional point cloud sample through the current coding matrix to obtain a target uncertain feature vector under an uncertain feature space.
Then, the original three-dimensional point cloud sample can be mapped to an Uncertain feature space (Uncertain feature space) by using the screened current coding matrix, and the feature vector mapped to the Uncertain feature space is marked as a target Uncertain feature vector.
For example, the three-dimensional coordinate information in the three-dimensional point cloud sample may be denoted as (x, y, z), and the three-dimensional coordinate information is processed through the current coding matrix in a preset uncertain space mapping formula.
As for the preset uncertain space mapping formula, as follows,
ri,j,k=wk,1xi,j+wk,2yi,j+wk,3zi,j+b,
wherein r isi,j,kFor uncertain feature vectors, wk,1To wk,3For the elements in the kth coding matrix, i and j together represent the jth point in the ith object, k represents the vector serial number of the uncertain feature vector, and b represents the offset.
S114, performing pooling treatment on the target uncertain feature vectors through an average pooling algorithm to obtain target uniform feature vectors in a uniform feature space.
Then, the feature vectors in the uncertain feature space can be pooled by using an average pooling algorithm to be encoded into feature vectors in a Unified feature space (Unified-feature space), and the feature vectors are recorded as a Unified feature vector Vi
As for the average pooling algorithm, as follows,
Figure BDA0002332164770000141
wherein v isi,kFor the average pooling result, ri,j,kFor uncertain feature vectors, niAnd (3) the number of points contained in the three-dimensional point cloud sample is represented by i, i represents the ith object, i and j together represent the jth point in the ith object, and k represents the serial number of the uncertain feature vector.
Unified feature vector ViConsisting of the average pooling result, which can be expressed as
Vi=[vi,1,vi,2,…,vi,k,…,vi,d]T
It can be seen that the feature vectors in the uncertain feature space can be mapped to the uniform feature space through the average pooling operation.
And S115, transposing the target uniform characteristic vector to obtain a preset characteristic node.
The uniform feature vector is transposed to obtain the feature nodes shown below, which can be marked as preset feature nodes,
Xi=ViT
wherein, XiA predetermined characteristic node, V, representing the ith objectiRepresenting a uniform feature vector for the ith object.
It should be noted that, in the embodiments of the present invention, the data types are still the same only for distinguishing the data contents in different situations, for example, the target uniform feature vector and the current uniform feature vector, and so on.
Therefore, the preset feature node can be generated through the embodiment of the invention.
On the basis of the foregoing embodiment, preferably, the selecting a current encoding matrix from preset encoding matrices according to the coordinate information in the three-dimensional point cloud sample specifically includes:
training an encoding matrix and a decoding matrix through a preset batch gradient descent algorithm to obtain a preset encoding matrix and a preset decoding matrix;
mapping coordinate information in the three-dimensional point cloud sample through the preset coding matrix to obtain hidden layer information;
mapping the hidden layer information through the preset decoding matrix to obtain output layer information;
and if the similarity between the coordinate information and the output layer information is within a preset similarity range, taking the preset coding matrix as the current coding matrix.
It can be understood that, the embodiment of the present invention optimizes the training of the coding matrix, and the accuracy of object identification can be further improved by optimizing the coding accuracy of the coding matrix.
Specifically, the encoding matrix and the decoding matrix are trained through a preset Batch Gradient Descent (BGD) algorithm, for example, if the trained encoding matrix is a preset encoding matrix, the trained decoding matrix is a preset decoding matrix.
Then, the preset encoding matrix and the preset decoding matrix may be used for testing, for example, the coordinate information and a bias node in the three-dimensional point cloud sample may be mapped to the hidden layer through the preset encoding matrix to obtain hidden layer information corresponding to the coordinate information.
The hidden layer information and another bias node may then be mapped to the output layer by a preset decoding matrix to obtain output layer information.
Wherein, the testing process of the testing using the predetermined encoding matrix and the predetermined decoding matrix can be referred to the following predetermined testing formula,
Figure BDA0002332164770000161
wherein,
Figure BDA0002332164770000162
for the output layer information i.e. the coding information,
Figure BDA0002332164770000163
for a predetermined decoding matrix, W for a predetermined encoding matrix, ai,jFor the coordinate information of the input three-dimensional point cloud sample, i and j together represent the j-th point of the i-th object, and the f function and the g function are both Sigmoid functions.
Then, if the coordinate information before encoding and the output layer information after encoding have higher similarity, the preset encoding matrix can be reserved as the encoding matrix used later and recorded as the current encoding matrix.
In addition, see also the following table 1,
TABLE 1 test run Table
Figure BDA0002332164770000164
Table 1 records that Name represents a Name, UASE Training sample represents the number of Training samples of a preset uniform spatial encoder, BLS Training sample represents the number of Training samples of a preset width learning neural network, Training Accuracy represents Training Accuracy, BLS Testing sample represents the number of test samples of the preset width learning neural network, and Testing Accuracy represents Testing Accuracy;
car denotes Car, Pedestian denotes Pedestrian, Bush denotes shrub, Trunk denotes Trunk, Tree denotes Tree, Building denotes Building, Total/Aver denotes Total/average.
As shown in the test flow of table 1 above, 10 feature nodes will be used as inputs to the preset width learning neural network, 12 preset unified spatial encoders (USAE), and 9000 enhancement nodes.
Therefore, the embodiment of the invention can optimize the coding matrix so as to find the coding matrix with better coding performance to be used in object identification.
Fig. 8 is a schematic structural diagram of an object recognition system based on width learning according to an embodiment of the present invention, as shown in fig. 8, the system includes: adata acquisition module 301, aspatial coding module 302 and anobject identification module 303;
thedata acquisition module 301 is used for acquiring three-dimensional point cloud data in a current area;
thespatial coding module 302 is configured to process the three-dimensional point cloud data through a preset uniform spatial coder to obtain a current feature node in a uniform feature space;
and theobject identification module 303 is configured to perform object identification on the current feature node through a preset width learning neural network.
The object identification system based on width learning provided by the embodiment of the invention firstly collects three-dimensional point cloud data in a current area; processing the three-dimensional point cloud data through a preset uniform space encoder to obtain current feature nodes in a uniform feature space; and carrying out object recognition on the current feature nodes through a preset width learning neural network. When the object is identified, the preset width learning neural network adopted by the embodiment of the invention is different from the traditional deep learning neural network, the number of network layers is less, the number of parameters involved in the neural network structure is less, and the overall calculation efficiency is higher; meanwhile, the input quantity of the preset width learning neural network is changed from the original three-dimensional point cloud data to feature vectors in a unified feature space, the data type is simpler to process, and the calculation efficiency is further improved.
The system embodiment provided in the embodiments of the present invention is for implementing the above method embodiments, and for details of the process and the details, reference is made to the above method embodiments, which are not described herein again.
Fig. 9 is a schematic entity structure diagram of an electronic device according to an embodiment of the present invention, and as shown in fig. 9, the electronic device may include: a processor (processor)401, a communication Interface (communication Interface)402, a memory (memory)403 and abus 404, wherein theprocessor 401, thecommunication Interface 402 and thememory 403 complete communication with each other through thebus 404. Thecommunication interface 402 may be used for information transfer of an electronic device.Processor 401 may call logic instructions inmemory 403 to perform a method comprising:
collecting three-dimensional point cloud data in a current area;
processing the three-dimensional point cloud data through a preset uniform space encoder to obtain current feature nodes in a uniform feature space;
and carrying out object recognition on the current feature nodes through a preset width learning neural network.
In addition, the logic instructions in thememory 403 may be implemented in the form of software functional units and stored in a computer readable storage medium when the software functional units are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the above-described method embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
In another aspect, an embodiment of the present invention further provides a non-transitory computer-readable storage medium, on which a computer program is stored, where the computer program is implemented by a processor to perform the method provided by the foregoing embodiments, for example, including:
collecting three-dimensional point cloud data in a current area;
processing the three-dimensional point cloud data through a preset uniform space encoder to obtain current feature nodes in a uniform feature space;
and carrying out object recognition on the current feature nodes through a preset width learning neural network.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

Translated fromChinese
1.一种基于宽度学习的物体识别方法,其特征在于,包括:1. an object recognition method based on width learning, is characterized in that, comprises:采集当前区域内的三维点云数据;Collect 3D point cloud data in the current area;通过预设统一空间编码器对所述三维点云数据进行处理,以得到统一特征空间下的当前特征节点;Process the three-dimensional point cloud data by using a preset unified space encoder to obtain the current feature node in the unified feature space;将所述当前特征节点通过预设宽度学习神经网络进行物体识别。Object recognition is performed on the current feature node through a preset width learning neural network.2.根据权利要求1所述的基于宽度学习的物体识别方法,其特征在于,所述通过预设统一空间编码器对所述三维点云数据进行处理,以得到统一特征空间下的当前特征节点,具体包括:2 . The object recognition method based on width learning according to claim 1 , wherein the three-dimensional point cloud data is processed by a preset unified space encoder to obtain the current feature node in the unified feature space. 3 . , including:通过预设统一空间编码器中的当前编码矩阵对所述三维点云数据进行处理,以获得不确定特征空间下的当前不确定特征向量;Process the three-dimensional point cloud data by using the current encoding matrix in the preset unified space encoder to obtain the current uncertain feature vector in the uncertain feature space;通过所述预设统一空间编码器中的平均池化算法对所述当前不确定特征向量进行池化处理,以获得统一特征空间下的当前统一特征向量;The current uncertain feature vector is pooled by the average pooling algorithm in the preset unified space encoder to obtain the current unified feature vector in the unified feature space;对所述当前统一特征向量进行转置,以获得当前特征节点。Transpose the current unified feature vector to obtain the current feature node.3.根据权利要求1或2所述的基于宽度学习的物体识别方法,其特征在于,所述将所述当前特征节点通过预设宽度学习神经网络进行物体识别,具体包括:3. The object recognition method based on width learning according to claim 1 or 2, wherein the object recognition is performed by the current feature node through a preset width learning neural network, specifically comprising:将所述当前特征节点通过预设宽度学习神经网络中的预设权值矩阵进行物体识别。Object recognition is performed on the current feature node through a preset weight matrix in a preset width learning neural network.4.根据权利要求3所述的基于宽度学习的物体识别方法,其特征在于,所述采集当前区域内的三维点云数据之前,所述基于宽度学习的物体识别方法还包括:4. The object recognition method based on width learning according to claim 3, characterized in that, before said collecting the three-dimensional point cloud data in the current area, the object recognition method based on width learning further comprises:获取待训练宽度学习神经网络中的预设特征节点、预设增强节点以及待更新权值矩阵;Acquiring preset feature nodes, preset enhancement nodes and to-be-updated weight matrix in the width learning neural network to be trained;根据所述预设特征节点、所述预设增强节点以及所述待更新权值矩阵进行输出处理,以获得输出矩阵;Perform output processing according to the preset feature node, the preset enhancement node and the to-be-updated weight matrix to obtain an output matrix;根据所述预设特征节点、所述预设增强节点以及所述输出矩阵进行权值矩阵的更新,以将所述待更新权值矩阵更新为预设权值矩阵。The weight matrix is updated according to the preset feature node, the preset enhancement node and the output matrix, so as to update the to-be-updated weight matrix to a preset weight matrix.5.根据权利要求4所述的基于宽度学习的物体识别方法,其特征在于,所述获取待训练宽度学习神经网络中的预设特征节点、预设增强节点以及待更新权值矩阵之前,所述基于宽度学习的物体识别方法还包括:5. The object recognition method based on width learning according to claim 4, characterized in that, before acquiring the preset feature node, preset enhancement node and the weight matrix to be updated in the width learning neural network to be trained, the The object recognition method based on width learning also includes:获取预设特征节点;Get the preset feature node;将所述预设特征节点通过预设激活函数进行增强层的构建,以构建出预设增强节点。The preset feature node is used to construct an enhancement layer through a preset activation function, so as to construct a preset enhancement node.6.根据权利要求4所述的基于宽度学习的物体识别方法,其特征在于,所述获取待训练宽度学习神经网络中的预设特征节点、预设增强节点以及待更新权值矩阵之前,所述基于宽度学习的物体识别方法还包括:6. The object recognition method based on width learning according to claim 4, characterized in that, before acquiring the preset feature nodes, preset enhancement nodes and the weight matrix to be updated in the width learning neural network to be trained, the The object recognition method based on width learning also includes:获取三维点云样本;Obtain 3D point cloud samples;根据所述三维点云样本中的坐标信息从预设编码矩阵中选取当前编码矩阵;Select the current encoding matrix from the preset encoding matrix according to the coordinate information in the three-dimensional point cloud sample;通过所述当前编码矩阵对所述三维点云样本进行处理,以获得不确定特征空间下的目标不确定特征向量;Process the three-dimensional point cloud sample through the current encoding matrix to obtain the target uncertain feature vector in the uncertain feature space;通过平均池化算法对所述目标不确定特征向量进行池化处理,以获得统一特征空间下的目标统一特征向量;The target uncertain feature vector is pooled by the average pooling algorithm to obtain the target unified feature vector in the unified feature space;对所述目标统一特征向量进行转置,以获得预设特征节点。Transpose the target unified feature vector to obtain a preset feature node.7.根据权利要求6所述的基于宽度学习的物体识别方法,其特征在于,所述根据所述三维点云样本中的坐标信息从预设编码矩阵中选取当前编码矩阵,具体包括:7. The object recognition method based on width learning according to claim 6, wherein the current encoding matrix is selected from a preset encoding matrix according to the coordinate information in the three-dimensional point cloud sample, specifically comprising:通过预设批量梯度下降算法进行编码矩阵与解码矩阵的训练,以获得预设编码矩阵与预设解码矩阵;The encoding matrix and the decoding matrix are trained by the preset batch gradient descent algorithm to obtain the preset encoding matrix and the preset decoding matrix;通过所述预设编码矩阵对所述三维点云样本中的坐标信息进行映射,以获得隐藏层信息;The coordinate information in the three-dimensional point cloud sample is mapped by the preset coding matrix to obtain hidden layer information;通过所述预设解码矩阵对所述隐藏层信息进行映射,以获得输出层信息;Mapping the hidden layer information through the preset decoding matrix to obtain output layer information;若所述坐标信息与所述输出层信息的相似度处于预设相似度范围内,则将所述预设编码矩阵作为当前编码矩阵。If the similarity between the coordinate information and the output layer information is within a preset similarity range, the preset encoding matrix is used as the current encoding matrix.8.一种基于宽度学习的物体识别系统,其特征在于,包括:8. An object recognition system based on width learning, comprising:数据采集模块,用于采集当前区域内的三维点云数据;The data acquisition module is used to collect 3D point cloud data in the current area;空间编码模块,用于通过预设统一空间编码器对所述三维点云数据进行处理,以得到统一特征空间下的当前特征节点;a spatial encoding module, configured to process the three-dimensional point cloud data through a preset unified spatial encoder to obtain the current feature node in the unified feature space;物体识别模块,用于将所述当前特征节点通过预设宽度学习神经网络进行物体识别。The object recognition module is used to perform object recognition on the current feature node through a preset width learning neural network.9.一种电子设备,包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,其特征在于,所述处理器执行所述程序时实现如权利要求1至7中任一项所述基于宽度学习的物体识别方法的步骤。9. An electronic device, comprising a memory, a processor and a computer program stored on the memory and running on the processor, wherein the processor implements the program as claimed in claim 1 when executing the program Steps of the object recognition method based on width learning described in any one of to 7.10.一种非暂态计算机可读存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现如权利要求1至7中任一项所述基于宽度学习的物体识别方法的步骤。10. A non-transitory computer-readable storage medium on which a computer program is stored, characterized in that, when the computer program is executed by a processor, the breadth-based learning according to any one of claims 1 to 7 is implemented. The steps of an object recognition method.
CN201911340626.6A2019-12-232019-12-23Object identification method and system based on width learningActiveCN111160198B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN201911340626.6ACN111160198B (en)2019-12-232019-12-23Object identification method and system based on width learning

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201911340626.6ACN111160198B (en)2019-12-232019-12-23Object identification method and system based on width learning

Publications (2)

Publication NumberPublication Date
CN111160198Atrue CN111160198A (en)2020-05-15
CN111160198B CN111160198B (en)2023-06-27

Family

ID=70558122

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201911340626.6AActiveCN111160198B (en)2019-12-232019-12-23Object identification method and system based on width learning

Country Status (1)

CountryLink
CN (1)CN111160198B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN111832748A (en)*2020-08-242020-10-27西南大学 An electronic nose width learning method for regression prediction of mixed gas concentration
CN112257817A (en)*2020-12-182021-01-22之江实验室Geological geology online semantic recognition method and device and electronic equipment
WO2021253722A1 (en)*2020-06-192021-12-23中国科学院深圳先进技术研究院Medical image reconstruction technology method and apparatus, storage medium and electronic device

Citations (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN109344804A (en)*2018-10-302019-02-15百度在线网络技术(北京)有限公司A kind of recognition methods of laser point cloud data, device, equipment and medium
CN109711410A (en)*2018-11-202019-05-03北方工业大学Three-dimensional object rapid segmentation and identification method, device and system
CN110197203A (en)*2019-05-082019-09-03湖北民族大学 Crack Classification and Recognition Method of Bridge Pavement Based on Width Learning Neural Network
CN110263652A (en)*2019-05-232019-09-20杭州飞步科技有限公司Laser point cloud data recognition methods and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN109344804A (en)*2018-10-302019-02-15百度在线网络技术(北京)有限公司A kind of recognition methods of laser point cloud data, device, equipment and medium
CN109711410A (en)*2018-11-202019-05-03北方工业大学Three-dimensional object rapid segmentation and identification method, device and system
CN110197203A (en)*2019-05-082019-09-03湖北民族大学 Crack Classification and Recognition Method of Bridge Pavement Based on Width Learning Neural Network
CN110263652A (en)*2019-05-232019-09-20杭州飞步科技有限公司Laser point cloud data recognition methods and device

Cited By (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
WO2021253722A1 (en)*2020-06-192021-12-23中国科学院深圳先进技术研究院Medical image reconstruction technology method and apparatus, storage medium and electronic device
CN111832748A (en)*2020-08-242020-10-27西南大学 An electronic nose width learning method for regression prediction of mixed gas concentration
CN111832748B (en)*2020-08-242024-12-10广西大学 An electronic nose width learning method for regression prediction of mixed gas concentration
CN112257817A (en)*2020-12-182021-01-22之江实验室Geological geology online semantic recognition method and device and electronic equipment
CN112257817B (en)*2020-12-182021-03-30之江实验室Geological geology online semantic recognition method and device and electronic equipment

Also Published As

Publication numberPublication date
CN111160198B (en)2023-06-27

Similar Documents

PublicationPublication DateTitle
CN113095370B (en)Image recognition method, device, electronic equipment and storage medium
CN115690708B (en)Method and device for training three-dimensional target detection model based on cross-modal knowledge distillation
CN109711410A (en)Three-dimensional object rapid segmentation and identification method, device and system
CN115223117B (en) Three-dimensional object detection model training and use method, device, medium and equipment
CN111160198B (en)Object identification method and system based on width learning
CN114418030A (en)Image classification method, and training method and device of image classification model
CN110765882A (en)Video tag determination method, device, server and storage medium
CN111914809A (en)Target object positioning method, image processing method, device and computer equipment
CN111027610B (en)Image feature fusion method, apparatus, and medium
WO2023125628A1 (en)Neural network model optimization method and apparatus, and computing device
CN113553975A (en)Pedestrian re-identification method, system, equipment and medium based on sample pair relation distillation
AlshehriA content-based image retrieval method using neural network-based prediction technique
CN114565092A (en) A kind of neural network structure determination method and device
CN117037102A (en)Object following method, device, computer equipment and storage medium
CN117475253A (en)Model training method and device, electronic equipment and storage medium
CN116958603A (en)Visual positioning method, device, electronic equipment and readable storage medium
CN115222954B (en) Weakly perceived target detection method and related equipment
CN118658028B (en)Intrinsic characteristic self-adaptive visible light infrared fusion detection and identification method and system
CN117115366B (en) Environmental model reconstruction method, system and equipment based on three-dimensional perception of unmanned systems
CN117058498B (en)Training method of segmentation map evaluation model, and segmentation map evaluation method and device
CN117853596A (en)Unmanned aerial vehicle remote sensing mapping method and system
CN115130593B (en)Connection relation determining method, device, equipment and medium
CN117114083A (en)Method and device for constructing attitude estimation model and attitude estimation method
CN117218467A (en)Model training method and related device
CN120474825B (en) Node access control method, device, computer equipment and storage medium

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant
TR01Transfer of patent right

Effective date of registration:20240125

Address after:1111, Building 2, Shangfengyuan, Baijiatuan, Haidian District, Beijing, 100029

Patentee after:ZHONGQU (BEIJING) TECHNOLOGY CO.,LTD.

Country or region after:China

Address before:100144 Beijing City, Shijingshan District Jin Yuan Zhuang Road No. 5

Patentee before:NORTH CHINA University OF TECHNOLOGY

Country or region before:China

TR01Transfer of patent right

[8]ページ先頭

©2009-2025 Movatter.jp