Disclosure of Invention
The invention provides a method and a device for labeling training set data, which are used for improving the efficiency of labeling the training set data. The specific technical scheme is as follows.
In a first aspect, an embodiment of the present invention discloses a method for labeling training set data, including:
acquiring marked first sample laser point cloud data in a training set and a corresponding standard marking frame of an object to be marked as model training data;
training a target network model according to the model training data to obtain an updated network model; wherein the updated network model is used to correlate the sample laser point cloud data with the corresponding standard labeling box;
judging whether the quantity of unmarked sample laser point cloud data in the data set to be marked is greater than a first quantity or not;
if the number of the sample laser point cloud data is larger than the preset number, determining a second number of sample laser point cloud data from unmarked sample laser point cloud data in the data set to be marked as second sample laser point cloud data, and determining a reference marking frame of an object to be marked in the second sample laser point cloud data by the updated network model;
displaying the second sample laser point cloud data and a reference marking frame in a two-dimensional overlook interface according to a first mapping relation between a two-dimensional overlook coordinate system and a three-dimensional coordinate system; the three-dimensional coordinate system is a coordinate system where the second sample laser point cloud data is located, and the two-dimensional overlooking interface corresponds to the two-dimensional overlooking coordinate system;
acquiring first adjustment operation input by a marker aiming at second sample laser point cloud data and a reference marking frame displayed on the two-dimensional overlooking interface, and determining a standard marking frame of an object to be marked in the second sample laser point cloud data according to the first adjustment operation;
adding the second sample laser point cloud data and a corresponding standard marking frame into the training set; and taking the second sample laser point cloud data and the corresponding standard marking frame as model training data, taking the updated network model as a target network model, and returning to execute the step of training the target network model according to the model training data to obtain the updated network model.
Optionally, after determining a reference marking frame of an object to be marked in the second sample laser point cloud data, the method further includes:
displaying the second sample laser point cloud data in a three-dimensional interface; wherein the three-dimensional interface corresponds to the three-dimensional coordinate system;
the step of determining a standard marking frame of an object to be marked in the second sample laser point cloud data according to the first adjustment operation includes:
determining a first to-be-adjusted marking frame of an object to be marked in the second sample laser point cloud data according to the first adjusting operation;
displaying the first to-be-adjusted labeling frame in the three-dimensional interface;
and acquiring second adjustment operation input by a marker aiming at a first to-be-adjusted marking frame displayed in the three-dimensional interface, and adjusting the first to-be-adjusted marking frame according to the second adjustment operation to obtain a standard marking frame of the to-be-marked object in the second sample laser point cloud data.
Optionally, the step of determining a standard labeling frame of an object to be labeled in the second sample laser point cloud data according to the first adjustment operation includes:
determining a second to-be-adjusted marking frame of an object to be marked in the second sample laser point cloud data according to the first adjusting operation;
determining a second mapping relation between the other two-dimensional coordinate system where the other surface of the second to-be-adjusted labeling frame is located and the three-dimensional coordinate system; wherein the other faces comprise back and/or side faces, the other two-dimensional coordinate system comprising: a two-dimensional back-view coordinate system and/or a two-dimensional side-view coordinate system;
displaying the second sample laser point cloud data and the second to-be-adjusted marking frame in other two-dimensional interfaces corresponding to the other two-dimensional coordinate systems according to the second mapping relation;
acquiring a third adjustment operation input by a annotator aiming at a second to-be-adjusted annotation box displayed on the other two-dimensional interface;
and adjusting the second marking frame to be adjusted according to the third adjustment operation to obtain a standard marking frame of the object to be marked in the second sample laser point cloud data.
Optionally, when the updated network model determines that the object to be marked does not exist in the second sample laser point cloud data, the method further includes:
adding the second sample laser point cloud data into a negative sample training set; and refusing to display the sample laser point cloud data in the negative sample training set for a marker.
Optionally, the step of training the target network model according to the model training data includes:
inputting sample laser point cloud data in the model training data into a target network model; the target network model comprises a feature extraction layer and a regression layer;
determining a feature vector in the sample laser point cloud data through the first model parameter of the feature extraction layer; performing regression on the feature vector through a second model parameter of the regression layer to obtain an initial labeling frame;
determining the difference between the initial marking frame and a standard marking frame corresponding to sample laser point cloud data in the model training data;
when the difference is larger than a preset difference threshold value, modifying the first model parameter and the second model parameter according to the difference, and returning to execute the step of inputting the sample laser point cloud data into a target network model;
and when the difference is not larger than a preset difference threshold value, determining that the training of the target network model is finished.
Optionally, when the number of the sample laser point cloud data that are not marked in the data set to be marked is not greater than a first number, the method further includes:
taking the sample laser point cloud data which are not marked in the data set to be marked as third sample laser point cloud data, and directly displaying the third sample laser point cloud data;
acquiring a marking operation input by a marker for the third sample laser point cloud data;
determining a standard marking frame of an object to be marked aiming at the third sample laser point cloud data according to the marking operation;
and adding the third sample laser point cloud data and the corresponding standard marking frame into the training set.
In a second aspect, an embodiment of the present invention discloses a device for labeling training set data, including:
the data acquisition module is configured to acquire the marked first sample laser point cloud data in the training set and a corresponding standard marking frame of an object to be marked as model training data;
the model training module is configured to train a target network model according to the model training data to obtain an updated network model; wherein the updated network model is used to correlate the sample laser point cloud data with the corresponding standard labeling box;
the quantity judging module is configured to judge whether the quantity of the unmarked sample laser point cloud data in the data set to be marked is greater than a first quantity;
a reference frame determining module, configured to determine, when the number of sample laser point cloud data that are not marked in the data set to be marked is greater than a first number, a second number of sample laser point cloud data from the sample laser point cloud data that are not marked in the data set to be marked as second sample laser point cloud data, and determine, by the updated network model, a reference marking frame of an object to be marked in the second sample laser point cloud data;
a two-dimensional display module configured to display the second sample laser point cloud data and the reference marking frame in a two-dimensional overlook interface according to a first mapping relationship between a two-dimensional overlook coordinate system and a three-dimensional coordinate system; the three-dimensional coordinate system is a coordinate system where the second sample laser point cloud data is located, and the two-dimensional overlooking interface corresponds to the two-dimensional overlooking coordinate system;
the standard frame determining module is configured to acquire first adjusting operation input by a marker for the second sample laser point cloud data and the reference marking frame displayed on the two-dimensional overlooking interface, and determine a standard marking frame of an object to be marked in the second sample laser point cloud data according to the first adjusting operation;
a first adding module configured to add the second sample laser point cloud data and a corresponding standard labeling box to the training set;
and the data updating module is configured to take the second sample laser point cloud data and the corresponding standard marking frame as model training data, take the updated network model as a target network model, and return to execute the operation of training the target network model according to the model training data to obtain the updated network model.
Optionally, the apparatus further comprises:
a three-dimensional display module configured to display the second sample laser point cloud data in a three-dimensional interface after determining a reference marking frame of an object to be marked in the second sample laser point cloud data; wherein the three-dimensional interface corresponds to the three-dimensional coordinate system;
the standard frame determining module, when determining the standard marking frame of the object to be marked in the second sample laser point cloud data according to the first adjusting operation, includes:
determining a first to-be-adjusted marking frame of an object to be marked in the second sample laser point cloud data according to the first adjusting operation;
displaying the first to-be-adjusted labeling frame in the three-dimensional interface;
and acquiring second adjustment operation input by a marker aiming at a first to-be-adjusted marking frame displayed in the three-dimensional interface, and adjusting the first to-be-adjusted marking frame according to the second adjustment operation to obtain a standard marking frame of the to-be-marked object in the second sample laser point cloud data.
Optionally, when the standard frame determining module determines the standard marking frame of the object to be marked in the second sample laser point cloud data according to the first adjusting operation, the standard frame determining module includes:
determining a second to-be-adjusted marking frame of an object to be marked in the second sample laser point cloud data according to the first adjusting operation;
determining a second mapping relation between the other two-dimensional coordinate system where the other surface of the second to-be-adjusted labeling frame is located and the three-dimensional coordinate system; wherein the other faces include back and/or side faces, and the other two-dimensional coordinate system includes: a two-dimensional back-view coordinate system and/or a two-dimensional side-view coordinate system;
displaying the second sample laser point cloud data and the second to-be-adjusted marking frame in other two-dimensional interfaces corresponding to the other two-dimensional coordinate systems according to the second mapping relation;
acquiring third adjusting operation input by a annotator for a second to-be-adjusted annotation box displayed on the other two-dimensional interface;
and adjusting the second marking frame to be adjusted according to the third adjustment operation to obtain a standard marking frame of the object to be marked in the second sample laser point cloud data.
Optionally, the apparatus further comprises:
the second adding module is configured to add the second sample laser point cloud data into a negative sample training set when the updated network model determines that the object to be marked does not exist in the second sample laser point cloud data; and refusing to display the sample laser point cloud data in the negative sample training set for a marker.
Optionally, the model training module is specifically configured to:
inputting sample laser point cloud data in the model training data into a target network model; the target network model comprises a feature extraction layer and a regression layer;
determining a feature vector in the sample laser point cloud data through the first model parameter of the feature extraction layer; performing regression on the feature vector through a second model parameter of the regression layer to obtain an initial labeling frame;
determining the difference between the initial marking frame and a standard marking frame corresponding to sample laser point cloud data in the model training data;
when the difference is larger than a preset difference threshold value, modifying the first model parameter and the second model parameter according to the difference, and returning to execute the operation of inputting the sample laser point cloud data into a target network model;
and when the difference is not larger than a preset difference threshold value, determining that the training of the target network model is finished.
Optionally, the two-dimensional display module is further configured to, when the number of sample laser point cloud data that are not marked in the data set to be marked is not greater than a first number, take the sample laser point cloud data that are not marked in the data set to be marked as third sample laser point cloud data, and directly display the third sample laser point cloud data;
the standard box determining module is further configured to acquire a labeling operation input by a labeling operator for the third sample laser point cloud data; determining a standard marking frame of an object to be marked aiming at the third sample laser point cloud data according to the marking operation;
the first adding module is further configured to add the third sample laser point cloud data and the corresponding standard labeling box to the training set.
It can be known from the above content that the method and apparatus for labeling training set data provided in the embodiments of the present invention can train a target network model with labeled sample laser point cloud data and a corresponding standard labeling frame, determine a reference labeling frame of an object to be labeled in the unlabeled laser point cloud data by using the trained network model, and label the reference labeling frame as a reference for a label operator to obtain the standard labeling frame, so that the labeling complexity of the label operator can be reduced, and the labeling efficiency of the training set data can be improved; meanwhile, the laser point cloud data are distributed in a three-dimensional space, the laser point cloud data and the reference marking frame are displayed in the two-dimensional overlooking interface to be adjusted by a marker, so that a standard marking frame is obtained, the original method that a three-dimensional surrounding frame is directly marked in the three-dimensional space is replaced, the reference marking frame is adjusted in the two-dimensional overlooking interface, the marking difficulty of the marker can be reduced to a greater extent, and the marking efficiency of the training set data is improved. Of course, not all of the advantages described above need to be achieved at the same time in the practice of any one product or method of the invention.
The innovation points of the embodiment of the invention comprise:
1. the method comprises the steps of training a network model according to marked data, marking unmarked data by using the trained model, displaying laser point cloud data and a marking frame determined by the model in a two-dimensional overlooking interface, providing the laser point cloud data and the marking frame for a marker to adjust to obtain a standard marking frame, and aiming at the conditions of a large amount of sample data and the data with high marking difficulty, namely the laser point cloud sample data, the marking mode can reduce the marking difficulty and improve the marking efficiency.
2. When the standard marking frame is determined, the reference marking frame can be displayed in other two-dimensional interfaces for a marker to adjust other surfaces of the three-dimensional surrounding frame, so that the more accurate standard marking frame can be obtained, and compared with the method of directly adjusting other surfaces in the three-dimensional interface, the marking efficiency can be improved.
3. When the network model determines that the object to be marked does not exist in the sample laser point cloud data, the sample laser point cloud data is not displayed to a marker for identification, but is directly added into a negative sample training set for training other models. When a large amount of sample data exists, the operation can screen the sample data, only the laser point cloud data of the object to be marked is provided for a marker to be marked, and the marking efficiency can be improved.
Detailed Description
The technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. It is to be understood that the described embodiments are merely a few embodiments of the invention, and not all embodiments. All other embodiments, which can be obtained by a person skilled in the art without inventive effort based on the embodiments of the present invention, are within the scope of the present invention.
It is to be noted that the terms "comprises" and "comprising" and any variations thereof in the embodiments and drawings of the present invention are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.
The embodiment of the invention discloses a method and a device for labeling training set data, which can improve the efficiency of labeling the training set data. The following provides a detailed description of embodiments of the present invention.
Fig. 1 is a schematic flow chart of a method for labeling training set data according to an embodiment of the present invention. The method is applied to the electronic equipment. The electronic device can be a common computer, a server or an intelligent mobile device and the like. The method specifically comprises the following steps.
S110: and acquiring the marked first sample laser point cloud data in the training set and a corresponding standard marking frame of the object to be marked as model training data.
The training set comprises a plurality of sample laser point cloud data used for training the network model and a standard marking frame of a corresponding object to be marked. Each sample laser point cloud data includes a plurality of laser data points. The sample laser point cloud data can be acquired by a laser radar. The laser radars can all be installed in the smart machine, and the smart machine can be equipment such as intelligent vehicle, robot. The laser radar can collect laser point cloud data in a circle of surrounding environment with the laser radar as a center.
When the laser radar collects data, a plurality of laser beams are emitted to the surrounding environment, and each laser beam meets an object and is reflected back to the laser radar. The laser radar can obtain laser point cloud data according to the emitted laser beams and the returned laser beams, and the laser point cloud data can represent the three-dimensional space position of a surrounding object taking the laser radar as the center.
The objects to be labeled may include vehicles, pedestrians, bicycles, tricycles, and the like. The standard marking frame can be understood as a three-dimensional surrounding frame which can surround an object to be marked, and the standard marking frame can be represented in a cuboid form. The number of the first sample laser point cloud data can be multiple, each first laser point cloud data can contain multiple objects to be marked, and each object to be marked corresponds to one standard marking frame.
S120: and training the target network model according to the model training data to obtain an updated network model.
And the updated network model is used for enabling the sample laser point cloud data to be associated with the corresponding standard marking frame.
In this step, the target network model may adopt a deep learning network model. The trained network model can determine a marking frame of the object to be detected in the input laser point cloud data according to the model parameters, and the marking frame can be used as a reference mark. And training the target network model according to the model training data, wherein the obtained network model has certain accuracy, and can correlate the sample laser point cloud data with the corresponding standard marking frame to a certain extent.
S130: and judging whether the quantity of the unmarked sample laser point cloud data in the data set to be marked is greater than the first quantity or not, and if so, executing the step S140.
The data set to be labeled comprises a large amount of unlabeled sample laser point cloud data. When the number of the sample laser point cloud data which are not marked in the data set to be marked is not more than the first number, a reference marking frame does not need to be determined by using a network model, and the reference marking frame is directly provided for a marker to mark the sample laser point cloud data.
When the number of the sample laser point cloud data that are not labeled in the to-be-labeled data set is greater than the first number, step S140 may be performed to improve the labeling efficiency.
S140: and determining a second number of sample laser point cloud data from the unmarked sample laser point cloud data in the data set to be marked as second sample laser point cloud data, and determining a reference marking frame of the object to be marked in the second sample laser point cloud data by the updated network model.
The second number may be smaller than the first number, or may be larger than the first number. For example, the first number may be 2000 and the second number may be 1000. When the number of the sample laser point cloud data which are not marked in the data set to be marked is 5000, 1000 sample laser point cloud data can be selected from the 5000 sample laser point cloud data to serve as second sample laser point cloud data.
And inputting the laser point cloud data of each second sample into the updated network model, and determining a reference marking frame of the object to be marked in the laser point cloud data of each second sample according to the network parameters obtained by training the network model by the updated network model. The reference mark frame is a three-dimensional surrounding frame and can be represented by a cuboid. The reference marking frame may not be accurate enough, and a marker is required to adjust the reference marking frame to improve the accuracy of the marking frame.
S150: and displaying the second sample laser point cloud data and the reference marking frame in a two-dimensional overlook interface according to a first mapping relation between the two-dimensional overlook coordinate system and the three-dimensional coordinate system.
The three-dimensional coordinate system is a coordinate system where the second sample laser point cloud data is located, the three-dimensional coordinate system can use points in the laser radar as an origin of coordinates, the advancing direction of the intelligent device is used as the X-axis direction, the vertical upward direction is used as the Z-axis direction, and the right left direction of the intelligent device is used as the Y-axis direction.
The two-dimensional top view interface corresponds to a two-dimensional top view coordinate system. Since the direction of the Z axis in the three-dimensional coordinate system is a vertically upward direction, the two-dimensional plan view coordinate system may be a two-dimensional coordinate system including the X axis and the Y axis in the three-dimensional rectangular coordinate system. The X-axis and the Y-axis may be a first coordinate axis and a second coordinate axis, and the Z-axis is a third coordinate axis.
Referring to fig. 2, fig. 2 is a schematic diagram of a coordinate structure architecture according to an embodiment of the present invention. The interface shown is a two-dimensional top view interface with the X-axis up and the Y-axis left. The white dotted data in fig. 2 is a data point in the laser point cloud data, and both the arc-shaped dotted line and the straight-line dotted line in the interface are auxiliary lines for defining the labeling range.
And the second sample laser point cloud data displayed in the two-dimensional overlooking interface can be checked by a marker so as to mark the object to be marked. The annotator can be a human or an advanced intelligent robot, etc., which is not specifically limited in this application.
Because the sample laser point cloud data is distributed in the three-dimensional space, the marking frame which can surround the object to be marked is also three-dimensional, so that a marker needs to operate in the three-dimensional space, and the operation difficulty is high. In order to facilitate a marker to adjust the reference marking frame, the second laser point cloud data and the reference marking frame are displayed in a two-dimensional overlooking interface.
S160: and acquiring first adjustment operation input by a marker aiming at the second sample laser point cloud data displayed on the two-dimensional overlooking interface and the reference marking frame, and determining the standard marking frame of the object to be marked in the second sample laser point cloud data according to the first adjustment operation.
The first adjusting operation is used for adjusting a reference marking frame containing an object to be marked. The first adjustment operation may include at least one of a mouse click operation, a mouse drag operation, a keyboard drag operation, and the like.
According to the first adjustment operation aiming at the reference marking frame, an adjusted marking frame can be obtained, and the adjusted marking frame can be used as a standard marking frame.
Referring to fig. 3, the second sample laser point cloud data is displayed on the two-dimensional top view interface at the right-hand corner, the displayed reference marking frame is a two-dimensional rectangular frame in the XY plane, the adjustment operation of the marker can be performed on the two-dimensional rectangular frame, and the adjustment operation is simple and easy to implement. The X and Y coordinates of the reference frame vertex may be modified according to a first adjustment operation by the annotator, with the Z coordinate of the reference frame vertex remaining unchanged.
S170: and adding the second sample laser point cloud data and the corresponding standard marking frame into a training set, taking the second sample laser point cloud data and the corresponding standard marking frame as model training data, taking the updated network model as a target network model, and returning to execute the step S120.
In this embodiment, when the second sample laser point cloud data is added to the training set, the second sample laser point cloud data in the data set to be marked may also be deleted.
And (3) taking the second sample laser point cloud data and the corresponding standard marking frame as model training data, taking the updated network model as the target network model, returning to the step (S120), continuing to train the target network model, continuously improving the accuracy of the reference marking frame determined by the target network model, further reducing the adjustment amount of a marker on the reference marking frame, and improving the efficiency of determining the standard marking frame.
As can be seen from the above, in this embodiment, the marked sample laser point cloud data and the corresponding standard marking frame can be used for training the target network model, the trained network model is used for determining the reference marking frame of the object to be marked in the unmarked laser point cloud data, and the reference marking frame is used as a reference for a marker to mark, so as to obtain the standard marking frame, thereby reducing the marking complexity of the marker and improving the marking efficiency of the training set data; meanwhile, the laser point cloud data are distributed in a three-dimensional space, the laser point cloud data and the reference marking frame are displayed in the two-dimensional overlooking interface to be adjusted by a marker, so that a standard marking frame is obtained, the original method that a three-dimensional surrounding frame is directly marked in the three-dimensional space is replaced, the reference marking frame is adjusted in the two-dimensional overlooking interface, the marking difficulty of the marker can be reduced to a greater extent, and the marking efficiency of the training set data is improved.
In another embodiment of the present invention, based on the embodiment shown in fig. 1, after determining the reference marking frame of the object to be marked in the second sample laser point cloud data in step S140, the method may further display the second sample laser point cloud data in the three-dimensional interface. Wherein the three-dimensional interface corresponds to the three-dimensional coordinate system.
In step S160, according to the first adjustment operation, a standard marking frame of the object to be marked in the second sample laser point cloud data is determined, which specifically includes the following steps 1a to 3 a.
Step 1 a: and determining a first to-be-adjusted marking frame of the to-be-marked object in the second sample laser point cloud data according to the first adjusting operation.
In this step, according to the first adjustment operation for the reference labeling frame, an adjusted labeling frame may be obtained, and the adjusted labeling frame may be used as the first to-be-adjusted labeling frame.
Step 2 a: and displaying the first to-be-adjusted marking frame in the three-dimensional interface.
Step 3 a: and acquiring a second adjusting operation input by a marker aiming at the first to-be-adjusted marking frame displayed in the three-dimensional interface, and adjusting the first to-be-adjusted marking frame according to the second adjusting operation to obtain a standard marking frame of the to-be-marked object in the second sample laser point cloud data.
Because the first to-be-adjusted marking frame is used for adjusting the size and the position of the reference marking frame in the XY plane, after the first to-be-adjusted marking frame is displayed in the three-dimensional interface, the Z parameter of the first to-be-adjusted marking frame can be adjusted according to the second adjustment operation input by the marker, and the standard marking frame of the object to be marked in the second sample laser point cloud data is obtained.
The second adjusting operation is used for adjusting the size and the position of the first marking frame to be adjusted. The second adjustment operation may include at least one of a mouse click operation, a mouse drag operation, a keyboard drag operation, and the like.
In this embodiment, when the reference marking frame is adjusted in the two-dimensional top-view interface, the first adjustment operation may be displayed in the three-dimensional interface in real time according to the first adjustment operation. When the first to-be-adjusted labeling frame is adjusted in the three-dimensional interface, the adjustment of the first to-be-adjusted labeling frame can be displayed in the two-dimensional top view interface in real time according to the second adjustment operation.
In summary, the second sample laser point cloud data and the first to-be-adjusted labeling frame can be displayed in the two-dimensional overlooking interface and the three-dimensional interface simultaneously, and mapping between three dimensions and two dimensions is referred, so that the result data is more reliable. Meanwhile, a marker can view the marking frame from more angles and timely modify the marking frame.
In another embodiment of the present invention, based on the embodiment shown in fig. 1, in order to further improve the accuracy of the standard marking frame, the step of determining the standard marking frame of the object to be marked in the second sample laser point cloud data according to the first adjusting operation in step S160 may specifically include the following steps 1b to 5 b.
Step 1 b: and determining a second to-be-adjusted marking frame of the to-be-marked object in the second sample laser point cloud data according to the first adjusting operation.
In this step, an adjusted annotation frame can be obtained according to the first adjustment operation for the reference annotation frame, and the adjusted annotation frame can be used as a second annotation frame to be adjusted.
And step 2 b: and determining a second mapping relation between the other two-dimensional coordinate system and the three-dimensional coordinate system where the other surface of the second to-be-adjusted labeling frame is located.
Wherein, other faces include the back and/or side, and other two-dimensional coordinate systems include: a two-dimensional back-view coordinate system and/or a two-dimensional side-view coordinate system. The other faces can be understood as faces of the second to-be-adjusted labeling frame, except for the face displayed in the two-dimensional top-view interface, which are not parallel to the face displayed in the two-dimensional top-view interface. The back face refers to a plane facing when the laser radar is standing at the position facing the second marking frame to be adjusted or a plane parallel to the facing plane. The side surface refers to a lateral plane of the second to-be-adjusted marking frame when the laser radar stands at the position facing the second to-be-adjusted marking frame, and the lateral plane can be a left side surface or a right side surface.
In this step, a second mapping relationship between the other two-dimensional coordinate system and the three-dimensional coordinate system can be determined according to the coordinate of the second to-be-adjusted labeling frame in the three-dimensional coordinate system. The second mapping relationship can be determined by coordinate transformation.
Fig. 4 is a schematic diagram of a three-dimensional coordinate system and two-dimensional top, back and side coordinate systems provided by the present invention. The laser radar laser marking device comprises a three-dimensional coordinate system, a laser radar, a marking frame, a laser radar, a three-dimensional coordinate system and a three-dimensional coordinate system, wherein the cuboid is a second to-be-adjusted marking frame of an object to be marked in the three-dimensional coordinate system, the X-axis direction is the advancing direction of the intelligent device, the Z-axis is vertical upwards, the O point is the position of the laser radar, and three edges of the cuboid are respectively parallel to three coordinate axes in the three-dimensional coordinate system. The two-dimensional overlook coordinate system comprises an X axis and a Y axis, the two-dimensional back view coordinate system comprises a Z axis and a Y axis, and the two-dimensional side view coordinate system comprises a Z axis and an X axis. According to the graph, the corresponding relation between a two-dimensional top view coordinate system, a two-dimensional back view coordinate system and a two-dimensional side view coordinate system and a three-dimensional coordinate system can be determined. The two-dimensional overlooking coordinate system corresponds to the two-dimensional overlooking interface, the two-dimensional back-viewing coordinate system corresponds to the two-dimensional back-viewing interface, and the two-dimensional side-viewing coordinate system corresponds to the two-dimensional side-viewing interface.
When the sides of the cuboid are not parallel to the coordinate axes, projection can be performed according to the included angle between the sides and the coordinate axes, and a coordinate system where the back and the side of the cuboid are located is obtained.
And step 3 b: and displaying the second sample laser point cloud data and the second to-be-adjusted marking frame in other two-dimensional interfaces corresponding to other two-dimensional coordinate systems according to the second mapping relation.
Referring to fig. 3, the two interfaces at the lower right corner are a two-dimensional back-view interface and a two-dimensional side-view interface, respectively, in which a two-dimensional second bounding box to be adjusted and second sample laser point cloud data are displayed. And each displayed two-dimensional interface can be used for a annotator to well view the determined second enclosure frame to be adjusted so as to correct the edge of the second enclosure frame to be adjusted more accurately.
The lower right text part of fig. 3 also shows a button for the annotator to select the type of the object to be annotated. The types of objects to be labeled corresponding to the button options comprise cars, trucks, buses, bicycles, pedestrians, tricycles and unknowns. According to the selection operation input by the annotator through the click button, the type of the object to be annotated can be determined.
And 4 b: and acquiring third adjusting operation input by the annotator aiming at the second to-be-adjusted annotation box displayed on other two-dimensional interfaces.
And the third adjusting operation is used for adjusting the size and the position of the second marking frame to be adjusted. The third adjustment operation may include at least one of a mouse click operation, a mouse drag operation, a keyboard drag operation, and the like.
And step 5 b: and adjusting the second marking frame to be adjusted according to the third adjustment operation to obtain a standard marking frame of the object to be marked in the second sample laser point cloud data.
After the second to-be-adjusted labeling frame is adjusted, the adjusted standard labeling frame can be displayed in the two-dimensional overlooking interface and the three-dimensional interface in real time.
In summary, in this embodiment, when the second to-be-adjusted marking frame is obtained, the second to-be-adjusted marking frame and the second sample laser point cloud data are displayed in the two-dimensional back-view and/or side-view interface, so that a marker can view the second to-be-adjusted marking frame from more views and adjust the second to-be-adjusted marking frame, the second to-be-adjusted marking frame is adjusted in the two-dimensional interface, the implementation is easier for the marker, the accuracy of the standard marking frame can be improved, and compared with the case that other surfaces are directly adjusted in the three-dimensional interface, the marking efficiency can be improved.
In another embodiment of the present invention, based on the embodiment shown in fig. 1, when the updated network model determines that the object to be marked does not exist in the second sample laser point cloud data, the method may further add the second sample laser point cloud data into the negative sample training set.
And the sample laser point cloud data in the negative sample training set is rejected to be displayed for a marker. For example, when the object to be marked is a vehicle, and when the second sample laser point cloud data does not include a laser data point reflected by the vehicle, the second sample laser point cloud data may not be displayed to a marker, and the marker does not need to identify whether the object to be marked exists in the second sample laser point cloud data. The sample laser point cloud data in the negative sample training set can be used as a negative sample when other network models are trained.
When the object to be marked exists in the second sample laser point cloud data, the second sample laser point cloud data can be used as a positive sample and added into a corresponding training set.
In summary, in this embodiment, when the network model determines that the object to be labeled does not exist in the second sample laser point cloud data, the sample laser point cloud data is not displayed to a labeling person for identification, but is directly added to the negative sample training set for training other models. When a large amount of sample data exists, the operation can screen the sample data, only the sample laser point cloud data of the object to be marked exists is provided for a marker to be marked, and the marking efficiency can be improved.
In another embodiment of the present invention, based on the embodiment shown in fig. 1, the step S120 of training the target network model according to the model training data may specifically include the following steps 1c to 4 c.
Step 1 c: and inputting sample laser point cloud data in the model training data into the target network model.
The target network model comprises a feature extraction layer and a regression layer. The sample laser point cloud data in the model training data may be the first sample laser point cloud data or the second sample laser point cloud data.
And step 2 c: and determining a characteristic vector in the sample laser point cloud data through the first model parameter of the characteristic extraction layer, and regressing the characteristic vector through the second model parameter of the regression layer to obtain an initial marking frame.
The initial values of the first model parameter and the second model parameter may be set in advance empirically, for example, may be set to a small value. During each training, the first model parameters and the second model parameters are continuously corrected to gradually approach the true values.
And step 3 c: and determining the difference between the initial marking frame and a standard marking frame corresponding to the sample laser point cloud data in the model training data. Wherein the difference can be obtained using a loss function.
And 4 c: and when the difference is larger than a preset difference threshold value, modifying the first model parameter and the second model parameter according to the difference, and returning to execute the step 1 c. And when the difference is not larger than a preset difference threshold value, determining that the training of the target network model is finished.
And (3) when returning to the step 1c, the sample laser point cloud data input into the target network model is different from the sample laser point cloud data input into the target network model in the last cycle.
When the difference is larger than the preset difference threshold, the difference between the prediction result and the true value of the target network model is considered to be large, and the network needs to be trained continuously. When the first model parameter and the second model parameter are corrected according to the difference, the first model parameter and the second model parameter may be adjusted in opposite directions according to the specific value with reference to the specific value and the direction of the difference.
In summary, the present embodiment provides an implementation method for performing cyclic training on a target network model by using sample laser point cloud data and a standard labeling box.
In another embodiment of the present invention, based on the embodiment shown in fig. 1, when the number of the sample laser point cloud data that is not marked in the data set to be marked is not greater than the first number, that is, when the number of the sample laser point cloud data that is not marked in the data set to be marked is not too large, the sample laser point cloud data that is not marked may not be input into the updated network model, but the following steps 1d to 4d are performed.
Step 1 d: and taking the unmarked sample laser point cloud data in the data set to be marked as third sample laser point cloud data, and directly displaying the third sample laser point cloud data.
In particular, the third sample laser point cloud data may be displayed in a two-dimensional overhead interface. The third sample laser point cloud data may also be displayed in the three-dimensional interface simultaneously.
Step 2 d: and acquiring the marking operation input by a marker aiming at the third sample laser point cloud data.
The labeling operation may include at least one of a mouse click operation, a mouse drag operation, a keyboard drag operation, and the like.
And step 3 d: and determining a standard marking frame of the object to be marked aiming at the third sample laser point cloud data according to the marking operation.
And 4 d: and adding the laser point cloud data of the third sample and the corresponding standard marking frame into the training set.
In summary, in this embodiment, when the number of the sample laser point cloud data that is not marked in the data set to be marked is not too large, it is not necessary to determine the second sample laser point cloud data from the sample laser point cloud data that is not marked in the data set to be marked, and it is also not necessary to input the sample laser point cloud data that is not marked into the updated network model, but the sample laser point cloud data that is not marked is directly displayed, and the standard marking frame is determined.
The following describes embodiments of the present invention with reference to specific examples.
Referring to an execution flow diagram shown in fig. 5, in an initial stage, sample laser point cloud data is added into a training set after being manually labeled, and a machine learning algorithm is used for learning the labeled sample laser point cloud data in the training set to obtain a basic neural network model. For subsequent unmarked sample laser point cloud data, before manual marking, the basic neural network model can be used for detecting the unmarked sample laser point cloud data, screening out data meeting requirements (for example, screening out data with an object to be marked), and generating a reference marking frame corresponding to the screened data, wherein the reference marking frame can be used as auxiliary information of subsequent manual marking, so that the marking efficiency is improved. And (4) the manually marked sample laser point cloud data enters a process of training a neural network model, and the trained neural network model replaces the old neural network model for next screening and is sequentially and circularly carried out until all the laser point cloud data are marked.
See fig. 6 for a schematic diagram of the screening logic for sample laser point cloud data. Based on a deep learning technology, a neural network model is obtained by using marked sample laser point cloud data training. And a screening processor in the electronic equipment loads the neural network model, analyzes the laser point cloud data of each unmarked sample to obtain a screening result of whether the data is reserved, and obtains a corresponding reference marking frame when the screening result is that the data is reserved. The screening processor in the electronic device may be one or more. When the sample laser point cloud data is screened, when each input sample laser point cloud data passes through each layer of the neural network model, the layer classifies the data according to the characteristic value of the current layer. After passing through a plurality of layers of the neural network model, a screening result is output to the object to be marked in the sample laser point cloud data, the screening result represents the probability that the object to be marked exists in the sample laser point cloud data, and the screening result comprises an enclosing frame capable of enclosing the object to be marked. Therefore, through judging the probability, the data with high probability is screened out, and the corresponding bounding box is displayed so as to be convenient for a marker to adjust, thereby improving the efficiency of subsequent manual marking.
Fig. 7 is a schematic structural diagram of a training set data labeling apparatus according to an embodiment of the present invention. The embodiment of the device is applied to electronic equipment. This embodiment of the device corresponds to the embodiment shown in fig. 1. The device includes:
adata obtaining module 710 configured to obtain the first sample laser point cloud data labeled in the training set and a standard labeling frame of a corresponding object to be labeled as model training data;
amodel training module 720, configured to train the target network model according to the model training data, to obtain an updated network model; the updated network model is used for enabling the sample laser point cloud data to be correlated with the corresponding standard marking frame;
aquantity judging module 730, configured to judge whether the quantity of the unmarked sample laser point cloud data in the data set to be marked is greater than a first quantity;
a referenceframe determining module 740 configured to determine, when the number of the sample laser point cloud data that are not marked in the data set to be marked is greater than the first number, a second number of sample laser point cloud data from the sample laser point cloud data that are not marked in the data set to be marked as second sample laser point cloud data, and determine, from the updated network model, a reference marking frame of the object to be marked in the second sample laser point cloud data;
a two-dimensional display module 750 configured to display the second sample laser point cloud data and the reference marking frame in a two-dimensional look-down interface according to a first mapping relationship between the two-dimensional look-down coordinate system and the three-dimensional coordinate system; the three-dimensional coordinate system is a coordinate system where the second sample laser point cloud data is located, and the two-dimensional overlooking interface corresponds to the two-dimensional overlooking coordinate system;
the standardframe determining module 760 is configured to acquire a first adjusting operation input by a marker for the second sample laser point cloud data displayed on the two-dimensional overlooking interface and the reference marking frame, and determine a standard marking frame of an object to be marked in the second sample laser point cloud data according to the first adjusting operation;
a first addingmodule 770 configured to add the second sample laser point cloud data and the corresponding standard labeling box to the training set;
and thedata updating module 780 is configured to use the second sample laser point cloud data and the corresponding standard marking box as model training data, use the updated network model as the target network model, return to themodel training module 720, and perform an operation of training the target network model according to the model training data to obtain the updated network model.
In another embodiment of the present invention, based on the embodiment shown in fig. 7, the apparatus further includes:
a three-dimensional display module (not shown in the figure) configured to display the second sample laser point cloud data in a three-dimensional interface after determining a reference marking frame of an object to be marked in the second sample laser point cloud data; wherein the three-dimensional interface corresponds to a three-dimensional coordinate system;
the standardbox determining module 760, when determining the standard marking box of the object to be marked in the second sample laser point cloud data according to the first adjusting operation, includes:
determining a first to-be-adjusted marking frame of an object to be marked in the second sample laser point cloud data according to the first adjusting operation;
displaying a first to-be-adjusted marking frame in the three-dimensional interface;
and acquiring second adjustment operation input by a marker aiming at the first to-be-adjusted marking frame displayed in the three-dimensional interface, and adjusting the first to-be-adjusted marking frame according to the second adjustment operation to obtain a standard marking frame of the to-be-marked object in the second sample laser point cloud data.
In another embodiment of the present invention, based on the embodiment shown in fig. 7, when the standardframe determining module 760 determines the standard marking frame of the object to be marked in the second sample laser point cloud data according to the first adjusting operation, the method includes:
determining a second to-be-adjusted marking frame of the to-be-marked object in the second sample laser point cloud data according to the first adjusting operation;
determining a second mapping relation between the other two-dimensional coordinate system and the three-dimensional coordinate system where the other surface of the second to-be-adjusted labeling frame is located; wherein, other faces include the back and/or side, and other two-dimensional coordinate systems include: a two-dimensional back-view coordinate system and/or a two-dimensional side-view coordinate system;
displaying the second sample laser point cloud data and the second to-be-adjusted marking frame in other two-dimensional interfaces corresponding to other two-dimensional coordinate systems according to the second mapping relation;
acquiring a third adjustment operation input by a annotator aiming at a second to-be-adjusted annotation box displayed on other two-dimensional interfaces;
and adjusting the second marking frame to be adjusted according to the third adjustment operation to obtain a standard marking frame of the object to be marked in the second sample laser point cloud data.
In another embodiment of the present invention, based on the embodiment shown in fig. 7, the apparatus further includes:
a second adding module (not shown in the figure) configured to add the second sample laser point cloud data into the negative sample training set when the updated network model determines that the object to be marked does not exist in the second sample laser point cloud data; and refusing to display the sample laser point cloud data in the negative sample training set for a marker.
In another embodiment of the present invention, based on the embodiment shown in fig. 7, themodel training module 720 is specifically configured to:
inputting sample laser point cloud data in model training data into a target network model; the target network model comprises a feature extraction layer and a regression layer;
determining a characteristic vector in sample laser point cloud data through a first model parameter of a characteristic extraction layer; performing regression on the feature vector through a second model parameter of the regression layer to obtain an initial labeling frame;
determining the difference between the initial marking frame and a standard marking frame corresponding to sample laser point cloud data in model training data;
when the difference is larger than a preset difference threshold value, modifying the first model parameter and the second model parameter according to the difference, and returning to execute the operation of inputting the sample laser point cloud data into the target network model;
and when the difference is not larger than a preset difference threshold value, determining that the training of the target network model is finished.
In another embodiment of the present invention, based on the embodiment shown in fig. 7, the apparatus further includes:
the two-dimensional display module 750 is further configured to, when the number of the sample laser point cloud data that are not marked in the data set to be marked is not greater than the first number, directly display the third sample laser point cloud data by using the sample laser point cloud data that are not marked in the data set to be marked as the third sample laser point cloud data;
a standardbox determination module 760 further configured to obtain a labeling operation of the annotator for the third sample laser point cloud data input; determining a standard marking frame of an object to be marked aiming at the third sample laser point cloud data according to marking operation;
a first addingmodule 770 further configured to add the third sample laser point cloud data and the corresponding standard labeling box to the training set.
The above device embodiment corresponds to the method embodiment, and has the same technical effect as the method embodiment, and for the specific description, refer to the method embodiment. The device embodiment is obtained based on the method embodiment, and for specific description, reference may be made to the method embodiment section, which is not described herein again.
Those of ordinary skill in the art will understand that: the figures are merely schematic representations of one embodiment, and the blocks or flow diagrams in the figures are not necessarily required to practice the present invention.
Those of ordinary skill in the art will understand that: modules in the devices in the embodiments may be distributed in the devices in the embodiments according to the description of the embodiments, or may be located in one or more devices different from the embodiments with corresponding changes. The modules of the above embodiments may be combined into one module, or further split into multiple sub-modules.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.