CN118521772A

Movatterモバイル変換

Info

Publication number: CN118521772A
Application number: CN202410700911.9A
Authority: CN
Inventors: 杨继林
Original assignee: Suzhou Metabrain Intelligent Technology Co Ltd
Current assignee: Suzhou Metabrain Intelligent Technology Co Ltd
Priority date: 2024-05-31
Filing date: 2024-05-31
Publication date: 2024-08-20

Abstract

Description

Three-dimensional detection frame processing method, system, equipment and medium for target detection

Technical Field

The present application relates to the field of target detection, and in particular, to a method, a system, an apparatus, and a medium for processing a three-dimensional detection frame for target detection.

Background

The development of the target detection technology is rapid, and the algorithm performance and accuracy are obviously improved under the push of deep learning; the target detection network is continuously optimized and improved to realize the identification and positioning of the target in a complex scene, and by adopting a more efficient computing architecture and parallel processing technology, the detection speed of the target detection technology on a large-scale data set is greatly improved, and the method provides possibility for implementation and application.

In the process of detecting the target, a plurality of target detection frames are generated for the same target, and redundant detection frames are required to be removed to obtain the target detection frames; meanwhile, the detection frame is also updated from a two-dimensional detection frame to a three-dimensional detection frame along with the development of the technology, and the screening difficulty is increased. Therefore, a method for rapidly screening repeated or redundant three-dimensional detection frames is needed to solve the above-mentioned problems.

Disclosure of Invention

Based on the foregoing, it is necessary to provide a method, a system and a device for processing a three-dimensional detection frame for target detection, so as to quickly reject a repeated detection frame or a redundant detection frame in a plurality of initial three-dimensional detection frames.

In a first aspect, the present application provides a method for processing a three-dimensional detection frame for target detection, including:

Acquiring a plurality of initial three-dimensional detection frames generated by a target detection model, and confidence degrees corresponding to the initial three-dimensional detection frames;

the initial three-dimensional detection frames are ordered in a descending order according to the confidence level so as to generate a detection frame sequence;

traversing the rest second three-dimensional detection frames in the detection frame sequence from the first three-dimensional detection frame with highest confidence, and calculating the intersection ratio between the second three-dimensional detection frame and the first three-dimensional detection frame, wherein the intersection ratio is calculated according to the bottom surface areas and the heights of the first three-dimensional detection frame and the second three-dimensional detection frame which correspond to the target areas respectively;

And suppressing the second three-dimensional detection frames corresponding to the cross ratio greater than or equal to a preset cross ratio threshold value until all the second three-dimensional detection frames in the detection frame sequence are traversed to obtain a target three-dimensional detection frame.

In some embodiments, the method for calculating the intersection ratio according to the bottom surface area and the height corresponding to the first three-dimensional detection frame and the second three-dimensional detection frame on the target area respectively includes:

Determining a target area according to the bottom surface vertex coordinates of the plurality of initial three-dimensional detection frames;

judging whether an overlapping area exists between the first three-dimensional detection frame and the second three-dimensional detection frame;

Determining a first bottom surface area occupied by the first three-dimensional detection frame in the target area and a second bottom surface area occupied by the second three-dimensional detection frame in the target area according to a preset bottom surface area determination rule in response to the existence of an overlapping area of the first three-dimensional detection frame and the second three-dimensional detection frame, wherein the first bottom surface area and the second bottom surface are a set of a plurality of line segments;

Determining an intersection area according to the first bottom surface area, the second bottom surface area, the height of the first three-dimensional detection frame and the height of the second three-dimensional detection frame;

And calculating and determining the intersection ratio of the first three-dimensional detection frame and the second three-dimensional detection frame according to the volume of the first three-dimensional detection frame, the volume of the second three-dimensional detection frame and the volume of the intersection area.

In some embodiments, the determining the target area according to the bottom vertex coordinates of the plurality of initial three-dimensional detection frames includes:

Constructing a rectangular coordinate system, and acquiring four bottom surface vertex coordinates corresponding to each three-dimensional detection frame;

determining the area of the target area according to the obtained target area selected by the X coordinate minimum value, the X coordinate maximum value, the Y coordinate minimum value and the linear frame where the Y coordinate maximum value is located in the plurality of bottom surface vertex coordinates;

And dividing an area matched with the target area by taking an origin as a starting point to serve as the target area.

In some embodiments, the determining, according to a preset floor area determining rule, the occupied first floor area of the first three-dimensional detection frame in the target area and the occupied second floor area of the second three-dimensional detection frame in the target area includes:

defining the height of the target area on a Y axis as H, wherein the value corresponding to H is the difference value between the maximum value of the Y coordinate and the minimum value of the Y coordinate;

calculating an intersection point of the first three-dimensional detection frame and a straight line Y=i and an intersection point of the second three-dimensional detection frame and the straight line Y=i, and determining a first intersection line segment of the first three-dimensional detection frame and the straight line Y=i and a second intersection line segment of the second three-dimensional detection frame and the straight line Y=i, wherein 0< =i < =H-1;

Determining a first bottom surface area occupied by the first three-dimensional detection frame on the target area according to a set of a plurality of first intersecting line segments;

and determining a second bottom surface area occupied by the second three-dimensional detection frame on the target area according to the set of the plurality of second intersecting line segments.

In some embodiments, the intersection point includes a start intersection point, calculating an intersection point of the first three-dimensional detection frame or the second three-dimensional detection frame with a straight line y=i includes:

according to the relative positions of the four bottom surface top points corresponding to the three-dimensional detection frame, respectively defining four bottom surface top point positions as a lowest point, a highest point, a leftmost point and a rightmost point;

calculating a first slope corresponding to a first connecting line between the lowest point and the leftmost point, a second slope corresponding to a second connecting line between the lowest point and the rightmost point, a third slope corresponding to a third connecting line between the leftmost point and the highest point and a fourth slope corresponding to a fourth connecting line between the rightmost point and the highest point;

if i is less than or equal to the Y coordinate of the lowest point, determining that the starting intersection point does not exist;

If i is less than or equal to the Y coordinate of the leftmost point, determining that the start intersection point is on the first connecting line, where X coordinate of the start intersection point=x coordinate of the lowest point+first slope;

If i is less than or equal to the Y coordinate of the highest point, determining that the start intersection is on the third connecting line, where X coordinate of the start intersection=x coordinate of the leftmost point+third slope;

if i is less than or equal to the Y coordinate of the rightmost point, determining that the start intersection is on the fourth link, where X coordinate of the start intersection=x coordinate of the highest point+fourth slope;

If i is greater than the Y coordinate of the highest point, it is determined that the starting intersection point does not exist.

In some embodiments, calculating the intersection of the first three-dimensional detection frame or the second three-dimensional detection frame with the straight line y=i comprises:

If i is less than or equal to the Y coordinate of the lowest point, determining that the ending intersection point does not exist;

if i is less than or equal to the Y coordinate of the rightmost point, determining that the ending intersection point is on the second connecting line, where the X coordinate of the ending intersection point=the X coordinate of the lowest point+the second slope;

If i is less than or equal to the Y coordinate of the highest point, determining that the ending intersection point is on the fourth connecting line, where X coordinate of the ending intersection point=x coordinate of the rightmost point+fourth slope;

If i is less than or equal to the Y coordinate of the leftmost point, determining that the ending intersection point is on the third connecting line, where the X coordinate of the ending intersection point=the X coordinate of the highest point+the third slope;

If i is greater than the Y coordinate of the highest point, determining that the ending intersection point does not exist.

In some embodiments, the intersection region includes a first intersection region on a target region and a second intersection region on a Z-axis, the determining the intersection region based on the first floor region, the second floor region, and the height of the first three-dimensional detection frame and the height of the second three-dimensional detection frame includes:

Determining a starting intersection point of a straight line y=i and an intersecting line segment forming the first intersecting region on a target region according to the maximum value of the X coordinate of the starting intersection point corresponding to the first bottom surface region and the X coordinate of the starting intersection point corresponding to the second bottom surface region;

determining an ending intersection point of a straight line y=i and an intersecting line segment forming the first intersecting region on the target region according to a minimum value of an X coordinate of the ending intersection point corresponding to the first bottom surface region and an X coordinate of the ending intersection point corresponding to the second bottom surface region;

And determining a starting intersection point of the intersecting line segment forming the second intersection region according to a larger value in the minimum Z coordinate of the first three-dimensional detection frame and the minimum Z coordinate of the second three-dimensional detection frame, and determining an ending intersection point of the intersecting line segment forming the second intersection region according to a smaller value in the maximum Z coordinate of the first three-dimensional detection frame and the maximum Z coordinate of the second three-dimensional detection frame.

In a second aspect, the present application provides a three-dimensional inspection box processing system for object inspection, the system comprising:

the data preparation module is used for acquiring a plurality of initial three-dimensional detection frames generated by the target detection model and the confidence coefficient corresponding to each initial three-dimensional detection frame;

the sorting module is used for sorting the initial three-dimensional detection frames in a descending order according to the confidence level so as to generate a detection frame sequence;

The intersection ratio calculation module is used for traversing the remaining second three-dimensional detection frames in the detection frame sequence from the first three-dimensional detection frame with the highest confidence, and calculating the intersection ratio between the second three-dimensional detection frame and the first three-dimensional detection frame, wherein the intersection ratio is calculated according to the bottom surface areas and the heights of the first three-dimensional detection frame and the second three-dimensional detection frame which correspond to the target areas respectively;

The data processing module is used for inhibiting the second three-dimensional detection frames corresponding to the cross ratio greater than or equal to a preset cross ratio threshold value until all the second three-dimensional detection frames are traversed to obtain target three-dimensional detection frames.

In a third aspect, the present application provides an electronic device, including:

one or more processors;

and a memory associated with the one or more processors, the memory for storing program instructions that, when read and executed by the one or more processors, perform the operations of:

In a fourth aspect, the present application also provides a computer-readable storage medium having stored thereon a computer program that causes a computer to perform the operations of:

The beneficial effects achieved by the application are as follows:

Drawings

For a clearer description of the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly introduced below, it being obvious that the drawings in the description below are only some embodiments of the present application, and that other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art, wherein:

FIG. 1 is a flow chart of a three-dimensional detection frame processing method for target detection according to an embodiment of the present application;

FIG. 2 is a schematic view of a target area according to an embodiment of the present application;

FIG. 3 is a schematic diagram showing the relative positions of the vertices of a bottom surface according to an embodiment of the present application;

FIG. 4 is a schematic illustration of an intersection point provided by an embodiment of the present application;

FIG. 5 is a schematic diagram of a three-dimensional detection frame processing architecture for object detection according to an embodiment of the present application;

FIG. 6 is a schematic diagram of an exemplary embodiment of an overlap ratio calculation module;

Fig. 7 is a block diagram of an electronic device according to an embodiment of the present application.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the present application more apparent, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

It should be understood that throughout this specification and the claims, unless the context clearly requires otherwise, the words "comprise", "comprising", and the like, are to be construed in an inclusive sense as opposed to an exclusive or exhaustive sense; that is, it is the meaning of "including but not limited to".

It should also be appreciated that the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. Furthermore, in the description of the present application, unless otherwise indicated, the meaning of "a plurality" is two or more.

It should be noted that the terms "S1", "S2", and the like are used for the purpose of describing the steps only, and are not intended to be construed to be specific as to the order or sequence of steps, nor are they intended to limit the present application, which is merely used to facilitate the description of the method of the present application, and are not to be construed as indicating the sequence of steps. In addition, the technical solutions of the embodiments may be combined with each other, but it is necessary to base that the technical solutions can be realized by those skilled in the art, and when the technical solutions are contradictory or cannot be realized, the combination of the technical solutions should be considered to be absent and not within the scope of protection claimed in the present application.

The three-dimensional detection frame processing method for target detection disclosed by the application can be applied to multiple fields, for example, in the intelligent traffic field, can be used in automatic driving automobiles and traffic monitoring systems, and can be used for realizing real-time detection and tracking of targets such as vehicles, pedestrians and the like, thereby improving traffic safety and efficiency; in the security field, the target detection algorithm is used for face recognition, behavior analysis and other aspects, so that automatic recognition and early warning of targets in the monitoring video are realized; in addition, the target detection plays an important role in the fields of medical image analysis, intelligent retail, industrial quality inspection and the like.

At present, two general implementation modes are available for calculating the cross-over ratio: one is to call the implementation of the opencv and other computing libraries, the mode has large and complete functions, but the operation speed is slower; the other is to directly write CUDA codes, and the method generally calculates parameters such as intersection points of candidate frame edges, and the like, and calculates a detection frame which is relatively complex and mostly does not support three-dimensional rotation. In addition, the method has higher efficiency and lower power consumption compared with the GPU in consideration of the purpose of optimizing a specific application scene in the neural network pushing; ASIC chips are also increasingly being used to perform neural network reasoning tasks, but ASIC chips are less powerful to handle and less efficient in complex computing. The method provided by the application greatly quickens the screening of redundant repeated detection frames in the target detection process through the rapid, simple and convenient calculation of the cross-over ratio, and greatly improves the efficiency and the universality of target detection.

Example 1

The embodiment of the application provides a three-dimensional detection frame processing method for target detection, in particular to a method for processing a three-dimensional detection frame generated in a target detection process to obtain the target detection frame, which comprises the following steps of:

s100, acquiring a plurality of initial three-dimensional detection frames generated by the target detection model and confidence degrees corresponding to the initial three-dimensional detection frames.

In the embodiment of the application, the target detection model comprises, but is not limited to, target detection network models such as R-CNN series based on candidate frames and end-to-end YOLO series; the method acquires a plurality of initial three-dimensional detection frames and the confidence degrees corresponding to the initial three-dimensional detection frames based on the target detection model. The generation of the plurality of initial three-dimensional detection frames and the corresponding confidence levels according to the target detection model are conventional technical means in the art, and the application is not repeated here.

S200, sorting the initial three-dimensional detection frames in a descending order according to the confidence level so as to generate a detection frame sequence.

In target detection, confidence (Confidence Score) is an important metric used to evaluate the accuracy of the model's prediction of the detection frame of the target it detects. In particular, confidence generally represents the degree of confidence that a model is in its predicted target class and detection box position. In the output of a target detection model, a plurality of initial detection frames and corresponding category predictions are usually generated, the initial three-dimensional detection frames are ordered according to the confidence, the initial three-dimensional detection frame with the highest confidence is used as a first three-dimensional detection frame, the rest initial three-dimensional detection frames are used as a second three-dimensional detection frame, the initial three-dimensional detection frames are orderly arranged according to the descending order of the confidence, and the arranged initial three-dimensional detection frames are defined as a detection frame sequence.

S300, traversing the rest second three-dimensional detection frames in the detection frame sequence from the first three-dimensional detection frame with the highest confidence, and calculating the intersection ratio between the second three-dimensional detection frame and the first three-dimensional detection frame, wherein the intersection ratio is calculated according to the bottom surface areas and the heights of the first three-dimensional detection frame and the second three-dimensional detection frame which correspond to the target area respectively.

In some implementations, a method of calculating an intersection ratio from a bottom surface region and a height of a first three-dimensional detection frame and a second three-dimensional detection frame, respectively, on a target region includes:

S310, determining a target area according to the bottom surface vertex coordinates of the initial three-dimensional detection frames.

Four maximum values determined according to the above: dividing a target area on an X-Y plane by an X coordinate minimum value, an X coordinate maximum value, a Y coordinate minimum value and a Y coordinate maximum value, wherein the target area S= (all_max_x-all_min_x) of the target area is equal to (all_max_y-all_min_y); and the target area is started by an origin (0, 0). It will be appreciated that the target area may accommodate the bottom surface of all of the initial three-dimensional inspection frames. In order to facilitate processing and improve accuracy of a subsequent cross-over ratio calculation result, the application proposes that a first three-dimensional detection frame and a second three-dimensional detection frame which need to calculate the cross-over ratio are translated on an X-Y plane, the translation operation does not affect the coordinate on the Z axis of the corresponding three-dimensional detection frame, preferably, the vertex on each three-dimensional detection frame is translated towards the X axis by a unit matched with the minimum value of the X coordinate, the vertex on each three-dimensional detection frame after translation is translated towards the Y axis by a unit matched with the minimum value of the Y coordinate, and the vertex on each three-dimensional detection frame after translation is (xm-all_min_x, ym-all_min_y, zm).

S320, judging whether an overlapping area exists between the first three-dimensional detection frame and the second three-dimensional detection frame.

S330, determining a first bottom surface area occupied by the first three-dimensional detection frame in the target area and a second bottom surface area occupied by the second three-dimensional detection frame in the target area according to a preset bottom surface area determination rule in response to the existence of the overlapping areas of the first three-dimensional detection frame and the second three-dimensional detection frame, wherein the first bottom surface area and the second bottom surface are a set of a plurality of line segments.

Specifically, the intersection includes a start intersection and an end intersection, and the calculating an intersection of the first three-dimensional detection frame or the second three-dimensional detection frame and the straight line y=i includes determining an X coordinate x_start (i) of the start intersection:

As shown in fig. 3, according to the relative positions of the four bottom surface vertices corresponding to the three-dimensional detection frame, four bottom surface vertex points are respectively defined as a lowest point (bot), a highest point (top), a leftmost point (left) and a rightmost point (right); specifically, firstly, selecting the point with the smallest Y coordinate in four bottom surface vertexes as the bottommost point bot, and if the Y coordinates of a plurality of points are all the smallest values, selecting the point with the largest X coordinate as the bot, wherein the coordinates of the point are (X (bot), Y (bot)); then, the rotation angles of the other three points relative to the bot are calculated, the point with the smallest rotation angle is selected as the rightmost point right, the coordinate of the point is (x (right), y (right)), the point with the largest rotation angle is selected as the leftmost point left, the coordinate of the point is (x (left), y (left)), and the rest point is the topmost point top, the coordinate of the point is (x (top), y (top)).

And calculating a first slope corresponding to a first connecting line between the lowest point and the leftmost point, a second slope corresponding to a second connecting line between the lowest point and the rightmost point, a third slope corresponding to a third connecting line between the leftmost point and the highest point and a fourth slope corresponding to a fourth connecting line between the rightmost point and the highest point.

Calculating x_start (i), and determining that a start intersection point does not exist if i is less than or equal to the Y coordinate of the lowest point as shown in FIG. 4; if i is less than or equal to the Y coordinate of the leftmost point, determining that the start intersection point is on the first connecting line, wherein the X coordinate of the start intersection point=the X coordinate of the lowest point+the first slope (i-the Y coordinate of the lowest point); if i is less than or equal to the Y coordinate of the highest point, determining that the start intersection point is on the third connecting line, wherein the X coordinate of the start intersection point=the X coordinate of the leftmost point+the third slope; if i is less than or equal to the Y coordinate of the rightmost point, determining that the start intersection point is on the fourth connecting line, wherein the X coordinate of the start intersection point=the X coordinate of the highest point+the fourth slope (i-the Y coordinate of the highest point); if i is greater than the Y coordinate of the highest point, it is determined that there is no start intersection.

S340, determining an intersection area according to the first bottom surface area, the second bottom surface area, the height of the first three-dimensional detection frame and the height of the second three-dimensional detection frame.

The intersecting region includes a first intersecting region on the target region and a second intersecting region on the Z-axis, and determining the intersecting region based on the first bottom surface region, the second bottom surface region, and the height of the first three-dimensional detection frame and the height of the second three-dimensional detection frame includes:

Determining a starting intersection point of a straight line y=i and an intersecting line segment constituting a first intersecting region on the target region according to a maximum value of an X coordinate of the starting intersection point corresponding to the first bottom surface region and an X coordinate of the starting intersection point corresponding to the second bottom surface region; and determining the ending intersection point of the straight line Y=i and the intersecting line segment forming the first intersecting region on the target region according to the minimum value of the X coordinate of the ending intersection point corresponding to the first bottom surface region and the X coordinate of the ending intersection point corresponding to the second bottom surface region. For convenience of explanation, when y=i (0 < =i < =h-1), it is assumed that the first bottom surface region occupied by the first three-dimensional detection frame on the target region is denoted as (a_x_start (i), a_x_end (i)); the second bottom surface area occupied by the second three-dimensional detection frame on the target area is denoted as (b_x_start (i), b_x_end (i)); then the start intersection of the intersection region between the first three-dimensional detection frame and the second three-dimensional detection frame is overlap_start (i), overlap_start (i) =max (a_x_start (i), b_x_start (i)); when the ending intersection point is overlap_end (i), overlap_end (i) =min (a_x_end (i), b_x_end (i)), and y=i (0 < =i < =h-1), the first intersection area occupied by the first three-dimensional detection frame and the second three-dimensional detection frame on the target area may be expressed as (overlap_start (i), overlap_end (i)).

And determining a starting intersection point of the intersecting line segments forming the second intersection region according to a larger value in the minimum Z coordinate of the first three-dimensional detection frame and the minimum Z coordinate of the second three-dimensional detection frame, and determining an ending intersection point of the intersecting line segments forming the second intersection region according to a smaller value in the maximum Z coordinate of the first three-dimensional detection frame and the maximum Z coordinate of the second three-dimensional detection frame. For convenience of explanation, assuming that the region occupied by the first three-dimensional inspection frame on the Z-axis is denoted as [ a_min_z, a_max_z ], the region occupied by the second three-dimensional inspection frame on the Z-axis is denoted as [ b_min_z, b_max_z ], the start intersection point of the region occupied by the intersecting region on the Z-axis is min_z=max (a_min_z, b_min_z), the end intersection point is max_z=min (a_max_z, b_max_z), and the second intersection region on the Z-axis is denoted as (min_z, max_z).

S350, calculating and determining the intersection ratio of the first three-dimensional detection frame and the second three-dimensional detection frame according to the volume of the first three-dimensional detection frame, the volume of the second three-dimensional detection frame and the volume of the intersection area.

The method for calculating the volumes of the first three-dimensional detection frame and the second three-dimensional detection frame and the volumes of the intersection areas comprises the following steps: the method comprises the steps of calculating the area of a first bottom surface area occupied by a first three-dimensional detection frame on a target area, the area of a second bottom surface area occupied by a second three-dimensional detection frame on the target area and the area of a first intersection area occupied by an intersection area on the target area. Specifically, when y=i (0 < =i < =h-1), the area occupied by the bottom surface of the three-dimensional detection frame on the target area may be represented by (x_start (i), x_end (i)), and then the area of the bottom surface area corresponding to y=i is bot (i) =x_end (i) -x_start (i), and the area bot= Σbot (i) = Σ (x_end (i) -x_start (i)) of the bottom surface area is obtained by accumulating the bot (i). The area of the first bottom surface area corresponding to the first three-dimensional detection frame, the area of the second bottom surface area corresponding to the second three-dimensional detection frame and the area of the first intersection area occupied by the intersection area on the target area can be calculated.

Calculating the volume of the first three-dimensional detection frame according to the height of the first three-dimensional detection frame and the area of the first bottom surface area; calculating the volume of the second three-dimensional detection frame according to the height of the second three-dimensional detection frame and the area of the second bottom surface area; the volume of the intersection region is calculated from the height of the intersection region and the area of the first intersection region. For illustration, assume that the height of the first three-dimensional detection frame is denoted as H (a), where=h (a) =a_max_z-a_min_z; the volume of the first three-dimensional detection box is denoted V (a), wherein V (a) =bot_a×h (a); similarly, the volume of the second three-dimensional detection frame is denoted as V (B), where V (B) =bot_b×h (B), H (B) =b_max_z-b_min_z. The height of the intersection region is H (overlap) =max_z-min_z, its volume V (overlap) =bot_overlap H (overlap).

The ratio of intersection between the first three-dimensional detection frame and the second three-dimensional detection frame = the volume of the intersection region/(the volume of the first three-dimensional detection frame + the volume of the second three-dimensional detection frame-the volume of the intersection region); corresponding to the specific representation above, the intersection ratio (IoU, intersection ofUnion) of the first three-dimensional detection frame and the second three-dimensional detection frame may be represented as IoU (a, B) =v (overlap)/(V (a) +v (B) -V (overlap)).

S4, restraining the second three-dimensional detection frames corresponding to the cross ratio greater than or equal to the preset cross ratio threshold value until all the second three-dimensional detection frames are traversed to obtain the target three-dimensional detection frames.

After calculating the cross-over ratio between the first three-dimensional detection frame and the second three-dimensional detection frame, comparing the calculated cross-over ratio with a preset cross-over ratio threshold, and if the calculated cross-over ratio is greater than or equal to the cross-over ratio threshold, indicating that the overlapping degree of the second three-dimensional detection frame and the first three-dimensional detection frame is high, at the moment, the corresponding second three-dimensional detection is required to be restrained and not reserved; if the calculated intersection ratio is smaller than the intersection ratio threshold, the overlapping degree of the second three-dimensional detection frame and the first three-dimensional detection frame is not high, and at this time, in order to ensure the accuracy of the subsequent target detection, the corresponding second three-dimensional detection frame can be reserved. The above process continues until all the second three-dimensional inspection frames in the inspection frame sequence, and the first three-dimensional inspection frame and the last remaining one or more second three-dimensional inspection frames are taken as target three-dimensional inspection frames. The above cross ratio threshold is preferably set to 0.5, and is also set by those skilled in the art according to the actual application scenario, which is not limited by the present application.

The method calculates the cross-over ratio between the initial three-dimensional detection frames output by the target detection model, eliminates redundant or repeated initial three-dimensional detection frames by comparing the cross-over ratio with the preset cross-over ratio threshold value, and reserves the detection frames with high confidence so as to avoid the influence of the redundant repeated detection frames on target detection and improve the accuracy of target detection; and the screening speed of the detection frame is greatly accelerated through the rapid calculation of the cross-over ratio.

Example two

Corresponding to the first embodiment, the present application further provides a three-dimensional detection frame processing for target detection, as shown in fig. 5, which specifically includes:

The data preparation module 510 is configured to obtain a plurality of initial three-dimensional detection frames generated by the target detection model and confidence degrees corresponding to the initial three-dimensional detection frames;

the sorting module 520 is configured to sort the initial three-dimensional detection frames in a descending order according to the confidence level to generate a detection frame sequence;

The cross-over ratio calculating module 530 is configured to traverse the remaining second three-dimensional detection frames in the detection frame sequence from the first three-dimensional detection frame with the highest confidence, and calculate a cross-over ratio between the second three-dimensional detection frame and the first three-dimensional detection frame, where the cross-over ratio is calculated according to the bottom surface areas and the heights of the first three-dimensional detection frame and the second three-dimensional detection frame that respectively correspond to the target areas;

The data processing module 540 is configured to suppress the second three-dimensional detection frames corresponding to the cross ratio greater than or equal to the preset cross ratio threshold until all the second three-dimensional detection frames are traversed to obtain the target three-dimensional detection frame.

Preferably, as shown in fig. 6, the cross-over ratio calculating module 530 further includes a bottom surface processing unit 610, an overlap judging unit 620, a bottom surface area calculating unit 630, and a cross-over ratio calculating unit 640;

A bottom surface processing unit 610, configured to determine a target area according to bottom surface vertex coordinates of the plurality of initial three-dimensional detection frames;

An overlap judging unit 620 for judging whether or not an overlap region exists between the first three-dimensional detection frame and the second three-dimensional detection frame;

A bottom surface area calculating unit 630, configured to determine, in response to the first three-dimensional detection frame and the second three-dimensional detection frame having an overlapping area, a first bottom surface area occupied by the first three-dimensional detection frame in the target area and a second bottom surface area occupied by the second three-dimensional detection frame in the target area according to a preset bottom surface area determination rule, where the first bottom surface area and the second bottom surface are a set of a plurality of line segments;

The intersection ratio calculating unit 640 is configured to determine an intersection area according to the first bottom surface area, the second bottom surface area, the height of the first three-dimensional detection frame, and the height of the second three-dimensional detection frame;

The cross-over ratio calculating unit 640 is further configured to calculate and determine a cross-over ratio of the first three-dimensional detection frame and the second three-dimensional detection frame according to the volume of the first three-dimensional detection frame, the volume of the second three-dimensional detection frame, and the volume of the intersection region.

Preferably, in some implementation scenarios, the bottom surface processing unit 610 is further configured to construct a rectangular coordinate system, and obtain four bottom surface vertex coordinates corresponding to each three-dimensional detection frame; determining the area of a target area according to the obtained target area selected by the X coordinate minimum value, the X coordinate maximum value, the Y coordinate minimum value and the linear frame where the Y coordinate maximum value is located in the plurality of bottom surface vertex coordinates; the origin is used as a starting point, and a region matched with the target area is divided as a target region.

Preferably, in some implementation scenarios, the bottom surface area calculating unit 630 is further configured to define a height of the target area on the Y axis as H, where a value corresponding to H is a difference between a maximum value of the Y coordinate and a minimum value of the Y coordinate; calculating an intersection point of the first three-dimensional detection frame and the straight line Y=i and an intersection point of the second three-dimensional detection frame and the straight line Y=i, and determining a first intersection line segment of the first three-dimensional detection frame and the straight line Y=i and a second intersection line segment of the second three-dimensional detection frame and the straight line Y=i, wherein 0< =i < =H-1; determining a first bottom surface area occupied by the first three-dimensional detection frame on the target area according to a set of a plurality of first intersecting line segments; and determining a second bottom surface area occupied by the second three-dimensional detection frame on the target area according to the set of the plurality of second intersecting line segments.

Preferably, in some implementation scenarios, the bottom surface area calculating unit 630 is further configured to define four bottom surface vertex points as a lowest point, a highest point, a leftmost point and a rightmost point according to the relative positions of the four bottom surface vertices corresponding to the three-dimensional detection frame; calculating a first slope corresponding to a first connecting line between the lowest point and the leftmost point, a second slope corresponding to a second connecting line between the lowest point and the rightmost point, a third slope corresponding to a third connecting line between the leftmost point and the highest point and a fourth slope corresponding to a fourth connecting line between the rightmost point and the highest point; when the i is smaller than or equal to the Y coordinate of the lowest point, determining that a starting intersection point does not exist; when i is less than or equal to the Y coordinate of the leftmost point, determining that the start intersection point is on the first connecting line, wherein the X coordinate of the start intersection point=the X coordinate of the lowest point+the first slope (i-the Y coordinate of the lowest point); determining that the start intersection point is on the third connecting line at the Y coordinate of i is less than or equal to the highest point, wherein the X coordinate of the start intersection point=the X coordinate of the leftmost point+the third slope; determining that the start intersection point is on the fourth connecting line at the Y coordinate of i is smaller than or equal to the right-most point, wherein the X coordinate of the start intersection point=the X coordinate of the highest point+the fourth slope; at a Y coordinate where i is greater than the highest point, it is determined that there is no start intersection.

Preferably, in some implementation scenarios, the bottom surface area calculating unit 630 is further configured to determine that there is no ending intersection when i is less than or equal to the Y coordinate of the lowest point; when i is less than or equal to the Y coordinate of the rightmost point, determining that the ending intersection point is on the second connecting line, wherein the X coordinate of the ending intersection point=the X coordinate of the lowest point+the second slope; determining that the ending intersection point is on the fourth connecting line at the Y coordinate of i is smaller than or equal to the highest point, wherein the X coordinate of the ending intersection point=the X coordinate of the rightmost point+the fourth slope; determining that the ending intersection point is on the third connecting line at the Y coordinate of i is less than or equal to the leftmost point, and that the X coordinate of the ending intersection point=the X coordinate of the highest point+the third slope; and at the Y coordinate where i is greater than the highest point, determining that no ending intersection point exists.

Preferably, in some implementation scenarios, the intersection ratio calculating unit 640 is further configured to determine, according to a maximum value of an X coordinate of a start intersection point corresponding to the first bottom surface area and an X coordinate of a start intersection point corresponding to the second bottom surface area, a start intersection point of the straight line y=i and an intersection line segment that forms the first intersection area on the target area; determining an ending intersection point of the straight line y=i and an intersecting line segment constituting the first intersecting region on the target region according to a minimum value of an X coordinate of the ending intersection point corresponding to the first bottom surface region and an X coordinate of the ending intersection point corresponding to the second bottom surface region; and determining a starting intersection point of the intersecting line segments forming the second intersection region according to a larger value in the minimum Z coordinate of the first three-dimensional detection frame and the minimum Z coordinate of the second three-dimensional detection frame, and determining an ending intersection point of the intersecting line segments forming the second intersection region according to a smaller value in the maximum Z coordinate of the first three-dimensional detection frame and the maximum Z coordinate of the second three-dimensional detection frame.

Example III

Corresponding to the first and second embodiments, the embodiments of the present application also provide a computer program product, which when executed by a processor, implements the steps of the method as follows:

In some embodiments, the computer program when executed by a processor performs the steps of the method as follows:

the intersection point comprises a start intersection point, and the calculation of the intersection point of the first three-dimensional detection frame or the second three-dimensional detection frame and the straight line Y=i comprises the following steps:

Example IV

Corresponding to all the embodiments described above, an embodiment of the present application provides an electronic device, including: one or more processors; and a memory associated with the one or more processors, the memory for storing program instructions that, when read and executed by the one or more processors, perform the operations of:

Fig. 7 illustrates an architecture of an electronic device, which may include a processor 710, a video display adapter 711, a disk drive 712, an input/output interface 713, a network interface 714, and a memory 720, among others. The processor 710, the video display adapter 711, the disk drive 712, the input/output interface 713, the network interface 714, and the memory 720 may be communicatively connected via a bus 730.

The processor 710 may be implemented by a general-purpose CPU (Central Processing Unit ), a microprocessor, an Application SPECIFIC INTEGRATED Circuit (ASIC), or one or more integrated circuits, etc. for executing related programs to implement the technical solution provided by the present application.

The Memory 720 may be implemented in the form of ROM (Read Only Memory), RAM (Random Access Memory ), static storage, dynamic storage, etc. The memory 720 may store an operating system 721 for controlling the execution of the electronic device 700, and a Basic Input Output System (BIOS) 722 for controlling the low-level operation of the electronic device 700. In addition, a web browser 723, a data storage management system 724, an icon font processing system 727, and the like may also be stored. The icon font processing system 727 may be an application program for implementing the operations of the foregoing steps in the embodiment of the present application. In general, when the technical solution provided by the present application is implemented by software or firmware, relevant program codes are stored in the memory 720 and invoked by the processor 710 for execution.

The input/output interface 713 is used to connect with an input/output module to enable information input and output. The input/output module may be configured as a component in a device (not shown) or may be external to the device to provide corresponding functionality. Wherein the input devices may include a keyboard, mouse, touch screen, microphone, various types of sensors, etc., and the output devices may include a display, speaker, vibrator, indicator lights, etc.

The network interface 714 is used to connect communication modules (not shown) to enable communication interactions of the device with other devices. The communication module may implement communication through a wired manner (such as USB, network cable, etc.), or may implement communication through a wireless manner (such as mobile network, WIFI, bluetooth, etc.).

Bus 730 includes a path to transfer information between various components of the device (e.g., processor 710, video display adapter 711, disk drive 712, input/output interface 713, network interface 714, and memory 720).

In addition, the electronic device 700 may also obtain information of specific acquisition conditions from the virtual resource object acquisition condition information database, for performing condition judgment, and so on.

It should be noted that although the above devices illustrate only the processor 710, the video display adapter 711, the disk drive 712, the input/output interface 713, the network interface 714, the memory 720, the bus 730, etc., the device may include other components necessary to achieve normal execution in an implementation. Furthermore, it will be appreciated by those skilled in the art that the apparatus may include only the components necessary to implement the present application, and not all of the components shown in the drawings.

From the above description of embodiments, it will be apparent to those skilled in the art that the present application may be implemented in software plus a necessary general hardware platform. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a cloud server, or a network device, etc.) to execute the method of the embodiments or some parts of the embodiments of the present application.

Example five

Corresponding to all the above embodiments, the embodiments of the present application further provide a computer-readable storage medium, characterized in that it stores a computer program, the computer program causing a computer to perform the following operations:

In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for a system or system embodiment, since it is substantially similar to a method embodiment, the description is relatively simple, with reference to the description of the method embodiment being made in part. The systems and system embodiments described above are merely illustrative, wherein elements illustrated as separate elements may or may not be physically separate, and elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.

The foregoing is only illustrative of the present application and is not to be construed as limiting thereof, but rather as various modifications, equivalent arrangements, improvements, etc., within the spirit and principles of the present application.

Claims

1. A method for processing a three-dimensional detection frame for target detection, the method comprising:

2. The method of claim 1, wherein the method of calculating the intersection ratio from the respective bottom surface areas and heights of the first and second three-dimensional inspection frames on the target area comprises:

3. The method of claim 2, wherein determining the target region based on the bottom surface vertex coordinates of the plurality of initial three-dimensional inspection boxes comprises:

4. The method of claim 3, wherein determining the occupied first floor area of the first three-dimensional inspection box within the target area and the occupied second floor area of the second three-dimensional inspection box within the target area according to a preset floor area determination rule comprises:

5. The method of claim 4, wherein the intersection point comprises a start intersection point, and calculating an intersection point of the first three-dimensional detection frame or the second three-dimensional detection frame with a straight line Y = i comprises:

6. The method of claim 5, wherein the intersection point comprises an ending intersection point, and calculating an intersection point of the first three-dimensional detection frame or the second three-dimensional detection frame with a straight line Y = i comprises:

7. The method of claim 2, wherein the intersection region comprises a first intersection region on a target region and a second intersection region on a Z-axis, the determining the intersection region based on the first floor region, the second floor region, and the heights of the first and second three-dimensional detection frames, comprising:

8. A three-dimensional inspection box processing system for object inspection, the system comprising:

9. An electronic device, the electronic device comprising:

one or more processors;

and a memory associated with the one or more processors, the memory for storing program instructions that, when read for execution by the one or more processors, perform the method of any of claims 1-7.

10. A computer-readable storage medium, characterized in that it stores a computer program, which causes a computer to perform the method of any one of claims 1-7.