Movatterモバイル変換


[0]ホーム

URL:


CN118521772A - Three-dimensional detection frame processing method, system, equipment and medium for target detection - Google Patents

Three-dimensional detection frame processing method, system, equipment and medium for target detection
Download PDF

Info

Publication number
CN118521772A
CN118521772ACN202410700911.9ACN202410700911ACN118521772ACN 118521772 ACN118521772 ACN 118521772ACN 202410700911 ACN202410700911 ACN 202410700911ACN 118521772 ACN118521772 ACN 118521772A
Authority
CN
China
Prior art keywords
dimensional detection
detection frame
coordinate
point
intersection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410700911.9A
Other languages
Chinese (zh)
Inventor
杨继林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Metabrain Intelligent Technology Co Ltd
Original Assignee
Suzhou Metabrain Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Metabrain Intelligent Technology Co LtdfiledCriticalSuzhou Metabrain Intelligent Technology Co Ltd
Priority to CN202410700911.9ApriorityCriticalpatent/CN118521772A/en
Publication of CN118521772ApublicationCriticalpatent/CN118521772A/en
Pendinglegal-statusCriticalCurrent

Links

Classifications

Landscapes

Abstract

The application provides a three-dimensional detection frame processing method, a system, equipment and a medium for target detection, which comprise the steps of obtaining a plurality of initial three-dimensional detection frames generated by a target detection model and confidence degrees corresponding to the initial three-dimensional detection frames; the initial three-dimensional detection frames are ordered in a descending order according to the confidence level so as to generate a detection frame sequence; traversing the rest second three-dimensional detection frames in the detection frame sequence from the first three-dimensional detection frame with highest confidence, and calculating the intersection ratio between the second three-dimensional detection frame and the first three-dimensional detection frame, wherein the intersection ratio is calculated according to the bottom surface areas and the heights of the first three-dimensional detection frame and the second three-dimensional detection frame which correspond to the target areas respectively; and suppressing the second three-dimensional detection frame corresponding to the cross ratio greater than or equal to the preset cross ratio threshold value until the second three-dimensional detection frame in the detection frame sequence is traversed to obtain a target three-dimensional detection frame. And the screening efficiency of the target three-dimensional detection frame is improved by realizing the rapid calculation of the cross-over ratio.

Description

Three-dimensional detection frame processing method, system, equipment and medium for target detection
Technical Field
The present application relates to the field of target detection, and in particular, to a method, a system, an apparatus, and a medium for processing a three-dimensional detection frame for target detection.
Background
The development of the target detection technology is rapid, and the algorithm performance and accuracy are obviously improved under the push of deep learning; the target detection network is continuously optimized and improved to realize the identification and positioning of the target in a complex scene, and by adopting a more efficient computing architecture and parallel processing technology, the detection speed of the target detection technology on a large-scale data set is greatly improved, and the method provides possibility for implementation and application.
In the process of detecting the target, a plurality of target detection frames are generated for the same target, and redundant detection frames are required to be removed to obtain the target detection frames; meanwhile, the detection frame is also updated from a two-dimensional detection frame to a three-dimensional detection frame along with the development of the technology, and the screening difficulty is increased. Therefore, a method for rapidly screening repeated or redundant three-dimensional detection frames is needed to solve the above-mentioned problems.
Disclosure of Invention
Based on the foregoing, it is necessary to provide a method, a system and a device for processing a three-dimensional detection frame for target detection, so as to quickly reject a repeated detection frame or a redundant detection frame in a plurality of initial three-dimensional detection frames.
In a first aspect, the present application provides a method for processing a three-dimensional detection frame for target detection, including:
Acquiring a plurality of initial three-dimensional detection frames generated by a target detection model, and confidence degrees corresponding to the initial three-dimensional detection frames;
the initial three-dimensional detection frames are ordered in a descending order according to the confidence level so as to generate a detection frame sequence;
traversing the rest second three-dimensional detection frames in the detection frame sequence from the first three-dimensional detection frame with highest confidence, and calculating the intersection ratio between the second three-dimensional detection frame and the first three-dimensional detection frame, wherein the intersection ratio is calculated according to the bottom surface areas and the heights of the first three-dimensional detection frame and the second three-dimensional detection frame which correspond to the target areas respectively;
And suppressing the second three-dimensional detection frames corresponding to the cross ratio greater than or equal to a preset cross ratio threshold value until all the second three-dimensional detection frames in the detection frame sequence are traversed to obtain a target three-dimensional detection frame.
In some embodiments, the method for calculating the intersection ratio according to the bottom surface area and the height corresponding to the first three-dimensional detection frame and the second three-dimensional detection frame on the target area respectively includes:
Determining a target area according to the bottom surface vertex coordinates of the plurality of initial three-dimensional detection frames;
judging whether an overlapping area exists between the first three-dimensional detection frame and the second three-dimensional detection frame;
Determining a first bottom surface area occupied by the first three-dimensional detection frame in the target area and a second bottom surface area occupied by the second three-dimensional detection frame in the target area according to a preset bottom surface area determination rule in response to the existence of an overlapping area of the first three-dimensional detection frame and the second three-dimensional detection frame, wherein the first bottom surface area and the second bottom surface are a set of a plurality of line segments;
Determining an intersection area according to the first bottom surface area, the second bottom surface area, the height of the first three-dimensional detection frame and the height of the second three-dimensional detection frame;
And calculating and determining the intersection ratio of the first three-dimensional detection frame and the second three-dimensional detection frame according to the volume of the first three-dimensional detection frame, the volume of the second three-dimensional detection frame and the volume of the intersection area.
In some embodiments, the determining the target area according to the bottom vertex coordinates of the plurality of initial three-dimensional detection frames includes:
Constructing a rectangular coordinate system, and acquiring four bottom surface vertex coordinates corresponding to each three-dimensional detection frame;
determining the area of the target area according to the obtained target area selected by the X coordinate minimum value, the X coordinate maximum value, the Y coordinate minimum value and the linear frame where the Y coordinate maximum value is located in the plurality of bottom surface vertex coordinates;
And dividing an area matched with the target area by taking an origin as a starting point to serve as the target area.
In some embodiments, the determining, according to a preset floor area determining rule, the occupied first floor area of the first three-dimensional detection frame in the target area and the occupied second floor area of the second three-dimensional detection frame in the target area includes:
defining the height of the target area on a Y axis as H, wherein the value corresponding to H is the difference value between the maximum value of the Y coordinate and the minimum value of the Y coordinate;
calculating an intersection point of the first three-dimensional detection frame and a straight line Y=i and an intersection point of the second three-dimensional detection frame and the straight line Y=i, and determining a first intersection line segment of the first three-dimensional detection frame and the straight line Y=i and a second intersection line segment of the second three-dimensional detection frame and the straight line Y=i, wherein 0< =i < =H-1;
Determining a first bottom surface area occupied by the first three-dimensional detection frame on the target area according to a set of a plurality of first intersecting line segments;
and determining a second bottom surface area occupied by the second three-dimensional detection frame on the target area according to the set of the plurality of second intersecting line segments.
In some embodiments, the intersection point includes a start intersection point, calculating an intersection point of the first three-dimensional detection frame or the second three-dimensional detection frame with a straight line y=i includes:
according to the relative positions of the four bottom surface top points corresponding to the three-dimensional detection frame, respectively defining four bottom surface top point positions as a lowest point, a highest point, a leftmost point and a rightmost point;
calculating a first slope corresponding to a first connecting line between the lowest point and the leftmost point, a second slope corresponding to a second connecting line between the lowest point and the rightmost point, a third slope corresponding to a third connecting line between the leftmost point and the highest point and a fourth slope corresponding to a fourth connecting line between the rightmost point and the highest point;
if i is less than or equal to the Y coordinate of the lowest point, determining that the starting intersection point does not exist;
If i is less than or equal to the Y coordinate of the leftmost point, determining that the start intersection point is on the first connecting line, where X coordinate of the start intersection point=x coordinate of the lowest point+first slope;
If i is less than or equal to the Y coordinate of the highest point, determining that the start intersection is on the third connecting line, where X coordinate of the start intersection=x coordinate of the leftmost point+third slope;
if i is less than or equal to the Y coordinate of the rightmost point, determining that the start intersection is on the fourth link, where X coordinate of the start intersection=x coordinate of the highest point+fourth slope;
If i is greater than the Y coordinate of the highest point, it is determined that the starting intersection point does not exist.
In some embodiments, calculating the intersection of the first three-dimensional detection frame or the second three-dimensional detection frame with the straight line y=i comprises:
If i is less than or equal to the Y coordinate of the lowest point, determining that the ending intersection point does not exist;
if i is less than or equal to the Y coordinate of the rightmost point, determining that the ending intersection point is on the second connecting line, where the X coordinate of the ending intersection point=the X coordinate of the lowest point+the second slope;
If i is less than or equal to the Y coordinate of the highest point, determining that the ending intersection point is on the fourth connecting line, where X coordinate of the ending intersection point=x coordinate of the rightmost point+fourth slope;
If i is less than or equal to the Y coordinate of the leftmost point, determining that the ending intersection point is on the third connecting line, where the X coordinate of the ending intersection point=the X coordinate of the highest point+the third slope;
If i is greater than the Y coordinate of the highest point, determining that the ending intersection point does not exist.
In some embodiments, the intersection region includes a first intersection region on a target region and a second intersection region on a Z-axis, the determining the intersection region based on the first floor region, the second floor region, and the height of the first three-dimensional detection frame and the height of the second three-dimensional detection frame includes:
Determining a starting intersection point of a straight line y=i and an intersecting line segment forming the first intersecting region on a target region according to the maximum value of the X coordinate of the starting intersection point corresponding to the first bottom surface region and the X coordinate of the starting intersection point corresponding to the second bottom surface region;
determining an ending intersection point of a straight line y=i and an intersecting line segment forming the first intersecting region on the target region according to a minimum value of an X coordinate of the ending intersection point corresponding to the first bottom surface region and an X coordinate of the ending intersection point corresponding to the second bottom surface region;
And determining a starting intersection point of the intersecting line segment forming the second intersection region according to a larger value in the minimum Z coordinate of the first three-dimensional detection frame and the minimum Z coordinate of the second three-dimensional detection frame, and determining an ending intersection point of the intersecting line segment forming the second intersection region according to a smaller value in the maximum Z coordinate of the first three-dimensional detection frame and the maximum Z coordinate of the second three-dimensional detection frame.
In a second aspect, the present application provides a three-dimensional inspection box processing system for object inspection, the system comprising:
the data preparation module is used for acquiring a plurality of initial three-dimensional detection frames generated by the target detection model and the confidence coefficient corresponding to each initial three-dimensional detection frame;
the sorting module is used for sorting the initial three-dimensional detection frames in a descending order according to the confidence level so as to generate a detection frame sequence;
The intersection ratio calculation module is used for traversing the remaining second three-dimensional detection frames in the detection frame sequence from the first three-dimensional detection frame with the highest confidence, and calculating the intersection ratio between the second three-dimensional detection frame and the first three-dimensional detection frame, wherein the intersection ratio is calculated according to the bottom surface areas and the heights of the first three-dimensional detection frame and the second three-dimensional detection frame which correspond to the target areas respectively;
The data processing module is used for inhibiting the second three-dimensional detection frames corresponding to the cross ratio greater than or equal to a preset cross ratio threshold value until all the second three-dimensional detection frames are traversed to obtain target three-dimensional detection frames.
In a third aspect, the present application provides an electronic device, including:
one or more processors;
and a memory associated with the one or more processors, the memory for storing program instructions that, when read and executed by the one or more processors, perform the operations of:
Acquiring a plurality of initial three-dimensional detection frames generated by a target detection model, and confidence degrees corresponding to the initial three-dimensional detection frames;
the initial three-dimensional detection frames are ordered in a descending order according to the confidence level so as to generate a detection frame sequence;
traversing the rest second three-dimensional detection frames in the detection frame sequence from the first three-dimensional detection frame with highest confidence, and calculating the intersection ratio between the second three-dimensional detection frame and the first three-dimensional detection frame, wherein the intersection ratio is calculated according to the bottom surface areas and the heights of the first three-dimensional detection frame and the second three-dimensional detection frame which correspond to the target areas respectively;
And suppressing the second three-dimensional detection frames corresponding to the cross ratio greater than or equal to a preset cross ratio threshold value until all the second three-dimensional detection frames in the detection frame sequence are traversed to obtain a target three-dimensional detection frame.
In a fourth aspect, the present application also provides a computer-readable storage medium having stored thereon a computer program that causes a computer to perform the operations of:
Acquiring a plurality of initial three-dimensional detection frames generated by a target detection model, and confidence degrees corresponding to the initial three-dimensional detection frames;
the initial three-dimensional detection frames are ordered in a descending order according to the confidence level so as to generate a detection frame sequence;
traversing the rest second three-dimensional detection frames in the detection frame sequence from the first three-dimensional detection frame with highest confidence, and calculating the intersection ratio between the second three-dimensional detection frame and the first three-dimensional detection frame, wherein the intersection ratio is calculated according to the bottom surface areas and the heights of the first three-dimensional detection frame and the second three-dimensional detection frame which correspond to the target areas respectively;
And suppressing the second three-dimensional detection frames corresponding to the cross ratio greater than or equal to a preset cross ratio threshold value until all the second three-dimensional detection frames in the detection frame sequence are traversed to obtain a target three-dimensional detection frame.
The beneficial effects achieved by the application are as follows:
The application provides a three-dimensional detection frame processing method for target detection, which comprises the steps of obtaining a plurality of initial three-dimensional detection frames generated by a target detection model and confidence degrees corresponding to the initial three-dimensional detection frames; the initial three-dimensional detection frames are ordered in a descending order according to the confidence level so as to generate a detection frame sequence; traversing the rest second three-dimensional detection frames in the detection frame sequence from the first three-dimensional detection frame with highest confidence, and calculating the intersection ratio between the second three-dimensional detection frame and the first three-dimensional detection frame, wherein the intersection ratio is calculated according to the bottom surface areas and the heights of the first three-dimensional detection frame and the second three-dimensional detection frame which correspond to the target areas respectively; and suppressing the second three-dimensional detection frames corresponding to the cross ratio greater than or equal to a preset cross ratio threshold value until all the second three-dimensional detection frames in the detection frame sequence are traversed to obtain a target three-dimensional detection frame. The method and the device realize the screening of the initial three-dimensional detection frame generated by the target detection model, and greatly improve the screening and sorting efficiency of the target three-dimensional detection frame by realizing the rapid calculation of the cross-over ratio.
Drawings
For a clearer description of the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly introduced below, it being obvious that the drawings in the description below are only some embodiments of the present application, and that other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art, wherein:
FIG. 1 is a flow chart of a three-dimensional detection frame processing method for target detection according to an embodiment of the present application;
FIG. 2 is a schematic view of a target area according to an embodiment of the present application;
FIG. 3 is a schematic diagram showing the relative positions of the vertices of a bottom surface according to an embodiment of the present application;
FIG. 4 is a schematic illustration of an intersection point provided by an embodiment of the present application;
FIG. 5 is a schematic diagram of a three-dimensional detection frame processing architecture for object detection according to an embodiment of the present application;
FIG. 6 is a schematic diagram of an exemplary embodiment of an overlap ratio calculation module;
Fig. 7 is a block diagram of an electronic device according to an embodiment of the present application.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present application more apparent, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
It should be understood that throughout this specification and the claims, unless the context clearly requires otherwise, the words "comprise", "comprising", and the like, are to be construed in an inclusive sense as opposed to an exclusive or exhaustive sense; that is, it is the meaning of "including but not limited to".
It should also be appreciated that the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. Furthermore, in the description of the present application, unless otherwise indicated, the meaning of "a plurality" is two or more.
It should be noted that the terms "S1", "S2", and the like are used for the purpose of describing the steps only, and are not intended to be construed to be specific as to the order or sequence of steps, nor are they intended to limit the present application, which is merely used to facilitate the description of the method of the present application, and are not to be construed as indicating the sequence of steps. In addition, the technical solutions of the embodiments may be combined with each other, but it is necessary to base that the technical solutions can be realized by those skilled in the art, and when the technical solutions are contradictory or cannot be realized, the combination of the technical solutions should be considered to be absent and not within the scope of protection claimed in the present application.
The three-dimensional detection frame processing method for target detection disclosed by the application can be applied to multiple fields, for example, in the intelligent traffic field, can be used in automatic driving automobiles and traffic monitoring systems, and can be used for realizing real-time detection and tracking of targets such as vehicles, pedestrians and the like, thereby improving traffic safety and efficiency; in the security field, the target detection algorithm is used for face recognition, behavior analysis and other aspects, so that automatic recognition and early warning of targets in the monitoring video are realized; in addition, the target detection plays an important role in the fields of medical image analysis, intelligent retail, industrial quality inspection and the like.
At present, two general implementation modes are available for calculating the cross-over ratio: one is to call the implementation of the opencv and other computing libraries, the mode has large and complete functions, but the operation speed is slower; the other is to directly write CUDA codes, and the method generally calculates parameters such as intersection points of candidate frame edges, and the like, and calculates a detection frame which is relatively complex and mostly does not support three-dimensional rotation. In addition, the method has higher efficiency and lower power consumption compared with the GPU in consideration of the purpose of optimizing a specific application scene in the neural network pushing; ASIC chips are also increasingly being used to perform neural network reasoning tasks, but ASIC chips are less powerful to handle and less efficient in complex computing. The method provided by the application greatly quickens the screening of redundant repeated detection frames in the target detection process through the rapid, simple and convenient calculation of the cross-over ratio, and greatly improves the efficiency and the universality of target detection.
Example 1
The embodiment of the application provides a three-dimensional detection frame processing method for target detection, in particular to a method for processing a three-dimensional detection frame generated in a target detection process to obtain the target detection frame, which comprises the following steps of:
s100, acquiring a plurality of initial three-dimensional detection frames generated by the target detection model and confidence degrees corresponding to the initial three-dimensional detection frames.
In the embodiment of the application, the target detection model comprises, but is not limited to, target detection network models such as R-CNN series based on candidate frames and end-to-end YOLO series; the method acquires a plurality of initial three-dimensional detection frames and the confidence degrees corresponding to the initial three-dimensional detection frames based on the target detection model. The generation of the plurality of initial three-dimensional detection frames and the corresponding confidence levels according to the target detection model are conventional technical means in the art, and the application is not repeated here.
S200, sorting the initial three-dimensional detection frames in a descending order according to the confidence level so as to generate a detection frame sequence.
In target detection, confidence (Confidence Score) is an important metric used to evaluate the accuracy of the model's prediction of the detection frame of the target it detects. In particular, confidence generally represents the degree of confidence that a model is in its predicted target class and detection box position. In the output of a target detection model, a plurality of initial detection frames and corresponding category predictions are usually generated, the initial three-dimensional detection frames are ordered according to the confidence, the initial three-dimensional detection frame with the highest confidence is used as a first three-dimensional detection frame, the rest initial three-dimensional detection frames are used as a second three-dimensional detection frame, the initial three-dimensional detection frames are orderly arranged according to the descending order of the confidence, and the arranged initial three-dimensional detection frames are defined as a detection frame sequence.
S300, traversing the rest second three-dimensional detection frames in the detection frame sequence from the first three-dimensional detection frame with the highest confidence, and calculating the intersection ratio between the second three-dimensional detection frame and the first three-dimensional detection frame, wherein the intersection ratio is calculated according to the bottom surface areas and the heights of the first three-dimensional detection frame and the second three-dimensional detection frame which correspond to the target area respectively.
In some implementations, a method of calculating an intersection ratio from a bottom surface region and a height of a first three-dimensional detection frame and a second three-dimensional detection frame, respectively, on a target region includes:
S310, determining a target area according to the bottom surface vertex coordinates of the initial three-dimensional detection frames.
Specifically, as shown in fig. 2, determining the target area according to the bottom vertex coordinates of the plurality of initial three-dimensional detection frames includes: and constructing a rectangular coordinate system, namely constructing a three-dimensional coordinate system comprising an X axis, a Y axis and a Z axis. Acquiring four bottom surface vertex coordinates corresponding to each three-dimensional detection frame; and determining the area of the target area according to the obtained target area selected by the straight line frame where the X coordinate minimum value, the X coordinate maximum value, the Y coordinate minimum value and the Y coordinate maximum value are positioned in the plurality of bottom surface vertex coordinates. Specifically, it is assumed that eight vertices of the initial three-dimensional detection frame are denoted (xm, ym, zm), respectively, where 0< = m < = 7; four vertexes of the bottom surface of the initial three-dimensional detection frame are set as (xn, yn, zn), wherein 0< =n < =3. The maximum value of X coordinates (denoted by all_max_x) and the minimum value of X coordinates (denoted by all_min_x) and the maximum value of Y coordinates (denoted by all_max_y) and the minimum value of Y coordinates (denoted by all_min_y) among the four bottom vertices of all the initial three-dimensional inspection frames are calculated. Assuming that k initial three-dimensional detection boxes are present, all_max_x=max (x (i, m)), all_min_x=min (x (i, m)), all_max_y=max (y (i, m)), all_min_y=min (y (i, m)); wherein 0< = i < = k-1,0< = m < = 3. And a target area which can accommodate all the initial three-dimensional detection frames is obtained through division, and a standard reference plane is provided for each three-dimensional detection frame so as to calculate the bottom area of each three-dimensional detection frame in the target area later, and the accuracy of the calculation of the subsequent cross ratio is further improved.
Four maximum values determined according to the above: dividing a target area on an X-Y plane by an X coordinate minimum value, an X coordinate maximum value, a Y coordinate minimum value and a Y coordinate maximum value, wherein the target area S= (all_max_x-all_min_x) of the target area is equal to (all_max_y-all_min_y); and the target area is started by an origin (0, 0). It will be appreciated that the target area may accommodate the bottom surface of all of the initial three-dimensional inspection frames. In order to facilitate processing and improve accuracy of a subsequent cross-over ratio calculation result, the application proposes that a first three-dimensional detection frame and a second three-dimensional detection frame which need to calculate the cross-over ratio are translated on an X-Y plane, the translation operation does not affect the coordinate on the Z axis of the corresponding three-dimensional detection frame, preferably, the vertex on each three-dimensional detection frame is translated towards the X axis by a unit matched with the minimum value of the X coordinate, the vertex on each three-dimensional detection frame after translation is translated towards the Y axis by a unit matched with the minimum value of the Y coordinate, and the vertex on each three-dimensional detection frame after translation is (xm-all_min_x, ym-all_min_y, zm).
S320, judging whether an overlapping area exists between the first three-dimensional detection frame and the second three-dimensional detection frame.
The embodiment of the application provides a method for judging whether an overlapping area exists between a first three-dimensional detection frame and a second three-dimensional detection frame according to four bottom surface vertex coordinates corresponding to the first three-dimensional detection frame and the second three-dimensional detection frame and coordinates of the first three-dimensional detection frame and the second three-dimensional detection frame on a Z axis; if the maximum value of the X-axis coordinates in the four bottom surface vertexes of the first three-dimensional detection frame is larger than or equal to the minimum value of the X-axis coordinates of the second three-dimensional detection frame, and the minimum value of the X-axis coordinates in the four bottom surface vertexes of the first three-dimensional detection frame is smaller than or equal to the maximum value of the X-axis coordinates in the four bottom surface vertexes of the second three-dimensional detection frame, judging that an intersection point exists between the first three-dimensional detection frame and the second three-dimensional detection frame on the X-axis; if the maximum value of the Y-axis coordinates in the four bottom surface vertexes of the first three-dimensional detection frame is larger than or equal to the minimum value of the Y-axis coordinates of the second three-dimensional detection frame, and the minimum value of the Y-axis coordinates in the four bottom surface vertexes of the first three-dimensional detection frame is smaller than or equal to the maximum value of the Y-axis coordinates in the four bottom surface vertexes of the second three-dimensional detection frame, judging that an intersection point exists between the first three-dimensional detection frame and the second three-dimensional detection frame on the Y-axis; if the maximum value of the Z-axis coordinate corresponding to the first three-dimensional detection frame is larger than or equal to the minimum value of the Z-axis coordinate corresponding to the second three-dimensional detection frame and the minimum value of the Z-axis coordinate corresponding to the first three-dimensional detection frame is smaller than or equal to the maximum value of the Z-axis coordinate corresponding to the second three-dimensional detection frame, judging that an intersection point exists between the first three-dimensional detection frame and the second three-dimensional detection frame on the Z axis; if the first three-dimensional detection frame and the second three-dimensional detection frame have intersection points on the X axis, the Y axis and the Z axis, judging that an overlapping area exists between the first three-dimensional detection frame and the second three-dimensional detection frame; and if the first three-dimensional detection frame and the second three-dimensional detection frame do not have the intersection point in any direction of the X axis, the Y axis or the Z axis, judging that the first three-dimensional detection frame and the second three-dimensional detection frame do not have the overlapping area. Responsive to the absence of an overlap region between the first three-dimensional detection frame and the second three-dimensional detection frame, retaining the second three-dimensional detection frame in the detection frame sequence; at this time, the first three-dimensional detection frame and the second three-dimensional detection frame are mutually independent and are not redundant or overlapped detection frames; and in response to the existence of the overlapping area between the first three-dimensional detection frame and the second three-dimensional detection frame, calculating the intersection ratio between the first three-dimensional detection frame and the second three-dimensional detection frame, and determining the overlapping degree of the first three-dimensional detection frame and the second three-dimensional detection frame according to the calculated intersection ratio so as to judge whether to keep the second three-dimensional detection frame or reject the second three-dimensional detection frame. Directly determining whether the three-dimensional detection frames intersect in different directions or not through comparison of maximum values and minimum values of coordinates of the three-dimensional detection frames in different directions, further determining whether an intersection area exists or not, directly stopping calculation of a subsequent intersection ratio if the intersection area does not exist, directly reserving the non-overlapping three-dimensional detection frames through the preprocessing method, and greatly improving the determination efficiency of the target detection frames.
For example, the area occupied by the first three-dimensional inspection box on the X-axis is defined as [ A_min_x, A_max_x ], wherein A_min_x and A_max_x are the minimum and maximum values of the X-coordinate in the four bottom vertices of the first three-dimensional inspection box, respectively; the area occupied on the Y-axis may be represented as [ a_min_y, a_max_y ], where a_min_y and a_max_y are the minimum and maximum values of the Y coordinates of the four bottom vertices of the first three-dimensional inspection box, respectively; the area occupied on the Z-axis may be denoted as [ A_min_z, A_max_z ], where A_min_z and A_max_z are the Z coordinates of either of the bottom surface vertex and the top surface vertex of the first three-dimensional inspection box, respectively. Likewise, the area occupied on the X-axis defining the second three-dimensional detection frame is denoted as [ b_min_x, b_ mBx _x ], the area occupied on the Y-axis may be denoted as [ b_min_y, b_ mBx _y ], and the area occupied on the Z-axis may be denoted as [ b_min_z, b_ mBx _z ]. For the first three-dimensional detection frame and the second three-dimensional detection frame, the common area on the X axis is [ min_x, max_x ], wherein min_x=max (A_min_x, B_min_x), max_x=min (A_max_x, B_max_x); whether there is an overlapping region (x_overlap) between two three-dimensional detection frames on the X-axis can be represented by x_overlap=max_x > min_x, x_overlap is 1 (i.e., x_overlap=max_x > min_x is true, max_x > =min_x) indicates that there is an overlapping region between two three-dimensional detection frames on the X-axis, and x_overlap is 0 (i.e., x_overlap=max_x > min_x is false, max_x < min_x)) indicates that two three-dimensional detection frames do not overlap on the X-axis. Likewise, y_overlap and z_overlap represent whether there is an overlapping region of two three-dimensional detection frames on the Y-axis and the Z-axis, respectively. Only if the two three-dimensional detection frames have overlapping areas on X, Y and the Z axis, the existence of the overlapping areas of the two three-dimensional detection frames can be judged, and as long as no overlapping area exists on any one of the X, Y and the Z axis, the existence of the overlapping areas of the two three-dimensional detection frames can be judged.
S330, determining a first bottom surface area occupied by the first three-dimensional detection frame in the target area and a second bottom surface area occupied by the second three-dimensional detection frame in the target area according to a preset bottom surface area determination rule in response to the existence of the overlapping areas of the first three-dimensional detection frame and the second three-dimensional detection frame, wherein the first bottom surface area and the second bottom surface are a set of a plurality of line segments.
The determining, according to a preset bottom surface area determining rule, a first bottom surface area occupied by a first three-dimensional detection frame in a target area and a second bottom surface area occupied by a second three-dimensional detection frame in the target area, includes: defining the height of the target area on the Y axis as H, wherein the value corresponding to H is the difference value between the maximum value of the Y coordinate and the minimum value of the Y coordinate; calculating an intersection point of the first three-dimensional detection frame and the straight line Y=i and an intersection point of the second three-dimensional detection frame and the straight line Y=i, and determining a first intersection line segment of the first three-dimensional detection frame and the straight line Y=i and a second intersection line segment of the second three-dimensional detection frame and the straight line Y=i, wherein 0< =i < =H; determining a first bottom surface area occupied by the first three-dimensional detection frame on the target area according to a set of a plurality of first intersecting line segments; and determining a second bottom surface area occupied by the second three-dimensional detection frame on the target area according to the set of the plurality of second intersecting line segments. Taking a three-dimensional detection frame as an example, the area occupied by the three-dimensional detection frame at the height i in the target area is represented as [ x_start (i), x_end (i) ], namely an intersecting line segment formed by coordinate points (x_start (i), i) and (x_end (i), wherein x_start (i) is the X coordinate of the intersection point of the leftmost side of the bottom surface of the three-dimensional detection frame and the straight line y=i, and x_end (i) is the X coordinate of the intersection point of the rightmost side of the bottom surface and the straight line y=i; the floor area occupied by the three-dimensional detection box on the target area may be represented as [ (x_start (0), x_end (0)), … (x_start (i), x_end (i)) … (x_start (H-1), x_end [ H-1 ])) ], i.e., a set of intersecting line segments. The method for rapidly calculating the cross-over ratio is provided, and a three-dimensional detection frame which rotates around a z-axis is supported; complex calculation such as calculating the edge intersection point of the detection frame is avoided, implementation details are clear, the application scene of target detection is greatly improved, and even if an ASIC chip with poor computing capability is utilized, the rapid screening of the detection frame generated by target detection can be realized.
Specifically, the intersection includes a start intersection and an end intersection, and the calculating an intersection of the first three-dimensional detection frame or the second three-dimensional detection frame and the straight line y=i includes determining an X coordinate x_start (i) of the start intersection:
As shown in fig. 3, according to the relative positions of the four bottom surface vertices corresponding to the three-dimensional detection frame, four bottom surface vertex points are respectively defined as a lowest point (bot), a highest point (top), a leftmost point (left) and a rightmost point (right); specifically, firstly, selecting the point with the smallest Y coordinate in four bottom surface vertexes as the bottommost point bot, and if the Y coordinates of a plurality of points are all the smallest values, selecting the point with the largest X coordinate as the bot, wherein the coordinates of the point are (X (bot), Y (bot)); then, the rotation angles of the other three points relative to the bot are calculated, the point with the smallest rotation angle is selected as the rightmost point right, the coordinate of the point is (x (right), y (right)), the point with the largest rotation angle is selected as the leftmost point left, the coordinate of the point is (x (left), y (left)), and the rest point is the topmost point top, the coordinate of the point is (x (top), y (top)).
And calculating a first slope corresponding to a first connecting line between the lowest point and the leftmost point, a second slope corresponding to a second connecting line between the lowest point and the rightmost point, a third slope corresponding to a third connecting line between the leftmost point and the highest point and a fourth slope corresponding to a fourth connecting line between the rightmost point and the highest point.
Calculating x_start (i), and determining that a start intersection point does not exist if i is less than or equal to the Y coordinate of the lowest point as shown in FIG. 4; if i is less than or equal to the Y coordinate of the leftmost point, determining that the start intersection point is on the first connecting line, wherein the X coordinate of the start intersection point=the X coordinate of the lowest point+the first slope (i-the Y coordinate of the lowest point); if i is less than or equal to the Y coordinate of the highest point, determining that the start intersection point is on the third connecting line, wherein the X coordinate of the start intersection point=the X coordinate of the leftmost point+the third slope; if i is less than or equal to the Y coordinate of the rightmost point, determining that the start intersection point is on the fourth connecting line, wherein the X coordinate of the start intersection point=the X coordinate of the highest point+the fourth slope (i-the Y coordinate of the highest point); if i is greater than the Y coordinate of the highest point, it is determined that there is no start intersection.
The above calculation of the intersection of the first three-dimensional detection frame or the second three-dimensional detection frame and the straight line y=i includes determining the X coordinate x_end (i) of the ending intersection, as shown in fig. 4: if i is less than or equal to the Y coordinate of the lowest point, determining that no ending intersection point exists; if i is less than or equal to the Y coordinate of the rightmost point, determining that the ending intersection point is on the second connecting line, wherein the X coordinate of the ending intersection point=the X coordinate of the lowest point+the second slope; if i is less than or equal to the Y coordinate of the highest point, determining that the ending intersection point is on the fourth connecting line, wherein the X coordinate of the ending intersection point=the X coordinate of the rightmost point+the fourth slope (i-the Y coordinate of the rightmost point); if i is less than or equal to the Y coordinate of the leftmost point, determining that the ending intersection point is on the third connecting line, wherein the X coordinate of the ending intersection point=the X coordinate of the highest point+the third slope; if i is greater than the Y coordinate of the highest point, it is determined that there is no ending intersection point. The application creatively provides a method for determining the three-dimensional detection frame in the bottom surface area, which comprises the steps of determining whether the three-dimensional detection frame has corresponding intersecting line segments at each height in a target area, and disassembling the calculation of the surface into the line segments.
S340, determining an intersection area according to the first bottom surface area, the second bottom surface area, the height of the first three-dimensional detection frame and the height of the second three-dimensional detection frame.
The intersecting region includes a first intersecting region on the target region and a second intersecting region on the Z-axis, and determining the intersecting region based on the first bottom surface region, the second bottom surface region, and the height of the first three-dimensional detection frame and the height of the second three-dimensional detection frame includes:
Determining a starting intersection point of a straight line y=i and an intersecting line segment constituting a first intersecting region on the target region according to a maximum value of an X coordinate of the starting intersection point corresponding to the first bottom surface region and an X coordinate of the starting intersection point corresponding to the second bottom surface region; and determining the ending intersection point of the straight line Y=i and the intersecting line segment forming the first intersecting region on the target region according to the minimum value of the X coordinate of the ending intersection point corresponding to the first bottom surface region and the X coordinate of the ending intersection point corresponding to the second bottom surface region. For convenience of explanation, when y=i (0 < =i < =h-1), it is assumed that the first bottom surface region occupied by the first three-dimensional detection frame on the target region is denoted as (a_x_start (i), a_x_end (i)); the second bottom surface area occupied by the second three-dimensional detection frame on the target area is denoted as (b_x_start (i), b_x_end (i)); then the start intersection of the intersection region between the first three-dimensional detection frame and the second three-dimensional detection frame is overlap_start (i), overlap_start (i) =max (a_x_start (i), b_x_start (i)); when the ending intersection point is overlap_end (i), overlap_end (i) =min (a_x_end (i), b_x_end (i)), and y=i (0 < =i < =h-1), the first intersection area occupied by the first three-dimensional detection frame and the second three-dimensional detection frame on the target area may be expressed as (overlap_start (i), overlap_end (i)).
And determining a starting intersection point of the intersecting line segments forming the second intersection region according to a larger value in the minimum Z coordinate of the first three-dimensional detection frame and the minimum Z coordinate of the second three-dimensional detection frame, and determining an ending intersection point of the intersecting line segments forming the second intersection region according to a smaller value in the maximum Z coordinate of the first three-dimensional detection frame and the maximum Z coordinate of the second three-dimensional detection frame. For convenience of explanation, assuming that the region occupied by the first three-dimensional inspection frame on the Z-axis is denoted as [ a_min_z, a_max_z ], the region occupied by the second three-dimensional inspection frame on the Z-axis is denoted as [ b_min_z, b_max_z ], the start intersection point of the region occupied by the intersecting region on the Z-axis is min_z=max (a_min_z, b_min_z), the end intersection point is max_z=min (a_max_z, b_max_z), and the second intersection region on the Z-axis is denoted as (min_z, max_z).
S350, calculating and determining the intersection ratio of the first three-dimensional detection frame and the second three-dimensional detection frame according to the volume of the first three-dimensional detection frame, the volume of the second three-dimensional detection frame and the volume of the intersection area.
The method for calculating the volumes of the first three-dimensional detection frame and the second three-dimensional detection frame and the volumes of the intersection areas comprises the following steps: the method comprises the steps of calculating the area of a first bottom surface area occupied by a first three-dimensional detection frame on a target area, the area of a second bottom surface area occupied by a second three-dimensional detection frame on the target area and the area of a first intersection area occupied by an intersection area on the target area. Specifically, when y=i (0 < =i < =h-1), the area occupied by the bottom surface of the three-dimensional detection frame on the target area may be represented by (x_start (i), x_end (i)), and then the area of the bottom surface area corresponding to y=i is bot (i) =x_end (i) -x_start (i), and the area bot= Σbot (i) = Σ (x_end (i) -x_start (i)) of the bottom surface area is obtained by accumulating the bot (i). The area of the first bottom surface area corresponding to the first three-dimensional detection frame, the area of the second bottom surface area corresponding to the second three-dimensional detection frame and the area of the first intersection area occupied by the intersection area on the target area can be calculated.
Calculating the volume of the first three-dimensional detection frame according to the height of the first three-dimensional detection frame and the area of the first bottom surface area; calculating the volume of the second three-dimensional detection frame according to the height of the second three-dimensional detection frame and the area of the second bottom surface area; the volume of the intersection region is calculated from the height of the intersection region and the area of the first intersection region. For illustration, assume that the height of the first three-dimensional detection frame is denoted as H (a), where=h (a) =a_max_z-a_min_z; the volume of the first three-dimensional detection box is denoted V (a), wherein V (a) =bot_a×h (a); similarly, the volume of the second three-dimensional detection frame is denoted as V (B), where V (B) =bot_b×h (B), H (B) =b_max_z-b_min_z. The height of the intersection region is H (overlap) =max_z-min_z, its volume V (overlap) =bot_overlap H (overlap).
The ratio of intersection between the first three-dimensional detection frame and the second three-dimensional detection frame = the volume of the intersection region/(the volume of the first three-dimensional detection frame + the volume of the second three-dimensional detection frame-the volume of the intersection region); corresponding to the specific representation above, the intersection ratio (IoU, intersection ofUnion) of the first three-dimensional detection frame and the second three-dimensional detection frame may be represented as IoU (a, B) =v (overlap)/(V (a) +v (B) -V (overlap)).
S4, restraining the second three-dimensional detection frames corresponding to the cross ratio greater than or equal to the preset cross ratio threshold value until all the second three-dimensional detection frames are traversed to obtain the target three-dimensional detection frames.
After calculating the cross-over ratio between the first three-dimensional detection frame and the second three-dimensional detection frame, comparing the calculated cross-over ratio with a preset cross-over ratio threshold, and if the calculated cross-over ratio is greater than or equal to the cross-over ratio threshold, indicating that the overlapping degree of the second three-dimensional detection frame and the first three-dimensional detection frame is high, at the moment, the corresponding second three-dimensional detection is required to be restrained and not reserved; if the calculated intersection ratio is smaller than the intersection ratio threshold, the overlapping degree of the second three-dimensional detection frame and the first three-dimensional detection frame is not high, and at this time, in order to ensure the accuracy of the subsequent target detection, the corresponding second three-dimensional detection frame can be reserved. The above process continues until all the second three-dimensional inspection frames in the inspection frame sequence, and the first three-dimensional inspection frame and the last remaining one or more second three-dimensional inspection frames are taken as target three-dimensional inspection frames. The above cross ratio threshold is preferably set to 0.5, and is also set by those skilled in the art according to the actual application scenario, which is not limited by the present application.
The method calculates the cross-over ratio between the initial three-dimensional detection frames output by the target detection model, eliminates redundant or repeated initial three-dimensional detection frames by comparing the cross-over ratio with the preset cross-over ratio threshold value, and reserves the detection frames with high confidence so as to avoid the influence of the redundant repeated detection frames on target detection and improve the accuracy of target detection; and the screening speed of the detection frame is greatly accelerated through the rapid calculation of the cross-over ratio.
Example two
Corresponding to the first embodiment, the present application further provides a three-dimensional detection frame processing for target detection, as shown in fig. 5, which specifically includes:
The data preparation module 510 is configured to obtain a plurality of initial three-dimensional detection frames generated by the target detection model and confidence degrees corresponding to the initial three-dimensional detection frames;
the sorting module 520 is configured to sort the initial three-dimensional detection frames in a descending order according to the confidence level to generate a detection frame sequence;
The cross-over ratio calculating module 530 is configured to traverse the remaining second three-dimensional detection frames in the detection frame sequence from the first three-dimensional detection frame with the highest confidence, and calculate a cross-over ratio between the second three-dimensional detection frame and the first three-dimensional detection frame, where the cross-over ratio is calculated according to the bottom surface areas and the heights of the first three-dimensional detection frame and the second three-dimensional detection frame that respectively correspond to the target areas;
The data processing module 540 is configured to suppress the second three-dimensional detection frames corresponding to the cross ratio greater than or equal to the preset cross ratio threshold until all the second three-dimensional detection frames are traversed to obtain the target three-dimensional detection frame.
Preferably, as shown in fig. 6, the cross-over ratio calculating module 530 further includes a bottom surface processing unit 610, an overlap judging unit 620, a bottom surface area calculating unit 630, and a cross-over ratio calculating unit 640;
A bottom surface processing unit 610, configured to determine a target area according to bottom surface vertex coordinates of the plurality of initial three-dimensional detection frames;
An overlap judging unit 620 for judging whether or not an overlap region exists between the first three-dimensional detection frame and the second three-dimensional detection frame;
A bottom surface area calculating unit 630, configured to determine, in response to the first three-dimensional detection frame and the second three-dimensional detection frame having an overlapping area, a first bottom surface area occupied by the first three-dimensional detection frame in the target area and a second bottom surface area occupied by the second three-dimensional detection frame in the target area according to a preset bottom surface area determination rule, where the first bottom surface area and the second bottom surface are a set of a plurality of line segments;
The intersection ratio calculating unit 640 is configured to determine an intersection area according to the first bottom surface area, the second bottom surface area, the height of the first three-dimensional detection frame, and the height of the second three-dimensional detection frame;
The cross-over ratio calculating unit 640 is further configured to calculate and determine a cross-over ratio of the first three-dimensional detection frame and the second three-dimensional detection frame according to the volume of the first three-dimensional detection frame, the volume of the second three-dimensional detection frame, and the volume of the intersection region.
Preferably, in some implementation scenarios, the bottom surface processing unit 610 is further configured to construct a rectangular coordinate system, and obtain four bottom surface vertex coordinates corresponding to each three-dimensional detection frame; determining the area of a target area according to the obtained target area selected by the X coordinate minimum value, the X coordinate maximum value, the Y coordinate minimum value and the linear frame where the Y coordinate maximum value is located in the plurality of bottom surface vertex coordinates; the origin is used as a starting point, and a region matched with the target area is divided as a target region.
Preferably, in some implementation scenarios, the bottom surface area calculating unit 630 is further configured to define a height of the target area on the Y axis as H, where a value corresponding to H is a difference between a maximum value of the Y coordinate and a minimum value of the Y coordinate; calculating an intersection point of the first three-dimensional detection frame and the straight line Y=i and an intersection point of the second three-dimensional detection frame and the straight line Y=i, and determining a first intersection line segment of the first three-dimensional detection frame and the straight line Y=i and a second intersection line segment of the second three-dimensional detection frame and the straight line Y=i, wherein 0< =i < =H-1; determining a first bottom surface area occupied by the first three-dimensional detection frame on the target area according to a set of a plurality of first intersecting line segments; and determining a second bottom surface area occupied by the second three-dimensional detection frame on the target area according to the set of the plurality of second intersecting line segments.
Preferably, in some implementation scenarios, the bottom surface area calculating unit 630 is further configured to define four bottom surface vertex points as a lowest point, a highest point, a leftmost point and a rightmost point according to the relative positions of the four bottom surface vertices corresponding to the three-dimensional detection frame; calculating a first slope corresponding to a first connecting line between the lowest point and the leftmost point, a second slope corresponding to a second connecting line between the lowest point and the rightmost point, a third slope corresponding to a third connecting line between the leftmost point and the highest point and a fourth slope corresponding to a fourth connecting line between the rightmost point and the highest point; when the i is smaller than or equal to the Y coordinate of the lowest point, determining that a starting intersection point does not exist; when i is less than or equal to the Y coordinate of the leftmost point, determining that the start intersection point is on the first connecting line, wherein the X coordinate of the start intersection point=the X coordinate of the lowest point+the first slope (i-the Y coordinate of the lowest point); determining that the start intersection point is on the third connecting line at the Y coordinate of i is less than or equal to the highest point, wherein the X coordinate of the start intersection point=the X coordinate of the leftmost point+the third slope; determining that the start intersection point is on the fourth connecting line at the Y coordinate of i is smaller than or equal to the right-most point, wherein the X coordinate of the start intersection point=the X coordinate of the highest point+the fourth slope; at a Y coordinate where i is greater than the highest point, it is determined that there is no start intersection.
Preferably, in some implementation scenarios, the bottom surface area calculating unit 630 is further configured to determine that there is no ending intersection when i is less than or equal to the Y coordinate of the lowest point; when i is less than or equal to the Y coordinate of the rightmost point, determining that the ending intersection point is on the second connecting line, wherein the X coordinate of the ending intersection point=the X coordinate of the lowest point+the second slope; determining that the ending intersection point is on the fourth connecting line at the Y coordinate of i is smaller than or equal to the highest point, wherein the X coordinate of the ending intersection point=the X coordinate of the rightmost point+the fourth slope; determining that the ending intersection point is on the third connecting line at the Y coordinate of i is less than or equal to the leftmost point, and that the X coordinate of the ending intersection point=the X coordinate of the highest point+the third slope; and at the Y coordinate where i is greater than the highest point, determining that no ending intersection point exists.
Preferably, in some implementation scenarios, the intersection ratio calculating unit 640 is further configured to determine, according to a maximum value of an X coordinate of a start intersection point corresponding to the first bottom surface area and an X coordinate of a start intersection point corresponding to the second bottom surface area, a start intersection point of the straight line y=i and an intersection line segment that forms the first intersection area on the target area; determining an ending intersection point of the straight line y=i and an intersecting line segment constituting the first intersecting region on the target region according to a minimum value of an X coordinate of the ending intersection point corresponding to the first bottom surface region and an X coordinate of the ending intersection point corresponding to the second bottom surface region; and determining a starting intersection point of the intersecting line segments forming the second intersection region according to a larger value in the minimum Z coordinate of the first three-dimensional detection frame and the minimum Z coordinate of the second three-dimensional detection frame, and determining an ending intersection point of the intersecting line segments forming the second intersection region according to a smaller value in the maximum Z coordinate of the first three-dimensional detection frame and the maximum Z coordinate of the second three-dimensional detection frame.
Example III
Corresponding to the first and second embodiments, the embodiments of the present application also provide a computer program product, which when executed by a processor, implements the steps of the method as follows:
Acquiring a plurality of initial three-dimensional detection frames generated by a target detection model, and confidence degrees corresponding to the initial three-dimensional detection frames;
the initial three-dimensional detection frames are ordered in a descending order according to the confidence level so as to generate a detection frame sequence;
traversing the rest second three-dimensional detection frames in the detection frame sequence from the first three-dimensional detection frame with highest confidence, and calculating the intersection ratio between the second three-dimensional detection frame and the first three-dimensional detection frame, wherein the intersection ratio is calculated according to the bottom surface areas and the heights of the first three-dimensional detection frame and the second three-dimensional detection frame which correspond to the target areas respectively;
And suppressing the second three-dimensional detection frames corresponding to the cross ratio greater than or equal to a preset cross ratio threshold value until all the second three-dimensional detection frames in the detection frame sequence are traversed to obtain a target three-dimensional detection frame.
In some embodiments, the computer program when executed by a processor performs the steps of the method as follows:
Determining a target area according to the bottom surface vertex coordinates of the plurality of initial three-dimensional detection frames;
judging whether an overlapping area exists between the first three-dimensional detection frame and the second three-dimensional detection frame;
Determining a first bottom surface area occupied by the first three-dimensional detection frame in the target area and a second bottom surface area occupied by the second three-dimensional detection frame in the target area according to a preset bottom surface area determination rule in response to the existence of an overlapping area of the first three-dimensional detection frame and the second three-dimensional detection frame, wherein the first bottom surface area and the second bottom surface are a set of a plurality of line segments;
Determining an intersection area according to the first bottom surface area, the second bottom surface area, the height of the first three-dimensional detection frame and the height of the second three-dimensional detection frame;
And calculating and determining the intersection ratio of the first three-dimensional detection frame and the second three-dimensional detection frame according to the volume of the first three-dimensional detection frame, the volume of the second three-dimensional detection frame and the volume of the intersection area.
In some embodiments, the computer program when executed by a processor performs the steps of the method as follows:
Constructing a rectangular coordinate system, and acquiring four bottom surface vertex coordinates corresponding to each three-dimensional detection frame;
determining the area of the target area according to the obtained target area selected by the X coordinate minimum value, the X coordinate maximum value, the Y coordinate minimum value and the linear frame where the Y coordinate maximum value is located in the plurality of bottom surface vertex coordinates;
And dividing an area matched with the target area by taking an origin as a starting point to serve as the target area.
In some embodiments, the computer program when executed by a processor performs the steps of the method as follows:
defining the height of the target area on a Y axis as H, wherein the value corresponding to H is the difference value between the maximum value of the Y coordinate and the minimum value of the Y coordinate;
calculating an intersection point of the first three-dimensional detection frame and a straight line Y=i and an intersection point of the second three-dimensional detection frame and the straight line Y=i, and determining a first intersection line segment of the first three-dimensional detection frame and the straight line Y=i and a second intersection line segment of the second three-dimensional detection frame and the straight line Y=i, wherein 0< =i < =H-1;
Determining a first bottom surface area occupied by the first three-dimensional detection frame on the target area according to a set of a plurality of first intersecting line segments;
and determining a second bottom surface area occupied by the second three-dimensional detection frame on the target area according to the set of the plurality of second intersecting line segments.
In some embodiments, the computer program when executed by a processor performs the steps of the method as follows:
the intersection point comprises a start intersection point, and the calculation of the intersection point of the first three-dimensional detection frame or the second three-dimensional detection frame and the straight line Y=i comprises the following steps:
according to the relative positions of the four bottom surface top points corresponding to the three-dimensional detection frame, respectively defining four bottom surface top point positions as a lowest point, a highest point, a leftmost point and a rightmost point;
calculating a first slope corresponding to a first connecting line between the lowest point and the leftmost point, a second slope corresponding to a second connecting line between the lowest point and the rightmost point, a third slope corresponding to a third connecting line between the leftmost point and the highest point and a fourth slope corresponding to a fourth connecting line between the rightmost point and the highest point;
if i is less than or equal to the Y coordinate of the lowest point, determining that the starting intersection point does not exist;
If i is less than or equal to the Y coordinate of the leftmost point, determining that the start intersection point is on the first connecting line, where X coordinate of the start intersection point=x coordinate of the lowest point+first slope;
If i is less than or equal to the Y coordinate of the highest point, determining that the start intersection is on the third connecting line, where X coordinate of the start intersection=x coordinate of the leftmost point+third slope;
if i is less than or equal to the Y coordinate of the rightmost point, determining that the start intersection is on the fourth link, where X coordinate of the start intersection=x coordinate of the highest point+fourth slope;
If i is greater than the Y coordinate of the highest point, it is determined that the starting intersection point does not exist.
In some embodiments, the computer program when executed by a processor performs the steps of the method as follows:
If i is less than or equal to the Y coordinate of the lowest point, determining that the ending intersection point does not exist;
if i is less than or equal to the Y coordinate of the rightmost point, determining that the ending intersection point is on the second connecting line, where the X coordinate of the ending intersection point=the X coordinate of the lowest point+the second slope;
If i is less than or equal to the Y coordinate of the highest point, determining that the ending intersection point is on the fourth connecting line, where X coordinate of the ending intersection point=x coordinate of the rightmost point+fourth slope;
If i is less than or equal to the Y coordinate of the leftmost point, determining that the ending intersection point is on the third connecting line, where the X coordinate of the ending intersection point=the X coordinate of the highest point+the third slope;
If i is greater than the Y coordinate of the highest point, determining that the ending intersection point does not exist.
In some embodiments, the computer program when executed by a processor performs the steps of the method as follows:
Determining a starting intersection point of a straight line y=i and an intersecting line segment forming the first intersecting region on a target region according to the maximum value of the X coordinate of the starting intersection point corresponding to the first bottom surface region and the X coordinate of the starting intersection point corresponding to the second bottom surface region;
determining an ending intersection point of a straight line y=i and an intersecting line segment forming the first intersecting region on the target region according to a minimum value of an X coordinate of the ending intersection point corresponding to the first bottom surface region and an X coordinate of the ending intersection point corresponding to the second bottom surface region;
And determining a starting intersection point of the intersecting line segment forming the second intersection region according to a larger value in the minimum Z coordinate of the first three-dimensional detection frame and the minimum Z coordinate of the second three-dimensional detection frame, and determining an ending intersection point of the intersecting line segment forming the second intersection region according to a smaller value in the maximum Z coordinate of the first three-dimensional detection frame and the maximum Z coordinate of the second three-dimensional detection frame.
Example IV
Corresponding to all the embodiments described above, an embodiment of the present application provides an electronic device, including: one or more processors; and a memory associated with the one or more processors, the memory for storing program instructions that, when read and executed by the one or more processors, perform the operations of:
Acquiring a plurality of initial three-dimensional detection frames generated by a target detection model, and confidence degrees corresponding to the initial three-dimensional detection frames;
the initial three-dimensional detection frames are ordered in a descending order according to the confidence level so as to generate a detection frame sequence;
traversing the rest second three-dimensional detection frames in the detection frame sequence from the first three-dimensional detection frame with highest confidence, and calculating the intersection ratio between the second three-dimensional detection frame and the first three-dimensional detection frame, wherein the intersection ratio is calculated according to the bottom surface areas and the heights of the first three-dimensional detection frame and the second three-dimensional detection frame which correspond to the target areas respectively;
And suppressing the second three-dimensional detection frames corresponding to the cross ratio greater than or equal to a preset cross ratio threshold value until all the second three-dimensional detection frames in the detection frame sequence are traversed to obtain a target three-dimensional detection frame.
Fig. 7 illustrates an architecture of an electronic device, which may include a processor 710, a video display adapter 711, a disk drive 712, an input/output interface 713, a network interface 714, and a memory 720, among others. The processor 710, the video display adapter 711, the disk drive 712, the input/output interface 713, the network interface 714, and the memory 720 may be communicatively connected via a bus 730.
The processor 710 may be implemented by a general-purpose CPU (Central Processing Unit ), a microprocessor, an Application SPECIFIC INTEGRATED Circuit (ASIC), or one or more integrated circuits, etc. for executing related programs to implement the technical solution provided by the present application.
The Memory 720 may be implemented in the form of ROM (Read Only Memory), RAM (Random Access Memory ), static storage, dynamic storage, etc. The memory 720 may store an operating system 721 for controlling the execution of the electronic device 700, and a Basic Input Output System (BIOS) 722 for controlling the low-level operation of the electronic device 700. In addition, a web browser 723, a data storage management system 724, an icon font processing system 727, and the like may also be stored. The icon font processing system 727 may be an application program for implementing the operations of the foregoing steps in the embodiment of the present application. In general, when the technical solution provided by the present application is implemented by software or firmware, relevant program codes are stored in the memory 720 and invoked by the processor 710 for execution.
The input/output interface 713 is used to connect with an input/output module to enable information input and output. The input/output module may be configured as a component in a device (not shown) or may be external to the device to provide corresponding functionality. Wherein the input devices may include a keyboard, mouse, touch screen, microphone, various types of sensors, etc., and the output devices may include a display, speaker, vibrator, indicator lights, etc.
The network interface 714 is used to connect communication modules (not shown) to enable communication interactions of the device with other devices. The communication module may implement communication through a wired manner (such as USB, network cable, etc.), or may implement communication through a wireless manner (such as mobile network, WIFI, bluetooth, etc.).
Bus 730 includes a path to transfer information between various components of the device (e.g., processor 710, video display adapter 711, disk drive 712, input/output interface 713, network interface 714, and memory 720).
In addition, the electronic device 700 may also obtain information of specific acquisition conditions from the virtual resource object acquisition condition information database, for performing condition judgment, and so on.
It should be noted that although the above devices illustrate only the processor 710, the video display adapter 711, the disk drive 712, the input/output interface 713, the network interface 714, the memory 720, the bus 730, etc., the device may include other components necessary to achieve normal execution in an implementation. Furthermore, it will be appreciated by those skilled in the art that the apparatus may include only the components necessary to implement the present application, and not all of the components shown in the drawings.
From the above description of embodiments, it will be apparent to those skilled in the art that the present application may be implemented in software plus a necessary general hardware platform. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a cloud server, or a network device, etc.) to execute the method of the embodiments or some parts of the embodiments of the present application.
Example five
Corresponding to all the above embodiments, the embodiments of the present application further provide a computer-readable storage medium, characterized in that it stores a computer program, the computer program causing a computer to perform the following operations:
Acquiring a plurality of initial three-dimensional detection frames generated by a target detection model, and confidence degrees corresponding to the initial three-dimensional detection frames;
the initial three-dimensional detection frames are ordered in a descending order according to the confidence level so as to generate a detection frame sequence;
traversing the rest second three-dimensional detection frames in the detection frame sequence from the first three-dimensional detection frame with highest confidence, and calculating the intersection ratio between the second three-dimensional detection frame and the first three-dimensional detection frame, wherein the intersection ratio is calculated according to the bottom surface areas and the heights of the first three-dimensional detection frame and the second three-dimensional detection frame which correspond to the target areas respectively;
And suppressing the second three-dimensional detection frames corresponding to the cross ratio greater than or equal to a preset cross ratio threshold value until all the second three-dimensional detection frames in the detection frame sequence are traversed to obtain a target three-dimensional detection frame.
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for a system or system embodiment, since it is substantially similar to a method embodiment, the description is relatively simple, with reference to the description of the method embodiment being made in part. The systems and system embodiments described above are merely illustrative, wherein elements illustrated as separate elements may or may not be physically separate, and elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
The foregoing is only illustrative of the present application and is not to be construed as limiting thereof, but rather as various modifications, equivalent arrangements, improvements, etc., within the spirit and principles of the present application.

Claims (10)

CN202410700911.9A2024-05-312024-05-31Three-dimensional detection frame processing method, system, equipment and medium for target detectionPendingCN118521772A (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202410700911.9ACN118521772A (en)2024-05-312024-05-31Three-dimensional detection frame processing method, system, equipment and medium for target detection

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202410700911.9ACN118521772A (en)2024-05-312024-05-31Three-dimensional detection frame processing method, system, equipment and medium for target detection

Publications (1)

Publication NumberPublication Date
CN118521772Atrue CN118521772A (en)2024-08-20

Family

ID=92279257

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202410700911.9APendingCN118521772A (en)2024-05-312024-05-31Three-dimensional detection frame processing method, system, equipment and medium for target detection

Country Status (1)

CountryLink
CN (1)CN118521772A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN120493998A (en)*2025-07-142025-08-15苏州元脑智能科技有限公司Expert parallel computing method of hybrid expert model and computer program product

Cited By (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN120493998A (en)*2025-07-142025-08-15苏州元脑智能科技有限公司Expert parallel computing method of hybrid expert model and computer program product

Similar Documents

PublicationPublication DateTitle
CN107358149B (en)Human body posture detection method and device
CN108629231B (en)Obstacle detection method, apparatus, device and storage medium
KR20220002066A (en)method, apparatus, computer equipment, computer readable storage medium and computer program for visual qustion answering
CN111640089A (en)Defect detection method and device based on feature map center point
US8340433B2 (en)Image processing apparatus, electronic medium, and image processing method
CN110427908A (en)A kind of method, apparatus and computer readable storage medium of person detecting
CN108074237B (en) Image sharpness detection method, device, storage medium and electronic device
CN107204044A (en)A kind of picture display process and relevant device based on virtual reality
CN118521772A (en)Three-dimensional detection frame processing method, system, equipment and medium for target detection
CN112418316B (en)Robot repositioning method and device, laser robot and readable storage medium
CN112446374B (en)Method and device for determining target detection model
CN113392794A (en)Vehicle over-line identification method and device, electronic equipment and storage medium
CN107480710B (en)Feature point matching result processing method and device
CN111597889A (en)Method, device and system for detecting target movement in video
CN120047894A (en)Chinese herbal medicine and foreign matter detection method, system, storage medium and product thereof
CN119249872A (en) Point cloud-based robot environment simulation reconstruction method, device, equipment and medium
CN112364751B (en)Obstacle state judgment method, device, equipment and storage medium
CN109657577B (en) An animal detection method based on entropy and motion offset
CN114140512B (en) Image processing method and related device
CN114510142B (en)Gesture recognition method based on two-dimensional image, gesture recognition system based on two-dimensional image and electronic equipment
CN116560373A (en) Robot obstacle avoidance method, device, equipment and medium based on blind spot obstacles
CN118262324A (en)Target detection method and device, terminal equipment and unmanned equipment
CN113627646A (en)Path planning method, device, equipment and medium based on neural network
CN115761815B (en)Training method of human body detection model, human body detection method, device and medium
CN112614181A (en)Robot positioning method and device based on highlight target

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination

[8]ページ先頭

©2009-2025 Movatter.jp