Movatterモバイル変換


[0]ホーム

URL:


CN111104893B - Target detection method, target detection device, computer equipment and storage medium - Google Patents

Target detection method, target detection device, computer equipment and storage medium
Download PDF

Info

Publication number
CN111104893B
CN111104893BCN201911304904.2ACN201911304904ACN111104893BCN 111104893 BCN111104893 BCN 111104893BCN 201911304904 ACN201911304904 ACN 201911304904ACN 111104893 BCN111104893 BCN 111104893B
Authority
CN
China
Prior art keywords
target
target object
image
parallax
determining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911304904.2A
Other languages
Chinese (zh)
Other versions
CN111104893A (en
Inventor
崔迪潇
江志浩
徐生良
陈安
龚伟林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhijia Usa
Suzhou Zhijia Technology Co Ltd
Original Assignee
Zhijia Usa
Suzhou Zhijia Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhijia Usa, Suzhou Zhijia Technology Co LtdfiledCriticalZhijia Usa
Priority to CN201911304904.2ApriorityCriticalpatent/CN111104893B/en
Publication of CN111104893ApublicationCriticalpatent/CN111104893A/en
Application grantedgrantedCritical
Publication of CN111104893BpublicationCriticalpatent/CN111104893B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Images

Classifications

Landscapes

Abstract

The invention discloses a target detection method, a target detection device, computer equipment and a storage medium, and belongs to the technical field of automatic driving. The embodiment of the invention determines the target area and the road area in the vehicle environment image and performs object segmentation on the target area, thereby obtaining the target semantic information comprising the object type and the initial contour of the target object, realizing the comprehensive description of the target object from multiple angles, further accurately positioning the contour of the target object according to the parallax image and the target semantic information, greatly improving the accuracy of the contour, and finally accurately determining the spatial position of the target object according to the road area and the accurately positioned contour.

Description

Target detection method and device, computer equipment and storage medium
Technical Field
The present invention relates to the field of automatic driving technologies, and in particular, to a target detection method and apparatus, a computer device, and a storage medium.
Background
The automatic driving technology is a technology that senses the vehicle surroundings, makes driving decision plans, and automatically performs driving operations, instead of human beings. In the automatic driving process of the vehicle, the obstacle and the like in the surrounding environment need to be detected in real time based on the image, the point cloud data and the like of the surrounding environment so as to ensure the safe driving of the vehicle.
In the related art, the target detection process may include: the method comprises the steps that a vehicle-mounted terminal obtains an image of a surrounding environment, a 2D detection frame, such as a rectangular frame, is adopted in the image, and a target object in the image is marked, such as a vehicle, a traffic sign, a pedestrian and the like; and determining the position coordinates of the target object in a vehicle coordinate system by combining the three-dimensional shape of the target object in the laser point cloud data of the surrounding environment, thereby projecting the surrounding objects in the vehicle coordinate system one by one.
In the target detection process, the 2D detection frame is used to target the target object to obtain the position of the target object, however, the 2D detection frame can only roughly estimate the approximate position of the target object, so that the final detection result is inaccurate, and the accuracy of the target detection process is poor.
Disclosure of Invention
The embodiment of the invention provides a target detection method, a target detection device, computer equipment and a storage medium, which can solve the problem of poor accuracy of a target detection process in the related art. The technical scheme is as follows:
in one aspect, a target detection method is provided, and the method includes:
determining a target area and a road area in a vehicle environment image based on a parallax image of the vehicle environment image during the running of a vehicle, wherein the target area comprises a target object;
carrying out object segmentation on the target area to obtain target semantic information of a target object, wherein the target semantic information comprises an object type of the target object and an initial contour of the target object;
determining the outline of the target object according to the parallax image and the target semantic information of the target object;
determining a spatial position of the target object in a vehicle environment according to the road area and the contour of the target object.
In one possible implementation, the determining the contour of the target object according to the parallax image and the target semantic information of the target object includes:
adjusting the initial contour of the target object based on the boundary pixel points of the initial contour of the target object corresponding to the parallax image;
constructing a three-dimensional model of the target object based on the adjusted initial contour and calibration parameters of image acquisition equipment, wherein the image acquisition equipment is used for acquiring the vehicle environment image;
and adjusting the projection area of the three-dimensional model on the horizontal plane according to the depth of the three-dimensional model of the target object to obtain the contour of the target object on the horizontal plane.
In one possible implementation manner, the adjusting the initial contour of the target object based on the boundary pixel point corresponding to the initial contour of the target object in the parallax image includes:
determining a plurality of boundary pixel points corresponding to the initial contour of the target object in the parallax image;
and adjusting the plurality of boundary pixel points corresponding to the initial contour of the target object according to the change degree of the parallax values of the plurality of neighborhood pixel points of each boundary pixel point.
In a possible implementation manner, the adjusting, according to a change degree of parallax values of a plurality of neighborhood pixel points of each boundary pixel point, a plurality of boundary pixel points corresponding to an initial contour of the target object includes:
when the change degree of the parallax values of a plurality of neighborhood pixels of the boundary pixel meets a target mutation condition, the boundary pixel is reserved;
and when the change degree of the parallax values of the plurality of neighborhood pixels of the boundary pixel does not meet the target mutation condition, replacing the boundary pixel with a pixel meeting the target mutation condition in the plurality of neighborhood pixels.
In one possible implementation, the determining, based on the parallax image of the vehicle environment image, the target area and the road area in the vehicle environment image during the driving of the vehicle includes:
determining a parallax image of the vehicle environment image based on at least two frames of vehicle environment images during the running of the vehicle;
projecting the parallax image along the vertical coordinate of the image coordinate system of the parallax image, and determining a vertical parallax image of the vehicle environment image, wherein the gray value of pixel points in the vertical parallax image is used for indicating the parallax distribution of each row of pixel points in the parallax image;
and determining a target area and a road area in the vehicle environment image according to the gray value of each pixel point in the longitudinal parallax image and the parallax value of each pixel point in the parallax image.
In one possible implementation manner, the determining, according to the gray-scale value of each pixel in the longitudinal parallax image and the parallax value of each pixel in the parallax image, a target region and a road region in the vehicle environment image includes:
determining a road surface straight line in an image coordinate system of the longitudinal parallax image according to the gray value of each pixel point in the longitudinal parallax image;
determining a road area in the vehicle environment image based on the road surface straight line and the parallax value of the pixel point in the parallax image;
determining a parallax value of a target area in the vehicle environment image according to the gray value of each row of pixel points in the longitudinal parallax image;
determining a longitudinal coordinate range of the target area above the road area according to the parallax value of the target area in the longitudinal parallax image;
and determining the transverse coordinate range of the target area in the parallax image according to the longitudinal coordinate range and the parallax value of the target area.
In a possible implementation manner, the performing object segmentation on the target region to obtain target semantic information of the target object includes:
identifying an object region of at least one target object within the target region in the target region;
and determining the object type of each target object and the initial contour of each target object according to the pixel values of a plurality of pixel points in the object region.
In one possible implementation, the determining a spatial position of the target object in a vehicle environment according to the road region and the contour of the target object comprises:
determining a minimum bounding rectangle of the target object in a horizontal plane based on the contour of the target object;
determining the plane size of the target object based on the position coordinates of the minimum bounding rectangle in a vehicle coordinate system;
determining a relative height of the target object on the road surface based on a road surface height of the road region.
In another aspect, an object detecting apparatus is provided, the apparatus including:
the vehicle environment image processing device comprises a determining module, a judging module and a judging module, wherein the determining module is used for determining a target area and a road area in a vehicle environment image based on a parallax image of the vehicle environment image in the driving process of a vehicle, and the target area comprises a target object;
the segmentation module is used for carrying out object segmentation on the target area to obtain target semantic information of a target object, wherein the target semantic information comprises an object type of the target object and an initial contour of the target object;
the determining module is further configured to determine a contour of the target object according to the parallax image and target semantic information of the target object;
the determining module is further configured to determine a spatial position of the target object in the vehicle environment according to the road area and the contour of the target object.
In a possible implementation manner, the determining module is further configured to adjust the initial contour of the target object based on a boundary pixel point of the initial contour of the target object corresponding to the parallax image; constructing a three-dimensional model of the target object based on the adjusted initial contour and calibration parameters of image acquisition equipment, wherein the image acquisition equipment is used for acquiring the vehicle environment image; and adjusting the projection area of the three-dimensional model on the horizontal plane according to the depth of the three-dimensional model of the target object to obtain the contour of the target object on the horizontal plane.
In a possible implementation manner, the determining module is further configured to determine a plurality of boundary pixel points corresponding to the initial contour of the target object in the parallax image; and adjusting the plurality of boundary pixel points corresponding to the initial contour of the target object according to the change degree of the parallax values of the plurality of neighborhood pixel points of each boundary pixel point.
In a possible implementation manner, the determining module is further configured to, when a disparity value variation degree of a plurality of neighborhood pixels of the boundary pixel meets a target mutation condition, retain the boundary pixel; and when the change degree of the parallax values of the plurality of neighborhood pixels of the boundary pixel does not meet the target mutation condition, replacing the boundary pixel with a pixel meeting the target mutation condition in the plurality of neighborhood pixels.
In a possible implementation manner, the determining module is further configured to determine a parallax image of the vehicle environment image based on at least two frames of vehicle environment images during the vehicle driving; projecting the parallax image along the vertical coordinate of the image coordinate system of the parallax image, and determining a vertical parallax image of the vehicle environment image, wherein the gray value of pixel points in the vertical parallax image is used for indicating the parallax distribution of each row of pixel points in the parallax image; and determining a target area and a road area in the vehicle environment image according to the gray value of each pixel point in the longitudinal parallax image and the parallax value of each pixel point in the parallax image.
In a possible implementation manner, the determining module is further configured to determine a road surface straight line in an image coordinate system of the longitudinal parallax image according to a gray value of each pixel point in the longitudinal parallax image; determining a road area in the vehicle environment image based on the road surface straight line and the parallax value of the pixel point in the parallax image; determining a parallax value of a target area in the vehicle environment image according to the gray value of each row of pixel points in the longitudinal parallax image; determining a longitudinal coordinate range of the target area above the road area according to the parallax value of the target area in the longitudinal parallax image; and determining the transverse coordinate range of the target area in the parallax image according to the longitudinal coordinate range and the parallax value of the target area.
In a possible implementation manner, the segmentation module is further configured to identify an object region of at least one target object in the target region, where the target region is located; and determining the object type of each target object and the initial contour of each target object according to the pixel values of a plurality of pixel points in the object region.
In a possible implementation manner, the determining module is further configured to determine a minimum bounding rectangle of the target object in a horizontal plane based on the contour of the target object; determining the plane size of the target object based on the position coordinates of the minimum bounding rectangle in a vehicle coordinate system; determining a relative height of the target object on the road surface based on a road surface height of the road region.
In another aspect, a computer device is provided, which includes a processor and a memory, where at least one instruction is stored, the instruction being loaded and executed by the processor to implement the operations performed by the object detection method as described above.
In another aspect, a computer-readable storage medium is provided, in which at least one instruction is stored, and the instruction is loaded and executed by a processor to implement the operations performed by the object detection method as described above.
The technical scheme provided by the embodiment of the invention has the following beneficial effects:
the method comprises the steps of determining a target area and a road area in a vehicle environment image, and carrying out object segmentation on the target area to obtain target semantic information comprising an object type and an initial contour of a target object, realizing comprehensive description on the target object from multiple angles, further accurately positioning the contour of the target object according to a parallax image and the target semantic information, greatly improving the accuracy of the contour, and finally accurately determining the spatial position of the target object according to the road area and the accurately positioned contour.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a flowchart of a target detection method according to an embodiment of the present invention;
FIG. 2 is a flowchart of a target detection method according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of an object detection framework provided by an embodiment of the present invention;
FIG. 4 is a schematic diagram of a target detection process according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of an object detection apparatus according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of a terminal according to an embodiment of the present invention;
fig. 7 is a schematic structural diagram of a server according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.
Fig. 1 is a flowchart of a target detection method according to an embodiment of the present invention. The execution subject of the embodiment of the invention is computer equipment, such as a server or a terminal of the computer equipment. Referring to fig. 1, the method includes:
101. determining a target area and a road area in a vehicle environment image based on a parallax image of the vehicle environment image during the running of a vehicle, wherein the target area comprises a target object;
102. carrying out object segmentation on the target area to obtain target semantic information of a target object, wherein the target semantic information comprises an object type of the target object and an initial contour of the target object;
103. determining the outline of the target object according to the parallax image and the target semantic information of the target object;
104. the spatial position of the target object in the vehicle environment is determined from the road region and the contour of the target object.
In one possible implementation, the determining the contour of the target object according to the parallax image and the target semantic information of the target object includes:
adjusting the initial contour of the target object based on the boundary pixel points of the initial contour of the target object corresponding to the parallax image;
constructing a three-dimensional model of the target object based on the adjusted initial contour and calibration parameters of image acquisition equipment, wherein the image acquisition equipment is used for acquiring the vehicle environment image;
and adjusting the projection area of the three-dimensional model on the horizontal plane according to the depth of the three-dimensional model of the target object to obtain the outline of the target object on the horizontal plane.
In a possible implementation manner, the adjusting the initial contour of the target object based on the boundary pixel point corresponding to the initial contour of the target object in the parallax image includes:
determining a plurality of boundary pixel points corresponding to the initial contour of the target object in the parallax image;
and adjusting the plurality of boundary pixel points corresponding to the initial contour of the target object according to the change degree of the parallax values of the plurality of neighborhood pixel points of each boundary pixel point.
In a possible implementation manner, the adjusting, according to the change degree of the parallax values of the plurality of neighborhood pixel points of each boundary pixel point, the plurality of boundary pixel points corresponding to the initial contour of the target object includes:
when the change degree of the parallax values of a plurality of neighborhood pixels of the boundary pixel meets a target mutation condition, the boundary pixel is reserved;
and when the change degree of the parallax values of the plurality of neighborhood pixels of the boundary pixel does not meet the target mutation condition, replacing the boundary pixel with a pixel meeting the target mutation condition in the plurality of neighborhood pixels.
In one possible implementation, the determining the target area and the road area in the vehicle environment image based on the parallax image of the vehicle environment image during the driving of the vehicle includes:
determining a parallax image of the vehicle environment image based on at least two frames of vehicle environment images during the running of the vehicle;
projecting the parallax image along the vertical coordinate of the image coordinate system of the parallax image, and determining a vertical parallax image of the vehicle environment image, wherein the gray value of pixel points in the vertical parallax image is used for indicating the parallax distribution of each row of pixel points in the parallax image;
and determining a target area and a road area in the vehicle environment image according to the gray value of each pixel point in the longitudinal parallax image and the parallax value of each pixel point in the parallax image.
In one possible implementation manner, the determining the target area and the road area in the vehicle environment image according to the gray value of each pixel point in the longitudinal parallax image and the parallax value of each pixel point in the parallax image includes:
determining a road surface straight line in an image coordinate system of the longitudinal parallax image according to the gray value of each pixel point in the longitudinal parallax image;
determining a road area in the vehicle environment image based on the road surface straight line and the parallax value of the pixel point in the parallax image;
determining a parallax value of a target area in the vehicle environment image according to the gray value of each row of pixel points in the longitudinal parallax image;
in the longitudinal parallax image, according to the parallax value of the target area, determining the longitudinal coordinate range of the target area above the road area;
and determining the transverse coordinate range of the target area in the parallax image according to the longitudinal coordinate range and the parallax value of the target area.
In a possible implementation manner, the performing object segmentation on the target area to obtain target semantic information of the target object includes:
identifying an object area of at least one object in the target area;
and determining the object type of each target object and the initial contour of each target object according to the pixel values of a plurality of pixel points in the object region.
In one possible implementation, the determining the spatial position of the target object in the vehicle environment based on the road region and the contour of the target object comprises:
determining a minimum bounding rectangle of the target object in a horizontal plane based on the contour of the target object;
determining the plane size of the target object based on the position coordinates of the minimum circumscribed rectangle in a vehicle coordinate system;
based on the road surface height of the road region, a relative height of the target object on the road surface is determined.
In the embodiment of the invention, the target region and the road region in the vehicle environment image are determined, and the target region is subjected to object segmentation, so that the target semantic information comprising the object type and the initial contour of the target object is obtained, the target object is comprehensively described from multiple angles, then the contour of the target object is further accurately positioned according to the parallax image and the target semantic information, the accuracy of the contour is greatly improved, and finally the spatial position of the target object is accurately determined according to the road region and the accurately positioned contour.
Fig. 2 is a flowchart of a target detection method according to an embodiment of the present invention. The execution subject of the embodiment of the present invention is a computer device, and the computer device is a server or a terminal, and in the embodiment of the present invention, only a terminal is taken as an example for description, and for example, the terminal may be a vehicle-mounted terminal or a personal computer. Referring to fig. 2, the method includes:
201. the terminal determines a target area and a road area in the vehicle environment image based on the parallax image of the vehicle environment image during the driving of the vehicle.
Wherein the target area comprises a target object; in the embodiment of the invention, during the driving process of the vehicle, the terminal can acquire the vehicle environment image of the surrounding environment of the vehicle in real time and detect the target object in the surrounding environment based on the vehicle environment image, wherein the target object can be any object with a certain three-dimensional shape, such as an adjacent vehicle of the vehicle, surrounding road signs, trees at two sides of a road, pedestrians crossing the road, and the like.
In this step, the terminal may obtain a parallax image of the vehicle environment image according to two frames of vehicle environment images, where the parallax image includes a parallax value of each pixel point in the vehicle environment image, and the terminal may segment the vehicle environment image according to the parallax value of the pixel point in the parallax image to obtain a target region and a road region. In one possible embodiment, the step may comprise: the terminal determines a parallax image of the vehicle environment image based on at least two frames of vehicle environment images in the driving process of the vehicle; the terminal projects the parallax image along the vertical coordinate of the image coordinate system of the parallax image, and determines a vertical parallax image of the vehicle environment image, wherein the gray value of the pixel points in the vertical parallax image is used for indicating the parallax distribution of each row of pixel points in the parallax image; and the terminal determines a target area and a road area in the vehicle environment image according to the gray value of each pixel point in the longitudinal parallax image and the parallax value of each pixel point in the parallax image.
The gray value of each pixel point in the parallax image may be the parallax value of the pixel point. The image coordinate system of the parallax image may be a UOV rectangular coordinate system, where U represents a horizontal axis of the coordinate system, V represents a vertical axis of the coordinate system, coordinates (U, V) in the parallax image represent pixels in a U-th row and a V-th column, and a gray value of the pixel may be a parallax value d of the pixel. The terminal performs gray value projection along the V-axis vertical coordinate according to the parallax value of each pixel point in the parallax image, namely, the parallax value projection to obtain a longitudinal parallax image, namely, a V parallax image. The image coordinate system of the V parallax image may be an dOV rectangular coordinate system, where d represents a horizontal axis of the coordinate system, V represents a vertical axis of the coordinate system, d represents a parallax value, a pixel point (d1, V) in the vertical parallax image represents a pixel point of a d1 th column and a V th row, and a gray value of the pixel point may represent: in the v-th row of pixels of the parallax image, the parallax value is the number of pixels d 1. For example, if the gray value of the pixel (2,3) is 4, it indicates that the number of pixels with a parallax value of 2 in the 3 rd row of pixels of the parallax image is 4.
In a possible implementation manner, the terminal may determine the road region according to a variation of a gray value of a pixel point in the longitudinal parallax image, and the process may include: the terminal determines a road surface straight line in an image coordinate system of the longitudinal parallax image according to the gray value of each pixel point in the longitudinal parallax image; and the terminal determines a road area in the vehicle environment image based on the road surface straight line and the parallax value of the pixel point in the parallax image.
In one possible example, the terminal may transform the longitudinal disparity map, and only retain the point with the maximum gray value in each row, so as to obtain a binary image, and the terminal performs hough transform on the binary image, so as to determine that a straight line representing the road surface in the rectangular coordinate system of the V disparity map is obtained according to the following formula one:
the formula I is as follows: (d) kd + b;
wherein d represents the abscissa in dOV rectangular coordinate system, and f (d) represents the ordinate in dOV rectangular coordinate system.
The terminal detects each pixel point in the parallax image according to the road surface straight line, and determines a road plane in the parallax image based on the following formula II:
the formula II is as follows: Δ (u, v) ═ d (u, v) -f (d);
Figure BDA0002322815160000101
wherein d (u, v) represents the parallax value of the pixel point (u, v) in the parallax image; ε is the decision threshold. The terminal determines a road plane in the vehicle environment image, wherein the parallax image is based on the determination threshold. For the non-road surface point, the pixel point with the delta (u, v) smaller than the judgment threshold value is a pixel point in a road plane or a background such as sky. On the other hand, the pixel points with the Δ (u, v) smaller than the determination threshold are pixel points of non-road surfaces, for example, pixel points on obstacles such as vehicles and signs on the road surface, so that the terminal detects travelable areas such as roads in the vehicle environment image.
In a possible implementation manner, the terminal may first determine, according to a gray value variation condition of a pixel point in the longitudinal parallax image, a parallax value and a longitudinal coordinate range of the target region, and then determine, in the parallax image, a transverse coordinate range of the target region, so as to obtain the target region, where the process may include: the terminal determines the parallax value of a target area in the vehicle environment image according to the gray value of each row of pixel points in the longitudinal parallax image; the terminal determines a longitudinal coordinate range of the target area above the road area according to the parallax value of the target area in the longitudinal parallax image; and the terminal determines the transverse coordinate range of the target area in the parallax image according to the longitudinal coordinate range and the parallax value of the target area. At one isIn a possible example, the terminal may sum gray values of all pixel points included in each column in the V parallax image to obtain a mapping relationship d-s (d) between the column and a summation result, the terminal may establish a rectangular coordinate system with d as a horizontal coordinate and s (d) as a vertical coordinate, and in the rectangular coordinate system, the terminal obtains a plurality of maximum value points d in the mapping relationship based on the mapping relationshipi . The surface of a significant obstacle due to a vehicle, pedestrian, etc. on the road surface is in a near perpendicular relationship to the ground. Therefore, the corresponding straight line of the target object in the V disparity map is approximately perpendicular to the road surface straight line, so that the terminal can use the maximum value point di And determining the corresponding abscissa of the target object in the V disparity map. In the V parallax image, the terminal is (d)i ,f(di ) In the upper area of the road surface area, a vertical line segment perpendicular to the road surface line is searched:
Figure BDA0002322815160000111
the vertical line segment is a pixel point corresponding to the target object in the V parallax image, and the vertical line segment is
Figure BDA0002322815160000112
The two middle elements are the upper and lower boundaries of the target object along the longitudinal coordinate axis and the average parallax of the target object respectively. After the terminal determines the longitudinal coordinate range of the target object in the V parallax image, the terminal may search in the parallax image along the transverse direction in the longitudinal coordinate range to determine that the longitudinal coordinate interval is (V)ui ,vdi ) In the area (d), the parallax value satisfies the pixel point of the target condition, which may be the average parallax value d between the parallax value and the maximum value pointi The difference between them is less than the target threshold value, so that the terminal determines that most of the parallax in the lateral direction is close to di Thereby determining the region position of the target object from the lateral direction and the longitudinal direction, respectively.
202. And the terminal performs object segmentation on the target area to obtain target semantic information of the target object.
Wherein the target semantic information includes an object class of the target object and an initial contour of the target object; in one possible implementation, the terminal may first acquire approximate region positions of a plurality of target objects by using a target detection method, and then segment each target object from the approximate region positions of the plurality of target objects by using semantic segmentation, instance segmentation, or the like. The process may include: the terminal identifies an object area of at least one object in the object area, wherein the object area is located in the object area; and the terminal determines the object type of each target object and the initial contour of each target object according to the pixel values of a plurality of pixel points in the object region. In one possible example, the terminal may implement object segmentation using an instance segmentation algorithm. For example, the terminal segments the target region using a MaskR-CNN (mask regions with a Convolutional neural network Features, mask based region) algorithm.
In a possible implementation manner, the terminal may directly segment the target region in the vehicle environment image by using an instance segmentation algorithm, so as to obtain the target semantic information of the target object. Alternatively, the terminal may also directly segment the vehicle environment image by using an example segmentation algorithm, and adjust the segmentation result based on the target area obtained instep 201 to further obtain the target semantic information of the target object. The embodiment of the present invention is not particularly limited to this.
It should be noted that the Mask R-CNN algorithm is based on deep learning, wherein the backbone network uses a classical 50-layer deep residual network ResNet 50. The Mask R-CNN algorithm can effectively detect the target object in a single RGB image and carry out instance segmentation to obtain the category of the target object and carry out pixel-by-pixel segmentation, and can accurately and efficiently segment the initial contour of the target object and identify the category of the object, so that the target object can be further described more accurately based on the semantic aspect, and the accuracy of subsequently determining the contour of the target object is further improved.
203. And the terminal determines the outline of the target object according to the parallax image and the target semantic information of the target object.
In the embodiment of the invention, the terminal can further optimize the more accurate contour of the target object based on the projection of the three-dimensional solid model of the target object on the horizontal plane. This step may be accomplished by the following steps 2031-2033.
2031. And the terminal adjusts the initial contour of the target object based on the boundary pixel point of the initial contour of the target object corresponding to the parallax image.
In a possible implementation manner, because the disparity value of the pixel point is changed greatly from the inside of the contour edge of the target object to the outside of the contour edge in the disparity image, the terminal can correct the initial contour according to the disparity change condition of the boundary pixel point corresponding to the initial contour in the disparity image. The process may include: the terminal can determine a plurality of boundary pixel points corresponding to the initial contour of the target object in the parallax image; the terminal can adjust a plurality of boundary pixel points corresponding to the initial contour of the target object according to the change degree of the parallax values of a plurality of neighborhood pixel points of each boundary pixel point. When the change degree of the parallax values of a plurality of neighborhood pixels of the boundary pixel meets a target mutation condition, the terminal can reserve the boundary pixel; when the change degree of the parallax values of the plurality of neighborhood pixels of the boundary pixel does not satisfy the target mutation condition, the terminal can replace the boundary pixel with a pixel satisfying the target mutation condition among the plurality of neighborhood pixels. The target mutation condition may be that a difference between a maximum parallax value and a minimum parallax value of the plurality of neighborhood pixels is greater than a target threshold.
It should be noted that the terminal can locate the initial contour in the parallax image according to the position of the pixel point of the initial contour in the vehicle environment image, so as to independently extract the contours of different object categories segmented by the object for further optimization, and specifically, the terminal searches in the neighborhood of each segmented boundary pixel point based on the target mutation condition, so as to locate the pixel point with the mutation of the parallax value in the neighborhood as the boundary pixel point of the target object, thereby further accurately correcting the contour of the target object, and improving the accuracy and precision of the initial contour.
2032. And the terminal constructs a three-dimensional model of the target object based on the adjusted initial contour and the calibration parameters of the image acquisition equipment.
Wherein, the image acquisition equipment is used for acquiring the vehicle environment image.
The terminal can construct a corresponding point set of the target object in a three-dimensional vehicle coordinate system. The vehicle coordinate system may be an XYZ rectangular space coordinate system, wherein the positive X-axis direction may be a horizontal rightward direction perpendicular to the vehicle traveling direction, the positive Y-axis direction may be the vehicle traveling direction, and the positive Z-axis direction may be a vertical upward direction. The terminal obtains the area position of the target object in the vehicle environment image according to the adjusted initial contour, and solves the three-dimensional space coordinate of the target object in the vehicle coordinate system based on the parallax value and the camera calibration parameter corresponding to the pixel point in the area position, so that a space model for simulating the target object is constructed.
In one possible example, the terminal may also determine the position of the road plane in three-dimensional space based on the disparity of the road plane. For example, the terminal may determine a three-dimensional space coordinate of each pixel point in the road plane in the three-dimensional vehicle coordinate system according to the parallax value of each pixel point in the road plane and the camera calibration parameter, and further, the terminal may determine the road surface height of the road area based on the three-dimensional space coordinate of the road area corresponding to the three-dimensional vehicle coordinate system.
2033. And the terminal adjusts the projection area of the three-dimensional model on the horizontal plane according to the depth of the three-dimensional model of the target object to obtain the outline of the target object on the horizontal plane.
The projection region may be a region where the terminal projects the three-dimensional stereo model on a horizontal plane, that is, an XOY plane. The depth of the three-dimensional solid model refers to the length of the three-dimensional solid model in the vehicle traveling direction, that is, the length of the three-dimensional solid model in the Y-axis direction. In the parallax image, there may be confusion between pixel values of pixel points at the contour boundary of the target object and between the background and the road surface. In this step, the terminal may further optimize the contour of the target object in the Y-axis direction.
In a possible implementation manner, the terminal may find out pixel points possibly confused between the outline of the target object and the surrounding environment based on the coordinates of the target object in the Y-axis direction. In a possible example, the terminal determines an average coordinate value of a plurality of Y coordinates according to the Y coordinates of a plurality of pixel points included in the target object in the Y axis direction, and the terminal reserves a plurality of target pixel points of which the Y coordinates belong to a target range among the plurality of pixel points according to the average coordinate value, for example, the terminal may only reserve pixel points of which the Y coordinates belong to an interval [ m-3 σ, m +3 σ ]. Wherein m represents an average coordinate value of the Y coordinates of the plurality of pixel points in the Y-axis direction, and σ is a standard deviation of the Y coordinates of the plurality of pixel points in the Y-axis direction. Then, the terminal converts the target pixels and the areas where the target pixels are located into a 0-1 matrix, performs convolution operation on the matrix by using a target convolution kernel, for example, the target convolution kernel may be a full "1" convolution kernel with a certain size, the terminal may only retain coordinates of target pixels which are "1" before convolution and are larger than a target threshold after convolution, then the terminal converts the matrix, and converts the target pixels in the matrix into a point set on a projected XOY plane, thereby further correcting the projection area.
It should be noted that the terminal can correct the pixel points which may be confused between the contour of the target object and the surrounding environment by using the Y coordinate in the Y axis direction, and delete the abnormal points which are expressed as isolated outliers in the point cloud of the target object through the Y coordinate screening process and the convolution process in the target range, thereby further optimizing the contour of the target object in the Y axis direction and improving the accuracy of target detection.
204. The terminal determines the spatial position of the target object in the vehicle environment according to the road area and the contour of the target object.
In this step, the terminal may determine the position of the target object in the surrounding environment relative to the road surface based on the height of the road surface, for example, 3 meters above the road surface ahead, so as to more directly give the effective spatial position of the target object. The terminal can determine the minimum bounding rectangle of the target object in the horizontal plane based on the outline of the target object; the terminal determines the plane size of the target object based on the position coordinate of the minimum circumscribed rectangle in a vehicle coordinate system; the terminal determines a relative height of the target object on the road surface based on the road surface height of the road region. When the terminal determines the minimum external rectangle of the target object, the terminal can include all pixel points of the target object in the minimum external rectangle, and the minimum rectangle including all the pixel points is used as a detection frame of the target object, so that the two-dimensional plane position of the target object on the horizontal plane is determined. And after the terminal determines the minimum circumscribed rectangle, the size of the minimum circumscribed rectangle can be further adjusted based on the actual size of the target object in the real physical world. In addition, the terminal may determine the height of the target object according to the height of the target object in the vertical direction and the height of the road surface, for example, the terminal may determine the height of the target object according to the maximum coordinate value and the minimum coordinate value of the target object in the Z-axis direction, and determine the relative height of the target object with respect to the road surface according to the height coordinate of the road surface in the Z-axis direction. This terminal can also be according to this relative height, minimum external rectangle, determine the accurate three-dimensional space coordinate of this target object in vehicle coordinate system, it is further, this terminal can also be based on this three-dimensional space coordinate, with the target object projection to two-dimensional plane on, for example, projection to two-dimensional horizontal plane, of course, this terminal also can show the accurate projection on this two-dimensional horizontal plane on the vehicle-mounted terminal screen of vehicle, so that the user browses, for example, the check-up of tester among the vehicle automatic driving test process has been made things convenient for.
For example, taking an adjacent vehicle with a target object as the own vehicle as an example, the terminal finds corresponding pixel points of the adjacent vehicle in the parallax image, calculates corresponding three-dimensional coordinate points of the pixel points in the vehicle coordinate system by combining the camera calibration parameters, and then projects the three-dimensional coordinate points representing the adjacent vehicle on the XOY plane. Generally, the terminal can derive the location and orientation of the neighboring vehicles based on the distribution in the projection area. And then the terminal further optimizes the projection area of the target object in the Y-axis direction, and after the isolated points are removed, the minimum circumscribed rectangle of the projection area is detected to obtain the position of the adjacent vehicle on the ground. The terminal determines the height of the adjacent vehicle relative to the road according to the maximum coordinate value and the minimum coordinate value of the adjacent vehicle in the Z-axis direction in the vehicle coordinate system and the height of the road.
In order to more clearly describe the flow of the embodiment of the present invention, the overall flow of the above steps 201-204 is described below with the frame diagram shown in fig. 3 and the target detection flow diagram shown in fig. 4. As shown in fig. 3, the terminal may be configured with a rough segmentation module for road regions and target regions, and configured to segment the target regions and the road regions in the vehicle environment image according to the V parallax image, so as to implement rough segmentation of different functional regions; the terminal can also be provided with a three-dimensional target detection module fused with semantic information, and the three-dimensional module detection module is used for carrying out semantic instance segmentation on the vehicle environment image and further carrying out correction and optimization on the basis of an initial contour obtained by segmentation. As shown in fig. 4, the terminal acquires a parallax image, performs object segmentation on a target region to obtain target semantic information, corrects an initial contour based on the parallax image and the target semantic information, constructs a three-dimensional model of the target object based on camera internal and external parameters and the corrected initial contour, projects the three-dimensional model onto a horizontal plane, and performs an optimization process of removing outliers on the projection region based on a Y coordinate in the Y axis direction again, thereby obtaining a contour with higher accuracy. The terminal determines the minimum circumscribed rectangle of the target object, and after the length and the width of the circumscribed rectangle are adjusted based on the actual size, the terminal determines the accurate three-dimensional space coordinate of the target object in the vehicle coordinate system based on the road height and the accurate projection on the horizontal plane, and certainly, the terminal can further perform two-dimensional projection on the target object.
According to the method provided by the embodiment of the invention, the target region and the road region in the vehicle environment image are determined, and the target region is subjected to object segmentation, so that the target semantic information comprising the object type and the initial contour of the target object is obtained, the target object is comprehensively described from multiple angles, then the contour of the target object is further accurately positioned according to the parallax image and the target semantic information, the accuracy of the contour is greatly improved, and finally the spatial position of the target object is accurately determined according to the road region and the accurately positioned contour.
Fig. 5 is a schematic structural diagram of an object detection apparatus according to an embodiment of the present invention. Referring to fig. 5, the apparatus includes:
a determiningmodule 501, configured to determine a target area and a road area in a vehicle environment image based on a parallax image of the vehicle environment image during vehicle driving, where the target area includes a target object;
asegmentation module 502, configured to perform object segmentation on the target region to obtain target semantic information of the target object, where the target semantic information includes an object class of the target object and an initial contour of the target object;
the determiningmodule 501 is further configured to determine a contour of the target object according to the parallax image and the target semantic information of the target object;
the determiningmodule 501 is further configured to determine a spatial position of the target object in the vehicle environment according to the road area and the contour of the target object.
In a possible implementation manner, the determiningmodule 501 is further configured to adjust the initial contour of the target object based on a boundary pixel point of the initial contour of the target object corresponding to the parallax image; constructing a three-dimensional model of the target object based on the adjusted initial contour and calibration parameters of image acquisition equipment, wherein the image acquisition equipment is used for acquiring the vehicle environment image; and adjusting the projection area of the three-dimensional model on the horizontal plane according to the depth of the three-dimensional model of the target object to obtain the outline of the target object on the horizontal plane.
In a possible implementation manner, the determiningmodule 501 is further configured to determine a plurality of boundary pixel points corresponding to the initial contour of the target object in the parallax image; and adjusting the plurality of boundary pixel points corresponding to the initial contour of the target object according to the change degree of the parallax values of the plurality of neighborhood pixel points of each boundary pixel point.
In a possible implementation manner, the determiningmodule 501 is further configured to, when the parallax variation degrees of the plurality of neighborhood pixels of the boundary pixel satisfy a target mutation condition, retain the boundary pixel; and when the change degree of the parallax values of the plurality of neighborhood pixels of the boundary pixel does not meet the target mutation condition, replacing the boundary pixel with a pixel meeting the target mutation condition in the plurality of neighborhood pixels.
In a possible implementation manner, the determiningmodule 501 is further configured to determine a parallax image of the vehicle environment image based on at least two frames of vehicle environment images during the vehicle driving; projecting the parallax image along the vertical coordinate of the image coordinate system of the parallax image, and determining a longitudinal parallax image of the vehicle environment image, wherein the gray value of pixel points in the longitudinal parallax image is used for indicating the parallax distribution of each row of pixel points in the parallax image; and determining a target area and a road area in the vehicle environment image according to the gray value of each pixel point in the longitudinal parallax image and the parallax value of each pixel point in the parallax image.
In a possible implementation manner, the determiningmodule 501 is further configured to determine a road surface straight line in an image coordinate system of the longitudinal parallax image according to a gray value of each pixel in the longitudinal parallax image; determining a road area in the vehicle environment image based on the road surface straight line and the parallax value of the pixel point in the parallax image; determining a parallax value of a target area in the vehicle environment image according to the gray value of each row of pixel points in the longitudinal parallax image; in the longitudinal parallax image, determining a longitudinal coordinate range of the target area above the road area according to the parallax value of the target area; and determining the transverse coordinate range of the target area in the parallax image according to the longitudinal coordinate range and the parallax value of the target area.
In a possible implementation manner, thesegmentation module 502 is further configured to identify an object region of at least one target object in the target region; and determining the object type of each target object and the initial contour of each target object according to the pixel values of a plurality of pixel points in the object region.
In a possible implementation manner, the determiningmodule 501 is further configured to determine a minimum bounding rectangle of the target object in a horizontal plane based on the contour of the target object; determining the plane size of the target object based on the position coordinates of the minimum circumscribed rectangle in a vehicle coordinate system; based on the road surface height of the road region, a relative height of the target object on the road surface is determined.
In the embodiment of the invention, the target region and the road region in the vehicle environment image are determined, and the target region is subjected to object segmentation, so that the target semantic information comprising the object type and the initial contour of the target object is obtained, the target object is comprehensively described from multiple angles, then the contour of the target object is further accurately positioned according to the parallax image and the target semantic information, the accuracy of the contour is greatly improved, and finally the spatial position of the target object is accurately determined according to the road region and the accurately positioned contour.
All the above optional technical solutions may be combined arbitrarily to form the optional embodiments of the present disclosure, and are not described herein again.
It should be noted that: in the object detection apparatus provided in the foregoing embodiment, only the division of the functional modules is illustrated in the foregoing, and in practical applications, the functions may be distributed by different functional modules as needed, that is, the internal structure of the computer device is divided into different functional modules to complete all or part of the functions described above. In addition, the target detection apparatus and the target detection method provided by the above embodiments belong to the same concept, and specific implementation processes thereof are described in the method embodiments in detail and are not described herein again.
Fig. 6 is a schematic structural diagram of a terminal according to an embodiment of the present invention. The terminal 600 may be: a smart phone, a tablet computer, an MP3 player (Moving Picture Experts Group Audio Layer III, motion video Experts compression standard Audio Layer 3), an MP4 player (Moving Picture Experts Group Audio Layer IV, motion video Experts compression standard Audio Layer 4), a notebook computer, or a desktop computer. The terminal 600 may also be referred to by other names such as user equipment, portable terminal, laptop terminal, desktop terminal, etc.
In general, the terminal 600 includes: aprocessor 601 and amemory 602.
Processor 601 may include one or more processing cores, such as 4-core processors, 8-core processors, and so forth. Theprocessor 601 may be implemented in at least one hardware form of a DSP (Digital Signal Processing), an FPGA (Field-Programmable Gate Array), and a PLA (Programmable Logic Array). Theprocessor 601 may also include a main processor and a coprocessor, where the main processor is a processor for Processing data in an awake state, and is also called a Central Processing Unit (CPU); a coprocessor is a low power processor for processing data in a standby state. In some embodiments, theprocessor 601 may be integrated with a GPU (Graphics Processing Unit), which is responsible for rendering and drawing the content required to be displayed on the display screen. In some embodiments,processor 601 may further include an AI (Artificial Intelligence) processor for processing computing operations related to machine learning.
Memory 602 may include one or more computer-readable storage media, which may be non-transitory.Memory 602 may also include high speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium inmemory 602 is used to store at least one instruction for execution byprocessor 601 to implement the target detection method provided by the method embodiments herein.
In some embodiments, the terminal 600 may further optionally include: aperipheral interface 603 and at least one peripheral. Theprocessor 601,memory 602, andperipheral interface 603 may be connected by buses or signal lines. Various peripheral devices may be connected to theperipheral interface 603 via a bus, signal line, or circuit board. Specifically, the peripheral device includes: at least one of aradio frequency circuit 604, atouch screen display 605, acamera 606, anaudio circuit 607, apositioning component 608, and apower supply 609.
Theperipheral interface 603 may be used to connect at least one peripheral related to I/O (Input/Output) to theprocessor 601 and thememory 602. In some embodiments, theprocessor 601,memory 602, andperipheral interface 603 are integrated on the same chip or circuit board; in some other embodiments, any one or two of theprocessor 601, thememory 602, and theperipheral interface 603 may be implemented on a separate chip or circuit board, which is not limited in this embodiment.
TheRadio Frequency circuit 604 is used for receiving and transmitting RF (Radio Frequency) signals, also called electromagnetic signals. Theradio frequency circuitry 604 communicates with communication networks and other communication devices via electromagnetic signals. Therf circuit 604 converts an electrical signal into an electromagnetic signal to transmit, or converts a received electromagnetic signal into an electrical signal. Optionally, theradio frequency circuit 604 comprises: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a subscriber identity module card, and so forth. Theradio frequency circuit 604 may communicate with other terminals via at least one wireless communication protocol. The wireless communication protocols include, but are not limited to: metropolitan area networks, various generation mobile communication networks (2G, 3G, 4G, and 5G), Wireless local area networks, and/or WiFi (Wireless Fidelity) networks. In some embodiments, therf circuit 604 may further include NFC (Near Field Communication) related circuits, which are not limited in this application.
Thedisplay 605 is used to display a UI (user interface). The UI may include graphics, text, icons, video, and any combination thereof. When thedisplay screen 605 is a touch display screen, thedisplay screen 605 also has the ability to capture touch signals on or over the surface of thedisplay screen 605. The touch signal may be input to theprocessor 601 as a control signal for processing. At this point, thedisplay 605 may also be used to provide virtual buttons and/or a virtual keyboard, also referred to as soft buttons and/or a soft keyboard. In some embodiments, thedisplay 605 may be one, providing the front panel of the terminal 600; in other embodiments, thedisplay 605 may be at least two, respectively disposed on different surfaces of the terminal 600 or in a folded design; in still other embodiments, thedisplay 605 may be a flexible display disposed on a curved surface or on a folded surface of the terminal 600. Even more, thedisplay 605 may be arranged in a non-rectangular irregular pattern, i.e., a shaped screen. TheDisplay 605 may be made of LCD (Liquid Crystal Display), OLED (Organic Light-Emitting Diode), and the like.
Thecamera assembly 606 is used to capture images or video. Optionally,camera assembly 606 includes a front camera and a rear camera. Generally, a front camera is disposed at a front panel of the terminal, and a rear camera is disposed at a rear surface of the terminal. In some embodiments, the number of the rear cameras is at least two, and each rear camera is any one of a main camera, a depth-of-field camera, a wide-angle camera and a telephoto camera, so that the main camera and the depth-of-field camera are fused to realize a background blurring function, and the main camera and the wide-angle camera are fused to realize panoramic shooting and VR (Virtual Reality) shooting functions or other fusion shooting functions. In some embodiments,camera assembly 606 may also include a flash. The flash lamp can be a monochrome temperature flash lamp or a bicolor temperature flash lamp. The double-color-temperature flash lamp is a combination of a warm-light flash lamp and a cold-light flash lamp, and can be used for light compensation at different color temperatures.
Audio circuitry 607 may include a microphone and a speaker. The microphone is used for collecting sound waves of a user and the environment, converting the sound waves into electric signals, and inputting the electric signals to theprocessor 601 for processing or inputting the electric signals to theradio frequency circuit 604 to realize voice communication. For the purpose of stereo sound collection or noise reduction, a plurality of microphones may be provided at different portions of the terminal 600. The microphone may also be an array microphone or an omni-directional acquisition microphone. The speaker is used to convert electrical signals from theprocessor 601 or theradio frequency circuit 604 into sound waves. The loudspeaker can be a traditional film loudspeaker or a piezoelectric ceramic loudspeaker. When the speaker is a piezoelectric ceramic speaker, the speaker can be used for purposes such as converting an electric signal into a sound wave audible to a human being, or converting an electric signal into a sound wave inaudible to a human being to measure a distance. In some embodiments,audio circuitry 607 may also include a headphone jack.
Thepositioning component 608 is used for positioning the current geographic Location of the terminal 600 to implement navigation or LBS (Location Based Service). ThePositioning component 608 may be a Positioning component based on the Global Positioning System (GPS) in the united states, the beidou System in china, the grignard System in russia, or the galileo System in the european union.
Apower supply 609 is used to supply power to the various components interminal 600. Thepower supply 609 may be ac, dc, disposable or rechargeable. When thepower supply 609 includes a rechargeable battery, the rechargeable battery may support wired or wireless charging. The rechargeable battery may also be used to support fast charge technology.
In some embodiments, the terminal 600 also includes one or more sensors 610. The one or more sensors 610 include, but are not limited to: acceleration sensor 611, gyro sensor 612, pressure sensor 613,fingerprint sensor 614,optical sensor 615, and proximity sensor 616.
The acceleration sensor 611 may detect the magnitude of acceleration in three coordinate axes of the coordinate system established with the terminal 600. For example, the acceleration sensor 611 may be used to detect components of the gravitational acceleration in three coordinate axes. Theprocessor 601 may control thetouch screen display 605 to display the user interface in a landscape view or a portrait view according to the gravitational acceleration signal collected by the acceleration sensor 611. The acceleration sensor 611 may also be used for acquisition of motion data of a game or a user.
The gyro sensor 612 may detect a body direction and a rotation angle of the terminal 600, and the gyro sensor 612 and the acceleration sensor 611 may cooperate to acquire a 3D motion of the user on theterminal 600. Theprocessor 601 may implement the following functions according to the data collected by the gyro sensor 612: motion sensing (such as changing the UI according to a user's tilting operation), image stabilization at the time of photographing, game control, and inertial navigation.
The pressure sensor 613 may be disposed on a side frame of the terminal 600 and/or on a lower layer of thetouch display screen 605. When the pressure sensor 613 is disposed on the side frame of the terminal 600, a user's holding signal of the terminal 600 can be detected, and theprocessor 601 performs left-right hand recognition or shortcut operation according to the holding signal collected by the pressure sensor 613. When the pressure sensor 613 is arranged at the lower layer of thetouch display screen 605, theprocessor 601 controls the operability control on the UI interface according to the pressure operation of the user on thetouch display screen 605. The operability control comprises at least one of a button control, a scroll bar control, an icon control and a menu control.
Thefingerprint sensor 614 is used for collecting a fingerprint of a user, and theprocessor 601 identifies the identity of the user according to the fingerprint collected by thefingerprint sensor 614, or thefingerprint sensor 614 identifies the identity of the user according to the collected fingerprint. Upon identifying that the user's identity is a trusted identity, theprocessor 601 authorizes the user to perform relevant sensitive operations including unlocking the screen, viewing encrypted information, downloading software, paying, and changing settings, etc. Thefingerprint sensor 614 may be disposed on the front, back, or side of the terminal 600. When a physical button or vendor Logo is provided on the terminal 600, thefingerprint sensor 614 may be integrated with the physical button or vendor Logo.
Theoptical sensor 615 is used to collect the ambient light intensity. In one embodiment,processor 601 may control the display brightness oftouch display 605 based on the ambient light intensity collected byoptical sensor 615. Specifically, when the ambient light intensity is high, the display brightness of thetouch display screen 605 is increased; when the ambient light intensity is low, the display brightness of thetouch display screen 605 is turned down. In another embodiment, theprocessor 601 may also dynamically adjust the shooting parameters of thecamera assembly 606 according to the ambient light intensity collected by theoptical sensor 615.
A proximity sensor 616, also known as a distance sensor, is typically provided on the front panel of the terminal 600. The proximity sensor 616 is used to collect the distance between the user and the front surface of the terminal 600. In one embodiment, when the proximity sensor 616 detects that the distance between the user and the front surface of the terminal 600 gradually decreases, theprocessor 601 controls thetouch display 605 to switch from the bright screen state to the dark screen state; when the proximity sensor 616 detects that the distance between the user and the front surface of the terminal 600 gradually becomes larger, theprocessor 601 controls thetouch display 605 to switch from the breath screen state to the bright screen state.
Those skilled in the art will appreciate that the configuration shown in fig. 6 is not intended to be limiting ofterminal 600 and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components may be used.
Fig. 7 is a schematic structural diagram of aserver 700 according to an embodiment of the present invention, where theserver 700 may generate a relatively large difference due to different configurations or performances, and may include one or more processors (CPUs) 701 and one ormore memories 702, where thememory 702 stores at least one instruction, and the at least one instruction is loaded and executed by theprocessor 701 to implement the target detection method provided by each method embodiment described above. Of course, the server may also have components such as a wired or wireless network interface, a keyboard, and an input/output interface, so as to perform input/output, and the server may also include other components for implementing the functions of the device, which are not described herein again.
In an exemplary embodiment, a computer-readable storage medium, such as a memory including instructions executable by a processor in a terminal or a server to perform the object detection method in the above embodiments, is also provided. For example, the computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the storage medium may be a read-only memory, a magnetic disk or an optical disk.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (16)

1. A method of object detection, the method comprising:
determining a parallax image of the vehicle environment image based on at least two frames of vehicle environment images during the running of the vehicle;
projecting the parallax image along a vertical coordinate of an image coordinate system of the parallax image, and determining a longitudinal parallax image of the vehicle environment image, wherein gray values of pixel points in the longitudinal parallax image are used for indicating the parallax distribution of each row of pixel points in the parallax image;
determining a target area in the vehicle environment image according to the gray value of each pixel point in the longitudinal parallax image and the parallax value of each pixel point in the parallax image, wherein the target area comprises a target object;
carrying out object segmentation on the target area to obtain target semantic information of a target object, wherein the target semantic information comprises an object type of the target object and an initial contour of the target object;
determining the outline of the target object according to the parallax image and the target semantic information of the target object;
determining a road surface straight line in an image coordinate system of the longitudinal parallax image according to the gray value of each pixel point in the longitudinal parallax image;
determining a road area in the vehicle environment image based on the road surface straight line and the parallax value of the pixel point in the parallax image;
determining a spatial position of the target object in a vehicle environment according to the road area and the contour of the target object.
2. The method of claim 1, wherein determining the contour of the target object according to the disparity image and target semantic information of the target object comprises:
adjusting the initial contour of the target object based on the boundary pixel points of the initial contour of the target object corresponding to the parallax image;
constructing a three-dimensional model of the target object based on the adjusted initial contour and calibration parameters of image acquisition equipment, wherein the image acquisition equipment is used for acquiring the vehicle environment image;
and adjusting the projection area of the three-dimensional model on the horizontal plane according to the depth of the three-dimensional model of the target object to obtain the contour of the target object on the horizontal plane.
3. The method according to claim 2, wherein the adjusting the initial contour of the target object based on the boundary pixel point corresponding to the initial contour of the target object in the parallax image comprises:
determining a plurality of boundary pixel points corresponding to the initial contour of the target object in the parallax image;
and adjusting the plurality of boundary pixel points corresponding to the initial contour of the target object according to the change degree of the parallax values of the plurality of neighborhood pixel points of each boundary pixel point.
4. The method of claim 3, wherein the adjusting the plurality of boundary pixels corresponding to the initial contour of the target object according to the variation degree of the parallax values of the plurality of neighborhood pixels of each boundary pixel comprises:
when the change degree of the parallax values of a plurality of neighborhood pixels of the boundary pixel meets a target mutation condition, the boundary pixel is reserved;
and when the change degree of the parallax values of the plurality of neighborhood pixels of the boundary pixel does not meet the target mutation condition, replacing the boundary pixel with a pixel meeting the target mutation condition in the plurality of neighborhood pixels.
5. The method of claim 1, wherein the determining the target region in the vehicle environment image according to the gray value of each pixel point in the longitudinal parallax image and the parallax value of each pixel point in the parallax image comprises:
determining a parallax value of a target area in the vehicle environment image according to the gray value of each row of pixel points in the longitudinal parallax image;
determining a longitudinal coordinate range of the target area above the road area according to the parallax value of the target area in the longitudinal parallax image;
and determining the transverse coordinate range of the target area in the parallax image according to the longitudinal coordinate range and the parallax value of the target area.
6. The method of claim 1, wherein the performing object segmentation on the target region to obtain target semantic information of a target object comprises:
identifying an object region of at least one target object within the target region in the target region;
and determining the object type of each target object and the initial contour of each target object according to the pixel values of a plurality of pixel points in the object region.
7. The method of claim 1, wherein determining the spatial location of the target object in the vehicle environment based on the road region and the contour of the target object comprises:
determining a minimum bounding rectangle of the target object in a horizontal plane based on the contour of the target object;
determining the plane size of the target object based on the position coordinates of the minimum bounding rectangle in a vehicle coordinate system;
determining a relative height of the target object on the road surface based on a road surface height of the road region.
8. An object detection apparatus, characterized in that the apparatus comprises:
the vehicle environment image processing device comprises a determining module, a processing module and a processing module, wherein the determining module is used for determining a parallax image of a vehicle environment image based on at least two frames of vehicle environment images in the driving process of a vehicle; projecting the parallax image along a vertical coordinate of an image coordinate system of the parallax image, and determining a longitudinal parallax image of the vehicle environment image, wherein gray values of pixel points in the longitudinal parallax image are used for indicating the parallax distribution of each row of pixel points in the parallax image; determining a target area in the vehicle environment image according to the gray value of each pixel point in the longitudinal parallax image and the parallax value of each pixel point in the parallax image, wherein the target area comprises a target object;
the segmentation module is used for carrying out object segmentation on the target area to obtain target semantic information of a target object, wherein the target semantic information comprises an object type of the target object and an initial contour of the target object;
the determining module is further configured to determine a contour of the target object according to the parallax image and target semantic information of the target object;
the determining module is further configured to determine a road surface straight line in an image coordinate system of the longitudinal parallax image according to the gray value of each pixel point in the longitudinal parallax image; determining a road area in the vehicle environment image based on the road surface straight line and the parallax value of the pixel point in the parallax image;
the determining module is further configured to determine a spatial position of the target object in the vehicle environment according to the road area and the contour of the target object.
9. The apparatus of claim 8,
the determining module is further configured to adjust the initial contour of the target object based on a boundary pixel point of the initial contour of the target object corresponding to the parallax image; constructing a three-dimensional model of the target object based on the adjusted initial contour and calibration parameters of image acquisition equipment, wherein the image acquisition equipment is used for acquiring the vehicle environment image; and adjusting the projection area of the three-dimensional model on the horizontal plane according to the depth of the three-dimensional model of the target object to obtain the contour of the target object on the horizontal plane.
10. The apparatus of claim 9,
the determining module is further configured to determine a plurality of boundary pixel points corresponding to the initial contour of the target object in the parallax image; and adjusting the plurality of boundary pixel points corresponding to the initial contour of the target object according to the change degree of the parallax values of the plurality of neighborhood pixel points of each boundary pixel point.
11. The apparatus of claim 10,
the determining module is further configured to retain the boundary pixel points when the parallax value variation degrees of the plurality of neighborhood pixel points of the boundary pixel points satisfy a target mutation condition; and when the change degree of the parallax values of the neighborhood pixels of the boundary pixel does not meet the target mutation condition, replacing the boundary pixel with the pixel meeting the target mutation condition in the neighborhood pixels.
12. The apparatus of claim 8,
the determining module is further configured to determine a disparity value of a target area in the vehicle environment image according to a gray value of each column of pixel points in the longitudinal disparity image; determining a longitudinal coordinate range of the target area above the road area according to the parallax value of the target area in the longitudinal parallax image; and determining the transverse coordinate range of the target area in the parallax image according to the longitudinal coordinate range and the parallax value of the target area.
13. The apparatus of claim 8,
the segmentation module is further used for identifying an object region of at least one object in the object region, wherein the object region is in the object region; and determining the object type of each target object and the initial contour of each target object according to the pixel values of a plurality of pixel points in the object region.
14. The apparatus of claim 8,
the determining module is further configured to determine a minimum bounding rectangle of the target object in a horizontal plane based on the contour of the target object; determining the plane size of the target object based on the position coordinates of the minimum bounding rectangle in a vehicle coordinate system; determining a relative height of the target object on the road surface based on a road surface height of the road region.
15. A computer device comprising a processor and a memory, the memory having stored therein at least one instruction that is loaded and executed by the processor to perform operations performed by the object detection method of any one of claims 1 to 7.
16. A computer-readable storage medium having stored therein at least one instruction which is loaded and executed by a processor to perform operations performed by the object detection method of any one of claims 1 to 7.
CN201911304904.2A2019-12-172019-12-17Target detection method, target detection device, computer equipment and storage mediumActiveCN111104893B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN201911304904.2ACN111104893B (en)2019-12-172019-12-17Target detection method, target detection device, computer equipment and storage medium

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201911304904.2ACN111104893B (en)2019-12-172019-12-17Target detection method, target detection device, computer equipment and storage medium

Publications (2)

Publication NumberPublication Date
CN111104893A CN111104893A (en)2020-05-05
CN111104893Btrue CN111104893B (en)2022-09-20

Family

ID=70422608

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201911304904.2AActiveCN111104893B (en)2019-12-172019-12-17Target detection method, target detection device, computer equipment and storage medium

Country Status (1)

CountryLink
CN (1)CN111104893B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN112233150B (en)*2020-09-092024-12-31北京迈格威科技有限公司 Image processing and blurring method, device, electronic device and storage medium
CN112200172B (en)*2020-12-072021-02-19天津天瞳威势电子科技有限公司Driving region detection method and device
CN112950535B (en)*2021-01-222024-03-22北京达佳互联信息技术有限公司Video processing method, device, electronic equipment and storage medium
CN114120265A (en)*2021-10-292022-03-01际络科技(上海)有限公司Obstacle detection method, obstacle detection device, electronic device, and storage medium
CN114219992B (en)*2021-12-142022-06-03杭州古伽船舶科技有限公司Unmanned ship obstacle avoidance system based on image recognition technology
CN114219791B (en)*2021-12-172025-01-03盛视科技股份有限公司 Vision-based road water detection method, electronic equipment and vehicle alarm system
CN117710734A (en)*2023-12-132024-03-15北京百度网讯科技有限公司 Methods, devices, electronic equipment, and media for obtaining semantic data

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN108197590B (en)*2018-01-222020-11-03海信集团有限公司Pavement detection method, device, terminal and storage medium
CN108520536B (en)*2018-03-272022-01-11海信集团有限公司Disparity map generation method and device and terminal

Also Published As

Publication numberPublication date
CN111104893A (en)2020-05-05

Similar Documents

PublicationPublication DateTitle
CN111126182B (en)Lane line detection method, lane line detection device, electronic device, and storage medium
CN111104893B (en)Target detection method, target detection device, computer equipment and storage medium
US11205282B2 (en)Relocalization method and apparatus in camera pose tracking process and storage medium
WO2021128777A1 (en)Method, apparatus, device, and storage medium for detecting travelable region
US11978219B2 (en)Method and device for determining motion information of image feature point, and task performing method and device
CN111126276B (en)Lane line detection method, lane line detection device, computer equipment and storage medium
CN110059685A (en)Word area detection method, apparatus and storage medium
CN110599593B (en)Data synthesis method, device, equipment and storage medium
CN110490179B (en)License plate recognition method and device and storage medium
CN110570460A (en)Target tracking method and device, computer equipment and computer readable storage medium
CN110335224B (en)Image processing method, image processing device, computer equipment and storage medium
CN111784841B (en)Method, device, electronic equipment and medium for reconstructing three-dimensional image
CN110503159B (en)Character recognition method, device, equipment and medium
CN111538009B (en)Radar point marking method and device
CN112150560A (en) Method, apparatus and computer storage medium for determining vanishing point
CN113205515A (en)Target detection method, device and computer storage medium
CN113378705A (en)Lane line detection method, device, equipment and storage medium
CN111444749B (en)Method and device for identifying road surface guide mark and storage medium
CN113689484B (en)Method and device for determining depth information, terminal and storage medium
CN111127541A (en)Vehicle size determination method and device and storage medium
CN113920222A (en) Method, device, device and readable storage medium for obtaining map data
CN111563402B (en)License plate recognition method, license plate recognition device, terminal and storage medium
CN115965936A (en)Edge position marking method and equipment
CN111639639A (en)Method, device, equipment and storage medium for detecting text area
CN112241662B (en)Method and device for detecting drivable area

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
TA01Transfer of patent application right

Effective date of registration:20200611

Address after:215100 16 / F, Lingyu Business Plaza, 66 qinglonggang Road, high speed rail new town, Xiangcheng District, Suzhou City, Jiangsu Province

Applicant after:SUZHOU ZHIJIA TECHNOLOGY Co.,Ltd.

Applicant after:Zhijia (Cayman) Co.

Applicant after:Zhijia (USA)

Address before:215100 16 / F, Lingyu Business Plaza, 66 qinglonggang Road, high speed rail new town, Xiangcheng District, Suzhou City, Jiangsu Province

Applicant before:SUZHOU ZHIJIA TECHNOLOGY Co.,Ltd.

TA01Transfer of patent application right
TA01Transfer of patent application right

Effective date of registration:20210310

Address after:16 / F, Lingyu Business Plaza, 66 qinglonggang Road, high speed rail new town, Xiangcheng District, Suzhou City, Jiangsu Province

Applicant after:SUZHOU ZHIJIA TECHNOLOGY Co.,Ltd.

Applicant after:Zhijia (USA)

Address before:215100 16 / F, Lingyu Business Plaza, 66 qinglonggang Road, high speed rail new town, Xiangcheng District, Suzhou City, Jiangsu Province

Applicant before:SUZHOU ZHIJIA TECHNOLOGY Co.,Ltd.

Applicant before:Zhijia (Cayman) Co.

Applicant before:Zhijia (USA)

TA01Transfer of patent application right
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp