Disclosure of Invention
The invention provides a method and a device for detecting object space parameters, electronic equipment and a storage medium, which are used for solving the problems that in the prior art, target detection is carried out through a two-dimensional picture, the parameters of a three-dimensional space are difficult to restore, the position information of a target object cannot be accurately positioned, and the accuracy of the detection of the three-dimensional space of the object is influenced.
According to a first aspect of the embodiments of the present disclosure, the present disclosure provides an object space parameter detection method applied to an electronic device, where the method includes:
acquiring a two-dimensional image corresponding to a target object;
inputting the two-dimensional image into a recognition model trained to be convergent to obtain category information, two-dimensional frame information and an observation angle of the target object, wherein the target object is framed by the two-dimensional frame;
determining the three-dimensional frame contour of the target object according to the category information and a preset mapping relation between the category information and the three-dimensional frame contour of the target object;
determining the three-dimensional frame coordinate of the target object according to the three-dimensional frame profile of the target object, the two-dimensional frame information and the observation angle;
and outputting the three-dimensional frame outline and the three-dimensional frame coordinates of the target object.
Optionally, the two-dimensional frame information includes: the determining of the three-dimensional frame coordinate of the target object according to the three-dimensional frame profile of the target object, the two-dimensional frame information and the observation angle includes:
and converting the two-dimensional frame coordinate into the three-dimensional frame coordinate of the target object according to the projection relation between the three-dimensional frame contour and the two-dimensional frame contour under the observation angle.
Optionally, the converting the two-dimensional frame coordinate into the three-dimensional frame coordinate of the target object according to the projection relationship between the three-dimensional frame profile and the two-dimensional frame profile at the observation angle includes:
estimating the projection coordinate of the three-dimensional frame coordinate on the two-dimensional frame contour according to the two-dimensional frame contour, the observation angle and the two-dimensional frame coordinate;
and converting the projection coordinate into the three-dimensional frame coordinate according to a preset image conversion rule.
Optionally, the estimating, according to the two-dimensional frame profile, the observation angle, and the two-dimensional frame coordinate, a projection coordinate of the three-dimensional frame coordinate on the two-dimensional frame profile includes:
calculating a first projection parameter according to the two-dimensional frame outline;
calculating a second projection parameter according to the observation angle;
and estimating the projection coordinate of the three-dimensional frame coordinate on the two-dimensional frame outline according to the first projection parameter, the second projection parameter and the two-dimensional frame coordinate.
Optionally, the converting the projection coordinate into the three-dimensional frame coordinate according to a preset image conversion rule includes:
restoring the projection coordinate information into a unit three-dimensional coordinate according to a preset image conversion rule;
calculating the unit three-dimensional frame outline according to the unit three-dimensional coordinates;
calculating an amplification factor according to the ratio of the three-dimensional frame outline to the unit three-dimensional frame outline;
and determining the product of the unit three-dimensional coordinate and the magnification coefficient as the three-dimensional frame coordinate.
Optionally, the electronic device includes an image capturing sensor, configured to capture the two-dimensional image, and accordingly, the restoring the projection coordinate to a unit three-dimensional coordinate according to a preset image conversion rule includes:
acquiring parameter information when the two-dimensional image is acquired by the image acquisition sensor, wherein the parameter information comprises a preset image conversion rule, and the preset image conversion rule is a conversion relation between a projection coordinate and a unit three-dimensional coordinate;
and restoring the projection coordinate into a unit three-dimensional coordinate according to the conversion relation between the projection coordinate and the unit three-dimensional coordinate.
Optionally, after determining the three-dimensional frame coordinate of the target object according to the three-dimensional frame profile of the target object, the two-dimensional frame information, and the observation angle, the method further includes:
and determining the yaw angle of the target object according to the observation angle and the three-dimensional frame coordinate.
Optionally, before the inputting the two-dimensional image into the recognition model trained to converge to obtain the category information, the two-dimensional frame information, and the observation angle of the target object, the method further includes:
acquiring a training sample of the recognition model, wherein the training sample is a marked two-dimensional image;
and inputting the marked two-dimensional image into the recognition model, and training the recognition model.
Optionally, after outputting the three-dimensional frame contour and the three-dimensional frame coordinates of the target object, the method further includes:
and displaying the corresponding three-dimensional frame according to the three-dimensional frame outline of the target object and the three-dimensional frame coordinate.
According to a second aspect of the embodiments of the present disclosure, the present disclosure provides an object space parameter detection apparatus, including:
the acquisition module is used for acquiring a two-dimensional image corresponding to a target object;
the recognition module is used for inputting the two-dimensional image into a recognition model trained to be convergent so as to obtain the category information, the two-dimensional frame information and the observation angle of the target object, wherein the target object is framed by the two-dimensional frame;
the first determination module is used for determining the three-dimensional frame outline of the target object according to the category information and a preset mapping relation between the category information and the three-dimensional frame outline of the target object;
the second determination module is used for determining the three-dimensional frame coordinate of the target object according to the three-dimensional frame outline of the target object, the two-dimensional frame information and the observation angle;
and the output module is used for outputting the three-dimensional frame outline and the three-dimensional frame coordinate of the target object.
Optionally, the two-dimensional frame information includes: the second determining module is specifically configured to:
and converting the two-dimensional frame coordinate into the three-dimensional frame coordinate of the target object according to the projection relation between the three-dimensional frame contour and the two-dimensional frame contour under the observation angle.
Further optionally, when the two-dimensional frame coordinate is converted into the three-dimensional frame coordinate of the target object according to the projection relationship between the three-dimensional frame contour and the two-dimensional frame contour under the observation angle, the second determining module is specifically configured to:
estimating the projection coordinate of the three-dimensional frame coordinate on the two-dimensional frame contour according to the two-dimensional frame contour, the observation angle and the two-dimensional frame coordinate;
and converting the projection coordinate into the three-dimensional frame coordinate according to a preset image conversion rule.
Further optionally, when estimating the projection coordinate of the three-dimensional frame coordinate on the two-dimensional frame contour according to the two-dimensional frame contour, the observation angle, and the two-dimensional frame coordinate, the second determining module is specifically configured to:
calculating a first projection parameter according to the two-dimensional frame outline;
calculating a second projection parameter according to the observation angle;
and estimating the projection coordinate of the three-dimensional frame coordinate on the two-dimensional frame outline according to the first projection parameter, the second projection parameter and the two-dimensional frame coordinate.
Optionally, the second determining module, when the projection coordinate is converted into the three-dimensional frame coordinate according to a preset image conversion rule, is specifically configured to:
and restoring the projection coordinate information into a unit three-dimensional coordinate according to a preset image conversion rule.
Calculating the unit three-dimensional frame outline according to the unit three-dimensional coordinates;
calculating an amplification factor according to the ratio of the three-dimensional frame outline to the unit three-dimensional frame outline;
and determining the product of the unit three-dimensional coordinate and the magnification coefficient as the three-dimensional frame coordinate.
Optionally, the electronic device includes an image capturing sensor, configured to capture the two-dimensional image, and accordingly, the second determining module restores the projection coordinate to a unit three-dimensional coordinate according to a preset image conversion rule, and is specifically configured to:
acquiring parameter information when the two-dimensional image is acquired by the image acquisition sensor, wherein the parameter information comprises a preset image conversion rule, and the preset image conversion rule is a conversion relation between a projection coordinate and a unit three-dimensional coordinate;
and restoring the projection coordinate into a unit three-dimensional coordinate according to the conversion relation between the projection coordinate and the unit three-dimensional coordinate.
Optionally, the object space parameter detecting apparatus further includes:
and the yaw angle determining module is used for determining the yaw angle of the target object according to the observation angle and the three-dimensional frame coordinate.
Optionally, the object space parameter detecting apparatus further includes:
the sample acquisition module is used for acquiring a training sample of the recognition model, wherein the training sample is a marked two-dimensional image;
and the training module is used for inputting the marked two-dimensional images into the recognition model and training the recognition model.
Optionally, the object space parameter detecting apparatus further includes:
and the display module is used for displaying the corresponding three-dimensional frame according to the three-dimensional frame outline of the target object and the three-dimensional frame coordinate.
According to a third aspect of the embodiments of the present disclosure, the present invention provides an electronic apparatus including:
a memory, a processor, and a computer program;
wherein the computer program is stored in the memory and configured to be executed by the processor to implement the object space parameter detection method according to any one of the first aspect of the embodiments of the present disclosure.
According to a fourth aspect of the embodiments of the present disclosure, the present disclosure provides a computer-readable storage medium, in which computer-executable instructions are stored, and when the computer-executable instructions are executed by a processor, the computer-readable storage medium is configured to implement the object space parameter detection method according to any one of the first aspect of the embodiments of the present disclosure.
The invention provides a method and a device for detecting object space parameters, electronic equipment and a storage medium, wherein the method comprises the following steps: acquiring a two-dimensional image corresponding to a target object; inputting the two-dimensional image into a recognition model trained to be convergent to obtain category information, two-dimensional frame information and an observation angle of the target object, wherein the target object is framed by the two-dimensional frame; determining the three-dimensional frame contour of the target object according to the category information and a preset mapping relation between the category information and the three-dimensional frame contour of the target object; determining the three-dimensional frame coordinate of the target object according to the three-dimensional frame profile of the target object, the two-dimensional frame information and the observation angle; and outputting the three-dimensional frame outline and the three-dimensional frame coordinates of the target object. Inputting a two-dimensional image of a target object into an identification model to obtain category information, two-dimensional frame information and observation angle information of the target object, and obtaining a three-dimensional outline frame of the target object by using the category information of the target object; and the three-dimensional frame contour, the two-dimensional frame information and the observation angle are utilized to obtain the three-dimensional frame coordinate of the target object, the parameters of the three-dimensional space of the target object are restored, the purpose of accurately positioning the position information of the target object is realized, and the accuracy of the detection of the three-dimensional space of the object is improved.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.
The terms to which the present invention relates will be explained first:
detecting a three-dimensional space target: a three-dimensional space (3D for short) refers to a space formed by three dimensions, and in our daily life, that is, a space formed by three dimensions, namely, a length dimension, a width dimension, and a height dimension, points in the three-dimensional space can be uniquely determined by three coordinate axes. Therefore, for a static object in three-dimensional space, if the state of the static object in three-dimensional space is to be completely and accurately determined, its object space parameters, i.e. three-dimensional coordinates and three-dimensional contour, are required. The three-dimensional space target detection is a technology for acquiring information of a target object through electronic equipment, calculating object space parameters and determining the state of the target object in a three-dimensional space.
Two-dimensional frame: in the process of detecting a three-dimensional space target, a two-dimensional image corresponding to a target object may be acquired by an image acquisition device such as a camera or a camera, for example, a picture of an automobile is taken, and the picture is a two-dimensional image corresponding to the automobile. In order to identify the position and size of the car in the picture, the car is framed with a rectangular frame and the position and size of the car in the picture is determined. The rectangular frame is a two-dimensional frame and is used for determining the position and size of the target object in the two-dimensional image.
Three-dimensional frame: in the process of detecting the target in the three-dimensional space, the three-dimensional frame has a similar function to the two-dimensional frame and can be used for determining the size and the position of the target object in the three-dimensional space.
The following explains an application scenario of the embodiment of the present invention:
fig. 1 is an application scene diagram of the object space parameter detection method according to the embodiment of the present invention, and as shown in fig. 1, an object spaceparameter detection device 12 according to the embodiment of the present invention is installed on anautomobile 11 with an automatic driving function, and specifically, may be installed on a vehicle-mounted terminal, and is connected to a driving computer of theautomobile 11, and is configured to detect and display a situation of anobstacle 13 in a forward direction of theautomobile 11, so as to provide data support for a driving strategy of the automatic drivingautomobile 11.
In a specific scene, after the automatic driving mode of the automobile 11 is started, the object space parameter detection device of the object space parameter detection method provided by the embodiment of the invention is operated, and two-dimensional images in the advancing direction of the automobile 11 are collected; after the two-dimensional image is obtained, inputting the two-dimensional image into an identification model, and if the two-dimensional image has the obstacle 13, obtaining the type information, the two-dimensional frame information and the observation angle of the obstacle 13 through the identification model; then inquiring the corresponding three-dimensional frame outline according to the type information of the barrier 13; then, determining the three-dimensional frame coordinate of the obstacle 13 according to the three-dimensional frame contour, the two-dimensional frame information and the observation angle; finally, outputting the three-dimensional frame outline and the three-dimensional frame coordinate of the obstacle 13 to a display screen of a vehicle-mounted terminal in a cab of the automobile 11 in a three-dimensional frame mode, and prompting a driver user of the spatial position of the obstacle 13 in front and the size of the obstacle 13; meanwhile, the three-dimensional frame contour and the three-dimensional frame coordinate of the obstacle 13 are input into a driving computer of the automobile 11, the driving computer judges the distance and the direction between the obstacle 13 and the automobile 11 according to the three-dimensional frame contour and the three-dimensional frame coordinate of the obstacle 13, and meanwhile, a corresponding driving strategy is made according to the current driving speed per hour and the driving direction of the automobile 11 so as to avoid the obstacle 13. Since theobstacle 13 may be astatic obstacle 13, such as a garbage can on the roadside, or may be adynamic obstacle 13, such as an automobile or a pedestrian, the spatial parameter detection device needs to be continuously operated at a preset time interval, for example, the spatial parameter detection device acquires a two-dimensional image of theautomobile 11 in the forward direction every 0.2 second, and inputs the two-dimensional image into the identification model of the object spatial parameter detection device, if theobstacle 13 does not exist in the two-dimensional image, the identification model of the spatial parameter detection device cannot identify theobstacle 13, and the process of the method is ended; if theobstacle 13 is found, the subsequent method steps of obtaining the three-dimensional frame outline and the three-dimensional frame coordinate of theobstacle 13 are operated, and the result is output to a display screen and a running computer.
In an application scenario of the object space parameter detection method provided by the embodiment of the invention, the object space parameter detection device provided by the embodiment of the invention is arranged on an automobile, and the two-dimensional image is used for restoring the three-dimensional space parameters of the target object, so that the position information of the target object is accurately positioned, and the accuracy of the object three-dimensional space detection is improved.
The following describes the technical solutions of the present invention and how to solve the above technical problems with specific embodiments. The following several specific embodiments may be combined with each other, and details of the same or similar concepts or processes may not be repeated in some embodiments. Embodiments of the present invention will be described below with reference to the accompanying drawings.
Fig. 2 is a flowchart of the object space parameter detection method of the present invention, and as shown in fig. 2, the object space parameter detection method provided in this embodiment is applied to an electronic device, and includes the following steps:
step S201, acquiring a two-dimensional image corresponding to a target object;
specifically, the target object is an object to be subjected to spatial parameter detection, and the two-dimensional image is image information displayed by the target object in a two-dimensional image plane, for example, a photograph including a car, where the car is the target object, and the photograph including the car is a two-dimensional image corresponding to the car. Specifically, the two-dimensional image is a two-dimensional matrix composed of pixel information.
There are various methods for acquiring the two-dimensional image corresponding to the target object, for example, taking a picture with a camera, or reading a picture in a storage device, or extracting a specific frame from video information, and the specific method for acquiring the two-dimensional image is not limited herein.
Alternatively, the number of the target objects may be one or more, for example, in one shot picture, there are a plurality of cars, and each car may be used as a target object. In particular, in a two-dimensional image, a plurality of target objects may appear simultaneously, the two-dimensional image is regarded as a two-dimensional image corresponding to each target object, and each target object in the two-dimensional image may be subjected to subsequent method steps simultaneously.
Step S202, inputting the two-dimensional image into a recognition model trained to be convergent to obtain the class information, the two-dimensional frame information and the observation angle of the target object, wherein the target object is framed by the two-dimensional frame;
optionally, the recognition model is a convolutional neural network model with observation angle prediction, which is improved based on a fast-RCNN convolutional neural network model, and is used for extracting features in the two-dimensional image and outputting category information, two-dimensional frame information and observation angles of the target object. Specifically, the convolutional neural network model with observation angle prediction extracts features in a two-dimensional image, generates candidate two-dimensional frame information in the model, performs region-of-interest pooling (ROI pooling for short) on the candidate two-dimensional frame information to further extract features possibly including a target object, classifies the further extracted features, and outputs category information, the two-dimensional frame information, and an observation angle.
Further, the category information of the target object is a description about a category of the target object, and optionally, the category corresponding to the category information includes: family cars, SUVs, sprinklers, fire trucks, semi-trucks, people, garbage cans, trees, and the like. According to the trained recognition model, the target object in the two-dimensional image can be recognized as corresponding class information.
The two-dimensional frame information refers to the contour and position information of the target object realized in a two-dimensional frame manner, and specifically, the position and size of the target object in the two-dimensional image are marked in a two-dimensional frame manner, that is, the target object is framed by the two-dimensional frame, and the contour of the two-dimensional frame can approximately represent the contour of the target object in the two-dimensional image; the position of the two-dimensional frame can approximately represent the position of the target object on the two-dimensional image.
As shown in fig. 3, theobservation angle 34 is an angle between aline 31 connecting the observation point (i.e., the electronic device) and thetarget object 30 and the extending direction 33 of thetarget object 30. Specifically, when the target object is not axially symmetric, the two-dimensional pictures obtained by thesame target object 30 have differences when the positions of the observation points are different, and therefore, theobservation angle 34 is larger than 0 degree and smaller than 90 degrees, so that the spatial parameter detection process of thetarget object 30 is deviated, and theobservation angle 34 is introduced, so that the detection precision is improved due to the fact that the deviation of the spatial parameter detection process caused by the different positions of the observation points is corrected.
Step S203, determining the three-dimensional frame contour of the target object according to the category information and the preset mapping relation between the category information and the three-dimensional frame contour of the target object;
optionally, the three-dimensional frame profile is a cuboid structure defined by a set of length, width, and height parameters. The three-dimensional frame contour is close to the contour of the target object and can completely wrap the contour of the target object.
Specifically, the electronic device is preset with a mapping relationship between category information and parameters of length, width and height of the three-dimensional frame profile, for example, the category information is a family car, and what forms a one-to-one mapping relationship with the category information is a cuboid with a length of 3 meters, a width of 1.5 meters and a height of 1.2 meters, which is the three-dimensional frame profile.
Step S204, determining the three-dimensional frame coordinate of the target object according to the three-dimensional frame profile, the two-dimensional frame information and the observation angle of the target object;
the two-dimensional frame information of the target object is a mark of the target object in the two-dimensional image obtained from the two-dimensional image, and is used for representing the position and the size of the target object in the two-dimensional picture. And the two-dimensional image is the projection of a target object in a three-dimensional space in a two-dimensional space, and since the target object has a projection relationship between the three-dimensional space and the two-dimensional space and the observation angle is equivalent to the angle at which the projection occurs, the three-dimensional frame coordinates of the object can be obtained according to the three-dimensional frame profile, the two-dimensional frame information and the observation angle of the target object.
In step S205, the three-dimensional frame contour and the three-dimensional frame coordinates of the target object are output.
Specifically, after the three-dimensional frame contour and the three-dimensional frame coordinates are obtained, the position and the size of the target object in the three-dimensional space can be completely positioned, the three-dimensional frame contour and the three-dimensional frame coordinates of the target object are output, and the three-dimensional frame contour and the three-dimensional frame coordinates can be used for other subsequent functional devices to judge the distance between the target object and the running electronic equipment, so that the target object is close to or far away from the target object, and specific functions are realized; of course, the three-dimensional frame contour and the three-dimensional frame coordinates of the target object can also be displayed in the form of graphics or characters through the display device for the user to use. Here, a specific mode of outputting the three-dimensional frame contour and the three-dimensional frame coordinates of the target object is not limited.
In the embodiment, the category information, the two-dimensional frame information and the observation angle information of the target object are obtained by inputting the two-dimensional image of the target object into the recognition model, and the three-dimensional outline frame of the target object is obtained by using the category information of the target object; and the three-dimensional frame contour, the two-dimensional frame information and the observation angle are utilized to obtain the three-dimensional frame coordinate of the target object, the parameters of the three-dimensional space of the target object are restored, the purpose of accurately positioning the position information of the target object is realized, and the accuracy of the detection of the three-dimensional space of the object is improved.
Fig. 4 is a flowchart of another object space parameter detection method according to the present invention, and as shown in fig. 4, the object space parameter detection method provided in this embodiment further refines step S204 on the basis of the object space parameter detection method provided in the embodiment shown in fig. 2, and further includes a step of training a recognition model before step S204, and a step of detecting a yaw angle and outputting a three-dimensional frame contour and three-dimensional frame coordinates after step S207. The method for detecting the object space parameters provided by the embodiment comprises the following steps:
step S401, a training sample of the recognition model is obtained, and the training sample is a marked two-dimensional image.
Before the recognition model recognizes the input two-dimensional image, the recognition model needs to be trained to be convergent, so that the recognition effect can be realized. Specifically, the training process comprises: the method comprises the steps of collecting a two-dimensional image of a target object, recording the space parameter state of the two-dimensional object when the two-dimensional image is generated, for example, photographing an automobile through a camera, recording the observation angle of the camera relative to the automobile at the photographing moment and the category information of the automobile, and marking a two-dimensional frame of the automobile in a photo.
Step S402, inputting the marked two-dimensional image into the recognition model, and training the recognition model.
And inputting the two-dimensional photos marked with the category information, the observation angle and the two-dimensional frame corresponding to the automobile into a model for training until the model converges. The recognition model can recognize the two-dimensional image and correctly output the category information, the two-dimensional frame information and the observation angle of the target object.
In step S403, a two-dimensional image corresponding to the target object is acquired.
Step S404, inputting the two-dimensional image into the recognition model trained to be converged to obtain the category information, the two-dimensional frame information and the observation angle of the target object, and the target object is framed by the two-dimensional frame.
Step S405, determining the three-dimensional frame contour of the target object according to the category information and the preset mapping relation between the category information and the three-dimensional frame contour of the target object.
The implementation manners of step S403 to step S405 are similar to the implementation manners of step S201 to step S203 in the embodiment shown in fig. 2, and are not described in detail here.
In step S406, the two-dimensional frame information includes: and converting the two-dimensional frame coordinate into the three-dimensional frame coordinate of the target object according to the projection relation of the three-dimensional frame contour and the two-dimensional frame contour under the observation angle.
Specifically, since the two-dimensional image is a projection of a three-dimensional object in a two-dimensional space, the three-dimensional frame profile and the two-dimensional frame profile of the target object have a certain proportional relationship, and the proportional relationship can measure the distance between the target object and the electronic device for acquiring the two-dimensional image. For example, the target object is an automobile, the electronic device takes a picture of the automobile through the camera to obtain a two-dimensional image, the closer the electronic device is to the automobile, the larger the two-dimensional frame profile of the automobile in the obtained two-dimensional image is, and the smaller the two-dimensional frame profile of the automobile in the obtained two-dimensional image is otherwise. Therefore, the distance between the automobile and the electronic device, namely the spatial depth information lost by the three-dimensional object in the process of projecting to the two-dimensional space, can be judged through the proportional relation between the three-dimensional frame profile and the two-dimensional frame profile of the automobile. According to the spatial depth information and the observation angle information, the conversion from the two-dimensional frame coordinate to the three-dimensional frame coordinate of the target object can be realized.
Alternatively, the size of the two-dimensional profile may be expressed in terms of the width or height of the two-dimensional profile, since the aspect of the projection distance does not affect the aspect ratio of the two-dimensional profile.
Optionally, as shown in fig. 5, step S406 includes two specific implementation steps of step S4061 and step S4062:
step S4061, estimating the projection coordinate of the three-dimensional frame coordinate on the two-dimensional frame contour according to the two-dimensional frame contour, the observation angle and the two-dimensional frame coordinate.
In a specific embodiment, the two-dimensional frame profile is a rectangular frame, and the two-dimensional frame coordinates comprise a midpoint coordinate of an upper frame and a midpoint coordinate of a lower frame of the two-dimensional frame profile; the three-dimensional frame profile is a cuboid three-dimensional frame body, the three-dimensional frame coordinates comprise an upper surface central point coordinate and a lower surface central point coordinate of the three-dimensional frame profile, and the projection coordinates of the three-dimensional frame coordinates on the two-dimensional frame profile comprise an upper projection coordinate and a lower projection coordinate. According to the projection relation between the three-dimensional frame contour and the two-dimensional frame contour, when the three-dimensional frame coordinate is projected into the two-dimensional frame contour, the projection of the upper surface central point coordinate on the two-dimensional contour is an upper projection coordinate, the upper projection coordinate is near the upper frame central point coordinate, the projection of the lower surface central point coordinate on the two-dimensional contour is a lower projection coordinate, and the lower projection coordinate is near the lower frame central point coordinate. The proximity of the upper projection coordinate to the midpoint coordinate of the upper frame, and the proximity of the lower projection coordinate to the midpoint coordinate of the lower frame are determined by the viewing angle and the two-dimensional frame profile. More specifically, the smaller the viewing angle, the smaller the two-dimensional profile, the closer the upper projection coordinate is to the upper frame midpoint coordinate, and the closer the lower projection coordinate is to the lower frame midpoint coordinate. Therefore, from the two-dimensional frame profile, the observation angle, and the two-dimensional frame coordinates, the projection coordinates of the three-dimensional frame coordinates on the two-dimensional frame profile can be estimated approximately.
Optionally, as shown in fig. 6, step S4061 includes the following 3 specific steps:
in step S4061a, a first projection parameter is calculated from the two-dimensional frame contour.
Optionally, the two-dimensional frame information is B2d=(x2d,y2d,h2d,w2d) Wherein x is2dAnd y2dIs the coordinate of the pixel point of the two-dimensional frame in the two-dimensional image, h2dIs the height of the two-dimensional frame, w2dThe width of the two-dimensional frame.
The first projection parameter is related to the size of the outline of the two-dimensional frame and is used for evaluating the closeness degree of the upper projection coordinate and the midpoint coordinate of the upper frame and the lower projection coordinate and the midpoint coordinate of the lower frame.
Specifically, the first projection parameter is λ 1 ═ h2d/h0. Wherein h is0Is a constant statistically estimated from the data set, and is specifically set according to different types of information.
In step S4061b, a second projection parameter is calculated based on the viewing angle.
The second projection parameter is related to the observation angle and is also used for evaluating the closeness degree of the upper projection coordinate and the midpoint coordinate of the upper frame and the lower projection coordinate and the midpoint coordinate of the lower frame.
Specifically, the second projection parameter isλ 2 ═ α/α0Wherein α is the viewing angle, α0Is a constant statistically estimated from the data set, and is specifically set according to different types of information.
Step S4061c, estimating the projection coordinate of the three-dimensional frame coordinate on the two-dimensional frame contour according to the first projection parameter, the second projection parameter and the two-dimensional frame coordinate.
Optionally, the projection correction parameter is determined according to the first projection parameter and the second projection parameter, and may be expressed as formula (1), that is:
λ=w1·λ1+w2·λ2 (1)
w1 and w2 are weight coefficients corresponding to the first projection parameter and the second projection parameter, and are used for adjusting the influence weights of the first projection parameter and the second projection parameter, and the weights are adjusted according to specific type information and a service application scene.
Step S4062, converting the projection coordinate into a three-dimensional frame coordinate according to a preset image conversion rule.
Optionally, an image conversion rule is preset in the electronic device, and the image conversion rule is used for describing how to convert the projection coordinates into the three-dimensional frame coordinates. After the projection coordinates are obtained, the projection coordinates may be converted into three-dimensional frame coordinates according to the preset image conversion rule.
Optionally, as shown in fig. 7, step S4062 includes the following 4 specific implementation steps:
step S40621, restoring the projection coordinate information to a unit three-dimensional coordinate according to a preset image conversion rule.
Specifically, the unit three-dimensional coordinate is a three-dimensional coordinate of a unit length, namely a normalized coordinate of the three-dimensional coordinate, and values of three coordinate axes of the three-dimensional coordinate are all between [0,1 ]. The preset image conversion rule includes a method for reducing the projection coordinate information into a unit three-dimensional coordinate, and the acquired projection coordinate is reduced into the unit three-dimensional coordinate according to the preset image conversion rule.
Optionally, the electronic device includes an image capturing sensor, configured to capture a two-dimensional image, as shown in fig. 8, step S40621 includes the following 2 specific implementation steps:
step S40621a, acquiring parameter information when the image acquisition sensor acquires a two-dimensional image, where the parameter information includes a preset image conversion rule, and the preset image conversion rule is a conversion relation between a projection coordinate and a unit three-dimensional coordinate.
Specifically, the parameter information is a necessary acquisition parameter set when the image acquisition sensor acquires two-dimensional information, for example, focal length information, when the focal length information is different, facing the same target object, the two-dimensional image acquired by the image acquisition sensor is different, specifically, including that the two-dimensional frame information is different, that is, the two-dimensional frame profile and the two-dimensional frame coordinate are different, thereby resulting in a difference in projection coordinates. Therefore, the parameter information determines the conversion relationship between the two-dimensional image and the target object, but the distance between the image capture sensor and the target object also needs the three-dimensional frame contour of the target object to be determined, so that the reference information only includes the conversion relationship between the projection coordinate and the unit three-dimensional coordinate, that is, the image conversion rule.
Step S40621b, restoring the projection coordinates to the unit three-dimensional coordinates according to the conversion relationship between the projection coordinates and the unit three-dimensional coordinates.
According to the preset image conversion rule included in the parameter information, the conversion relationship between the projection coordinates and the unit three-dimensional coordinates can be determined, and therefore, the unit three-dimensional coordinates can be determined according to the obtained projection coordinates.
Step S40622, calculating the unit three-dimensional frame outline according to the unit three-dimensional coordinates.
Step S40623, calculating an amplification factor according to the ratio of the three-dimensional frame outline to the unit three-dimensional frame outline.
Alternatively, since the unit three-dimensional coordinates are restored after being projected from the three-dimensional frame profile of the rectangular parallelepiped shape, the unit three-dimensional coordinates include a coordinate pair with the center point of the upper surface of the three-dimensional frame profileThe coordinate of the center point on the corresponding unit and the coordinate of the center point under the unit corresponding to the coordinate of the center point of the lower surface of the three-dimensional frame outline. The coordinate of the central point on the unit and the coordinate of the central point under the unit are subtracted, and the height of the unit three-dimensional frame profile can be obtained
Further, the amplification factor is expressed by equation (2):
where δ is the amplification factor, h3dIs the height of the three-dimensional box outline.
In step S40624, the product of the unit three-dimensional coordinates and the magnification factor is determined as three-dimensional frame coordinates.
The three-dimensional frame coordinates can be obtained by multiplying the unit three-dimensional coordinates obtained in step S40621 by the enlargement factor δ. Specifically, the three-dimensional frame coordinates are coordinates of the center point of the upper surface and the center point of the lower surface of the three-dimensional frame profile.
Optionally, the coordinates of the center point of the upper surface and the center point of the lower surface of the three-dimensional frame profile may be averaged to obtain the center coordinate of the three-dimensional frame profile.
Optionally, after obtaining the three-dimensional frame coordinates, calculating the coordinates of each vertex of the three-dimensional frame outline according to the three-dimensional frame outline;
according to specific service requirements, the coordinates of the upper surface central point and the lower surface central point of the three-dimensional frame profile, or the central coordinates of the three-dimensional frame profile, or the coordinates of each vertex of the three-dimensional frame profile can be used as the three-dimensional frame coordinates, and subsequent method steps are carried out.
And step S407, determining the yaw angle of the target object according to the observation angle and the three-dimensional frame coordinate.
Specifically, based on the observation angle α and the three-dimensional frame coordinates (x, y, z), the yaw angle is calculated according to the following equation (3):
θ=α+arctan(x/z) (3)
and step S408, displaying the corresponding three-dimensional frame according to the three-dimensional frame outline and the three-dimensional frame coordinate of the target object.
Specifically, the obtained three-dimensional frame contour and the three-dimensional frame coordinate are output to a display unit, and the corresponding three-dimensional frame is displayed in a graphic or text form, so that an intuitive target object detection result is provided for a user.
Fig. 9 is a schematic diagram of an object space parameter detection apparatus according to an embodiment of the present invention. As shown in fig. 9, the object space parameter detection apparatus 9 provided in this embodiment includes:
the acquiringmodule 91 is configured to acquire a two-dimensional image corresponding to a target object;
therecognition module 92 is configured to input the two-dimensional image into therecognition model 92 trained to converge to obtain category information, two-dimensional frame information, and an observation angle of the target object, where the target object is framed by the two-dimensional frame;
the first determiningmodule 93 is configured to determine the three-dimensional frame contour of the target object according to the category information and a mapping relationship between preset category information and the three-dimensional frame contour of the target object;
a second determiningmodule 94, configured to determine three-dimensional frame coordinates of the target object according to the three-dimensional frame profile, the two-dimensional frame information, and the observation angle of the target object;
and anoutput module 95, configured to output the three-dimensional frame contour and the three-dimensional frame coordinates of the target object.
The obtainingmodule 91, the identifyingmodule 92, the first determiningmodule 93, the second determiningmodule 94 and theoutput module 95 are connected in sequence. The object space parameter detection apparatus 9 provided in this embodiment may implement the technical solution of the method embodiment shown in fig. 2, and the implementation principle and the technical effect are similar, which are not described herein again.
Fig. 10 is a schematic diagram of an object space parameter detection apparatus according to an embodiment of the present invention, and as shown in fig. 10, the object spaceparameter detection apparatus 10 according to the embodiment further includes, on the basis of the object space parameter detection apparatus shown in fig. 9:
asample obtaining module 1001, configured to obtain a training sample of the recognition model, where the training sample is a marked two-dimensional image;
thetraining module 1002 is configured to input the marked two-dimensional image to the recognition model, and train the recognition model.
The two-dimensional frame information includes: the two-dimensional frame profile and the two-dimensional frame coordinates, and the second determiningmodule 94 are specifically configured to:
and converting the two-dimensional frame coordinate into the three-dimensional frame coordinate of the target object according to the projection relation between the three-dimensional frame profile and the two-dimensional frame profile under the observation angle.
Further, optionally, the second determiningmodule 94, when converting the two-dimensional frame coordinate into the three-dimensional frame coordinate of the target object according to the projection relationship between the three-dimensional frame contour and the two-dimensional frame contour under the observation angle, is specifically configured to:
estimating the projection coordinate of the three-dimensional frame coordinate on the two-dimensional frame contour according to the two-dimensional frame contour, the observation angle and the two-dimensional frame coordinate;
and converting the projection coordinate into a three-dimensional frame coordinate according to a preset image conversion rule.
Further, optionally, the second determiningmodule 94, when estimating the projection coordinates of the three-dimensional frame coordinates on the two-dimensional frame profile according to the two-dimensional frame profile, the observation angle and the two-dimensional frame coordinates, is specifically configured to:
calculating a first projection parameter according to the two-dimensional frame outline;
calculating a second projection parameter according to the observation angle;
and estimating the projection coordinate of the three-dimensional frame coordinate on the two-dimensional frame outline according to the first projection parameter, the second projection parameter and the two-dimensional frame coordinate.
Optionally, the second determiningmodule 94, when converting the projection coordinates into the three-dimensional frame coordinates according to the preset image conversion rule, is specifically configured to:
and restoring the projection coordinate information into a unit three-dimensional coordinate according to a preset image conversion rule.
Calculating the unit three-dimensional frame outline according to the unit three-dimensional coordinates;
calculating an amplification factor according to the ratio of the three-dimensional frame outline to the unit three-dimensional frame outline;
and determining the product of the unit three-dimensional coordinate and the magnification coefficient as the three-dimensional frame coordinate.
Optionally, the electronic device includes an image capturing sensor, configured to capture a two-dimensional image, and accordingly, the second determiningmodule 94 restores the projection coordinate to a unit three-dimensional coordinate according to a preset image conversion rule, specifically configured to:
acquiring parameter information when an image acquisition sensor acquires a two-dimensional image, wherein the parameter information comprises a preset image conversion rule which is a conversion relation between a projection coordinate and a unit three-dimensional coordinate;
and restoring the projection coordinates into the unit three-dimensional coordinates according to the conversion relation between the projection coordinates and the unit three-dimensional coordinates.
Optionally, the object space parameter detecting apparatus further includes:
and a yawangle determining module 1003 for determining a yaw angle of the target object according to the observation angle and the three-dimensional frame coordinates.
Optionally, the object space parameter detecting apparatus further includes:
and thedisplay module 1004 is configured to display the corresponding three-dimensional frame according to the three-dimensional frame profile and the three-dimensional frame coordinates of the target object.
The object space parameter detection apparatus provided in this embodiment may execute the terminal device control method provided in any embodiment corresponding to fig. 2 or fig. 4, and the implementation principle and the technical effect are similar, which are not described herein again.
Fig. 11 is a schematic view of an electronic device according to an embodiment of the present invention, and as shown in fig. 11, the electronic device according to the embodiment includes: amemory 1101, aprocessor 1102 and computer programs.
The computer program is stored in thememory 1101 and configured to be executed by theprocessor 1102 to implement the object space parameter detection method provided by any one of the embodiments corresponding to fig. 2 or fig. 4 in the present invention.
Thememory 1101 and theprocessor 1102 are connected by abus 1103.
The related description may be understood by referring to the related description and effect corresponding to the step in fig. 2 or fig. 4, and redundant description is not repeated here.
An embodiment of the present invention provides a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to implement the object space parameter detection method provided in any embodiment of the present invention corresponding to fig. 2 or fig. 4.
The computer readable storage medium may be, among others, ROM, Random Access Memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, and the like.
In the embodiments provided in the present invention, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, a division of modules is merely a division of logical functions, and an actual implementation may have another division, for example, a plurality of modules or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or modules, and may be in an electrical, mechanical or other form.
Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This invention is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.
It will be understood that the invention is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the invention is limited only by the appended claims.