Method and device for detecting longitude and latitude of target and three-dimensional space attitude of shooting deviceTechnical Field
The invention belongs to the field of detection of three-dimensional space, and particularly relates to a method and a device for detecting the longitude and latitude of a target and the three-dimensional space attitude of a shooting device.
Background
The technology for remotely measuring the direction and the posture of the target is always a hot point of domestic and foreign research, and has important practical application value in the fields of battlefield decision, autonomous navigation, change monitoring and the like. The traditional detection method is based on expensive and clumsy detection equipment such as TOF depth camera, kinect, laser scanner and the like. High cost and inconvenient detection.
Disclosure of Invention
The invention aims to provide a method and a device for detecting longitude and latitude of a target and a three-dimensional space posture of a shooting device, a computer readable storage medium and electronic equipment, and aims to solve the problems of high cost and inconvenience in detection of expensive and clumsy detection equipment based on a TOF depth camera, a kinect, a laser scanner and the like.
In a first aspect, the present invention provides a method for detecting a three-dimensional spatial posture of a camera, the method comprising:
s101, constructing a group of vectors q related to the target three-dimensional space attitude;
s102, receiving a target image I shot by a shooting device;
s103, utilizing machine learning to carry out N groups of sample data I1,q1...IN,qNOptimizing a neural network model parameter W according to a neural network model equation to obtain an optimized neural network model parameter W of the formed sample set;
s104, substituting the optimized neural network model parameters W and a newly received target image I shot by the shooting device into a neural network model equation to obtain a vector q;
and S105, calculating through the vector q to obtain the three-dimensional space attitude R of the shooting device relative to the target.
In a second aspect, the present invention provides a device for detecting a three-dimensional spatial attitude of a photographing device, the device comprising:
a construction module, which is used for constructing a group of vectors q related to the target three-dimensional space attitude;
the receiving module is used for receiving a target image I shot by the shooting device;
an optimization module for using machine learning to combine N groups of sample data I1,q1...IN,qNOptimizing a neural network model parameter W according to a neural network model equation to obtain an optimized neural network model parameter W of the formed sample set;
the vector calculation module is used for substituting the optimized neural network model parameters W and a newly received target image I shot by the shooting device into a neural network model equation to obtain a vector q;
and the resolving module is used for resolving through the vector q to obtain the three-dimensional space attitude R of the shooting device relative to the target.
In a third aspect, the present invention provides a computer-readable storage medium storing a computer program which, when executed by a processor, implements the steps of the method for detecting a three-dimensional spatial orientation of a photographing apparatus as described above.
In a fourth aspect, the present invention provides a method for detecting longitude and latitude of a target, wherein the method includes:
s201, obtaining a three-dimensional space attitude R of the shooting device relative to a target and a three-dimensional space coordinate T of the shooting device relative to the target according to the detection method of the three-dimensional space attitude of the shooting device;
s202, inversely calculating the three-dimensional space coordinate of the target relative to the shooting device according to the three-dimensional space attitude R of the shooting device relative to the target and the three-dimensional space coordinate T of the shooting device relative to the target;
s203, indirectly solving the coordinate To of the target relative To the earth geocentric coordinate system according To the three-dimensional space coordinate of the target relative To the shooting device and the coordinate Tx and the attitude Rx of the shooting device relative To the earth geocentric coordinate system, and directly obtaining the longitude and latitude of the target according To the To.
In a fourth aspect, the present invention provides a computer-readable storage medium storing a computer program which, when executed by a processor, implements the steps of the method for detecting a target longitude and latitude as described above.
In the invention, N groups of sample data I are obtained by utilizing machine learning1,q1...IN,qNOptimizing a neural network model parameter W according to a neural network model equation to obtain an optimized neural network model parameter W of the formed sample set; substituting the optimized neural network model parameter W and a newly received target image I shot by the shooting device into a neural network model equation to obtain a vector q; and calculating through the vector q to obtain the three-dimensional space attitude R of the shooting device relative to the target. Therefore, the data source of the invention can be from real-time videos of common monocular shooting devices, and has low cost and convenient detection, which is different from the traditional target detection technology based on expensive and clumsy detection equipment such as TOF depth cameras, kinect and laser scanners; the method can realize the prediction of the moving direction of the target by the static picture, establish the mapping relation of the target from the video to the map and provide support for the related application expansion.
Drawings
Fig. 1 is a flowchart of a method for detecting a three-dimensional spatial pose of a camera according to an embodiment of the present invention.
Fig. 2 is a functional block diagram of a detection apparatus for detecting a three-dimensional spatial attitude of a camera according to a second embodiment of the present invention.
Fig. 3 is a flowchart of a method for detecting latitude and longitude of a target according to a fourth embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more clearly apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
In order to explain the technical means of the present invention, the following description will be given by way of specific examples.
The first embodiment is as follows:
referring to fig. 1, a method for detecting a three-dimensional attitude of a camera according to an embodiment of the present invention includes the following steps: it should be noted that the method for detecting the three-dimensional attitude of the imaging device according to the present invention is not limited to the flow sequence shown in fig. 1 if the results are substantially the same.
And S101, constructing a group of vectors q related to the target three-dimensional space attitude.
In the first embodiment of the present invention, the vector q associated with the target three-dimensional spatial pose may be: a 4-ary number { q0, q1, q2, q3}, an attitude matrix, or three attitude angles { a, b, c }. The vector q is a binary number when a plane defined by two dimensions of the three-dimensional space is perpendicular to the camera line-of-sight direction.
And S102, receiving a target image I shot by the shooting device.
S103, utilizing machine learning to carry out N groups of sample data I1,q1...IN,qNAnd optimizing the neural network model parameters W according to the neural network model equation to obtain optimized neural network model parameters W of the formed sample set.
In the first embodiment of the present invention, the neural network model equation is
f(W,I1)=q1
...
f(W,IN)=qN。
Machine Learning (ML) is a multi-domain cross subject, and relates to multiple subjects such as probability theory, statistics, approximation theory, convex analysis and algorithm complexity theory.
Since the output layer of the neural network model outputs 4 values representing the target three-dimensional spatial attitude at the time of establishing the forward propagation of the neural network model, since the range of the value range output by the neural network model is (- ∞, infinity), and the quaternion representing the target three-dimensional spatial attitude is subject to the constraint q that the sum of squares is equal to 102+q12+q22+q321. Therefore, when the vector q is a quaternion, the output processing procedure of the neural network model is as follows:
the vector Q output by the last output layer of the neural network model is processed by a unitization constraint layer to output a quaternion vector Q { Q }0,q1,q2,q3}; the calculation process is as follows:
forward propagation formula
Wherein i is 0..3,
this ensures that it is a quaternion q
0,q
1,q
2,q
3The unit vector constraint q of
02+q
12+q
22+q
32=1;
Formula of back propagation
Wherein,
e is an error function
Wherein
Is the expected value of the i-th component of the quaternion.
Quaternions predict three-dimensional spatial attitude, and degenerating into a binary number predicts the direction of a two-dimensional plane, and predicts the attitude of a two-dimensional plane target on the plane, for example, the quaternion can be used for aerial photography to predict the direction of a ground target.
Therefore, when the vector q is a binary number, the output processing procedure of the neural network model is as follows:
the vector Q output by the last output layer of the neural network model is processed by a unitization constraint layer to output a binary number vector Q { Q }0,q1}; the calculation process is as follows:
forward propagation formula
Wherein, i is 0,1,
can ensure { q
0,q
1Is the unit vector constraint q
02+q
12=1;
Formula of back propagation
Wherein
E is an error function
Wherein
Is the unit direction vector expectation of the target on the plane.
And S104, substituting the optimized neural network model parameters W and the newly received target image I shot by the shooting device into a neural network model equation to obtain a vector q.
In the first embodiment of the present invention, the neural network model equation is f (W, I) ═ q.
And S105, calculating through the vector q to obtain the three-dimensional space attitude R of the shooting device relative to the target.
In the first embodiment of the present invention, the vector q may be a quaternion, a coordinate of n feature points on an image, a rotation vector, a rotation matrix, or the like, where n is greater than or equal to 3.
When the vector q is a quaternion, the three-dimensional attitude R of the camera with respect to the target can be calculated by:
when the vector q is the coordinates P of n feature points on the image1,…,PNDuring the shooting process, the three-dimensional space posture R and the position T of the shooting device relative to the target can be solved through the corresponding relation of the computer vision object image, and the three-dimensional space posture R of the shooting device relative to the target and the three-dimensional space coordinate T of the shooting device relative to the target can be obtained through a cv:: solvePp function in an OpenCV library function.
When the vector q is a rotation vector, the rotation vector can be converted into a three-dimensional spatial pose R of the camera relative to the target by a cv:: Rodrigues function in an OpenCV library function.
In the first embodiment of the present invention, after S105, the method may further include:
according to the formula
The three-dimensional spatial coordinates T of the camera with respect to the object are approximated, wherein,
c
x,c
yis the coordinate of the principal point of the camera, f
x、f
yIs the focal length of the pixel of the camera,
where D is the diameter of the object, and Δ u, Δ v are the width and height, respectively, of the object as identified on the imageAnd the degrees u and v are central points of the target on the image, and the length of the target projected in the sight line direction under the coordinate system of the shooting device is z.
In the first embodiment of the present invention, after S105, the method may further include:
constructing a group of vectors Z related to the position and the posture of the target three-dimensional space; in the first embodiment of the present invention, the vector Z related to the target three-dimensional spatial attitude may be: quaternion { q0,q1,q2,q3And a projection parameter z of the target to the camera in the direction of the line of sight.
Receiving a target image I shot by a shooting device and a circumscribed rectangular frame coordinate r of a target in the image, wherein the circumscribed rectangular frame coordinate r of the target in the image can be obtained by the prior art;
using machine learning to sample N groups of data I1,r1,z1...IN,rN,zNOptimizing the neural network model parameter W according to the following neural network model equation to obtain the optimized neural network model parameter W,
f(W,I1,r1)=z1
...
f(W,IN,rN)=zN;
substituting the optimized neural network model parameters W and a newly received target image I r shot by the shooting device into a neural network model equation f (W, I, r) Z to obtain a vector Z;
and calculating to obtain the three-dimensional space coordinate T of the shooting device relative to the target through the vector Z and the three-dimensional space attitude R of the shooting device relative to the target.
In the first embodiment of the present invention, the vector Z may be a quantity related to the target position, the vector Z may also be a Z component λ of the vector KRT, and λ is calculated as a Z value by calculating the KRT during machine learning. When in prediction, R is predicted, and then Z ═ lambda predicted by the neural network is substituted into a formula
Wherein,
c
x,c
yis the coordinate of the principal point of the camera, f
x、f
yThe focal length of the pixel of the shooting device is shown, R is the three-dimensional space posture of the shooting device relative to the target, u and v are coordinates of the target origin on the image, and u and v can be obtained through coordinates of image points of the target origin on the image or through approximation of the center point of the target rectangular frame.
Example two:
referring to fig. 2, a device for detecting a three-dimensional attitude of a camera according to a second embodiment of the present invention includes:
a construction module 11, configured to construct a set of vectors q associated with the target three-dimensional spatial pose;
the receivingmodule 12 is used for receiving a target image I shot by the shooting device;
anoptimization module 13 for utilizing machine learning to combine N sets of sample data I1,q1...IN,qNOptimizing a neural network model parameter W according to a neural network model equation to obtain an optimized neural network model parameter W of the formed sample set;
thevector calculation module 14 is configured to substitute the optimized neural network model parameter W and a newly received target image I captured by the capturing device into a neural network model equation to obtain a vector q;
and the calculatingmodule 15 is used for calculating the three-dimensional space attitude R of the shooting device relative to the target through the vector q.
The detection device for the three-dimensional space attitude of the shooting device provided by the second embodiment of the invention and the detection method for the three-dimensional space attitude of the shooting device provided by the first embodiment of the invention belong to the same concept, and the specific implementation process is described in the whole specification, and is not described herein again.
Example three:
a third embodiment of the present invention provides a computer-readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the method for detecting a three-dimensional spatial attitude of a shooting device according to the first embodiment of the present invention is implemented.
Example four:
referring to fig. 3, a method for detecting longitude and latitude of a target according to a fourth embodiment of the present invention includes the following steps: it should be noted that, if the result is substantially the same, the method for detecting the target longitude and latitude of the present invention is not limited to the flow sequence shown in fig. 3.
S201, obtaining a three-dimensional space attitude R of the shooting device relative to the target and a three-dimensional space coordinate T of the shooting device relative to the target according to the detection method of the three-dimensional space attitude of the shooting device provided by the embodiment of the invention.
And S202, inversely calculating the three-dimensional space coordinate of the object relative to the shooting device according to the three-dimensional space attitude R of the shooting device relative to the object and the three-dimensional space coordinate T of the shooting device relative to the object.
S203, indirectly solving the coordinate To of the target relative To the earth geocentric coordinate system according To the three-dimensional space coordinate of the target relative To the shooting device and the coordinate Tx and the attitude Rx of the shooting device relative To the earth geocentric coordinate system, and directly obtaining the longitude and latitude of the target according To the To.
Wherein, Tx can be obtained by GPS of the camera, and Rx can be obtained by gyroscope, magnetometer and accelerometer bound with the camera.
Wherein the three-dimensional space coordinates of the object relative to the camera
Rx RgRV, Rg is attitude data relative to a northeast coordinate system measured by a gyroscope of the shooting device,
wherein,
is the longitude of the camera and θ is the latitude of the camera.
In the fourth embodiment of the present invention, S202 may further include the following steps:
and inversely calculating the three-dimensional space posture of the target relative to the shooting device according to the three-dimensional space posture R of the shooting device relative to the target.
S203 may further include the steps of:
according to the three-dimensional space attitude R of the target relative to the shooting deviceTAnd the attitude Rx of the shooting device relative to the earth geocentric coordinate system, and indirectly calculating the attitude Ro of the target relative to the earth geocentric coordinate system, wherein the specific formula is as follows: r is Ro ═ RTRx。
Example five:
a fifth embodiment of the present invention provides a computer-readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the method for detecting a target longitude and latitude provided in the fourth embodiment of the present invention is implemented.
In the invention, N groups of sample data I are obtained by utilizing machine learning1,q1...IN,qNOptimizing a neural network model parameter W according to a neural network model equation to obtain an optimized neural network model parameter W of the formed sample set; substituting the optimized neural network model parameter W and a newly received target image I shot by the shooting device into a neural network model equation to obtain a vector q; and calculating through the vector q to obtain the three-dimensional space attitude R of the shooting device relative to the target. Therefore, the data source of the invention can be from real-time videos of common monocular shooting devices, and has low cost and convenient detection, which is different from the traditional target detection technology based on expensive and clumsy detection equipment such as TOF depth cameras, kinect and laser scanners; the method can realize the prediction of the moving direction of the target by the static picture, establish the mapping relation of the target from the video to the map and provide support for the related application expansion.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by associated hardware instructed by a program, which may be stored in a computer-readable storage medium, and the storage medium may include: read Only Memory (ROM), Random Access Memory (RAM), magnetic or optical disks, and the like.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.