Three-dimensional reconstruction method based on structured lightTechnical Field
The invention relates to the technical field of three-dimensional reconstruction, in particular to a three-dimensional reconstruction method based on structured light.
Background
Computer vision is that a computer acquires descriptions and information of an objective world through processing pictures or picture sequences so as to help people to better understand contents contained in the pictures. The three-dimensional reconstruction technology is a branch of computer vision and is a research direction combining computer vision and computer graphic image processing. The method is widely applied to industrial automation, reverse engineering, cultural relic protection, computer-assisted medical treatment, virtual reality, augmented reality, robot application and other scenes.
Structured light three-dimensional reconstruction is one of the important techniques in computer vision. However, most of the existing methods require multiple projections of the designed pattern to implement closed form solutions, which makes them unable to measure dynamic objects. Most of related systems are based on the reconstruction of three-dimensional color images, the edge detection of the images and the feature matching algorithm, the three colors are independently processed according to RGB, the association between the color information of the images is artificially stripped, and the reliability of the detection is influenced.
The binocular stereo vision technology is characterized in that a left image and a right image of an object are shot by two cameras at two angles, then the same-name points in the left image and the right image are found out by utilizing a stereo matching algorithm, and the three-dimensional space position coordinate information of the object to be measured is calculated by combining the internal and external parameters of the cameras and utilizing triangular intersection. The binocular stereo vision technology does not need to actively project pattern information, has simple hardware structure, but has the defects of low reconstruction point cloud precision, low reconstruction speed, easy occurrence of matching point errors and the like for objects with little surface texture information. The structured light technology projects a specific coding pattern to the surface of an object through a projector, then shoots the coding pattern modulated by the surface of the object through a camera, and recovers the depth information of the object through decoding operation of the coding pattern. Structured light technology rebuilds the precision height, and is fast, even the object that surface texture information is few also can obtain fine reconstruction effect, but traditional structured light system of rebuilding mostly all is the single-purpose, need mark the projecting apparatus at the in-process of calculating depth information, and the process of demarcating of projecting apparatus is extremely loaded down with trivial details again.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides a three-dimensional reconstruction method combining binocular stereo vision and a structured light technology, which utilizes the structured light in active vision to increase the texture characteristics of the surface of an object and realizes three-dimensional reconstruction through passive vision, thereby being beneficial to improving the precision and efficiency of the three-dimensional reconstruction and saving the cost.
A three-dimensional reconstruction method based on structured light comprises the following steps:
s1, building a three-dimensional reconstruction system, wherein the three-dimensional reconstruction system comprises two cameras, a projector and a computer;
s2, calibrating the camera and obtaining internal and external parameters of the camera;
s3, solving a distortion mapping matrix according to the parameters of the camera;
s4, projecting the designed RGB point structured light pattern to the object by using a projector;
s5, acquiring a left image and a right image of a reconstructed object by using a camera of a binocular stereo vision system, and performing stereo correction on the images;
s6, based on the regional similarity, red points, green points and blue points of the left image and the right image are respectively segmented by a three-channel combined RGB point segmentation method;
s7, clustering the points with the same color according to the Euclidean distance;
s8, matching points in the left view and points in the right view, and obtaining the parallax of corresponding matching points on the left image and the right image relative to a point P on the object;
s9, combining internal and external parameters of a camera, and obtaining three-dimensional space coordinates of each point on the object by using a parallax principle;
and S10, generating a point cloud picture with sparse object according to the three-dimensional space coordinates of multiple points on the object, and finishing the three-dimensional reconstruction of the object.
Preferably, the step S2 includes the following sub-steps:
s21, calibrating a left camera to obtain internal and external parameters of the left camera, wherein the external parameters comprise a rotation matrix and a translation matrix;
s22, the internal parameters and the rotation matrixes of the left camera and the right camera are the same, and the translation matrix of the right camera is obtained by the translation matrix of the left camera and the distance between the two cameras, so that the internal and external parameters of the right camera are obtained.
Preferably, in step S4, the RGB structured-light dot pattern is based on a structured-light dot matrix of RGB three primary colors.
Preferably, the RBG structured light dot pattern is a structured light dot matrix capable of forming the same color of each dot of each row of red, green and blue and different colors of adjacent rows.
Preferably, in S4, the structured light is projected toward the object from the front of the object.
Preferably, the step S5 includes the following sub-steps:
s51, respectively obtaining a left correction matrix and a right correction matrix based on the internal and external parameters and the distortion mapping matrix of the left camera and the internal and external parameters and the distortion mapping matrix of the right camera;
s52, performing stereo correction on the left image by using the left correction matrix, performing stereo correction on the right image by using the right correction matrix, wherein the point in the left image processed by the left correction matrix and the matching point in the right image processed by the right correction matrix are on the same scanning line, namely the y-axis coordinate of the point and the matching point is the same.
Preferably, the step S6 includes the following sub-steps:
s601, calculating a first threshold value T for a point image of an R channel by using a threshold value selection method based on slope difference distribution;
s602, adopting a threshold value T pairDividing the point image, and dividing the divided binary image I1All points in (a) are marked as:
wherein (x, y) is an index of the binary image;
s603, setting the image resolution as NX×NYDefining a set X as {1, 2.,. NXIn the set Y, the set is {1, 2.,. N }YAnd calculating an index set of the kth mark point as:
(Xk,Yk)={(x,y)|I1(x,y)=k}(2);
calculating the area A of the k-th mark pointkComprises the following steps:
Ak=|Xk|=|Yk| (3);
s604. area set
Is ordered as
The following conditions are satisfied:
since the regions of the points are similar, the difference in the sorted regions should not be too large if all points are segmented accurately enough. Therefore, the accuracy of the segmentation result can be judged by the calculated difference value of the classification areas;
s605, calculating a difference value D of the sequencing regioniComprises the following steps:
the maximum difference is calculated as:
Dmax=max Di,i=1,2,...,NB-1 (6);
if the maximum difference Dmax is greater than a threshold (which is calculated as the aggregate area)
Because the selected global threshold T is smaller than the optimal threshold, some adjacent points in the segmentation result are combined into one point; the optimal threshold is defined as the threshold that can separate all bright and dark spots from the background; on the other hand, a smaller threshold may more completely segment the bright spots relative to the optimal threshold; therefore, some points of smaller threshold segmentation should be used. The region threshold is used to select which segmentation points to use;
s606, calculating a region set
Average value of (d):
s607, all the areas are smaller than A
mIndex set of division points of
The calculation is as follows:
s608, updating the global threshold value as follows:
T=T+ΔT (9);
where Δ T is the step size of the loop, its value being an integer greater than or equal to 1; in order to accelerate convergence speed, selecting delta T as 10;
s609, segmenting the image by using the updated threshold value again;
s610, repeating the steps S601 to S609 until Dmax is smaller than the area set
One tenth of the median;
s611, repeating the steps for m times to obtain a m-th segmentation result ImIndex set (X) of all divided pointsm,Ym) Comprises the following steps:
(Xm,Ym)={(x,y)|Im(x,y)>0} (10);
s612, finally segmenting the image IRResolution N ofX×NYThe initialization is as follows:
calculating to obtain:
s613, segmenting the images of the G channel and the B channel to obtain an IGAnd IBAnd adding the division results to form a final division result.
Preferably, the step S7 includes the following substeps:
s71, spreading the divided points 5 times by using a structural element B which is {0,0} in each channel, and connecting adjacent points to form a linear image;
and S72, multiplying the clustering straight line image in each channel with the corresponding segmentation point image to generate a clustering point image. In each lane, points on different lines are assigned different identification numbers, including line identification numbers, row identification numbers.
Preferably, the step S8 includes the following substeps:
s81, in two views of each channel, firstly, matching is carried out according to line identification numbers of points, and then the points with the same line identification numbers are matched according to the line identification numbers of the points, so that clustering points in the two views are matched;
s82, obtaining the pixel coordinates of the matched corner points of the left image and the right image, namely the certain corner point l (x) of the left imagel,yl) And a corner r (x) of the right imager,yr);
S83. since the image has been stereo corrected to achieve line alignment, the y-axis coordinates of point l and point r are the same, and the parallax of the corresponding matching point on the left and right images with respect to point P on the object can be directly expressed as d ═ xl-xr。
Preferably, the step S9 includes the following substeps:
s91, obtaining a triangular PO according to the parallax of the corresponding matching points on the left image and the right image relative to the point P on the object and the optical centers of the left camera and the right cameralOrSimilar to triangle P1r, wherein the similar triangle scale formula is:
where T is the optical center distance of the left and right cameras, d is the parallax d ═ x of the corresponding matching point on the left and right images with respect to the point P on the objectl-xrF is the focal length of the left and right cameras, Z is the depth value of point P, OlIs the optical center of the left camera, OrIs the right camera optical center;
s92, obtaining the three-dimensional coordinates (X, Y, Z) of the point P by using the formula (13),
and finally, obtaining the three-dimensional coordinates (X, Y, Z) of all the points on the image.
The invention has the beneficial effects that: the invention combines binocular stereo vision and structured light technology, avoids the calibration of the projector, and simplifies the steps of three-dimensional reconstruction; the double-view reconstruction method designed by the invention only needs one projection, and can realize the measurement of the dynamic object; the designed RGB structure light point pattern and the region similarity based on the point structure light pattern provide the iterative point segmentation method, which can effectively segment red, blue and green points of the three-channel combined RGB structure light point pattern respectively, so that subsequent unsupervised point clustering is facilitated, and points in left and right views can be rapidly matched; and the three-dimensional reconstruction is realized through passive vision, the precision and the efficiency of the three-dimensional reconstruction are favorably improved, and the method has a good reconstruction effect on targets with non-bright colors, non-abundant textures and unobvious shielding.
Drawings
FIG. 1 is a flow chart of a method for reconstructing a surface of an object based on a structured light point pattern;
FIG. 2 is a schematic diagram of an imaging system set up;
FIG. 3 is a designed RGB format point structured light pattern;
FIG. 4 is a photograph of a spherical object projected with regular structured light according to the present invention;
FIG. 5 is a three-dimensional reconstructed volumetric imaging model;
fig. 6 is a schematic view of a similar triangle using the parallax principle according to the present invention.
Detailed Description
The geometric model adopted by the invention is shown in fig. 5, wherein Ol is the optical center of the left camera, Or is the optical center of the right camera, P is any point in the space, and the optical centers of the left camera and the right camera and the point P form a plane POlOr. Pl and Pr are the image points of the P point in the left and right cameras respectively and are called a pair of homonymous points, and the intersecting lines Lp1 and Lpr of the plane POLOr and the left and right image planes are called a pair of epipolar lines.
The invention is further illustrated by the following figures and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some but not all of the relevant aspects of the present invention are shown in the drawings.
Example 1
Referring to fig. 1 and 2, a three-dimensional reconstruction method based on structured light includes the following steps:
s1, building a three-dimensional reconstruction system, wherein the three-dimensional reconstruction system comprises two cameras, a projector and a computer;
s2, calibrating the camera and obtaining internal and external parameters of the camera;
the step S2 includes the following sub-steps:
s21, calibrating a left camera to obtain internal and external parameters of the left camera, wherein the external parameters comprise a rotation matrix and a translation matrix;
s22, the internal parameters and the rotation matrixes of the left camera and the right camera are the same, and the translation matrix of the right camera is obtained by the translation matrix of the left camera and the distance between the two cameras, so that the internal and external parameters of the right camera are obtained.
Namely the rotation matrix and the translation matrix of the calibrated left camera and the calibrated right camera are respectively R1、t1And R2、t2Wherein R is1=R2,t1=(x,y,z)T,t2=(x+d,y,z)TAnd d is the translation distance from the left camera to the right camera.
S3, solving a distortion mapping matrix according to the parameters of the camera;
s4, referring to the figures 2, 3 and 4, projecting the designed RGB point structured light pattern to an object by using a projector;
in step S4, the RGB structural dot pattern is a structured light lattice based on RGB three primary colors, and the structured light is projected from the front of the object to the object.
In this embodiment, the RBG structure dot pattern is a structured light dot matrix that can form the same color for each row of dots of red, green, and blue, and different colors for adjacent rows. The structured light lattice projecting the RGB three primary colors is chosen because the color structured light dot pattern is more favorable for matching of the feature points.
S5, acquiring a left image and a right image of a reconstructed object by using a camera of a binocular stereo vision system, and performing stereo correction on the images;
the step S5 includes the following sub-steps:
s51, respectively obtaining a left correction matrix and a right correction matrix based on the internal and external parameters and the distortion mapping matrix of the left camera and the internal and external parameters and the distortion mapping matrix of the right camera;
s52, performing stereo correction on the left image by using the left correction matrix, performing stereo correction on the right image by using the right correction matrix, wherein the point in the left image processed by the left correction matrix and the matching point in the right image processed by the right correction matrix are on the same scanning line, namely the y-axis coordinate of the point and the matching point is the same.
S6, based on the regional similarity, red points, green points and blue points of the left image and the right image are respectively segmented by a three-channel combined RGB point segmentation method;
the step S6 includes the following sub-steps:
s601, calculating a first threshold value T for a point image of an R channel by using a threshold value selection method based on slope difference distribution;
s602, segmenting the point image by adopting a threshold value T, and segmenting the segmented binary image I1All points in (a) are marked as:
wherein (x, y) is an index of the binary image;
s603, setting the image resolution as NX×NYDefining a set X as {1, 2.,. NXIn the set Y, the set is {1, 2.,. N }YAnd calculating an index set of the kth mark point as:
(Xk,Yk)={(x,y)|I1(x,y)=k} (2);
calculating the area A of the k-th mark pointkComprises the following steps:
Ak=|Xk|=|Yk| (3);
s604. area set
Is ordered as
The following conditions are satisfied:
since the regions of the points are similar, the difference in the sorted regions should not be too large if all points are segmented accurately enough. Therefore, the accuracy of the segmentation result can be judged by the calculated difference value of the classification areas;
s605, calculating a difference value D of the sequencing regioniComprises the following steps:
the maximum difference is calculated as:
Dmax=max Di,i=1,2,...,NB-1 (6);
if the maximum difference Dmax is greater than a threshold (which is calculated as the aggregate area)
Because the selected global threshold T is smaller than the optimal threshold, some adjacent points in the segmentation result are combined into one point; the optimal threshold is defined as the threshold that can separate all bright and dark spots from the background; on the other hand, a smaller threshold may more completely segment the bright spots relative to the optimal threshold; therefore, some points of smaller threshold segmentation should be used. The region threshold is used to select which segmentation points to use;
s606, calculating a region set
Average value of (d):
s607, all the areas are smaller than A
mIndex set of division points of
The calculation is as follows:
s608, updating the global threshold value as follows:
T=T+ΔT (9);
where Δ T is the step size of the loop, its value being an integer greater than or equal to 1; in order to accelerate convergence speed, selecting delta T as 10;
s609, segmenting the image by using the updated threshold value again;
s610, repeating the steps S601 to S609 until Dmax is smaller than the area set
One tenth of the median;
s611, repeating the steps for m times to obtain a m-th segmentation result ImIndex set (X) of all divided pointsm,Ym) Comprises the following steps:
(Xm,Ym)={(x,y)|Im(x,y)>0} (10);
s612, finally segmenting the image IRResolution N ofX×NYThe initialization is as follows:
calculating to obtain:
s613, segmenting the images of the G channel and the B channel to obtain an IGAnd IBAnd adding the division results to form a final division result.
S7, clustering the points with the same color according to the Euclidean distance;
the step S7 includes the following substeps:
s71, spreading the divided points 5 times by using a structural element B which is {0,0} in each channel, and connecting adjacent points to form a linear image;
and S72, multiplying the clustering straight line image in each channel with the corresponding segmentation point image to generate a clustering point image. In each lane, points on different lines are assigned different identification numbers, including line identification numbers, row identification numbers.
S8, matching points in the left view and points in the right view, and obtaining the parallax of corresponding matching points on the left image and the right image relative to a point P on the object;
the step S8 includes the following substeps:
s81, in two views of each channel, firstly, matching is carried out according to line identification numbers of points, and then the points with the same line identification numbers are matched according to the line identification numbers of the points, so that clustering points in the two views are matched;
s82, obtaining the pixel coordinates of the matched corner points of the left image and the right image, namely the certain corner point l (x) of the left imagel,yl) And a corner r (x) of the right imager,yr);
S83. since the image has been stereo corrected to achieve line alignment, the y-axis coordinates of point l and point r are the same, and the parallax of the corresponding matching point on the left and right images with respect to point P on the object can be directly expressed as d ═ xl-xr。
S9, combining internal and external parameters of a camera, and obtaining three-dimensional space coordinates of each point on the object by using a parallax principle;
the step S9 includes the following substeps:
s91, obtaining a triangular PO according to the parallax of the corresponding matching points on the left image and the right image relative to the point P on the object and the optical centers of the left camera and the right cameralOrSimilar to triangle P1r, wherein the similar triangle scale formula is:
where T is the optical center distance of the left and right cameras, d is the parallax d ═ x of the corresponding matching point on the left and right images with respect to the point P on the objectl-xrF isFocal lengths of the left and right cameras, Z being the depth value of point P, OlIs the optical center of the left camera, OrIs the right camera optical center;
s92, obtaining the three-dimensional coordinates (X, Y, Z) of the point P by using the formula (13),
and finally, obtaining the three-dimensional coordinates (X, Y, Z) of all the points on the image.
And S10, generating a point cloud picture with sparse object according to the three-dimensional space coordinates of multiple points on the object, and finishing the three-dimensional reconstruction of the object.