Movatterモバイル変換


[0]ホーム

URL:


CN111784841B - Method, device, electronic equipment and medium for reconstructing three-dimensional image - Google Patents

Method, device, electronic equipment and medium for reconstructing three-dimensional image
Download PDF

Info

Publication number
CN111784841B
CN111784841BCN202010507773.4ACN202010507773ACN111784841BCN 111784841 BCN111784841 BCN 111784841BCN 202010507773 ACN202010507773 ACN 202010507773ACN 111784841 BCN111784841 BCN 111784841B
Authority
CN
China
Prior art keywords
image
frame
target
image block
dimensional
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010507773.4A
Other languages
Chinese (zh)
Other versions
CN111784841A (en
Inventor
唐荣富
邓宝松
龙知洲
商尔科
李靖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National Defense Technology Innovation Institute PLA Academy of Military Science
Original Assignee
National Defense Technology Innovation Institute PLA Academy of Military Science
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National Defense Technology Innovation Institute PLA Academy of Military SciencefiledCriticalNational Defense Technology Innovation Institute PLA Academy of Military Science
Priority to CN202010507773.4ApriorityCriticalpatent/CN111784841B/en
Publication of CN111784841ApublicationCriticalpatent/CN111784841A/en
Application grantedgrantedCritical
Publication of CN111784841BpublicationCriticalpatent/CN111784841B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Classifications

Landscapes

Abstract

The application discloses a method, a device, electronic equipment and a medium for reconstructing a three-dimensional image. In the method, after a target common view image is acquired, and a first frame sequence which is orderly sequenced according to the weight sum of corresponding connecting edges is determined based on the target common view image, a first image frame of an image frame with the highest weight sum value of the connecting edges in the first frame sequence can be acquired, a first image block is generated based on the first image frame, a second image block with the target number is obtained based on the first image block, and finally a reconstructed three-dimensional grid is generated based on the first image block and the second image block with the target number. By applying the technical scheme of the application, the reconstruction sequence can be established by introducing the common view, and a new image block generation method is provided according to visual geometry mathematics, so that the problems of speed and precision in the matching process can be better solved.

Description

Method, device, electronic equipment and medium for reconstructing three-dimensional image
Technical Field
The present application relates to image processing technology, and in particular, to a method, an apparatus, an electronic device, and a medium for reconstructing a three-dimensional image.
Background
The three-dimensional reconstruction of images is an important research problem of computer vision and computer graphics, which refers to dense mesh reconstruction of a scene on the basis of known motion restoration structures (structure from motion) of a plurality of images. Although the method of three-dimensional reconstruction varies from sensor to sensor, the mainstream image reconstruction-based method is generally called multi-view stereoreconstruction (MVS).
Further, due to the difference of scene expression forms (and data formats), the MVS methods currently mainly have three main categories: a depth map-based reconstruction method, a point cloud-based reconstruction method, and a voxel-based reconstruction method. Generally, the reconstruction method based on the depth map is dependent on the visual angle, and has unique advantages in large scene application, but the method is generally large in calculated amount, is unfavorable for observation of different vision, and fusion of a plurality of depth maps under a unified coordinate system is still to be further studied; reconstructing dense point cloud under a unified coordinate system by a point cloud-based method, wherein the method has very good characteristics of geometric editing, fusion, rendering and the like, but the point cloud generated by the method is often large in noise or has holes; the voxel-based method represents the three-dimensional space by voxels, and then adopts the concept of a Markov random field to process the three-dimensional point cloud, and generally adopts an octree structure to accelerate.
However, the existing three-dimensional image reconstruction method still has the defects of more noise and large operation amount. Therefore, how to generate a high-performance three-dimensional image reconstruction method becomes a problem to be solved by those skilled in the art.
Disclosure of Invention
The embodiment of the application provides a method, a device, electronic equipment and a medium for reconstructing a three-dimensional image, which are used for solving the problems of more noise points and large operation amount of the three-dimensional image in the related technology.
According to an aspect of an embodiment of the present application, there is provided a method for reconstructing a three-dimensional image, including:
Acquiring a target common view image, and determining a first frame sequence based on the target common view image, wherein each frame in the first frame sequence is sequentially ordered according to the weight sum of the corresponding connecting edges;
Acquiring a first image frame in the first frame sequence, and generating a first image block based on the first image frame, wherein the first image frame is an image frame with the highest sum value of connecting edge weights in the first frame sequence;
Obtaining a target number of second image blocks based on the first image blocks, wherein the second image blocks are adjacent to the first image blocks;
a reconstructed three-dimensional grid is generated based on the first image block and the target number of second image blocks.
Optionally, in another embodiment of the above method according to the present application, after the acquiring the first image frame in the first frame sequence, the method further includes:
a first set of feature points of the first image frame is acquired, the first set of feature points comprising at least corner detection features and gaussian function features.
Optionally, in another embodiment of the above method according to the present application, the acquiring the first set of feature points of the first image frame includes:
acquiring a second image frame set adjacent to the first image frame;
obtaining target to-be-matched points corresponding to each image frame in the second image frame set by using an epipolar constraint algorithm;
calculating the similarity of the points to be matched of each target by using the cost function;
obtaining a plurality of candidate matching points based on the similarity of the target points to be matched;
And obtaining the first characteristic point set of the first image frame based on the candidate matching points.
Optionally, in another embodiment of the above method according to the present application, after the acquiring the first set of feature points of the first image frame, the method further includes:
Calculating three-dimensional space coordinates corresponding to each first feature point in the first feature point set by using a forward intersection algorithm and a robust function algorithm;
Calculating initial appearance corresponding to each first feature point in the first feature point set;
utilizing a homography matrix to constrain a minimized optimization target and acquiring a first orientation of the first image block;
and generating the first image block based on the three-dimensional space coordinates, the initial appearance and the first orientation corresponding to each first feature point in the first feature point set.
Optionally, in another embodiment of the above method according to the present application, the calculating an initial appearance corresponding to each first feature point in the first feature point set includes:
Calculating a difference sum value of each first feature point and other feature points in the first feature point set;
And taking the characteristic point with the minimum difference sum value as a second characteristic point, and taking an M-by-M neighborhood of the second characteristic point as the initial appearance of the first image block.
Optionally, in another embodiment of the above method according to the present application, the obtaining, based on the first image block, a target number of second image blocks includes:
Gridding the first image frame to obtain a second image block;
Taking the three-dimensional coordinates, the orientation initial value and the reference frame of the first image block as the three-dimensional coordinates, the orientation initial value and the reference frame of the second image block;
performing back projection operation by using the initial value of the second image block, and obtaining an effective feature set according to the luminosity difference;
and the method is repeated until three-dimensional space coordinates, the initial appearance and the second orientation corresponding to all the image frames in the first frame sequence are calculated.
Optionally, in another embodiment of the above method according to the present application, the generating a reconstructed three-dimensional grid based on the first image block and the target number of second image blocks includes:
Filtering the first image blocks and the second image blocks with the target number, and removing outlier points of the second image blocks with the target number;
And generating the reconstructed three-dimensional grid according to the first image blocks subjected to filtering and the second image blocks with the target number.
According to another aspect of an embodiment of the present application, there is provided an apparatus for reconstructing a three-dimensional image, including:
The first acquisition module is arranged to acquire a target common view image, and determine a first frame sequence based on the target common view image, wherein each frame in the first frame sequence is sequentially ordered according to the weight sum of the corresponding connecting edges;
The second acquisition module is configured to acquire a first image frame in the first frame sequence, and generate a first image block based on the first image frame, wherein the first image frame is an image frame with the highest connecting edge weight sum value in the first frame sequence;
the first generation module is configured to obtain a target number of second image blocks based on the first image blocks, wherein the second image blocks are adjacent to the first image blocks;
a second generation module is arranged to generate a reconstructed three-dimensional mesh based on the first image block and the target number of second image blocks.
According to still another aspect of an embodiment of the present application, there is provided an electronic apparatus including:
a memory for storing executable instructions; and
And a display for displaying with the memory to execute the executable instructions to perform any of the operations described above for the method of reconstructing a three-dimensional image.
According to still another aspect of an embodiment of the present application, there is provided a computer-readable storage medium storing computer-readable instructions that, when executed, perform the operations of any of the above-described methods of reconstructing a three-dimensional image.
In the method, after a target common view image is acquired, and a first frame sequence which is orderly sequenced according to the weight sum of corresponding connecting edges is determined based on the target common view image, a first image frame of an image frame with the highest connecting edge weight sum value in the first frame sequence can be acquired, a first image block is generated based on the first image frame, a target number of second image blocks are obtained based on the first image block, and finally a reconstructed three-dimensional grid is generated based on the first image block and the target number of second image blocks. By applying the technical scheme of the application, the reconstruction sequence can be established by introducing the common view, and a new image block generation method is provided according to visual geometry mathematics, so that the problems of speed and precision in the matching process can be better solved.
The technical scheme of the application is further described in detail through the drawings and the embodiments.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description, serve to explain the principles of the application.
The application may be more clearly understood from the following detailed description taken in conjunction with the accompanying drawings, in which:
FIG. 1 is a schematic diagram of a method for reconstructing a three-dimensional image according to the present application;
FIG. 2 is a schematic flow chart of reconstructing a three-dimensional image according to the present application;
FIG. 3 is a schematic diagram of another method for reconstructing a three-dimensional image according to the present application;
FIG. 4 is a schematic structural view of an apparatus for reconstructing a three-dimensional image according to the present application;
Fig. 5 is a schematic diagram showing the structure of an electronic device according to the present application.
Detailed Description
Various exemplary embodiments of the present application will now be described in detail with reference to the accompanying drawings. It should be noted that: the relative arrangement of the components and steps, numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present application unless it is specifically stated otherwise.
Meanwhile, it should be understood that the sizes of the respective parts shown in the drawings are not drawn in actual scale for convenience of description.
The following description of at least one exemplary embodiment is merely exemplary in nature and is in no way intended to limit the application, its application, or uses.
Techniques, methods, and apparatus known to one of ordinary skill in the relevant art may not be discussed in detail, but are intended to be part of the specification where appropriate.
It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further discussion thereof is necessary in subsequent figures.
In addition, the technical solutions of the embodiments of the present application may be combined with each other, but it is necessary to be based on the fact that those skilled in the art can implement the technical solutions, and when the technical solutions are contradictory or cannot be implemented, the combination of the technical solutions should be considered as not existing, and not falling within the scope of protection claimed by the present application.
It should be noted that, in the embodiments of the present application, all directional indicators (such as up, down, left, right, front, and rear … …) are merely used to explain the relative positional relationship, movement conditions, and the like between the components in a specific gesture (as shown in the drawings), and if the specific gesture changes, the directional indicators correspondingly change.
A method for reconstructing a three-dimensional image according to an exemplary embodiment of the present application is described below in connection with fig. 1-3. It should be noted that the following application scenarios are only shown for facilitating understanding of the spirit and principles of the present application, and embodiments of the present application are not limited in this respect. Rather, embodiments of the application may be applied to any scenario where applicable.
The application also provides a method, a device, a target terminal and a medium for reconstructing the three-dimensional image.
Fig. 1 schematically shows a flow diagram of a method of reconstructing a three-dimensional image according to an embodiment of the present application. As shown in fig. 1, the method includes:
s101, acquiring a target common view image, determining a first frame sequence based on the target common view image, and sequentially sequencing all frames in the first frame sequence according to the weight sum of the corresponding connecting edges.
First, the present application can calculate a co-view from the result of the motion restoration structure (structure from motion SfM) and determine a reconstructed frame sequence in a co-view relationship.
Further, motion restoration structures, i.e. determining the spatial and geometrical relationship of the object by movement of the camera, are a common method of three-dimensional reconstruction. The three-dimensional camera is the biggest difference with the Kinect 3D camera in that the three-dimensional camera only needs a common RGB camera, so that the cost is lower, the restriction of the environment is smaller, and the three-dimensional camera can be used indoors and outdoors. However, the SfM requires complex theory and algorithm to support, and there is still a need for improvement in accuracy and speed, so that there are not many commercial applications that are mature at present.
According to the application, a common view can be constructed according to the SfM result (comprising the camera pose and the existing matching characteristic points). Further, the common view in the present application may be an undirected view, where the vertex V is the camera and the edge E is the number of common view feature points between the two phases. To ensure robustness, the maximum value of the edge E does not exceed β times the number of all feature points (e.g., β=1/3), i.e., E < βsf, where Sf is the number of all feature points.
Still further, all frames are ordered from large to small according to the weight sum of the connected edges, and then a first frame sequence is obtained. It should be noted that the following matching process needs to be calculated from the frame with the largest weight.
S102, acquiring a first image frame in a first frame sequence, and generating a first image block based on the first image frame, wherein the first image frame is the image frame with the highest sum value of the connecting edge weights in the first frame sequence.
Furthermore, after the first frame sequence is determined based on the target common-view image, the first image frame with the highest sum value of the connecting edge weights in the first frame sequence can be obtained, and the first image block is generated based on the first image frame.
Specifically, the application can extract the characteristic points in the image frame, match the characteristic points, and calculate the image block Patch corresponding to the characteristic points. Wherein the image block Patch includes a position, an orientation, an appearance, and the like.
S103, obtaining a target number of second image blocks based on the first image blocks, wherein the second Patch image blocks are adjacent to the first image blocks.
Calculating the surrounding image blocks according to the existing image blocks
And S104, generating a reconstructed three-dimensional grid based on the first image blocks and the target number of second image blocks.
Furthermore, in the application, the dense point cloud can be gridded based on the first image block and the second image blocks with the target number, so that a final reconstructed three-dimensional grid is obtained, and the reconstructed three-dimensional image is reconstructed by using the reconstructed three-dimensional grid later.
In the method, after a target common view image is acquired, and a first frame sequence which is orderly sequenced according to the weight sum of corresponding connecting edges is determined based on the target common view image, a first image frame of an image frame with the highest connecting edge weight sum value in the first frame sequence can be acquired, a first image block is generated based on the first image frame, a target number of second image blocks are obtained based on the first image block, and finally a reconstructed three-dimensional grid is generated based on the first image block and the target number of second image blocks. By applying the technical scheme of the application, the reconstruction sequence can be established by introducing the common view, and a new image block generation method is provided according to visual geometry mathematics, so that the problems of speed and precision in the matching process can be better solved.
Alternatively, in one possible embodiment of the present application, after S102 (acquiring the first image frame in the first frame sequence), the following steps may be implemented:
A first set of feature points of the first image frame is acquired, the first set of feature points comprising at least corner detection (features from ACCELERATED SEGMENT TEST FAST) features and gaussian function (DIFFERENCE OF GAUSSIAN DOG) features.
Further alternatively, in the process of acquiring the first feature point set of the first image frame, the method may be obtained by:
acquiring a second image frame set adjacent to the first image frame;
obtaining target to-be-matched points corresponding to each image frame in a second image frame set by using an epipolar constraint algorithm;
calculating the similarity of the points to be matched of each target by using the cost function;
Obtaining a plurality of candidate matching points based on the similarity of the points to be matched of each target;
a first set of feature points for the first image frame is obtained based on the plurality of candidate matching points.
Further, the present application may further implement the following steps after acquiring the first feature point set of the first image frame:
Calculating three-dimensional space coordinates corresponding to each first feature point in the first feature point set by using a forward intersection algorithm and a robust function algorithm;
calculating initial appearance corresponding to each first feature point in the first feature point set;
Acquiring a first orientation of a first image block by using an optimization target of constraint minimization of a homography matrix (Homography matrix);
and generating a first image block based on the three-dimensional space coordinates, the initial appearance and the first orientation corresponding to each first feature point in the first feature point set.
Further, in the process of calculating the initial appearance corresponding to each first feature point in the first feature point set, the method can be obtained by the following steps:
Calculating a difference sum value of each first feature point and other feature points in the first feature point set;
And taking the characteristic point with the smallest difference sum value as a second characteristic point, and taking a neighborhood of the second characteristic point with a preset size as the initial appearance of the first image block.
First, as shown in fig. 2, the present application may calculate a common view from SfM results and determine a reconstructed frame sequence in a common view relationship. Further, the present application constructs a common view G (V, E). The common view G (V, E) is an undirected view, the vertex V is the camera, and the edge E is the number of common view feature points between the two cameras. To ensure robustness, the maximum value of the edge E does not exceed β times the number of all feature points (e.g., β=1/3), i.e., E < βsf, where Sf is the number of all feature points.
Further, the present application may first extract feature points in the process of obtaining the first image block. In order to extract as many feature points as possible, the application extracts FAST features and DoG features simultaneously. In addition, the first image frame in the application can be the image frame I, and for a certain characteristic point s of the image frame I, the initial matching is completed quickly, and other image frame sets adjacent to the frame I are obtained firstly, namelyThen, i.e. can be applied to/>The following procedure is traversed to obtain an initial set of candidate matching points.
Firstly, a epipolar constraint algorithm can be utilized to obtain a point to be matched of a frame L, an AD cost function (absolute DIFFERENCE AND cencus AD-Cencus) can be utilized to calculate the similarity of the point to be matched, if the luminosity similarity is larger than a certain preset threshold value, the matching point is considered to be an effective candidate matching point, and it is noted that one frame of image can have one or more effective candidate matching points in the application.
Furthermore, the application can also utilize forward intersection (forward triangulation) and a robust function algorithm to calculate the three-dimensional space coordinate P(s) of the feature point s, obtain a feature point set { s } of the two-dimensional image, and ensure that each frame of image has at most one matching point. The present application may then initialize the appearance (application) of the image block, and specifically, may take a neighborhood of the feature point M, which is the smallest sum of the apparent differences from all other feature points in { s }, as the initial appearance a(s) of the image block.
Still further, the present application may perform back projection according to the known camera position and the three-dimensional coordinates P(s), and find whether there are more candidate feature points by using the application difference, and update { s }, which may be understood that the following may be repeated to calculate the three-dimensional coordinates and the corresponding appearance of the first image block. Finally, in calculating the first image block orientation, the orientation O(s) of the image block may be obtained using Homography matrix constraint on the minimized optimization objective. And a first image block comprising three-dimensional spatial coordinates, an initial appearance and a first orientation is obtained.
Further alternatively, in the process of obtaining the target number of second image blocks based on the first image blocks, the following may be obtained:
Gridding the first image frame to obtain a second image block;
taking the three-dimensional coordinates, the orientation initial value and the reference frame of the first image block as the three-dimensional coordinates, the orientation initial value and the reference frame of the second image block;
performing back projection operation by using the initial value of the second image block, and obtaining an effective feature set according to the luminosity difference;
and the method is repeated until the three-dimensional space coordinates, the initial appearance and the second orientation corresponding to all the image frames in the first frame sequence are calculated.
Furthermore, after the first image block is generated, the method can continue to follow the first image block for expansion so as to obtain a plurality of second image blocks around the first image block, so that more image blocks are obtained, and dense three-dimensional point cloud reconstruction is completed. It should be noted that the following process of generating the second image block needs to be repeated according to the frame sequence:
first, the present application needs to grid the first image frame (e.g., each grid has a size of 3*3), and expand the first image frame according to the existing image block. Further, the three-dimensional coordinates, the orientation initial value and the reference frame of the newly expanded second image block are all the corresponding three-dimensional coordinates, the orientation initial value and the reference frame of the adjacent existing image block. Wherein the appearance of the second image block is the appearance of the reference frame.
Furthermore, the initial value of the second image block can be utilized to carry out back projection, an effective feature set is obtained according to the luminosity difference, and then the steps are repeated continuously, so that the three-dimensional coordinates and the appearance of the second image block are calculated, and the optimal target of Homography matrix constraint minimization is adopted to obtain the orientation of the second image block.
Further optionally, in S104 (generating a reconstructed three-dimensional grid based on the first image block and the target number of second image blocks), the present application may further include a specific embodiment, as shown in fig. 3, including:
S201, acquiring a target common view image, determining a first frame sequence based on the target common view image, and sequentially sequencing all frames in the first frame sequence according to the weight sum of the corresponding connecting edges;
s202, acquiring a first image frame in a first frame sequence, and generating a first image block based on the first image frame, wherein the first image frame is an image frame with the highest total value of connecting edge weights in the first frame sequence;
s203, obtaining a target number of second image blocks based on the first image blocks, wherein the second image blocks are adjacent to the first image blocks;
S204, filtering the first image blocks and the second image blocks with the target number, and removing outlier points of the second image blocks with the target number;
s205, generating a reconstructed three-dimensional grid according to the first image blocks subjected to filtering and the second image blocks with the target number.
Furthermore, the application can filter the first image blocks and the second image blocks with target quantity, thereby achieving the purpose of eliminating the outlier points generated by the expansion of the image blocks. The method comprises the steps of removing image block observation points which do not meet the luminosity consistency condition according to the luminosity difference function, removing image block observation points which do not meet the FOV condition constraint, gridding dense point cloud, and obtaining the final reconstructed three-dimensional grid by adopting a standard poisson reconstruction algorithm (Poisson Surface Reconstructio).
In the method, after a target common view image is acquired, and a first frame sequence which is orderly sequenced according to the weight sum of corresponding connecting edges is determined based on the target common view image, a first image frame of an image frame with the highest connecting edge weight sum value in the first frame sequence can be acquired, a first image block is generated based on the first image frame, a target number of second image blocks are obtained based on the first image block, and finally a reconstructed three-dimensional grid is generated based on the first image block and the target number of second image blocks. By applying the technical scheme of the application, the reconstruction sequence can be established by introducing the common view, and a new image block generation method is provided according to visual geometry mathematics, so that the problems of speed and precision in the matching process can be better solved.
In another embodiment of the present application, as shown in fig. 4, the present application also provides an apparatus for reconstructing a three-dimensional image. Wherein the device comprises a first acquisition module 301, a second acquisition module 302, a first generation module 303, a second generation module 304, wherein,
A first obtaining module 301, configured to obtain a target common view image, and determine a first frame sequence based on the target common view image, where each frame in the first frame sequence is ordered sequentially according to a weight sum of corresponding connection edges;
A second obtaining module 302, configured to obtain a first image frame in the first frame sequence, and generate a first image block based on the first image frame, where the first image frame is an image frame with a highest sum value of connecting edge weights in the first frame sequence;
A first generating module 303, configured to obtain a target number of second image blocks based on the first image blocks, where the second image blocks are adjacent to the first image blocks;
A second generation module 304 is arranged to generate a reconstructed three-dimensional mesh based on the first image block and the target number of second image blocks.
In the method, after a target common view image is acquired, and a first frame sequence which is orderly sequenced according to the weight sum of corresponding connecting edges is determined based on the target common view image, a first image frame of an image frame with the highest connecting edge weight sum value in the first frame sequence can be acquired, a first image block is generated based on the first image frame, a target number of second image blocks are obtained based on the first image block, and finally a reconstructed three-dimensional grid is generated based on the first image block and the target number of second image blocks. By applying the technical scheme of the application, the reconstruction sequence can be established by introducing the common view, and a new image block generation method is provided according to visual geometry mathematics, so that the problems of speed and precision in the matching process can be better solved.
In another embodiment of the present application, the second obtaining module 302 further includes:
A second acquisition module 302 is configured to acquire a first set of feature points of the first image frame, the first set of feature points comprising at least corner detection features and gaussian function features.
In another embodiment of the present application, the second obtaining module 302 further includes:
A second acquisition module 302 configured to acquire a second set of image frames adjacent to the first image frame;
A second obtaining module 302, configured to obtain, using an epipolar constraint algorithm, a target to-be-matched point corresponding to each image frame in the second image frame set;
A second obtaining module 302, configured to calculate the similarity of the target points to be matched by using the cost function;
a second obtaining module 302, configured to obtain a plurality of candidate matching points based on the similarity of the target points to be matched;
A second obtaining module 302 is configured to obtain the first set of feature points of the first image frame based on the plurality of candidate matching points.
In another embodiment of the present application, the second obtaining module 302 further includes:
a second obtaining module 302, configured to calculate, using a forward intersection algorithm and a robust function algorithm, three-dimensional space coordinates corresponding to each first feature point in the first feature point set;
A second obtaining module 302, configured to calculate an initial appearance corresponding to each first feature point in the first feature point set;
A second obtaining module 302 configured to obtain a first orientation of the first image block using an optimization objective of homography matrix constraint minimization;
The second obtaining module 302 is configured to generate the first image block based on the three-dimensional space coordinate, the initial appearance and the first orientation corresponding to each first feature point in the first feature point set.
In another embodiment of the present application, the first generating module 303 further includes:
a first generating module 303 configured to calculate a difference sum value of each first feature point and other feature points in the first feature point set;
The first generating module 303 is configured to take a feature point with the smallest difference sum value as a second feature point, and take a neighborhood of a preset size of the second feature point as an initial appearance of the first image block.
In another embodiment of the present application, the first generating module 303 further includes:
a first generating module 303, configured to grid the first image frame to obtain a second image block;
A first generating module 303 configured to take the three-dimensional coordinates, the orientation initial value, and the reference frame of the first image block as the three-dimensional coordinates, the orientation initial value, and the reference frame of the second image block;
A first generating module 303, configured to perform a back projection operation by using the initial value of the second image block, and obtain an effective feature set according to the luminosity difference;
and the method is repeated until three-dimensional space coordinates, the initial appearance and the second orientation corresponding to all the image frames in the first frame sequence are calculated.
In another embodiment of the present application, the first generating module 303 further includes:
A first generating module 303, configured to filter the first image blocks and the target number of second image blocks, and remove outlier points of the target number of second image blocks;
a first generating module 303, configured to generate the reconstructed three-dimensional grid according to the filtered first image blocks and the target number of second image blocks.
Fig. 5 is a block diagram of a logical structure of an electronic device, according to an example embodiment. For example, electronic device 400 may be a mobile phone, computer, digital broadcast terminal, messaging device, game console, tablet device, medical device, exercise device, personal digital assistant, or the like.
Referring to fig. 5, an electronic device 400 may include one or more of the following components: a processor 401 and a memory 402.
Processor 401 may include one or more processing cores such as a 4-core processor, an 8-core processor, etc. The processor 401 may be implemented in at least one hardware form of DSP (DIGITAL SIGNAL Processing), FPGA (Field-Programmable gate array), PLA (Programmable Logic Array ). Processor 401 may also include a main processor, which is a processor for processing data in an awake state, also called a CPU (Central Processing Unit ), and a coprocessor; a coprocessor is a low-power processor for processing data in a standby state. In some embodiments, the processor 401 may integrate a GPU (Graphics Processing Unit, image processor) for rendering and drawing of content required to be displayed by the display screen. In some embodiments, the processor 401 may also include an AI (ARTIFICIAL INTELLIGENCE ) processor for processing computing operations related to machine learning.
Memory 402 may include one or more computer-readable storage media, which may be non-transitory. Memory 402 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in memory 402 is used to store at least one instruction for execution by processor 401 to implement the interactive special effect calibration method provided by the method embodiments of the present application.
In some embodiments, the electronic device 400 may further optionally include: a peripheral interface 403 and at least one peripheral. The processor 401, memory 402, and peripheral interface 403 may be connected by a bus or signal line. The individual peripheral devices may be connected to the peripheral device interface 403 via buses, signal lines or a circuit board. Specifically, the peripheral device includes: at least one of radio frequency circuitry 404, a touch display 405, a camera 406, audio circuitry 407, a positioning component 408, and a power supply 409.
Peripheral interface 403 may be used to connect at least one Input/Output (I/O) related peripheral to processor 401 and memory 402. In some embodiments, processor 401, memory 402, and peripheral interface 403 are integrated on the same chip or circuit board; in some other embodiments, either or both of the processor 401, memory 402, and peripheral interface 403 may be implemented on separate chips or circuit boards, which is not limited in this embodiment.
The Radio Frequency circuit 404 is configured to receive and transmit RF (Radio Frequency) signals, also known as electromagnetic signals. The radio frequency circuitry 404 communicates with a communication network and other communication devices via electromagnetic signals. The radio frequency circuit 404 converts an electrical signal into an electromagnetic signal for transmission, or converts a received electromagnetic signal into an electrical signal. Optionally, the radio frequency circuit 404 includes: antenna systems, RF transceivers, one or more amplifiers, tuners, oscillators, digital signal processors, codec chipsets, subscriber identity module cards, and so forth. The radio frequency circuitry 404 may communicate with other terminals via at least one wireless communication protocol. The wireless communication protocol includes, but is not limited to: metropolitan area networks, various generations of mobile communication networks (2G, 3G, 4G, and 5G), wireless local area networks, and/or WiFi (WIRELESS FIDELITY ) networks. In some embodiments, the radio frequency circuit 404 may further include NFC (NEAR FIELD Communication) related circuits, which is not limited by the present application.
The display screen 405 is used to display a UI (User Interface). The UI may include graphics, text, icons, video, and any combination thereof. When the display screen 405 is a touch display screen, the display screen 405 also has the ability to collect touch signals at or above the surface of the display screen 405. The touch signal may be input as a control signal to the processor 401 for processing. At this time, the display screen 405 may also be used to provide virtual buttons and/or a virtual keyboard, also referred to as soft buttons and/or a soft keyboard. In some embodiments, the display screen 405 may be one, providing a front panel of the electronic device 400; in other embodiments, the display screen 405 may be at least two, and disposed on different surfaces of the electronic device 400 or in a folded design; in still other embodiments, the display 405 may be a flexible display disposed on a curved surface or a folded surface of the electronic device 400. Even more, the display screen 405 may be arranged in an irregular pattern that is not rectangular, i.e. a shaped screen. The display screen 405 may be made of materials such as an LCD (Liquid CRYSTAL DISPLAY) and an OLED (Organic Light-Emitting Diode).
The camera assembly 406 is used to capture images or video. Optionally, camera assembly 406 includes a front camera and a rear camera. Typically, the front camera is disposed on the front panel of the terminal and the rear camera is disposed on the rear surface of the terminal. In some embodiments, the at least two rear cameras are any one of a main camera, a depth camera, a wide-angle camera and a tele camera, so as to realize that the main camera and the depth camera are fused to realize a background blurring function, and the main camera and the wide-angle camera are fused to realize a panoramic shooting and Virtual Reality (VR) shooting function or other fusion shooting functions. In some embodiments, camera assembly 406 may also include a flash. The flash lamp can be a single-color temperature flash lamp or a double-color temperature flash lamp. The dual-color temperature flash lamp refers to a combination of a warm light flash lamp and a cold light flash lamp, and can be used for light compensation under different color temperatures.
The audio circuit 407 may include a microphone and a speaker. The microphone is used for collecting sound waves of users and environments, converting the sound waves into electric signals, and inputting the electric signals to the processor 401 for processing, or inputting the electric signals to the radio frequency circuit 404 for realizing voice communication. For purposes of stereo acquisition or noise reduction, the microphone may be multiple and separately disposed at different locations of the electronic device 400. The microphone may also be an array microphone or an omni-directional pickup microphone. The speaker is used to convert electrical signals from the processor 401 or the radio frequency circuit 404 into sound waves. The speaker may be a conventional thin film speaker or a piezoelectric ceramic speaker. When the speaker is a piezoelectric ceramic speaker, not only the electric signal can be converted into a sound wave audible to humans, but also the electric signal can be converted into a sound wave inaudible to humans for ranging and other purposes. In some embodiments, audio circuit 407 may also include a headphone jack.
The location component 408 is used to locate the current geographic location of the electronic device 400 to enable navigation or LBS (Location Based Service, location-based services). The positioning component 408 may be a positioning component based on the United states GPS (Global Positioning System ), the Beidou system of China, the Granati system of Russia, or the Galileo system of the European Union.
The power supply 409 is used to power the various components in the electronic device 400. The power supply 409 may be an alternating current, a direct current, a disposable battery, or a rechargeable battery. When power supply 409 comprises a rechargeable battery, the rechargeable battery may support wired or wireless charging. The rechargeable battery may also be used to support fast charge technology.
In some embodiments, the electronic device 400 further includes one or more sensors 410. The one or more sensors 410 include, but are not limited to: acceleration sensor 411, gyroscope sensor 412, pressure sensor 413, fingerprint sensor 414, optical sensor 415, and proximity sensor 416.
The acceleration sensor 411 may detect the magnitudes of accelerations on three coordinate axes of the coordinate system established with the electronic device 400. For example, the acceleration sensor 411 may be used to detect components of gravitational acceleration on three coordinate axes. The processor 401 may control the touch display screen 405 to display a user interface in a lateral view or a longitudinal view according to the gravitational acceleration signal acquired by the acceleration sensor 411. The acceleration sensor 411 may also be used for the acquisition of motion data of a game or a user.
The gyro sensor 412 may detect a body direction and a rotation angle of the electronic device 400, and the gyro sensor 412 may collect a 3D motion of the user on the electronic device 400 in cooperation with the acceleration sensor 411. The processor 401 may implement the following functions according to the data collected by the gyro sensor 412: motion sensing (e.g., changing UI according to a tilting operation by a user), image stabilization at shooting, game control, and inertial navigation.
The pressure sensor 413 may be disposed at a side frame of the electronic device 400 and/or at an underlying layer of the touch screen 405. When the pressure sensor 413 is disposed on a side frame of the electronic device 400, a grip signal of the user on the electronic device 400 may be detected, and the processor 401 performs a left-right hand recognition or a shortcut operation according to the grip signal collected by the pressure sensor 413. When the pressure sensor 413 is disposed at the lower layer of the touch display screen 405, the processor 401 controls the operability control on the UI interface according to the pressure operation of the user on the touch display screen 405. The operability controls include at least one of a button control, a scroll bar control, an icon control, and a menu control.
The fingerprint sensor 414 is used to collect a fingerprint of the user, and the processor 401 identifies the identity of the user based on the fingerprint collected by the fingerprint sensor 414, or the fingerprint sensor 414 identifies the identity of the user based on the collected fingerprint. Upon recognizing that the user's identity is a trusted identity, the user is authorized by the processor 401 to perform relevant sensitive operations including unlocking the screen, viewing encrypted information, downloading software, paying for and changing settings, etc. The fingerprint sensor 414 may be provided on the front, back, or side of the electronic device 400. When a physical key or vendor Logo is provided on the electronic device 400, the fingerprint sensor 414 may be integrated with the physical key or vendor Logo.
The optical sensor 415 is used to collect the ambient light intensity. In one embodiment, the processor 401 may control the display brightness of the touch display screen 405 according to the ambient light intensity collected by the optical sensor 415. Specifically, when the intensity of the ambient light is high, the display brightness of the touch display screen 405 is turned up; when the ambient light intensity is low, the display brightness of the touch display screen 405 is turned down. In another embodiment, the processor 401 may also dynamically adjust the shooting parameters of the camera assembly 406 according to the ambient light intensity collected by the optical sensor 415.
A proximity sensor 416, also referred to as a distance sensor, is typically provided on the front panel of the electronic device 400. The proximity sensor 416 is used to collect distance between the user and the front of the electronic device 400. In one embodiment, when the proximity sensor 416 detects a gradual decrease in the distance between the user and the front of the electronic device 400, the processor 401 controls the touch display 405 to switch from the bright screen state to the off screen state; when the proximity sensor 416 detects that the distance between the user and the front surface of the electronic device 400 gradually increases, the processor 401 controls the touch display screen 405 to switch from the off-screen state to the on-screen state.
Those skilled in the art will appreciate that the structure shown in fig. 5 is not limiting of the electronic device 400 and may include more or fewer components than shown, or may combine certain components, or may employ a different arrangement of components.
In an exemplary embodiment, there is also provided a non-transitory computer readable storage medium, such as memory 404, including instructions executable by processor 420 of electronic device 400 to perform the above-described method of reconstructing a three-dimensional image, the method comprising: acquiring a target common view image, and determining a first frame sequence based on the target common view image, wherein each frame in the first frame sequence is sequentially ordered according to the weight sum of the corresponding connecting edges; acquiring a first image frame in the first frame sequence, and generating a first image block based on the first image frame, wherein the first image frame is an image frame with the highest sum value of connecting edge weights in the first frame sequence; obtaining a target number of second image blocks based on the first image blocks, wherein the second image blocks are adjacent to the first image blocks; a reconstructed three-dimensional grid is generated based on the first image block and the target number of second image blocks. Optionally, the above instructions may also be executed by the processor 420 of the electronic device 400 to perform the other steps involved in the above-described exemplary embodiments. For example, the non-transitory computer readable storage medium may be ROM, random Access Memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.
In an exemplary embodiment, there is also provided an application/computer program product comprising one or more instructions executable by the processor 420 of the electronic device 400 to perform the above-described method of reconstructing a three-dimensional image, the method comprising: acquiring a target common view image, and determining a first frame sequence based on the target common view image, wherein each frame in the first frame sequence is sequentially ordered according to the weight sum of the corresponding connecting edges; acquiring a first image frame in the first frame sequence, and generating a first image block based on the first image frame, wherein the first image frame is an image frame with the highest sum value of connecting edge weights in the first frame sequence; obtaining a target number of second image blocks based on the first image blocks, wherein the second image blocks are adjacent to the first image blocks; a reconstructed three-dimensional grid is generated based on the first image block and the target number of second image blocks. Optionally, the above instructions may also be executed by the processor 420 of the electronic device 400 to perform the other steps involved in the above-described exemplary embodiments.
Other embodiments of the application will be apparent to those skilled in the art from consideration of the specification and practice of the application disclosed herein. This application is intended to cover any variations, uses, or adaptations of the application following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the application pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.
It is to be understood that the application is not limited to the precise arrangements and instrumentalities shown in the drawings, which have been described above, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the application is limited only by the appended claims.

Claims (7)

CN202010507773.4A2020-06-052020-06-05Method, device, electronic equipment and medium for reconstructing three-dimensional imageActiveCN111784841B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202010507773.4ACN111784841B (en)2020-06-052020-06-05Method, device, electronic equipment and medium for reconstructing three-dimensional image

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202010507773.4ACN111784841B (en)2020-06-052020-06-05Method, device, electronic equipment and medium for reconstructing three-dimensional image

Publications (2)

Publication NumberPublication Date
CN111784841A CN111784841A (en)2020-10-16
CN111784841Btrue CN111784841B (en)2024-06-11

Family

ID=72754031

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202010507773.4AActiveCN111784841B (en)2020-06-052020-06-05Method, device, electronic equipment and medium for reconstructing three-dimensional image

Country Status (1)

CountryLink
CN (1)CN111784841B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN112396107B (en)*2020-11-182023-02-14广州极飞科技股份有限公司Reconstructed image selection method and device and electronic equipment
CN114445633B (en)*2022-01-252024-09-06腾讯科技(深圳)有限公司 Image processing method, device and computer readable storage medium
CN114708399B (en)*2022-03-212024-09-06北京百度网讯科技有限公司 Three-dimensional reconstruction method, device, equipment, medium and product
CN115272618B (en)*2022-09-202022-12-20深圳市其域创新科技有限公司 Three-dimensional mesh optimization method, device and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN109583457A (en)*2018-12-032019-04-05荆门博谦信息科技有限公司A kind of method and robot of robot localization and map structuring
WO2019169540A1 (en)*2018-03-062019-09-12斯坦德机器人(深圳)有限公司Method for tightly-coupling visual slam, terminal and computer readable storage medium
CN110599545A (en)*2019-09-062019-12-20电子科技大学中山学院Feature-based dense map construction system
WO2020007483A1 (en)*2018-07-062020-01-09Nokia Technologies OyMethod, apparatus and computer program for performing three dimensional radio model construction
WO2020092177A2 (en)*2018-11-022020-05-07Fyusion, Inc.Method and apparatus for 3-d auto tagging
CN111161347A (en)*2020-04-012020-05-15亮风台(上海)信息科技有限公司Method and equipment for initializing SLAM

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
WO2019169540A1 (en)*2018-03-062019-09-12斯坦德机器人(深圳)有限公司Method for tightly-coupling visual slam, terminal and computer readable storage medium
WO2020007483A1 (en)*2018-07-062020-01-09Nokia Technologies OyMethod, apparatus and computer program for performing three dimensional radio model construction
WO2020092177A2 (en)*2018-11-022020-05-07Fyusion, Inc.Method and apparatus for 3-d auto tagging
CN109583457A (en)*2018-12-032019-04-05荆门博谦信息科技有限公司A kind of method and robot of robot localization and map structuring
CN110599545A (en)*2019-09-062019-12-20电子科技大学中山学院Feature-based dense map construction system
CN111161347A (en)*2020-04-012020-05-15亮风台(上海)信息科技有限公司Method and equipment for initializing SLAM

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Robust visual semi-semantic loop closure detection by a covisibility graph and CNN features;S Cascianelli 等;《Robotics and Autonomous Systems》;全文*
基于RGB-D视频序列的大尺度场景三维语义表面重建技术研究;代具亭;《中国优秀博士论文全文数据库 信息科技辑》;全文*
未知主点条件的相机焦距自标定方法;唐荣富 等;《计算机应用研究》;全文*

Also Published As

Publication numberPublication date
CN111784841A (en)2020-10-16

Similar Documents

PublicationPublication DateTitle
US11205282B2 (en)Relocalization method and apparatus in camera pose tracking process and storage medium
CN108245893B (en)Method, device and medium for determining posture of virtual object in three-dimensional virtual environment
CN110097576B (en)Motion information determination method of image feature point, task execution method and equipment
CN110148178B (en)Camera positioning method, device, terminal and storage medium
CN111784841B (en)Method, device, electronic equipment and medium for reconstructing three-dimensional image
CN111126182A (en)Lane line detection method, lane line detection device, electronic device, and storage medium
CN110599593B (en)Data synthesis method, device, equipment and storage medium
CN111768454A (en)Pose determination method, device, equipment and storage medium
CN111680758B (en)Image training sample generation method and device
CN110570460A (en)Target tracking method and device, computer equipment and computer readable storage medium
CN111862148B (en)Method, device, electronic equipment and medium for realizing visual tracking
CN109886208B (en)Object detection method and device, computer equipment and storage medium
CN110335224B (en)Image processing method, image processing device, computer equipment and storage medium
CN112308103B (en)Method and device for generating training samples
CN112581358B (en)Training method of image processing model, image processing method and device
CN111369684B (en)Target tracking method, device, equipment and storage medium
CN113689484B (en)Method and device for determining depth information, terminal and storage medium
CN112967261B (en)Image fusion method, device, equipment and storage medium
CN111754564B (en)Video display method, device, equipment and storage medium
CN109472855B (en)Volume rendering method and device and intelligent device
CN114596215B (en)Method, device, electronic equipment and medium for processing image
CN117911520A (en)Camera internal parameter calibration method and automatic driving equipment
CN109685881B (en)Volume rendering method and device and intelligent equipment
CN110443841B (en)Method, device and system for measuring ground depth
CN111583339A (en)Method, device, electronic equipment and medium for acquiring target position

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp