Movatterモバイル変換


[0]ホーム

URL:


WO2025098625A1 - Generation of 3d point cloud with absolute scale - Google Patents

Generation of 3d point cloud with absolute scale
Download PDF

Info

Publication number
WO2025098625A1
WO2025098625A1PCT/EP2023/081427EP2023081427WWO2025098625A1WO 2025098625 A1WO2025098625 A1WO 2025098625A1EP 2023081427 WEP2023081427 WEP 2023081427WWO 2025098625 A1WO2025098625 A1WO 2025098625A1
Authority
WO
WIPO (PCT)
Prior art keywords
images
camera
point cloud
scale
processing device
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/EP2023/081427
Other languages
French (fr)
Inventor
Vladislav POLIANSKII
André MATEUS
Elijs Dima
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Telefonaktiebolaget LM Ericsson AB
Original Assignee
Telefonaktiebolaget LM Ericsson AB
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Telefonaktiebolaget LM Ericsson ABfiledCriticalTelefonaktiebolaget LM Ericsson AB
Priority to PCT/EP2023/081427priorityCriticalpatent/WO2025098625A1/en
Publication of WO2025098625A1publicationCriticalpatent/WO2025098625A1/en
Pendinglegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Classifications

Definitions

Landscapes

Abstract

There is provided techniques for generating a3D point cloud with absolute scale. A method is performed by an image processing device. The method comprises obtaining a set of 2D images of a3D environment. The method comprises generating a3D point cloud with relative scale from the set of 2D images. One optical center position in a coordinate system of the3D point cloud is recovered per each of the 2D images as part of generating the3D point cloud. The method comprises arranging the 2D images into one cluster per camera position based on the optical center positions of the 2D images. The method comprises calculating a scale factor based on a mapping between the clusters and the camera geometry. The method comprises obtaining the3D point cloud with absolute scale by scaling the3D point cloud with relative scale with the scale factor.

Description

GENERATION OF 3D POINT CLOUD WITH ABSOLUTE SCALE
TECHNICAL FIELD
Embodiments presented herein relate to a method, an image processing device, a computer program, and a computer program product for generating a three- dimensional point cloud with absolute scale.
BACKGROUND
In general terms, accurate three-dimensional (3D) scene geometry, such as 3D point clouds, can be captured with 3D reconstruction techniques, for example based on the Structure-from-Motion (SfM) concept. Such techniques allow depth in the (physical) 3D environment to be estimated from an unstructured set of two-dimensional (2D) images of the 3D environment, as obtained from one or more cameras having scanned the environment. In some examples the scans are performed using 360- degree cameras. In some examples the scans are obtained by placing a 2D panoramic camera (such as a 360-degree camera) on a tripod at several locations within the 3D environment, where a respective set of 2D images is captured for each location. In some examples the reconstructed 3D geometry is visualized in a skybox image rendering environment.
Simply applying an SfM process on a set of 2D images will produce a reconstructed 3D geometry (in terms of a 3D point cloud) with arbitrary scale. That is, the result of image-based SfM has a scale ambiguity, making it impossible to perform measurements in absolute real-world scale in the reconstructed 3D geometry, which is essential for most applications. Therefore, some procedure needs to be applied to bring absolute scale to the reconstructed 3D geometry.
Existing procedures for this often impose constraints on how the scans are performed. These constrains could make the scanning process cumbersome and there is also a risk that some of the constraints are not followed during the scanning. The result of this is an incorrect scaling of the reconstructed 3D geometry or even that the process for reconstructing the 3D geometry fails and hence no reconstructed 3D geometry is generated.
Hence, there is still a need for improved 3D reconstruction techniques. SUMMARY
An object of embodiments herein is to enable 3D reconstruction of a scene where the above issues are avoided, or at least mitigated or reduced.
A particular object is to provide 3D reconstruction of a scene with absolute scale.
A particular object is to provide 3D reconstruction of a scene with absolute scale and without imposing any specific constraints on how the scans are performed.
According to a first aspect there is presented a method for generating a 3D point cloud with absolute scale. The method is performed by an image processing device. The method comprises obtaining a set of 2D images of a 3D environment. The 2D images have been captured from a set of camera positions by at least one camera. Each of the at least one camera has a camera geometry that is known to the image processing device. The method comprises generating a 3D point cloud with relative scale from the set of 2D images. One optical center position in a coordinate system of the 3D point cloud is recovered per each of the 2D images as part of generating the 3D point cloud. The method comprises arranging the 2D images into one cluster per camera position based on the optical center positions of the 2D images. The method comprises calculating a scale factor based on a mapping between the clusters and the camera geometry. The method comprises obtaining the 3D point cloud with absolute scale by scaling the 3D point cloud with relative scale with the scale factor.
According to a second aspect there is presented an image processing device for generating a 3D point cloud with absolute scale. The image processing device comprises processing circuitry. The processing circuitry is configured to cause the image processing device to obtain a set of 2D images of a 3D environment. The 2D images have been captured from a set of camera positions by at least one camera. Each of the at least one camera has a camera geometry that is known to the image processing device. The processing circuitry is configured to cause the image processing device to generate a 3D point cloud with relative scale from the set of 2D images. One optical center position in a coordinate system of the 3D point cloud is recovered per each of the 2D images as part of generating the 3D point cloud. The processing circuitry is configured to cause the image processing device to arrange the 2D images into one cluster per camera position based on the optical center positions of the 2D images. The processing circuitry is configured to cause the image processing device to calculate a scale factor based on a mapping between the clusters and the camera geometry. The processing circuitry is configured to cause the image processing device to obtain the 3D point cloud with absolute scale by scaling the 3D point cloud with relative scale with the scale factor.
According to a third aspect there is presented an image processing device for generating a 3D point cloud with absolute scale. The image processing device comprises an obtain module configured to obtain a set of 2D images of a 3D environment. The 2D images have been captured from a set of camera positions by at least one camera. Each of the at least one camera has a camera geometry that is known to the image processing device. The image processing device comprises a generate module configured to generate a 3D point cloud with relative scale from the set of 2D images. One optical center position in a coordinate system of the 3D point cloud is recovered per each of the 2D images as part of generating the 3D point cloud. The image processing device comprises an arrange module configured to arrange the 2D images into one cluster per camera position based on the optical center positions of the 2D images. The image processing device comprises a calculate module configured to calculate a scale factor based on a mapping between the clusters and the camera geometry. The image processing device comprises a scale module configured to obtain the 3D point cloud with absolute scale by scaling the 3D point cloud with relative scale with the scale factor.
According to a fourth aspect there is presented a computer program for generating a 3D point cloud with absolute scale. The computer program comprises computer code which, when run on processing circuitry of an image processing device, causes the image processing device to perform actions. One action comprises the image processing device to obtain a set of 2D images of a 3D environment. The 2D images have been captured from a set of camera positions by at least one camera. Each of the at least one camera has a camera geometry that is known to the image processing device. One action comprises the image processing device to generate a 3D point cloud with relative scale from the set of 2D images. One optical center position in a coordinate system of the 3D point cloud is recovered per each of the 2D images as part of generating the 3D point cloud. One action comprises the image processing device to arrange the 2D images into one cluster per camera position based on the optical center positions of the 2D images. One action comprises the image processing device to calculate a scale factor based on a mapping between the clusters and the camera geometry. One action comprises the image processing device to obtain the 3D point cloud with absolute scale by scaling the 3D point cloud with relative scale with the scale factor.
According to a fifth aspect there is presented a computer program product comprising a computer program according to the fourth aspect and a computer readable storage medium on which the computer program is stored. The computer readable storage medium could be a non-transitory computer readable storage medium.
Advantageously, these aspects provide 3D reconstruction of a scene where the above issues are avoided.
Advantageously, these aspects provide 3D reconstruction of a scene with absolute scale.
Advantageously, these aspects provide 3D reconstruction of a scene with absolute scale and without imposing any specific constraints on how the scans are performed.
Advantageously, these aspects do not require any specific actions from a user performing the scanning of the 3D environment with the camera. For example, the herein disclosed aspects do not require the use of any fiducial markers, and do not require any particular tripod setups of the cameras to be used. In turn, these aspects therefore enable a simple and fast scanning process, without being error-prone, to be used.
Advantageously, these aspects can utilize all 3D points in the 3D point cloud with relative scale, and do not require any 3D points to fall within an overlap area, in fact, these aspects do not even require the camera to even have an overlap area between its lenses.
Advantageously, these aspects work with, and allow for, the use of a general-purpose state-of-the-art procedure for generating the 3D point cloud with relative scale. Advantageously, these aspects are applicable to unordered sets of 2D images, and do not require any manual pre-processing, for example to associate each 2D image with each specific lens in the camera.
Other objectives, features and advantages of the enclosed embodiments will be apparent from the following detailed disclosure, from the attached dependent claims as well as from the drawings.
Generally, all terms used in the claims are to be interpreted according to their ordinary meaning in the technical field, unless explicitly defined otherwise herein. All references to "a/an/the element, apparatus, component, means, module, step, etc." are to be interpreted openly as referring to at least one instance of the element, apparatus, component, means, module, step, etc., unless explicitly stated otherwise. The steps of any method disclosed herein do not have to be performed in the exact order disclosed, unless explicitly stated.
BRIEF DESCRIPTION OF THE DRAWINGS
The inventive concept is now described, by way of example, with reference to the accompanying drawings, in which:
Fig. i is a block diagram of an image processing device according to an embodiment;
Fig. 2 is a flowchart of methods according to embodiments;
Fig. 3 is a schematic illustration of a camera according to an embodiment;
Fig. 4 is a schematic illustration of cameras according to embodiments;
Fig. 5 is a schematic illustration of a 3D environment according to an embodiment;
Fig. 6 illustrates a schematic representation of a 3D point cloud with relative scale according to an embodiment;
Fig. 7 illustrates a schematic representation of a 3D point cloud with absolute scale according to an embodiment;
Fig. 8 is a schematic diagram showing structural units of an image processing device according to an embodiment; Fig. 9 is a schematic diagram showing functional modules of an image processing device according to an embodiment; and
Fig. io shows one example of a computer program product comprising computer readable storage medium according to an embodiment.
DETAILED DESCRIPTION
The inventive concept will now be described more fully hereinafter with reference to the accompanying drawings, in which certain embodiments of the inventive concept are shown. This inventive concept may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein; rather, these embodiments are provided by way of example so that this disclosure will be thorough and complete, and will fully convey the scope of the inventive concept to those skilled in the art. Like numbers refer to like elements throughout the description. Any step or feature illustrated by dashed lines should be regarded as optional.
As noted above there is still a need for improved 3D reconstruction techniques.
The embodiments disclosed herein therefore relate to techniques for generating a 3D point cloud with absolute scale. In order to obtain such techniques there is provided an image processing device, a method performed by the image processing device, a computer program product comprising code, for example in the form of a computer program, that when run on an image processing device, causes the image processing device to perform the method.
Fig. 1 is a block diagram of an image processing device 200 according to an embodiment. The image processing device 200 takes as input a set of 2D images no of a 3D environment. The set of 2D images no might be per-lens images (fisheye or otherwise) as captured by a multi-lens camera (such as a 360-degree camera) at multiple scan positions. The image processing device 200 provides as output a 3D point cloud 120 with absolute scale of the 3D environment. The resulting thus scaled 3D point cloud 120 can be used for visualization, inspection, computer-aided design (CAD) model generation, building information model (BIM) generation and verification, etc. The image processing device 200 comprises a point cloud generator block 250 configured to recover the scene geometry, for example in terms of a 3D point cloud with relative scale, with arbitrary scale, and optical centers and viewing directions of the individual 2D images no in the coordinate system of the recovered scene geometry. The image processing device 200 comprises an image cluster block 255 configured to match the reconstructed optical centers and viewing directions to corresponding optical centers and viewing directions of the camera, as given by the camera geometry, as known to the image processing device 200. The image processing device 200 comprises a scale factor calculator block 260 configured to determine a scale factor based on the matching. The image processing device 200 comprises a scale factor applicator block 265 configured to use the scale factor to scale the recovered scene geometry, for example in terms of the 3D point cloud with relative scale, to an absolute scale, for example into a 3D point cloud with absolute scale. Further details of the operations performed by the image processing device 200 for generating the 3D point cloud 120 with absolute scale will be disclosed next with reference to Fig. 2.
Fig. 2 is a flowchart illustrating embodiments of methods for generating a 3D point cloud 120, 700 with absolute scale. In this respect, absolute scale here refers to a certain scale, in the sense that distances can be measured in a certain absolute unit, such as meters, etc. The methods are performed by the image processing device 200. The methods are advantageously provided as computer programs 320.
S102: The image processing device 200 obtains a set of 2D images no of a 3D environment. The 2D images no have been captured from a set of camera positions 510 by at least one camera 3ooa:3ood. Examples of cameras 3ooa:3ood and camera positions 510 will be disclosed below. Each of the at least one camera 3ooa:3ood has a camera geometry that is known to the image processing device 200. Examples of what could define, or constitute, the camera geometry will be disclosed below.
S104: The image processing device 200 generates a 3D point cloud 600 with relative scale from the set of 2D images 110. One optical center position in a coordinate system of the 3D point cloud 600 is recovered per each of the 2D images 110 as part of generating the 3D point cloud 600. Sio6: The image processing device 200 arranges the 2D images no into one cluster per camera position 510 based on the optical center positions of the 2D images 110. Examples of how this clustering can be performed will be disclosed below.
S110: The image processing device 200 calculates a scale factor based on a mapping between the clusters and the camera geometry. Examples of this mapping will be disclosed below.
S112: The image processing device 200 obtains the 3D point cloud 120, 700 with absolute scale by scaling the 3D point cloud 600 with relative scale with the scale factor.
Embodiments relating to further details of generating a 3D point cloud 120, 700 with absolute scale as performed by the image processing device 200 will now be disclosed with continued reference to Fig. 2.
The disclosed method entails the metric scaling of a 3D model (in terms of a 3D point cloud) computed from the 2D images, by exploiting a known camera geometry. As specified above, the camera geometry of each camera 3ooa:3ood used to capture the set of 2D images no is known to the image processing device 200. In this respect, the camera geometry could be obtained during calibration of the cameras 3ooa:3ood or be provided by the camera manufacturer. In any case, the camera geometry can be explicitly provided to the image processing device 200 together with the set of 2D images no. Alternatively, the camera geometry could be provided to the image processing device 200 as an index, assuming that the image processing device 200 has access to a database of different camera geometries, and where the index thus specifies one of the camera geometries in the database.
There could be different ways to define the camera geometry. Intermediate reference is here made to Fig. 3 and Fig. 4.
In Fig. 3 is shown a top view of an example camera 300a with two lenses 310a, 310b. The two lenses 310a, 310b are placed around the camera body and face opposite directions 330a, 330b. Each lens 310a, 310b has a respective field-of-view 320a, 320b. Further, each lens 310a, 310b has its own optical center 340a, 340b. That is, the optical centers 340a, 340b are not co-located. In Fig. 4 is shown a top view of four different cameras 300a, 300b, 300c, 3ood. Camera 300a is the same as illustrated in Fig. 3 and thus is an example of a camera with two lenses. Camera 300b is an example of a camera with four lenses, each facing its own direction. Camera 300c is an example of a camera with eight lenses, with two lenses per direction. Camera 3ood is an example of a camera with six lenses, each facing its own direction. Each of the cameras 3ooa:3ood thus has its own layout in terms of lens placements and optical centers, and thus each camera 3ooa:3ood has its own camera geometry.
In some embodiments, the camera geometry pertains to an optical center position of the at least one camera 3ooa:3ood and a viewing direction of the at least one camera 3ooa:3ood per camera position 510. In some embodiments, the at least one camera 3ooa:3ood comprises at least two image sensors 310a, 310b, and the camera geometry pertains to a relative placement of the at least two image sensors 310a, 310b per camera 3ooa:3ood. In some examples, the relative placement encompasses at least one of rotation and translation.
Further aspects of how the set of 2D images no can be obtained will be disclosed next.
In general terms, the camera 3ooa:3ood is placed in multiple positions of the 3D environment, and a scan (i.e., a set of images for each lens in the camera 3ooa:3ood) is obtained for each position. Intermediate reference is here made to Fig. 5. In Fig. 5 is schematically illustrated a top view of a 3D environment 500 in terms of part of a building interior where a camera is placed at multiple positions 510. In the example of Fig. 5, there are eleven such positions, enumerated from “o” to “10”. The arrows indicate the viewing direction of each lens of the camera when placed at each of the eleven positions.
Further aspects of how the 3D point cloud 600 with relative scale can be generated from the set of 2D images no will be disclosed next.
In general terms, the 3D point cloud 600 with relative scale can be generated using either a sparse or a dense 3D reconstruction. Sparse reconstruction involves solving a Structure-from-Motion (SfM) problem that creates a sparse set of 3D points that come from matching points of interests among the 2D images. Dense reconstruction might incorporate or depend on SfM in a first step, followed by a Multiple-View Stereo method to compute a denser 3D point cloud. In both sparse and dense reconstruction, the poses of the 2D images (as positions of the optical center and the rotation of the viewing frustum) are reconstructed along with the set of 3D points, in the same coordinate system and scale.
Intermediate reference is here made to Fig. 6 in which is schematically illustrated a 3D reconstruction output from a SfM process, up to an arbitrary scale. Fig. 6 thereby provides a schematic representation of a 3D point cloud 600 with relative scale. The 3D environment is reconstructed as a 3D point cloud 600 (indicated by dotted lines), and the camera positions are reconstructed as individual positions (and viewing directions) of each lens. In the example illustrated in Fig. 6, 22 lens positions (enumerated from “o-a” to “10-a”, and from “o-b” to “10-b”) are reconstructed, where the arrows indicate the viewing direction of each lens, as in Fig. 5. That is, lens positions “o-a” and “o-b” correspond to the camera at position “o” in Fig. 5, and so on.
Since no constraints on image position are consider at this point, any general- purpose 3D reconstruction methods can be used. COLMAP, OpenSfM and Metashape are three non-limiting examples of such 3D reconstruction methods. All of these methods reconstruct the 3D geometry up to an arbitrary scale.
In some examples, the 3D point cloud with relative scale is generated using the primary per-lens images (e.g., “fisheye images”) directly. In other examples, the 3D point cloud with relative scale is generated using secondary derived images (e.g., “perspective images”) produced from the primary per-lens images (e.g., “fisheye images”). In the latter case, the mapping used to generate the secondary images from the primary images is preserved such that any change in image optical center and viewing direction, created by the primary-to-secondary image generation, can be reversed after the SfM process recovers the 3D geometry and image positions. In that way, even if the SfM process recovers the positions of the secondary images, the positions of the primary images are obtained using the reversed known primary-to- secondary mapping.
Further aspects of how the 2D images no can be arranged in clusters will be disclosed next. In some aspects a two-step procedure is used to arrange the 2D images no in clusters. In a first step, the image positions are clustered based on proximity. 2D images captured at the same camera position should be closer to each other than to 2D images captured at some another camera position. In a second step, the relative rotation and/or viewing directions are used by the image processing device 200 to validate the clusters in the first step (i.e., to confirm that each of the 2D images belongs to the correct cluster). That is, in some examples, one viewing direction per each of the 2D images 110 is recovered as part of generating the 3D point cloud 600 with relative scale, and in some embodiments, the image processing device 200 further is configured to perform (optional) step S108:
S108: The image processing device 200 validates the clusters by comparing the optical center positions and the viewing directions of the 2D images no in each cluster to optical center positions 340a, 340b and viewing direction 330a, 330b of the camera positions 510, with one camera position 510 per cluster.
Since the common scale of the 3D model is not known, this clustering process can be performed iteratively, as indicated by a feedback loop from step S108 to step sio6 in Fig. 2, where in each iteration the distance metric for clustering the per lens optical centers is reduced. That is, in some embodiments, the 2D images no are arranged in the clusters by iteratively associating the 2D images no with the clusters according to a distance metric. In each iteration the distance metric is reduced. Here, the distance metric for clustering the per lens optical centers can be reduced until all clusters pass the viewing direction validation step (i.e., step S108). That is, in some embodiments, the distance metric is reduced until all clusters are successfully validated.
Further in this respect, if only some camera positions are recovered during the 3D reconstruction, then the clustering process can be performed from an iteration point where the distance metric encompasses the span of all reconstructed image positions, down to an iteration distance metric just above where each image position is in its own cluster (i.e. reduce distance metric until all images except the closest two are in their own cluster). In such case, the iteration with the largest amount of image positions validated against the camera geometry can be used. If no iterations contain validated clusters, then the 3D reconstruction can be repeated using a different parametrization of the SfM method, or using a different available SfM method. If multiple iterations contain the same total number of validated clusters, then the iteration with the smallest distance metric can be used.
In some examples, the scans in the 3D environment are performed with at least two different types of cameras 3ooa:3ood and thus more than one kind of camera geometry is used in the validation step (i.e., step S108). In such case, each cluster can be validated against each of the supported camera geometries to determine which cluster fits best to which camera geometry. That is, in some embodiments, the 2D images no have been captured by at least two cameras 3ooa:3ood with different camera geometries, and the clusters are validated for each of the different camera geometries.
In some examples, the 2D images contain an explicit association to a specific camera lens. In such cases, this explicit or inferred association can be used to connect the 2D images to the relative positions and/ or viewing directions of the camera geometry, or geometries. In particular, in some embodiments, each of the 2D images 110 comprises metadata specifying with which image sensor 310a, 310b the 2D image was captured, and the 2D images no are arranged into the clusters based on the metadata. The metadata can be provided e.g., in the image exchangeable image file format (EXIF) data, or be inferred from the image filename, or from the image data itself (e.g., through detecting distinct image distortions unique to a specific lens).
In some examples, the camera geometry is received as user input. For example, the user might indicate the camera geometry as an abstract arrangement of lenses (e.g. 6 lenses equidistantly spaced in a circle, as for camera 3ood), in case the user also provides the lens-to-lens distances in the abstract arrangement.
Further aspects of how the scale factor can be calculated will be disclosed next.
In some embodiments, the mapping between the clusters and the camera geometry is defined by a relation between a relative distance between the optical center positions of 2D images no in the 3D point cloud 600 with relative scale and an absolute distance between the optical center positions 340a, 340b of 2D images 120 in a reference frame defined by the camera geometry. In some embodiments, the scale factor, s, is determined as:
Figure imgf000014_0001
where ot • and ok represent the optical center positions of 2D images j and k for camera position i in a reference frame defined by the camera geometry, and 0" and represent the optical center positions of 2D images j and k for camera position i in coordinates of the 3D point cloud 600 with relative scale.
In some embodiments, the 3D point cloud 120, 700 with absolute scale is obtained by scaling each point in the 3D point cloud 600 with relative scale with the scale factor. In some examples, the scale factor is applied to the points x" in the 3D point cloud 600 with relative scale as
Figure imgf000014_0002
where xf is a point in the absolute scale. Intermediate reference is here made to Fig. 7 in which is schematically illustrated a 3D reconstruction output as in Fig. 6 but with absolute scale. Fig. 7 thereby provides a schematic representation of a 3D point cloud 700 with absolute scale. The 3D point cloud 700 is scaled back to real-world metric scale using the camera geometry, which describes the orientation and position of the camera lenses. Hence, in Fig. 7 is shown the same 22 lens positions (enumerated from “o-a” to “10-a”, and from “o-b” to “10-b”) as in Fig. 6, where again the arrows indicate the viewing direction of each lens. That is, lens positions “o-a” and “o-b” correspond to the camera at position “o” in Fig. 5, and so on. But the difference with respect to Fig. 6 is that the 3D point cloud 700 has absolute scale (whereas the 3D point cloud 600 does not have absolute scale).
In some embodiments, the scale factor is calculated for only one of the clusters. However, in other embodiments, one scale factor is calculated per each of the clusters, and the scale factor with which the 3D point cloud 600 with relative scale is scaled is a weighted average of all calculated scale factors. For example, for increased robustness, the scale factor can be computed by the mean of all ratios (as in Equation 1) between the different 2D images at the same sensor positions. In the presence of outliers, the median can be used instead. In some examples, the scale factor is computed as a weighted average of all such ratios, where the relative weight of each ratio can be based on some fitment accuracy between the 2D images and the camera geometry.
In some examples, the camera geometry (e.g., as represented by the numerator of the ratio in Equation (i)) can be used as constraints in a pose graph optimization. That is, in some embodiments, the optical center positions
Figure imgf000015_0001
of 2D images j and k for camera position i in a reference frame defined by the camera geometry are used as constraints in a pose graph optimization during which the 3D point cloud 600 with relative scale is scaled with the scale factor. In this respect, the pose graph nodes represent the absolute pose, i.e., the pose with respect to the 3D model, and its edges represent the relative pose between the connected nodes. In accordance with the herein disclosed embodiments, the camera geometry can act as a constrain to the norm of the translation between 2D images captured by the same camera. After pose graph optimization the 3D points in the 3D model can be retriangulated given the scaled positions of the 2D images. Optionally, an additional bundle adjustment step can be performed to optimized both the poses and the 3D points.
Fig. 8 schematically illustrates, in terms of a number of structural units, the components of an image processing device 200 according to an embodiment. Processing circuitry 210 is provided using any combination of one or more of a suitable central processing unit (CPU), multiprocessor, microcontroller, digital signal processor (DSP), etc., capable of executing software instructions stored in a computer program product 1010 (as in Fig. 10), e.g. in the form of a storage medium 230. The processing circuitry 210 may further be provided as at least one application specific integrated circuit (ASIC), or field programmable gate array (FPGA).
Particularly, the processing circuitry 210 is configured to cause the image processing device 200 to perform a set of operations, or steps, as disclosed above. For example, the storage medium 230 may store the set of operations, and the processing circuitry 210 maybe configured to retrieve the set of operations from the storage medium 230 to cause the image processing device 200 to perform the set of operations. The set of operations may be provided as a set of executable instructions.
Thus the processing circuitry 210 is thereby arranged to execute methods as herein disclosed. The storage medium 230 may also comprise persistent storage, which, for example, can be any single one or combination of magnetic memory, optical memory, solid state memory or even remotely mounted memory. The image processing device 200 may further comprise a communications (comm.) interface 220 at least configured for communications with other entities, functions, nodes, and devices, as required for the image processing device 200 for generating a 3D point cloud 120, 700 with absolute scale in accordance with the herein disclosed embodiments. As such the communications interface 220 may comprise one or more transmitters and receivers, comprising analogue and digital components. The processing circuitry 210 controls the general operation of the image processing device 200 e.g. by sending data and control signals to the communications interface 220 and the storage medium 230, by receiving data and reports from the communications interface 220, and by retrieving data and instructions from the storage medium 230. Other components, as well as the related functionality, of the image processing device 200 are omitted in order not to obscure the concepts presented herein.
Fig. 9 schematically illustrates, in terms of a number of functional modules, the components of an image processing device 200 according to an embodiment. The image processing device 200 of Fig. 9 comprises a number of functional modules; an obtain module 210a configured to perform step S102, a generate module 210b configured to perform step S104, an arrange module 210c configured to perform step S106, a calculate module 2ioe configured to perform step S110, and a scale module 2iof configured to perform step S112. The image processing device 200 of Fig. 9 may further comprise a number of optional functional modules, such as a validate module 2iod configured to perform step S108. In general terms, each functional module 2ioa:2iof may in one embodiment be implemented only in hardware and in another embodiment with the help of software, i.e., the latter embodiment having computer program instructions stored on the storage medium 230 which when run on the processing circuitry makes the image processing device 200 perform the corresponding steps mentioned above in conjunction with Fig 9. It should also be mentioned that even though the modules correspond to parts of a computer program, they do not need to be separate modules therein, but the way in which they are implemented in software is dependent on the programming language used. Preferably, one or more or all functional modules 2ioa:2iof maybe implemented by the processing circuitry 210, possibly in cooperation with the communications interface 220 and/or the storage medium 230. The processing circuitry 210 may thus be configured to from the storage medium 230 fetch instructions as provided by a functional module 210a: 2iof and to execute these instructions, thereby performing any steps as disclosed herein.
The image processing device 200 maybe provided as a standalone device or as a part of at least one further device. A first portion of the instructions performed by the image processing device 200 maybe executed in a first device, and a second portion of the of the instructions performed by the image processing device 200 may be executed in a second device; the herein disclosed embodiments are not limited to any particular number of devices on which the instructions performed by the image processing device 200 maybe executed. Hence, the methods according to the herein disclosed embodiments are suitable to be performed by an image processing device 200 residing in a cloud computational environment. Therefore, although a single processing circuitry 210 is illustrated in Fig. 8 the processing circuitry 210 maybe distributed among a plurality of devices, or nodes. The same applies to the functional modules 2ioa:2iof of Fig. 9 and the computer program 1020 of Fig. 10.
Fig. 10 shows one example of a computer program product 1010 comprising computer readable storage medium 1030. On this computer readable storage medium 1030, a computer program 1020 can be stored, which computer program 1020 can cause the processing circuitry 210 and thereto operatively coupled entities and devices, such as the communications interface 220 and the storage medium 230, to execute methods according to embodiments described herein. The computer program 1020 and/or computer program product 1010 may thus provide means for performing any steps as herein disclosed.
In the example of Fig. 10, the computer program product 1010 is illustrated as an optical disc, such as a CD (compact disc) or a DVD (digital versatile disc) or a Blu-Ray disc. The computer program product 1010 could also be embodied as a memory, such as a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM), or an electrically erasable programmable read-only memory (EEPROM) and more particularly as a non-volatile storage medium of a device in an external memory such as a USB (Universal Serial Bus) memory or a Flash memory, such as a compact Flash memory. Thus, while the computer program 1020 is here schematically shown as a track on the depicted optical disk, the computer program 1020 can be stored in any way which is suitable for the computer program product 1010.
The inventive concept has mainly been described above with reference to a few embodiments. However, as is readily appreciated by a person skilled in the art, other embodiments than the ones disclosed above are equally possible within the scope of the inventive concept, as defined by the appended patent claims.

Claims

1. A method for generating a 3D point cloud (120, 700) with absolute scale, wherein the method is performed by an image processing device (200), and wherein the method comprises: obtaining (S102) a set of 2D images (110) of a 3D environment, wherein the 2D images (110) have been captured from a set of camera positions (510) by at least one camera (3ooa:3ood), and wherein each of the at least one camera (3ooa:3ood) has a camera geometry that is known to the image processing device (200); generating (S104) a 3D point cloud (600) with relative scale from the set of 2D images (110), wherein one optical center position in a coordinate system of the 3D point cloud (600) is recovered per each of the 2D images (110) as part of generating the 3D point cloud (600); arranging (S106) the 2D images (110) into one cluster per camera position (510) based on the optical center positions of the 2D images (110); calculating (S110) a scale factor based on a mapping between the clusters and the camera geometry; and obtaining (S112) the 3D point cloud (120, 700) with absolute scale by scaling the 3D point cloud (600) with relative scale with the scale factor.
2. The method according to claim 1, wherein the camera geometry pertains to an optical center position of the at least one camera (3ooa:3ood) and a viewing direction of the at least one camera (3ooa:3ood) per camera position (510).
3. The method according to claim 1 or 2, wherein the at least one camera (3ooa:3ood) comprises at least two image sensors (310a, 310b), and wherein the camera geometry pertains to a relative placement of the at least two image sensors (310a, 310b) per camera (3ooa:3ood).
4. The method according to claim 3, wherein the relative placement encompasses at least one of rotation and translation.
5. The method according to any preceding claim, wherein one viewing direction per each of the 2D images (no) is recovered as part of generating the 3D point cloud (600) with relative scale, and wherein the method further comprises: validating (S108) the clusters by comparing the optical center positions and the viewing directions of the 2D images (no) in each cluster to optical center positions (340a, 340b) and viewing direction (330a, 330b) of the camera positions (510), with one camera position (510) per cluster.
6. The method according to any preceding claim, wherein the 2D images (no) are arranged in the clusters by iteratively associating the 2D images (110) with the clusters according to a distance metric, wherein in each iteration the distance metric is reduced.
7. The method according to a combination of claims 5 and 6, wherein the distance metric is reduced until all clusters are successfully validated.
8. The method according to claim 5, 6, or 7, wherein the 2D images (no) have been captured by at least two cameras (3ooa:3ood) with different camera geometries, and wherein the clusters are validated for each of the different camera geometries.
9. The method according to any preceding claim, wherein each of the 2D images (no) comprises metadata specifying with which image sensor (310a, 310b) the 2D image was captured, and wherein the 2D images (110) are arranged into the clusters based on the metadata.
10. The method according to any preceding claim, wherein the mapping between the clusters and the camera geometry is defined by a relation between a relative distance between the optical center positions of 2D images (110) in the 3D point cloud (600) with relative scale and an absolute distance between the optical center positions (340a, 340b) of 2D images (110) in a reference frame defined by the camera geometry.
11. The method according to any preceding claim, wherein the scale factor, s, is determined as:
Figure imgf000020_0001
Figure imgf000021_0001
represent the optical center positions of 2D images j and k for camera position i in a reference frame defined by the camera geometry, and o" and represent the optical center positions of 2D images j and k for camera position i in coordinates of the 3D point cloud (600) with relative scale.
12. The method according to any preceding claim, wherein the scale factor is calculated for only one of the clusters.
13. The method according to any of claims 1 to 11, wherein one scale factor is calculated per each of the clusters, and wherein the scale factor with which the 3D point cloud (600) with relative scale is scaled is a weighted average of all calculated scale factors.
14. The method according to any preceding claim, wherein the 3D point cloud (120, 700) with absolute scale is obtained by scaling each point in the 3D point cloud (600) with relative scale with the scale factor.
15. The method according to any preceding claim, wherein the optical center positions
Figure imgf000021_0002
of 2D images j and k for camera position i in a reference frame defined by the camera geometry are used as constraints in a pose graph optimization during which the 3D point cloud (600) with relative scale is scaled with the scale factor.
16. An image processing device (200) for generating a 3D point cloud (120, 700) with absolute scale, the image processing device (200) comprising processing circuitry (210), the processing circuitry being configured to cause the image processing device (200) to: obtain a set of 2D images (110) of a 3D environment, wherein the 2D images (no) have been captured from a set of camera positions (510) by at least one camera (3ooa:3ood), and wherein each of the at least one camera (3ooa:3ood) has a camera geometry that is known to the image processing device (200); generate a 3D point cloud (600) with relative scale from the set of 2D images (no), wherein one optical center position in a coordinate system of the 3D point cloud (6oo) is recovered per each of the 2D images (no) as part of generating the 3D point cloud (600); arrange the 2D images (no) into one cluster per camera position (510) based on the optical center positions of the 2D images (110); calculate a scale factor based on a mapping between the clusters and the camera geometry; and obtain the 3D point cloud (120, 700) with absolute scale by scaling the 3D point cloud (600) with relative scale with the scale factor.
17. An image processing device (200) for generating a 3D point cloud (120, 700) with absolute scale, the image processing device (200) comprising: an obtain module (210a) configured to obtain a set of 2D images (110) of a 3D environment, wherein the 2D images (110) have been captured from a set of camera positions (510) by at least one camera (3ooa:3ood), and wherein each of the at least one camera (3ooa:3ood) has a camera geometry that is known to the image processing device (200); a generate module (210b) configured to generate a 3D point cloud (600) with relative scale from the set of 2D images (110), wherein one optical center position in a coordinate system of the 3D point cloud (600) is recovered per each of the 2D images (no) as part of generating the 3D point cloud (600); an arrange module (210c) configured to arrange the 2D images (no) into one cluster per camera position (510) based on the optical center positions of the 2D images (110); a calculate module (2ioe) configured to calculate a scale factor based on a mapping between the clusters and the camera geometry; and a scale module (2iof) configured to obtain the 3D point cloud (120, 700) with absolute scale by scaling the 3D point cloud (600) with relative scale with the scale factor.
18. The image processing device (200) according to claim 16 or 17, further being configured to perform the method according to any of claims 2 to 15.
19. A computer program (1020) for generating a 3D point cloud (120, 700) with absolute scale, the computer program comprising computer code which, when run on processing circuitry (210) of an image processing device (200), causes the image processing device (200) to: obtain (S102) a set of 2D images (110) of a 3D environment, wherein the 2D images (110) have been captured from a set of camera positions (510) by at least one camera (3ooa:3ood), and wherein each of the at least one camera (3ooa:3ood) has a camera geometry that is known to the image processing device (200); generate (S104) a 3D point cloud (600) with relative scale from the set of 2D images (110), wherein one optical center position in a coordinate system of the 3D point cloud (600) is recovered per each of the 2D images (110) as part of generating the 3D point cloud (600); arrange (S106) the 2D images (110) into one cluster per camera position (510) based on the optical center positions of the 2D images (110); calculate (S110) a scale factor based on a mapping between the clusters and the camera geometry; and obtain (S112) the 3D point cloud (120, 700) with absolute scale by scaling the 3D point cloud (600) with relative scale with the scale factor.
20. A computer program product (1010) comprising a computer program (1020) according to claim 19, and a computer readable storage medium (1030) on which the computer program is stored.
PCT/EP2023/0814272023-11-102023-11-10Generation of 3d point cloud with absolute scalePendingWO2025098625A1 (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
PCT/EP2023/081427WO2025098625A1 (en)2023-11-102023-11-10Generation of 3d point cloud with absolute scale

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
PCT/EP2023/081427WO2025098625A1 (en)2023-11-102023-11-10Generation of 3d point cloud with absolute scale

Publications (1)

Publication NumberPublication Date
WO2025098625A1true WO2025098625A1 (en)2025-05-15

Family

ID=88837516

Family Applications (1)

Application NumberTitlePriority DateFiling Date
PCT/EP2023/081427PendingWO2025098625A1 (en)2023-11-102023-11-10Generation of 3d point cloud with absolute scale

Country Status (1)

CountryLink
WO (1)WO2025098625A1 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20200066034A1 (en)*2017-02-272020-02-27Katam Technologies AbImproved forest surveying
WO2023102552A1 (en)*2021-12-032023-06-08Hover Inc.System and methods for validating imagery pipelines

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20200066034A1 (en)*2017-02-272020-02-27Katam Technologies AbImproved forest surveying
WO2023102552A1 (en)*2021-12-032023-06-08Hover Inc.System and methods for validating imagery pipelines

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
HESHENG YIN: "SLAM-Based Self-Calibration of a Binocular Stereo Vision Rig in Real-Time", SENSORS, vol. 20, no. 3, 22 January 2020 (2020-01-22), CH, pages 621, XP093161102, ISSN: 1424-8220, DOI: 10.3390/s20030621*
LANGPING LI: "Recovering absolute scale for Structure from Motion using the law of free fall", OPTICS AND LASER TECHNOLOGY, vol. 112, 1 April 2019 (2019-04-01), NL, pages 514 - 523, XP093160606, ISSN: 0030-3992, Retrieved from the Internet <URL:https://www.sciencedirect.com/science/article/pii/S003039921830224X/pdfft?md5=9f2c18063176278093dbe334858c9a98&pid=1-s2.0-S003039921830224X-main.pdf> DOI: 10.1016/j.optlastec.2018.11.045*

Similar Documents

PublicationPublication DateTitle
CN115345942B (en) Space calibration method, device, computer equipment and storage medium
JP6537237B2 (en) INFORMATION PROCESSING APPARATUS AND METHOD
JP7173285B2 (en) Camera calibration device, camera calibration method, and program
US20070075997A1 (en)Artifact mitigation in three-dimensional imaging
TW201812700A (en)Measurement systems and methods for measuring multi-dimensions
CN106570907B (en)Camera calibration method and device
CN117495975A (en) Calibration method, device and electronic equipment for zoom lens
Huo et al.Corrected calibration algorithm with a fixed constraint relationship and an error compensation technique for a binocular vision measurement system
Ahmadabadian et al.Stereo‐imaging network design for precise and dense 3D reconstruction
CN115457594B (en)Three-dimensional human body posture estimation method, system, storage medium and electronic equipment
US20240420416A1 (en)Method and system for three-dimensional reconstruction, and storage medium
WO2023201723A1 (en)Object detection model training method, and object detection method and apparatus
WO2025098625A1 (en)Generation of 3d point cloud with absolute scale
KR102434567B1 (en)Rectification method for stereo image and apparatus thereof
CN117523003A (en)Camera calibration method and device in multi-camera system and electronic equipment
Tang et al.Non-intrusive biomass estimation in aquaculture using structure from motion within decision support systems
JP2024062190A (en)Image processing device and method, program, and storage medium
CN113628265B (en)Vehicle Zhou Shidian cloud generation method, depth estimation model training method and device
Esteban et al.Fit3d toolbox: multiple view geometry and 3d reconstruction for matlab
Chen et al.Improved blur circle detection method for geometric calibration of multifocus light field cameras
CN115205359A (en)Robust depth estimation method and device based on scanning light field
CN118251696A (en)Alignment of point clouds representing physical objects
JP5464671B2 (en) Image processing apparatus, image processing method, and image processing program
KR102611481B1 (en)Method and apparatus for calculating actual distance between coordinates in iamge
KR101424118B1 (en)Method and Apparatus for 3D Scanning

Legal Events

DateCodeTitleDescription
121Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number:23806191

Country of ref document:EP

Kind code of ref document:A1


[8]ページ先頭

©2009-2025 Movatter.jp