where o_t • and o_k represent the optical center positions of 2D images j and k for camera position i in a reference frame defined by the camera geometry, and 0" and represent the optical center positions of 2D images j and k for camera position i in coordinates of the 3D point cloud 600 with relative scale.

In some embodiments, the 3D point cloud 120, 700 with absolute scale is obtained by scaling each point in the 3D point cloud 600 with relative scale with the scale factor. In some examples, the scale factor is applied to the points x" in the 3D point cloud 600 with relative scale as

where xf is a point in the absolute scale. Intermediate reference is here made to Fig. 7 in which is schematically illustrated a 3D reconstruction output as in Fig. 6 but with absolute scale. Fig. 7 thereby provides a schematic representation of a 3D point cloud 700 with absolute scale. The 3D point cloud 700 is scaled back to real-world metric scale using the camera geometry, which describes the orientation and position of the camera lenses. Hence, in Fig. 7 is shown the same 22 lens positions (enumerated from “o-a” to “10-a”, and from “o-b” to “10-b”) as in Fig. 6, where again the arrows indicate the viewing direction of each lens. That is, lens positions “o-a” and “o-b” correspond to the camera at position “o” in Fig. 5, and so on. But the difference with respect to Fig. 6 is that the 3D point cloud 700 has absolute scale (whereas the 3D point cloud 600 does not have absolute scale).

In some embodiments, the scale factor is calculated for only one of the clusters. However, in other embodiments, one scale factor is calculated per each of the clusters, and the scale factor with which the 3D point cloud 600 with relative scale is scaled is a weighted average of all calculated scale factors. For example, for increased robustness, the scale factor can be computed by the mean of all ratios (as in Equation 1) between the different 2D images at the same sensor positions. In the presence of outliers, the median can be used instead. In some examples, the scale factor is computed as a weighted average of all such ratios, where the relative weight of each ratio can be based on some fitment accuracy between the 2D images and the camera geometry.

In some examples, the camera geometry (e.g., as represented by the numerator of the ratio in Equation (i)) can be used as constraints in a pose graph optimization. That is, in some embodiments, the optical center positions

of 2D images j and k for camera position i in a reference frame defined by the camera geometry are used as constraints in a pose graph optimization during which the 3D point cloud 600 with relative scale is scaled with the scale factor. In this respect, the pose graph nodes represent the absolute pose, i.e., the pose with respect to the 3D model, and its edges represent the relative pose between the connected nodes. In accordance with the herein disclosed embodiments, the camera geometry can act as a constrain to the norm of the translation between 2D images captured by the same camera. After pose graph optimization the 3D points in the 3D model can be retriangulated given the scaled positions of the 2D images. Optionally, an additional bundle adjustment step can be performed to optimized both the poses and the 3D points.

Fig. 8 schematically illustrates, in terms of a number of structural units, the components of an image processing device 200 according to an embodiment. Processing circuitry 210 is provided using any combination of one or more of a suitable central processing unit (CPU), multiprocessor, microcontroller, digital signal processor (DSP), etc., capable of executing software instructions stored in a computer program product 1010 (as in Fig. 10), e.g. in the form of a storage medium 230. The processing circuitry 210 may further be provided as at least one application specific integrated circuit (ASIC), or field programmable gate array (FPGA).

Particularly, the processing circuitry 210 is configured to cause the image processing device 200 to perform a set of operations, or steps, as disclosed above. For example, the storage medium 230 may store the set of operations, and the processing circuitry 210 maybe configured to retrieve the set of operations from the storage medium 230 to cause the image processing device 200 to perform the set of operations. The set of operations may be provided as a set of executable instructions.

Thus the processing circuitry 210 is thereby arranged to execute methods as herein disclosed. The storage medium 230 may also comprise persistent storage, which, for example, can be any single one or combination of magnetic memory, optical memory, solid state memory or even remotely mounted memory. The image processing device 200 may further comprise a communications (comm.) interface 220 at least configured for communications with other entities, functions, nodes, and devices, as required for the image processing device 200 for generating a 3D point cloud 120, 700 with absolute scale in accordance with the herein disclosed embodiments. As such the communications interface 220 may comprise one or more transmitters and receivers, comprising analogue and digital components. The processing circuitry 210 controls the general operation of the image processing device 200 e.g. by sending data and control signals to the communications interface 220 and the storage medium 230, by receiving data and reports from the communications interface 220, and by retrieving data and instructions from the storage medium 230. Other components, as well as the related functionality, of the image processing device 200 are omitted in order not to obscure the concepts presented herein.

Fig. 9 schematically illustrates, in terms of a number of functional modules, the components of an image processing device 200 according to an embodiment. The image processing device 200 of Fig. 9 comprises a number of functional modules; an obtain module 210a configured to perform step S102, a generate module 210b configured to perform step S104, an arrange module 210c configured to perform step S106, a calculate module 2ioe configured to perform step S110, and a scale module 2iof configured to perform step S112. The image processing device 200 of Fig. 9 may further comprise a number of optional functional modules, such as a validate module 2iod configured to perform step S108. In general terms, each functional module 2ioa:2iof may in one embodiment be implemented only in hardware and in another embodiment with the help of software, i.e., the latter embodiment having computer program instructions stored on the storage medium 230 which when run on the processing circuitry makes the image processing device 200 perform the corresponding steps mentioned above in conjunction with Fig 9. It should also be mentioned that even though the modules correspond to parts of a computer program, they do not need to be separate modules therein, but the way in which they are implemented in software is dependent on the programming language used. Preferably, one or more or all functional modules 2ioa:2iof maybe implemented by the processing circuitry 210, possibly in cooperation with the communications interface 220 and/or the storage medium 230. The processing circuitry 210 may thus be configured to from the storage medium 230 fetch instructions as provided by a functional module 210a: 2iof and to execute these instructions, thereby performing any steps as disclosed herein.

The image processing device 200 maybe provided as a standalone device or as a part of at least one further device. A first portion of the instructions performed by the image processing device 200 maybe executed in a first device, and a second portion of the of the instructions performed by the image processing device 200 may be executed in a second device; the herein disclosed embodiments are not limited to any particular number of devices on which the instructions performed by the image processing device 200 maybe executed. Hence, the methods according to the herein disclosed embodiments are suitable to be performed by an image processing device 200 residing in a cloud computational environment. Therefore, although a single processing circuitry 210 is illustrated in Fig. 8 the processing circuitry 210 maybe distributed among a plurality of devices, or nodes. The same applies to the functional modules 2ioa:2iof of Fig. 9 and the computer program 1020 of Fig. 10.

Fig. 10 shows one example of a computer program product 1010 comprising computer readable storage medium 1030. On this computer readable storage medium 1030, a computer program 1020 can be stored, which computer program 1020 can cause the processing circuitry 210 and thereto operatively coupled entities and devices, such as the communications interface 220 and the storage medium 230, to execute methods according to embodiments described herein. The computer program 1020 and/or computer program product 1010 may thus provide means for performing any steps as herein disclosed.

In the example of Fig. 10, the computer program product 1010 is illustrated as an optical disc, such as a CD (compact disc) or a DVD (digital versatile disc) or a Blu-Ray disc. The computer program product 1010 could also be embodied as a memory, such as a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM), or an electrically erasable programmable read-only memory (EEPROM) and more particularly as a non-volatile storage medium of a device in an external memory such as a USB (Universal Serial Bus) memory or a Flash memory, such as a compact Flash memory. Thus, while the computer program 1020 is here schematically shown as a track on the depicted optical disk, the computer program 1020 can be stored in any way which is suitable for the computer program product 1010.

The inventive concept has mainly been described above with reference to a few embodiments. However, as is readily appreciated by a person skilled in the art, other embodiments than the ones disclosed above are equally possible within the scope of the inventive concept, as defined by the appended patent claims.

Claims

1. A method for generating a 3D point cloud (120, 700) with absolute scale, wherein the method is performed by an image processing device (200), and wherein the method comprises: obtaining (S102) a set of 2D images (110) of a 3D environment, wherein the 2D images (110) have been captured from a set of camera positions (510) by at least one camera (3ooa:3ood), and wherein each of the at least one camera (3ooa:3ood) has a camera geometry that is known to the image processing device (200); generating (S104) a 3D point cloud (600) with relative scale from the set of 2D images (110), wherein one optical center position in a coordinate system of the 3D point cloud (600) is recovered per each of the 2D images (110) as part of generating the 3D point cloud (600); arranging (S106) the 2D images (110) into one cluster per camera position (510) based on the optical center positions of the 2D images (110); calculating (S110) a scale factor based on a mapping between the clusters and the camera geometry; and obtaining (S112) the 3D point cloud (120, 700) with absolute scale by scaling the 3D point cloud (600) with relative scale with the scale factor.

2. The method according to claim 1, wherein the camera geometry pertains to an optical center position of the at least one camera (3ooa:3ood) and a viewing direction of the at least one camera (3ooa:3ood) per camera position (510).

3. The method according to claim 1 or 2, wherein the at least one camera (3ooa:3ood) comprises at least two image sensors (310a, 310b), and wherein the camera geometry pertains to a relative placement of the at least two image sensors (310a, 310b) per camera (3ooa:3ood).

4. The method according to claim 3, wherein the relative placement encompasses at least one of rotation and translation.

5. The method according to any preceding claim, wherein one viewing direction per each of the 2D images (no) is recovered as part of generating the 3D point cloud (600) with relative scale, and wherein the method further comprises: validating (S108) the clusters by comparing the optical center positions and the viewing directions of the 2D images (no) in each cluster to optical center positions (340a, 340b) and viewing direction (330a, 330b) of the camera positions (510), with one camera position (510) per cluster.

6. The method according to any preceding claim, wherein the 2D images (no) are arranged in the clusters by iteratively associating the 2D images (110) with the clusters according to a distance metric, wherein in each iteration the distance metric is reduced.

7. The method according to a combination of claims 5 and 6, wherein the distance metric is reduced until all clusters are successfully validated.

8. The method according to claim 5, 6, or 7, wherein the 2D images (no) have been captured by at least two cameras (3ooa:3ood) with different camera geometries, and wherein the clusters are validated for each of the different camera geometries.

9. The method according to any preceding claim, wherein each of the 2D images (no) comprises metadata specifying with which image sensor (310a, 310b) the 2D image was captured, and wherein the 2D images (110) are arranged into the clusters based on the metadata.

10. The method according to any preceding claim, wherein the mapping between the clusters and the camera geometry is defined by a relation between a relative distance between the optical center positions of 2D images (110) in the 3D point cloud (600) with relative scale and an absolute distance between the optical center positions (340a, 340b) of 2D images (110) in a reference frame defined by the camera geometry.

11. The method according to any preceding claim, wherein the scale factor, s, is determined as:

represent the optical center positions of 2D images j and k for camera position i in a reference frame defined by the camera geometry, and o" and represent the optical center positions of 2D images j and k for camera position i in coordinates of the 3D point cloud (600) with relative scale.

12. The method according to any preceding claim, wherein the scale factor is calculated for only one of the clusters.

13. The method according to any of claims 1 to 11, wherein one scale factor is calculated per each of the clusters, and wherein the scale factor with which the 3D point cloud (600) with relative scale is scaled is a weighted average of all calculated scale factors.

14. The method according to any preceding claim, wherein the 3D point cloud (120, 700) with absolute scale is obtained by scaling each point in the 3D point cloud (600) with relative scale with the scale factor.

15. The method according to any preceding claim, wherein the optical center positions

of 2D images j and k for camera position i in a reference frame defined by the camera geometry are used as constraints in a pose graph optimization during which the 3D point cloud (600) with relative scale is scaled with the scale factor.

16. An image processing device (200) for generating a 3D point cloud (120, 700) with absolute scale, the image processing device (200) comprising processing circuitry (210), the processing circuitry being configured to cause the image processing device (200) to: obtain a set of 2D images (110) of a 3D environment, wherein the 2D images (no) have been captured from a set of camera positions (510) by at least one camera (3ooa:3ood), and wherein each of the at least one camera (3ooa:3ood) has a camera geometry that is known to the image processing device (200); generate a 3D point cloud (600) with relative scale from the set of 2D images (no), wherein one optical center position in a coordinate system of the 3D point cloud (6oo) is recovered per each of the 2D images (no) as part of generating the 3D point cloud (600); arrange the 2D images (no) into one cluster per camera position (510) based on the optical center positions of the 2D images (110); calculate a scale factor based on a mapping between the clusters and the camera geometry; and obtain the 3D point cloud (120, 700) with absolute scale by scaling the 3D point cloud (600) with relative scale with the scale factor.

17. An image processing device (200) for generating a 3D point cloud (120, 700) with absolute scale, the image processing device (200) comprising: an obtain module (210a) configured to obtain a set of 2D images (110) of a 3D environment, wherein the 2D images (110) have been captured from a set of camera positions (510) by at least one camera (3ooa:3ood), and wherein each of the at least one camera (3ooa:3ood) has a camera geometry that is known to the image processing device (200); a generate module (210b) configured to generate a 3D point cloud (600) with relative scale from the set of 2D images (110), wherein one optical center position in a coordinate system of the 3D point cloud (600) is recovered per each of the 2D images (no) as part of generating the 3D point cloud (600); an arrange module (210c) configured to arrange the 2D images (no) into one cluster per camera position (510) based on the optical center positions of the 2D images (110); a calculate module (2ioe) configured to calculate a scale factor based on a mapping between the clusters and the camera geometry; and a scale module (2iof) configured to obtain the 3D point cloud (120, 700) with absolute scale by scaling the 3D point cloud (600) with relative scale with the scale factor.

18. The image processing device (200) according to claim 16 or 17, further being configured to perform the method according to any of claims 2 to 15.

19. A computer program (1020) for generating a 3D point cloud (120, 700) with absolute scale, the computer program comprising computer code which, when run on processing circuitry (210) of an image processing device (200), causes the image processing device (200) to: obtain (S102) a set of 2D images (110) of a 3D environment, wherein the 2D images (110) have been captured from a set of camera positions (510) by at least one camera (3ooa:3ood), and wherein each of the at least one camera (3ooa:3ood) has a camera geometry that is known to the image processing device (200); generate (S104) a 3D point cloud (600) with relative scale from the set of 2D images (110), wherein one optical center position in a coordinate system of the 3D point cloud (600) is recovered per each of the 2D images (110) as part of generating the 3D point cloud (600); arrange (S106) the 2D images (110) into one cluster per camera position (510) based on the optical center positions of the 2D images (110); calculate (S110) a scale factor based on a mapping between the clusters and the camera geometry; and obtain (S112) the 3D point cloud (120, 700) with absolute scale by scaling the 3D point cloud (600) with relative scale with the scale factor.

20. A computer program product (1010) comprising a computer program (1020) according to claim 19, and a computer readable storage medium (1030) on which the computer program is stored.