WO2017090027A1

Movatterモバイル変換

Info

Publication number: WO2017090027A1
Application number: PCT/IL2016/051230
Authority: WO
Inventors: Ofer BRAUN; Michael BEN YACOV
Original assignee: Ilmoby Awareness Systems Ltd
Current assignee: Ilmoby Awareness Systems Ltd
Priority date: 2015-11-24
Filing date: 2016-11-15
Publication date: 2017-06-01
Anticipated expiration: 2018-05-24
Also published as: WO2017090027A8

Abstract

A three dimensional modeling system, comprising: a plurality of digital cameras disposed in substantially fixed relation to each other such that each of the digital cameras has a field of view that overlaps a field of view of at least one other digital camera to create a stereoscopic field of view; a texture video camera disposed in substantially fixed relation to each other of said digital cameras such that it has a field of view that overlaps a field of view of at least one other of the plurality of said digital cameras to create a stereoscopic field of view; a pattern projector that emits pattern light to a target object; a mass of storage device storing pixel vector maps for each of the plurality of said digital cameras; a command and control that substantially simultaneously activates the digital cameras and said texture video camera to capture image data; a camera interface for commanding the plurality of said digital cameras and said texture video camera; and a video processor coupled to the memory; wherein the processor executes software that generates a first three-dimensional model of the field of view using the image data captured by the digital cameras; wherein the software generates additional three-dimensional models from the additional image data and merges the first and additional three-dimensional models to create a digital three-dimensional model.

Description

A SYSTEM AND METHOD TO CREATE THREE-DIMENSIONAL MODELS IN REAL-TIME FROM STEREOSCOPIC VIDEO PHOTOGRAPHS

FIELD OF THE INVENTION The present invention relates to the creation of three-dimensional (3D) models, and more specifically the creation of the same in a real-time manner from stereoscopic video photographs.

BACKGROUND OF THE INVENTION 3-dimensional photography is not new. 3-dimensional photography has been available for over hundred years through stereoscopic cameras.

Panoramic photography, the taking of a photograph covering a field of view that is wide to an entire 360 degrees panorama, has a long history in photography.

US patent 5,646,679, issued on July 8, 1997, discloses an image processing method and apparatus that uses a pair of digital cameras to capture separate overlapping images. The overlapping portions of the images are forming combined image information and a single image over a wide field of view may be created.

While achieving improved panoramic photography, these methods do not provide the visual image data necessary to produce a 3-dimensional image or model. US patent 7,724, 379 issued on May 25, 2010, discloses a 3-dimensional shape measuring method and apparatus, using pattern projector and an image capturing device having a fixed relative positional relationship to each other. The range is calculated in accordance with the formal distortion of the light as it is received by image captured by the camera. US patent 7, 463, 280 issued on Dec. 9, 2008, discloses a digital 3D/360 degrees camera system that uses several digital cameras to capture image data that is necessary to create an accurate digital model of a 3-dimensional scene. None of the above inventions and patents, taken either singly or in combination, is seen to describe the instant invention as claimed.

BRIEF DESCRIPTION OF THE DRAWINGS Fig. 1 is a block diagram of the system of the present invention.

Fig, 2 is a diagrammatic illustration of a stereoscopic field of view using two digital cameras being part of the system of the present camera.

Fig.3 is a block diagram of the laser or LED projector being part the system of the present invention. Fig.4 is an example of pattern lights (markers) projected by the laser or LED projector.

Fig.5A-5C is a flow chart explaining the steps of the method of the 3-D modeling in an embodiment of the present invention,

DETAIL DESCRIPTION OF THE PREFERRED EMBODIMENTS The present invention is a 3-D modeling system and method creating 3-D models in realtime from stereoscopic video photographs designed generally as 10 in the drawings.

The 3-D modeling system and method may operate in two modes:

• standalone system mode, for building models of objects in three dimensional (3- D) for different uses such as advertising, product catalogs, etc.

· smart glass system mode wherein the system is embedded in smart glasses which measures the area where the user moves around so the smart glasses can recognize the surrounding environment and to enable it to place augmented information at the right range from the user on the relevant real object.

As shown in Fig.l, the 3-D modeling system 10 includes at least two digital cameras 12, 14, a texture video camera 20 and a laser or LED projector 30. Cameras 12, 14 are oriented so that each camera's field of view overlaps with the field of view of the second camera to form a stereoscopic field of view, and the texture camera 20 field of view that overlaps the stereoscopic field of view, as detailed later in the description related to Fig. 2.

Both cameras 12 and 14 are medium resolution (e.g. 480X640 VGA resolution) cameras, have the same optical properties, e.g. pixel size, distance between pixels, focal length and are panchromatic cameras type. Camera 20 is a high resolution (e.g. 1,920 x 1,080 HD resolution or higher) color camera. The purpose of camera 20 is to record the texture of the objects in the field of view. This texture is later used in the building of the 3D model and gives it a real life presence. The cameras are under the control of a controller 32, which is depicted schematically in the figure. The controller is a system that includes a video processor 40, a command and control 44, a memory 48, a mass storage device 50, a clock 52, an external communication interface 54, a battery 58 and a camera interface 60. The controller could be a general purpose computer system such as a Personal Computer (PC) with enough computing resources, or a custom-designed computer system. The memory is large enough to store the images captured by the cameras.

The cameras have a command and data interface that is in connection with the camera interface. Commercially available cameras are typically available with a Universal Serial Bus (USB), Fire Wire, or another interface for command and data transfer. Additionally it is desirable that the cameras be equipped with a digital command line that will allow a digital signal to cause the camera to capture an image. Use of a single digital control line allows all the cameras to be commanded simultaneously, by a single-digital control signal, to capture an image.

The clock is used to schedule image capture, to tag image data files that are captured and to synchronize command of the camera. The clock should have a resolution and accuracy within 0.01 msec, or better.

The external communication interface 54 may be any data communication interface, and may employ a wired, fiber optic, wireless, or another method for connection with an external device e.g. reader photo engine, a smartphone, user interface to the smart glasses.

A computer software program, stored on the mass storage device 50 and executed in the memory, directs the controller to perform the various functions, such as commanding the cameras to capture image data, storing image data files. It is also responsible to calculate the range from the digital cameras to any point in the area and together with the images outputted from the texture video camera to build the 3-D modeling of the target object.

In traditional stereoscopic photography, two cameras in a stereoscopic pair are spaced a certain distance apart and each has the same field of view angle. As shown in Fig.2, the two cameras may be spaced any distance apart 300. The lens should have at least 90% field of view overlapping 310. The projector and texture camera are mounted in the middle between camera 12 and 14. The cameras are rigidly mounted and their precise geometry is known.

The problem is to identify in two images the same point of the landscape. Traditionally, it is done using image processing procedure called correlation, but it requires high computing power and its precision is dependent on the light conditions. In order to overcome these issues, the system in the present invention is projecting a light pattern landscape by using a laser or LED projector, which is depicted schematically in Fig, 3. The projector is projecting a plurality of patterns to the target object, while at the same time, the cameras are capturing it. The projector includes an infrared light source 300, a mask 310 and optical elements 320. In one embodiment, a lens is used as the optical elements. The mask is a component produced in photolithography technology consisting of a pattern that blocks, or transmit portions of the light beam from being projected. It is a passive flat glass coated with opaque material to create the pattern needed. The cameras include a filter that transmits the wavelength of the projector and rejects all other wavelength. The purpose of the filter is to reduce ambient light and enhance the illumination markers to the cameras.

If at any time, all the markers are unique, and do not repeat themselves, every image captured by the cameras will include the same marker, as the captured marker is the optical reflect of the marker that is projected by the projector. Based on this method, a large number of correspondences between the pixel coordinates on the captured image of each camera can be retrieved.

In order to be able to identify the markers and create a geometric structure that is unique, the marker has to be built from a collection of 4 pixels by 4 pixels and the angular size of each pixel of the marker should be the same angular size of each pixel of the cameras. Assuming that the cameras have a VGA resolution (480x640 pixels so 307200 pixels), and assuming that each marker is separated from the neighbored marker by an empty pixel line, so each marker is about 5x5 pixels, 12,228 different markers may be applied, An example of the markers is shown at Fig. 4.

A search window of 6x6 pixels moving across the field of view enables to identify the markers in each of the captured images. Once the marker is identified in the two pictures, the range from the object to every pixel of the marker on the camera focal plan can be calculated from the two images by solving the stereoscopic equation. As shown in Fig, 5A-5C, the process to build the 3-D model includes following stages:

1. Capturing images from the two cameras 12, 14 (500).

2. Building a cloud of the points (530).

3. Capturing image from the texture video camera, synchronized with cameras 12, 14 (550). 4. Applying texture to cloud of points (560)

5. Generating the three-dimensional and three hundred-and-sixty degree model (570) 1. Capturing images from the two different digital cameras 12, 14 (500)

This step includes:

1.1. Capturing image of the scene with a target lit by the projector light by camera 12 - image 1 (510).

1.2. Capturing image of the scene with a target lit by the projector light by camera 14 - image 2(520). 2. Building a cloud of the points (530)

This step includes:

2.1 Going over all image 1 and performing: 2.1. Searching for markers in image 1 by activating a window search of 6x6 pixels

(523).

2.2 Building a list of the pixels relevant to each marker at image 1 (534)

2.2. Going over all image2 and performing:

2.2.1. Searching for markers in image 2 by activating a window search of 6x6 pixels (538).

2.2.2 Building a list of the pixels relevant to each marker at image 2 (540)

3. Calculating for every pixel in the images the range to the object (544), obtaining about 307200 ranges, one for every pixel, this ensemble has a unique coordinate in the image. 3. Capturing image from the texture video camera (550)

This step includes:

3.1. Capturing image of the scene with a target lit by the projector light by texture video camera 20- image 3 (552). As mentioned earlier, all the cameras are synchronized. 4. Generating the reconstructed object (560)

This step includes:

4.1 generating the reconstructed target object by stretching the texture image from stage 3 over the cloud of points from stage 2 as defined above.

5. Generating the three hundred-and-sixty degree model (570) The purpose is to build all around model of the target object, , either by measuring the same scene while moving the cameras set around it, or by measuring the same scene while rotating the target object with the cameras and the projector fixed.

This step includes, in an embodiment where the cameras are rotated:

5.1 Moving the cameras around the target object by a pre-defined angle (572) (e.g. 120 degrees)

5.2 Performing stages 1- 4 (574).

5.3 Stitching together the reconstructed object from stage 4 to the reconstructed object obtained at stage 4 by moving of the cameras around the target object. (576)

5.4 Performing steps 5.1 - 5.3 to cover full rotation (hundred and sixty degree around the target object).

In an embodiment of the present invention, the result of the method provides actual distances between the cameras and the objects perceived by said cameras in the 3-D model.

Claims

1. A three dimensional modeling system, comprising: a. a plurality of digital cameras disposed in substantially fixed relation to each other such that each of the plurality of digital cameras has a field of view that overlaps a field of view of at least one other of the plurality of digital cameras to create a stereoscopic field of view [12], [14];

b. a texture video camera disposed in substantially fixed relation to each other said digital cameras such that it has a field of view that overlaps a field of view of at least one other of the plurality of said digital cameras to create a stereoscopic field of view [20];

c. a pattern projector that emits pattern light to a target object [30];

d. a mass of storage device storing pixel vector maps for each of the plurality of said digital cameras [50] ;

e. a command and control [44] that substantially simultaneously activates the plurality of said digital cameras and said texture video camera to capture image data;

f. a camera interface for commanding the plurality of said digital cameras and said texture video camera; [60]; and

g. a video processor [40] coupled to the memory [48] , wherein the processor executes software that generates a first three-dimensional model of the field of view using the image data captured by the said plurality of digital cameras; wherein the software generates additional three-dimensional models from the additional image data and merges the first and additional three-dimensional models to create a digital three-dimensional model.

2. The system of claim 1, wherein the pattern projector is a laser projector.

3. The system of claim 1, wherein the pattern projector is a led projector,

4. The system of claim 1 , wherein the command and control further includes a clock [52] that synchronizes commands of the plurality of said digital cameras and said texture video camera.

5. The system of claim 2, wherein the command and control further includes a clock [52] that synchronizes commands of the plurality of said digital cameras and said texture video camera.

6. The system of claim 3, wherein the command and control further includes a clock [52] that synchronizes commands of the plurality of said digital cameras and said texture video camera.

7. The system of claim 2, wherein the command and control further includes a means for external communication devices [54].

8. The system of claim 3, wherein the command and control further includes a means for external communication devices [54].

9. The system of claim 2, wherein the said laser projector includes a laser diode [300], a mask [310] and optical elements [320] and wherein said mask produced in photolithography technology consists of a pattern that blocks or transmits portions of the light beam from being projected.

10. The system of claim 3, wherein the said led projector includes a led, a mask and optical elements and wherein said mask produced in photolithography technology consists of a pattern that blocks or transmits portions of the light beam from being projected.

11. A method for creating and displaying three dimensional models real-time manner using the three dimensional modeling system of claim 1 , the method comprising: a. emitting a pattern light to a target object by said pattern projector;

b. commanding each of the said plurality of digital cameras to simultaneously capture image data , wherein each of the plurality of the digital cameras has a field of view that overlaps a field of view of at least one other of the plurality of digital cameras to create a stereoscopic field of view [500] ;

c. generating a cloud of points [530] ;

d. commanding said texture video camera to simultaneously capture image data

[550]; e. generating a first three-dimensional model of the target object from said cloud of points and said image data from said texture video camera [560] ;

f. capturing additional images from separate locations [572] ;

g. generating additional three-dimensional models of the target object from additional image data [574], [576], [578]; and,

h. stitching the first and additional three-dimensional model to generate the three dimensional model [582].

12. The method according to claim 10, wherein said step of generating a first three- dimensional model of the target is by stretching said captured image from said texture video camera over the said cloud of points.