Movatterモバイル変換


[0]ホーム

URL:


CN107077719B - Perspective correction based on depth map in digital photos - Google Patents

Perspective correction based on depth map in digital photos
Download PDF

Info

Publication number
CN107077719B
CN107077719BCN201580057165.5ACN201580057165ACN107077719BCN 107077719 BCN107077719 BCN 107077719BCN 201580057165 ACN201580057165 ACN 201580057165ACN 107077719 BCN107077719 BCN 107077719B
Authority
CN
China
Prior art keywords
photograph
pixel
camera
depth
new
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201580057165.5A
Other languages
Chinese (zh)
Other versions
CN107077719A (en
Inventor
T·斯沃特达尔
P·克雷恩
N·塔拉隆
J·迪马雷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Polarite Corp
Original Assignee
Polarite Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Polarite CorpfiledCriticalPolarite Corp
Publication of CN107077719ApublicationCriticalpatent/CN107077719A/en
Application grantedgrantedCritical
Publication of CN107077719BpublicationCriticalpatent/CN107077719B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Images

Classifications

Landscapes

Abstract

The present invention relates to post-processing of digital photographs to correct perspective distortion in the photograph. The correction technique applies a digital photograph of a scene and a depth map associated with the photograph and including the depth of each pixel in the photograph, the depth being the distance between that portion of the scene that is the pixel and the position of the camera at the time the photograph was taken. The correction technique is performed locally, so that the correction of any pixel in the photograph depends on the depth of that pixel. The correction technique may be implemented to convert each pixel in the original photograph into a new location in the corrected photograph. The pixel values must then be calculated for the pixels of the corrected picture using the original pixel values and their new positions. The invention is particularly relevant for photographs of objects or scenes involving significant magnification differences, such as self-photographs, close-up photographs and photographs in which the extension of a large object is not orthogonal to the optical axis of the camera (low/high angle shots).

Description

Perspective correction based on depth map in digital photos
Technical Field
The present invention relates to post-processing of digital photos, and more particularly, to a method, digital storage holding software, an integrated circuit, and a handheld device with a camera for performing post-processing of digital photos.
Background
Distortion of perspective
Photography provides a 2D representation of the 3D world. This 3D to 2D conversion is achieved by projecting a 3D scene onto a 2D sensor with a lens.
At great distances and if the optical axis of the camera is perpendicular to the extension of the object, the perspective projection provides a "pleasing to the eye" that is "as expected". However, if the ratio between the distance to the closest part of the object and the distance to the farthest part of the object is very large, the near part will appear at a different magnification than the far part, and this difference in magnification will cause perspective distortion in the picture of the object.
There are many well known perspective distortion effects. When the extension of a larger object is not orthogonal to the optical axis, the parallel lines in the object are not parallel in the photograph, since the proximal end will be magnified more than the distal end. This is the case, for example, when shooting skyscrapers at low angles. Another effect often encountered is that when the distance between the camera and the object in the photograph is on the same order of magnitude as the depth of the topology (topology) of the object, the near part will appear disproportionate to the far part. This is the case, for example, of a self-photograph (self-picture taken by a hand-held camera of a subject) in which the arm length distance (30 to 50cm) between the camera and the subject's head and the distance between the nose and the ear are of about the same order of magnitude, making the nose look unnaturally large.
Thus, perspective distortion can affect any photograph in which an object or scene involves a high magnification difference.
Perspective correction
The problem of perspective correction has been partially solved in some specific applications, most of the time using tools that require user interaction. Existing tools allow correction of photographs taken at low and high angles, but these corrections are based on knowledge of the camera position and orientation or on an overall correction of the assumed geometry of the scene (e.g. parallel lines in a building).
Most currently available correction schemes only provide correction for optical distortions caused by camera optics, such as fisheye, barrel or pillow distortion correction. In these cases, the optical distortion is modeled and the global correction is applied.
DxO ViewPoint 2(http:// www.dxo.com/intl/photopraphy/DxO-viewport/wide-angle-lens-software) represents the current state of the art in perspective distortion correction. The DxO ViewPoint application allows correction of perspective distortion caused by camera optics when using a wide-angle lens. The application also allows correction of vanishing lines by performing different projections, but this correction is independent of the distance between the camera and the object or scene and cannot correct both near and far distance distortions. The applied correction is global and is applied independent of the topology of the scene or object in the photograph.
The overall correction is based on a smoothing criterion function determined by a few parameters relating to camera-extrinsic and camera-intrinsic parameters and/or relating to "prior" data or user-defined data. For example, fig. 1A-1D illustrate image correction by DxO ViewPoint, applying the same correction techniques on photographs and checkerboard patterns. Fig. 1A shows an original photograph and a pattern, with lens distortion correction applied in fig. 1B, perspective correction (natural mode) applied in fig. 1C, and perspective correction (complete mode) applied in fig. 1D. A problem with global correction techniques, such as those in DxO ViewPoint, is that these correction techniques are applied over the entire image without regard to the subject. The correction may be adjusted to a given object in a given plane at a given location through user interaction, but if the scene includes some objects that are not equidistant from the camera, the correction may not be satisfactory for all objects.
There are some smartphone applications such as SkinneePix and Facetune that allow users to improve the appearance of self-photographs. SkinneePix simply geometrically warps the picture, relying on a picture that contains only one face in the center of the picture, meaning the known geometry. Facetune allows for local changes in shape, but is essentially a simplified Photoshop tool dedicated to facial photographs that allows the user to control local distortion. Facetune is not a perspective correction tool and does not rely on image depth information.
Software exists to create 3D models from multiple cameras (e.g., RGB + TOF camera, stereo camera, or projection light system such as Microsoft Kinect). Another example is the paper "Kinect-Variety Fusion: A Novel Hybrid Approach for Artifacts-free 3DTV Content Generation" by Shama Manxi et al, which describes the extraction of depth information by combining multiple sensors and the use of Microsoft Kinect cameras and projection light structuring technology to improve depth map extraction. The second part of the paper (Section B, page 2277) relates to content generation for 3DTV, generating new views of a scene with multiple images of the scene captured by cameras as they move relative to the scene. These stereoscopic techniques do not provide a method to automatically correct perspective distortion in a single photograph. 3D GFX libraries such as OpenGL also do not provide a method for perspective correction using depth information.
To date, the tools available for perspective correction are either global correction, meaning some geometry of the object (straight line, face in the center), recording state (low angle, close range), or local correction, which requires user interaction and his/her knowledge of the natural or desired scene or appearance of the object (commonly known as photoshop cutback).
Disclosure of Invention
It is an object of the present invention to provide a method, digital storage holding software, integrated circuit and handheld device with camera for eliminating or compensating for local magnification difference of the whole photo by performing perspective correction using depth information of the photo. Another object is to perform such perspective correction automatically, without requiring a priori knowledge or user input about the natural appearance or topology of the scene or object in the photograph or the position or orientation of the camera.
To perform the perspective correction of the present invention, only a photograph is not sufficient. The distance between the camera and the scene for each pixel in the photograph, i.e. the depth map, is required. Once the photograph and depth map are obtained, the photograph taken by the virtual camera from different perspectives, i.e., different angles, farther distances, maintaining the same field of view can be reconstructed using the images and appropriate processing, thus eliminating or compensating for any perspective distortion. Instead of using the global conversion used for the entire image, a local correction for each pixel is performed according to the pixel distance from the associated depth map.
Accordingly, in a first aspect, the present invention provides a method for performing perspective correction in a digital photograph of a scene recorded by a camera, the method comprising the steps of:
for an array of pixels (x, y), the pixel value P is provided(x,y)A digital photograph of a representative scene, the photograph relating to perspective distortion effects;
provide information relating to the photograph and including the depth d(x,y)For each pixel in the array of pixels, the depth d(x,y)Refers to the distance between the part of the scene represented by the pixel and the acquisition position of the camera when the photograph was acquired; and
performing perspective correction using a photograph of the scene and its associated depth map, the correction of any pixel or region in the photograph depending on the depth of the pixel or the depth of the pixels in the region.
In this case, correction is to be understood as a process of making some change to solve a problem or produce a desired effect, and is not limited to a change made to make something accurate or correct according to a standard. Therefore, the perspective correction may also be a correction from the viewpoint of achieving a specific effect in the photograph. Also, the photograph originally obtained or some photograph with perspective distortion, i.e., the original photograph or the source photograph, is often referred to as the "captured photograph". The final photograph, i.e., the final photograph or the target photograph, on which the perspective correction has been performed is generally referred to as a "processed photograph". In a more detailed definition, depth is the distance between an object to the object principal plane (object principal plane) of the camera lens. For a mobile phone, the object plane is very close to the protection window of the camera in the mobile phone. It should be noted that the depth map may have a different (and typically lower) resolution than the photograph. In this case, the depth map may be upscaled so that it has the same resolution and effective depth information is obtained for each pixel value.
The perspective correction of the present invention is advantageous in that it can be performed without user interaction, so it can be implemented as a fully automatic correction that provides an immediate improvement of the photograph. Another advantage of the perspective correction of the present invention is local and thus the local perspective distortion of each pixel or region in the photograph can be corrected. Thus, the correction can simultaneously and independently correct perspective distortion of different parts with different depths in the scene of the photograph.
In a preferred embodiment, the step of performing perspective correction comprises: perspective correction of a scene of a photograph is performed using the photograph and its associated depth map, i.e. for each pixel (x, y) in the photograph, at least according to the depth d of the pixel in the image plane of the acquisition locationacq(x, y) and its position Dacq(x, y) and a displacement C between the virtual camera position and the acquisition position, determining a new position D of the virtual camera position in the image planeproc. Preferably, the new position is calculated to maintain the magnification in the selected plane of the depth map.
In this relationship, the displacement is the difference between the final and original position (movement in the x, y, z direction) and direction (rotation about the x, y, z axis), i.e. here the difference between the camera position at the time of taking the picture and the virtual camera position. Furthermore, the magnification is the ratio between the real size of the object and the size of the object in the photograph, which for each pixel is equal to the ratio between the depth of the pixel and the effective focal length.
In an alternative, the step of performing perspective correction using the at least one photograph of the scene and its associated depth map preferably comprises the steps of:
the depth of each pixel (x, y) in the photograph is used to calculate the magnification of that pixel.
Using the calculated magnification to calculate a new position (x ', y') for each pixel in the photograph, such that all new positions have the same magnification.
In another embodiment, the step of performing perspective correction comprises determining the new position as follows:
Figure BDA0001275040430000051
moreover, the step of performing perspective correction may further comprise further using the depth d of the reference plane selected to maintain magnification within the selected plane of the depth mapacq_refTo adjust the magnification of the new position. This embodiment may be implemented by determining the new position as follows:
Figure BDA0001275040430000052
if a new position has been determined, the step of performing perspective correction preferably further comprises: using pixel values P(x,y)And a new position (x ', y') to determine a new pixel value P for the array of pixels (i, j)(i,j)(ii) a By applying the corresponding pixel value P(x,y)With a new pixel value P of the pixel (i, j) surrounding the new position(i,j)Adding, for each new position (x ', y'), a corrected picture is depicted, wherein the pixel value P is scaled with a weighting factor as a function of the new position (x ', y') and the relative position of each pixel (i, j)(x,y)The weighting is performed.
If a new pixel value is generated by the addition of weighted values of one or more pixel values in the acquired picture, the perspective correction preferably further comprises the subsequent steps of: i.e. each new pixel value P(i,j)Divided by a normalization factor. In areas where the photograph is "stretched," some new pixels may not be near any new locations and thus may not have any pixel values assigned to them. Therefore, the method preferably further comprises the subsequent steps of: for having an indeterminate value P(i,j)According to the determined value P in the corrected picture(i,j)The interpolated pixel value is calculated.
In a preferred implementation, the displacement between the virtual camera position and the acquisition position is a linear displacement along the optical axis of the camera. It may also be preferred that the displacement is infinitely distant from the virtual camera position, so that all magnification changes are equal.
The depth map may be generated by a multi-camera setting or a single-camera setting. In a preferred embodiment, the steps of providing a photograph and providing a depth map relating to the photograph involve only a single camera having a single acquisition position. The single camera preferably has an adjustable lens or any other optical element with an adjustable optical magnification.
The step of providing a depth map may involve generating the depth map using focus-based depth map estimates, such as in-focus ranging (also known as Shape from focus) or out-of-focus ranging. By definition, DFF (in-focus ranging) or DFD (out-of-focus ranging) provides a depth map that is completely consistent with a picture taken by the same camera at the same location, thus providing the advantage of reducing the complexity of the processing and eliminating any calibration between the picture and the depth map.
In an alternative embodiment, the steps of providing a picture and providing a depth map involve the use of multiple cameras, for example:
use separate image and depth map cameras;
generating images and depth maps using stereo or array cameras; or
Generate images and depth maps using multiple cameras and from different perspectives.
Since not all photographs are perspective distorted to the extent that perspective correction is needed or preferred, and since perspective correction requires some processing power, it may be preferred to select a photograph for which perspective correction is needed. Thus, in a preferred embodiment, the steps of providing a photograph and providing a depth map may comprise: a series of photographs and their associated depth maps are provided, wherein the method further comprises the step of detecting and evaluating perspective distortion in the photographs to select photographs that would benefit from perspective correction. The evaluation of perspective distortion may be based on, for example, the distance of the nearest object in the scene and/or on the analysis of vanishing straight lines.
In a further embodiment, the same transformation used to correct perspective distortion in the photograph may be applied to transform the depth map itself to generate a depth map associated with the corrected photograph. Here, the method of the first aspect further comprises performing perspective also on the depth mapCorrection, i.e. for each pixel (x, y) in the depth map, by at least depending on the depth d of that pixel in the image plane of the acquisition locationacq(x, y) and its position Dacq(x, y) and a displacement C between the virtual camera position and the acquisition position, determining a new position D of the virtual camera position in the image planeprocTo also perform perspective correction on the depth map. In this respect, depth maps are only single-channel images (all pixel values are depth), while color photographs are typically three-channel images (pixel values in one of the three color channels, e.g., red, green, and blue). The process is equivalent, the only difference being that only the depth map itself is required, since the depth of each pixel is inherent. Preferably, the displacement C is added to each depth (pixel value) of the corrected depth map.
The perspective correction according to the first aspect of the invention may be performed immediately in the camera, or in a post-processing of the camera or in other devices.
The invention may be implemented in hardware, software, firmware or any combination of these. Thus in a second aspect the invention provides digital storage holding software which, when executed by one or more digital processors, performs the method of the first aspect. The digital memory may be any one or more readable media that can store digital code, such as a diskette, hard drive, RAM, ROM, etc., and the software may be on a single medium (e.g., the memory of a device with a camera) or distributed over several media, such as different hard drive servers connected via a network, or other types of electronic storage.
In a third aspect, the invention provides an integrated circuit configured to perform the method of the first aspect.
Similarly, in a fourth aspect, the invention provides a hand-held or portable device with a camera comprising the data memory of the second aspect or the integrated circuit of the third aspect.
Drawings
The invention will be described in more detail with reference to the accompanying drawings. The drawings illustrate one way of carrying out the invention and are not intended to limit other possible embodiments within the scope of the appended claims.
Fig. 1A-D show the image correction applied by DxO ViewPoint 2, the correction applied to the photo and checkerboard pattern being the same.
Fig. 2 and 3 show an arrangement for explaining an applicable algebraic derivation according to an embodiment of the invention.
Fig. 4 shows the way in which the new pixel value is calculated.
Fig. 5A-C illustrate the use of an adaptive kernel for interpolating pixel values for pixels having undetermined pixel values.
FIG. 6 is a schematic system diagram illustrating an embodiment of the method of the present invention and a schematic representation of an overview of the operation of the computer program product of the present invention.
Detailed Description
The main focus described in the following description is on perspective distortion that occurs when the ratio between the distance of the camera from the closest part of the scene and the distance of the camera from the farthest part of the scene is high and generates severe distortion. This occurs mainly in close-up or low-angle photography. However, the perspective correction in the present invention can be applied in any scene topology.
The method of perspective correction in the present invention converts a photograph from the camera angle of view (POV) at the time of acquisition to a virtual POV in a processed photograph in which the perspective distortion effect is weakened, negligible, or not present. Yet another object is to convert to a new POV (remote or infinite) while maintaining the same object size.
Therefore, for each pixel (x, y) in the acquired picture, the user needs to calculate the value P of the relevant pixel in the processed picture(x,y)The new position (x ', y'). The calculated new position is the pixel position (x, y) in the acquired picture and the distance d between the camera and the part of the scene in that pixel(x,y)As a function of (c).
The following describes an embodiment in which the displacement between the camera position at the time of taking a photograph (also referred to as the original camera position) and the camera position in the virtual POV is a linear displacement along the camera optical axis (here, the Z axis). More complex displacements (displacements in other directions: x/y and rotation) can be applied, but the algebra for such displacements, although readily available, is quite extensive.
Fig. 2 shows the settings of the object, the position of the camera where the picture is taken and the virtual camera position, the camera having a lens with a focal length f.
The following symbols will be used in the description and are shown in fig. 2 and 3:
d: depth of pixel
D: pixel distance from sensor center or optical axis
C: acquiring the displacement between the picture position and the virtual position along the Z-axis; +: away from the scene; -: close up scene
Subscript "acq": refers to a taken photograph
Subscript "proc": refer to processed photographs from a virtual camera position
Coordinates/index (x, y): integer position of pixel in captured photograph
P: pixel values, example: RGB or other color spaces
Coordinates/index (x ', y'): new (decimal) position of pixel value of pixel (x, y) after conversion
Coordinates/index (i, j): integer position of pixel in processed photograph
The following geometrical relationships can be taken from fig. 2:
dacq/D=f/Dacq
dproc/D=f/Dproc
dproc=dacq+C
=>Dproc/Dacq=dacq/dproc
=>Dproc=Dacq*dacq/dproc
=>Dproc=Dacq*dacq/(dacq+C) (1)
as previously mentioned, the magnification is the true size of the object and the object's position in the photographThe ratio between the dimensions, in relation to FIG. 2, the magnification in the acquired photograph may be expressed as D/Dacq=f/dacqCan be expressed as D/D in the processed photoproc=f/dproc=f/(dacq+C)。
The transformation (1) causes the enlargement of the whole picture. If we want to select a reference plane where the magnification factor is one, we need to calculate the magnification factor for this distance. This process is illustrated in fig. 3. The reference plane is preferably selected close to the center of the object in the direction towards the camera. For example: face, the plane of the face (hair/ear) contour may be chosen as the reference plane, maintaining head size and accounting for nose distortion.
The magnification on the reference plane is:
Dproc_ref/Dacq_ref=(dacq_ref)/(dacq_ref+C) (2)
substituting the reference magnification (2) into the conversion (1) yields:
Dproc=Dacq*dacq*(dacq_ref+C)/((dacq_ref)*(dacq+C)) (3)
if C is infinity (same magnification for all objects), we get:
Dproc=Dacq*dacq/(dacq_ref) (4)
because D has axial symmetry around the optical axis (Z-axis), the transformation expressed in (3) lies in polar coordinates, D is a radial coordinate, and the angular coordinate
Figure BDA0001275040430000091
Unaffected by the conversion. It should be noted that other expressions may be developed with the same or similar results as the transformation (3), using, for example, other coordinate systems, other camera optics, or other conditions. An important feature of the conversion (3) is that the conversion, and thus the correction, of any pixel in the photograph depends on the depth d of that pixelacq. Therefore, if a photograph and associated depth map are obtained with perspective distortion, virtual facies with significantly reduced or absent perspective distortion are selectedMachine position, this perspective correction is done in principle.
The conversion (3) can be used to calculate a pixel having a pixel value P(x,y)If the photograph is taken with a camera at a virtual location, the pixel will be at that location (x ', y'). As can be inferred from (3), the new position is the pixel depth d(x,y)Position D and the displacement C between the virtual camera position and the position at which the photograph was taken. The new position maintains the magnification of the selected plane of the depth map by substituting the reference magnification in (2) into (3).
In a preferred embodiment, the conversion (3) is used as a forward conversion; computing a pixel value P from an original pixel position (x, y)(x,y)The new position (x ', y'). The forward conversion involves some complications, where multiple pixels in the processed photograph will contribute to a single pixel, but some pixels in the processed photograph may not be at all subject to this contribution. Inverse transformation may also be used, but the computational requirements are higher for perspective correction, so it is preferable to implement forward mapping.
In the forward conversion, the acquired picture is scanned and new coordinates are calculated for each point. However, in order to represent the converted photograph in a standard digital format with a regular array of pixels of the same size and each having a single value in a certain color space, the converted photograph requires more processing. Repositioning each pixel (x, y) to its new position (x ', y') only creates a picture with more overlapping pixels (multi-source pixels) points and with no pixels black points (holes).
For each source pixel P where x and y are integer values(x,y)Coordinates in the processed photograph: x 'and y' are decimal values. These coordinates may be represented as an integer value (i, j) plus a fractional part (a)xy)。
x’=i+x
y’=j+y
First, a pixel value P(x,y)Are assigned to pixels within the pixel array in the processed photograph as explained in connection with fig. 4. P(x,y)'s is the pixel value in the target picture, and X's is the pixel value P in the processed picture(i,j)Is measured in the center of the pixel (i, j). For each calculated new position, the corresponding pixel value will act on the pixel values of nearby pixels in the processed picture. In a preferred embodiment, this is achieved as follows.
For each pixel in the acquired picture by assigning a corresponding pixel value P(x,y)New pixel values P of the four pixels closest to the new position in the corrected picture(i,j)Add to determine a new pixel value. When added, the pixel value P(x,y)Weighted by a weighting factor that is a function of the new position (x ', y') and the relative position of the pixel (i, j), resulting in a bilinear interpolation, as shown in fig. 4:
P(i,j)→P(i,j)+P(x,y)*(1-x)*(1-y)
P(i+1,j)→P(i+1,j)+P(x,y)*x*(1-y)
P(i,j+1)→P(i,j+1)+P(x,y)*(1-x)*y
P(i+1,j+1)→P(i+1,j+1)+P(x,y)*x*y
thus, in the processed photograph, the weighted values of the original pixel values are accumulated in each pixel. To normalize the new pixel values in the processed picture, each pixel value is subsequently divided by a normalization factor.
In practical applications, a "Photo accumulated buffer" is created. First, it is filled with 0's, and each time the pixel value P is present(x,y)Acting on pixel P in processed picture(i,j)Then P is calculated in the buffer(x,y)The sum of the weighted values of (a). At the same time, the weighting factors are calculated in a "weighting buffer" (example: (1-)x)*(1-y) ) of the total of the two. Once the forward mapping is complete, each pixel from the photo accumulation buffer is divided by the weighting factor of the weighting buffer to generate a "weighted perspective corrected photo". This solves the problemThe problem is that some pixels in the processed picture will get information from a number of pixels in the obtained picture by forward mapping.
In the above, the obtained pixel values are accumulated in the surrounding pixels using bilinear interpolation and weighting factorsx*y. However, other methods of assigning the pixel values of the acquired photos, and thus other weighting factors, are possible, and other interpolation methods such as bi-cubic or spline interpolation may also be used.
The weighted perspective corrected picture may include a "hole", i.e. no pixel in the obtained picture contributes to this pixel value, which is therefore undetermined. To fill in holes, the pixels are corrected to have an indeterminate value P based on surrounding pixels in the picture(i,j)Has a determined value P(i,j)The interpolated pixel values of (2). In the preferred embodiment, the weighted perspective corrected picture is scanned and each time a hole is found, an interpolated value for the pixel is calculated based on the Inverse Distance Weighting (IDW) from the active pixel. More information about IDW can be found on web pages like http:// en. wikipedia. org/wiki/Inverse _ distance _ weighing.
Since the size of the hole is unknown, the distance to the nearest determined pixel value is also unknown. To ensure fast processing and avoid unfilled holes, an adaptive kernel size may be used, as shown in FIGS. 5A-C: in fig. 5A and 5B, a 3 × 3 kernel is used, and in fig. 5A, the value of the active pixel, i.e., the value of the hole, can be calculated from the surrounding 6 pixels having the determined pixel value by IDW. In fig. 5B, however, the surrounding pixels do not provide data for the IDW, so no determined value for the active pixel/hole is available, and the kernel size must be increased, for example to the 5x5 kernel in fig. 5C. Here, the value of the active pixel/hole can be calculated by IDW from the surrounding 10 pixels having a certain value.
Filling holes can also be done by texture mapping if the holes are too large for pixel interpolation. For example, for a selfie, the side of the nose may be an area where missing polygons are obtained as the distance increases, in which case skin texture mapping may be used.
Fig. 6 shows the overall process of the perspective correction method.
As mentioned before, the depth map itself is a single-channel image, all pixel values of which are depths, and the depth map can also be corrected by the same transformation process and steps as the scene picture. The displacement C between the actual and virtual camera positions may be added to each depth/pixel value after conversion. The corrected depth map is correlated with the corrected picture and the distances of different parts of the scene are provided to the picture plane of the virtual camera position.
Self-photographing application
The preferred embodiment of the present invention relates to perspective correction in selfphotographing. Self-photography, by its very nature, is a photograph taken at close range, such as a photograph taken using a built-in camera or camera of a cell phone (the farthest distance is usually arm length) or a laptop or tablet. These close-range photographs most often show perspective distortion-the face is close to the camera and the ratio between the distance of the camera from the closest part (nose) and the distance of the camera from the farthest part (ear) is large.
Thus, selfphotographing is typically suited for automatic perspective correction techniques, and in order to pick out photographs that would benefit from the aforementioned perspective correction techniques, automatic detection and evaluation of perspective distortion can be combined with pattern recognition to detect depth differences imposed on the depth map that occur on human faces.
Also, in self-photography, the background (the portion of the scene that is photographed behind the person being photographed) can be determined using depth map information. Thus, in a preferred embodiment, the method of the invention involves detecting one or more foreground objects and the background in a photograph by analysing a depth map, thereby identifying areas which vary rapidly in depth and which are at least partially outlined by such areas, areas with a smaller average depth being identified as foreground objects and areas with a greater average depth being identified as the background. In an alternative embodiment, the method may involve detecting one or more foreground objects and the background in the photograph by analysing the depth map, so that regions with a depth of less than 300cm (e.g. less than 200cm, 150cm or 100cm) are identified as foreground objects and regions with greater depth are identified as background.
After detecting the background and foreground portions of the photograph, the background can be replaced with other image content (photograph, painting, graphics, or any combination thereof) while retaining the foreground objects.
Technical implementation
The invention may be implemented in hardware, software, firmware or any combination thereof. The invention or some of its features may also be implemented as software running in one or more data processors and/or digital signal processors. Fig. 6 is also considered to be a system schematic block diagram depicting an overview of the operation of an embodiment of the computer program product according to the second aspect of the present invention. The individual elements of a hardware embodiment of the invention may be physically, functionally and logically implemented in any suitable way, e.g. in a single unit, in a plurality of units or as part of separate functional units. The invention may be implemented in a single unit or may be physically and functionally distributed between different units and processors.
The integrated circuit of the third aspect of the invention may be a general purpose In System Programming (ISP), microprocessor or Application Specific Integrated Circuit (ASIC) or some part thereof. This may be advantageous, especially for hand-held or portable devices with cameras as described in the fourth aspect of the invention, for which lower cost, power consumption, weight, volume, heat generation etc. are important. Handheld devices with cameras include digital cameras, mobile phones, tablet computers, mp3 players, and the like. The portable device with a camera includes, for example, a notebook computer.
While the invention has been described in connection with specific embodiments, the invention should not be construed as being limited in any way to the examples presented. The scope of the invention is to be construed in accordance with the substance defined by the following claims. In the context of the claims, the term "comprising" or "comprises" does not exclude other possible elements or steps. Furthermore, references to words such as "a" or "an" should not be construed as excluding the plural. The use of reference signs in the claims with respect to elements illustrated in the figures shall not be construed as limiting the scope of the invention. Furthermore, advantageous combinations of features may be obtained from the individual features mentioned in different claims, and the mentioning of these features in different claims does not exclude that a combination of features is possible and advantageous.

Claims (11)

1. A method of automatically performing perspective correction in a digital photograph of a scene recorded with a camera, the method comprising the steps of:
using only a single camera with a single acquisition position,
providing a digital photograph of the scene represented by pixel values P (x, y) for an array of pixels (x, y), said photograph being related to the effect of perspective distortion;
providing a depth map relating to the photograph and comprising a depth d (x, y), which for each pixel in the array of pixels refers to the distance between the part of the scene represented by the pixel and the acquisition position of the camera at the time the photograph was taken; and
using the photograph of the scene from the single camera and the acquisition position and its associated depth map as the only input recorded by the camera to perform a perspective correction of the photograph, i.e. for each pixel (x, y) in the photograph, at least according to the depth d of the pixel in the image plane of the acquisition positionacq(x, y) and its position Dacq(x, y) and a linear displacement C between the virtual camera position and said acquisition position along the camera optical axis to determine a new position D of said virtual camera position in the image planeprocThe following were used:
Figure FDA0002574099750000011
2. the method of claim 1, wherein the step of performing perspective correction further comprises: also using the depth d of the reference planeacq_refTo adjust the magnification of the new position, the reference plane being chosen to hold the depth mapThe magnification of the selected plane.
3. The method of claim 2, wherein the step of performing perspective correction comprises determining a new location as follows:
Figure FDA0002574099750000012
4. the method of claim 1, wherein the step of performing perspective correction further comprises: using said pixel value P(x,y)And said new position (x ', y') to determine a new pixel value P for the array of pixels (i, j)(i,j)By applying the corresponding pixel value P(x,y)With a new pixel value P of the pixel (i, j) surrounding the new position(i,j)Adding up to render a corrected picture for each new position (x ', y'), wherein the pixel value P is scaled with a weighting factor that is a function of the new position (x ', y') and the relative position of each pixel (i, j)(x,y)The weighting is performed.
5. The method of claim 4, further comprising: subsequently, each new pixel value P is used(i,j)Divided by a normalization factor.
6. The method of claim 4 or 5, further comprising: subsequently, for a value having an uncertainty P(i,j)Calculates an interpolated pixel value from the surrounding pixels in the corrected picture having the determined value P (i, j).
7. The method of claim 1, wherein the steps of providing a photograph and providing a depth map comprise providing a series of photographs and their associated depth maps; wherein the method further comprises the steps of: detecting and evaluating perspective distortion in the photograph based on distance of nearest objects in the scene or based on analysis of vanishing straight lines; selecting a photograph with perspective distortion that would benefit from perspective correction; and automatically performing perspective correction in the selected photograph.
8. The method of claim 1, further comprising: perspective correction is also performed on the depth map, i.e. for each pixel (x, y) in the depth map, at least according to the depth d of said pixel in the image plane of the acquisition locationacq(x, y) and its position Dacq(x, y) and a displacement C between the virtual camera position and the acquisition position, determining a new position D of the virtual camera position in the image planeproc
9. A computer-readable storage medium, on which a computer program is stored, the computer program being executable by a processor for performing the method of claim 1.
10. An integrated circuit configured to perform the method of any of claims 1-8.
11. A hand-held or portable device with a camera comprising a computer readable storage medium according to claim 9 or an integrated circuit according to claim 10.
CN201580057165.5A2014-09-052015-09-04Perspective correction based on depth map in digital photosActiveCN107077719B (en)

Applications Claiming Priority (3)

Application NumberPriority DateFiling DateTitle
EP14183766.62014-09-05
EP141837662014-09-05
PCT/EP2015/070246WO2016034709A1 (en)2014-09-052015-09-04Depth map based perspective correction in digital photos

Publications (2)

Publication NumberPublication Date
CN107077719A CN107077719A (en)2017-08-18
CN107077719Btrue CN107077719B (en)2020-11-13

Family

ID=51518566

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201580057165.5AActiveCN107077719B (en)2014-09-052015-09-04Perspective correction based on depth map in digital photos

Country Status (5)

CountryLink
US (1)US10154241B2 (en)
EP (1)EP3189493B1 (en)
CN (1)CN107077719B (en)
DK (1)DK3189493T3 (en)
WO (1)WO2016034709A1 (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
FR3028988B1 (en)*2014-11-202018-01-19Commissariat A L'energie Atomique Et Aux Energies Alternatives METHOD AND APPARATUS FOR REAL-TIME ADAPTIVE FILTERING OF BURNED DISPARITY OR DEPTH IMAGES
US11538184B2 (en)2018-06-012022-12-27Hewlett-Packard Development Company, L.P.Substantially real-time correction of perspective distortion
CN109495688B (en)*2018-12-262021-10-01华为技术有限公司 Photographing preview method of electronic device, graphical user interface and electronic device
EP3691277A1 (en)*2019-01-302020-08-05Ubimax GmbHComputer-implemented method and system of augmenting a video stream of an environment
CN110276734B (en)*2019-06-242021-03-23Oppo广东移动通信有限公司Image distortion correction method and device
CN112270672B (en)*2020-11-102025-05-09成都圭目机器人有限公司 Tunnel segment surface image processing method based on linear array camera
US12394076B2 (en)2022-01-312025-08-19Samsung Electronics Co., Ltd.System and method for facial un-distortion in digital images using multiple imaging sensors
DE102024201188A1 (en)*2024-02-092025-08-14Continental Automotive Technologies GmbH COMPUTER-IMPLEMENTED METHOD FOR GENERATING ENVIRONMENTALLY SYNCHRONOUS IMAGE DATA

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN101726855B (en)*2009-11-132011-05-11河北工业大学Correction method of fisheye image distortion on basis of cubic projection
JP2012010044A (en)2010-06-242012-01-12Sharp CorpImage processing apparatus
GB2500417B8 (en)*2012-03-212017-06-07Sony Computer Entertainment Europe LtdCamera device, system and method of imaging
CN102722080B (en)*2012-06-272015-11-18杭州南湾科技有限公司A kind of multi purpose spatial image capture method based on many lens shootings
US9898856B2 (en)*2013-09-272018-02-20Fotonation Cayman LimitedSystems and methods for depth-assisted perspective distortion correction
CN103729841B (en)*2013-12-182016-08-24同济大学A kind of based on side's target model and the camera distortion bearing calibration of perspective projection

Also Published As

Publication numberPublication date
EP3189493B1 (en)2018-11-07
US10154241B2 (en)2018-12-11
US20170289516A1 (en)2017-10-05
EP3189493A1 (en)2017-07-12
CN107077719A (en)2017-08-18
WO2016034709A1 (en)2016-03-10
DK3189493T3 (en)2019-03-04

Similar Documents

PublicationPublication DateTitle
CN107077719B (en)Perspective correction based on depth map in digital photos
US11756223B2 (en)Depth-aware photo editing
JP7403528B2 (en) Method and system for reconstructing color and depth information of a scene
CN103973978B (en)It is a kind of to realize the method focused again and electronic equipment
CN106683071B (en) Image stitching method and device
JP6883608B2 (en) Depth data processing system that can optimize depth data by aligning images with respect to depth maps
CN113574863A (en)Method and system for rendering 3D image using depth information
JP6452360B2 (en) Image processing apparatus, imaging apparatus, image processing method, and program
TWI738196B (en)Method and electronic device for image depth estimation and storage medium thereof
WO2015180659A1 (en)Image processing method and image processing device
WO2020250175A1 (en)Method for optimal body or face protection with adaptive dewarping based on context segmentation layers
CN108616733B (en)Panoramic video image splicing method and panoramic camera
US10354399B2 (en)Multi-view back-projection to a light-field
KR102723109B1 (en) Disparity estimation from wide-angle images
WO2011014421A2 (en)Methods, systems, and computer-readable storage media for generating stereoscopic content via depth map creation
JP5762015B2 (en) Image processing apparatus, image processing method, and program
JP2019020952A (en) Image processing apparatus, method, and program
US20230005213A1 (en)Imaging apparatus, imaging method, and program
CN106162149A (en)A kind of method shooting 3D photo and mobile terminal
JP6168601B2 (en) Image converter
JP6071142B2 (en) Image converter
KR20240057994A (en)Method and apparatus for generating stereoscopic display contents
JP4809316B2 (en) Image generating apparatus, program, and recording medium
CN117915058A (en)Image perspective method, device, storage medium, electronic device and XR device
CN119094720A (en) Method, device, equipment, medium and program product for generating stereoscopic images

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp