CN117291947A

Movatterモバイル変換

Info

Publication number: CN117291947A
Application number: CN202311232786.5A
Authority: CN
Inventors: 刘家铭; 梁瑛平
Original assignee: Xiaohongshu Technology Co ltd
Current assignee: Xiaohongshu Technology Co ltd
Priority date: 2023-09-21
Filing date: 2023-09-21
Publication date: 2023-12-26

Abstract

The application discloses a method for generating a new visual angle image, a related method and a related product. The method comprises the following steps: acquiring an original image and depth information of the original image, wherein the original image is an image obtained by shooting a target scene by a camera at an original view angle, and the target scene comprises a background and a moving target; processing the original image and the depth information by using a neural network to obtain a plurality of original depth images of the multi-plane image under the original visual angle; generating a first depth image and a second depth image based on the plurality of original depth images; and determining a target shielding relation based on the first depth image and the second depth image, wherein the target shielding relation is the shielding relation between the background and the moving target under the condition that the camera shoots the target scene at a new view angle.

Description

Method for generating new visual angle image, related method and related product

Technical Field

The application relates to the technical field of artificial intelligence, in particular to a method for generating a new view angle image, a related method and a related product.

Background

With the rapid development of computer vision technology, there has been a mature technology in recent years, which can generate an image at a new viewing angle based on an original image at the original viewing angle as the photographing viewing angle. However, in the case that the original image includes a moving object in a moving state and a background in a static state, how to determine the occlusion relationship between the moving object and the background in a new view angle has a very important meaning for improving the quality of the image in the new view angle.

Disclosure of Invention

The application provides a method for generating a new visual angle image, a related method and a related product, wherein the related method comprises the following steps: the training method and the path planning method of the optical flow model, and the related device comprises the following steps: an apparatus for generating a new view angle image, a training apparatus for an optical flow model, a vehicle, an electronic device, a computer-readable storage medium.

In a first aspect, a method of generating a new view angle image is provided, the method comprising:

acquiring an original image and depth information of the original image, wherein the original image is an image obtained by shooting a target scene by a camera at an original view angle, and the target scene comprises a background and a moving target;

processing the original image and the depth information by using a neural network to obtain a plurality of original depth images of the multi-plane image (multiplane images, MPI) under the original view angle;

generating a first depth image and a second depth image based on the plurality of original depth images; the first depth image includes: in a case where the camera photographs the background at a new view angle, depth information of the background, the second depth image includes: the depth information of the moving target under the condition that the camera shoots the moving target at a new view angle;

And determining a target shielding relation based on the first depth image and the second depth image, wherein the target shielding relation is the shielding relation between the background and the moving target when the camera shoots the target scene at the new view angle.

In combination with any one of the embodiments of the present application, the method further includes:

processing the original image and the depth information by using a neural network to obtain the MPI;

generating a new image under the new view angle based on the MPI, wherein the new image is an image obtained by shooting the target scene under the new view angle by the camera;

and correcting the shielding relation between the background and the moving target by utilizing the target shielding relation aiming at the new image to obtain a target image under the new view angle.

In combination with any one of the embodiments of the present application, the determining, based on the first depth image and the second depth image, a target occlusion relationship includes:

generating a background reference image obtained by shooting the target scene by the camera in the first pose and a moving target reference image obtained by shooting the moving target by the camera in the second pose based on the original image; the first pose is a pose of the camera shooting the background at the new view angle, and the second pose is a pose of the camera shooting the moving target at the new view angle;

Determining a first position range of the background in the background reference image and a second position range of the moving target in the moving target reference image;

determining a position of an intersection of the first position range and the second position range as a third position range;

and determining an occlusion relation between the background and the moving target in the third position range based on the first depth image and the second depth image, and taking the occlusion relation as the target occlusion relation.

In combination with any one of the embodiments of the present application, the determining, based on the first depth image and the second depth image, an occlusion relationship between the background and the moving target in the third position range, as the target occlusion relationship, includes:

determining a first reference depth of the background in the third position range according to the first depth image;

determining a second reference depth of the moving target in the third position range according to the second depth image;

determining the target shielding relation as that the moving target is shielded by the background under the condition that the first reference depth is smaller than the second reference depth;

and under the condition that the first reference depth is larger than or equal to the second reference depth, determining the target shielding relation as that the background is shielded by the moving target.

In combination with any embodiment of the present application, the pose of the original image obtained by shooting with the camera is the original pose;

the determining a first position range of the background in the background reference image and a second position range of the moving target in the moving target reference image comprises:

acquiring an original mask, a first conversion relation and a second conversion relation of the moving target in the original image, wherein the first conversion relation is a conversion relation between the original pose and the first pose, and the second conversion relation is a conversion relation between the original pose and the second pose;

converting the original mask based on the first conversion relation to obtain a first reference mask of the moving target in the background reference image;

converting the original mask based on the second conversion relation to obtain a second reference mask of the moving target in the moving target reference image;

determining the first position range according to the first reference mask;

and determining the second position range according to the second reference mask.

In combination with any one of the embodiments of the present application, the generating a first depth image and a second depth image based on the plurality of original depth images includes:

Converting the original depth images into background depth images under the new view angle based on the first conversion relation;

converting the plurality of original depth images into a plurality of moving target depth images under the new view angle based on the second conversion relation;

performing volume rendering on the plurality of background depth images to obtain the first depth image;

and performing volume rendering on the multiple moving target depth images to obtain the second depth image.

In a second aspect, a method for training an optical flow model is provided, the method comprising:

acquiring an original image and a target image in an implementation manner of the first aspect;

acquiring dense optical flow between the original image and the target image;

taking the dense optical flow as a label of the original image and the target image;

training a model to be trained based on the original image, the target image and the label to obtain an optical flow model, wherein the optical flow model is used for estimating dense optical flow.

In a third aspect, a path planning method is provided, where the path planning method is applied to a vehicle, and the vehicle includes a camera, and the method includes:

In the running process of the vehicle, continuously shooting the environment where the vehicle is located through the camera to obtain a first image to be processed and a second image to be processed, wherein the first image to be processed and the second image to be processed both comprise target objects;

acquiring an optical flow model obtained by the method according to the second aspect;

determining an estimated optical flow between the first image to be processed and the second image to be processed using the optical flow model;

estimating a first travel path of the target object based on the estimated optical flow;

a second travel path of the vehicle is planned based on the first travel path.

In a fourth aspect, there is provided an apparatus for generating a new view angle image, the apparatus comprising:

the device comprises an acquisition unit, a display unit and a display unit, wherein the acquisition unit is used for acquiring an original image and depth information of the original image, the original image is an image obtained by shooting a target scene by a camera at an original view angle, and the target scene comprises a background and a moving target;

the processing unit is used for processing the original image and the depth information by utilizing a neural network to obtain a plurality of original depth images of the MPI under the original view angle;

a generation unit configured to generate a first depth image and a second depth image based on the plurality of original depth images; the first depth image includes: in a case where the camera photographs the background at a new view angle, depth information of the background, the second depth image includes: the depth information of the moving target under the condition that the camera shoots the moving target at a new view angle;

And the determining unit is used for determining a target shielding relation based on the first depth image and the second depth image, wherein the target shielding relation is the shielding relation between the background and the moving target under the condition that the camera shoots the target scene at the new view angle.

In combination with any one of the embodiments of the present application, the processing unit is further configured to process the original image and the depth information by using a neural network, so as to obtain the MPI;

the generating unit is further configured to generate a new image under the new view angle based on the MPI, where the new image is an image obtained by the camera shooting the target scene under the new view angle;

the generating unit is further configured to correct, for the new image, an occlusion relationship between the background and the moving target by using the target occlusion relationship, so as to obtain a target image under the new view angle.

In combination with any one of the embodiments of the present application, the determining unit is configured to:

The determining unit is used for:

determining the first position range according to the first reference mask;

In combination with any one of the embodiments of the present application, the generating unit is configured to:

In a fifth aspect, there is provided a training apparatus for an optical flow model, the apparatus comprising:

an acquisition unit, configured to acquire an original image and a target image in an implementation manner of the first aspect;

the acquisition unit is used for acquiring dense optical flow between the original image and the target image;

a processing unit for taking the dense optical flow as a label of the original image and the target image;

the training unit is used for training the model to be trained based on the original image, the target image and the label to obtain an optical flow model, and the optical flow model is used for estimating dense optical flow.

In a sixth aspect, a cart, the cart comprising:

the camera is used for continuously shooting the environment where the vehicle is located in the running process of the vehicle to obtain a first image to be processed and a second image to be processed, wherein the first image to be processed and the second image to be processed both comprise target objects;

an acquisition unit configured to acquire an optical flow model obtained by the method according to the second aspect;

A determining unit configured to determine an estimated optical flow between the first image to be processed and the second image to be processed using the optical flow model;

the estimating unit is used for estimating a first driving path of the target object based on the estimated optical flow;

and the planning unit is used for planning a second driving path of the vehicle based on the first driving path.

In a seventh aspect, there is provided an electronic device comprising: a processor and a memory for storing computer program code, the computer program code comprising computer instructions; causing a processor to perform the first aspect and any implementation thereof as described above, when the program instructions are executed by the processor; in the case where the program instructions are executed by a processor, or cause the processor to perform the second aspect as described above and any implementation thereof; in case the program instructions are executed by a processor, or cause the processor to perform the third aspect as described above and any implementation thereof.

In an eighth aspect, there is provided another electronic device comprising: a processor, a transmitting device, an input device, an output device, and a memory for storing computer program code, the computer program code comprising computer instructions; the electronic device performs the first aspect and any implementation thereof as described above, when the processor executes the computer instructions; the electronic device, or performing the second aspect and any embodiments thereof as described above, when the processor executes the computer instructions; the electronic device may alternatively perform the third aspect and any implementation thereof as described above, in case the processor executes the computer instructions.

In a ninth aspect, there is provided a computer readable storage medium having a computer program stored therein, the computer program comprising program instructions; causing a processor to perform the first aspect and any implementation thereof as described above, when the program instructions are executed by the processor; in the case where the program instructions are executed by a processor, or cause the processor to perform the second aspect as described above and any implementation thereof; in case the program instructions are executed by a processor, or cause the processor to perform the third aspect as described above and any implementation thereof.

In a tenth aspect, there is provided a computer program product comprising a computer program or instructions; when the computer program or instructions are run on a computer, the computer is caused to perform the first aspect and any implementation thereof described above; in the case where the program instructions are executed by a processor, or cause the processor to perform the second aspect as described above and any implementation thereof; in case the program instructions are executed by a processor, or cause the processor to perform the third aspect as described above and any implementation thereof.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.

In the application, the original image is an image obtained by shooting the target scene by the camera at the original view angle, and the generating device can process the original image and the depth information by utilizing the neural network after acquiring the original image and the depth information of the original image to obtain a plurality of original depth images of the MPI under the original view angle.

Since the target scene includes the background and the moving object, in converting the photographing view angle of the target scene from the original view angle to the new view angle, there is not only movement of the camera but also movement of the moving object, and thus, a change in the background from the depth under the original view angle to the depth under the new view angle is caused by movement of the camera, and a change in the moving object from the depth under the original view angle to the depth under the new view angle is caused by movement of the camera and movement of the moving object. Accordingly, the generating means generates, based on the plurality of original depth images, a first depth image including depth information of the background in the case where the background is photographed at the new view angle, and a second depth image including depth information of the moving object in the case where the moving object is photographed at the new view angle. That is, the depth of the background when the camera photographs the target scene at a new view angle can be determined based on the first depth image, and the depth of the moving target when the camera photographs the target scene at a new view angle can be determined based on the second depth image. Therefore, the generating device can determine the shielding relation between the background and the moving target when the camera shoots the target scene at the new view angle based on the first depth image and the second depth image.

Drawings

In order to more clearly describe the technical solutions in the embodiments or the background of the present application, the following description will describe the drawings that are required to be used in the embodiments or the background of the present application.

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and, together with the description, serve to explain the technical aspects of the application.

Fig. 1 is a flowchart of a method for generating a new perspective image according to an embodiment of the present application;

FIG. 2 is a flowchart of a training method of an optical flow model according to an embodiment of the present application;

FIG. 3 is a schematic diagram of an architecture for generating an optical flow dataset based on an original image according to an embodiment of the present application;

FIG. 4a is a schematic diagram of an original image according to an embodiment of the present application;

FIG. 4b is a schematic diagram of a target image generated based on FIG. 4a according to an embodiment of the present application;

FIG. 4c is a schematic diagram of a dense optical flow between FIG. 4a and FIG. 4b according to an embodiment of the present application;

FIG. 5a is a schematic diagram of an original image according to an embodiment of the present application;

FIG. 5b is a target image generated based on FIG. 5a according to an embodiment of the present application;

FIG. 5c is another image of a target generated based on FIG. 5a provided in an embodiment of the present application;

FIG. 5d is a further image of a target generated based on FIG. 5a provided in an embodiment of the present application;

fig. 6 is a flow chart of a path planning method according to an embodiment of the present application;

fig. 7 is a schematic structural diagram of an apparatus for generating a new view angle image according to an embodiment of the present application;

FIG. 8 is a schematic structural diagram of a training device for an optical flow model according to an embodiment of the present disclosure;

FIG. 9 is a schematic structural view of a vehicle according to an embodiment of the present disclosure;

fig. 10 is a schematic hardware structure of an electronic device according to an embodiment of the present application.

Detailed Description

In order to make the present application solution better understood by those skilled in the art, the following description will clearly and completely describe the technical solution in the embodiments of the present application with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.

The terms first, second and the like in the description and in the claims of the present application and in the above-described figures, are used for distinguishing between different objects and not for describing a particular sequential order. Furthermore, the terms "comprise" and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those listed steps or elements but may include other steps or elements not listed or inherent to such process, method, article, or apparatus.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the present application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.

The execution body of the embodiment of the present application generates a device for generating a new view image (hereinafter referred to simply as a generating device), where the generating device may be any electronic device capable of executing the technical solution disclosed in the embodiment of the method of the present application. Alternatively, the generating means may be one of the following: computer, server.

It should be understood that the method embodiments of the present application may also be implemented by way of a processor executing computer program code. Embodiments of the present application are described below with reference to the accompanying drawings in the embodiments of the present application. Referring to fig. 1, fig. 1 is a flowchart of a method for generating a new perspective image according to an embodiment of the present application.

101. The original image and the depth information of the original image are acquired.

In the embodiment of the present application, the original image is an image obtained by shooting the target scene with the original view angle by the camera. The camera may be any device having an imaging function, for example, the camera may be a mobile phone having an imaging function, and the camera may be a video camera. The original view angle may be an arbitrary photographing view angle, and the target scene may be an arbitrary scene.

The target scene comprises a background and a moving target, wherein the background comprises an object in a static state in the target scene, and the moving target is an object in a moving state in the target scene. For example, the moving object is a motor vehicle in a moving state, for example, the moving object is a non-motor vehicle in a moving state, and for example, the moving object is a pedestrian in a moving state.

In one implementation of capturing the original image, the generating device has a communication connection with the camera, through which the generating device captures the original image captured by the camera.

In another implementation of capturing an original image, the generating device receives the original image input by a user through an input component, where the input component includes: keyboard, mouse, touch screen, touch pad, audio input device.

In still another implementation manner of acquiring an original image, the generating device receives the original image sent by the terminal, where the terminal includes: cell phone, computer, panel computer, server.

In the embodiment of the application, the depth information of each pixel in the original image can be determined according to the depth information of the original image. Optionally, the depth information of the original image is a depth image of the original image. In one implementation of acquiring depth information of an original image, the generating device estimates a depth image of the original image through a monocular depth estimation network, and regards the depth image as the depth information for acquiring the original image. In another implementation of acquiring depth information of an original image, the generating device receives depth information of the original image input by a user through the input component.

It should be understood that the step of acquiring the original image and the step of acquiring the depth information of the original image may be performed simultaneously or may be performed separately, which is not limited in this application.

102. And processing the original image and the depth information by using a neural network to obtain a plurality of original depth images of the MPI under the original view angle.

In this embodiment of the present application, the MPI includes a plurality of images, and the depth of the pixels in each image is the same, and the depths of the pixels in different images are different. The original depth image is the depth image of the image in the MPI, and the original depth image corresponds to the image in the MPI one by one. Since the image in MPI is based on the original image. The generating device processes the original image by using the neural network, extracts color values and volume densities at a plurality of depths, generates an MPI at an original viewing angle based on the color values and the volume densities at the plurality of depths, and generates an original depth image based on the depths of the images in the MPI. The color values may be values in a color channel.

Optionally, the generating device obtains the depth of the original image and the RGBD image using monocular depth estimation, wherein the RGBD image is a depth image including the original image. The neural network extracts color values and volume densities at a plurality of depths based on context information of the RGBD image, generates an original image based on the color values and the volume densities at the plurality of depths, and generates an MPI at an original view angle, so that an original depth image can be generated based on the depths of the images in the MPI.

103. Based on the plurality of original depth images, a first depth image and a second depth image are generated.

In an implementation of the present application, the first depth image includes: in the case where the camera photographs the background at a new view angle, depth information of the background. The new view angle may be any photographing view angle different from the original view angle. The second depth image includes: in the case where the camera photographs a moving object at a new view angle, depth information of the moving object.

It should be understood that since there is not only movement of the camera but also movement of the moving object in the process of converting the photographing view angle of the object scene from the original view angle to the new view angle, a change in the background from the depth at the original view angle to the depth at the new view angle is caused by movement of the camera, and a change in the moving object from the depth at the original view angle to the depth at the new view angle is caused by movement of the camera and movement of the moving object. Accordingly, the depth information of the background in the first depth image is determined based on the motion of the camera and the depth information of the background at the original view angle, and the depth information of the moving object in the second depth image is determined based on the motion of the camera, the motion of the moving object, and the depth information of the background at the original view angle.

In one possible implementation, the generating means determines depth information of a background in the plurality of original depth images and depth information of a moving object in the plurality of original depth images based on the depth information of the plurality of original depth images. A first depth image is generated based on a first amount of motion, which is an amount of motion of the camera in a case where a photographing angle of view of the camera is converted from an original angle of view to a new angle of view, and depth information of a background in the plurality of original depth images. The generation means generates a second depth image based on the first motion amount, a second motion amount, and depth information of the moving object in the plurality of original depth images, wherein the second motion amount is a motion amount of the moving object in a case where a photographing angle of view of the camera is converted from the original angle of view to a new angle of view.

104. And determining a target shielding relation based on the first depth image and the second depth image.

In the embodiment of the present application, the target shielding relationship is a shielding relationship between a background and a moving target when the camera shoots a target scene with a new view angle. Since the first depth image includes: in the case where the camera photographs the background at a new view angle, the depth information of the background, the second depth image includes: in the case where the camera photographs the moving object at the new view angle, the generating device may determine the depth of the background and the depth of the moving object in the case where the camera photographs the target scene at the new view angle based on the first depth image and the second depth image, and may further determine the object shielding relationship based on the depth of the background and the depth of the moving object.

In one possible implementation, the generating means determines a range of positions of the background within the shooting range when the camera shoots at a new view angle based on the positions of the background in the first depth image. Based on the position of the moving object in the second depth image, a position range of the moving object within the shooting range when the camera shoots at a new view angle is determined. And determining the intersection of the position range of the background in the shooting range and the position range of the moving target in the shooting range as the overlapping position range of the background and the moving target. For the range except the overlapping position range in the shooting range, the object shielding relation is determined as that the moving object is shielded by the background, namely, the camera shoots the background in the range. For the overlapping position range, the generating device determines the depth of the background according to the first depth image, determines the depth of the moving target according to the second depth image, and then determines the shielding relation between the background and the moving target by comparing the depth of the background with the depth of the moving target. Specifically, the generating device determines that the target shielding relation is that the moving target is shielded by the background when the depth of the background is smaller than the depth of the moving target, namely that the camera shoots the background at the moment, and determines that the target shielding relation is that the moving target is shielded by the background when the depth of the background is larger than or equal to the depth of the moving target, namely that the camera shoots the moving target at the moment.

In this embodiment of the present application, the original image is an image obtained by photographing, by a camera, a target scene with an original view angle, and after obtaining the original image and depth information of the original image, the generating device may process the original image and the depth information by using a neural network, so as to obtain multiple original depth images of the MPI under the original view angle.

As an alternative embodiment, the generating means further performs the steps of:

2001. and processing the original image and the depth information by using a neural network to obtain the MPI under the original view angle.

The implementation of this step may be specifically referred to step 102, and will not be described herein.

2002. Based on the MPI at the original viewing angle, a new image at the new viewing angle is generated.

In the embodiment of the present application, the new image is an image obtained by shooting the target scene under the new view angle by the camera. In one possible implementation, the camera takes the pose of the original image obtained by taking the original view angle as the original pose, and takes the pose of the new view angle for taking the target scene as the first pose. The generating device determines a first conversion relation between the original pose and the first pose based on the original pose and the first pose.

And converting the MPI under the original view angle based on the first conversion relation to obtain the MPI under the new view angle. Specifically, the generating device uses the first conversion relation to convert each image in the MPI under the original view angle respectively, so as to obtain the MPI under the new view angle. In one possible implementation manner, the generating device converts the MPI under the original view angle based on the first conversion relation to obtain a plurality of images to be interpolated, and performs bilinear interpolation on the plurality of images to be interpolated to obtain the MPI under the new view angle.

Optionally, use (x_s ,y_s ) Representing pixels in MPI at the original viewing angle, using (x_t ,y_t ) Representing a pixel in MPI at a new view angle, then (x_s ,y_s )、(x_t ,y_t ) Satisfies the following formula:

wherein R is a rotation matrix in a first conversion relation, t is a translation matrix in the first conversion relation, K is an intrinsic reference matrix of the camera, and n= [0, 1]As normal vector, d_n Is the depth of the nth image in the MPI at the original view.

After obtaining the MPI under the new view angle, the generating device obtains a new image by performing volume rendering on the MPI under the new view angle. Optionally, the generating device performs volume rendering on the MPI under the new view angle, and generates the new image by the following expression:

wherein I is_t For a new image, N is the number of images in MPI at the original view, c'_n Is the color value of the nth image in MPI under the new view angle, alpha'_n ＝exp(-δ_n σ′_n )，δ_n Is the distance between the nth image in the MPI at the original viewing angle and the (n+1) th image in the MPI at the original viewing angle, σ'_n Is the volume density of the nth image in MPI under the new view angle, alpha'_m ＝exp(-δ_m σ′_m )，δ_m Is the distance between the mth image in the MPI at the original viewing angle and the (m+1) th image in the MPI at the original viewing angle, σ'_m Is the volume density in the mth image in MPI at the new view angle.

Under the condition that the color value and the volume density of the pixels in the MPI under the new view angle are obtained based on the real color value and the real volume density in the MPI under the original view angle, the color values and the volume densities of different images in the MPI can be subjected to weighted fusion through volume rendering on the MPI under the new view angle, so that a new image is generated, the authenticity of the new image can be improved, and the generation of the real image under the new view angle based on the single Zhang Zhenshi image can be realized.

2003. And correcting the shielding relation between the background and the moving target by using the target shielding relation aiming at the new image to obtain the target image under the new view angle.

Since a new image is generated based on the MPI at the original viewing angle, only the movement of the camera in the case where the photographing viewing angle of the camera is converted from the original viewing angle to the new viewing angle is considered, and the movement of the moving object is not considered, there is an error in the occlusion relationship of the moving object and the background in the new image. And the target occlusion relationship is determined based on the first depth image and the second depth image, so that the target occlusion relationship considers not only the movement of the camera but also the movement of the moving target. Thus, the occlusion relationship of the background in the new image and the moving target can be corrected by using the target occlusion relationship. Specifically, by correcting the new image by using the target shielding relationship, a target image in which the camera shoots the target scene with a new view angle can be obtained, wherein in the target image, the shielding relationship between the moving target and the background is the target shielding relationship.

In this embodiment, the generating device processes the original image and the depth information using a neural network to obtain the MPI at the original viewing angle. Based on the multi-plane image, a new image under a new view angle is generated, the new image is corrected by utilizing the target shielding relation, a target image of a camera shooting a target scene with the new view angle can be obtained, and the quality of the target image can be improved.

As an alternative embodiment, the generating means performs the following steps in performing step 104:

3001. and generating a background reference image obtained by photographing the target scene by the camera in the first pose and a moving target reference image obtained by photographing the target scene by the camera in the second pose based on the original image.

In this embodiment of the present application, the first pose is a pose in which the camera shoots the background at the new view angle, and it should be understood that, when the shooting view angle of the camera is converted from the original view angle to the new view angle, the camera moves, and the background does not move, so in step 2002, the first pose is a pose in which the camera shoots the target scene at the new view angle, and represents a pose in which the camera shoots the target scene at the new view angle without considering the movement of the moving target, which is the same as a pose in which the camera shoots the background at the new view angle. The second pose is a pose in which the camera shoots the moving target with a new view angle.

The pose of the camera when shooting the background at the original view angle and the pose of the camera when shooting the moving target at the original view angle are both original poses. The conversion of the shooting view angle of the background from the original view angle to the new view angle is caused by the movement of the camera, so that the change from the original pose to the first pose can represent the movement amount of the camera in the process of converting the shooting view angle of the background from the original view angle to the new view angle.

In addition, since the moving object is in a moving state, there is also a movement of the moving object in a process of converting a photographing view angle of the background from an original view angle to a new view angle, and therefore, the photographing view angle of the moving object is converted from the original view angle to the new view angle, not only the movement of the camera but also the movement of the moving object is involved. The change from the original pose to the second pose can be used for representing the movement quantity of the camera and the movement quantity of the moving target in the process of converting the shooting view angle of the moving target from the original view angle to a new view angle.

In the embodiment of the application, the background reference image is an image obtained by converting a shooting visual angle from an original visual angle to a new visual angle based on the movement of the camera and shooting the background through the camera. The moving target reference image is an image obtained by converting a shooting visual angle from an original visual angle to a new visual angle based on the movement of a camera and the movement of a moving target and shooting the moving target through the camera.

In one possible implementation, the generating means acquires an image generating neural network. And inputting the original image, the original pose and the first pose into an image generation neural network to generate a background reference image. And inputting the original image, the original pose and the second pose into an image generation neural network to generate a moving target reference image.

In another possible implementation, the generating means generates the background reference image based on the MPI technique using the original image, the original pose and the first pose. Based on MPI technology, the original image, the original pose and the second pose are utilized to generate a moving target reference image.

3002. A first range of positions of the background in the background reference image and a second range of positions of the moving object in the moving object reference image are determined.

Since the background reference image is an image obtained by photographing the background with a new view angle by the camera, the first position range is a position range of the background within the photographing range when the camera photographs the target scene with the new view angle. Since the moving target reference image is an image obtained by photographing the moving target with the new view angle by the camera, the second position range is a position range of the moving target within the photographing range when the camera photographs the target scene with the new view angle.

3003. And determining the intersection of the first position range and the second position range as a third position range.

In this embodiment of the present application, the third position range is an overlapping position range of the background and the moving target in the shooting range when the camera shoots the target scene with the new view angle.

In one possible implementation, the generating device acquires an original mask, a first conversion relationship, and a second conversion relationship of the moving object in the original image. And converting the original mask based on the first conversion relation to obtain a first reference mask of the moving target in the background reference image. And converting the original mask based on a second conversion relation to obtain a second reference mask of the moving target in the moving target reference image. A first range of locations is determined based on the first reference mask. A second range of positions is determined based on the second reference mask.

3004. And determining an occlusion relationship between the background and the moving target in the third position range as the target occlusion relationship based on the first depth image and the second depth image.

When the camera shoots the moving object with a new view angle, the overlapping position of the background and the moving object is a third position range, and the background and the moving object may have shielding in the third position range, so that the shielding relation between the background and the moving object in the third position range needs to be determined.

Since the first depth image includes: in the case where the camera photographs the background at a new view angle, the depth information of the background, the second depth image includes: in the case where the camera photographs a moving object at a new view angle, depth information of the moving object. Therefore, the generating device may determine, based on the first depth image and the second depth image, the depth of the background in the third position range and the depth of the moving object in the third position range in the case where the camera photographs the target scene at the new angle of view, and may further determine, as the target occlusion relationship, the occlusion relationship of the background and the moving object in the third position range based on the depth of the background in the third position range and the depth of the moving object in the third position range.

In one possible implementation, the generating means determines a first reference depth of the background in the third position range from the first depth image. And determining a second reference depth of the moving target in the third position range according to the second depth image. And under the condition that the first reference depth is smaller than the second reference depth, determining the target shielding relation as that the moving target is shielded by the background. And under the condition that the first reference depth is larger than or equal to the second reference depth, determining the target shielding relation as background passive target shielding.

Optionally, the generating means may determine, after acquiring the first reference mask, the depth image of the target image, the second reference mask, and the depth image of the moving target reference image, based on the following equation:

wherein M is_occ Representing the region of the target image where the moving target is blocked by the background, M_t A first reference mask is indicated and a second reference mask is indicated,represents a second reference mask, D_t Representing the depth of a target imageDegree image->A depth image representing a moving target reference image.

In this embodiment, the first pose is a pose in which the camera photographs a background at a new angle of view, and the second pose is a pose in which the camera photographs a moving object at a new angle of view. Since the conversion of the shooting view angle of the background from the original view angle to the new view angle is caused by the movement of the camera, the movement amount of the camera can be characterized in the process of converting the shooting view angle of the background from the original view angle to the new view angle from the original pose to the first pose. The shooting view angle of the moving object is converted from the original view angle to the new view angle and is caused by the movement of the camera and the movement of the moving object, so that the movement quantity of the camera and the movement quantity of the moving object can be represented in the process of converting the shooting view angle of the background from the original view angle to the new view angle from the original pose to the second pose.

The generating device generates a background reference image obtained by photographing the moving target by the camera in the first pose and a moving target reference image obtained by photographing the moving target by the camera in the second pose based on the original image. In this way, the background in the background reference image can be made to be the background under the new view angle determined based on the camera motion, and the moving object in the moving object reference image can be made to be the moving object under the new view angle determined based on the camera motion and the moving object motion.

Therefore, the generating device determines a first position range of the background in the background reference image, namely, a position range of the background in the shooting range when the camera shoots the target scene at a new view angle. The generating device determines a second position range of the moving target in the moving target reference image, namely, the position range of the moving target in the shooting range when the camera shoots the target scene at a new visual angle. And then determining the position of the intersection of the first position range and the second position range as a third position range, wherein the third position range is the overlapping position range of the background and the moving target in the shooting range when the camera shoots the target scene at a new view angle.

Since the overlapping position of the background and the moving object is the third position range when the camera shoots the moving object with the new view angle, there may be shielding between the background and the moving object in the third position range. And based on the first depth image, a depth of the background in the third position range in a case where the camera photographs the target scene at the new angle of view may be determined, and based on the second depth image, a depth of the moving target in the third position range in a case where the camera photographs the target scene at the new angle of view may be determined. Thus, the generating means may determine, as the target occlusion relationship, an occlusion relationship of the background and the moving target in the third position range based on the first depth image and the second depth image.

As an alternative embodiment, the generating means performs the following steps in performing step 103:

4001. and converting the plurality of original depth images into a plurality of background depth images under the new view angle based on the first conversion relation.

The generating device respectively converts each image in the plurality of original depth images under the original view angle into a new view angle through a first conversion relation, so that a plurality of background depth images can be obtained, wherein the plurality of background depth images are the depth images of the MPI of the background reference image under the new view angle.

4002. And converting the plurality of original depth images into a plurality of moving target depth images under the new view angle based on the second conversion relation.

The generating device respectively converts each image in the plurality of original depth images under the original view angle into a new view angle through a second conversion relation, so that a plurality of moving target depth images can be obtained, wherein the plurality of moving target depth images are the depth images of the MPI of the moving target reference image under the new view angle.

4003. And performing volume rendering on the plurality of background depth images to obtain the first depth image.

Because the background depth image is the depth image of the MPI of the background reference image under the new view angle, the weighted fusion of the background depth images can be carried out by carrying out volume rendering on the background depth images, so as to obtain the depth image (namely the first depth image) of the background reference image.

4004. And performing volume rendering on the multiple moving target depth images to obtain the second depth image.

Because the moving target depth image is the depth image of the MPI of the moving target reference image under the new view angle, the weighted fusion of the multiple moving target depth images can be carried out by carrying out volume rendering on the multiple moving target depth images, so that the depth image (namely the second depth image) of the moving target reference image is obtained.

In this embodiment, the generating device converts the plurality of original depth images into a plurality of background depth images under the new view angle based on the first conversion relation, and may obtain the depth image of the MPI of the background reference image under the new view angle. Based on the second conversion relation, converting the plurality of original depth images into a plurality of moving target depth images under the new view angle, and obtaining the depth image of the MPI of the moving target reference image under the new view angle. And then performing volume rendering on the plurality of background depth images to obtain a first depth image of the background reference image, and performing volume rendering on the plurality of moving target depth images to obtain a second depth image of the moving target reference image.

By using the method for generating the new view image, the target image can be generated based on the original image, so that the dense optical flow can be used as the optical flow label of the original image and the target image under the condition that the dense optical flow between the original image and the target image is acquired, an optical flow data set is constructed, and the optical flow data set can be used for training the model, so that the model has the capability of estimating the dense optical flow. Based on the above, the embodiment of the application also provides a training method of the optical flow model.

An execution subject of the technical solution disclosed in the embodiment of the training method of an optical flow model is a training device (hereinafter simply referred to as a training device) of the optical flow model. The training device may be any electronic device capable of executing the technical scheme disclosed in the embodiment of the training method of the optical flow model. Alternatively, the training device may be one of the following: computer, server.

It should be appreciated that embodiments of the training method of the optical flow model may also be implemented by means of a processor executing computer program code. Embodiments of the present application are described below with reference to the accompanying drawings in the embodiments of the present application. Referring to fig. 2, fig. 2 is a flowchart of a training method of an optical flow model according to an embodiment of the present application.

201. The original image and the target image in the method for generating the new view angle image are acquired.

The original image in this step is the original image in step 101, and the target image in this step is the target image generated by steps 2001 to 2003.

202. A dense optical flow between the original image and the target image is acquired.

In one possible implementation, the generating means generates the MPI at the original viewing angle based on the original image. The MPI at the original view angle is converted into the MPI at the new view angle using the first conversion relation. Calculating the optical flow between the MPI at the original view angle and the MPI at the new view angle results in a plurality of reference optical flows. For example, the MPI under the original view angle includes an image a and an image b, and the MPI under the new view angle includes an image c and an image d, wherein the image c is an image obtained by converting the image a with the first conversion relationship, and the image d is an image obtained by converting the image b with the first conversion relationship. At this time, the plurality of reference optical flows includes: dense optical flow between image a and image c, dense optical flow between image b and image d. By volume rendering the plurality of reference optical flows, a dense optical flow between the original image and the target image is obtained.

203. And taking the dense optical flow as a label of the original image and the target image.

In the embodiment of the application, the dense optical flow label characterizes dense optical flow between an original image and a target image. The dense optical flow label may be used as a truth value (GT) of dense optical flow between the original image and the target image.

204. Training the model to be trained based on the original image, the target image and the label to obtain an optical flow model.

The model to be trained may be any model having a structure that estimates dense optical flow. An optical flow model obtained by training the model to be trained is used for estimating dense optical flow. In one possible implementation, the model to be trained estimates the predicted optical flow between the original image and the target image by processing the original image and the target image. Based on the difference between the predicted optical flow and the labels, determining the loss of the model to be trained, wherein the difference is positively correlated with the loss. And determining a counter-propagating gradient of the model to be trained based on the loss, and updating parameters of the model to be trained based on the gradient until the loss converges to obtain an optical flow model.

In the embodiment of the application, the original image is a real image, and the target image based on the original image is high in reality and quality. Therefore, the training device trains the model to be trained to obtain the optical flow model by utilizing the original image, the target image and the dense optical flow between the original image and the target image, and the estimation accuracy of the optical flow model on the dense optical flow of the real image can be improved. Moreover, by the method for generating the new view angle image, the optical flow data set comprising diversified real scenes can be constructed at lower cost, so that by the method for generating the new view angle image, the target image is generated, the optical flow data set for training the model to be trained is constructed based on the original image and the target image, and the optical flow data set is utilized for training the model to be trained, so that the training effect can be improved at lower cost.

Referring to fig. 3, fig. 3 is a schematic diagram of an architecture for generating an optical flow dataset based on an original image according to an embodiment of the present application. As shown in fig. 3, the architecture includes the following two parts: 1. optical flow (Optical Flow from MPI), 2. Depth-Aware padding module (Depth-Aware padding) is generated based on MPI. Wherein, when the original image (i.e. I in fig. 3_s ) Thereafter, a depth image of the original image (i.e., D in fig. 3) is obtained by performing depth estimation on the original image_s ). The MPI at the original viewing angle is output by inputting the original image to the neural network.

MPI advancement at original viewing angle based on color values, volume density, and depth image providedLine volume rendering, first pose, second pose, and volume rendering, generating a first depth image (i.e., in fig. 3) A second depth image (i.e. +.in FIG. 3)>) And a new image at a new view angle. The depth aware fill module determines a target occlusion relationship (i.e., M in fig. 3) based on the first depth image and the second depth image_occ ). The depth perception filling module corrects the occlusion relation between the background and the moving target by utilizing the target occlusion relation aiming at the new image to obtain a target image (i.e. I in fig. 3)_t )。

In addition, the dense optical flow between the original image and the target image (i.e. F in FIG. 3) can be generated by volume rendering, the first pose, the second pose, and the volume rendering based on the color value, the volume density, and the MPI under the original view provided by the depth image_s→t ) The process is implemented as described below with reference to step 202. Finally, dense optical flow between the original image, the target image, and the original image is output as an optical flow dataset.

Referring to fig. 4a and fig. 4b, fig. 4a is a schematic diagram of an original image provided in an embodiment of the present application, and fig. 4b is a schematic diagram of a target image generated based on fig. 4a provided in an embodiment of the present application, where the target image shown in fig. 4b is obtained by a method of the present application. As shown in fig. 4b, it can be seen from the enlargement of the car that the quality and the reality of the target image are high, which proves that the quality and the reality of the target image generated based on the method of the present application are high. As can be seen by comparing fig. 4a and fig. 4b, the automobile not only moves relatively to the background, but also fig. 4b shows the shielding relationship between the automobile and the traffic sign correctly, which also illustrates that the target image generated based on the method of the present application can accurately determine the target shielding relationship between the moving target and the background under the new view angle in the case that the moving target exists in the target scene.

Referring to fig. 4c, fig. 4c is a schematic diagram of a dense optical flow between fig. 4a and fig. 4b according to an embodiment of the present application. The dense optical flow shown in fig. 4c is obtained by step 202. As shown in fig. 4c, the high degree of matching of the dense optical flow to the target image also demonstrates that with the method of the present application, an optical flow dataset can be generated with only one real image.

To further illustrate the high quality of optical flow datasets generated using the methods of the present application, embodiments of the present application also provide some comparative schematic diagrams of the methods of the present application with other methods.

Referring to fig. 5a, 5b, 5c and 5d, fig. 5a is a schematic diagram of an original image provided by the implementation of the present application, and fig. 5b, 5c and 5d are all target images generated based on fig. 5a, where fig. 5a is generated by learning an optical flow (Learning optical flow from still images, depthtigll) method from a still image, fig. 5c is generated by a real optical flow (RealFlow: em based realistic optical flow dataset generation from videos, realFlow) method, and fig. 5d is generated by the method of the present application. As can be seen by comparing the partial enlarged views in fig. 5b, 5c and 5d, the target image generated based on the method of the present application has higher detail quality than the target image generated by the other two methods.

The embodiment of the application also provides a path planning method, wherein the execution main body of the technical scheme disclosed by the embodiment of the path planning method is a vehicle, the vehicle comprises a camera, and the vehicle comprises a motor vehicle and a non-motor vehicle.

It will be appreciated that embodiments of the path planning method may also be implemented by means of a processor executing computer program code. Embodiments of the present application are described below with reference to the accompanying drawings in the embodiments of the present application. Referring to fig. 6, fig. 6 is a flow chart of a path planning method according to an embodiment of the present application.

601. And in the running process of the vehicle, continuously shooting the environment where the vehicle is located through the camera to obtain a first image to be processed and a second image to be processed.

In this application implementation, the shooting time of the first to-be-processed image and the shooting time of the second to-be-processed image are different, and the first to-be-processed image and the second to-be-processed image both include a target object, where the target object is an object in an environment where the mobile robot is located, for example, the target object may be a person in the environment where the mobile robot is located, and the target object may also be another vehicle in the environment where the mobile robot is located.

602. And acquiring an optical flow model obtained according to the training method of the optical flow model.

603. And determining an estimated optical flow between the first to-be-processed image and the second to-be-processed image by using the optical flow model.

The vehicle can estimate dense optical flow between the first to-be-processed image and the second to-be-processed image by utilizing the optical flow model, so as to obtain estimated optical flow.

604. Based on the estimated optical flow, a first travel path of the target object is estimated.

In one possible implementation, a vehicle determines motion information of a target object based on an estimated optical flow, wherein the motion information includes: speed of movement, direction of movement. The vehicle can further estimate a future driving path of the target object based on the motion information of the target object, namely the first driving path.

605. And planning a second travel path of the vehicle based on the first travel path.

The vehicle plans a future travel path of the vehicle based on the first travel path so that the vehicle does not collide with the target object. The second driving path is the planned future driving path of the vehicle.

In the embodiment of the application, the first image to be processed and the second image to be processed are obtained through shooting by the camera in the driving process of the vehicle. Then, after the optical flow model is acquired, the optical flow model is utilized to determine the estimated optical flow between the first to-be-processed image and the second to-be-processed image, so that the accuracy of the estimated optical flow can be improved. And based on the estimated optical flow, estimating a second driving path of the target object in the first to-be-processed image and the second to-be-processed image, so that the accuracy of the second driving path can be improved. Finally, based on the first travel path, a second travel path of the vehicle is planned, and accuracy of the second travel path can be improved.

It will be appreciated by those skilled in the art that in the above-described method of the specific embodiments, the written order of steps is not meant to imply a strict order of execution but rather should be construed according to the function and possibly inherent logic of the steps.

The foregoing details the method of embodiments of the present application, and the apparatus of embodiments of the present application is provided below.

Referring to fig. 7, fig. 7 is a schematic structural diagram of an apparatus for generating a new view angle image according to an embodiment of the present application, where the apparatus 1 for generating a new view angle image includes: acquisition unit 11, processing unit 12, generation unit 13, determination unit 14, in particular:

an obtaining unit 11, configured to obtain an original image and depth information of the original image, where the original image is an image obtained by capturing, by a camera, a target scene with an original view angle, where the target scene includes a background and a moving target;

a processing unit 12, configured to process the original image and the depth information by using a neural network, so as to obtain a plurality of original depth images of the MPI under the original viewing angle;

a generation unit 13 for generating a first depth image and a second depth image based on the plurality of original depth images; the first depth image includes: in a case where the camera photographs the background at a new view angle, depth information of the background, the second depth image includes: the depth information of the moving target under the condition that the camera shoots the moving target at a new view angle;

A determining unit 14, configured to determine a target occlusion relationship based on the first depth image and the second depth image, where the target occlusion relationship is an occlusion relationship between the background and the moving target when the camera photographs the target scene at the new view angle.

In combination with any one of the embodiments of the present application, the processing unit 12 is further configured to process the original image and the depth information by using a neural network, so as to obtain the MPI;

the generating unit 13 is further configured to generate a new image under the new view angle based on the MPI, where the new image is an image obtained by the camera capturing the target scene under the new view angle;

the generating unit 13 is further configured to correct, for the new image, an occlusion relationship between the background and the moving target by using the target occlusion relationship, so as to obtain a target image under the new view angle.

In combination with any one of the embodiments of the present application, the determining unit 14 is configured to:

The determining unit 14 is configured to:

determining the first position range according to the first reference mask;

In combination with any embodiment of the present application, the generating unit 13 is configured to:

Referring to fig. 8, fig. 8 is a schematic structural diagram of an optical flow model training device according to an embodiment of the present application, where the optical flow model training device 2 includes: acquisition unit 21, processing unit 22, training unit 23, specifically:

an acquisition unit 21 for acquiring an original image, a target image in an embodiment of the first aspect;

the acquisition unit 21 is configured to acquire a dense optical flow between the original image and the target image;

a processing unit 22 for taking the dense optical flow as a label of the original image and the target image;

the training unit 23 is configured to train a model to be trained based on the original image, the target image and the label, so as to obtain an optical flow model, where the optical flow model is used for estimating dense optical flow.

Referring to fig. 9, fig. 9 is a schematic structural diagram of a vehicle according to an embodiment of the present application, where the vehicle 3 includes: camera 31, acquisition unit 32, determination unit 33, estimation unit 34, planning unit 35, in particular:

the camera 31 is configured to continuously shoot an environment where the vehicle is located during a running process of the vehicle, so as to obtain a first to-be-processed image and a second to-be-processed image, where the first to-be-processed image and the second to-be-processed image both include a target object;

an acquisition unit 32 for acquiring an optical flow model obtained according to the method of the second aspect;

a determining unit 33 for determining an estimated optical flow between the first image to be processed and the second image to be processed using the optical flow model;

an estimating unit 34 for estimating a first travel path of the target object based on the estimated optical flow;

a planning unit 35 for planning a second travel path of the vehicle based on the first travel path.

In some embodiments, functions or modules included in the apparatus provided in the embodiments of the present application may be used to perform the methods described in the foregoing method embodiments, and specific implementations thereof may refer to descriptions of the foregoing method embodiments, which are not repeated herein for brevity.

Fig. 10 is a schematic hardware structure of an electronic device according to an embodiment of the present application. The electronic device 4 comprises a processor 41, a memory 42. Optionally, the electronic device 4 further comprises input means 43 and output means 44. The processor 41, memory 42, input device 43, and output device 44 are coupled by connectors including various interfaces, transmission lines or buses, etc., which are not limited in this application. It should be understood that in various embodiments of the present application, coupled is intended to mean interconnected by a particular means, including directly or indirectly through other devices, e.g., through various interfaces, transmission lines, buses, etc.

The processor 41 may comprise one or more processors, for example one or more central processing units (central processing unit, CPU), which in the case of a CPU may be a single-core CPU or a multi-core CPU. Alternatively, the processor 41 may be a processor group constituted by a plurality of CPUs, and the plurality of processors are coupled to each other through one or more buses. In the alternative, the processor may be another type of processor, and the embodiment of the present application is not limited.

Memory 42 may be used to store computer program instructions as well as various types of computer program code for performing aspects of the present application. Optionally, the memory includes, but is not limited to, a random access memory (random access memory, RAM), a read-only memory (ROM), an erasable programmable read-only memory (erasable programmable read only memory, EPROM), or a portable read-only memory (compact disc read-only memory, CD-ROM) for associated instructions and data.

The input means 43 are for inputting data and/or signals and the output means 44 are for outputting data and/or signals. The input device 43 and the output device 44 may be separate devices or may be an integral device.

It will be appreciated that in the embodiments of the present application, the memory 42 may be used to store not only relevant instructions, but also relevant data, and the embodiments of the present application are not limited to the data specifically stored in the memory.

It will be appreciated that fig. 10 shows only a simplified design of an electronic device. In practical applications, the electronic device may further include other necessary elements, including but not limited to any number of input/output devices, processors, memories, etc., and all electronic devices that may implement the embodiments of the present application are within the scope of protection of the present application.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, and are not repeated herein. It will be further apparent to those skilled in the art that the descriptions of the various embodiments herein are provided with emphasis, and that the same or similar parts may not be explicitly described in different embodiments for the sake of convenience and brevity of description, and thus, parts not described in one embodiment or in detail may be referred to in the description of other embodiments.

In the several embodiments provided in this application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.

In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, produces a flow or function in accordance with embodiments of the present application, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in or transmitted across a computer-readable storage medium. The computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by a wired (e.g., coaxial cable, fiber optic, digital subscriber line (digital subscriber line, DSL)), or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., a floppy disk, a hard disk, a magnetic tape), an optical medium (e.g., a digital versatile disk (digital versatiledisc, DVD)), or a semiconductor medium (e.g., a Solid State Disk (SSD)), or the like.

Those of ordinary skill in the art will appreciate that implementing all or part of the above-described method embodiments may be accomplished by a computer program to instruct related hardware, the program may be stored in a computer readable storage medium, and the program may include the above-described method embodiments when executed. And the aforementioned storage medium includes: a read-only memory (ROM) or a random access memory (random access memory, RAM), a magnetic disk or an optical disk, or the like.

Claims

1. A method of generating a new perspective image, the method comprising:

processing the original image and the depth information by using a neural network to obtain a plurality of original depth images of the multi-plane image under the original view angle;

2. The method according to claim 1, wherein the method further comprises:

processing the original image and the depth information by using a neural network to obtain the multi-plane image;

generating a new image under the new view angle based on the multi-plane image, wherein the new image is an image obtained by shooting the target scene under the new view angle by the camera;

3. The method of claim 1 or 2, wherein the determining a target occlusion relationship based on the first depth image and the second depth image comprises:

generating a background reference image obtained by shooting the target scene by the camera in a first pose and a moving target reference image obtained by shooting the moving target by the camera in a second pose based on the original image; the first pose is a pose of the camera shooting the background at the new view angle, and the second pose is a pose of the camera shooting the moving target at the new view angle;

4. A method according to claim 3, wherein the determining, based on the first depth image and the second depth image, an occlusion relation of the background to the moving object in the third position range as the object occlusion relation comprises:

5. The method according to claim 3 or 4, wherein the pose of the original image obtained by the camera is an original pose;

determining the first position range according to the first reference mask;

6. The method of claim 5, wherein generating a first depth image and a second depth image based on the plurality of original depth images comprises:

7. A method of training an optical flow model, the method comprising:

acquiring the original image and the target image in claim 2;

acquiring dense optical flow between the original image and the target image;

8. A path planning method, wherein the path planning method is applied to a vehicle, the vehicle including a camera, the method comprising:

acquiring an optical flow model obtained by the method of claim 7;

a second travel path of the vehicle is planned based on the first travel path.

9. An apparatus for generating a new view angle image, the apparatus comprising:

the processing unit is used for processing the original image and the depth information by utilizing a neural network to obtain a plurality of original depth images of the multi-plane image under the original view angle;

10. A training device for an optical flow model, the device comprising:

an acquisition unit for acquiring the original image, the target image in claim 2;

11. A cart, the cart comprising:

an acquisition unit for acquiring an optical flow model obtained by the method according to claim 7;

12. An electronic device, comprising: a processor and a memory for storing computer program code, the computer program code comprising computer instructions;

the electronic device performing the method of any one of claims 1 to 6, when the processor executes the computer instructions;

the electronic device or the method of claim 7 when the processor executes the computer instructions;

The electronic device, or the method of claim 8, when the processor executes the computer instructions.

13. A computer readable storage medium having a computer program stored therein, the computer program comprising program instructions;

causing a processor to perform the method of any one of claims 1 to 6, when the program instructions are executed by the processor;

where the program instructions are executed by a processor, or cause the processor to perform the method of claim 7;

where the program instructions are executed by a processor, or cause the processor to perform the method of claim 8.