Disclosure of Invention
Aiming at the defects in the prior art, the invention provides a reflected light removing method based on a two-stage reflected light eliminating network and pixel loss, which comprises the steps of firstly constructing a training data set and a testing data set by utilizing simulation data and real data, and then setting a primary sub-network and a secondary sub-network of a generator in the two-stage reflected light eliminating network; then setting a loss function of a generator in a two-stage reflected light elimination network based on pixel loss; then setting a loss function of a discriminator in a two-stage reflected light elimination network; training the two-stage reflected light elimination network until the parameters of the two-stage reflected light elimination network are converged to obtain the trained two-stage reflected light elimination network; and finally, removing image reflection light of the test data set by using the trained two-stage reflection light elimination network, and outputting a transmission diagram after the image reflection light is removed.
In order to achieve the purpose, the invention adopts the following technical scheme: a reflected light removing method based on a two-stage reflected light eliminating network and pixel loss comprises the following steps:
the method comprises the following steps of firstly, constructing a training data set and a testing data set by utilizing simulation data and real data;
step two, setting a primary sub-network of a generator in a two-stage reflected light elimination network;
step three, setting a secondary sub-network of a generator in the two-stage reflected light elimination network;
step four, constructing a loss function of a generator in the two-stage reflected light elimination network based on the pixel loss of the simulation data by using a real transmission image and reflection image of the simulation data in the training data set, a roughly estimated transmission image and reflection image and a transmission image after the reflected light of the image is removed;
step five, constructing a loss function of a generator in the two-stage reflected light elimination network based on real data pixel loss by using a real transmission diagram of real data in the training data set, a roughly estimated transmission diagram and a transmission diagram after image reflected light is removed;
weighting and adding a loss function of a generator in the two-stage reflected light elimination network based on the loss of the analog data pixels, a loss function of the generator in the two-stage reflected light elimination network based on the loss of the real data pixels and an original generator countervailing loss function to serve as a loss function of the generator in the two-stage reflected light elimination network;
step seven, setting a loss function of a discriminator in the two-stage reflected light elimination network;
step eight, training a two-stage reflected light elimination network, sequentially loading an Mth frame image in a training data set as a current frame image, inputting the current frame image into a primary sub-network of a generator to obtain a roughly estimated transmission image and a reflection image, and inputting the roughly estimated transmission image and the reflection image into a secondary sub-network of the generator to obtain a transmission image after image reflected light is removed; judging whether the current frame image is the last frame image of the training data set; if yes, finishing the round of training and entering the ninth step; if not, continuing to load the subsequent frame image for training, wherein M represents an integer greater than or equal to one;
step nine, judging whether the parameters of the two-stage reflected light elimination network are converged; if yes, finishing all training and entering the step ten; if not, returning to the step eight, and continuing the next round of training until a trained two-stage reflected light elimination network is obtained;
and step ten, removing image reflection light of the test data set by using the trained two-stage reflection light elimination network, and outputting a transmission image after the image reflection light is removed.
In order to optimize the technical scheme, the specific measures adopted further comprise:
further, the second step is specifically realized by the following steps:
s201, setting an 8-layer coder-decoder, wherein the coder-decoder is provided with 4 rolling blocks with different scales;
s202, respectively connecting coding-decoding layers with the same scale by using 4 convolutional block attention units;
s203, constructing a full convolution neural network, wherein the number of channels of the first seven layers is 64, and the number of channels of the eighth layer is two three channels;
and S204, connecting the steps S201 to S203 together to serve as a primary sub-network of a generator in the two-stage reflected light elimination network.
Further, the third step is realized by the following steps:
s301, setting 9 characteristic extraction layers based on a gate convolution neural network;
s302, setting 1 layer of convolution network feature extraction layer;
and S303, connecting the steps S301 to S302 together to serve as a secondary sub-network of a generator in the two-stage reflected light elimination network.
Further, the fourth step specifically includes: setting a loss function of a generator in the two-stage reflected light elimination network based on the loss of the analog data pixel according to the following formula:
wherein L is
pixelSRepresenting a loss function of the generator in a two-stage reflected light cancellation network based on simulated data pixel loss,
represents the gradient operator, | ·| non-conducting phosphor
2Denotes the operation of two norms, eta denotes the constraint factor, lambda
1Represents a weight value, λ
2Representing the gradient weights, T representing the true transmission map,
a transmission map representing a coarse estimate of the transmission,
representing image reflected lightThe transmission map after removal, R represents the true reflection map,
representing a roughly estimated reflection map.
Further, the fifth step specifically includes: setting a loss function of a generator in the two-stage reflected light elimination network based on real data pixel loss according to the following formula:
wherein L ispixelRRepresenting a loss function of a generator in a two-stage reflected light cancellation network based on real data pixel loss.
Further, the sixth step specifically includes: the loss function L of the generator in the two-stage reflected light cancellation network is set according to the following formula:
L=αLA+βLpixelS+χLpixelR
LA=-E(D(I,G(I,θ)))
wherein, alpha, beta and chi are respectively LA、LpixelSAnd LpixelRWeight coefficient of (1), LAFor the raw generator immunity loss function, E (·) denotes the desired operation, D denotes the discriminator in the two-stage reflected light cancellation network, I denotes the input image, G denotes the raw generator, D (I, G (I, θ)) denotes the probability that G (I, θ) output by the discriminator in the two-stage reflected light cancellation network belongs to the transmission image given the input image and the image to be discriminated G (I, θ), and θ denotes the raw generator network parameters.
Further, the seventh step specifically includes: the loss function of the discriminator in the two-stage reflected light cancellation network is set according to the following formula:
wherein L is
DIn networks representing two-stage reflection light cancellationLoss function of discriminator, T represents true transmission map, μ is
The weight coefficient of (2).
The invention has the beneficial effects that:
first, because the first-level sub-network and the second-level sub-network of the generator in the two-level reflected light elimination network described in the second step and the third step are adopted, the estimation accuracy of the transmission diagram is improved by further fully utilizing the information in the estimated reflected image, and the defect that the detail loss exists in the result after the reflected light is removed due to the fact that the transmission diagram is directly estimated in the prior art is overcome.
Secondly, because the invention adopts the loss function calculation mode based on pixel loss from the fourth step to the sixth step, namely, the network considers the roughly estimated transmission diagram and reflection diagram firstly, and uses the two estimated quantities and the extracted characteristics as the input of the secondary sub-network, the precision of the transmission diagram is further improved, namely, the network of the invention adopts a two-stage structure from rough to fine, the invention can effectively remove the reflected light of the images of various scenes, and overcomes the defect that the color distortion is easy to occur in the prior art.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. The present invention will now be described in further detail with reference to the accompanying drawings.
The embodiment of the invention provides a reflected light removing method based on a two-stage reflected light eliminating network and pixel loss, which comprises the following steps as shown in fig. 1 and fig. 2:
step 1: building a training data set and a testing data set by using the simulation data and the real data;
specifically, in one embodiment of the present invention, the training data used by the two-stage reflected light cancellation network is a university of berkeley dataset, the constructed training dataset includes 13700 simulation data of transmission and reflection maps, and 90 real data, and the constructed test dataset includes 20 real data. Fig. 3 is a schematic diagram of an input image of analog data according to an embodiment of the present invention. Fig. 4 is a schematic diagram showing a real transmission diagram of simulation data according to an embodiment of the present invention. Fig. 5 is a schematic diagram showing real reflection of simulation data according to an embodiment of the present invention.
Step 2: setting a primary sub-network of a generator in a two-stage reflected light elimination network;
the method is realized by the following steps:
step 201, setting an 8-layer coder-decoder, wherein the coder-decoder has 4 rolling blocks with different scales;
specifically, the channel number of 8 convolutional layers of the encoder-decoder of the present invention is set to {64, 128, 256, 512, 512, 256, 128, 64}, the convolutional templates are all 3 × 3, and each convolutional layer contains one lreul active layer and a batch regularization operation.
Step 202, connecting coding-decoding layers with the same scale by using 4 convolutional block attention units respectively;
specifically, the convolution block attention unit implements feature enhancement mainly through two steps: firstly, aiming at channel characteristic enhancement, firstly, respectively carrying out maximum pooling and average pooling on each channel to form two characteristic vectors with the same length as the number of characteristic channels; then, processing the two eigenvectors through a three-layer full-connection network sharing the weight to finally obtain an enhanced vector; and finally, taking the value of an element in the enhancement vector as an enhancement coefficient, and multiplying the enhancement coefficient by each channel characteristic graph respectively to realize the channel enhancement of the characteristic. Secondly, aiming at spatial feature enhancement, performing spatial maximum pooling and average pooling on the features to obtain two feature maps; then, obtaining a spatial enhancement coefficient through convolution of parameter sharing and Sigmoid activation; and finally, multiplying the enhancement coefficient by the values of all channels at the same position of the original characteristic diagram respectively to obtain a final result.
Step 203, constructing a full convolution neural network, wherein the number of channels of the first seven layers is 64, and the eighth layer is two three channels;
specifically, the number of channels of the first 7 layers of the full convolution sub-network is set to be 64, hole convolution is introduced to increase the receptive field, the spatial span of the hole convolution is set to be {2, 4, 8, 16, 32, 64, 1, 1}, the sizes of convolution windows are 3 × 3, and the activation and normalization function setting of the first 7 layers is the same as that of the coding and decoding sub-network. The output of the last layer is 3 x 2 channels and is taken as two three-channel RGB images to represent the roughly estimated reflection and transmission maps, respectively.
And step 204, connecting the steps S201 to S203 together to serve as a primary sub-network of a generator in the two-stage reflected light elimination network.
And step 3: setting a secondary sub-network of a generator in the two-stage reflected light elimination network;
the method is realized by the following steps:
step 301, setting 9 characteristic extraction layers based on a gate convolution neural network;
specifically, the number of feature channels of 9 feature extraction layers based on the gate convolution neural network is 32, the adopted space span of the hole convolution is set to be {1, 2, 4, 8, 16, 32, 64, 1, 1} respectively, and the convolution window size is 3 × 3.
Step 302, setting 1 layer of convolution network feature extraction layer;
specifically, the last 1 layer of convolution network feature extraction layer is a common convolution layer and does not contain activation and normalization, and the output of the layer is 3 channels, namely, a transmission image after the reflected light of the transmission image in the RGB format is removed.
Step 303, connecting steps S301 to S302 together as a secondary sub-network of a generator in a two-stage reflected light cancellation network.
And 4, step 4: constructing a loss function of a generator in a two-stage reflected light elimination network based on the pixel loss of the simulation data by utilizing a real transmission diagram and a reflection diagram of the simulation data in a training data set, a roughly estimated transmission diagram and a reflection diagram and a transmission diagram after image reflected light is removed, and specifically comprising the following steps: setting a loss function of a generator in the two-stage reflected light elimination network based on the loss of the analog data pixel according to the following formula:
wherein L is
pixelSRepresenting a loss function of the generator in a two-stage reflected light cancellation network based on simulated data pixel loss,
represents the gradient operator, | ·| non-conducting phosphor
2Denotes the operation of two norms, eta denotes the constraint factor, lambda
1Represents a weight value, λ
2Representing the gradient weights, T representing the true transmission map,
a transmission map representing a coarse estimate of the transmission,
a transmission map after image reflection light removal, R a real reflection map,
representing a roughly estimated reflection map.
Specifically, eta is 0.5 and lambda is taken in the experiment1A value of 0.2, λ2The value is 0.4. The purpose of introducing the constraint factor is to expect that the error weight of the final transmitted light prediction is increased by the design, so that the precision is improved.
And 5: constructing a loss function of a generator in the two-stage reflected light elimination network based on real data pixel loss by using a real transmission diagram of real data in a training data set, a roughly estimated transmission diagram and a transmission diagram after image reflected light removal, and specifically comprising the following steps: setting a loss function of a generator in the two-stage reflected light elimination network based on real data pixel loss according to the following formula:
wherein L ispixelRRepresenting a loss function of a generator in a two-stage reflected light cancellation network based on real data pixel loss.
Specifically, for real data, no reflection error term is included in the loss function because there is no reflection reference map.
Step 6: weighting and adding a loss function of a generator in the two-stage reflected light elimination network based on the loss of a simulated data pixel, a loss function of the generator in the two-stage reflected light elimination network based on the loss of a real data pixel and a confrontation loss function of an original generator to serve as a loss function of the generator in the two-stage reflected light elimination network, specifically, setting the loss function of the generator in the two-stage reflected light elimination network according to the following formula:
L=αLA+βLpixelS+χLpixelR
LA=-E(D(I,G(I,θ)))
wherein, alpha, beta and chi are respectively LA、LpixelSAnd LpixelRWeight coefficient of (1), LAFor the raw generator immunity loss function, E (·) denotes the desired operation, D denotes the discriminator in the two-stage reflected light cancellation network, I denotes the input image, G denotes the raw generator, D (I, G (I, θ)) denotes the probability that G (I, θ) output by the discriminator in the two-stage reflected light cancellation network belongs to the transmission image given the input image and the image to be discriminated G (I, θ), and θ denotes the raw generator network parameters.
Specifically, α, β, and χ were all equal to 1 in the experiment.
And 7: setting a loss function of the discriminator in the two-stage reflected light elimination network, specifically, setting the loss function of the discriminator in the two-stage reflected light elimination network according to the following formula:
wherein L is
DRepresents the loss function of the discriminator in a two-stage reflected light cancellation network, mu is
The weight coefficient of (2).
And 8: training a two-stage reflected light elimination network, sequentially loading the Mth frame image in a training data set as a current frame image, inputting the current frame image into a primary sub-network of a generator to obtain a roughly estimated transmission image and a reflection image, inputting the roughly estimated transmission image and the reflection image into a secondary sub-network of the generator to obtain a transmission image with image reflected light removed, judging whether the current frame image is the last frame image of the training data set or not, if so, finishing the training of the round, and entering step 9; if not, enabling M to be M +1, and continuing to load the subsequent frame image for training, wherein M represents an integer greater than or equal to one;
fig. 6 is a schematic diagram of transmission of rough estimation of simulation data in an embodiment of the present invention, fig. 7 is a schematic diagram of reflection of rough estimation of simulation data in an embodiment of the present invention, fig. 8 is a schematic diagram of transmission after reflection of simulation data is removed in an embodiment of the present invention, fig. 9 is a schematic diagram of an input image of real data in an embodiment of the present invention, fig. 10 is a schematic diagram of real transmission of real data in an embodiment of the present invention, fig. 11 is a schematic diagram of transmission of rough estimation of real data in an embodiment of the present invention, and fig. 12 is a schematic diagram of transmission after reflection of real data is removed in an embodiment of the present invention.
And step 9: judging whether the parameters of the two-stage reflected light elimination network are converged, if so, finishing all training and entering the step 10; if not, returning to the step 8, and making M equal to M +1, and continuing the next round of training until a trained two-stage reflected light elimination network is obtained;
specifically, the two-stage reflection light elimination network mentioned in the present invention is trained by Nvidia RTX Titan V and tenserflow 1.9.0 for 150 rounds (50 rounds of learning rate 0.0001, 0.00003 and 0.00001).
Step 10: and (4) removing image reflection light of the test data set by using the trained two-stage reflection light elimination network, and outputting a transmission diagram after the image reflection light is removed.
The method comprises the steps of removing image reflected light by a reflected light removing method based on a two-stage reflected light removing network, and firstly setting a primary sub-network and a secondary sub-network of a generator in the two-stage reflected light removing network; then setting a loss function of a generator in the two-stage reflected light elimination network; then setting a loss function of a discriminator in a two-stage reflected light elimination network; training the two-stage reflected light elimination network until the parameters of the two-stage reflected light elimination network are converged to obtain the trained two-stage reflected light elimination network; and finally, removing image reflection light of the test data set by using the trained two-stage reflection light elimination network, realizing a transmission image after removing the image reflection light, effectively removing the reflection light of the images of various scenes, and avoiding color distortion and detail loss.
The above is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the above-mentioned embodiments, and all technical solutions belonging to the idea of the present invention belong to the protection scope of the present invention. It should be noted that modifications and embellishments within the scope of the invention may be made by those skilled in the art without departing from the principle of the invention.