Movatterモバイル変換


[0]ホーム

URL:


CN110378844B - A Blind Image Deblurring Method Based on Recurrent Multiscale Generative Adversarial Networks - Google Patents

A Blind Image Deblurring Method Based on Recurrent Multiscale Generative Adversarial Networks
Download PDF

Info

Publication number
CN110378844B
CN110378844BCN201910515590.4ACN201910515590ACN110378844BCN 110378844 BCN110378844 BCN 110378844BCN 201910515590 ACN201910515590 ACN 201910515590ACN 110378844 BCN110378844 BCN 110378844B
Authority
CN
China
Prior art keywords
image
layer
size
generator
convolution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201910515590.4A
Other languages
Chinese (zh)
Other versions
CN110378844A (en
Inventor
陈华华
陈富成
叶学义
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Dianzi University
Original Assignee
Hangzhou Dianzi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Dianzi UniversityfiledCriticalHangzhou Dianzi University
Priority to CN201910515590.4ApriorityCriticalpatent/CN110378844B/en
Publication of CN110378844ApublicationCriticalpatent/CN110378844A/en
Application grantedgrantedCritical
Publication of CN110378844BpublicationCriticalpatent/CN110378844B/en
Expired - Fee Relatedlegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Classifications

Landscapes

Abstract

Translated fromChinese

本发明公开了一种基于循环多尺度生成对抗网络的图像盲去运动模糊方法。本发明方法以循环多尺度编码器和解码器作为生成器,并构建了相应的判决器。以生成图像和清晰图像的对抗性损失、多尺度均方误差和多尺度梯度误差作为生成对抗网络的损失函数,以梯度下降法优化损失函数。本发明运用生成对抗网络学习运动模糊图像与其对应清晰图像之间的关系,省去了复杂的模糊核估计过程。本发明方法可以提取图像的边缘特征,具有更简单的网络结构、更少的参数,并且该网络模型更容易训练,且复原效果较好。The invention discloses an image blind motion blurring method based on cyclic multi-scale generation confrontation network. The method of the present invention uses the cyclic multi-scale encoder and decoder as generators, and constructs the corresponding decider. The adversarial loss, multi-scale mean square error and multi-scale gradient error of the generated image and the clear image are used as the loss function of the generative adversarial network, and the loss function is optimized by the gradient descent method. The invention uses the generative confrontation network to learn the relationship between the motion blurred image and its corresponding clear image, and saves the complicated blur kernel estimation process. The method of the invention can extract the edge features of the image, has a simpler network structure and fewer parameters, and the network model is easier to train and has a better recovery effect.

Description

Image blind motion blur removing method based on cyclic multi-scale generation countermeasure network
Technical Field
The invention belongs to the technical field of image processing, and relates to an image blind motion blur removing method based on a cyclic multi-scale generation countermeasure network.
Background
Since it is difficult to maintain a relatively stationary state between the photographing apparatus and the imaging object, motion blur of the image may be caused. However, in the fields of daily life, traffic safety, medicine, military investigation and the like, it is very important to obtain a clear image.
The blurring of moving images can be seen as the formation of sharp images and a two-dimensional linear function after convolution operation by additive noise pollution. This linear function, called the point spread function or convolution kernel, contains the blurring information of the image. Blind deblurring of an image refers to restoring an original sharp image by only depending on information of a blurred image under the condition that a blurring mode is unknown (namely a blurring kernel is unknown). In the blind deblurring of a single moving image, a blur kernel and the size of the blur image are unknown, which affects the accuracy of blur kernel estimation and further affects the final restoration effect.
Disclosure of Invention
The invention aims to provide an image blind motion blur removing method based on a cyclic multi-scale generation countermeasure network aiming at the characteristic of image motion blur, and the method can estimate a clear image without estimating a blur kernel.
The invention specifically comprises the following steps:
step (1), constructing a discriminator D;
the discriminator D consists of nine convolutional layers, a full-link layer and a Sigmoid active layer, and inputs a color image with the size of 256 multiplied by 256.
Each convolutional layer used a LeakyReLU as the activation function: the first layer has 32 convolution kernels, each convolution kernel size is 5 × 5, step size is 2, zero-padding width (zero-padding) is 2; the second layer has 64 convolution kernels, each convolution kernel is 5 × 5 in size, 1 in step size and 2 in zero-filling width; the third layer has 64 convolution kernels, each convolution kernel is 5 × 5 in size, 2 in step size and 2 in zero-filling width; the fourth layer has 128 convolution kernels, each convolution kernel is 5 × 5 in size, 1 in step size and 2 in zero-filling width; the fifth layer has 128 convolution kernels, each convolution kernel size is 5 × 5, step size is 4, and zero filling width is 2; the sixth layer has 256 convolution kernels, each convolution kernel has a size of 5 × 5, a step size of 1, and a zero filling width of 2; the seventh layer has 256 convolution kernels, each convolution kernel has a size of 5 × 5, a step size of 4, and a zero filling width of 2; the eighth layer has 512 convolution kernels, each convolution kernel is 5 × 5 in size, the step size is 1, and the zero filling width is 2; the ninth layer has 512 convolution kernels, each with a size of 4 x 4, step size of 4, and zero-fill width of 0.
And the convolution output of the last layer is subjected to full-connection layers with the number of input channels being 512 and the number of output channels being 1 to obtain 1 constant, and the probability of judgment is output after being activated by a Sigmoid function.
Step (2), constructing a generator G;
the generator G comprises cascaded three-scale sub-networks, wherein each sub-network comprises 1 input module, 2 coding modules, 1 cascaded convolution long-time memory (ConvLSTM) module, 2 decoding modules and 1 output module; each module comprises a residual module, the residual module is formed by cascading a convolution layer with a convolution kernel, and the convolution layer takes a modified Linear Unit (ReLU) as an activation function; and adding the output of the cascaded convolution kernel in the residual error module and the input of the residual error module to obtain the output of the residual error module.
The input module comprises an independent convolutional layer and three residual modules with the same structure, the number of cores of convolutional layer convolutional cores of the independent convolutional layer and the residual modules is 32, the size of the convolutional layer convolutional cores is 5 multiplied by 5, the step length of the convolutional layer convolutional cores is 1, the zero filling width of the convolutional layer cores is 2, and a ReLU function is used as an activation function in the independent convolutional layer.
The first coding module comprises an independent convolutional layer and three residual modules with the same structure, the number of convolutional layers of the independent convolutional layer and the residual modules is 64, the size of the convolutional layers is 5 multiplied by 5, the step size is 2, the zero filling width is 2, and a ReLU function is used as an activation function in the independent convolutional layer.
The second coding module comprises an independent convolutional layer and three residual modules with the same structure, the number of convolutional layers of the independent convolutional layer and the residual modules is 128, the size of the convolutional layers is 5 multiplied by 5, the step size is 2, the zero filling width is 2, and a ReLU function is used as an activation function in the independent convolutional layer.
The state output of the memory cells in the convolution long-time and short-time memory module is used as the input of the decoding module, and the hidden state output of the convolution long-time and short-time memory module is connected with the hidden state input of the convolution long-time and short-time memory module in the next scale sub-network; and for the last scale, the hidden state output of the convolution long-time and short-time memory module is not connected with other modules.
The structure of the convolution long-short memory (ConvLSTM) module is shown in Shi X, Chen Z, Hao W, et al. the convolutional LSTM Network a machine learning approach for prediction of non-coding [ C ]// International Conference on Neural Information Processing systems.2015, page number: 802-810.
The first decoding module comprises three residual modules with the same structure and an independent convolutional layer, the number of convolutional kernels of the independent convolutional layer and the residual modules is 128, the size of the convolutional kernels is 5 multiplied by 5, the step size of the convolutional kernels is 2, the zero filling width of the convolutional kernels is 2, and a ReLU function is used as an activation function in the independent convolutional layer cascaded after the residual modules.
The second decoding module comprises three residual modules with the same structure and an independent convolutional layer, the number of convolutional kernels of the independent convolutional layer and the residual modules is 64, the size of the convolutional kernels is 5 multiplied by 5, the step size of the convolutional kernels is 2, the zero filling width of the convolutional kernels is 2, and a ReLU function is used as an activation function in the independent convolutional layer cascaded after the residual modules.
The output module comprises three residual modules with the same structure and an independent convolutional layer, the number of convolutional cores of the independent convolutional layer and the convolutional layer of the residual modules is 32, the size of the convolutional layer is 5 multiplied by 5, the step length is 1, the zero filling width is 2, and a ReLU function is used as an activation function in the independent convolutional layer cascaded behind the residual modules.
Output third-level scaled generator output image L3The size is 64 x 64, L3Up-sampling to obtain 128 × 128 image, and outputting 128 × 128 second-level generator output image L as input of the second-level scale2;L2Up-sampling to obtain 256 × 256 image as input of first-level scale, and outputting 256 × 256 generator output image L of first-level scale1I.e. the deblurred resulting image. In the cascaded three-scale sub-networks, the corresponding structures, the channel numbers and the convolution kernel sizes of the three sub-networks are all the same. In the three channels of the color image RGB, each channel weight is shared.
And (3) randomly extracting m (m is more than or equal to 16) blurred images and corresponding clear images from the training data set T, randomly cutting the blurred images and the corresponding clear images into square areas of 256 multiplied by 256, respectively forming a blurred image set B for training and a corresponding clear image set S, wherein the number of the obtained images B and S is m, and each image is a 3-channel color image of 256 multiplied by 256. The blurred image set B is input into a generator, and an output image set L of the generator is obtained, wherein m color images with the size of 256 multiplied by 256 exist in the L.
And (4) sequentially taking the generator output image set L and the corresponding clear image set S as the input of a discriminator, and sequentially outputting two groups of confidence coefficient results by the discriminator, wherein each group of confidence coefficients comprises m probability values, so that each input image is judged to be a clear image or a generated image: if the probability value is greater than 0.5, determining that the image is clear; and if the probability value is less than or equal to 0.5, determining that an image is generated.
And (5) constructing a loss function of the training generator, wherein the loss function is as follows: ldb=lE1lgrad2ladv
Wherein alpha is1、α2Is a regular term coefficient greater than 0, lETo the generator, the mean square error between the image set L and the corresponding clear image set S is output, i.e.:
Figure BDA0002094940980000031
wherein L isi、SiRepresenting the generator output image and the sharp image, respectively, at the ith scale, NiThe number of pixels of all channels on the ith scale image is represented, i is 1,2 and 3; and (3) carrying out multi-scale down sampling on the image for 3 times to obtain the image with reduced size, wherein the first-level scale is the image with the original size, and from the second level, the size of each level of image is half of the width and the height of the previous-level image.
lgradAs a gradient image
Figure BDA0002094940980000032
And
Figure BDA0002094940980000033
the gradient error between, i.e.:
Figure BDA0002094940980000034
Figure BDA0002094940980000035
in the formula Li(dx) And Li(dy) Respectively represent the horizontal and vertical gradient of Li, Si(dx) And Si(dy) Respectively represent SiHorizontal gradient and vertical gradient.
ladvThe decision error for the generator output image set L and the corresponding sharp image set S is:
Figure BDA0002094940980000041
wherein S-p (S) represents that the clear image S is taken from the clear image set S, and p (S) represents the probability distribution of the clear image set S; b to p (B) show that the blurred image B is taken from the blurred image set B, and p (B) show the probability distribution of the blurred image set B;
d(s) represents the discrimination probability of the discriminator on the input image s, G (b) represents the result image generated by the input image b through the generator, and E [. cndot. ] represents the expectation in brackets.
Step (6) inputting the generated image and the clear image into a discriminator, updating weight parameters in each layer of network by gradient descent iteration, and continuously optimizing ladvUntil the discriminator can not discriminate whether the input image is the generated image or the clear image, namely the difference value change between the obtained probability value and 0.5 is less than thr, and thr is more than or equal to 0.01 and less than or equal to 0.08, the discriminant training is finished.
Step (7) according to the loss function ldb=lE1lgrad2ladvTraining generator, inputting the fuzzy image into the generator, obtaining the generated image through forward propagation, comparing the difference between the generated image and the clear image, and iterating by gradient descentNew weighting parameter in each layer network, constant loss function ldb=lE1lgrad2ladvTraining lumped loss function values until the generator model training phasedbThe variation is less than a threshold Th, Th is more than or equal to 0.001 and less than or equal to 0.01, and the training of the generator is finished at the moment.
And (8) repeating the steps (3) to (7) of the training process until the value l of the training lumped loss function of the generator model in the training stagedbAnd when the variation is smaller than the threshold Th, namely the discriminator cannot judge whether the input image is a sharp image or a generated image, the generator model and the discriminator model are determined to be converged, and the blurred image is input into the generator to obtain an estimated deblurred image.
The method of the invention uses a deep learning method to learn the relation between the motion blurred image and the corresponding clear image, and omits a complex blurred kernel estimation process. Through the contrast training of a large number of blurred images and clear images, the extracted model can extract the edge characteristics of the images, has a simpler network structure and fewer parameters, is easier to train and has a better restoration effect.
Detailed Description
The following further illustrates the practice of the present invention.
The blurred image set B is input to a generator G, and a generator output image set L is obtained as an input of a discriminator D, so that a discrimination result of the discriminator is obtained. Similarly, the clear image set S is also used as an input of the discriminator to obtain a discrimination result. The determination result indicates whether the determination input is from the clear image set or the generated image set, and if the determination result is greater than 0.5, the determination is made as the clear image set S; otherwise, it is determined that the generator outputs the image set L. And calculating the error between the judgment result and the real label data, optimizing the discriminator by using a gradient descent algorithm, then calculating the error mean value of the generated image and the clear image, and optimizing the generator by using the gradient descent algorithm. The discriminators and generators are alternately optimized until the model converges. In the experiments of the present invention, the model converged after a total of 40 ten thousand training times.
The image blind motion blur removing method for generating the countermeasure network based on the circulation multi-scale comprises the following specific steps:
s1, constructing a discriminator D: the discriminator D is composed of nine convolutional layers, one full link layer, and one Sigmoid active layer, and inputs a color image having a size of 256 × 256.
Each convolutional layer used a LeakyReLU as the activation function: the first layer has 32 convolution kernels, each convolution kernel has a size of 5 × 5, a step size of 2, and a zero-filling width of 2; the second layer has 64 convolution kernels, each convolution kernel is 5 × 5 in size, 1 in step size and 2 in zero-filling width; the third layer has 64 convolution kernels, each convolution kernel is 5 × 5 in size, 2 in step size and 2 in zero-filling width; the fourth layer has 128 convolution kernels, each convolution kernel is 5 × 5 in size, 1 in step size and 2 in zero-filling width; the fifth layer has 128 convolution kernels, each convolution kernel size is 5 × 5, step size is 4, and zero filling width is 2; the sixth layer has 256 convolution kernels, each convolution kernel has a size of 5 × 5, a step size of 1, and a zero filling width of 2; the seventh layer has 256 convolution kernels, each convolution kernel has a size of 5 × 5, a step size of 4, and a zero filling width of 2; the eighth layer has 512 convolution kernels, each convolution kernel is 5 × 5 in size, the step size is 1, and the zero filling width is 2; the ninth layer has 512 convolution kernels, each with a size of 4 x 4, step size of 4, and zero-fill width of 0.
And the convolution output of the last layer is subjected to full-connection layers with the number of input channels being 512 and the number of output channels being 1 to obtain 1 constant, and the probability of judgment is output after being activated by a Sigmoid function.
S2, constructing a generator G: the generator G comprises cascaded subnetworks with three scales, wherein each subnetwork comprises 1 input module, 2 coding modules, cascaded 1 convolution long-time and short-time memory modules, 2 decoding modules and 1 output module; each module comprises a residual error module, the residual error module is formed by cascading a convolution layer with a convolution kernel, and the convolution layer takes an improved linear unit ReLU as an activation function; and adding the output of the cascaded convolution kernel in the residual error module and the input of the residual error module to obtain the output of the residual error module.
The input module comprises an independent convolutional layer and three residual modules with the same structure, the number of cores of convolutional cores of the independent convolutional layer and the residual modules is 32, the size of the convolutional cores is 5 multiplied by 5, the step length of the convolutional cores is 1, the zero filling width of the convolutional cores is 2, and a ReLU function is used as an activation function in the independent convolutional layer.
The first coding module comprises an independent convolutional layer and three residual modules with the same structure, the number of convolutional layers of the independent convolutional layer and the residual modules is 64, the size of the convolutional layers is 5 multiplied by 5, the step size is 2, the zero filling width is 2, and a ReLU function is used as an activation function in the independent convolutional layer.
The second coding module comprises an independent convolutional layer and three residual modules with the same structure, the number of convolutional layers of the independent convolutional layer and the residual modules is 128, the size of the convolutional layers is 5 multiplied by 5, the step size is 2, the zero filling width is 2, and a ReLU function is used as an activation function in the independent convolutional layer.
The state output of a memory cell in the convolution long-time and short-time memory module is used as the input of the decoding module, and the hidden state output of the convolution long-time and short-time memory module is connected with the hidden state input of the convolution long-time and short-time memory module in the next scale sub-network; and for the last scale, the hidden state output of the convolution long-time and short-time memory module is not connected with other modules.
The first decoding module comprises three residual modules with the same structure and an independent convolutional layer, the number of convolutional kernels of the independent convolutional layer and the residual modules is 128, the size of the convolutional kernels is 5 multiplied by 5, the step size of the convolutional kernels is 2, the zero filling width of the convolutional kernels is 2, and a ReLU function is used as an activation function in the independent convolutional layer cascaded after the residual modules.
The second decoding module comprises three residual modules with the same structure and an independent convolutional layer, the number of convolutional kernels of the independent convolutional layer and the residual modules is 64, the size of the convolutional kernels is 5 multiplied by 5, the step size of the convolutional kernels is 2, the zero filling width of the convolutional kernels is 2, and a ReLU function is used as an activation function in the independent convolutional layer cascaded after the residual modules.
The output module comprises three residual modules with the same structure and an independent convolutional layer, the number of convolutional cores of the independent convolutional layer and the convolutional layer of the residual modules is 32, the size of the convolutional layer is 5 multiplied by 5, the step length is 1, the zero filling width is 2, and a ReLU function is used as an activation function in the independent convolutional layer cascaded behind the residual modules.
Output third-level scaled generator output image L3The size is 64 x 64, L3Up-sampling to obtain 128 × 128 image, and outputting 128 × 128 second-level generator output image L as input of the second-level scale2;L2Up-sampling to obtain 256 × 256 image as input of first-level scale, and outputting 256 × 256 generator output image L of first-level scale1I.e. the deblurred resulting image. In the cascaded three-scale sub-networks, the corresponding structures, the channel numbers and the convolution kernel sizes of the three sub-networks are all the same. In the three channels of the color image RGB, each channel weight is shared.
S3, randomly extracting m (m is 16) blurred images and corresponding sharp images from the training data set T, and randomly cutting the blurred images and the corresponding sharp images into 256 × 256 square regions to respectively form a blurred image set B and a corresponding sharp image set S for training, where the images of B and S are both 256 × 256 3-channel color images. And inputting the blurred image set B into a generator to obtain an output image set L of the generator.
S4, sequentially taking the image set L output by the generator and the corresponding clear image set S as the input of a discriminator, and sequentially outputting two groups of confidence coefficient results by the discriminator, wherein each group of confidence coefficients comprises 16 probability values, so that each input image is judged to be a clear image or a generated image: if the probability value is greater than 0.5, determining that the image is clear; otherwise, the image is determined to be generated.
S5, constructing a loss function of the training generator, wherein the loss function is as follows: ldb=lE1lgrad2ladv。α1、α2Being coefficient of a regular term, α1=10-2,α2=10-4。lETo generate the mean square error between the generator output image set L and the corresponding clear image set S:
Figure BDA0002094940980000071
wherein L isi、SiRepresenting the generator output image and the sharp image, respectively, at the ith scale, NiThe number of pixels of all channels on the ith scale image is represented, i is 1,2 and 3; and in the multiscale, an image with reduced size is obtained by carrying out three times of downsampling on the image, the first-level scale is the image with the original size, and from the second level, the size of each level of image is half of the width and the height of the size of the previous-level image.
lgradAs a gradient image
Figure BDA0002094940980000072
And
Figure BDA0002094940980000073
the gradient error between, i.e.:
Figure BDA0002094940980000074
Figure BDA0002094940980000075
in the formula, Li(dx) And Li(dy) Respectively represent the horizontal and vertical gradient of Li, Si(dx) And Si(dy) Respectively represent SiHorizontal and vertical gradients of; ladvThe decision error for the generator output image set L and the corresponding sharp image set S is:
Figure BDA0002094940980000076
wherein S-p (S) represents that the clear image S is taken from the clear image set S, and p (S) represents the probability distribution of the clear image set S; b to p (B) show that the blurred image B is taken from the blurred image set B, and p (B) show the probability distribution of the blurred image set B;
d(s) represents the discrimination probability of the discriminator on the input image s, G (b) represents the result image generated by the input image b through the generator, and E [. cndot. ] represents the expectation in brackets.
S6, inputting the generated image and the clear image into a discriminator, updating the weight parameters in each layer of network by gradient descent iteration, and continuously optimizing ladvAnd until the discriminator cannot discriminate whether the input image is a generated image or a clear image, namely the difference value change between the obtained probability value and 0.5 is less than the set threshold value 0.05, finishing the training of the discriminator.
S7 according to the loss function ldb=lE1lgrad2ladvTraining generator, inputting the fuzzy image into the generator, obtaining the generated image through forward propagation, comparing the difference between the generated image and the clear image, updating the weight parameter in each layer of network by gradient descent iteration, and continuously losing function ldb=lE1lgrad2ladvTraining lumped loss function values until the generator model training phasedbThe change is less than the set threshold of 0.005, at which point the generator training is complete.
S8, repeating the steps S3 to S7 of the training process until the training lumped loss function value l of the generator model training phasedbAnd when the variation is less than 0.005, namely the discriminator cannot judge whether the input image is a sharp image or a generated image, the training of the generator model and the discriminator model is determined to be converged, and the blurred image is input into the generator to obtain an estimated deblurred image.

Claims (6)

Translated fromChinese
1.基于循环多尺度生成对抗网络的图像盲去运动模糊方法,其特征在于具体步骤是:1. An image blind motion blurring method based on a recurrent multi-scale generative adversarial network, characterized in that the specific steps are:步骤(1).构建判别器D:Step (1). Build the discriminator D:所述的判别器D由九个卷积层、一个全连接层和一个Sigmoid激活层组成,输入大小为256×256的彩色图像;The discriminator D is composed of nine convolutional layers, a fully connected layer and a sigmoid activation layer, and the input size is a color image of 256×256;每个卷积层均采用LeakyReLU作为激活函数:第一层有32个卷积核,每个卷积核尺寸为5×5,步长为2,填零宽度为2;第二层有64个卷积核,每个卷积核尺寸为5×5,步长为1,填零宽度为2;第三层有64个卷积核,每个卷积核尺寸为5×5,步长为2,填零宽度为2;第四层有128个卷积核,每个卷积核尺寸为5×5,步长为1,填零宽度为2;第五层有128个卷积核,每个卷积核尺寸为5×5,步长为4,填零宽度为2;第六层有256个卷积核,每个卷积核尺寸为5×5,步长为1,填零宽度为2;第七层有256个卷积核,每个卷积核尺寸为5×5,步长为4,填零宽度为2;第八层有512个卷积核,每个卷积核尺寸为5×5,步长为1,填零宽度为2;第九层有512个卷积核,每个卷积核尺寸为4×4,步长为4,填零宽度为0;Each convolutional layer uses LeakyReLU as the activation function: the first layer has 32 convolution kernels, each with a size of 5 × 5, a stride of 2, and a zero padding width of 2; the second layer has 64 Convolution kernel, the size of each convolution kernel is 5×5, the stride is 1, and the zero padding width is 2; the third layer has 64 convolution kernels, each convolution kernel is 5×5 in size, and the stride is 2 2, the zero-fill width is 2; the fourth layer has 128 convolution kernels, each with a size of 5 × 5, a stride of 1, and a zero-fill width of 2; the fifth layer has 128 convolution kernels, The size of each convolution kernel is 5×5, the stride is 4, and the zero padding width is 2; the sixth layer has 256 convolution kernels, each convolution kernel is 5×5 in size, the stride is 1, and the zero padding is The width is 2; the seventh layer has 256 convolution kernels, each with a size of 5 × 5, a stride of 4, and a zero padding width of 2; the eighth layer has 512 convolution kernels, each convolution kernel The kernel size is 5×5, the stride is 1, and the zero-padding width is 2; the ninth layer has 512 convolution kernels, each of which has a size of 4×4, a stride of 4, and a zero-padding width of 0;最后一层的卷积输出经输入通道数为512、输出通道数为1的全连接层,得到1个常数,经Sigmoid函数激活后输出判定的概率;The convolution output of the last layer obtains a constant through a fully connected layer with 512 input channels and 1 output channel, and outputs the probability of judgment after being activated by the sigmoid function;步骤(2).构建生成器G:Step (2). Build generator G:所述的生成器G包含级联的三个尺度的子网络,每个子网络包含1个输入模块、2个编码模块、级联1个卷积长短时记忆模块、2个解码模块和1个输出模块;每个模块中都含有残差模块,所述的残差模块由一个卷积层级联一个卷积核组成,卷积层以改进型线性单元ReLU作为激活函数;残差模块中级联的卷积核的输出和残差模块的输入相加后即为残差模块的输出;The generator G includes cascaded three-scale sub-networks, each sub-network includes 1 input module, 2 encoding modules, 1 convolutional long and short-term memory module, 2 decoding modules and 1 output. module; each module contains a residual module, the residual module is composed of a convolution layer concatenated with a convolution kernel, and the convolution layer uses the improved linear unit ReLU as the activation function; the cascaded in the residual module The output of the convolution kernel and the input of the residual module are added to become the output of the residual module;所述的输入模块包括一个独立的卷积层和三个结构相同的残差模块,独立的卷积层以及残差模块的卷积层卷积核的核数量为32、大小为5×5、步长为1、填零宽度为2,独立的卷积层中使用改进型线性单元ReLU作为激活函数;The input module includes an independent convolution layer and three residual modules with the same structure. The step size is 1, the zero padding width is 2, and the improved linear unit ReLU is used as the activation function in the independent convolution layer;第一编码模块包括一个独立的卷积层和三个结构相同的残差模块,独立的卷积层以及残差模块的卷积层卷积核的数量为64、大小为5×5、步长为2、填零宽度为2,独立的卷积层中使用改进型线性单元ReLU作为激活函数;The first encoding module includes an independent convolutional layer and three residual modules with the same structure. is 2, the zero-filling width is 2, and the improved linear unit ReLU is used as the activation function in the independent convolutional layer;第二编码模块包括一个独立的卷积层和三个结构相同的残差模块,独立的卷积层以及残差模块的卷积层卷积核的数量为128、大小为5×5、步长为2、填零宽度为2,独立的卷积层中使用改进型线性单元ReLU作为激活函数;The second encoding module includes an independent convolution layer and three residual modules with the same structure. The number of convolution kernels of the independent convolution layer and the residual module is 128, the size is 5 × 5, and the stride is 128. is 2, the zero-filling width is 2, and the improved linear unit ReLU is used as the activation function in the independent convolutional layer;所述的卷积长短时记忆模块中记忆细胞状态输出作为解码模块的输入,卷积长短时记忆模块的隐藏状态输出与下一尺度子网络中卷积长短时记忆模块的隐藏状态输入相连;对于最后一个尺度,卷积长短时记忆模块隐藏状态输出不与其他模块连接;The output of the memory cell state in the convolutional long and short-term memory module is used as the input of the decoding module, and the hidden state output of the convolutional long and short-term memory module is connected with the hidden state input of the convolutional long and short-term memory module in the next scale sub-network; for The last scale, the hidden state output of the convolutional long and short-term memory module is not connected to other modules;第一解码模块包括三个结构相同的残差模块和一个独立的卷积层,独立的卷积层以及残差模块的卷积层卷积核的数量为128、大小为5×5、步长为2、填零宽度为2,残差模块后级联的独立的卷积层中使用改进型线性单元ReLU作为激活函数;The first decoding module includes three residual modules with the same structure and an independent convolutional layer. The number of convolution kernels of the independent convolutional layers and the convolutional layer of the residual module is 128, the size is 5×5, and the stride is 5×5. is 2, the zero-filling width is 2, and the improved linear unit ReLU is used as the activation function in the independent convolutional layer cascaded after the residual module;第二解码模块包括三个结构相同的残差模块和一个独立的卷积层,独立的卷积层以及残差模块的卷积层卷积核的数量为64、大小为5×5、步长为2、填零宽度为2,残差模块后级联的独立的卷积层中使用改进型线性单元ReLU作为激活函数;The second decoding module includes three residual modules with the same structure and an independent convolutional layer. The number of convolution kernels of the independent convolutional layers and the convolutional layer of the residual module is 64, the size is 5×5, and the stride is 5×5. is 2, the zero-filling width is 2, and the improved linear unit ReLU is used as the activation function in the independent convolutional layer cascaded after the residual module;所述的输出模块包括三个结构相同的残差模块和一个独立的卷积层,独立的卷积层以及残差模块的卷积层卷积核的数量为32、大小为5×5、步长为1、填零宽度为2,残差模块后级联的独立的卷积层中使用改进型线性单元ReLU作为激活函数;The output module includes three residual modules with the same structure and an independent convolution layer. The length is 1, the zero-filling width is 2, and the improved linear unit ReLU is used as the activation function in the independent convolutional layer cascaded after the residual module;输出第三级尺度的生成器输出图像L3,大小为64×64,L3经上采样得到尺寸为128×128的图像,作为第二级尺度的输入,输出128×128的第二级尺度的生成器输出图像L2;L2经上采样得到尺寸为256×256的图像作为第一级尺度的输入,输出256×256的第一级尺度的生成器输出图像L1,即为去模糊的结果图像;The generator that outputs the third-level scale outputs an image L3 with a size of 64×64. L3 is upsampled to obtain an image with a size of 128×128, which is used as the input of the second-level scale and outputs a second-level scale of 128×128.The generator outputs image L2of the resulting image;步骤(3).从训练数据集T中随机抽取m张模糊图像和对应的清晰图像,并随机裁剪成256×256的方形区域,分别组成用于训练的模糊图像集B和对应的清晰图像集S,得到的B和S的图像数量均为m张,每张图像均为256×256的3通道彩色图像;将模糊图像集B输入生成器,得到生成器输出图像集L,L中有m张尺寸大小为256×256的彩色图像;Step (3). Randomly extract m blurred images and corresponding clear images from the training data set T, and randomly crop them into a 256×256 square area to form a blurred image set B and a corresponding clear image set for training respectively. S, the number of obtained images of B and S are m, and each image is a 256×256 3-channel color image; input the blurred image set B into the generator, and get the generator output image set L, there are m in L A color image with a size of 256×256;步骤(4).将生成器输出图像集L和对应的清晰图像集S依次作为判别器的输入,判别器依次输出两组置信度结果,每组置信度包含m个概率值,以此判定每张输入的图像是清晰图像或生成图像:若概率值大于0.5,则判定为清晰图像;概率值小于等于0.5,则判定为生成图像;Step (4). The generator output image set L and the corresponding clear image set S are used as the input of the discriminator in turn, and the discriminator outputs two sets of confidence results in turn, each set of confidences contains m probability values, so as to determine each The input image is a clear image or a generated image: if the probability value is greater than 0.5, it is determined as a clear image; if the probability value is less than or equal to 0.5, it is determined as a generated image;步骤(5).构建训练生成器的损失函数,损失函数为:ldb=lE1lgrad2ladvStep (5). Construct the loss function of the training generator, the loss function is: ldb =lE1 lgrad2 ladv ;其中α1、α2为大于0的正则项系数,lE为生成器输出图像集L和对应的清晰图像集S之间的均方误差,即:
Figure FDA0002094940970000021
Li、Si分别表示在第i尺度上的生成器输出图像和清晰图像,Ni表示在第i尺度图像上所有通道的像素个数,i=1,2,3;多尺度通过对图像3次降采样得到尺寸缩小的图像;where α1 and α2 are regular term coefficients greater than 0, and lE is the mean square error between the generator output image set L and the corresponding clear image set S, namely:
Figure FDA0002094940970000021
Li and Si represent the generator output image and clear image on theith scale respectively,Ni represents the number of pixels of all channels on the ith scale image, i=1, 2, 3; Downsampling 3 times to get a reduced size image;lgrad为梯度图像
Figure FDA0002094940970000031
Figure FDA0002094940970000032
之间的梯度误差,即:
Figure FDA0002094940970000033
lgrad is the gradient image
Figure FDA0002094940970000031
and
Figure FDA0002094940970000032
The gradient error between , namely:
Figure FDA0002094940970000033
Figure FDA0002094940970000034
Li(dx)和Li(dy)分别表示Li的水平梯度和垂直梯度,Si(dx)和Si(dy)分别表示Si的水平梯度和垂直梯度;
Figure FDA0002094940970000034
Li (dx ) andLi (dy ) represent the horizontal and vertical gradients ofLi, respectively, and Si (d x)andSi(dy)represent the horizontal and vertical gradients of Si, respectively;
ladv为生成器输出图像集L和对应的清晰图像集S的判别误差,即:ladv is the discrimination error between the generator output image set L and the corresponding clear image set S, namely:
Figure FDA0002094940970000035
s~p(S)表示清晰图像s取自于清晰图像集S,p(S)表示清晰图像集S的概率分布;b~p(B)表示模糊图像b取自于模糊图像集B,p(B)表示模糊图像集B的概率分布;D(s)表示判别器对输入图像s的判别概率,G(b)表示由输入图像b经生成器生成的结果图像,E[·]表示对括号内取期望;
Figure FDA0002094940970000035
s~p(S) means that the clear image s is taken from the clear image set S, p(S) means the probability distribution of the clear image set S; b~p(B) means that the blurred image b is taken from the blurred image set B, p (B) represents the probability distribution of the fuzzy image set B; D(s) represents the discrimination probability of the input image s by the discriminator, G(b) represents the result image generated by the generator from the input image b, and E[·] represents the pair of Take the expectation in parentheses;
步骤(6).将生成图像与清晰图像一同输入到判别器中,利用梯度下降迭代更新各层网络中的权重参数,不断优化ladv,直到判别器无法判别输入的图像是生成图像还是清晰图像,即获得的概率值与0.5的差值变化小于thr,此时判别器训练结束;Step (6). Input the generated image and the clear image into the discriminator, use gradient descent to iteratively update the weight parameters in each layer of the network, and continuously optimizela adv until the discriminator cannot distinguish whether the input image is a generated image or a clear image , that is, the difference between the obtained probability value and 0.5 is less than thr, and the discriminator training ends;步骤(7).根据损失函数ldb=lE1lgrad2ladv训练生成器,将模糊图像输入到生成器中,经过前向传播获得生成图像,比较生成图像与清晰图像的差异性,利用梯度下降迭代更新各层网络中的权重参数,不断优化损失函数ldb=lE1lgrad2ladv,直到生成器模型训练阶段的训练集总损失函数值ldb变化小于阈值Th,此时生成器训练结束;Step (7). Train the generator according to the loss function ldb =lE1 lgrad2 ladv , input the blurred image into the generator, obtain the generated image through forward propagation, and compare the generated image with the clear image The difference of , use gradient descent to iteratively update the weight parameters in each layer of the network, and continuously optimize the loss function ldb =lE1 lgrad2 ladv , until the total loss function value of the training set in the training phase of the generator model l If the change indb is less than the threshold Th, the generator training ends;步骤(8).重复训练过程的步骤(3)~步骤(7),直至生成器模型训练阶段的训练集总损失函数值ldb变化小于阈值Th,即判别器无法判定输入的图像是清晰图像还是生成图像,认定生成器模型与判别器模型训练已达到收敛,此时将模糊图像输入到生成器中,获得估计的去模糊图像。Step (8). Repeat steps (3) to (7) of the training process until the change of the total loss function valueldb of the training set in the training phase of the generator model is less than the threshold Th, that is, the discriminator cannot determine that the input image is a clear image Or generate an image, and determine that the training of the generator model and the discriminator model has reached convergence. At this time, the blurred image is input into the generator to obtain an estimated deblurred image.2.如权利要求1所述的基于循环多尺度生成对抗网络的图像盲去运动模糊方法,其特征在于:步骤(2)级联的三个尺度的子网络中,三个子网络对应的结构、通道数、卷积核尺寸均相同;彩色图像RGB的三个通道中,每个通道权值共享。2. The image blind deblurring method based on cyclic multi-scale generative adversarial network as claimed in claim 1, is characterized in that: in the sub-networks of the three scales cascaded in step (2), the corresponding structures of the three sub-networks, The number of channels and the size of the convolution kernel are the same; in the three channels of the color image RGB, the weights of each channel are shared.3.如权利要求1所述的基于循环多尺度生成对抗网络的图像盲去运动模糊方法,其特征在于:步骤(3)和(4)中,m≥16。3. The method for blind image deblurring based on recurrent multi-scale generative adversarial network as claimed in claim 1, characterized in that: in steps (3) and (4), m≥16.4.如权利要求1所述的基于循环多尺度生成对抗网络的图像盲去运动模糊方法,其特征在于:步骤(5)中,多尺度通过对图像多次降采样得到尺寸缩小的图像,第一级尺度为原尺寸大小的图像,从第二级开始,每一级图像的尺寸为上一级图像尺寸的宽度、高度各一半。4. The image blind deblurring method based on cyclic multi-scale generative adversarial network as claimed in claim 1, it is characterized in that: in step (5), multi-scale obtains the size-reduced image by down-sampling the image for many times, the first The first-level scale is the image of the original size. Starting from the second level, the size of each level of image is half the width and height of the previous level's image size.5.如权利要求1所述的基于循环多尺度生成对抗网络的图像盲去运动模糊方法,其特征在于:步骤(6)中,0.01≤thr≤0.08。5. The method for blind image deblurring based on recurrent multi-scale generative adversarial network according to claim 1, wherein: in step (6), 0.01≤thr≤0.08.6.如权利要求1所述的基于循环多尺度生成对抗网络的图像盲去运动模糊方法,其特征在于:步骤(7)和(8)中,0.001≤Th≤0.01。6. The method for blind image deblurring based on recurrent multi-scale generative adversarial network according to claim 1, wherein: in steps (7) and (8), 0.001≤Th≤0.01.
CN201910515590.4A2019-06-142019-06-14 A Blind Image Deblurring Method Based on Recurrent Multiscale Generative Adversarial NetworksExpired - Fee RelatedCN110378844B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN201910515590.4ACN110378844B (en)2019-06-142019-06-14 A Blind Image Deblurring Method Based on Recurrent Multiscale Generative Adversarial Networks

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201910515590.4ACN110378844B (en)2019-06-142019-06-14 A Blind Image Deblurring Method Based on Recurrent Multiscale Generative Adversarial Networks

Publications (2)

Publication NumberPublication Date
CN110378844A CN110378844A (en)2019-10-25
CN110378844Btrue CN110378844B (en)2021-04-09

Family

ID=68250306

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201910515590.4AExpired - Fee RelatedCN110378844B (en)2019-06-142019-06-14 A Blind Image Deblurring Method Based on Recurrent Multiscale Generative Adversarial Networks

Country Status (1)

CountryLink
CN (1)CN110378844B (en)

Families Citing this family (22)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN111340716B (en)*2019-11-202022-12-27电子科技大学成都学院Image deblurring method for improving double-discrimination countermeasure network model
CN110910442B (en)*2019-11-292023-09-29华南理工大学High-speed moving object machine vision size detection method based on kernel-free image restoration
CN111028177B (en)*2019-12-122023-07-21武汉大学 An edge-based deep learning image de-blurring method
CN111199522B (en)*2019-12-242024-02-09芽米科技(广州)有限公司Single-image blind removal motion blurring method for generating countermeasure network based on multi-scale residual error
CN111223062B (en)*2020-01-082023-04-07西安电子科技大学Image deblurring method based on generation countermeasure network
CN111292262B (en)*2020-01-192023-10-13腾讯科技(深圳)有限公司Image processing method, device, electronic equipment and storage medium
CN111489304B (en)*2020-03-272022-04-26天津大学 An Image Deblurring Method Based on Attention Mechanism
CN111583143A (en)*2020-04-302020-08-25广州大学 A deblurring method for complex images
CN111681188B (en)*2020-06-152022-06-17青海民族大学 Image Deblurring Method Based on Combining Image Pixel Prior and Image Gradient Prior
CN111986275B (en)*2020-07-312024-06-11广州嘉尔日用制品有限公司Inverse halftoning method for multi-mode halftone image
CN112541864A (en)*2020-09-252021-03-23中国石油大学(华东)Image restoration method based on multi-scale generation type confrontation network model
CN112241939B (en)*2020-10-152023-05-30天津大学Multi-scale and non-local-based light rain removal method
CN112329932B (en)*2020-10-302024-07-23深圳市优必选科技股份有限公司Training method and device for generating countermeasure network and terminal equipment
CN112435187A (en)*2020-11-232021-03-02浙江工业大学Single-image blind motion blur removing method for generating countermeasure network based on aggregation residual
CN112258425A (en)*2020-11-242021-01-22中电万维信息技术有限责任公司Two-dimensional code image sharpening and deblurring processing method
CN112508817B (en)*2020-12-162024-05-14西北工业大学Image motion blind deblurring method based on cyclic generation countermeasure network
CN112686119B (en)*2020-12-252022-12-09陕西师范大学License plate motion blurred image processing method based on self-attention generation countermeasure network
CN112614072B (en)*2020-12-292022-05-17北京航空航天大学合肥创新研究院Image restoration method and device, image restoration equipment and storage medium
CN112634163B (en)*2020-12-292024-10-15南京大学Method for removing image motion blur based on improved cyclic generation countermeasure network
CN113012074B (en)*2021-04-212023-03-24山东新一代信息产业技术研究院有限公司Intelligent image processing method suitable for low-illumination environment
CN113129237B (en)*2021-04-262022-10-28广西师范大学Depth image deblurring method based on multi-scale fusion coding network
CN117078520A (en)*2022-05-092023-11-17广州导远电子科技有限公司Training method and device for motion debounce model

Citations (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN108320274A (en)*2018-01-262018-07-24东华大学It is a kind of to recycle the infrared video colorization method for generating confrontation network based on binary channels
CN109035149A (en)*2018-03-132018-12-18杭州电子科技大学A kind of license plate image based on deep learning goes motion blur method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN108320274A (en)*2018-01-262018-07-24东华大学It is a kind of to recycle the infrared video colorization method for generating confrontation network based on binary channels
CN109035149A (en)*2018-03-132018-12-18杭州电子科技大学A kind of license plate image based on deep learning goes motion blur method

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
Motion Deblurring Via Using Generative Adversarial Networks For Space-Based Imaging;Yi Chen等;《SERA 2018》;20180615;第37-41页*
Scale-recurrent Network for Deep Image Deblurring;Xin Tao等;《2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition》;20181231;第8174-8182页*
基于深度学习的车牌图像去运动模糊技术;毛勇等;《杭州电子科技大学学报(自然科学版)》;20180930;第38卷(第5期);I138-3182*
基于生成对抗网络的图像盲去运动模糊算法;陈富成等;《软件导刊》;20190326;第26-29页*
模糊车牌图像的盲去模糊算法研究;毛勇;《中国优秀硕士学位论文全文数据库 信息科技辑》;20190115(第1期);第29-33页*
运动图像盲去模糊技术研究;鲍宗袍;《中国优秀硕士学位论文全文数据库 信息科技辑》;20180215(第2期);I138-2500*

Also Published As

Publication numberPublication date
CN110378844A (en)2019-10-25

Similar Documents

PublicationPublication DateTitle
CN110378844B (en) A Blind Image Deblurring Method Based on Recurrent Multiscale Generative Adversarial Networks
CN111583109B (en)Image super-resolution method based on generation of countermeasure network
CN112016507B (en)Super-resolution-based vehicle detection method, device, equipment and storage medium
CN113688723B (en) A pedestrian target detection method in infrared images based on improved YOLOv5
CN109035149B (en) A deep learning-based motion blurring method for license plate images
CN110232394B (en)Multi-scale image semantic segmentation method
CN112884671B (en)Fuzzy image restoration method based on unsupervised generation countermeasure network
CN109685716B (en) An Image Super-Resolution Reconstruction Method Based on Gaussian Coding Feedback Generative Adversarial Networks
CN111062872A (en) A method and system for image super-resolution reconstruction based on edge detection
CN112634163A (en)Method for removing image motion blur based on improved cycle generation countermeasure network
CN105981050A (en)Method and system for exacting face features from data of face images
CN110223234A (en)Depth residual error network image super resolution ratio reconstruction method based on cascade shrinkage expansion
CN112288632A (en) Single-image super-resolution method and system based on simplified ESRGAN
CN112862689A (en)Image super-resolution reconstruction method and system
CN115880158B (en) A blind image super-resolution reconstruction method and system based on variational autoencoding
CN117576402B (en)Deep learning-based multi-scale aggregation transducer remote sensing image semantic segmentation method
CN111833277A (en) A sea image dehazing method with unpaired multi-scale hybrid encoder-decoder structure
CN114820389B (en)Face image deblurring method based on unsupervised decoupling representation
CN116977188A (en)Infrared image enhancement method based on depth full convolution neural network
CN114862699B (en)Face repairing method, device and storage medium based on generation countermeasure network
CN116309178A (en) A Visible Light Image Denoising Method Based on Adaptive Attention Mechanism Network
CN111738919A (en) A realistic illusion method for low-definition small faces based on linear multi-step residual dense network
CN112734678B (en)Image motion blur removing method based on depth residual shrinkage network and generation countermeasure network
CN114140323A (en)Image super-resolution method for generating countermeasure network based on progressive residual errors
CN117788293B (en)Feature aggregation image super-resolution reconstruction method and system

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant
CF01Termination of patent right due to non-payment of annual fee

Granted publication date:20210409

CF01Termination of patent right due to non-payment of annual fee

[8]ページ先頭

©2009-2025 Movatter.jp