CN110378844B

Movatterモバイル変換

Info

Publication number: CN110378844B
Application number: CN201910515590.4A
Authority: CN
Inventors: 陈华华; 陈富成; 叶学义
Original assignee: Hangzhou Dianzi University
Current assignee: Hangzhou Dianzi University
Priority date: 2019-06-14
Filing date: 2019-06-14
Publication date: 2021-04-09
Anticipated expiration: 2039-06-14
Also published as: CN110378844A

Abstract

Translated fromChinese

本发明公开了一种基于循环多尺度生成对抗网络的图像盲去运动模糊方法。本发明方法以循环多尺度编码器和解码器作为生成器，并构建了相应的判决器。以生成图像和清晰图像的对抗性损失、多尺度均方误差和多尺度梯度误差作为生成对抗网络的损失函数，以梯度下降法优化损失函数。本发明运用生成对抗网络学习运动模糊图像与其对应清晰图像之间的关系，省去了复杂的模糊核估计过程。本发明方法可以提取图像的边缘特征，具有更简单的网络结构、更少的参数，并且该网络模型更容易训练，且复原效果较好。The invention discloses an image blind motion blurring method based on cyclic multi-scale generation confrontation network. The method of the present invention uses the cyclic multi-scale encoder and decoder as generators, and constructs the corresponding decider. The adversarial loss, multi-scale mean square error and multi-scale gradient error of the generated image and the clear image are used as the loss function of the generative adversarial network, and the loss function is optimized by the gradient descent method. The invention uses the generative confrontation network to learn the relationship between the motion blurred image and its corresponding clear image, and saves the complicated blur kernel estimation process. The method of the invention can extract the edge features of the image, has a simpler network structure and fewer parameters, and the network model is easier to train and has a better recovery effect.

Description

Image blind motion blur removing method based on cyclic multi-scale generation countermeasure network

Technical Field

The invention belongs to the technical field of image processing, and relates to an image blind motion blur removing method based on a cyclic multi-scale generation countermeasure network.

Background

Since it is difficult to maintain a relatively stationary state between the photographing apparatus and the imaging object, motion blur of the image may be caused. However, in the fields of daily life, traffic safety, medicine, military investigation and the like, it is very important to obtain a clear image.

The blurring of moving images can be seen as the formation of sharp images and a two-dimensional linear function after convolution operation by additive noise pollution. This linear function, called the point spread function or convolution kernel, contains the blurring information of the image. Blind deblurring of an image refers to restoring an original sharp image by only depending on information of a blurred image under the condition that a blurring mode is unknown (namely a blurring kernel is unknown). In the blind deblurring of a single moving image, a blur kernel and the size of the blur image are unknown, which affects the accuracy of blur kernel estimation and further affects the final restoration effect.

Disclosure of Invention

The invention aims to provide an image blind motion blur removing method based on a cyclic multi-scale generation countermeasure network aiming at the characteristic of image motion blur, and the method can estimate a clear image without estimating a blur kernel.

The invention specifically comprises the following steps:

step (1), constructing a discriminator D;

the discriminator D consists of nine convolutional layers, a full-link layer and a Sigmoid active layer, and inputs a color image with the size of 256 multiplied by 256.

And the convolution output of the last layer is subjected to full-connection layers with the number of input channels being 512 and the number of output channels being 1 to obtain 1 constant, and the probability of judgment is output after being activated by a Sigmoid function.

Step (2), constructing a generator G;

the generator G comprises cascaded three-scale sub-networks, wherein each sub-network comprises 1 input module, 2 coding modules, 1 cascaded convolution long-time memory (ConvLSTM) module, 2 decoding modules and 1 output module; each module comprises a residual module, the residual module is formed by cascading a convolution layer with a convolution kernel, and the convolution layer takes a modified Linear Unit (ReLU) as an activation function; and adding the output of the cascaded convolution kernel in the residual error module and the input of the residual error module to obtain the output of the residual error module.

The input module comprises an independent convolutional layer and three residual modules with the same structure, the number of cores of convolutional layer convolutional cores of the independent convolutional layer and the residual modules is 32, the size of the convolutional layer convolutional cores is 5 multiplied by 5, the step length of the convolutional layer convolutional cores is 1, the zero filling width of the convolutional layer cores is 2, and a ReLU function is used as an activation function in the independent convolutional layer.

The first coding module comprises an independent convolutional layer and three residual modules with the same structure, the number of convolutional layers of the independent convolutional layer and the residual modules is 64, the size of the convolutional layers is 5 multiplied by 5, the step size is 2, the zero filling width is 2, and a ReLU function is used as an activation function in the independent convolutional layer.

The second coding module comprises an independent convolutional layer and three residual modules with the same structure, the number of convolutional layers of the independent convolutional layer and the residual modules is 128, the size of the convolutional layers is 5 multiplied by 5, the step size is 2, the zero filling width is 2, and a ReLU function is used as an activation function in the independent convolutional layer.

The state output of the memory cells in the convolution long-time and short-time memory module is used as the input of the decoding module, and the hidden state output of the convolution long-time and short-time memory module is connected with the hidden state input of the convolution long-time and short-time memory module in the next scale sub-network; and for the last scale, the hidden state output of the convolution long-time and short-time memory module is not connected with other modules.

The structure of the convolution long-short memory (ConvLSTM) module is shown in Shi X, Chen Z, Hao W, et al. the convolutional LSTM Network a machine learning approach for prediction of non-coding [ C ]// International Conference on Neural Information Processing systems.2015, page number: 802-810.

The first decoding module comprises three residual modules with the same structure and an independent convolutional layer, the number of convolutional kernels of the independent convolutional layer and the residual modules is 128, the size of the convolutional kernels is 5 multiplied by 5, the step size of the convolutional kernels is 2, the zero filling width of the convolutional kernels is 2, and a ReLU function is used as an activation function in the independent convolutional layer cascaded after the residual modules.

The second decoding module comprises three residual modules with the same structure and an independent convolutional layer, the number of convolutional kernels of the independent convolutional layer and the residual modules is 64, the size of the convolutional kernels is 5 multiplied by 5, the step size of the convolutional kernels is 2, the zero filling width of the convolutional kernels is 2, and a ReLU function is used as an activation function in the independent convolutional layer cascaded after the residual modules.

The output module comprises three residual modules with the same structure and an independent convolutional layer, the number of convolutional cores of the independent convolutional layer and the convolutional layer of the residual modules is 32, the size of the convolutional layer is 5 multiplied by 5, the step length is 1, the zero filling width is 2, and a ReLU function is used as an activation function in the independent convolutional layer cascaded behind the residual modules.

Output third-level scaled generator output image L³The size is 64 x 64, L³Up-sampling to obtain 128 × 128 image, and outputting 128 × 128 second-level generator output image L as input of the second-level scale²；L²Up-sampling to obtain 256 × 256 image as input of first-level scale, and outputting 256 × 256 generator output image L of first-level scale¹I.e. the deblurred resulting image. In the cascaded three-scale sub-networks, the corresponding structures, the channel numbers and the convolution kernel sizes of the three sub-networks are all the same. In the three channels of the color image RGB, each channel weight is shared.

And (3) randomly extracting m (m is more than or equal to 16) blurred images and corresponding clear images from the training data set T, randomly cutting the blurred images and the corresponding clear images into square areas of 256 multiplied by 256, respectively forming a blurred image set B for training and a corresponding clear image set S, wherein the number of the obtained images B and S is m, and each image is a 3-channel color image of 256 multiplied by 256. The blurred image set B is input into a generator, and an output image set L of the generator is obtained, wherein m color images with the size of 256 multiplied by 256 exist in the L.

And (4) sequentially taking the generator output image set L and the corresponding clear image set S as the input of a discriminator, and sequentially outputting two groups of confidence coefficient results by the discriminator, wherein each group of confidence coefficients comprises m probability values, so that each input image is judged to be a clear image or a generated image: if the probability value is greater than 0.5, determining that the image is clear; and if the probability value is less than or equal to 0.5, determining that an image is generated.

And (5) constructing a loss function of the training generator, wherein the loss function is as follows: l_db＝l_E+α₁l_grad+α₂l_adv；

Wherein alpha is₁、α₂Is a regular term coefficient greater than 0, l_ETo the generator, the mean square error between the image set L and the corresponding clear image set S is output, i.e.:

wherein L isⁱ、SⁱRepresenting the generator output image and the sharp image, respectively, at the ith scale, N_iThe number of pixels of all channels on the ith scale image is represented, i is 1,2 and 3; and (3) carrying out multi-scale down sampling on the image for 3 times to obtain the image with reduced size, wherein the first-level scale is the image with the original size, and from the second level, the size of each level of image is half of the width and the height of the previous-level image.

l_gradAs a gradient image

And

the gradient error between, i.e.:

in the formula Lⁱ(d_x) And Lⁱ(d_y) Respectively represent the horizontal and vertical gradient of Li, Sⁱ(d_x) And Sⁱ(d_y) Respectively represent SⁱHorizontal gradient and vertical gradient.

l_advThe decision error for the generator output image set L and the corresponding sharp image set S is:

wherein S-p (S) represents that the clear image S is taken from the clear image set S, and p (S) represents the probability distribution of the clear image set S; b to p (B) show that the blurred image B is taken from the blurred image set B, and p (B) show the probability distribution of the blurred image set B;

d(s) represents the discrimination probability of the discriminator on the input image s, G (b) represents the result image generated by the input image b through the generator, and E [. cndot. ] represents the expectation in brackets.

Step (6) inputting the generated image and the clear image into a discriminator, updating weight parameters in each layer of network by gradient descent iteration, and continuously optimizing l_advUntil the discriminator can not discriminate whether the input image is the generated image or the clear image, namely the difference value change between the obtained probability value and 0.5 is less than thr, and thr is more than or equal to 0.01 and less than or equal to 0.08, the discriminant training is finished.

Step (7) according to the loss function l_db＝l_E+α₁l_grad+α₂l_advTraining generator, inputting the fuzzy image into the generator, obtaining the generated image through forward propagation, comparing the difference between the generated image and the clear image, and iterating by gradient descentNew weighting parameter in each layer network, constant loss function l_db＝l_E+α₁l_grad+α₂l_advTraining lumped loss function values until the generator model training phase_dbThe variation is less than a threshold Th, Th is more than or equal to 0.001 and less than or equal to 0.01, and the training of the generator is finished at the moment.

And (8) repeating the steps (3) to (7) of the training process until the value l of the training lumped loss function of the generator model in the training stage_dbAnd when the variation is smaller than the threshold Th, namely the discriminator cannot judge whether the input image is a sharp image or a generated image, the generator model and the discriminator model are determined to be converged, and the blurred image is input into the generator to obtain an estimated deblurred image.

The method of the invention uses a deep learning method to learn the relation between the motion blurred image and the corresponding clear image, and omits a complex blurred kernel estimation process. Through the contrast training of a large number of blurred images and clear images, the extracted model can extract the edge characteristics of the images, has a simpler network structure and fewer parameters, is easier to train and has a better restoration effect.

Detailed Description

The following further illustrates the practice of the present invention.

The blurred image set B is input to a generator G, and a generator output image set L is obtained as an input of a discriminator D, so that a discrimination result of the discriminator is obtained. Similarly, the clear image set S is also used as an input of the discriminator to obtain a discrimination result. The determination result indicates whether the determination input is from the clear image set or the generated image set, and if the determination result is greater than 0.5, the determination is made as the clear image set S; otherwise, it is determined that the generator outputs the image set L. And calculating the error between the judgment result and the real label data, optimizing the discriminator by using a gradient descent algorithm, then calculating the error mean value of the generated image and the clear image, and optimizing the generator by using the gradient descent algorithm. The discriminators and generators are alternately optimized until the model converges. In the experiments of the present invention, the model converged after a total of 40 ten thousand training times.

The image blind motion blur removing method for generating the countermeasure network based on the circulation multi-scale comprises the following specific steps:

s1, constructing a discriminator D: the discriminator D is composed of nine convolutional layers, one full link layer, and one Sigmoid active layer, and inputs a color image having a size of 256 × 256.

Each convolutional layer used a LeakyReLU as the activation function: the first layer has 32 convolution kernels, each convolution kernel has a size of 5 × 5, a step size of 2, and a zero-filling width of 2; the second layer has 64 convolution kernels, each convolution kernel is 5 × 5 in size, 1 in step size and 2 in zero-filling width; the third layer has 64 convolution kernels, each convolution kernel is 5 × 5 in size, 2 in step size and 2 in zero-filling width; the fourth layer has 128 convolution kernels, each convolution kernel is 5 × 5 in size, 1 in step size and 2 in zero-filling width; the fifth layer has 128 convolution kernels, each convolution kernel size is 5 × 5, step size is 4, and zero filling width is 2; the sixth layer has 256 convolution kernels, each convolution kernel has a size of 5 × 5, a step size of 1, and a zero filling width of 2; the seventh layer has 256 convolution kernels, each convolution kernel has a size of 5 × 5, a step size of 4, and a zero filling width of 2; the eighth layer has 512 convolution kernels, each convolution kernel is 5 × 5 in size, the step size is 1, and the zero filling width is 2; the ninth layer has 512 convolution kernels, each with a size of 4 x 4, step size of 4, and zero-fill width of 0.

S2, constructing a generator G: the generator G comprises cascaded subnetworks with three scales, wherein each subnetwork comprises 1 input module, 2 coding modules, cascaded 1 convolution long-time and short-time memory modules, 2 decoding modules and 1 output module; each module comprises a residual error module, the residual error module is formed by cascading a convolution layer with a convolution kernel, and the convolution layer takes an improved linear unit ReLU as an activation function; and adding the output of the cascaded convolution kernel in the residual error module and the input of the residual error module to obtain the output of the residual error module.

The input module comprises an independent convolutional layer and three residual modules with the same structure, the number of cores of convolutional cores of the independent convolutional layer and the residual modules is 32, the size of the convolutional cores is 5 multiplied by 5, the step length of the convolutional cores is 1, the zero filling width of the convolutional cores is 2, and a ReLU function is used as an activation function in the independent convolutional layer.

The state output of a memory cell in the convolution long-time and short-time memory module is used as the input of the decoding module, and the hidden state output of the convolution long-time and short-time memory module is connected with the hidden state input of the convolution long-time and short-time memory module in the next scale sub-network; and for the last scale, the hidden state output of the convolution long-time and short-time memory module is not connected with other modules.

S3, randomly extracting m (m is 16) blurred images and corresponding sharp images from the training data set T, and randomly cutting the blurred images and the corresponding sharp images into 256 × 256 square regions to respectively form a blurred image set B and a corresponding sharp image set S for training, where the images of B and S are both 256 × 256 3-channel color images. And inputting the blurred image set B into a generator to obtain an output image set L of the generator.

S4, sequentially taking the image set L output by the generator and the corresponding clear image set S as the input of a discriminator, and sequentially outputting two groups of confidence coefficient results by the discriminator, wherein each group of confidence coefficients comprises 16 probability values, so that each input image is judged to be a clear image or a generated image: if the probability value is greater than 0.5, determining that the image is clear; otherwise, the image is determined to be generated.

S5, constructing a loss function of the training generator, wherein the loss function is as follows: l_db＝l_E+α₁l_grad+α₂l_adv。α₁、α₂Being coefficient of a regular term, α₁＝10^-2，α₂＝10^-4。l_ETo generate the mean square error between the generator output image set L and the corresponding clear image set S:

wherein L isⁱ、SⁱRepresenting the generator output image and the sharp image, respectively, at the ith scale, N_iThe number of pixels of all channels on the ith scale image is represented, i is 1,2 and 3; and in the multiscale, an image with reduced size is obtained by carrying out three times of downsampling on the image, the first-level scale is the image with the original size, and from the second level, the size of each level of image is half of the width and the height of the size of the previous-level image.

l_gradAs a gradient image

And

the gradient error between, i.e.:

in the formula, Lⁱ(d_x) And Lⁱ(d_y) Respectively represent the horizontal and vertical gradient of Li, Sⁱ(d_x) And Sⁱ(d_y) Respectively represent SⁱHorizontal and vertical gradients of; l_advThe decision error for the generator output image set L and the corresponding sharp image set S is:

S6, inputting the generated image and the clear image into a discriminator, updating the weight parameters in each layer of network by gradient descent iteration, and continuously optimizing l_advAnd until the discriminator cannot discriminate whether the input image is a generated image or a clear image, namely the difference value change between the obtained probability value and 0.5 is less than the set threshold value 0.05, finishing the training of the discriminator.

S7 according to the loss function l_db＝l_E+α₁l_grad+α₂l_advTraining generator, inputting the fuzzy image into the generator, obtaining the generated image through forward propagation, comparing the difference between the generated image and the clear image, updating the weight parameter in each layer of network by gradient descent iteration, and continuously losing function l_db＝l_E+α₁l_grad+α₂l_advTraining lumped loss function values until the generator model training phase_dbThe change is less than the set threshold of 0.005, at which point the generator training is complete.

S8, repeating the steps S3 to S7 of the training process until the training lumped loss function value l of the generator model training phase_dbAnd when the variation is less than 0.005, namely the discriminator cannot judge whether the input image is a sharp image or a generated image, the training of the generator model and the discriminator model is determined to be converged, and the blurred image is input into the generator to obtain an estimated deblurred image.

Claims

Translated fromChinese

1.基于循环多尺度生成对抗网络的图像盲去运动模糊方法，其特征在于具体步骤是：1. An image blind motion blurring method based on a recurrent multi-scale generative adversarial network, characterized in that the specific steps are:

步骤(1).构建判别器D：Step (1). Build the discriminator D:

所述的判别器D由九个卷积层、一个全连接层和一个Sigmoid激活层组成，输入大小为256×256的彩色图像；The discriminator D is composed of nine convolutional layers, a fully connected layer and a sigmoid activation layer, and the input size is a color image of 256×256;

每个卷积层均采用LeakyReLU作为激活函数：第一层有32个卷积核，每个卷积核尺寸为5×5，步长为2，填零宽度为2；第二层有64个卷积核，每个卷积核尺寸为5×5，步长为1，填零宽度为2；第三层有64个卷积核，每个卷积核尺寸为5×5，步长为2，填零宽度为2；第四层有128个卷积核，每个卷积核尺寸为5×5，步长为1，填零宽度为2；第五层有128个卷积核，每个卷积核尺寸为5×5，步长为4，填零宽度为2；第六层有256个卷积核，每个卷积核尺寸为5×5，步长为1，填零宽度为2；第七层有256个卷积核，每个卷积核尺寸为5×5，步长为4，填零宽度为2；第八层有512个卷积核，每个卷积核尺寸为5×5，步长为1，填零宽度为2；第九层有512个卷积核，每个卷积核尺寸为4×4，步长为4，填零宽度为0；Each convolutional layer uses LeakyReLU as the activation function: the first layer has 32 convolution kernels, each with a size of 5 × 5, a stride of 2, and a zero padding width of 2; the second layer has 64 Convolution kernel, the size of each convolution kernel is 5×5, the stride is 1, and the zero padding width is 2; the third layer has 64 convolution kernels, each convolution kernel is 5×5 in size, and the stride is 2 2, the zero-fill width is 2; the fourth layer has 128 convolution kernels, each with a size of 5 × 5, a stride of 1, and a zero-fill width of 2; the fifth layer has 128 convolution kernels, The size of each convolution kernel is 5×5, the stride is 4, and the zero padding width is 2; the sixth layer has 256 convolution kernels, each convolution kernel is 5×5 in size, the stride is 1, and the zero padding is The width is 2; the seventh layer has 256 convolution kernels, each with a size of 5 × 5, a stride of 4, and a zero padding width of 2; the eighth layer has 512 convolution kernels, each convolution kernel The kernel size is 5×5, the stride is 1, and the zero-padding width is 2; the ninth layer has 512 convolution kernels, each of which has a size of 4×4, a stride of 4, and a zero-padding width of 0;

最后一层的卷积输出经输入通道数为512、输出通道数为1的全连接层，得到1个常数，经Sigmoid函数激活后输出判定的概率；The convolution output of the last layer obtains a constant through a fully connected layer with 512 input channels and 1 output channel, and outputs the probability of judgment after being activated by the sigmoid function;

步骤(2).构建生成器G：Step (2). Build generator G:

所述的生成器G包含级联的三个尺度的子网络，每个子网络包含1个输入模块、2个编码模块、级联1个卷积长短时记忆模块、2个解码模块和1个输出模块；每个模块中都含有残差模块，所述的残差模块由一个卷积层级联一个卷积核组成，卷积层以改进型线性单元ReLU作为激活函数；残差模块中级联的卷积核的输出和残差模块的输入相加后即为残差模块的输出；The generator G includes cascaded three-scale sub-networks, each sub-network includes 1 input module, 2 encoding modules, 1 convolutional long and short-term memory module, 2 decoding modules and 1 output. module; each module contains a residual module, the residual module is composed of a convolution layer concatenated with a convolution kernel, and the convolution layer uses the improved linear unit ReLU as the activation function; the cascaded in the residual module The output of the convolution kernel and the input of the residual module are added to become the output of the residual module;

所述的输入模块包括一个独立的卷积层和三个结构相同的残差模块，独立的卷积层以及残差模块的卷积层卷积核的核数量为32、大小为5×5、步长为1、填零宽度为2，独立的卷积层中使用改进型线性单元ReLU作为激活函数；The input module includes an independent convolution layer and three residual modules with the same structure. The step size is 1, the zero padding width is 2, and the improved linear unit ReLU is used as the activation function in the independent convolution layer;

第一编码模块包括一个独立的卷积层和三个结构相同的残差模块，独立的卷积层以及残差模块的卷积层卷积核的数量为64、大小为5×5、步长为2、填零宽度为2，独立的卷积层中使用改进型线性单元ReLU作为激活函数；The first encoding module includes an independent convolutional layer and three residual modules with the same structure. is 2, the zero-filling width is 2, and the improved linear unit ReLU is used as the activation function in the independent convolutional layer;

第二编码模块包括一个独立的卷积层和三个结构相同的残差模块，独立的卷积层以及残差模块的卷积层卷积核的数量为128、大小为5×5、步长为2、填零宽度为2，独立的卷积层中使用改进型线性单元ReLU作为激活函数；The second encoding module includes an independent convolution layer and three residual modules with the same structure. The number of convolution kernels of the independent convolution layer and the residual module is 128, the size is 5 × 5, and the stride is 128. is 2, the zero-filling width is 2, and the improved linear unit ReLU is used as the activation function in the independent convolutional layer;

所述的卷积长短时记忆模块中记忆细胞状态输出作为解码模块的输入，卷积长短时记忆模块的隐藏状态输出与下一尺度子网络中卷积长短时记忆模块的隐藏状态输入相连；对于最后一个尺度，卷积长短时记忆模块隐藏状态输出不与其他模块连接；The output of the memory cell state in the convolutional long and short-term memory module is used as the input of the decoding module, and the hidden state output of the convolutional long and short-term memory module is connected with the hidden state input of the convolutional long and short-term memory module in the next scale sub-network; for The last scale, the hidden state output of the convolutional long and short-term memory module is not connected to other modules;

第一解码模块包括三个结构相同的残差模块和一个独立的卷积层，独立的卷积层以及残差模块的卷积层卷积核的数量为128、大小为5×5、步长为2、填零宽度为2，残差模块后级联的独立的卷积层中使用改进型线性单元ReLU作为激活函数；The first decoding module includes three residual modules with the same structure and an independent convolutional layer. The number of convolution kernels of the independent convolutional layers and the convolutional layer of the residual module is 128, the size is 5×5, and the stride is 5×5. is 2, the zero-filling width is 2, and the improved linear unit ReLU is used as the activation function in the independent convolutional layer cascaded after the residual module;

第二解码模块包括三个结构相同的残差模块和一个独立的卷积层，独立的卷积层以及残差模块的卷积层卷积核的数量为64、大小为5×5、步长为2、填零宽度为2，残差模块后级联的独立的卷积层中使用改进型线性单元ReLU作为激活函数；The second decoding module includes three residual modules with the same structure and an independent convolutional layer. The number of convolution kernels of the independent convolutional layers and the convolutional layer of the residual module is 64, the size is 5×5, and the stride is 5×5. is 2, the zero-filling width is 2, and the improved linear unit ReLU is used as the activation function in the independent convolutional layer cascaded after the residual module;

所述的输出模块包括三个结构相同的残差模块和一个独立的卷积层，独立的卷积层以及残差模块的卷积层卷积核的数量为32、大小为5×5、步长为1、填零宽度为2，残差模块后级联的独立的卷积层中使用改进型线性单元ReLU作为激活函数；The output module includes three residual modules with the same structure and an independent convolution layer. The length is 1, the zero-filling width is 2, and the improved linear unit ReLU is used as the activation function in the independent convolutional layer cascaded after the residual module;

输出第三级尺度的生成器输出图像L³，大小为64×64，L³经上采样得到尺寸为128×128的图像，作为第二级尺度的输入，输出128×128的第二级尺度的生成器输出图像L²；L²经上采样得到尺寸为256×256的图像作为第一级尺度的输入，输出256×256的第一级尺度的生成器输出图像L¹，即为去模糊的结果图像；The generator that outputs the third-level scale outputs an image L³ with a size of 64×64. L³ is upsampled to obtain an image with a size of 128×128, which is used as the input of the second-level scale and outputs a second-level scale of 128×128.^The generator outputs image L²^of the resulting image;

步骤(3).从训练数据集T中随机抽取m张模糊图像和对应的清晰图像，并随机裁剪成256×256的方形区域，分别组成用于训练的模糊图像集B和对应的清晰图像集S，得到的B和S的图像数量均为m张，每张图像均为256×256的3通道彩色图像；将模糊图像集B输入生成器，得到生成器输出图像集L，L中有m张尺寸大小为256×256的彩色图像；Step (3). Randomly extract m blurred images and corresponding clear images from the training data set T, and randomly crop them into a 256×256 square area to form a blurred image set B and a corresponding clear image set for training respectively. S, the number of obtained images of B and S are m, and each image is a 256×256 3-channel color image; input the blurred image set B into the generator, and get the generator output image set L, there are m in L A color image with a size of 256×256;

步骤(4).将生成器输出图像集L和对应的清晰图像集S依次作为判别器的输入，判别器依次输出两组置信度结果，每组置信度包含m个概率值，以此判定每张输入的图像是清晰图像或生成图像：若概率值大于0.5，则判定为清晰图像；概率值小于等于0.5，则判定为生成图像；Step (4). The generator output image set L and the corresponding clear image set S are used as the input of the discriminator in turn, and the discriminator outputs two sets of confidence results in turn, each set of confidences contains m probability values, so as to determine each The input image is a clear image or a generated image: if the probability value is greater than 0.5, it is determined as a clear image; if the probability value is less than or equal to 0.5, it is determined as a generated image;

步骤(5).构建训练生成器的损失函数，损失函数为：l_db＝l_E+α₁l_grad+α₂l_adv；Step (5). Construct the loss function of the training generator, the loss function is: l_db =l_E +α₁ l_grad +α₂ l_adv ;

其中α₁、α₂为大于0的正则项系数，l_E为生成器输出图像集L和对应的清晰图像集S之间的均方误差，即：

Lⁱ、Sⁱ分别表示在第i尺度上的生成器输出图像和清晰图像,N_i表示在第i尺度图像上所有通道的像素个数，i＝1,2,3；多尺度通过对图像3次降采样得到尺寸缩小的图像；where α₁ and α₂ are regular term coefficients greater than 0, and l_E is the mean square error between the generator output image set L and the corresponding clear image set S, namely:

Li and Sⁱ represent the generator output image and clear image on the^ith scale respectively,_Ni represents the number of pixels of all channels on the ith scale image, i=1, 2, 3; Downsampling 3 times to get a reduced size image;l_grad为梯度图像

和

之间的梯度误差，即：

l_grad is the gradient image

and

The gradient error between , namely:

Lⁱ(d_x)和Lⁱ(d_y)分别表示Lⁱ的水平梯度和垂直梯度，Sⁱ(d_x)和Sⁱ(d_y)分别表示Sⁱ的水平梯度和垂直梯度；

Li (d_x ) and^Li (^dy ) represent the horizontal and vertical gradients of_{Li, respectively, and Si (d x}⁾_and^Si_(dy⁾^represent the horizontal and vertical gradients of Si, respectively;

l_adv为生成器输出图像集L和对应的清晰图像集S的判别误差，即：l_adv is the discrimination error between the generator output image set L and the corresponding clear image set S, namely:

s～p(S)表示清晰图像s取自于清晰图像集S，p(S)表示清晰图像集S的概率分布；b～p(B)表示模糊图像b取自于模糊图像集B，p(B)表示模糊图像集B的概率分布；D(s)表示判别器对输入图像s的判别概率，G(b)表示由输入图像b经生成器生成的结果图像，E[·]表示对括号内取期望；

s～p(S) means that the clear image s is taken from the clear image set S, p(S) means the probability distribution of the clear image set S; b～p(B) means that the blurred image b is taken from the blurred image set B, p (B) represents the probability distribution of the fuzzy image set B; D(s) represents the discrimination probability of the input image s by the discriminator, G(b) represents the result image generated by the generator from the input image b, and E[·] represents the pair of Take the expectation in parentheses;

步骤(6).将生成图像与清晰图像一同输入到判别器中，利用梯度下降迭代更新各层网络中的权重参数，不断优化l_adv,直到判别器无法判别输入的图像是生成图像还是清晰图像，即获得的概率值与0.5的差值变化小于thr，此时判别器训练结束；Step (6). Input the generated image and the clear image into the discriminator, use gradient descent to iteratively update the weight parameters in each layer of the network, and continuously optimize_{la adv} until the discriminator cannot distinguish whether the input image is a generated image or a clear image , that is, the difference between the obtained probability value and 0.5 is less than thr, and the discriminator training ends;

步骤(7).根据损失函数l_db＝l_E+α₁l_grad+α₂l_adv训练生成器，将模糊图像输入到生成器中，经过前向传播获得生成图像，比较生成图像与清晰图像的差异性，利用梯度下降迭代更新各层网络中的权重参数，不断优化损失函数l_db＝l_E+α₁l_grad+α₂l_adv，直到生成器模型训练阶段的训练集总损失函数值l_db变化小于阈值Th，此时生成器训练结束；Step (7). Train the generator according to the loss function l_db =l_E +α₁ l_grad +α₂ l_adv , input the blurred image into the generator, obtain the generated image through forward propagation, and compare the generated image with the clear image The difference of , use gradient descent to iteratively update the weight parameters in each layer of the network, and continuously optimize the loss function l_db =l_E +α₁ l_grad +α₂ l_adv , until the total loss function value of the training set in the training phase of the generator model l If the change in_db is less than the threshold Th, the generator training ends;

步骤(8).重复训练过程的步骤(3)～步骤(7)，直至生成器模型训练阶段的训练集总损失函数值l_db变化小于阈值Th，即判别器无法判定输入的图像是清晰图像还是生成图像，认定生成器模型与判别器模型训练已达到收敛，此时将模糊图像输入到生成器中，获得估计的去模糊图像。Step (8). Repeat steps (3) to (7) of the training process until the change of the total loss function value_ldb of the training set in the training phase of the generator model is less than the threshold Th, that is, the discriminator cannot determine that the input image is a clear image Or generate an image, and determine that the training of the generator model and the discriminator model has reached convergence. At this time, the blurred image is input into the generator to obtain an estimated deblurred image.

2.如权利要求1所述的基于循环多尺度生成对抗网络的图像盲去运动模糊方法，其特征在于：步骤(2)级联的三个尺度的子网络中，三个子网络对应的结构、通道数、卷积核尺寸均相同；彩色图像RGB的三个通道中，每个通道权值共享。2. The image blind deblurring method based on cyclic multi-scale generative adversarial network as claimed in claim 1, is characterized in that: in the sub-networks of the three scales cascaded in step (2), the corresponding structures of the three sub-networks, The number of channels and the size of the convolution kernel are the same; in the three channels of the color image RGB, the weights of each channel are shared.

3.如权利要求1所述的基于循环多尺度生成对抗网络的图像盲去运动模糊方法，其特征在于：步骤(3)和(4)中，m≥16。3. The method for blind image deblurring based on recurrent multi-scale generative adversarial network as claimed in claim 1, characterized in that: in steps (3) and (4), m≥16.

4.如权利要求1所述的基于循环多尺度生成对抗网络的图像盲去运动模糊方法，其特征在于：步骤(5)中，多尺度通过对图像多次降采样得到尺寸缩小的图像，第一级尺度为原尺寸大小的图像，从第二级开始，每一级图像的尺寸为上一级图像尺寸的宽度、高度各一半。4. The image blind deblurring method based on cyclic multi-scale generative adversarial network as claimed in claim 1, it is characterized in that: in step (5), multi-scale obtains the size-reduced image by down-sampling the image for many times, the first The first-level scale is the image of the original size. Starting from the second level, the size of each level of image is half the width and height of the previous level's image size.

5.如权利要求1所述的基于循环多尺度生成对抗网络的图像盲去运动模糊方法，其特征在于：步骤(6)中，0.01≤thr≤0.08。5. The method for blind image deblurring based on recurrent multi-scale generative adversarial network according to claim 1, wherein: in step (6), 0.01≤thr≤0.08.

6.如权利要求1所述的基于循环多尺度生成对抗网络的图像盲去运动模糊方法，其特征在于：步骤(7)和(8)中，0.001≤Th≤0.01。6. The method for blind image deblurring based on recurrent multi-scale generative adversarial network according to claim 1, wherein: in steps (7) and (8), 0.001≤Th≤0.01.