CN109255755B

Movatterモバイル変換

Info

Publication number: CN109255755B
Application number: CN201811241002.4A
Authority: CN
Inventors: 王永芳; 帅源
Original assignee: University of Shanghai for Science and Technology
Current assignee: University of Shanghai for Science and Technology
Priority date: 2018-10-24
Filing date: 2018-10-24
Publication date: 2023-05-23
Anticipated expiration: 2038-10-24
Also published as: CN109255755A

Abstract

The invention discloses an image super-resolution reconstruction method based on a multi-column convolutional neural network. First, a multi-column convolutional neural network model is designed according to a deep learning algorithm, including a feature extraction portion and an image reconstruction portion. The original image is then cut into small blocks and these high resolution small blocks are downsampled to obtain low resolution small blocks, which are used to build the training set. And finally, training the model by using a random gradient descent algorithm to obtain a model for reconstructing the low-resolution image into the high-resolution image, and recovering the input low-resolution image reconstruction into the corresponding high-resolution image. The method of the invention tests on five general image databases of Set5, set14, BSDS100, urban100 and Manga109, and has higher robustness and accuracy.

Description

Translated fromChinese

基于多列卷积神经网络的图像超分辨率重建方法Image super-resolution reconstruction method based on multi-column convolutional neural network

技术领域Technical Field

本发明涉及一种图像超分辨率重建方法，特别是涉及一种基于多列卷积神经网络图像的超分辨率重建方法，属于图像处理、重建技术利用。The invention relates to an image super-resolution reconstruction method, in particular to a super-resolution reconstruction method based on a multi-column convolutional neural network image, and belongs to the field of image processing and reconstruction technology utilization.

背景技术Background Art

随着信息技术的发展，图像作为其中主要的信息传播媒介，已经广泛应用于各种场景。在众多领域中，人们对于图像的质量有着较高的要求，所以对于高速发展的信息时代来说，低质量的图像已经很难满足特定场景的需求。图像分辨率是度量图像质量的一个重要指标，图像分辨率越高就代表这张图像包含更多的细节信息。图像超分辨率(Super-Resolution，SR)重建属于图像处理技术，从低分辨率(Low-Resolution，LR)图像重建得到高分辨率(High-Resolution，HR)图像。图像的超分辨率重建有着广泛的应用，例如人脸识别，医疗成像和遥感技术。With the development of information technology, images, as the main medium for information dissemination, have been widely used in various scenarios. In many fields, people have high requirements for image quality. Therefore, in the rapidly developing information age, low-quality images can hardly meet the needs of specific scenarios. Image resolution is an important indicator for measuring image quality. The higher the image resolution, the more detailed information the image contains. Image super-resolution (SR) reconstruction belongs to image processing technology, which reconstructs high-resolution (HR) images from low-resolution (LR) images. Image super-resolution reconstruction has a wide range of applications, such as face recognition, medical imaging and remote sensing technology.

目前，卷积神经网络(Convolutional Neural Networks，CNN)在目标检测，人类行为识别和图像分割等计算机视觉任务上取得了显著的进步。尤其是基于卷积神经网络的超分辨率方法，比字典学习、局部线性回归和随机森林等传统方法有着更好的重建效果。2014年，Dong等利用卷积神经网络实现了图像超分辨率重建(Super ResolutionConvolutional Neural Network，SRCNN)，参见参考文献Dong,Chao,et al."Image super-resolution using deep convolutional networks."IEEE transactions on patternanalysis and machine intelligence 38.2(2016):295-307。双三次插值预处理后的低分辨率图像会输入到端到端的深层卷积神经网络中去，逐渐将低分辨率图像到高分辨率图像之间的映射关系学习处理。由于采用深度学习中的端到端的训练方式，从而相较于传统方法这一方法显著提高了图像超分辨率重建效果。At present, convolutional neural networks (CNNs) have made significant progress in computer vision tasks such as target detection, human behavior recognition, and image segmentation. In particular, super-resolution methods based on convolutional neural networks have better reconstruction effects than traditional methods such as dictionary learning, local linear regression, and random forests. In 2014, Dong et al. used convolutional neural networks to achieve image super-resolution reconstruction (Super Resolution Convolutional Neural Network, SRCNN), see reference Dong, Chao, et al. "Image super-resolution using deep convolutional networks." IEEE transactions on pattern analysis and machine intelligence 38.2 (2016): 295-307. The low-resolution image preprocessed by bicubic interpolation will be input into the end-to-end deep convolutional neural network to gradually learn and process the mapping relationship between low-resolution images and high-resolution images. Due to the end-to-end training method in deep learning, this method significantly improves the image super-resolution reconstruction effect compared with traditional methods.

已经提出的基于卷积神经网络的图像超分辨率算法虽然解决了传统图像超分辨率重建算法存在鲁棒性不强、计算复杂等问题，但现有的基于卷积神经网络的图像超分辨率方法在提取低分辨率图像特征前，要先使用双三次插值(Bicubic Interpolation)的方法把低分辨率图像放大到想要重建得到的高分辨率图像的尺寸，从双三次插值后的图像中去提取特征，通过双三次插值后的图像引入了很多冗余的信息，这对特征提取是没有帮助的。因此，现有的方法还是存在对于细节比较丰富的图像存在重建能力差、视觉效果差等问题。Although the proposed image super-resolution algorithm based on convolutional neural network solves the problems of weak robustness and complex calculation of traditional image super-resolution reconstruction algorithm, the existing image super-resolution method based on convolutional neural network needs to use bicubic interpolation method to enlarge the low-resolution image to the size of the high-resolution image to be reconstructed before extracting the features of the low-resolution image, and extract features from the image after bicubic interpolation. The image after bicubic interpolation introduces a lot of redundant information, which is not helpful for feature extraction. Therefore, the existing method still has problems such as poor reconstruction ability and poor visual effect for images with rich details.

发明内容Summary of the invention

本发明的目的是为了对低分辨率图像进行更高质量的重建，提出一种基于多列卷积神经网络的图像超分辨率方法，通过对低分辨率图像中多尺度特征的提取，使重建的高分辨率图像能够恢复更多的图像细节信息，边缘更加清晰。本发明方法能有效地提高超分辨率重建图像的峰值信噪比和结构相似度，而且在主观视觉上也有更好的效果。此外，本发明对于卷积神经网络在图像超分辨率的应用也具有重要的借鉴意义。The purpose of the present invention is to reconstruct a low-resolution image with higher quality, and propose an image super-resolution method based on a multi-column convolutional neural network. By extracting multi-scale features from the low-resolution image, the reconstructed high-resolution image can restore more image detail information and have clearer edges. The method of the present invention can effectively improve the peak signal-to-noise ratio and structural similarity of the super-resolution reconstructed image, and also has a better effect in subjective vision. In addition, the present invention also has important reference significance for the application of convolutional neural networks in image super-resolution.

为达到上述目的，本发明的构思是：To achieve the above object, the concept of the present invention is:

首先，根据深度学习算法设计多列卷积神经网络模型，包括特征提取部分和图像重建部分。然后，把原始图像切成小块，并对这些高分辨率的小块做下采样，从而得到低分辨率的小块，使用这些低分辨率和高分辨率的小块对来建立训练集。最后，使用随机梯度下降算法对这个模型进行训练，得到一个将低分辨率图像重建到高分辨率图像的模型，即本发明所述的多列卷积神经网络的图像超分辨率重建模型。First, a multi-column convolutional neural network model is designed according to a deep learning algorithm, including a feature extraction part and an image reconstruction part. Then, the original image is cut into small blocks, and these high-resolution small blocks are downsampled to obtain low-resolution small blocks, and these low-resolution and high-resolution small block pairs are used to establish a training set. Finally, the model is trained using a stochastic gradient descent algorithm to obtain a model that reconstructs low-resolution images into high-resolution images, that is, the image super-resolution reconstruction model of the multi-column convolutional neural network described in the present invention.

根据上述构思，本发明采用如下技术方案：According to the above concept, the present invention adopts the following technical solution:

一种基于多列卷积神经网络的图像超分辨率方法，包括如下步骤：An image super-resolution method based on a multi-column convolutional neural network comprises the following steps:

步骤1、多列卷积神经网络模型建立：根据深度学习算法设计多列卷积神经网络模型，包括特征提取部分和图像重建部分；Step 1: Establish a multi-column convolutional neural network model: Design a multi-column convolutional neural network model based on the deep learning algorithm, including the feature extraction part and the image reconstruction part;

步骤2、图像增广(Image Augmentation)：大规模数据集是成功使用深度网络的前提，图像增广是通过对训练图像做一系列随机改变，来产生相似但又有不同的训练样本，从而扩大训练数据集规模；通过图像增广来增加训练集的规模，降低模型对某些属性的依赖，从而提高模型的泛化能力，使用的图像增广方法有旋转，缩放，镜像；Step 2: Image Augmentation: Large-scale datasets are a prerequisite for the successful use of deep networks. Image augmentation is to make a series of random changes to the training images to generate similar but different training samples, thereby expanding the size of the training dataset. By increasing the size of the training set through image augmentation, the model's dependence on certain attributes is reduced, thereby improving the model's generalization ability. The image augmentation methods used include rotation, scaling, and mirroring.

步骤3、训练集建立：在根据步骤2得到的规模增加的训练集上把原始图像切成小块，并对这些高分辨率的小块做下采样，从而得到低分辨率的小块，使用这些低分辨率和高分辨率的小块对来建立训练集；Step 3: Establishing a training set: Cut the original image into small blocks on the training set with increased size obtained in step 2, and downsample these high-resolution blocks to obtain low-resolution blocks. Use these low-resolution and high-resolution block pairs to establish a training set.

步骤4、多列卷积神经网络模型训练：在步骤3得到的训练集上训练图像超分辨率重建模型，优化算法使用随机梯度下降算法，训练完成后得到一个将低分辨率图像重建到高分辨率图像的模型；Step 4: Multi-column convolutional neural network model training: Train the image super-resolution reconstruction model on the training set obtained in step 3. The optimization algorithm uses the stochastic gradient descent algorithm. After the training is completed, a model is obtained that reconstructs low-resolution images into high-resolution images.

步骤5、图像超分辨率重建：在步骤4训练得到的模型将输入的低分辨率图像重建恢复成对应的高分辨率图像。Step 5: Image super-resolution reconstruction: The model trained in step 4 reconstructs the input low-resolution image into a corresponding high-resolution image.

本发明方法主要是考虑到图像的多尺度特性，因此借助于多列卷积神经网络模型，能有效地提取图像中的多尺度特征，并将这些多尺度特征进行融合。直接从低分辨率图像中提取特征，减少计算量，提高图像的重建速度。为了加快图像超分辨率重建模型的收敛速度，使用提取得到的多尺度特征去重建高分辨率图像和双三次插值图像的插值图像，而不是直接从特征去重建高分辨率图像，降低了网络的训练难度，同时提高图像的超分辨率重建质量。The method of the present invention mainly takes into account the multi-scale characteristics of the image. Therefore, with the help of a multi-column convolutional neural network model, the multi-scale features in the image can be effectively extracted and the multi-scale features can be fused. Features are directly extracted from low-resolution images to reduce the amount of calculation and improve the image reconstruction speed. In order to speed up the convergence speed of the image super-resolution reconstruction model, the extracted multi-scale features are used to reconstruct the interpolation image of the high-resolution image and the bicubic interpolation image, rather than directly reconstructing the high-resolution image from the features, which reduces the difficulty of network training and improves the image super-resolution reconstruction quality.

本发明与现有技术相比较，具有如下显而易见的突出实质性特点和显著优点：Compared with the prior art, the present invention has the following obvious outstanding substantial features and significant advantages:

1、本发明方法充分考虑了图像的多尺度特点，即图像中物体存在尺度不同的情况。提出了一种基于多列卷积神经网络的图像超分辨率重建模型。1. The method of the present invention fully considers the multi-scale characteristics of images, that is, the objects in the image have different scales. A multi-column convolutional neural network-based image super-resolution reconstruction model is proposed.

2、本发明方法直接从未经预处理的低分辨率图像中提取特征，使其计算量减少，从而提高模型的重建速度。2. The method of the present invention directly extracts features from low-resolution images that have not been preprocessed, thereby reducing the amount of calculation and improving the reconstruction speed of the model.

3、本发明方法利用提取得到的多尺度特征去重建高分辨率图像和双三次插值图像的插值图像，而不是直接从特征去重建高分辨率图像，降低了模型的训练难度，提高了图像的超分辨率重建质量。3. The method of the present invention utilizes the extracted multi-scale features to reconstruct the interpolated images of the high-resolution image and the bicubic interpolated image, rather than directly reconstructing the high-resolution image from the features, thereby reducing the difficulty of model training and improving the quality of super-resolution reconstruction of the image.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1为本发明基于多列卷积神经网络的图像超分辨率重建方法的网络结构框图。FIG1 is a network structure block diagram of an image super-resolution reconstruction method based on a multi-column convolutional neural network according to the present invention.

图2为Set5测试集中“butterfly”放大倍数为2时的超分辨率重建效果比较。Figure 2 compares the super-resolution reconstruction effects of “butterfly” in the Set5 test set when the magnification is 2.

图3为BSDS100测试集中“21077”放大倍数为3时的超分辨率重建效果比较。Figure 3 compares the super-resolution reconstruction effects of “21077” in the BSDS100 test set when the magnification is 3.

图4为Urban100测试集中“img023”放大倍数为4时的超分辨率重建效果比较。Figure 4 shows the comparison of super-resolution reconstruction effects when the magnification of “img023” in the Urban100 test set is 4.

图5为Manga109测试集中“UltraEleven”放大倍数为4时的超分辨率重建效果比较。Figure 5 shows the comparison of super-resolution reconstruction effects when the “UltraEleven” magnification is 4 in the Manga109 test set.

具体实施方式DETAILED DESCRIPTION

本发明的优选实施例结合附图详述如下：The preferred embodiments of the present invention are described in detail as follows in conjunction with the accompanying drawings:

本实施例的多列卷积神经网络结构如图1所示。在Ubuntu 16.04，PyTorch环境下编程仿真实现本方法。首先，根据深度学习算法设计多列卷积神经网络模型，包括特征提取部分和图像重建部分。然后，把原始图像切成小块，并对这些高分辨率的小块做下采样，从而得到低分辨率的小块，使用这些低分辨率和高分辨率的小块对来建立训练集。最后，使用随机梯度下降算法对这个模型进行训练，得到一个将低分辨率图像重建到高分辨率图像的模型，即本发明所述的多列卷积神经网络的图像超分辨率重建模型。The multi-column convolutional neural network structure of this embodiment is shown in Figure 1. This method is implemented by programming simulation in Ubuntu 16.04, PyTorch environment. First, a multi-column convolutional neural network model is designed according to the deep learning algorithm, including a feature extraction part and an image reconstruction part. Then, the original image is cut into small blocks, and these high-resolution small blocks are downsampled to obtain low-resolution small blocks, and these low-resolution and high-resolution small block pairs are used to establish a training set. Finally, the model is trained using a stochastic gradient descent algorithm to obtain a model that reconstructs low-resolution images into high-resolution images, that is, the image super-resolution reconstruction model of the multi-column convolutional neural network described in the present invention.

本方法具体包括如下步骤：This method specifically comprises the following steps:

在所述步骤1中，提出了一个级联的多列卷积神经网络从低分辨率图像中提取多尺度特征，然后重建相应的高分辨率图像，网络结构如图1所示。提出的网络框架使用了很多的多列模块(Multi-Column Block)，每个多列模块由三列不同卷积核大小的卷积层组成。所提出的模型从输入的低分辨率图像去预测经过双三次插值后图像和目标高分辨率图像之间的插值图像。所提出的模型分为两个部分，特征提取部分和图像重建部分。Instep 1, a cascaded multi-column convolutional neural network is proposed to extract multi-scale features from a low-resolution image and then reconstruct the corresponding high-resolution image. The network structure is shown in Figure 1. The proposed network framework uses a lot of multi-column blocks, each of which consists of three columns of convolutional layers with different convolution kernel sizes. The proposed model predicts the interpolated image between the image after bicubic interpolation and the target high-resolution image from the input low-resolution image. The proposed model is divided into two parts, the feature extraction part and the image reconstruction part.

在特征提取部分，首先使用一个卷积层来提取粗糙的特征，该卷积层有64个3×3的卷积核。然后，使用三个级联的多列模块去提取多尺度特征。在该模型中，没有使用偏置，所以卷积层的计算公式如下：In the feature extraction part, a convolutional layer is first used to extract coarse features. The convolutional layer has 64 3×3 convolution kernels. Then, three cascaded multi-column modules are used to extract multi-scale features. In this model, no bias is used, so the calculation formula of the convolutional layer is as follows:

在上述公式中，W_l和x分别表示可学习的权重和卷积层的输入。σ表示激活函数，在该模型中，使用带泄露修正线性单元(Leaky Rectified Linear Unit)。In the above formula, W_l and x represent the learnable weights and the input of the convolutional layer respectively. σ represents the activation function. In this model, the Leaky Rectified Linear Unit is used.

最后，使用一个反卷积层去上采样所提取的特征，在反卷积层后使用一个3×3的卷积层来得到残差图像。反卷积层的输出图像尺寸的计算公式如下：Finally, a deconvolution layer is used to upsample the extracted features, and a 3×3 convolution layer is used after the deconvolution layer to get the residual image. The output image size of the deconvolution layer is calculated as follows:

X_out＝(X_in-1)×λ-2×ρ+κ, (2)X_out =(X_in -1)×λ-2×ρ+κ, (2)

在上述公式中，X_in和X_out分别是反卷积层的输入和输出，λ表示反卷积的步长，ρ表示在输入的每条边上添加0的行数，κ表示反卷积核的大小。显然，需要把λ设置为和放大倍数一样。表1给出了在不同放大倍数下的反卷积层的参数设置。In the above formula, X_in and X_out are the input and output of the deconvolution layer, respectively, λ represents the deconvolution step size, ρ represents the number of rows of 0 added to each edge of the input, and κ represents the size of the deconvolution kernel. Obviously, λ needs to be set to the same as the magnification. Table 1 shows the parameter settings of the deconvolution layer at different magnifications.

表1Table 1

在所提出的模型中，在每一列使用了不同大小的卷积核去提取特征。详细的结构如图1所示。卷积层的感受野γ的计算公式如下：In the proposed model, convolution kernels of different sizes are used in each column to extract features. The detailed structure is shown in Figure 1. The calculation formula of the receptive field γ of the convolution layer is as follows:

γ＝κ+(κ-1)×(n-1), (3)γ＝κ+(κ-1)×(n-1), (3)

在上述公式中，κ表示卷积核的大小，n表示在每一列中卷积层的数量。根据上式，在多列模块中使用了6层卷积核大小为3×3的卷积层，3层卷积核大小为5×5的卷积层，2层卷积核大小为7×7的卷积层，这样可以获得相同大小的感受野。In the above formula, κ represents the size of the convolution kernel, and n represents the number of convolution layers in each column. According to the above formula, 6 convolution layers with a convolution kernel size of 3×3, 3 convolution layers with a convolution kernel size of 5×5, and 2 convolution layers with a convolution kernel size of 7×7 are used in the multi-column module, so that the receptive field of the same size can be obtained.

为了提取到更加可靠的特征，不同列提取的特征需要在同一个感受野上进行特征融合。特征融合采取的方法是在每一列最后一层增加一个1×1的卷积层，然后把这些列的特征图做元素相加进入融合。增加1×1的卷积层的好处是可以对多尺度特征有更加复杂的组合。通常，更多的多列模块可以有更好的性能，出于性能和效率的权衡，本实施例使用了三个多列模块。In order to extract more reliable features, features extracted from different columns need to be fused on the same receptive field. The method used for feature fusion is to add a 1×1 convolution layer to the last layer of each column, and then add the feature maps of these columns element by element to enter the fusion. The advantage of adding a 1×1 convolution layer is that more complex combinations of multi-scale features can be achieved. Generally, more multi-column modules can have better performance. For the trade-off between performance and efficiency, this embodiment uses three multi-column modules.

在图像重建模块，使用一个3×3的卷积层去预测高分辨率图像和双三次插值图像的残差图像。将网络预测的残差图像和双三次插值图像通过元素相加，从而可以重建出相应的高分辨率图像。输出图像的计算公式如下：In the image reconstruction module, a 3×3 convolutional layer is used to predict the residual image of the high-resolution image and the bicubic interpolation image. The residual image predicted by the network and the bicubic interpolation image are added element by element to reconstruct the corresponding high-resolution image. The calculation formula of the output image is as follows:

在上述公式中，x和

分别表示模型的输入的低分辨率图像和输出的高分辨率图像。

表示双三次插值，

表示所提出的模型。In the above formula, x and

They represent the low-resolution image input and the high-resolution image output of the model respectively.

represents bicubic interpolation,

Represents the proposed model.

在所述步骤2中，使用的训练集图像由Yang的91张图片和BSDS的200张图片组成。参见参考文献Yang,Jianchao,et al."Image super-resolution via sparserepresentation."IEEE transactions on image processing 19.11(2010):2861-2873。并参见参考文献Martin,David,et al."A database ofhuman segmented natural imagesand its application to evaluating segmentation algorithms and measuringecological statistics."Computer Vision,2001.ICCV2001.Proceedings.Eighth IEEEInternational Conference on.Vol.2.IEEE,2001。图像增广的方式主要有缩放，旋转，镜像。其中缩放的倍数为1倍、0.7倍和0.5倍；旋转的角度为0°、90°、180°和270°；镜像为水平镜像或者保持原图。经过图像增广，除了原图以外，得到了23个额外的版本。In step 2, the training set images used are composed of 91 images of Yang and 200 images of BSDS. See reference Yang, Jianchao, et al. "Image super-resolution via sparse representation." IEEE transactions on image processing 19.11 (2010): 2861-2873. And see reference Martin, David, et al. "A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics." Computer Vision, 2001. ICCV 2001. Proceedings. Eighth IEEE International Conference on. Vol. 2. IEEE, 2001. The main ways of image augmentation are scaling, rotation, and mirroring. The scaling multiples are 1, 0.7, and 0.5; the rotation angles are 0°, 90°, 180°, and 270°; the mirroring is horizontal mirroring or keeping the original image. After image augmentation, 23 additional versions are obtained in addition to the original image.

在所述步骤3中，在根据步骤2得到的规模增加的训练集上把原始图像切成小块，并对这些高分辨率的小块做下采样，从而得到低分辨率的小块，使用这些低分辨率和高分辨率的小块对来建立训练集。当放大倍数为2时，图像块大小为82×82，步长为64，下采样为放大倍数的倒数，即1/2倍。类似的，当放大倍数为3时，图像块大小为123×123，步长为48，下采样倍数为1/3倍；当放大倍数为4时，图像块大小为164×164，步长为32，下采样倍数为1/4倍。输入低分辨率图像块的大小都为41×41。In step 3, the original image is cut into small blocks on the scale-increased training set obtained in step 2, and these high-resolution small blocks are downsampled to obtain low-resolution small blocks, and these low-resolution and high-resolution small block pairs are used to establish the training set. When the magnification is 2, the image block size is 82×82, the step size is 64, and the downsampling is the inverse of the magnification, that is, 1/2 times. Similarly, when the magnification is 3, the image block size is 123×123, the step size is 48, and the downsampling multiple is 1/3 times; when the magnification is 4, the image block size is 164×164, the step size is 32, and the downsampling multiple is 1/4 times. The size of the input low-resolution image block is 41×41.

在所述步骤4中，在步骤3得到的训练集上训练图像超分辨率重建模型，优化算法使用随机梯度下降算法(Stochastic Gradient Descent)，批大小设置为64，动量参数设为0.9，权值衰减设置为10^-4，学习率设置为0.1，并且在每20个迭代周期后下降10倍。由于初始的学习率比较大，所示使用梯度切片来防止梯度爆炸，梯度切片设置为0.4，训练完成后可以得到一个将低分辨率图像重建到高分辨率图像的模型。In step 4, the image super-resolution reconstruction model is trained on the training set obtained in step 3, the optimization algorithm uses the stochastic gradient descent algorithm, the batch size is set to 64, the momentum parameter is set to 0.9, the weight decay is set to^10-4 , the learning rate is set to 0.1, and it decreases by 10 times after every 20 iterations. Since the initial learning rate is relatively large, gradient slicing is used to prevent gradient explosion, and the gradient slicing is set to 0.4. After the training is completed, a model that reconstructs low-resolution images into high-resolution images can be obtained.

在所述步骤5中，在上述步骤4训练得到的模型将输入的低分辨率图像重建恢复成对应的高分辨率图像。In step 5, the model trained in step 4 reconstructs the input low-resolution image into a corresponding high-resolution image.

下面在Set5、Set14、BSDS100、Urban100和Manga109五个图像数据库上进行试验来评估本发明所提出的基于多列卷积神经网络的图像超分辨率重建方法。Set5、Set14和BSDS100包含的是自然图像；Urban100包含的是城市场景图像；Manga109包含的漫画中图像。本实验的环境是Ubuntu 16.04操作系统下的PyTorch平台，内存为16GB，GPU为GeForce1070。使用峰值信噪比(Peak Signal to Noise Ratio，PSNR)和结构相似性系数(Structural Similarity Index，SSIM)作为超分辨率重建模型评价指标，PSNR越大，SSIM越接近1代表模型与原图的符合度更高，精确度越高，结果如表2所示。图2-图5比较了不同算法在这些测试集上重建效果。The following experiments are conducted on five image databases, Set5, Set14, BSDS100, Urban100 and Manga109, to evaluate the image super-resolution reconstruction method based on multi-column convolutional neural network proposed in the present invention. Set5, Set14 and BSDS100 contain natural images; Urban100 contains images of urban scenes; Manga109 contains images in comics. The environment of this experiment is the PyTorch platform under the Ubuntu 16.04 operating system, with 16GB of memory and a GeForce1070 GPU. Peak Signal to Noise Ratio (PSNR) and Structural Similarity Index (SSIM) are used as evaluation indicators for super-resolution reconstruction models. The larger the PSNR and the closer the SSIM is to 1, the higher the consistency between the model and the original image and the higher the accuracy. The results are shown in Table 2. Figures 2-5 compare the reconstruction effects of different algorithms on these test sets.

表2Table 2

表2中，参见如下参考文献：In Table 2, see the following references:

4.Timofte,Radu,Vincent De Smet,and Luc Van Gool."A+:Adjusted anchoredneighborhood regression for fast super-resolution."Asian Conference onComputer Vision.Springer,Cham,2014.4. Timofte, Radu, Vincent De Smet, and Luc Van Gool. "A+: Adjusted anchoredneighborhood regression for fast super-resolution." Asian Conference on Computer Vision. Springer, Cham, 2014.

5.Huang,Jia-Bin,Abhishek Singh,and Narendra Ahuja."Single imagesuper-resolution from transformed self-exemplars."Proceedings oftheIEEEConference on Computer Vision andPatternRecognition.2015.5. Huang, Jia-Bin, Abhishek Singh, and Narendra Ahuja. "Single imagesuper-resolution from transformed self-exemplars." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015.

6.Schulter,Samuel,Christian Leistner,and Horst Bischof."Fast andaccurate image upscaling with super-resolution forests."Proceedings of theIEEE Conference on Computer Vision and Pattern Recognition.2015.6.Schulter, Samuel, Christian Leistner, and Horst Bischof. "Fast andaccurate image upscaling with super-resolution forests." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015.

7.Dong,Chao,Chen Change Loy,and Xiaoou Tang."Accelerating the super-resolution convolutional neural network."European Conference on ComputerVision.Springer,Cham,2016.7. Dong, Chao, Chen Change Loy, and Xiaoou Tang. "Accelerating the super-resolution convolutional neural network." European Conference on ComputerVision. Springer, Cham, 2016.

8.Kim,Jiwon,Jung Kwon Lee,and Kyoung Mu Lee."Accurate image super-resolution using very deep convolutional networks."Proceedings oftheIEEEconference on computervision andpattern recognition.2016.8. Kim, Jiwon, Jung Kwon Lee, and Kyoung Mu Lee. "Accurate image super-resolution using very deep convolutional networks." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.

9.Kim,Jiwon,Jung Kwon Lee,and Kyoung Mu Lee."Deeply-recursiveconvolutional network for image super-resolution."Proceedings oftheIEEEconference on computervision andpattern recognition.2016.9. Kim, Jiwon, Jung Kwon Lee, and Kyoung Mu Lee. "Deeply-recursiveconvolutional network for image super-resolution." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.

10.Lai,Wei-Sheng,et al."Deep laplacian pyramid networks for fast andaccurate superresolution."IEEE Conference on Computer Vision andPatternRecognition.Vol.2.No.3.2017.10.Lai, Wei-Sheng, et al. "Deep laplacian pyramid networks for fast and accurate superresolution." IEEE Conference on Computer Vision andPatternRecognition.Vol.2.No.3.2017.

其中，实验结果最好的算法用加粗字体表示，第二好的算法用下划线表示。从表中可以看到本发明的方法在五个数据库上都有较好的鲁棒性和精确性。由上述实验可见，本发明方法在图像超分辨率重建上确实有较好的鲁棒性和精确性，并且计算复杂度低，能更好地适用于实时视频质量监控。Among them, the algorithm with the best experimental results is indicated by bold font, and the second best algorithm is indicated by underline. From the table, it can be seen that the method of the present invention has good robustness and accuracy on the five databases. From the above experiments, it can be seen that the method of the present invention does have good robustness and accuracy in image super-resolution reconstruction, and has low computational complexity, and can be better applied to real-time video quality monitoring.

Claims

1. The image super-resolution method based on the multi-column convolutional neural network is characterized by comprising the following steps of:

step 1, building a multi-column convolutional neural network model: designing a multi-column convolutional neural network model according to a deep learning algorithm, wherein the multi-column convolutional neural network model comprises a feature extraction part and an image reconstruction part;

step 2, image augmentation: the large-scale data set is a precondition of successfully using the depth network, and the image augmentation is to generate similar training samples but different training samples by making a series of random changes on the training images, so that the scale of the training data set is enlarged; the scale of the training set is increased through image augmentation, and the dependence of the model on certain attributes is reduced, so that the generalization capability of the model is improved, and the used image augmentation method comprises rotation, scaling and mirroring;

step 3, training set establishment: cutting an original image into small blocks on the training set with the increased scale obtained according to the step 2, and downsampling the small blocks with high resolution so as to obtain small blocks with low resolution, and establishing the training set by using the small block pairs with low resolution and high resolution;

step 4, training a multi-column convolutional neural network model: training an image super-resolution reconstruction model on the training set obtained in the step 3, and obtaining a model for reconstructing a low-resolution image to a high-resolution image after training by using a random gradient descent algorithm through an optimization algorithm;

step 5, reconstructing the super-resolution of the image: the model trained in step 4 reconstructs the input low resolution image back into a corresponding high resolution image.