CN107392865A

Movatterモバイル変換

Info

Publication number: CN107392865A
Application number: CN201710528727.0A
Authority: CN
Inventors: 林倞; 曹擎星
Original assignee: Guangzhou Deep Domain Mdt Infotech Ltd
Current assignee: Guangzhou wisdom Technology (Guangzhou) Co.,Ltd.
Priority date: 2017-07-01
Filing date: 2017-07-01
Publication date: 2017-11-24
Anticipated expiration: 2037-07-01
Also published as: CN107392865B

Abstract

The invention provides a kind of restored method of facial image, comprise the following steps：S1, obtain lineup's face image pair；S2, using blurred picture as initial input image, input in a tactful network；S3, by the tactful network, one piece of region is chosen from input picture；S4, by one enhancing network, in input picture S3 choose region restore；S5, iteration perform S3 to S4 several times；S6, tactful network and enhancing network are trained；S7, tactful network and enhancing network are initialized；S8, using the facial image of parked as initial input image, in input policing network, repeat S3 to S5, the facial image restored.The restored method of facial image provided by the invention, can the autonomous less region of distortion in the fuzzy facial image of preferential selection, these regions are recovered, and the recovery of remaining distortion zone is helped using the extraneous information after the recovery of these regions, reach recovery effect more more preferable than prior art.

Description

A kind of restored method of facial image

Technical field

The present invention relates to image processing field, more particularly, to a kind of restored method of facial image.

Background technology

Low-resolution face image is restored, and refers to going out clearly using secondary or several low resolution a face image restorationsClear and high-resolution facial image.In many images or video, the high face of definition is often with important informationAnd value.Particularly in recent years, with road monitoring, the extensive popularization of drive recorder and safety monitoring, clearly faceIn monitor video with being increasingly taken seriously in image.Many applications, such as authentication, population analysis, human body tracking etc., peopleFace image all plays extremely important role.In actual applications, to the high-resolution demand of facial image often and monitoringThe low resolution of video forms contradiction.So fuzzy and definition deficiency of the facial image in monitor video gives video monitoringPractical application bring it is many obstruction with inconvenience.Under the limitation of technical conditions, high-resolution optical sensor is notIt is seen everywhere.Although by upgrading the equipment such as optical sensor can be solved the problems such as face is fuzzy in video, this is oftenCause purchase cost and maintenance cost increase, and can not also solve the definition of recorded video.Meanwhile using processIn, can also there are many interference, such as move, distance is far waited situations such as influenceing recorded video quality.Therefore, technological means is passed throughDesired information is obtained from high-resolution image is restored, there is huge demand in actual applications.

In the current generation, when analyzing video, people are often the information in checking monitoring video, and counterweight repeatedlyPoint part is observed repeatedly, and facial image is often one of keynote message in video.Because the face in monitor videoOften distant, accounting is small.So when camera position farther out when, the resolution ratio of facial image is often than relatively low.For videoThe inadequate face of definition, the method for use are often carried out directly carrying out interpolation amplification, then analyzed again first.InterpolationMethod speed is fast, and has a wide range of applications.But because its amplification is second-rate, it can cause image high-frequency information is impaired to causeImage obscures, and can restore the difficulty for bringing many to the identification of the facial image in video.

With the development of computer vision technique, many computer vision techniques have been applied to low-resolution face imageIn recovery.At present, more ripe technology includes interpolation method, dictionary learning method, depth convolutional neural networks etc..Dictionary learningMethod is to establish low resolution image and two dictionaries of full resolution pricture, by learning different mapping relations, to reach from low pointDebate high-resolution mapping；Interpolation rule is by establishing the up-sampling function model more optimized, is ensureing high-frequency information holdingIt is complete under the conditions of, carry out enlarged drawing；Deep approach of learning is by neural network from low resolution to high-resolution " diluteDredge expression-mapping-reconstruct " process, to obtain high-definition picture.Although exist many for low-resolution face image recoveryMethod, but most method all be directed to controlled environment under facial image, i.e., face must in strict angle, illumination,Under the conditions of expression.

The content of the invention

It is an object of the present invention to it is directed to problems of the prior art, there is provided a kind of restored method of facial image,To realize fuzzy face image restoration under uncontrolled environment into picture rich in detail.

To achieve the above object, the present invention uses following technical scheme：

A kind of restored method of facial image, comprises the following steps：

S1, lineup's face image pair is obtained, the facial image is to a picture rich in detail including same person face imageWith a blurred picture；

S2, using blurred picture as initial input image, input in a tactful network；

S3, by the tactful network, one piece of region is chosen from input picture；

S4, by one enhancing network, in input picture S3 choose region restore；

S5, the image obtained after S4 is carried out into region recovery integrally perform S3 extremely as S3 input picture, iterationSeveral times, it is restored image that last time repeats the image that S4 is obtained to S4；

S7, based on the parameter for training to obtain in S6 to tactful network and enhancing network initialize；

S8, using the facial image of parked as initial input image, in input policing network, repeat S3 to S5, obtainThe facial image of recovery.

Further, the tactful network includes full articulamentum and shot and long term memory network；The shot and long term memory networkThe region chosen is recorded and encoded during for iteration before to be performed into S3, and by it is hidden it is vectorial in the form of be delivered to downAn iteration.

Further, the input picture of the tactful network in step S3 is to be walked in the blurred picture or last round of iterationThe image that rapid S4 is obtained, export for input picture probability graph of the same size；In S8, when going to S3, with probability graphPoint centered on probability highest point, the rectangular area that correspondence position intercepts one piece of fixed size over an input image is step S3 choosingsThe region selected.

Further, before S8, when going to S3, point centered on a point is randomly selected in probability graph, is being inputtedThe rectangular area that correspondence position intercepts one piece of fixed size on image is the region of step S3 selections.

Further, the enhancing network includes convolutional neural networks and multiple full articulamentums, the convolutional neural networksIt is made up of 8 convolutional layers.

Further, it is similar between restored image and the picture rich in detail obtained in S1 that calculating S5 is obtained in the S6Mean square error of the method for degree for calculating between the two, that is, poor square of two images correspondence position pixel is calculated, and willAll obtained value summations.

Further, in the S6, the method using nitrification enhancement Training strategy network is specially：By in step S6Obtained image similarity negates, as the prize signal in intensified learning method；Received awards letter using REINFORCE algorithmsNumber relative to tactful network gradient；Algorithm is returned using gradient and gradient descent algorithm updates the parameter of tactful network.

Further, the S7 also includes obtaining multigroup facial image pair, and each group of people's face image is held to iteration successivelyRow S2 to S7.

Compared with prior art, beneficial effects of the present invention are：The restored method of facial image provided by the invention, can be certainlyThese regions are recovered, and answered using these regions by the less region of distortion in the main fuzzy facial image of preferential selectionExtraneous information after original helps the recovery of remaining distortion zone, reaches recovery effect more more preferable than prior art.

Brief description of the drawings

Fig. 1 is a kind of schematic flow sheet of the restored method of facial image provided by the invention.

Fig. 2 is the schematic diagram of facial image pair in the present invention.

Fig. 3 is the schematic flow sheet of S3 to S4 in the present invention.

Fig. 4 is the schematic flow sheet of S5 in the present invention.

Fig. 5 is the instance graph that facial image recovery is carried out using the method for the present invention.

Embodiment

Below in conjunction with accompanying drawing and specific embodiment, technical scheme is described in detail.

A kind of restored method of facial image provided by the invention, fuzzy face image restoration can be schemed into clearPicture, mainly restore two parts comprising neural metwork training and facial image.

Specifically, as shown in figure 1, a kind of restored method of facial image provided by the invention comprises the following steps：

S1, lineup's face image pair is obtained, as shown in Fig. 2 the facial image is to including the one of same person face imageOpen picture rich in detail and a blurred picture；

S2, using blurred picture as initial input image, input in a tactful network；

S3, by the tactful network, one piece of region is chosen from input picture；

S4, by one enhancing network, in input picture S3 choose region restore；

Wherein, S1 to S7 is the process of neural metwork training, and S8 is the process that facial image restores.

, can be first at random in a manner of the normal distribution that average is 0, variance is 0.01 before being trained to neutral netThe parameter of the tactful network of beginningization and enhancing network.Wherein, the tactful network includes full articulamentum and shot and long term memory network；InstituteStating enhancing network includes convolutional neural networks and multiple full articulamentums, and the convolutional neural networks are made up of 8 convolutional layers.

Further, step S3 to S4 processing procedure is as shown in figure 3, S5 processing procedure is as shown in Figure 4.Specifically such asUnder：In S5, each round iteration performs S3 to S4, all by one of output image new " state ".One " state " includes twoPart, a part are the images after the region that S4 exports to obtain is restored, and the region of all " states " is answered before the image containsFormer result so that fuzzy information which region that tactful network can obtain image is clear, which region remains unchanged, and can basisRecovered region determines which region should be currently restored.Another part is tactful network borough chief short-term memory network in S3Caused hidden vector, shot and long term memory network possess the ability of memory long-term information, and shot and long term memory network here is used to incite somebody to actionBefore iteration selection regional location recorded and encoded, and by it is hidden it is vectorial in the form of be delivered to next iteration.

Since the second wheel iteration, the input of tactful network is last round of caused " state " (in i.e. last round of S4 in S3The image obtained after the recovery of region is overall).Wherein, tactful network first tier is full articulamentum, is inputted as image.Assuming that figureThe size of picture is 128*128, then the full articulamentum pulls into input picture the vector of one 16384 dimension, exports one 256 dimensionVector.The vector of 256 dimension, and last round of obtained hidden vector is together, is input in shot and long term memory network.Shot and long term is rememberedRecall network and then export the hidden variable of one 512 dimension, and the probability graph for being 128*128 by the full articulamentum output size.Every bit represents tactful network over an input image in probability graph, selects the region of a fixed size centered on the pointProbability.Because also in training process, we are not required to the maximum region of select probability at present；Therefore selected at random in probability graphA point is taken, centered on this point, size is the selection region that 60*45 rectangular area exports as step S3.

Assuming that the image size for needing to restore is 128*128, the image area size of extraction is 60*45.It is whole in step S4Width image will be drawn as the vector of 16384 dimensions, obtain 256 dimensional vectors by the full articulamentum of first layer, then connect entirely by the second layerConnect layer and obtain 256 dimensional vectors, finally obtain the characteristic pattern of 60*45 sizes by the 3rd full articulamentum again.The feature of the 60*45Scheme to merge with the image-region extracted, form the characteristic pattern of 2*60*45 sizes.This feature figure passes through convolutional Neural netNetwork, obtain the area image after 60*45 recovers.The convolutional neural networks are made up of 8 convolutional layers.First layer and the second layerConvolution kernel size is 5*5, output size 60*45*16, and the convolution kernel size of the second layer and layer 6 is 7*7, and output size is60*45*32, third layer, the 4th layer and layer 5 convolution kernel size are 7*7, output size 60*45*64.8th layer of convolution kernelSize is 5*5, output size 60*45*1, for the area image of recovery.The area image of the recovery will replace the last timeRegion corresponding to the image that iteration obtains, the image formed after replacement is integrally using as the input of next round iteration.

By in S5 iteration perform S3 to S4 several times, can finally obtain according to the figure after blur image restorationPicture, hereon referred to as restored image., then can be to tactful network and enhancing net by the way that restored image and picture rich in detail are contrastedNetwork is trained.

Specifically, in the S6, the similarity between the picture rich in detail obtained in restored image and S1 that S5 is obtained is calculatedMethod to calculate mean square error between the two, that is, calculate poor square of two images correspondence position pixel, and by instituteThere is obtained value summation.

Further, strengthen the conventional method of Web vector graphic training neutral net, i.e., be loss letter using mean square errorNumber, update network parameter with gradient passback and gradient descent algorithm.Tactful network has then used nitrification enhancement, attempts every timeDifferent regions is selected, according to the quality of last prize signal, the selection in this whole sequence is encouraged or suppressed.

In the S6, the method using nitrification enhancement Training strategy network is specially：It is calculated by more than equalSquare error negates, as the prize signal in intensified learning method；Signal is received awards relative to plan using REINFORCE algorithmsThe slightly gradient of network；Algorithm is returned using gradient and gradient descent algorithm updates the parameter of tactful network.In the present embodiment, it is falseIf the value of prize signal is R, the probability of the wherein randomly selected point of an iteration is P, then the gradient of this step strategy networkValue is R/P in this point, and remaining unchecked point is 0, and the gradient will be used for gradient passback and update tactful network with gradient descent methodParameter.

As an improvement, the S7 also includes obtaining multigroup facial image pair, and each group of people's face image is held to iteration successivelyRow S2 to S7.With multigroup facial image to for sample, being iterated training to tactful network and enhancing network, it is possible to increase strategyThe training effect of network and enhancing network.The sample group number of facial image pair is more, and effect is better.Each group of people's face image centering,The method that blurred picture can be amplified back full size again by picture rich in detail by bilinear interpolation diminution obtains, and simplifies sample acquisitionProcess.

After completing the training of tactful network and enhancing network and carrying out parameter initialization, it is possible to by the people of parkedFace image realizes the recovery of facial image as initial input image.In S8, iteration perform S3 to S4 reach 25 times orAfter certain number, the image finally given is the facial image after multiple regions are repeatedly restored excessively.As shown in figure 5,The output image that everyone face image in 25 facial images is current iteration step S3 to S4；It is to work as above facial imageThe human face region of preceding iterative step S3 selections.The output image of last time iteration is the output image of this method.

It should be noted that in S8, when going to S3, the probability graph exported according to tactful network selects image-regionWhen, and slightly distinguished during neural metwork training., should be with probability highest in probability graph when actually recovering single width facial imagePoint centered on point, the rectangular area that correspondence position intercepts one piece of fixed size over an input image are the region of step S3 selections.

Embodiment described above only expresses the several embodiments of the present invention, and its description is more specific and detailed, but simultaneouslyTherefore the limitation to the scope of the claims of the present invention can not be interpreted as.It should be pointed out that for one of ordinary skill in the artFor, without departing from the inventive concept of the premise, various modifications and improvements can be made, these belong to the guarantor of the present inventionProtect scope.Therefore, the protection domain of patent of the present invention should be determined by the appended claims.

Claims

1. a kind of restored method of facial image, it is characterised in that comprise the following steps：

S1, lineup's face image pair is obtained, the facial image is to the picture rich in detail and one including same person face imageOpen blurred picture；

S2, using blurred picture as initial input image, input in a tactful network；

S3, by the tactful network, one piece of region is chosen from input picture；

S4, by one enhancing network, in input picture S3 choose region restore；

S5, the image obtained after S4 is carried out into region recovery are integrally as S3 input picture, if iteration performs S3 to S4Dry time, it is restored image that last time, which repeats the image that S4 is obtained,；

S8, using the facial image of parked as initial input image, in input policing network, repeat S3 to S5, restoredFacial image.

2. according to the method for claim 1, it is characterised in that the tactful network includes full articulamentum and shot and long term is rememberedNetwork；The region that the shot and long term memory network is used to choose when iteration before is performed into S3 is recorded and encoded,And by it is hidden it is vectorial in the form of be delivered to next iteration.

3. according to the method for claim 1, it is characterised in that the input picture of the tactful network in step S3 is the mouldThe obtained images of step S4 in paste image or last round of iteration, export for input picture probability graph of the same size；In S8In, when going to S3, the point centered on probability highest point in probability graph, one piece of fixation of correspondence position interception over an input imageThe rectangular area of size is the region of step S3 selections.

4. according to the method for claim 3, it is characterised in that before S8, when going to S3, selected at random in probability graphPoint centered on a point is taken, the rectangular area that correspondence position intercepts one piece of fixed size over an input image is step S3 selectionsRegion.

5. according to the method for claim 1, it is characterised in that the enhancing network includes convolutional neural networks and multiple completeArticulamentum, the convolutional neural networks are made up of 8 convolutional layers.

6. according to the method for claim 1, it is characterised in that in the S6, calculate in restored image and S1 that S5 is obtainedThe method of similarity between the picture rich in detail of acquisition calculates two images and corresponds to position to calculate mean square error between the twoPoor square of pixel is put, and all obtained values are summed.

7. according to the method for claim 1, it is characterised in that in the S6, use nitrification enhancement Training strategy netThe method of network is specially：The image similarity obtained in step S6 is negated, as the prize signal in intensified learning method；MakeReceived awards gradient of the signal relative to tactful network with REINFORCE algorithms；Algorithm is returned using gradient and gradient declines calculationMethod updates the parameter of tactful network.

8. according to the method for claim 1, it is characterised in that the S7 also includes obtaining multigroup facial image pair, and according toIt is secondary that S2 to S7 is performed to iteration to each group of people's face image.