CN112686794A

Movatterモバイル変換

Info

Publication number: CN112686794A
Application number: CN202011517946.7A
Authority: CN
Inventors: 张西; 王雷; 居燕峰; 朱坚; 陆向东; 赵庆勇
Original assignee: Fujia Newland Software Engineering Co ltd
Current assignee: Fujia Newland Software Engineering Co ltd
Priority date: 2020-12-21
Filing date: 2020-12-21
Publication date: 2021-04-20
Anticipated expiration: 2040-12-21
Also published as: CN112686794B

Abstract

The invention provides a watermark removing method based on a generating type countermeasure network in the technical field of image processing, which comprises the following steps: step S10, building a generator based on the recursive attention cycle network and the context automatic encoder; step S20, building a discriminator based on the recursion attention cycle network and PatchGAN; step S30, inputting a plurality of watermark sample pictures into a conditional generation type countermeasure network composed of the generator and the discriminator to carry out countermeasure training; and step S40, inputting the watermark picture into the generator after the countermeasure training to generate a watermark-removed picture. The invention has the advantages that: the automatic watermark removal is realized, and the watermark removal effect is greatly improved.

Description

Watermark removing method based on generating type countermeasure network

Technical Field

The invention relates to the technical field of image processing, in particular to a watermark removing method based on a generating type countermeasure network.

Background

Watermarking is a widely used method for protecting copyright information of multimedia data such as images and videos, but some watermarking with malicious marketing property can affect the appreciation of the images. Thus, a need arises to remove the watermark.

The following 3 methods are mainly used for removing the watermark: 1. directly removing watermark characters in the image by means of a software tool; 2. the image watermark is trimmed by utilizing a trimming mode, so that the method is suitable for the condition that the watermark is at the edge of the image, and the premise is that the overall impression of the image is not influenced after trimming; 3. the method covers the characters in the image by using brushes with similar colors, and is suitable for the case that the image is pure color, such as the case that only black or white exists on the image.

The 3 methods for removing the watermark need tools, can only be operated manually, and process one image at a time, and the processing mode is complicated, low in efficiency and not suitable for processing large-batch images with complex background and complex watermark. Although there is also a method for building a watermark remover by using a full convolution network in the prior art, the input of the full convolution network is an area of an image with a watermark, and the image without the watermark is output after multi-layer convolution processing, but the method needs to mark the watermark part in the image and then perform watermark removing operation on the marked watermark area, and is not suitable for the image with complex watermark.

Therefore, how to provide a watermark removing method based on a generative countermeasure network to automatically remove a watermark and improve the watermark removing effect becomes a problem to be solved urgently.

Disclosure of Invention

The technical problem to be solved by the present invention is to provide a watermark removing method based on a generative countermeasure network, which realizes automatic watermark removal and improves the watermark removing effect.

The invention is realized by the following steps: a watermark removing method based on a generative countermeasure network comprises the following steps:

step S10, building a generator based on the recursive attention cycle network and the context automatic encoder;

step S20, building a discriminator based on the recursion attention cycle network and PatchGAN;

step S30, inputting a plurality of watermark sample pictures into a conditional generation type countermeasure network composed of the generator and the discriminator to carry out countermeasure training;

and step S40, inputting the watermark picture into the generator after the countermeasure training to generate a watermark-removed picture.

Further, in the step S10,

the recursive attention cycle network comprises at least two layers of ResNet, a convolution LSTM unit and an attention distribution graph for generating an attention distribution graph

Convs of the convolution layer of (a); wherein N is a positive integer; the recursive attention cycle network is used for positioning the area needing removing the watermark;

the context auto-encoder consists of a U-Net structure of 16 Conv-relu blocks for de-watermarking the recursively attention-cycled network located regions.

Further, in step S10, the generated network loss function of the generator is:

L_G＝10^-2L_GAN(O)+L_ATT({A},M)+L_M({S},{T})+L_P(O,T)；

L_GAN(O)＝log(1-D(O))；

L_P(O,T)＝L_MSE(VGG(O),VGG(T))；

wherein L is_GA loss value representing the generated network; o represents the watermark-removed picture generated by the generator; t represents a waterless picture corresponding to O; d represents a discrimination network; m represents a binary mask; l is_GAN(O) represents a loss function of the generated network; l is_ATT({ A }, M) represents the loss function of the recursive attention-cycle network, attention profile A, output at time step t_tMean square error with binary mask M, N is 5, θ is 0.9; l is_MSE(. to) represents the mean square error; l is_M({ S }, { T }) represents the multi-scale loss function of the context autocoder, S_iRepresenting the ith output, T, extracted from the context autocoder_iShows the reduction of the waterless printed picture to S_iSame size, λ_iWeights representing different size pictures; l is_PAnd (O, T) represents a perceptual loss function of the context automatic encoder, namely, a plurality of features are extracted from the pictures O and T by using a trained feature network VGG, and the sum is obtained after the mean square error is calculated.

Further, in step S20, the discriminant network loss function of the discriminator is:

L_D(T,O)＝-log(D(T))-log(1-D(O))+γL_map(O,T,A_N)；

L_map(O,T,A_N)＝L_MSE(D_map(O),A_N)+L_MSE(D_map(T),0)；

wherein L is_D(T, O) represents a loss value for discriminating the network; gamma denotes L_map(O,T,A_N) The weight lost; l is_map(O,T,A_N) Representing a difference between an attention mask generated by one of the layers of the arbiter and the attention profile; d_map() represents the process by which the arbiter generates the attention mask; 0 represents an attention profile that contains only 0 values.

Further, the step S30 specifically includes:

step S31, inputting a plurality of watermark sample pictures into the generator to generate a watermark sample picture;

step S32, inputting the watermark sample picture and the watermark removed sample picture into a discriminator;

step S33, judging whether all the water mark removing sample pictures are true and whether the water mark removing sample pictures are matched with the corresponding water mark sample pictures by the discriminator, if so, finishing the confrontation training, and entering step S40; if not, the process proceeds to step S31 to continue the countermeasure training.

Further, in step S30, the objective function of the conditional adversary network is:

wherein L is_cGAN(G, D) represents an objective function of the condition generating countermeasure network; s represents a watermark sample picture; x represents a real picture corresponding to the watermark sample picture; d (s, x) represents inputting the watermark sample picture and the real picture into the discriminator; d (s, g (s)) represents inputting the watermark sample picture and the de-watermark sample picture generated by the generator into the discriminator;

representing an expectation of joint distribution of the watermark sample picture and the corresponding real picture;

indicating the desirability of de-watermark sample picture distribution.

The invention has the advantages that:

1. the method comprises the steps of inputting a watermark sample picture into a conditional generation type countermeasure network consisting of a generator and a discriminator to perform countermeasure training, and then inputting the watermark picture into the generator after the countermeasure training to generate a watermark removing picture, namely, the generator after the countermeasure training is used for automatically removing the watermark in batch, so that the watermark removing efficiency is improved, and the method is suitable for processing large-batch images with complex background and complex watermark; the generator is built based on the recursive attention cycle network and the context automatic encoder, the discriminator is built based on the recursive attention cycle network and the PatchGAN, namely the generator generates an attention distribution map through the recursive attention cycle network for positioning the region needing to be removed with the watermark, the context automatic encoder removes the watermark based on the positioned region, the discriminator concentrates attention on the region needing to be removed with the watermark based on the attention distribution map, the watermark region does not need to be marked in advance like the prior art, all regions with the watermark can be automatically noticed, the method is suitable for the picture with the complex watermark, the automatic watermark removal is finally realized, and the removal effect of the watermark is greatly improved.

2. The conditional generation type countermeasure network (C-GAN) replaces the traditional generation type countermeasure network (GAN), the generator inputs the watermark sample picture and the watermark removing sample picture into the discriminator, and the discriminator not only needs to judge the authenticity of the watermark removing sample picture, but also needs to judge whether the watermark removing sample picture is matched with the watermark sample picture, so that the watermark removing effect is greatly improved.

3. The PatchGAN replaces the traditional GAN to build a discriminator, the influence of different parts of the image on the discriminator is considered, the trained model can pay more attention to the detailed part of the image, the overall difference representation which is more accurate than the single scalar output is realized, and the local feature and the overall feature of the image are fused.

Drawings

The invention will be further described with reference to the following examples with reference to the accompanying drawings.

Fig. 1 is a flowchart of a watermark removing method based on a generative countermeasure network according to the present invention.

FIG. 2 is a schematic diagram of the architecture of the recursive attention cycle network of the present invention.

Fig. 3 is a schematic diagram of the structure of the context autocoder of the present invention.

FIG. 4 is a schematic diagram of the structure of the discriminator according to the present invention.

Detailed Description

The technical scheme in the embodiment of the application has the following general idea: the traditional picture watermark removing method is converted into an image conversion task, the watermarked picture is converted into the watermark removing picture, namely the watermark removing picture generated by the generator is real enough through continuous countertraining between the generator and the discriminator, and therefore the ideal watermark removing effect is achieved.

Referring to fig. 1 to 4, a preferred embodiment of a watermark removing method based on a generative countermeasure network according to the present invention includes the following steps:

step S10, based on the recursion attention cycle network and the context automatic encoder building Generator (Generator); the input of the context automatic encoder is a watermark picture and an attention distribution map generated by a recursive attention circulation network, and the output is a watermark-removed picture;

step S20, building a Discriminator (Discriminator) based on the recursion attention cycle network and PatchGAN; the generator is used for generating a watermark removing picture, and the discriminator is used for judging the truth of the watermark removing picture;

since the GAN discriminator is to map the input information into a real number, i.e., the probability that the input sample is a real sample, the Patchgan discriminator is to map the input information into a Patch (matrix) X of N × N_i,j，X_i,jThe value of (D) represents the probability that each Patch is a true sample, X_i,jAnd the average value is the final output of the discriminator. X_i,jThe method is characterized in that a feature map output by a convolution layer is used, a certain position of an original image can be traced from the feature map, and the influence of the position on a final output result can be seen from the result of a discriminator, so that the discriminator can pay more attention to the detailed part of a generated image, namely, the discriminator is more sensitive to high frequency, and the PatchGAN is used for replacing the traditional GAN to build the discriminator.

Step S30, inputting a plurality of watermark sample pictures into a conditional generation type confrontation network (C-GAN) composed of the generator and the discriminator for confrontation training;

the traditional generation type countermeasure network (GAN) only needs to judge the authenticity of the watermark sample picture, can not ensure whether the input watermark sample picture generates the corresponding watermark sample picture, and is easy to drill empty bits;

In the step S10, in the above step,

each time step of the recursive attention loop network comprises one toTwo-layer-less ResNet, a convolution LSTM unit and a method for generating an attention profile

Convs of the convolution layer of (a); wherein N is a positive integer; the recursive attention circulation network is used for positioning the area needing removing the watermark, so that the generation network can pay more attention to the watermark area and the surrounding structure, and the judgment network can better evaluate the local consistency of the watermark recovery area;

the attention profile is a matrix from 0 to 1, with larger values indicating more attention; the attention profile is a non-binary map representing a gradually increasing attention from the non-watermarked area to the watermarked area, even though the attention in the watermarked area is different, because there is a difference in transparency of the watermark area, some parts of the watermark do not completely block the background, thereby conveying some background information.

The U-Net network belongs to an encoder-decoder, but is different from the traditional encoder-decoder in the characteristic jump layer connection, and all data information is required to flow through each layer from input to output in the traditional GAN generation model network structure, so that the training time is undoubtedly prolonged; for the task of image watermarking, although an input image and a target image need to be subjected to complex conversion, the structures of the input image and an output image are basically the same, namely the input image and the output image are shared on low-level information in the image conversion process, and the information does not need to be converted, so that the waste is caused by the traditional GAN generation model network structure, the network structure is adjusted according to the image conversion requirement, and the information sharing between the input image and the output image can be realized by using a U-Net network structure; the benefit of the U-Net network architecture is that the connection between the encode and decode parts of the same size in the network gives the generative model the ability to skip some of the subsequent steps, also known as skip-connections, so that low-level detail information under different resolution conditions is preserved and part of the information can be transmitted directly through the connection when the network is trained.

In step S10, the generated network loss function of the generator is:

L_G＝10^-2L_GAN(O)+L_ATT({A},M)+L_M({S},{T})+L_P(O,T)；

L_GAN(O)＝log(1-D(O))；

L_P(O,T)＝L_MSE(VGG(O),VGG(T))；

wherein L is_GA loss value representing the generated network; o represents the watermark-removed picture generated by the generator; t represents a waterless picture corresponding to O; d represents a discrimination network; m represents a binary mask; l is_GAN(O) represents a loss function of the generated network; l is_ATT({ A }, M) represents the loss function of the recursive attention-cycle network, attention profile A, output at time step t_tThe mean square error with the binary mask M, N is 5, θ is 0.9, we expect a higher N to produce a better attention map, but for a very large N, more video memory resources are needed, so let N take the value of 5; l is_MSE(. to) represents the mean square error; l is_M({ S }, { T }) represents the multi-scale loss function of the context autocoder, S_iRepresenting the ith output, T, extracted from the context autocoder_iShows the reduction of the waterless printed picture to S_iSame size, λ_iWeights representing pictures of different sizes, will be_iThe values of (a) are respectively set to be 0.6, 0.8 and 1, so that the sizes of the output pictures of the last layer, the third last layer and the fifth last layer of the context automatic encoder can be respectively the original sizes1/4, 1/2 and 1 fold; l is_PAnd (O, T) represents a perceptual loss function of the context automatic encoder, namely, a plurality of features are extracted from the pictures O and T by using a trained feature network VGG, and the sum is obtained after the mean square error is calculated.

In step S20, the discriminant network loss function of the discriminator is:

L_D(T,O)＝-log(D(T))-log(1-D(O))+γL_map(O,T,A_N)；

L_map(O,T,A_N)＝L_MSE(D_map(O),A_N)+L_MSE(D_map(T),0)；

wherein L is_D(T, O) represents a loss value for discriminating the network; gamma denotes L_map(O,T,A_N) The weight lost; l is_map(O,T,A_N) Representing a difference between an Attention mask and an Attention Map (Attention Map) generated by one of the layers of the arbiter; d_map() represents the process by which the arbiter generates the attention mask; 0 represents an attention profile that contains only 0 values.

The step S30 specifically includes:

In step S30, the objective function of the conditional adversary network is:

wherein L is_cGAN(G, D) represents an objective function of the condition generating countermeasure network; s represents a watermark sample picture; x tableDisplaying a real picture corresponding to the watermark sample picture; d (s, x) represents inputting the watermark sample picture and the real picture into the discriminator; d (s, g (s)) represents inputting the watermark sample picture and the de-watermark sample picture generated by the generator into the discriminator;

indicating the desirability of de-watermark sample picture distribution.

The generator of the C-GAN algorithm generates an image based on random noise, but the random noise is often buried in a watermark sample picture, so that the input random noise is omitted.

In summary, the invention has the advantages that:

Although specific embodiments of the invention have been described above, it will be understood by those skilled in the art that the specific embodiments described are illustrative only and are not limiting upon the scope of the invention, and that equivalent modifications and variations can be made by those skilled in the art without departing from the spirit of the invention, which is to be limited only by the appended claims.

Claims

1. A watermark removing method based on a generative countermeasure network is characterized in that: the method comprises the following steps:

2. The watermark removal method based on the generative countermeasure network as claimed in claim 1, wherein: in the step S10, in the above step,

the recursive attention-cycling network comprises at least two layers of ResNet, a convolution LSTM unit and an attention distribution generation unit for generating an attention distributionDrawing (A)

3. The watermark removal method based on the generative countermeasure network as claimed in claim 1, wherein: in step S10, the generated network loss function of the generator is:

L_G＝10^-2L_GAN(O)+L_ATT({A},M)+L_M({S},{T})+L_P(O,T)；

L_GAN(O)＝log(1-D(O))；

L_P(O,T)＝L_MSE(VGG(O),VGG(T))；

wherein L is_GA loss value representing the generated network; o represents the watermark-removed picture generated by the generator; t represents a waterless picture corresponding to O; d represents a discrimination network; m represents a binary mask; l is_GAN(O) represents a loss function of the generated network; l is_ATT({ A }, M) represents the loss function of the recursive attention-cycle network, attention profile A, output at time step t_tMean square error with binary mask M, N is 5, θ is 0.9; l is_MSE(. to) represents the mean square error; l is_M({ S }, { T }) represents the multi-scale loss function of the context autocoder, S_iRepresenting extraction from a context autocoderIth output, T_iShows the reduction of the waterless printed picture to S_iSame size, λ_iWeights representing different size pictures; l is_PAnd (O, T) represents a perceptual loss function of the context automatic encoder, namely, a plurality of features are extracted from the pictures O and T by using a trained feature network VGG, and the sum is obtained after the mean square error is calculated.

4. A watermark removal method based on a generative countermeasure network as claimed in claim 3, wherein: in step S20, the discriminant network loss function of the discriminator is:

L_D(T,O)＝-log(D(T))-log(1-D(O))+γL_map(O,T,A_N)；

L_map(O,T,A_N)＝L_MSE(D_map(O),A_N)+L_MSE(D_map(T),0)；

5. The watermark removal method based on the generative countermeasure network as claimed in claim 1, wherein: the step S30 specifically includes:

6. The watermark removal method based on the generative countermeasure network as claimed in claim 5, wherein: in step S30, the objective function of the conditional adversary network is:

indicating the desirability of de-watermark sample picture distribution.