Movatterモバイル変換


[0]ホーム

URL:


CN111986142B - An unsupervised data enhancement method for hot-rolled coil surface defect images - Google Patents

An unsupervised data enhancement method for hot-rolled coil surface defect images
Download PDF

Info

Publication number
CN111986142B
CN111986142BCN202010445141.XACN202010445141ACN111986142BCN 111986142 BCN111986142 BCN 111986142BCN 202010445141 ACN202010445141 ACN 202010445141ACN 111986142 BCN111986142 BCN 111986142B
Authority
CN
China
Prior art keywords
generator
discriminator
function
data
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010445141.XA
Other languages
Chinese (zh)
Other versions
CN111986142A (en
Inventor
杨永刚
张云贵
邓泽先
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Automation Research and Design Institute of Metallurgical Industry
Original Assignee
Automation Research and Design Institute of Metallurgical Industry
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Automation Research and Design Institute of Metallurgical IndustryfiledCriticalAutomation Research and Design Institute of Metallurgical Industry
Priority to CN202010445141.XApriorityCriticalpatent/CN111986142B/en
Publication of CN111986142ApublicationCriticalpatent/CN111986142A/en
Application grantedgrantedCritical
Publication of CN111986142BpublicationCriticalpatent/CN111986142B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Classifications

Landscapes

Abstract

An unsupervised enhancement method for surface defect image data of a hot rolled coil belongs to the technical field of metallurgical hot rolled coil defect detection. The generation countermeasure network consists of a deep neural network with two mutual countermeasure competition, which is a probability generation model, the mode of generating samples is forward propagation through a generator, gradient back propagation is used for optimization calculation after the judgment of a judgment device, and the model does not depend on any prior assumption. In the production countermeasure network training process, the two networks are continuously iterated and optimized in the mutual game, the generator learns to generate more real samples, and the discrimination network discriminates whether the data samples are real samples or false samples as far as possible. The two parties compete continuously until the two parties cannot get better, and finally the two networks achieve dynamic balance, namely the image distribution generated by the generator is close to the real image distribution, and the discriminator cannot recognize the real and false images. The method has the advantages that the problems of few picture data of the surface defects of the hot rolled coil, difficult acquisition, accurate identification and the like in the field of metallurgical steel rolling are solved.

Description

Method for enhancing unsupervised data of surface defect image of hot rolled coil
Technical Field
The invention belongs to the technical field of metallurgical hot rolled coil defect detection, and relates to a method for generating image data of surface defects of a hot rolled coil based on an artificial intelligent network model.
Background
In the industry's ever-evolving era, steel has become an industrial "grain," one of the most important materials in modern times. However, in the actual production of steel coils in steel mills, hot rolled coils with defective surfaces are often produced due to the influence of equipment problems and processing techniques. If the surface defects of the plate coil cannot be effectively detected and identified, the quality, the external appearance, the economic benefit, the usability and the like of the plate coil as steel products are seriously affected, so that the research on the surface defect identification of the plate coil has great practical significance. The early defect detection needs to be identified and judged by eyes of quality inspection staff, has high labor intensity, low efficiency and time and labor consumption, and later adopts machine vision based on traditional machine learning to detect the defects, but the identification rate is improved, the identification rate still needs to be judged manually, and the characteristic design is performed manually through characteristic engineering in the machine learning, so that the method is complicated. In recent years, the rise of deep learning technology brings a qualitative leap to image recognition, and the research of deep learning from an artificial neural network is that the deep neural network comprises a plurality of hidden layers. Deep learning forms more abstract high-level representation attribute categories or features by combining low-level features to discover a distributed pattern feature representation of the data. The deep learning does not need manual design characteristics, and the flow is simple. The method starts to input images from one end of the network to the other end of the network, outputs a prediction result, automatically learns the image texture characteristics of a large amount of data, combines and abstracts the image texture characteristics, can more describe the internal information with rich data, improves the learning capacity compared with the traditional network, and further can improve the accuracy of identifying the surface defect images of the hot rolled coil.
The deep learning technology is used for detecting the surface defects of the hot rolled plate coil, so that a corresponding neural network model is trained firstly, but massive label data are usually relied on, and the requirement on training samples is very high. The larger the data volume is, the better the generalization performance of the trained model is, and the higher the accuracy of identifying the surface defects of the hot rolled coil is. However, the data collection of the surface defects of the hot rolled coils in the metallurgical steel rolling field is very difficult, so that the industrial data becomes a very precious resource. The occurrence of defects on the surfaces of various hot rolled coils is random naturally, and after collection, steel rolling specialists are required to spend a great deal of time for marking defect data, which is time-consuming and labor-consuming. The data is input into the deep neural network, the picture data of the plate coil and the characteristics of the picture data often determine the upper limit of the final model recognition result, the selection and optimization of the model algorithm only gradually approach the upper limit, and the data quality and the data quantity play a critical role in the final training performance of the model. The data enhancement technology has been developed under the background, and the data enhancement is also called data amplification, namely, the existing original data is changed in a series of processing under the condition of no additional data, so that more new data are generated, more value is generated by limited data, and meanwhile, the recognition accuracy and generalization capability of a classification model are improved.
An unsupervised learning model is an artificial intelligence model that features that for its input examples, its potential rules are automatically found out from those examples. The unsupervised learning is to learn the essential characteristics of the real data, thereby describing the distribution characteristics of the sample data and generating new data similar to the training sample. The parameters of the model are much smaller than the amount of training data so the model can discover and effectively internalize the nature of the data so that the data can be generated. The idea of generating an antagonism network (GAN, generative Adversarial Networks) is derived from game theory, which belongs to an unsupervised learning network and was proposed by Ian Goodfellow in 2014. The GAN model ultimately produces good outputs through the learning of the game with each other by two modules in the framework, the Generator and the arbiter (Discriminator). The generator randomly generates the observation data by giving some implicit information. The input of the generator network in the invention is random noise variable, and the output is the image data of the surface defect of the hot rolled plate coil generated by a computer. The generator continuously learns to fit the true high-dimensional data distribution mainly from one low-dimensional data distribution, and the discriminator mainly aims to distinguish whether the data originates from the true image or the image generated by the generator. The arbiter model requires an input image, predictive decisions by the loss function of the model. The method comprises the steps of training a generator and a discriminator in an alternating mode, firstly training the discriminator, inputting real data and generated data into the discriminator, training the discriminator with good effect, and training the generator by using the trained discriminator, wherein the purpose is that the data generated by the generator cheat the discriminator as far as possible, and the generated pictures and the real pictures cannot be distinguished by the data generated by the generator until the discriminator is continuously confronted and learned, so that Nash equilibrium is finally achieved (one balance in game theory means that improvement of any party cannot cause increase of overall income).
Disclosure of Invention
The invention aims to provide a method for stably and effectively generating surface defect image data of a hot rolled coil, which solves the problems that the quality of products is seriously affected due to low accuracy of identifying defects by using an artificial intelligent machine vision technology caused by less surface defect image data of the hot rolled coil and difficult acquisition in the field of metallurgical steel rolling. The invention is different from the prior supervised data enhancement, the traditional data enhancement is to use fixed preset rules to perform simple geometric transformation (turnover, rotation, cutting, deformation, scaling and the like) and color transformation (noise, blurring, brightness, erasure, dithering, filling and the like) and other operations on the existing image, and the traditional method does not consider the difference between different tasks and the rich diversity of the samples and the high and low image quality after the data enhancement, and is not efficient and stable enough.
The original GAN theory does not require that both the generator and the arbiter are neural networks, but only that functions fit the corresponding generation and discrimination. The generation countermeasure network consists of two deep neural networks competing with each other, which is a probability generation model, the mode of generating samples is forward propagation through a generator, gradient back propagation is used for optimization calculation after judgment of a discriminator, and the model does not depend on any prior assumption. Many data enhancement methods in the past generally have been very complex to assume that the data follows a certain distribution and then use maximum likelihood to estimate the data distribution. The generator in the GAN takes random noise as input and generates a sample, and the discriminator is used for judging whether the data generated by the generating network is true or false. In the production countermeasure network training process, the two networks are continuously iterated and optimized in the mutual game, the generator learns to generate more real samples, and the discrimination network discriminates whether the data samples are real samples or false samples as far as possible. The two parties compete continuously until the two parties cannot get better, and finally the two networks achieve dynamic balance, namely the image distribution generated by the generator is close to the real image distribution, and the discriminator cannot recognize the real and false images. The earliest generators and discriminants were not deep neural networks employed but perceptron, which optimized the objective function as shown in the following equation:
Where x represents the true sample, z represents the random noise input to the generator G, and G (z) represents the image generated by the generator G. D (x) represents the probability that the discriminator D judges whether or not the picture is a true picture, and D (G (z)) is the probability that the discriminator D judges whether or not the picture generated by G is a true picture. p (x) represents the distribution of real data, and p (z) represents the distribution of generated data. This formula requires maximizing the probability estimate of the discriminant model for the real data and minimizing the probability estimate of the discriminant model for the generated data. The generated countermeasure network has training instability in the training process, can only grasp the local variance characteristics of data distribution, and is easy to generate model collapse. The problems that training is difficult, the loss of a generator and a discriminator cannot indicate the training process, the generated samples lack of diversity and the like exist, and the generated pictures are fuzzy and poor in quality.
In view of the above problems, the present invention improves and optimizes the original generation of an antagonism network, which consists of two deep neural networks (a generator sequence network and a arbiter sequence network) competing against each other.
Step 1. The invention defines the generator function as a generator sequence GEN (z) consisting of generator modules, wherein the initial module is defined as:
Where a is the picture space generated by the noise space Z map, Z is the space consisting of gaussian noise hidden vectors (latent vector of the left-hand input in fig. 3) subject to normal distribution. N (0, 1) is a standard normal (Gaussian) distribution, R represents the spatial dimension,To represent the mapped symbols, a0 represents the picture that was first generated by the noise spatial mapping.
Step 2, setting the generator sequence to have k layers (k is set by a model designer, k is set to 7 in the invention), and enabling Gk to be a general function:
It represents the basic generator module, and in implementation the function includes an upsampling operation such that each layer of image resolution is greater than it was.
Step 3, defining k such generator modules to synthesize a final generator sequence:
Step 4, defining a function r (represented by a horizontal black rectangular box in fig. 3), wherein the function has the function of generating images with different resolutions at different stages of different modules in the generator sequence, modeling r as a module consisting of a1×1 convolution and an activation function (a function for increasing nonlinearity of a neural network model), and greatly increasing the nonlinearity characteristic of the module (using a subsequent nonlinear activation function to convert the activation of intermediate convolution values into images, and the output Oi of the module corresponds to different downsampled images of a final output image) under the premise of keeping the scale of a feature map (feature map) unchanged (i.e. without losing resolution). The formula for defining r is as follows:
wherein, the function ri acts as a regularizer (a rule function for improving the generalization capability of the model), and projects the learned feature map into the RGB image space to generate an intermediate layer image.
Step 5 gi(z)=ai is obtained from step 1, so there is ri(gi(z))=ri(ai)=oi
Where oi is the generated image of the generator i-th layer intermediate layer. The generated pictures oi with different resolutions are all sent to the discriminator, so that the gradient information flows into the current layer when back-propagating to strengthen the information interaction and prevent the gradient from disappearing.
And 6, defining a discriminator sequence. We denote the sequence of discriminators, consisting of all the modules of the discriminator, by the letter D, we define the last layer of the discriminator as D (z '), the first layer of the discriminator as D0, the true sample by y, the generated sample image by y', and similarly Dj as the intermediate layer function of the discriminator. When j=k-i, i and j are always related to each other. The output of the j-th middle layer function of the arbiter is thus defined as:
a'j=dj(combine(ok-j,a'j-1))
=dj(combine(oi,a'j-1))
Wherein the combination is to perform 1×1 convolution (channel cascade combination operation) on the output oi of the ith intermediate layer of the generator and the corresponding output of the j-1 th intermediate layer of the discriminator, so as to realize feature information integration. Reference is made to the illustration of figure 3. Since the final discriminant loss function of the discriminant is not only a function of the final output of the generator, but also a function of the intermediate layer output. The connection of the middle layer enables gradient information to flow from the middle layer of the discriminator to the middle layer of the generator in the model training process, so that the network training is more stable.
And 7, finally, the formula of the whole discriminator sequence is recorded as:
Step 8, the invention improves the structure of the traditional generation countermeasure network and defines a new loss function. The following is shown:
Where LD represents the loss function of the arbiter D, LG represents the loss function of the generator G, xf is the picture data produced by the generator, xt is the real picture data, and E represents the averaging function. New loss function evaluates the probability that average given real data is more real than randomly sampled dummy dataThe whole network can generate stable high-quality (finer edges and richer textures) data images from smaller samples, and the time required by model training is reduced;
In the deep learning, the loss function measures the difference between the prediction and the actual and reduces the difference through an optimization strategy. The model optimization rule adopted by the patent is a gradient descent method for calculating the first moment estimation and the second moment estimation of the gradient and the adaptive learning rate of the parameter, which is different from a strategy for correcting errors by only solving random gradient descent of the gradient. The learning rate can be adjusted according to the historical information of the gradient, so that the memory required by the training data for updating the weight of the neural network is reduced, and the training speed is faster.
The invention has the advantage of providing a novel unsupervised data enhancement method for generating surface defect image data (shown in fig. 4) of hot rolled coils with different resolutions. The generated countermeasure network is different from other artificial intelligent networks, can stably generate the generated countermeasure network of the surface defects of the hot rolled plate coil and various resolution images, generates high-quality image sample data and solves the instability of original generated countermeasure network training. The network architecture differs in the connection between the middle layers of the generator and the discriminator by introducing a regularization term between the input vector and the generated result to ensure that similar input vectors do not always fall within one mapping region. The connected layers are trained at the same time, so that a great amount of time for network training is saved, the network is not trained in a traditional step-by-step layer increasing mode, the image generating efficiency is improved, and the method is suitable for rapid engineering landing. The invention also has the advantage that the result of the discriminator is determined by the output of the generator and the output of the intermediate layer together, and the gradient information features are transferred to images with different resolutions. Compared with the prior art, the method further relieves the problem that the image training set of the surface defect of the hot rolled coil is difficult to acquire, and lays a better data foundation for improving the identification accuracy of the AI model and the generalization capability of the model. The novel deep production countermeasure network-generated data is a good AI model training resource, breaks through a data barrier to a certain extent, and provides a reference for obtaining a desired image data sample for more basic research and actual project application.
Drawings
Fig. 1 is a graph of a profile generated to fit real data to data. Line b is the gaussian distribution of the real data, line c is the noise that generated against the network learned data distribution, which was initially randomly initialized. The a-line is the probability that the arbiter network decides that the image is real data. The objective of the production countermeasure network is to gradually fit the c curve to the b curve through the continuous training generator and the discriminator network, namely the process from left to right in fig. 1, and finally when the distribution of the generated samples and the real samples completely coincide, the sample data is true or false, and the true or false is judged in the discriminator with the same probability. The horizontal line of the mark x represents the sampling space subject to the gaussian distribution x, and the horizontal line of the mark z represents the sampling space subject to the distribution z. The arrow with z-axis pointing to the x-axis indicates that the generation countermeasure network learns the mapping from z-space to x-space;
Fig. 2 is a diagram for generating an impedance network concept. The generator network takes as input the random noise vector z to generate the image x, the arbiter is trained to be a two-class network based on two data, one data of the real data samples, all marked 1, and the other data from the generator marked 0. The real data and the generated data are input as samples after mixing, and the probability that the input image is the real image is output. The discriminator is used to determine whether the input image is a real image or a generated image. In an ideal situation, the generator network may generate enough "spurious" pictures x that it is difficult for the arbiter network to determine whether the pictures generated by the generator network are real or generated at all. When one such generated countermeasure model is obtained, it can be used to generate a target picture for unsupervised data enhancement;
Fig. 3 is a diagram of a model network architecture. This architecture connects the middle layer of the generator sequence with the middle layer of the arbiter sequence (which is different from conventional production countermeasure networks) and transfers the multi-dimensional images of the middle layer of the generator sequence to the arbiter and the corresponding activation functions in the arbiter obtained from the last convolutional layer together. The model in the present invention allows the arbiter to obtain not only the final output (highest resolution) of the generator, but also the output of the intermediate layer. The connection operation enables the production countermeasure network to better adjust parameters thereof, and improves the stability of network training. The left-most input of the graph is a potential noise vector to a horizontal rectangular module of a generator sequence, then the graph is subjected to two convolution modules (one 4×4 convolution and one 3×3 convolution) of a vertical rectangular frame, then the graph is subjected to upsample (up-sampling) through middle layer convolution, such as g1,g2 in the graph, until an image sample with the highest resolution designated (highest resolution samples) is generated (a defect image of a hot rolled coil surface break in fig. 3), then the generated graph and a training image are input to a right-side identifier sequence network, the down-sampling is continuously performed through a deconvolution module (downsample), such as dk-2,dk-1 in the right end of the graph, a generator middle layer is connected during the process, the images with different resolutions and corresponding activation values obtained from the previous convolution layer are connected together through a combination module (a longitudinal module in the right end of the graph), then difference information between a certain layer of feature graph of a small batch of samples is calculated through a MinibatchStd module (MinibatchStd module), the difference information between the feature graph of a certain layer in the identifier network is used as an output of the next layer of the identifier network, the identifier network is subjected to be subjected to down-sampling, the difference information is further input to the full-collapse pattern is set, and the full-collapse pattern is achieved, and the full-collapse pattern can be further input to the identifier pattern is further connected;
FIG. 4 is a graph showing an example of surface rolling defects of the resulting hot rolled sheet. The resolution sizes of the images are 4×4,8×8,16×16,32×32,64× 64,128 ×128 in order from left to right and from top to bottom. The theoretical generation image resolution may also be 256×256,512×512,1024×1024, which is related to the image dataset resolution for training and the performance of the training machine and the model training round number (number of iterations), training time, neural network super-parameter settings, generator sequence layer number k. The generator and the discriminator in the invention use the same super parameter setting, the model learning rate LEARNING RATE is 0.003, the batch size is 120, and the training is simpler than the training of other neural networks, and the realization is convenient. The resolution of the training data set adopted in the invention is 128×128, the training round number is 57290, and the training time is one week. The performance of the machine is described in the detailed description;
FIG. 5 is a graph showing another example of the surface rolling defect of the resulting hot rolled sheet. The resolution sizes of the images are 4×4,8×8,16×16,32×32,64× 64,128 ×128 in order from left to right and from top to bottom. The principle of the generation is the same as that of fig. 4, and fig. 5 and 4 are both trained (i.e. true distribution of true data is fitted) production countermeasure network model randomly generated hot rolled sheet coil surface defect data images with multiple resolutions.
Detailed Description
1. The method is implemented by a computer server needing deep learning, 2 pieces of GPU graphics cards with Tesla (version is V100), wherein each piece of graphics memory has 32G and 128 memory, and a 1TB solid state disk is used as a hardware platform;
2. Installing a Linux 64-bit (Ubuntu16.04) operating system, installing an Anaconda software library, installing Pytorch a deep learning framework, installing a CUDA Toolkit version of 10.2, a graphic card driving version of NVIDI-440.33.01 and a cudnn acceleration package version of 7.6.5 as software environments, and using a python programming language;
3. Activating a deep learning development environment using command statements source axtivate pytorch;
4. preparing a data set, wherein the picture training set is from a steel mill hot rolled plate coil actual production line, shooting and collecting by a surface defect detector, sorting the marked labeled data set by workers and experts, cleaning the labeled data set to remove some pictures which do not accord with the error format of scene recognition, noise, damaged pictures and uniform picture size, preprocessing the data set to obtain 325 rolling defect images, wherein the resolution is 128 multiplied by 128, and the picture format is jpg;
5. The structure of the countermeasure network is well defined and generated, the dimension shape of the input data is defined, the shape and the initialization mode of each layer are well defined, and the loss functions loss and Gaussian noise distribution of the two networks of the discriminator D and the G generator are defined;
6. parameters of both the generator G and the arbiter D are initialized. A training generator for generating image samples using the defined noise distribution and reading in data from the training set;
7. Fixing the G parameter of a generator, training a discriminator, putting the generated picture and the real picture into the discriminator, and distinguishing the true picture and the false picture as far as possible through a loss function;
8. After the discriminator clears the real data and the generated data, fixing parameters of the discriminator, training a generator, and enabling the discriminator to generate picture data again;
9. The optimization strategy is selected to optimize the model, after k times of cyclic updating, the discriminator cannot distinguish true and false samples, and when the Nash balance target is reached, the generator can be considered to capture the true distribution of the true data;
10. Examples of images generated using the trained generated challenge model are shown in fig. 4 and 5, and will be described with reference to the accompanying drawings.

Claims (1)

Translated fromChinese
1.一种热轧板卷表面缺陷图像无监督数据增强的方法,其特征在于,由两个相互对抗竞争的深度神经网络:生成器序列网络和判别器序列网络组成,包括如下步骤:1. A method for unsupervised data enhancement of surface defect images of hot-rolled coils, characterized by being composed of two competing deep neural networks: a generator sequence network and a discriminator sequence network, and comprising the following steps:步骤1:将生成器函数定义为一个由生成器模块组成的生成器序列GEN(z),其中初始模块定义为:Step 1: Define the generator function as a generator sequence GEN(z) consisting of generator modules, where the initial module is defined as:Z=R512,Z~N(0,1),A0=R4×4×512 Z=R512 ,Z~N(0,1),A0 =R4×4×512其中A是由噪声空间Z映射生成的图片空间,Z是由服从正态分布的高斯噪声隐向量组成的空间;N(0,1)为标准正态高斯分布,R表示空间维度,为表示映射的符号,A0表示首次由噪声空间映射生成的图片;Where A is the image space generated by mapping the noise space Z, Z is the space composed of Gaussian noise latent vectors that obey the normal distribution; N(0,1) is the standard normal Gaussian distribution, R represents the spatial dimension, is the notation for the mapping, A0 represents the image generated by the noise space mapping for the first time;步骤2:设生成器序列有k层,k由模型设计者设定,k设为7,令Gk为一个通用函数:Step 2: Assume that the generator sequence has k layers, k is set by the model designer, k is set to 7, and let Gk be a universal function:其中Ai→Ai+1表示在生成器模块中实现上采样操作,使得每层图像分辨率比原来大的生成过程,Ai+1表示由Ai在生成器模块通过上采样操作生成的图片,N为自然数集{0,1,2,3……};WhereAiAi+1 represents the generation process of implementing upsampling operation in the generator module so that the resolution of each layer of the image is larger than the original one,Ai+1 represents the image generated byAi in the generator module through upsampling operation, and N is a natural number set {0, 1, 2, 3...};步骤3:定义k个这样的生成器模块合成最终的生成器序列:Step 3: Define k such generator modules to synthesize the final generator sequence:步骤4:定义一个函数r,该函数具有能在生成器序列中不同模块不同阶段生成出不同分辨率图像的功能,将r建模为1×1卷积和激活函数组成的模块,能在保持特征图featuremap尺度不变的前提下大幅增加模块的非线性特性,利用后接的非线性激活函数,将中间卷积值激活转换为图像;其输出Oi对应于最终输出图像的不同下采样图像,提升神经网络模型表达能力,定义r的公式如下:Step 4: Define a function r, which has the function of generating images of different resolutions at different stages of different modules in the generator sequence. Model r as a module composed of 1×1 convolution and activation function, which can greatly increase the nonlinear characteristics of the module while keeping the scale of the feature map unchanged. The intermediate convolution value activation is converted into an image using the subsequent nonlinear activation function; its output Oi corresponds to different downsampled images of the final output image, which improves the expression ability of the neural network model. The formula for defining r is as follows:其中,ri函数作用相当于正则化器即一种提升模型泛化能力的规则函数,将学习到的特征图投影到RGB图像空间中,生成中间层图像;Among them, theri function is equivalent to a regularizer, that is, a rule function that improves the generalization ability of the model, which projects the learned feature map into the RGB image space to generate an intermediate layer image;步骤5:由步骤1得gi(z)=ai,所以有:ri(gi(z))=ri(ai)=oiStep 5: From step 1, we getgi (z)=ai , so we have:ri (gi (z))=ri (ai )=oi其中,oi是生成器第i个中间层的生成的图像,生成的这些大大小小分辨率不同的图片oi全都要送给判别器,这样使得梯度信息反向传播时流入当前层来加强信息互动和防止梯度消失;Among them, oi is the image generated by the i-th intermediate layer of the generator. All the generated images oi of different sizes and resolutions are sent to the discriminator, so that the gradient information flows into the current layer during back propagation to enhance information interaction and prevent gradient disappearance;步骤6:定义判别器序列,用字母D表示判别器所有模块组成的判别器序列,将判别器的最后一层定义为d(z'),将判别器的第一层输出定义为d0,用y表示真实样本,y'表示生成的样本图像,类似地定义dj为判别器的中间层函数;当j=k-i时,i和j总是彼此相关;因此判别器的第j个中间层函数的输出定义为:Step 6: Define the discriminator sequence, use the letter D to represent the discriminator sequence composed of all modules of the discriminator, define the last layer of the discriminator as d(z'), define the first layer output of the discriminator asd0 , use y to represent the real sample, y' to represent the generated sample image, and similarly definedj as the intermediate layer function of the discriminator; when j=ki, i and j are always related to each other; therefore, the output of the jth intermediate layer function of the discriminator is defined as:a'j=dj(combine(ok-j,a'j-1))a'j =dj (combine(okj ,a'j-1 )) =dj(combine(oi,a'j-1)) =dj (combine(oi ,a'j-1 ))其中combine是将生成器的第i个中间层的输出oi和判别器第j-1个中间层的相应输出进行1×1的卷积,现特征信息整合;由于判别器的最终判别损失函数不仅是生成器最终输出的函数,还是中间层输出的函数,中间层的连接使得模型训练过程中梯度信息可以从判别器的中间层流到生成器的中间层;Among them, combine is to perform a 1×1 convolution on the output oi of the i-th intermediate layer of the generator and the corresponding output of the j-1-th intermediate layer of the discriminator to integrate the feature information; since the final discriminant loss function of the discriminator is not only a function of the final output of the generator, but also a function of the output of the intermediate layer, the connection of the intermediate layer allows the gradient information to flow from the intermediate layer of the discriminator to the intermediate layer of the generator during the model training process;步骤7:最后整个判别器序列的公式记作:Step 7: Finally, the formula for the entire discriminator sequence is recorded as:步骤8:改进了传统生成对抗网络的结构并定义了新的损失函数,新的损失函数如下所示:Step 8: Improve the structure of the traditional generative adversarial network and define a new loss function. The new loss function is as follows:其中LD表示判别器D的损失函数,LG表示生成器G的损失函数,xf为生成器生产的图片数据,xt为真实图片数据,E表示求期望均值的函数;新的损失函数评估平均给定的真实数据比随机抽样的假数据更加真实的概率使得整个网络生成稳定高质量图像并减少模型训练时间;WhereLD represents the loss function of the discriminator D,LG represents the loss function of the generator G,xf represents the image data produced by the generator,xt represents the real image data, and E represents the function of finding the expected mean; the new loss function evaluates the probability that the average given real data is more real than the randomly sampled fake data This enables the entire network to generate stable, high-quality images and reduce model training time;步骤9:通过计算梯度的一阶矩估计和二阶矩估计以及参数自适应性学习率的梯度下降方法进行模型优化。Step 9: Optimize the model by calculating the first-order moment estimate and second-order moment estimate of the gradient and the gradient descent method with parameter adaptive learning rate.
CN202010445141.XA2020-05-232020-05-23 An unsupervised data enhancement method for hot-rolled coil surface defect imagesActiveCN111986142B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202010445141.XACN111986142B (en)2020-05-232020-05-23 An unsupervised data enhancement method for hot-rolled coil surface defect images

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202010445141.XACN111986142B (en)2020-05-232020-05-23 An unsupervised data enhancement method for hot-rolled coil surface defect images

Publications (2)

Publication NumberPublication Date
CN111986142A CN111986142A (en)2020-11-24
CN111986142Btrue CN111986142B (en)2025-01-07

Family

ID=73441996

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202010445141.XAActiveCN111986142B (en)2020-05-232020-05-23 An unsupervised data enhancement method for hot-rolled coil surface defect images

Country Status (1)

CountryLink
CN (1)CN111986142B (en)

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN112488130B (en)*2020-12-172023-08-15苏州聚悦信息科技有限公司 An AI-based Microhole Wall Detection Method
CN112581462A (en)*2020-12-252021-03-30北京邮电大学Method and device for detecting appearance defects of industrial products and storage medium
CN112669284A (en)*2020-12-292021-04-16天津大学Method for realizing pulmonary nodule detection by generating confrontation network
JP7596803B2 (en)*2021-01-142024-12-10オムロン株式会社 Parts Inspection Equipment
CN113298190B (en)*2021-07-052023-04-07四川大学Weld image recognition and classification algorithm based on large-size unbalanced samples
CN113920087A (en)*2021-10-092022-01-11东北林业大学Micro component defect detection system and method based on deep learning
CN114785423B (en)*2022-06-202022-11-01中国海洋大学三亚海洋研究院Method for building underwater laser digital information transmission enhancement model
CN115661062B (en)*2022-10-192025-10-03浙大宁波理工学院 Industrial defect sample generation method and system based on generative adversarial network
TWI822454B (en)*2022-11-102023-11-11州巧科技股份有限公司Defect-detection system and method for training with blended defects
CN116503275B (en)*2023-04-172025-09-12浙江大学 Small sample slab defect data enhancement and identification method based on RADS model
CN116663619B (en)*2023-07-312023-10-13山东科技大学Data enhancement method, device and medium based on GAN network
CN117173461B (en)*2023-08-292024-10-01湖北盛林生物工程有限公司Multi-visual task filling container defect detection method, system and medium
CN117115185B (en)*2023-09-072024-11-22淮安市第二人民医院 A tumor segmentation method and system based on multimodal generative adversarial network
CN117607155B (en)*2024-01-242024-04-19山东大学 A strain gauge appearance defect detection method and system
CN118071749B (en)*2024-04-222024-08-16江西师范大学 A training method and system for steel surface defect detection model

Citations (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN107563510A (en)*2017-08-142018-01-09华南理工大学A kind of WGAN model methods based on depth convolutional neural networks
CN108665005A (en)*2018-05-162018-10-16南京信息工程大学A method of it is improved based on CNN image recognition performances using DCGAN

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US10614557B2 (en)*2017-10-162020-04-07Adobe Inc.Digital image completion using deep learning
CN110097543B (en)*2019-04-252023-01-13东北大学Hot-rolled strip steel surface defect detection method based on generation type countermeasure network
CN110322433B (en)*2019-05-272021-03-12苏州佳赛特智能科技有限公司Data set amplification method for visual inspection of appearance defects

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN107563510A (en)*2017-08-142018-01-09华南理工大学A kind of WGAN model methods based on depth convolutional neural networks
CN108665005A (en)*2018-05-162018-10-16南京信息工程大学A method of it is improved based on CNN image recognition performances using DCGAN

Also Published As

Publication numberPublication date
CN111986142A (en)2020-11-24

Similar Documents

PublicationPublication DateTitle
CN111986142B (en) An unsupervised data enhancement method for hot-rolled coil surface defect images
Salehi et al.Generative adversarial networks (GANs): An overview of theoretical model, evaluation metrics, and recent developments
CN108416266B (en) A Fast Video Behavior Recognition Method Using Optical Flow to Extract Moving Objects
CN111274981B (en)Target detection network construction method and device and target detection method
CN109978882A (en)A kind of medical imaging object detection method based on multi-modal fusion
CN114842343B (en)ViT-based aerial image recognition method
CN107784288A (en)A kind of iteration positioning formula method for detecting human face based on deep neural network
CN105488515A (en)Method for training convolutional neural network classifier and image processing device
CN112950561A (en)Optical fiber end face defect detection method, device and storage medium
JP2005157679A (en)Object detecting device and method and group learning device and method
CN112784736A (en)Multi-mode feature fusion character interaction behavior recognition method
Carrara et al.On the robustness to adversarial examples of neural ode image classifiers
Wang et al.Not just select samples, but exploration: Genetic programming aided remote sensing target detection under deep learning
Mochalov et al.Application of deep learning to recognize ionograms
CN118397391A (en)Training method and device for deep learning model, electronic equipment and storage medium
CN110837787B (en) A Multispectral Remote Sensing Image Detection Method and System Based on Tripartite Generative Adversarial Network
CN113901916B (en) A facial fraud action recognition method based on visual optical flow features
CN115688939A (en) A personalized federated learning method, system, device and storage medium for long-tail data based on adversarial feature augmentation
CN118799681B (en) A threshold loss optimization method based on data augmentation and learnable parameters for small sample learning
Wan et al.One-shot unsupervised domain adaptation for object detection
Sari et al.Fruit classification quality using convolutional neural network and augmented reality
Ahsan et al.Bsgan: A novel oversampling technique for imbalanced pattern recognitions
Trottier et al.Multi-task learning by deep collaboration and application in facial landmark detection
CN112270404A (en)Detection structure and method for bulge defect of fastener product based on ResNet64 network
CN116597275A (en)High-speed moving target recognition method based on data enhancement

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp