技术领域technical field
本发明涉及视觉目标检测技术领域,更具体地说,涉及一种采用降噪堆叠自动编码器网络的视觉目标检测方法。The invention relates to the technical field of visual target detection, and more specifically, to a visual target detection method using a noise reduction stacked autoencoder network.
背景技术Background technique
从场景图像中检测目标,思路主要有三条:1)去除背景,剩余的就是目标;2)采用模板卷积图像增强目标,然后图像分割直接定位;3)是采用能够抑制背景、突出目标的特征,将图像转换到特征空间,采用机器学习模式识别的方法判断是否存在缺陷。无论是去除背景还是直接定位目标,或者转换到特征空间,往往都使用阈值来分辨缺陷、背景和干扰,而阈值适应复杂多变的场景。边缘检测难以处理边缘模糊和弱对比度对象,形态学易受非均匀照明和对比度的影响,模板匹配方法难以适应目标的形变,也难以确定合适的尺度参数,因此在处理类内变化、类间相似性、复杂干扰等方面表现欠佳,在复杂环境和干扰下鲁棒性差。There are three main ideas for detecting the target from the scene image: 1) remove the background, and the rest is the target; 2) use the template convolution image to enhance the target, and then directly locate the image through image segmentation; 3) use the features that can suppress the background and highlight the target , transform the image into the feature space, and use the method of machine learning pattern recognition to judge whether there is a defect. Whether it is removing the background or directly locating the target, or converting to the feature space, the threshold is often used to distinguish defects, background and interference, and the threshold is adapted to complex and changeable scenes. Edge detection is difficult to deal with blurred edges and weak contrast objects, morphology is easily affected by non-uniform illumination and contrast, template matching method is difficult to adapt to the deformation of the target, and it is difficult to determine the appropriate scale parameters, so it is difficult to deal with intra-class changes and inter-class similarities Poor performance in terms of stability and complex interference, etc., and poor robustness in complex environments and interference.
自动编码器捕捉可以代表输入数据的最重要的因素,以复现输入信号,像PCA一样找到代表原信息的主要成分,这些主要成分就是输入信号的特征,也就是中间层的结果;降噪自动编码器设定模型从含有部分噪声的输入数据中重构不含有噪声的原始输入;降噪自动编码器找到样本中不同维度间的联系,根据部分数据恢复丢失的信息,即模型能够抗噪声和抗遮挡。The autoencoder captures the most important factors that can represent the input data to reproduce the input signal, and finds the main components that represent the original information like PCA. These main components are the characteristics of the input signal, which is the result of the intermediate layer; the noise reduction is automatic The encoder sets the model to reconstruct the original input without noise from the input data containing part of the noise; the denoising autoencoder finds the connection between different dimensions in the sample, and restores the lost information based on part of the data, that is, the model can resist noise and Anti-occlusion.
因此,现有技术亟待有很大的进步。Therefore, prior art urgently needs to have very big progress.
发明内容Contents of the invention
本发明要解决的技术问题在于,针对现有技术的上述的缺陷,提供一种采用降噪堆叠自动编码器网络的视觉目标检测方法,包括步骤:The technical problem to be solved by the present invention is to provide a visual target detection method using a noise reduction stacked autoencoder network for the above-mentioned defects of the prior art, comprising steps:
S1、将训练样本的场景图像和目标位置的标记图像作为共同输入,经过多层编码解码后得到同样的输出,然后将输出中的标记图像作为目标检测结果;S1. Take the scene image of the training sample and the marked image of the target position as a common input, obtain the same output after multi-layer encoding and decoding, and then use the marked image in the output as the target detection result;
S2、降噪堆叠自动编码器网络包括多层,第一层作为输入端和输出端,经过简单的编码解码而没有降噪功能,中间若干层通过多次编码解码,找到不同维度间的联系,从样本中学习从场景图像中恢复丢失的标记图像功能,得到场景图像的标记图像;S2. The noise reduction stacked autoencoder network includes multiple layers. The first layer is used as the input and output ends. After simple encoding and decoding without noise reduction function, several layers in the middle are encoded and decoded multiple times to find the connection between different dimensions. Learn from the samples to recover the missing labeled image function from the scene image to get the labeled image of the scene image;
S3、降噪堆叠自动编码器网络逐层抽取特征并恢复丢失信息。S3. The denoising stacked autoencoder network extracts features layer by layer and restores lost information.
其中,步骤S2还包括步骤:A1、生成第一层自动编码器,将输入信息经过编码和解码后得到与原始输入一样的输出信息,将训练样本的场景图像和目标位置的标记图像作为共同输入F1,经过编码O1=s1(W1F1+b1)成为中间层O1,然后解码重构成F1’=s1(W2O1+b2),模型的参数应该尽可能使重构数据逼近原始向量,即Wherein, step S2 also includes the steps: A1, generate the first layer of automatic encoder, encode and decode the input information to obtain the same output information as the original input, and use the scene image of the training sample and the marked image of the target position as common input F1 , after encoding O1 =s1 (W1 F1 +b1 ) becomes the middle layer O1 , and then decodes and reconstructs it into F1 '=s1 (W2 O1 +b2 ), the parameters of the model should be It is possible to make the reconstructed data approximate the original vector, i.e.
用平方差表示表示重构数据与原始向量间的差异Loss,再加入L1限制,即稀疏要求,约束每一层中的大部分节点为0,少数不为0。因此上式演变成The square difference is used to express the difference Loss between the reconstructed data and the original vector, and then the L1 limit is added, that is, the sparse requirement, and most of the nodes in each layer are constrained to be 0, and a few are not 0. Therefore, the above formula becomes
纯净、无噪声的原始数据下,W2≈W1T;Under pure and noise-free raw data, W2 ≈W1T ;
A2、将第一层编码器的输出当成第二层降噪自动编码器的输入,同样最小化第二层降噪自动编码器的重构误差,使得第二层经过编码、解码后所重构的输出与第二层输入一样;A2. Take the output of the first layer encoder as the input of the second layer noise reduction autoencoder, and also minimize the reconstruction error of the second layer noise reduction autoencoder, so that the second layer is reconstructed after encoding and decoding The output of is the same as the input of the second layer;
A3、生成中间若干层降噪自动编码器;A3. Generate several layers of noise reduction autoencoders in the middle;
A4、堆叠各层降噪自动编码器,输入依次经过第一层编码、第二层编码…第n层编码,再依次经过第n层解码…第二层解码、第一层编码,输出与输入一样的信息;A4. Stack each layer of noise reduction autoencoder, the input is sequentially passed through the first layer of encoding, the second layer of encoding...the nth layer of encoding, and then in turn through the nth layer of decoding...the second layer of decoding, the first layer of encoding, output and input the same information;
A5、使用时,用场景图像和空白的标记图像作为共同输入,将标记图像作为噪声干扰下的丢失信息或者遮挡住的信息,经过多层的降噪自动编码器,从场景图像恢复丢失的信息,在最后一层得到场景图像和标记图像,但只取标记图像作为输出。A5. When in use, the scene image and the blank marked image are used as common input, and the marked image is used as the lost or occluded information under noise interference, and the lost information is recovered from the scene image through a multi-layer noise reduction autoencoder , get the scene image and the labeled image in the last layer, but only take the labeled image as output.
实施本发明的采用降噪堆叠自动编码器网络的视觉目标检测方法,降噪堆叠自动编码器网络包括多层,逐层抽取特征并恢复丢失信息,可以提高检测精度,可以广泛用于车牌检测、自然环境中字符检测、行人检测、缺陷检测等各种检测应用。Implement the visual target detection method of the present invention using the noise reduction stacked autoencoder network, the noise reduction stacked autoencoder network includes multiple layers, extracts features layer by layer and restores lost information, can improve detection accuracy, and can be widely used in license plate detection, Various detection applications such as character detection, pedestrian detection, and defect detection in natural environments.
附图说明Description of drawings
下面将结合附图及实施例对本发明作进一步说明,附图中:The present invention will be further described below in conjunction with accompanying drawing and embodiment, in the accompanying drawing:
图1是本发明采用降噪堆叠自动编码器网络的视觉目标检测方法的第一实施例的方法流程图。FIG. 1 is a method flow chart of the first embodiment of the method for detecting a visual object using a denoising stacked autoencoder network in the present invention.
图2是图1中步骤S2包括的方法流程图。FIG. 2 is a flow chart of the method included in step S2 in FIG. 1 .
具体实施方式detailed description
请参阅图1,为本发明采用降噪堆叠自动编码器网络的视觉目标检测方法的第一实施例的模块示意图。图2是图1中步骤S2包括的方法流程图。如图1、图2所示,在本发明第一实施例提供的采用降噪堆叠自动编码器网络的视觉目标检测方法中,至少包括步骤:Please refer to FIG. 1 , which is a block diagram of a first embodiment of a visual object detection method using a denoising stacked autoencoder network according to the present invention. FIG. 2 is a flow chart of the method included in step S2 in FIG. 1 . As shown in Figure 1 and Figure 2, in the visual target detection method using the noise reduction stacked autoencoder network provided by the first embodiment of the present invention, at least steps are included:
S1、将训练样本的场景图像和目标位置的标记图像作为共同输入,经过多层编码解码后得到同样的输出,然后将输出中的标记图像作为目标检测结果;S1. Take the scene image of the training sample and the marked image of the target position as a common input, obtain the same output after multi-layer encoding and decoding, and then use the marked image in the output as the target detection result;
S2、降噪堆叠自动编码器网络包括多层,第一层作为输入端和输出端,经过简单的编码解码而没有降噪功能,中间若干层通过多次编码解码,找到不同维度间的联系,从样本中学习从场景图像中恢复丢失的标记图像功能,得到场景图像的标记图像;S2. The noise reduction stacked autoencoder network includes multiple layers. The first layer is used as the input and output ends. After simple encoding and decoding without noise reduction function, several layers in the middle are encoded and decoded multiple times to find the connection between different dimensions. Learn from the samples to recover the missing labeled image function from the scene image to get the labeled image of the scene image;
具体实施时,S2还包括步骤:During specific implementation, S2 also includes steps:
A1、生成第一层自动编码器,将输入信息经过编码和解码后得到与原始输入一样的输出信息,将训练样本的场景图像和目标位置的标记图像作为共同输入F1,经过编码O1=s1(W1F1+b1)成为中间层O1,然后解码重构成F1’=s1(W2O1+b2),模型的参数应该尽可能使重构数据逼近原始向量,即A1. Generate the first layer of autoencoder, encode and decode the input information to obtain the same output information as the original input, take the scene image of the training sample and the marked image of the target position as the common input F1 , after encoding O1 = s1 (W1 F1 +b1 ) becomes the middle layer O1 , and then decoded and reconstructed into F1 '=s1 (W2 O1 +b2 ), the parameters of the model should make the reconstructed data as close as possible to the original vector ,Right now
用平方差表示表示重构数据与原始向量间的差异Loss,再加入L1限制,即稀疏要求,约束每一层中的大部分节点为0,少数不为0。因此上式演变成The square difference is used to express the difference Loss between the reconstructed data and the original vector, and then the L1 limit is added, that is, the sparse requirement, and most of the nodes in each layer are constrained to be 0, and a few are not 0. Therefore, the above formula becomes
纯净、无噪声的原始数据下,W2≈W1T;Under pure and noise-free raw data, W2 ≈W1T ;
A2、将第一层编码器的输出当成第二层降噪自动编码器的输入,同样最小化第二层降噪自动编码器的重构误差,使得第二层经过编码、解码后所重构的输出与第二层输入一样;A2. Take the output of the first layer encoder as the input of the second layer noise reduction autoencoder, and also minimize the reconstruction error of the second layer noise reduction autoencoder, so that the second layer is reconstructed after encoding and decoding The output of is the same as the input of the second layer;
A3、生成中间若干层降噪自动编码器;A3. Generate several layers of noise reduction autoencoders in the middle;
A4、堆叠各层降噪自动编码器,输入依次经过第一层编码、第二层编码…第n层编码,再依次经过第n层解码…第二层解码、第一层编码,输出与输入一样的信息;A4. Stack each layer of noise reduction autoencoder, the input is sequentially passed through the first layer of encoding, the second layer of encoding...the nth layer of encoding, and then in turn through the nth layer of decoding...the second layer of decoding, the first layer of encoding, output and input the same information;
A5、使用时,用场景图像和空白的标记图像作为共同输入,将标记图像作为噪声干扰下的丢失信息或者遮挡住的信息,经过多层的降噪自动编码器,从场景图像恢复丢失的信息,在最后一层得到场景图像和标记图像,但只取标记图像作为输出;A5. When in use, the scene image and the blank marked image are used as common input, and the marked image is used as the lost or occluded information under noise interference, and the lost information is recovered from the scene image through a multi-layer noise reduction autoencoder , get the scene image and the labeled image in the last layer, but only take the labeled image as output;
S3、降噪堆叠自动编码器网络逐层抽取特征并恢复丢失信息。通过逐层抽取特征并恢复丢失信息,可以提高所述采用降噪堆叠自动编码器网络的检测精度。S3. The denoising stacked autoencoder network extracts features layer by layer and restores lost information. By extracting features layer by layer and restoring lost information, the detection accuracy of the stacked autoencoder network with noise reduction can be improved.
本发明通过以上实施例的设计,可以做到降噪堆叠自动编码器网络包括多层,逐层抽取特征并恢复丢失信息,可以提高检测精度,可以广泛用于车牌检测、自然环境中字符检测、行人检测、缺陷检测等各种检测应用。Through the design of the above embodiments, the present invention can realize that the noise reduction stacked automatic encoder network includes multiple layers, extracts features layer by layer and restores lost information, can improve detection accuracy, and can be widely used in license plate detection, character detection in natural environments, Pedestrian detection, defect detection and other detection applications.
本发明是根据特定实施例进行描述的,但本领域的技术人员应明白在不脱离本发明范围时,可进行各种变化和等同替换。此外,为适应本发明技术的特定场合,可对本发明进行诸多修改而不脱离其保护范围。因此,本发明并不限于在此公开的特定实施例,而包括所有落入到权利要求保护范围的实施例。The present invention has been described based on specific embodiments, but those skilled in the art will understand that various changes and equivalent substitutions can be made without departing from the scope of the present invention. In addition, in order to adapt to the specific occasion of the technology of the present invention, many modifications can be made to the present invention without departing from its protection scope. Therefore, it is intended that the invention not be limited to the particular embodiments disclosed herein, but include all embodiments falling within the scope of the appended claims.
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201610959069.6ACN106529589A (en) | 2016-11-03 | 2016-11-03 | Visual object detection method employing de-noising stacked automatic encoder network |
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201610959069.6ACN106529589A (en) | 2016-11-03 | 2016-11-03 | Visual object detection method employing de-noising stacked automatic encoder network |
| Publication Number | Publication Date |
|---|---|
| CN106529589Atrue CN106529589A (en) | 2017-03-22 |
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201610959069.6APendingCN106529589A (en) | 2016-11-03 | 2016-11-03 | Visual object detection method employing de-noising stacked automatic encoder network |
| Country | Link |
|---|---|
| CN (1) | CN106529589A (en) |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN107194418A (en)* | 2017-05-10 | 2017-09-22 | 中国科学院合肥物质科学研究院 | A kind of Aphids in Rice Field detection method based on confrontation feature learning |
| CN109886210A (en)* | 2019-02-25 | 2019-06-14 | 百度在线网络技术(北京)有限公司 | A traffic image recognition method, device, computer equipment and medium |
| US10726525B2 (en) | 2017-09-26 | 2020-07-28 | Samsung Electronics Co., Ltd. | Image denoising neural network architecture and method of training the same |
| CN112861625A (en)* | 2021-01-05 | 2021-05-28 | 深圳技术大学 | Method for determining stacking denoising autoencoder model |
| WO2023050433A1 (en)* | 2021-09-30 | 2023-04-06 | 浙江大学 | Video encoding and decoding method, encoder, decoder and storage medium |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN104361328A (en)* | 2014-11-21 | 2015-02-18 | 中国科学院重庆绿色智能技术研究院 | Facial image normalization method based on self-adaptive multi-column depth model |
| CN104641644A (en)* | 2012-05-14 | 2015-05-20 | 卢卡·罗萨托 | Encoding and decoding based on mixing of sample sequences along time |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN104641644A (en)* | 2012-05-14 | 2015-05-20 | 卢卡·罗萨托 | Encoding and decoding based on mixing of sample sequences along time |
| CN104361328A (en)* | 2014-11-21 | 2015-02-18 | 中国科学院重庆绿色智能技术研究院 | Facial image normalization method based on self-adaptive multi-column depth model |
| Title |
|---|
| PASCAL VINCENT 等: "Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion", 《JOURNAL OF MACHINE LEARNING RESEARCH》* |
| XUGANG LU 等: "Speech Enhancement Based on Deep Denoising Autoencoder", 《HTTPS://WWW.RESEARCHGATE.NET/PUBLICATION/283600839 》* |
| 王宪保 等: "基于堆叠降噪自动编码器的胶囊缺陷检测方法", 《计算机科学》* |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN107194418A (en)* | 2017-05-10 | 2017-09-22 | 中国科学院合肥物质科学研究院 | A kind of Aphids in Rice Field detection method based on confrontation feature learning |
| CN107194418B (en)* | 2017-05-10 | 2021-09-28 | 中国科学院合肥物质科学研究院 | Rice aphid detection method based on antagonistic characteristic learning |
| US10726525B2 (en) | 2017-09-26 | 2020-07-28 | Samsung Electronics Co., Ltd. | Image denoising neural network architecture and method of training the same |
| CN109886210A (en)* | 2019-02-25 | 2019-06-14 | 百度在线网络技术(北京)有限公司 | A traffic image recognition method, device, computer equipment and medium |
| CN109886210B (en)* | 2019-02-25 | 2022-07-19 | 百度在线网络技术(北京)有限公司 | A traffic image recognition method, device, computer equipment and medium |
| CN112861625A (en)* | 2021-01-05 | 2021-05-28 | 深圳技术大学 | Method for determining stacking denoising autoencoder model |
| CN112861625B (en)* | 2021-01-05 | 2023-07-04 | 深圳技术大学 | Determination method for stacked denoising self-encoder model |
| WO2023050433A1 (en)* | 2021-09-30 | 2023-04-06 | 浙江大学 | Video encoding and decoding method, encoder, decoder and storage medium |
| Publication | Publication Date | Title |
|---|---|---|
| CN106529589A (en) | Visual object detection method employing de-noising stacked automatic encoder network | |
| KR102168397B1 (en) | Image tamper detection method, system, electronic device and storage medium | |
| Tillmann | On the computational intractability of exact and approximate dictionary learning | |
| US8863044B1 (en) | Layout assessment method and system | |
| JP5607261B2 (en) | System and method for improving feature generation in object recognition | |
| CN106920206B (en) | A Steganalysis Method Based on Adversarial Neural Networks | |
| JP2015519828A (en) | Multi-layer system for symbol space-based compression of patterns | |
| CN108898639A (en) | A kind of Image Description Methods and system | |
| JP5955925B2 (en) | Image recognition system based on cascading overcomplete dictionary | |
| CN110752894B (en) | A CNN-based LDPC code blind channel decoding method and decoder | |
| WO2023036045A1 (en) | Model training method, video quality assessment method and apparatus, and device and medium | |
| CN105761197A (en) | Feature invariants based remote sensing image watermark method | |
| CN116309202A (en) | An Unsupervised Low Light Enhancement Method Based on Histogram Equalization Prior | |
| WO2025112906A1 (en) | Image processing method and apparatus, and device, medium and program product | |
| CN107622267B (en) | A Scene Text Recognition Method Based on Embedding Bilateral Convolution Activation | |
| CN117115588A (en) | A 3D pre-training method and system based on diffusion model | |
| CN116739932A (en) | A deep learning algorithm for image denoising based on blind spot self-supervision | |
| CN103957011A (en) | Restoration method for compressed sensing signal with noise based on threshold value shrinkage iteration | |
| CN116665027A (en) | High-robustness image tampering detection method | |
| CN104217431B (en) | Compressed Sensing Compensation Method Based on Edge Extraction and Image Fusion Technology | |
| CN108171325B (en) | A time series integrated network, encoding device and decoding device for multi-scale face restoration | |
| KR101693247B1 (en) | A Method for Extracting Mosaic Blocks Using Boundary Features | |
| Wang et al. | Incoherent dictionary learning for sparse representation based image denoising | |
| CN109991578A (en) | Multicomponent radar signal modulation recognition method based on blind compression kernel dictionary learning | |
| CN116342363A (en) | Visible watermark removal method based on two-stage deep neural network |
| Date | Code | Title | Description |
|---|---|---|---|
| C06 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| CB02 | Change of applicant information | Address after:325000 Science Park, Dongfang South Road, Ouhai District, Zhejiang, Wenzhou Applicant after:Wenzhou University Address before:325000 Zhejiang city of Wenzhou province Wenzhou Higher Education Park (Chashan town of Ouhai District) Applicant before:Wenzhou University | |
| CB02 | Change of applicant information | ||
| WD01 | Invention patent application deemed withdrawn after publication | Application publication date:20170322 | |
| WD01 | Invention patent application deemed withdrawn after publication |