Movatterモバイル変換


[0]ホーム

URL:


CN117274823B - Visual transducer landslide identification method based on DEM feature enhancement - Google Patents

Visual transducer landslide identification method based on DEM feature enhancement
Download PDF

Info

Publication number
CN117274823B
CN117274823BCN202311555261.5ACN202311555261ACN117274823BCN 117274823 BCN117274823 BCN 117274823BCN 202311555261 ACN202311555261 ACN 202311555261ACN 117274823 BCN117274823 BCN 117274823B
Authority
CN
China
Prior art keywords
dem
image
landslide
remote sensing
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311555261.5A
Other languages
Chinese (zh)
Other versions
CN117274823A (en
Inventor
冷小鹏
贾璐
赵富鸿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengli Zhiyuan Technology Chengdu Co ltd
Chengdu Univeristy of Technology
Original Assignee
Chengli Zhiyuan Technology Chengdu Co ltd
Chengdu Univeristy of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengli Zhiyuan Technology Chengdu Co ltd, Chengdu Univeristy of TechnologyfiledCriticalChengli Zhiyuan Technology Chengdu Co ltd
Priority to CN202311555261.5ApriorityCriticalpatent/CN117274823B/en
Publication of CN117274823ApublicationCriticalpatent/CN117274823A/en
Application grantedgrantedCritical
Publication of CN117274823BpublicationCriticalpatent/CN117274823B/en
Priority to KR1020240089945Aprioritypatent/KR102797774B1/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Classifications

Landscapes

Abstract

Translated fromChinese

本发明公开了一种基于DEM特征增强的视觉Transformer滑坡识别方法,包括获取数据集,所述数据集包括遥感数据集D1和DEM图像数据集D2,D1中包含大量标记有滑坡区域的遥感图像,D2中包含与遥感图像块一一对应的DEM图像;构建一滑坡识别网络,包括DEM特征提取单元和Swin‑Transformer网络;训练滑坡识别网络,得到识别模型,用识别模型进行目标识别。本发明用DEM特征提取单元得到掩码矩阵,与Swin‑Transformer网络中图片分块单元的输出相叠加,本发明充分结合了DEM图像信息,使滑坡识别效果得到了提升,在召回率,准确率与F1值等方面有明显提高。

The invention discloses a visual Transformer landslide identification method based on DEM feature enhancement, which includes obtaining a data set. The data set includes a remote sensing data set D1 and a DEM image data set D2. D1 contains a large number of remote sensing images marked with landslide areas. D2 contains DEM images that correspond to remote sensing image blocks one-to-one; build a landslide recognition network, including a DEM feature extraction unit and a Swin-Transformer network; train the landslide recognition network to obtain a recognition model, and use the recognition model for target recognition. This invention uses the DEM feature extraction unit to obtain the mask matrix, which is superimposed with the output of the picture blocking unit in the Swin-Transformer network. This invention fully combines the DEM image information to improve the landslide recognition effect in terms of recall rate and accuracy. F1 value and other aspects have been significantly improved.

Description

Translated fromChinese
基于DEM特征增强的视觉Transformer滑坡识别方法Visual Transformer landslide identification method based on DEM feature enhancement

技术领域Technical field

本发明涉及一种图像识别方法,尤其涉及一种基于DEM特征增强的视觉Transformer滑坡识别方法。The invention relates to an image recognition method, and in particular to a visual Transformer landslide recognition method based on DEM feature enhancement.

背景技术Background technique

随着对地观测技术的迅猛发展,遥感技术在大范围地质灾害调查中的应用也越来越普遍。从大量的遥感数据中快速、全面和准确地识别出滑坡对防灾减灾具有重要意义。With the rapid development of earth observation technology, the application of remote sensing technology in large-scale geological disaster investigation is becoming more and more common. Rapid, comprehensive and accurate identification of landslides from a large amount of remote sensing data is of great significance to disaster prevention and reduction.

宋雨洋等人(2022年)提出了一种基于Google开源数据与支持向量机(SVM)相结合的方法,以川西地区为实验区,实现了遥感图像上滑坡的准确识别。Song Yuyang et al. (2022) proposed a method based on a combination of Google open source data and support vector machines (SVM), using Western Sichuan as the experimental area to achieve accurate identification of landslides on remote sensing images.

季顺平等人提出了卷积神经网络与3D通道注意力机制结合的方式进行滑坡分类识别,在自己构建的毕节滑坡数据集最终实验结果精确度为98.16%,F1分数达到96.62%。Ji Shunping et al. proposed a combination of convolutional neural network and 3D channel attention mechanism to classify and identify landslides. The final experimental result of the Bijie landslide data set constructed by themselves was 98.16% accurate and the F1 score reached 96.62%.

杨昭颖等提出来融合DEM的卷积神经网络分类模型,并在自己制作的黄土滑坡数据库上进行了实验验证,最终实验结果精确度为95.7。Yang Zhaoying and others proposed a convolutional neural network classification model that integrates DEM, and conducted experimental verification on the loess landslide database they produced. The final experimental result accuracy was 95.7.

SVM方法存在识别精度不高,季顺平等人并未引入DEM特征,杨昭颖等人的方法未结合注意力机制。随着2021年视觉Transformer新型基线网络模型的爆火,在多个数据集上显示效果优于以往的卷积神经网络,但至今没有视觉Transformer与DEM特征结合,并应用在滑坡分类识别领域。The SVM method has low recognition accuracy. Ji Shunping et al. did not introduce DEM features, and the method of Yang Zhaoying et al. did not incorporate an attention mechanism. With the popularity of the new baseline network model of visual transformer in 2021, the display effect on multiple data sets is better than that of previous convolutional neural networks, but so far no visual transformer has been combined with DEM features and applied in the field of landslide classification and recognition.

名词解释:Glossary:

DEM,Digital Elevation Model,中文为数字高程模型,DEM图像指数字高程模型地图。DEM, Digital Elevation Model, Chinese is digital elevation model, and DEM image refers to digital elevation model map.

发明内容Contents of the invention

本发明的目的就在于提供一种解决上述问题,能提升滑坡识别效果的基于DEM特征增强的视觉Transformer滑坡识别方法。The purpose of the present invention is to provide a visual Transformer landslide recognition method based on DEM feature enhancement that solves the above problems and can improve the landslide recognition effect.

为了实现上述目的,本发明采用的技术方案是这样的:一种基于DEM特征增强的视觉Transformer滑坡识别方法,包括以下步骤:In order to achieve the above objectives, the technical solution adopted by the present invention is as follows: a visual Transformer landslide identification method based on DEM feature enhancement, including the following steps:

(1)获取数据集;(1) Obtain the data set;

所述数据集包括遥感数据集D1和DEM图像数据集D2;The data sets include remote sensing data set D1 and DEM image data set D2;

D1中包含大量标记有滑坡区域的遥感图像,尺寸为M×M×3,M为长度和宽度值,M≥128;D1 contains a large number of remote sensing images marked with landslide areas, the size is M×M×3, M is the length and width value, M≥128;

D2中包含与遥感图像块一一对应的DEM图像,尺寸为M×M×1;D2 contains DEM images that correspond one-to-one to remote sensing image blocks, with a size of M×M×1;

遥感图像与对应的DEM图像构成样本数据对;Remote sensing images and corresponding DEM images constitute a sample data pair;

(2)构建一滑坡识别网络,包括DEM特征提取单元和Swin-Transformer网络;(2) Construct a landslide identification network, including a DEM feature extraction unit and a Swin-Transformer network;

所述DEM特征提取单元,用于将4×4像素点视为一子区,计算每个子区的粗糙度,并构成粗糙度矩阵,再用一预设的阈值与对每个粗糙度对比,将粗糙度矩阵中,小于阈值的粗糙度替换为一无穷小的数,其余粗糙度替换为0,得到掩码矩阵;The DEM feature extraction unit is used to regard 4×4 pixels as a sub-area, calculate the roughness of each sub-area, and form a roughness matrix, and then compare each roughness with a preset threshold, In the roughness matrix, the roughness smaller than the threshold is replaced with an infinitesimal number, and the remaining roughness is replaced with 0 to obtain the mask matrix;

所述Swin-Transformer网络包括依次连接的图片分块单元、四个降采样单元和一分类头,且图片分块单元与第一个降采样单元间设有一图像叠加单元;The Swin-Transformer network includes sequentially connected picture blocking units, four downsampling units and a classification head, and an image overlay unit is provided between the picture blocking unit and the first downsampling unit;

所述图像叠加单元用于将掩码矩阵与图片分块单元的输出进行叠加,得到叠加图像;The image superposition unit is used to superimpose the mask matrix and the output of the picture blocking unit to obtain a superimposed image;

(3)训练滑坡识别网络,得到识别模型;(3) Train the landslide recognition network and obtain the recognition model;

(31)设置训练次数、及每次训练送入滑坡识别网络中样本数据对的数量;(31) Set the number of training times and the number of sample data pairs sent to the landslide identification network for each training;

(32)将样本数据对送入滑坡识别网络,对其中一样本数据对,DEM特征提取单元处理DEM图像,得到大小为M/4×M/4×1掩码矩阵,Swin-Transformer网络将遥感图像经图片分块单元分为M/4×M/4×96的图像块,与掩码矩阵叠加得到叠加图像,再经四个降采样单元和分类头后输出目标识别结果;(32) Send the sample data pairs to the landslide identification network. For one of the sample data pairs, the DEM feature extraction unit processes the DEM image to obtain a mask matrix of size M/4×M/4×1. The Swin-Transformer network converts the remote sensing The image is divided into M/4×M/4×96 image blocks by the picture blocking unit, which is superimposed with the mask matrix to obtain the superimposed image, and then the target recognition result is output after passing through four downsampling units and classification heads;

(33)计算本次训练的损失函数,用Lion优化器更新滑坡识别网络的参数;(33) Calculate the loss function of this training and use the Lion optimizer to update the parameters of the landslide identification network;

(34)重复步骤(32)、(33)至参数收敛,得到识别模型;(34) Repeat steps (32) and (33) until the parameters converge and obtain the identification model;

(4)用识别模型进行目标识别;(4) Use the recognition model for target recognition;

(41)获取待识别区域的遥感图像,构建对应的DEM图像,并将遥感图像剪裁至M×M×3的遥感图像块、DEM图像剪裁至M×M×1的DEM图像块,将遥感图像块与DEM图像块一一对应构成样本对;(41) Obtain the remote sensing image of the area to be identified, construct the corresponding DEM image, and trim the remote sensing image to M×M×3 remote sensing image blocks, and the DEM image to M×M×1 DEM image blocks, and then The one-to-one correspondence between blocks and DEM image blocks constitutes a sample pair;

(42)将样本对送入识别模型,输出目标识别结果。(42) Send the sample pairs into the recognition model and output the target recognition results.

作为优选,DEM图像中,子区i的粗糙度Ri通过下式计算;Preferably, in the DEM image, the roughness Ri of sub-region i is calculated by the following formula;

Ri=Max(ei)-Min(ei) (1),Ri =Max(ei )-Min(ei ) (1),

式(1)中,Max(ei)、Min(ei)分别为子区i内像素点的高程值的最大值和最小值。In formula (1), Max(ei ) and Min(ei ) are the maximum and minimum values of the elevation value of the pixel in sub-area i, respectively.

作为优选,步骤(2)中,阈值为预设。Preferably, in step (2), the threshold is preset.

作为优选,Lion优化器根据下式进行参数的更新:As a preference, the Lion optimizer updates parameters according to the following formula:

(2), (2),

(3), (3),

(4), (4),

式(2)中,ut为中间量,sign( )为sign函数,β1为预设的第一参数,且β1=0.9,gt为第t次训练得到的损失函数的梯度;mt-1为第t-1次训练时的指数移动平均;In formula (2),ut is the intermediate quantity,sign ( ) is the sign function,β1 is the preset first parameter, andβ1 =0.9,gt is the gradient of the loss function obtained by the t-th training;mt -1 is the exponential moving average at the t-1th training;

式(3)中,θtθt-1分别为第t次、第t-1次训练得到的滑坡识别网络的参数,为权重衰减,/>为学习率;In formula (3),θt andθt-1 are the parameters of the landslide identification network obtained from the t-th and t-1-th training respectively, is the weight attenuation,/> is the learning rate;

式(4)中,mt为第t次训练时的指数移动平均,β2为预设的第二参数,且β2=0.99。In formula (4),mt is the exponential moving average during the t-th training,β2 is the preset second parameter, andβ2 =0.99.

关于DEM特征提取单元:目的是得到掩码矩阵。DEM特征提取单元的处理过程包括粗糙度计算、构造粗糙度矩阵和构造掩码矩阵。因为滑坡事故往往伴随巨大的高度变化,本发明粗糙度计算基于中心点高程值与周围高程值中的最大值和最小值的高程差,所以粗糙度能反应地形变化的剧烈程度,基于粗糙度计算构造的粗糙度矩阵,能反映整个DEM图像中地形变化的剧烈程度。再通过预先设置好的阈值,对粗糙度矩阵中的元素进行标记,将小于阈值的粗糙度替换为一无穷小的数,其余粗糙度替换为0,得到掩码矩阵。Regarding the DEM feature extraction unit: the purpose is to obtain the mask matrix. The processing process of the DEM feature extraction unit includes roughness calculation, construction of roughness matrix and construction of mask matrix. Because landslide accidents are often accompanied by huge height changes, the roughness calculation of the present invention is based on the height difference between the maximum and minimum values of the central point elevation value and the surrounding elevation values. Therefore, the roughness can reflect the severity of terrain changes. Based on the roughness calculation The constructed roughness matrix can reflect the severity of terrain changes in the entire DEM image. Then, mark the elements in the roughness matrix through the preset threshold, replace the roughness smaller than the threshold with an infinitesimal number, and replace the remaining roughness with 0 to obtain the mask matrix.

本发明还将DEM特征提取单元与Swin-Transformer网络进行了融合。步骤(32)中,Swin-Transformer网络将遥感图像经图片分块单元分为M/4×M/4×96的图像块,与掩码矩阵叠加得到叠加图像,与最初的遥感图像相比,叠加图像筛除了遥感图像中粗糙度平坦的数据信息,并将剩余数据信息输入到神经网络里进行分类。The present invention also integrates the DEM feature extraction unit with the Swin-Transformer network. In step (32), the Swin-Transformer network divides the remote sensing image into M/4×M/4×96 image blocks through the picture block unit, and superimposes them with the mask matrix to obtain the superimposed image. Compared with the original remote sensing image, The superimposed image filters out data information with flat roughness in the remote sensing image, and inputs the remaining data information into the neural network for classification.

与现有技术相比,本发明的优点在于:用DEM特征提取单元得到掩码矩阵,与Swin-Transformer网络中图片分块单元的输出相叠加,再送入Swin-Transformer网络后续模块中,由于叠加图像充分结合了DEM图像信息,使滑坡识别效果得到了提升,在召回率,准确率与F1值等方面有明显提高。Compared with the existing technology, the advantage of the present invention is that: the mask matrix is obtained by using the DEM feature extraction unit, superimposed with the output of the picture blocking unit in the Swin-Transformer network, and then sent to the subsequent module of the Swin-Transformer network. Due to the superposition The image fully combines the DEM image information, which improves the landslide recognition effect and significantly improves the recall rate, accuracy rate and F1 value.

附图说明Description of the drawings

图1为本发明滑坡识别网络的结构图;Figure 1 is a structural diagram of the landslide identification network of the present invention;

图2为Swin-Transformer网络结构图;Figure 2 is the Swin-Transformer network structure diagram;

图3为Swin-Transformer网络中特征提取模块的结构图;Figure 3 is the structure diagram of the feature extraction module in the Swin-Transformer network;

图4为得到叠加图像的示意图。Figure 4 is a schematic diagram of obtaining the superimposed image.

具体实施方式Detailed ways

下面将结合附图对本发明作进一步说明。The present invention will be further described below in conjunction with the accompanying drawings.

实施例1:参见图1到图4,一种基于DEM特征增强的视觉Transformer滑坡识别方法,包括以下步骤:Embodiment 1: Referring to Figures 1 to 4, a visual Transformer landslide identification method based on DEM feature enhancement includes the following steps:

(1)获取数据集;(1) Obtain the data set;

所述数据集包括遥感数据集D1和DEM图像数据集D2;The data sets include remote sensing data set D1 and DEM image data set D2;

D1中包含大量标记有滑坡区域的遥感图像,尺寸为M×M×3,M为长度和宽度值,M≥128;D1 contains a large number of remote sensing images marked with landslide areas, the size is M×M×3, M is the length and width value, M≥128;

D2中包含与遥感图像块一一对应的DEM图像,尺寸为M×M×1;D2 contains DEM images that correspond one-to-one to remote sensing image blocks, with a size of M×M×1;

遥感图像与对应的DEM图像构成样本数据对;Remote sensing images and corresponding DEM images constitute a sample data pair;

(2)构建一滑坡识别网络,包括DEM特征提取单元和Swin-Transformer网络;(2) Construct a landslide identification network, including a DEM feature extraction unit and a Swin-Transformer network;

所述DEM特征提取单元,用于将4×4像素点视为一子区,计算每个子区的粗糙度,并构成粗糙度矩阵,再用一预设的阈值与对每个粗糙度对比,将粗糙度矩阵中,小于阈值的粗糙度替换为一无穷小的数,其余粗糙度替换为0,得到掩码矩阵;The DEM feature extraction unit is used to regard 4×4 pixels as a sub-area, calculate the roughness of each sub-area, and form a roughness matrix, and then compare each roughness with a preset threshold, In the roughness matrix, the roughness smaller than the threshold is replaced with an infinitesimal number, and the remaining roughness is replaced with 0 to obtain the mask matrix;

所述Swin-Transformer网络包括依次连接的图片分块单元、四个降采样单元和一分类头,且图片分块单元与第一个降采样单元间设有一图像叠加单元;The Swin-Transformer network includes sequentially connected picture blocking units, four downsampling units and a classification head, and an image overlay unit is provided between the picture blocking unit and the first downsampling unit;

所述图像叠加单元用于将掩码矩阵与图片分块单元的输出进行叠加,得到叠加图像;The image superposition unit is used to superimpose the mask matrix and the output of the picture blocking unit to obtain a superimposed image;

(3)训练滑坡识别网络,得到识别模型;(3) Train the landslide recognition network and obtain the recognition model;

(31)设置训练次数、及每次训练送入滑坡识别网络中样本数据对的数量;(31) Set the number of training times and the number of sample data pairs sent to the landslide identification network for each training;

(32)将样本数据对送入滑坡识别网络,对其中一样本数据对,DEM特征提取单元处理DEM图像,得到大小为M/4×M/4×1掩码矩阵,Swin-Transformer网络将遥感图像经图片分块单元分为M/4×M/4×96的图像块,与掩码矩阵叠加得到叠加图像,再经四个降采样单元和分类头后输出目标识别结果;(32) Send the sample data pairs to the landslide identification network. For one of the sample data pairs, the DEM feature extraction unit processes the DEM image to obtain a mask matrix of size M/4×M/4×1. The Swin-Transformer network converts the remote sensing The image is divided into M/4×M/4×96 image blocks by the picture blocking unit, which is superimposed with the mask matrix to obtain the superimposed image, and then the target recognition result is output after passing through four downsampling units and classification heads;

(33)计算本次训练的损失函数,用Lion优化器更新滑坡识别网络的参数;(33) Calculate the loss function of this training and use the Lion optimizer to update the parameters of the landslide identification network;

(34)重复步骤(32)、(33)至参数收敛,得到识别模型;(34) Repeat steps (32) and (33) until the parameters converge and obtain the identification model;

(4)用识别模型进行目标识别;(4) Use the recognition model for target recognition;

(41)获取待识别区域的遥感图像,构建对应的DEM图像,并将遥感图像剪裁至M×M×3的遥感图像块、DEM图像剪裁至M×M×1的DEM图像块,将遥感图像块与DEM图像块一一对应构成样本对;(41) Obtain the remote sensing image of the area to be identified, construct the corresponding DEM image, and trim the remote sensing image to M×M×3 remote sensing image blocks, and the DEM image to M×M×1 DEM image blocks, and then The one-to-one correspondence between blocks and DEM image blocks constitutes a sample pair;

(42)将样本对送入识别模型,输出目标识别结果。(42) Send the sample pairs into the recognition model and output the target recognition results.

DEM图像中,子区i的粗糙度Ri通过下式计算;In the DEM image, the roughness Ri of sub-region i is calculated by the following formula;

Ri=Max(ei)-Min(ei) (1),Ri =Max(ei )-Min(ei ) (1),

式(1)中,Max(ei)、Min(ei)分别为子区i内像素点的高程值的最大值和最小值。In formula (1), Max(ei ) and Min(ei ) are the maximum and minimum values of the elevation value of the pixel in sub-area i, respectively.

步骤(2)中,阈值为根据经验进行预设。In step (2), the threshold is preset based on experience.

Lion优化器根据下式进行参数的更新:The Lion optimizer updates parameters according to the following formula:

(2), (2),

(3), (3),

(4), (4),

式(2)中,ut为中间量,sign( )为sign函数,β1为预设的第一参数,且β1=0.9,gt为第t次训练得到的损失函数的梯度;mt-1为第t-1次训练时的指数移动平均;In formula (2),ut is the intermediate quantity,sign ( ) is the sign function,β1 is the preset first parameter, andβ1 =0.9,gt is the gradient of the loss function obtained by the t-th training;mt -1 is the exponential moving average at the t-1th training;

式(3)中,θtθt-1分别为第t次、第t-1次训练得到的滑坡识别网络的参数,为权重衰减,/>为学习率;In formula (3),θt andθt-1 are the parameters of the landslide identification network obtained from the t-th and t-1-th training respectively, is the weight attenuation,/> is the learning rate;

式(4)中,mt为第t次训练时的指数移动平均,β2为预设的第二参数,且β2=0.99。In formula (4),mt is the exponential moving average during the t-th training,β2 is the preset second parameter, andβ2 =0.99.

实施例2:参见图1到图3;Embodiment 2: See Figures 1 to 3;

关于Swin-Transformer网络,简称Swin-T网络,由图片分块,四次特征提取,分类合并等操作构成。Regarding the Swin-Transformer network, referred to as the Swin-T network, it consists of image blocking, four feature extractions, classification merging and other operations.

图片分块单元通过patch partition操作将遥感图像分为数个图像块,如图1所示,设M=224,则对于RGB三通道的遥感图像,大小为224×224×3;The image blocking unit divides the remote sensing image into several image blocks through the patch partition operation, as shown in Figure 1. Assuming M=224, the size of the RGB three-channel remote sensing image is 224×224×3;

将遥感图像送入图片分块单元中,patch partition操作将每4×4相邻的像素点为一个块,则每个块包含4×4=16个像素点,每个像素点有R、G、B三个值,展平后维度为16×3=48,像素点由224除以4变为56,故经patch partition操作后,遥感图像的维度变成了56×56×48,再将其变为56×56×96的图像块。关于96这个值,是预设的,用于控制整个计算过程的复杂度,维度越多计算量越巨大。Send the remote sensing image into the picture block unit. The patch partition operation divides every 4×4 adjacent pixels into a block. Then each block contains 4×4=16 pixels, and each pixel has R and G. , B three values, the dimension after flattening is 16×3=48, the pixel points are divided from 224 by 4 to 56, so after the patch partition operation, the dimension of the remote sensing image becomes 56×56×48, and then This becomes a 56×56×96 image block. The value of 96 is preset and is used to control the complexity of the entire calculation process. The more dimensions, the greater the amount of calculation.

用掩码矩阵与56×56×96的图像块叠加,得到叠加图像,大小为56×56×96。再经四个降采样单元处理,现有技术给出一种四个降采样单元串联的方式如图3所示,分别为:第一个降采样单元由堆叠2次Swin Transform Block和1次patch Merging构成,第二个降采样单元由堆叠2次Swin Transform Block和1次patch Merging构成,第三个降采样单元由堆叠6次Swin Transform Block和1次patch Merging构成,第四个降采样单元由堆叠2次Swin Transform Block构成,其输出直接连接分类头。所述Swin Transform Block为特征提取模块,用于特征提取,其原理图参见图3,patch Merging为分块合并操作。Use the mask matrix to superimpose the 56×56×96 image block to obtain an overlay image with a size of 56×56×96. After being processed by four down-sampling units, the existing technology provides a way of connecting four down-sampling units in series, as shown in Figure 3. The first down-sampling unit consists of stacking 2 Swin Transform Blocks and 1 patch. It consists of Merging. The second down-sampling unit is composed of stacking 2 Swin Transform Blocks and 1 patch Merging. The third down-sampling unit is composed of stacking 6 Swin Transform Blocks and 1 patch Merging. The fourth down-sampling unit is composed of It is composed of stacking 2 Swin Transform Blocks, and its output is directly connected to the classification head. The Swin Transform Block is a feature extraction module for feature extraction. See Figure 3 for its schematic diagram. Patch Merging is a block merging operation.

图3为一个Swin Transform Block的结构图,假设输入的图像为T,图像T经层归一化、窗口多头自注意力层后,与图像T进行相加,得到第一次残差输出,记作Z1;Z1再经层归一化、多层感知层后,与Z1相加得到第二次残差结果,记作Z2;Z2经层归一化、滑动窗口多头自注意力层输出后,与Z2相加得到第三次残差结果,记作Z3;Z3经层归一化、多层感知层后,与Z3相加得到Swin Transform Block的最后输出。Figure 3 is a structural diagram of a Swin Transform Block. Assume that the input image is T. After the image T is layer normalized and the window multi-head self-attention layer is added to the image T, the first residual output is obtained, denoted Denoted as Z1 ; Z1 is then subjected to layer normalization and multi-layer sensing layers, and is added to Z1 to obtain the second residual result, recorded as Z2 ; Z2 is subjected to layer normalization and sliding window multi-head self-attention After the force layer is output, it is added to Z2 to obtain the third residual result, recorded as Z3 ; after Z3 is normalized and multi-layered, it is added to Z3 to obtain the final output of the Swin Transform Block.

图4为本实施例中由DEM图像和遥感图像得到叠加图像的示意图。图4中,DEM图像经过像素点粗糙度计算,可以找到粗糙度值小的区域,该区域可对应到遥感图像中,参见图4第一行第二张图,该图矩形框所表示的区域,就是粗糙度低于阈值的区域在遥感图像中的位置。Figure 4 is a schematic diagram of a superimposed image obtained from a DEM image and a remote sensing image in this embodiment. In Figure 4, after calculating the pixel roughness of the DEM image, an area with a small roughness value can be found. This area can be mapped to the remote sensing image. See the second picture in the first row of Figure 4. The area represented by the rectangular frame in this figure. , which is the position of the area in the remote sensing image where the roughness is lower than the threshold.

实施例3:参见图1到图4,在实施例1的基础上,我们提出了一种基于DEM特征增强的视觉Transformer滑坡识别方法,方法和实施例1相同,但具体设置如下:Embodiment 3: Referring to Figures 1 to 4, based on Embodiment 1, we propose a visual Transformer landslide identification method based on DEM feature enhancement. The method is the same as Embodiment 1, but the specific settings are as follows:

步骤(1)中,遥感数据集D1选用毕节滑坡数据集,该数据集对应的地区地处亚热带,为典型的亚热带季风气候,全年降水较为充足。同时位于云贵高原,地势上属于二三级阶梯交汇处,区域范围内海拔相对高差大。由于特殊的地形和充沛的降水条件,山体滑坡的可能性较大。该数据集主要包括滑坡影像图像与非滑坡影像图像。为了提高训练的速度,我们对毕节滑坡数据集中的遥感图像先预处理,具体为:将遥感图像大小统一设置为224×224×3,再进行归一化、随机旋转、高斯噪声等操作。In step (1), the remote sensing data set D1 uses the Bijie landslide data set. The area corresponding to this data set is located in the subtropics and has a typical subtropical monsoon climate with relatively sufficient precipitation throughout the year. At the same time, it is located on the Yunnan-Guizhou Plateau. The terrain is at the intersection of the second and third steps. The relative altitude difference within the region is large. Due to the special terrain and abundant precipitation conditions, landslides are more likely to occur. This data set mainly includes landslide images and non-landslide images. In order to improve the training speed, we first preprocessed the remote sensing images in the Bijie landslide data set, specifically: setting the size of the remote sensing images to 224×224×3, and then performing operations such as normalization, random rotation, and Gaussian noise.

步骤(1)中,DEM图像数据集D2中的DEM图像,为在D1对应区域,过合成孔径雷达干涉测量与机载激光雷达技术等方式构建而成,并按与遥感图像对应的关系,裁剪成224×224×1的大小。最终,本实施例D1中遥感图像分辨率为0.8m,DEM分辨率为2m。In step (1), the DEM image in the DEM image data set D2 is constructed in the corresponding area of D1 through synthetic aperture radar interferometry and airborne lidar technology, and is cropped according to the corresponding relationship with the remote sensing image. into a size of 224×224×1. Finally, in this embodiment D1, the resolution of the remote sensing image is 0.8m and the resolution of the DEM is 2m.

步骤(31)中,设置训练次数,Epoch=50;每次训练送入滑坡识别网络中样本数据对的数量Batch_size=32。在模型训练过程中,在前10个Epoch中,Lion优化器学习率设置为10-5,其余Epoch中学习率设置为10-6β1β2选择默认区间(0.9,0.99),权重衰减为0。In step (31), set the number of training times, Epoch=50; the number of sample data pairs sent to the landslide identification network for each training, Batch_size=32. During the model training process, in the first 10 Epochs, the learning rate of the Lion optimizer is set to 10-5 , and in the remaining Epochs, the learning rate is set to 10-6 . The default interval (0.9, 0.99) is selected for β1and β2,and the weight Attenuation is 0.

本实施例实验环境的硬件主要为处理器i7-10700、显卡GTX3070(8G)、内存16G,实验代码基于pytorch环境构建。并在实验结果中引入了参数TP(true positive)、FP(falsepositive)、FN(false negative)、TN(true negative),并使用了精确率Precision、精确率Recall、准确率Accuracy、F1值F1-scores来表示实验结果。The hardware of the experimental environment in this example is mainly the processor i7-10700, the graphics card GTX3070 (8G), and the memory 16G. The experimental code is built based on the pytorch environment. And the parameters TP (true positive), FP (falsepositive), FN (false negative), TN (true negative) were introduced in the experimental results, and the accuracy rate Precision, accuracy rate Recall, accuracy rate Accuracy, and F1 value F1- scores to represent the experimental results.

TP为true positive的缩写,代表在测试集中本身是滑坡图像并被准确识别为滑坡的数量;TN为true negative的缩写,表示本身是滑坡图像却被准确识别为非滑坡的数量;FP为false positive的缩写,代表测试集中不是滑坡图像却被识别为滑坡的数量;FN为false negative的缩写,代表测试集中本身不是滑坡图像并正确被识别为非滑坡的数量,每次实验均重复三次,最终结果取平均值。Precision、Recall、Accuracy、F1-scores的计算公式如下:TP is the abbreviation of true positive, which represents the number of landslide images in the test set that are accurately identified as landslides; TN is the abbreviation of true negative, which represents the number of landslide images that are accurately identified as non-landslides; FP is false positive The abbreviation of FN represents the number of images in the test set that are not landslides but are recognized as landslides; FN is the abbreviation of false negative, which represents the number of images in the test set that are not landslides but are correctly identified as non-landslides. Each experiment is repeated three times, and the final result is take the average. The calculation formulas for Precision, Recall, Accuracy and F1-scores are as follows:

(5), (5),

(6), (6),

(7), (7),

(8), (8),

最后,为了说明本发明效果,本实施例将相同数据集、两种方法得到的实验结果进行对比,其中方法一为:季顺平等人提出了的Resnet50与3D通道注意力机制结合的方式进行滑坡识别方式,方法二为本发明方法,实验结果对比如表1所示。Finally, in order to illustrate the effect of the present invention, this embodiment compares the experimental results obtained by the same data set and two methods. The first method is: the combination of Resnet50 and 3D channel attention mechanism proposed by Ji Shunping et al. to perform landslide Identification method, method two is the method of the present invention, and the comparison of experimental results is shown in Table 1.

表1、实施例3对比实验的实验结果对比表Table 1. Comparison table of experimental results of the comparative experiment of Example 3

从表1可知,在模型参数大体相同的情况下,在召回率,准确率与F1值等方面有明显提高。It can be seen from Table 1 that when the model parameters are roughly the same, the recall rate, accuracy rate and F1 value are significantly improved.

实施例4:参见图1到图4,在实施例3的基础上,我们数据集采用Landslide4Sense数据集。Landslide4Sense 数据集来源于2015年至2021年世界各地受滑坡影响的不同地区,数据集中的所有维度都调整为每像素约10m 的分辨率,图像块的大小为 128 x 128 像素。该数据集包括训练集和测试集。训练集共有3799个图像块,每个图像块共有14个维度,包含了Sentinel-2 卫星的多光谱数据、 ALOS PALSAR 的滑坡数据。本实验抽取了其中R,G,B,DEM四维通道,将R,G,B通道用于构建遥感数据集D1,DEM通道用于构建DEM图像数据集D2。并将训练集与测试集按7:3的比例进行划分,最终得到本实施例的数据集。Embodiment 4: Referring to Figures 1 to 4, based on Embodiment 3, our data set uses the Landslide4Sense data set. The Landslide4Sense dataset is derived from different areas affected by landslides around the world from 2015 to 2021. All dimensions in the dataset are resized to a resolution of approximately 10m per pixel, and the image patch size is 128 x 128 pixels. The data set includes training set and test set. The training set has a total of 3799 image patches, each image patch has 14 dimensions, including multispectral data from the Sentinel-2 satellite and landslide data from ALOS PALSAR. This experiment extracted the four-dimensional channels R, G, B, and DEM. The R, G, and B channels were used to construct the remote sensing data set D1, and the DEM channel was used to construct the DEM image data set D2. And the training set and the test set are divided at a ratio of 7:3, and finally the data set of this embodiment is obtained.

同样,为了说明本发明效果,本实施例同样选取了ResNet-50、VGG-16、denseNet121、Vit(base)、Convnet(tiny)等主流网络模型作为对照组。本实施例数据集上进行对照实验得到的实验结果对比如表2所示。Similarly, in order to illustrate the effect of the present invention, this embodiment also selects mainstream network models such as ResNet-50, VGG-16, denseNet121, Vit (base), and Convnet (tiny) as the control group. The comparison of experimental results obtained by conducting a control experiment on the data set of this embodiment is shown in Table 2.

表2 实施例4对比实验的实验结果对比表Table 2 Comparison table of experimental results of the comparative experiment of Example 4

从表2可知,本发明在召回率,准确率与F1值等方面有明显提高。It can be seen from Table 2 that the present invention has significantly improved the recall rate, accuracy rate and F1 value.

综上,本发明在滑坡识别上具有良好的效果,相比于传统神经网络在准确率等方面均有明显的提高。In summary, the present invention has good results in landslide identification, and has significantly improved accuracy and other aspects compared with traditional neural networks.

以上所述仅为本发明的较佳实施例而已,并不用以限制本发明,凡在本发明的精神和原则之内所作的任何修改、等同替换和改进等,均应包含在本发明的保护范围之内。The above descriptions are only preferred embodiments of the present invention and are not intended to limit the present invention. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention shall be included in the protection of the present invention. within the range.

Claims (3)

CN202311555261.5A2023-11-212023-11-21Visual transducer landslide identification method based on DEM feature enhancementActiveCN117274823B (en)

Priority Applications (2)

Application NumberPriority DateFiling DateTitle
CN202311555261.5ACN117274823B (en)2023-11-212023-11-21Visual transducer landslide identification method based on DEM feature enhancement
KR1020240089945AKR102797774B1 (en)2023-11-212024-07-08Dem feature enhancement-based visual transform landslide identification method

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202311555261.5ACN117274823B (en)2023-11-212023-11-21Visual transducer landslide identification method based on DEM feature enhancement

Publications (2)

Publication NumberPublication Date
CN117274823A CN117274823A (en)2023-12-22
CN117274823Btrue CN117274823B (en)2024-01-26

Family

ID=89218086

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202311555261.5AActiveCN117274823B (en)2023-11-212023-11-21Visual transducer landslide identification method based on DEM feature enhancement

Country Status (2)

CountryLink
KR (1)KR102797774B1 (en)
CN (1)CN117274823B (en)

Citations (23)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN111814844A (en)*2020-03-172020-10-23同济大学 A Dense Video Description Method Based on Positional Coding Fusion
WO2021211587A1 (en)*2020-04-132021-10-21Carrington Charles CGeoreferencing a generated floorplan and generating structural models
CN113887515A (en)*2021-10-282022-01-04中国自然资源航空物探遥感中心 A method and system for remote sensing landslide recognition based on convolutional neural network
WO2022074643A1 (en)*2020-10-082022-04-14Edgy Bees Ltd.Improving geo-registration using machine-learning based object identification
CN114581864A (en)*2022-03-042022-06-03哈尔滨工程大学 Transformer-based dynamic densely aligned vehicle re-identification technology
CN114926646A (en)*2022-05-192022-08-19中国人民解放军战略支援部队信息工程大学Remote sensing image elevation extraction method fusing context information
CN115221846A (en)*2022-06-082022-10-21华为技术有限公司 A data processing method and related equipment
CN115761335A (en)*2022-11-172023-03-07西安电子科技大学 Landslide Risk Point Classification Method Based on Multimodal Decision Fusion
WO2023030513A1 (en)*2021-09-052023-03-09汉熵通信有限公司Internet of things system
CN115861591A (en)*2022-12-092023-03-28南京航空航天大学Unmanned aerial vehicle positioning method based on transform key texture coding matching
CN116206098A (en)*2023-03-212023-06-02吉林大学Moon surface safety landing zone selection system and method thereof
CN116206210A (en)*2022-12-082023-06-02辽宁工程技术大学 A NAS-Swin-based method for extracting agricultural greenhouses from remote sensing images
CN116206214A (en)*2023-03-082023-06-02西安电子科技大学 A method, system, device and medium for automatically identifying landslides based on lightweight convolutional neural network and double attention
WO2023097362A1 (en)*2021-12-032023-06-08Annalise-Ai Pty LtdSystems and methods for analysis of computed tomography (ct) images
CN116310187A (en)*2023-05-172023-06-23中国地质大学(武汉) A small-scale and short-period fine tidal flat modeling method
CN116309348A (en)*2023-02-152023-06-23中国人民解放军战略支援部队航天工程大学Lunar south pole impact pit detection method based on improved TransUnet network
WO2023126914A2 (en)*2021-12-272023-07-06Yeda Research And Development Co. Ltd.METHOD AND SYSTEM FOR SEMANTIC APPEARANCE TRANSFER USING SPLICING ViT FEATURES
CN116402851A (en)*2023-03-172023-07-07中北大学 A Method of Infrared Weak and Small Target Tracking in Complex Background
CN116432019A (en)*2022-12-092023-07-14华为技术有限公司Data processing method and related equipment
CN116524258A (en)*2023-04-252023-08-01云南师范大学 A landslide detection method and system based on multi-label classification
CN116597287A (en)*2023-07-172023-08-15云南省交通规划设计研究院有限公司Remote sensing image landslide recognition method based on deep learning method
WO2023160472A1 (en)*2022-02-222023-08-31华为技术有限公司Model training method and related device
CN116977632A (en)*2023-06-302023-10-31南湖实验室Landslide extraction method for improving U-Net network based on asymmetric convolution

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
DE102020209281A1 (en)*2020-07-232022-01-27Robert Bosch Gesellschaft mit beschränkter Haftung Method and device for learning a strategy and operating the strategy
CN112668338B (en)*2021-03-222021-06-08中国人民解放军国防科技大学 Clarification question generation method, apparatus and electronic device

Patent Citations (23)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN111814844A (en)*2020-03-172020-10-23同济大学 A Dense Video Description Method Based on Positional Coding Fusion
WO2021211587A1 (en)*2020-04-132021-10-21Carrington Charles CGeoreferencing a generated floorplan and generating structural models
WO2022074643A1 (en)*2020-10-082022-04-14Edgy Bees Ltd.Improving geo-registration using machine-learning based object identification
WO2023030513A1 (en)*2021-09-052023-03-09汉熵通信有限公司Internet of things system
CN113887515A (en)*2021-10-282022-01-04中国自然资源航空物探遥感中心 A method and system for remote sensing landslide recognition based on convolutional neural network
WO2023097362A1 (en)*2021-12-032023-06-08Annalise-Ai Pty LtdSystems and methods for analysis of computed tomography (ct) images
WO2023126914A2 (en)*2021-12-272023-07-06Yeda Research And Development Co. Ltd.METHOD AND SYSTEM FOR SEMANTIC APPEARANCE TRANSFER USING SPLICING ViT FEATURES
WO2023160472A1 (en)*2022-02-222023-08-31华为技术有限公司Model training method and related device
CN114581864A (en)*2022-03-042022-06-03哈尔滨工程大学 Transformer-based dynamic densely aligned vehicle re-identification technology
CN114926646A (en)*2022-05-192022-08-19中国人民解放军战略支援部队信息工程大学Remote sensing image elevation extraction method fusing context information
CN115221846A (en)*2022-06-082022-10-21华为技术有限公司 A data processing method and related equipment
CN115761335A (en)*2022-11-172023-03-07西安电子科技大学 Landslide Risk Point Classification Method Based on Multimodal Decision Fusion
CN116206210A (en)*2022-12-082023-06-02辽宁工程技术大学 A NAS-Swin-based method for extracting agricultural greenhouses from remote sensing images
CN116432019A (en)*2022-12-092023-07-14华为技术有限公司Data processing method and related equipment
CN115861591A (en)*2022-12-092023-03-28南京航空航天大学Unmanned aerial vehicle positioning method based on transform key texture coding matching
CN116309348A (en)*2023-02-152023-06-23中国人民解放军战略支援部队航天工程大学Lunar south pole impact pit detection method based on improved TransUnet network
CN116206214A (en)*2023-03-082023-06-02西安电子科技大学 A method, system, device and medium for automatically identifying landslides based on lightweight convolutional neural network and double attention
CN116402851A (en)*2023-03-172023-07-07中北大学 A Method of Infrared Weak and Small Target Tracking in Complex Background
CN116206098A (en)*2023-03-212023-06-02吉林大学Moon surface safety landing zone selection system and method thereof
CN116524258A (en)*2023-04-252023-08-01云南师范大学 A landslide detection method and system based on multi-label classification
CN116310187A (en)*2023-05-172023-06-23中国地质大学(武汉) A small-scale and short-period fine tidal flat modeling method
CN116977632A (en)*2023-06-302023-10-31南湖实验室Landslide extraction method for improving U-Net network based on asymmetric convolution
CN116597287A (en)*2023-07-172023-08-15云南省交通规划设计研究院有限公司Remote sensing image landslide recognition method based on deep learning method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Conv-trans dual network for landslide detection of multi-channel optical remote sensing images;Chen, Xin等;《Frontiers in Earth Science》;第11卷;1-14*
基于改进Swin Transformer的滑坡分割算法;张思远等;《国外电子测量技术》;第42卷(第11期);49-56*

Also Published As

Publication numberPublication date
KR102797774B1 (en)2025-04-22
CN117274823A (en)2023-12-22

Similar Documents

PublicationPublication DateTitle
CN110852316B (en)Image tampering detection and positioning method adopting convolution network with dense structure
CN104794504B (en)Pictorial pattern character detecting method based on deep learning
CN114266794B (en)Pathological section image cancer region segmentation system based on full convolution neural network
CN112434672A (en)Offshore human body target detection method based on improved YOLOv3
CN107154023A (en)Face super-resolution reconstruction method based on generation confrontation network and sub-pix convolution
CN105512680A (en)Multi-view SAR image target recognition method based on depth neural network
CN112215188B (en)Traffic police gesture recognition method, device, equipment and storage medium
CN115457509A (en)Traffic sign image segmentation algorithm based on improved space-time image convolution
CN114708255A (en)Multi-center children X-ray chest image lung segmentation method based on TransUNet model
CN114494786A (en)Fine-grained image classification method based on multilayer coordination convolutional neural network
CN115063786A (en)High-order distant view fuzzy license plate detection method
CN117132811B (en)Cross-region same-earthquake landslide automatic identification method and computer readable storage medium
CN114943894A (en)ConvCRF-based high-resolution remote sensing image building extraction optimization method
CN114782254A (en)Infrared image super-resolution reconstruction system and method based on edge information fusion
CN117765507A (en)Foggy day traffic sign detection method based on deep learning
CN117809310A (en)Port container number identification method and system based on machine learning
CN117315370A (en)Floating object detection method, device, storage medium and equipment based on remote sensing image
CN117274823B (en)Visual transducer landslide identification method based on DEM feature enhancement
CN115359091A (en) An Armor Plate Detection and Tracking Method for Mobile Robots
CN114091519A (en) An occluded pedestrian re-identification method based on multi-granularity occlusion perception
CN111986109A (en)Remote sensing image defogging method based on full convolution network
CN112241695A (en)Method for recognizing portrait without safety helmet and with face recognition function
CN114387170A (en) An Image Inpainting Method to Improve the Edge Disjointness of Inpainted Regions
CN117671602A (en)Farmland forest smoke fire prevention detection method and device based on image recognition
CN117253217A (en)Charging station vehicle identification method and device, electronic equipment and storage medium

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp