Movatterモバイル変換


[0]ホーム

URL:


CN118537733A - Remote sensing image road extraction method based on feature consistency perception - Google Patents

Remote sensing image road extraction method based on feature consistency perception
Download PDF

Info

Publication number
CN118537733A
CN118537733ACN202410671607.6ACN202410671607ACN118537733ACN 118537733 ACN118537733 ACN 118537733ACN 202410671607 ACN202410671607 ACN 202410671607ACN 118537733 ACN118537733 ACN 118537733A
Authority
CN
China
Prior art keywords
road
information
feature
loss
consistency
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410671607.6A
Other languages
Chinese (zh)
Inventor
吴艳兰
王彪
杨辉
侯恩兵
任光耀
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui Second Surveying And Mapping Institute
Anhui University
Original Assignee
Anhui Second Surveying And Mapping Institute
Anhui University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui Second Surveying And Mapping Institute, Anhui UniversityfiledCriticalAnhui Second Surveying And Mapping Institute
Priority to CN202410671607.6ApriorityCriticalpatent/CN118537733A/en
Publication of CN118537733ApublicationCriticalpatent/CN118537733A/en
Pendinglegal-statusCriticalCurrent

Links

Classifications

Landscapes

Abstract

The invention belongs to the technical field of photogrammetric data processing, and particularly relates to a remote sensing image road extraction method based on feature consistency perception, which comprises the following steps: data set acquisition, sample set manufacturing, initial road extraction network structure design, characteristic enhancement and characteristic consistency sensing modules, loss function optimization models and model result evaluation. The invention designs a characteristic enhancement and characteristic consistency sensing network, which consists of an initial road extraction network and a characteristic enhancement and characteristic consistency sensing module, and the initial road information and the multi-scale characteristic information are combined to better utilize the multi-scale information, reduce the loss of the characteristic information and improve the network performance. The method of the invention enhances the consistency of the road characteristics, enriches the characteristic information of the road, improves the problem of non-communication and fragmentation of the road caused by building shadows or other ground covers, and enhances the connectivity and the integrity of the road extraction result.

Description

Translated fromChinese
基于特征一致性感知的遥感影像道路提取方法Road extraction method from remote sensing images based on feature consistency perception

技术领域Technical Field

本发明属于摄影测量数据处理技术领域,具体涉及一种基于特征一致性感知的遥感影像道路提取方法。The invention belongs to the technical field of photogrammetry data processing, and in particular relates to a remote sensing image road extraction method based on feature consistency perception.

背景技术Background Art

利用遥感图像提取道路信息在城市设计、车道检测和更新地理信息数据库等领域变得越来越重要。在遥感图像中,道路受到阴影、植被以及建筑物和车辆遮挡的影响。这些问题导致在使用深度学习方法进行道路提取时出现道路不连续、细小道路漏提的现象。Extracting road information from remote sensing images is becoming increasingly important in areas such as urban design, lane detection, and updating geographic information databases. In remote sensing images, roads are affected by shadows, vegetation, and occlusion by buildings and vehicles. These problems lead to discontinuities and omissions of small roads when using deep learning methods for road extraction.

常用的分类算法,如手动特征提取方法可分为阈值分割、边缘检测和机器学习。其中,阈值分割基于不同的使用场景确定相应的阈值,然后利用阈值区分像素属于道路或者背景,从而实现道路提。边缘检测利用Canny、Sobel和Roberts等算子来识别道路边界,从而实现道路提取。机器学习方法基于手动提取的特征,利用最大似然法、支持向量机和马尔可夫随机场等技术来实现道路提取。由于道路、建筑物和停车场之间的光谱相似性导致这些方法在应对复杂的道路场景时,效率低下,并且不能获取令人满意的结果。Commonly used classification algorithms, such as manual feature extraction methods, can be divided into threshold segmentation, edge detection, and machine learning. Among them, threshold segmentation determines the corresponding threshold based on different usage scenarios, and then uses the threshold to distinguish whether the pixel belongs to the road or the background, thereby realizing road extraction. Edge detection uses operators such as Canny, Sobel, and Roberts to identify road boundaries, thereby realizing road extraction. Machine learning methods are based on manually extracted features and use technologies such as maximum likelihood, support vector machines, and Markov random fields to realize road extraction. Due to the spectral similarity between roads, buildings, and parking lots, these methods are inefficient when dealing with complex road scenes and cannot obtain satisfactory results.

全卷积神经网络(FCN)为道路提取提供了新思路。然而,FCN在下采样过程中会丢失特征信息,并且缺少顾及全局特征信息一致性的机制。因此,FCN方法受到道路复杂背景信息的影响,无法有效地利用特征信息,导致道路提取结果破碎化和漏提现象。Fully convolutional neural network (FCN) provides a new idea for road extraction. However, FCN loses feature information during downsampling and lacks a mechanism to take into account the consistency of global feature information. Therefore, the FCN method is affected by the complex background information of the road and cannot effectively utilize the feature information, resulting in fragmented road extraction results and omissions.

如何利用深度学习技术获得更多道路的上下文信息并提高道路特征的一致性,通过提高特征利用率来提高道路提取的准确性,是现阶段道路提取领域的一大难题。How to use deep learning technology to obtain more contextual information of roads and improve the consistency of road features, and improve the accuracy of road extraction by improving feature utilization, is a major challenge in the field of road extraction at this stage.

发明内容Summary of the invention

本发明的目的在于克服传统技术中存在的上述问题,提供一种基于特征一致性感知的遥感影像道路提取方法。The purpose of the present invention is to overcome the above-mentioned problems existing in traditional technologies and to provide a remote sensing image road extraction method based on feature consistency perception.

为实现上述技术目的,达到上述技术效果,本发明是通过以下技术方案实现:In order to achieve the above technical objectives and the above technical effects, the present invention is implemented through the following technical solutions:

本发明提供一种基于特征一致性感知的遥感影像道路提取方法,包括以下步骤:The present invention provides a remote sensing image road extraction method based on feature consistency perception, comprising the following steps:

步骤一、数据集获取及样本集制作;Step 1: Data set acquisition and sample set production;

步骤二、初始道路提取网络结构;Step 2: Initial road extraction network structure;

步骤三、特征增强和特征一致性感知模块;Step 3: Feature enhancement and feature consistency perception module;

步骤四、损失函数优化模型;Step 4: Loss function optimization model;

步骤五、模型结果评估。Step 5: Model result evaluation.

进一步地,步骤一包含分步骤:Furthermore, step one comprises the following sub-steps:

1)分别获取马萨诸塞州数据集、DeepGlobal数据集,以及CHT数据集;1) Obtain the Massachusetts dataset, DeepGlobal dataset, and CHT dataset respectively;

2)利用人工标记的方式在遥感影像上标注道路标签,并对所有的遥感影像和标签进行裁剪,尺寸大小统一裁剪为512×512;2) Use manual labeling to mark road labels on remote sensing images, and crop all remote sensing images and labels to a uniform size of 512×512;

3)按照4:1的比例将所有数据分为训练样本集和验证样本集。3) All data are divided into training sample set and validation sample set in a ratio of 4:1.

进一步地,步骤二中,结合残差结构概念和连接不同尺度特征层的思想设计了初始道路提取网络,即CRE-Net;该提取网络基于编码器-解码器结构进行特征提取,并在编码和解码过程中结合多尺度特征,以此获得更为丰富的语义信息并减少特征损失。Furthermore, in step 2, the initial road extraction network, namely CRE-Net, was designed by combining the concept of residual structure and the idea of connecting feature layers of different scales. This extraction network performs feature extraction based on the encoder-decoder structure and combines multi-scale features in the encoding and decoding process to obtain richer semantic information and reduce feature loss.

进一步地,步骤二中,编码部分由四个残差堆叠块、六个卷积层和一个密集连接的扩张金字塔模块组成,用于捕获道路上下文信息;编码部分将输入图像转换为大小为16×16×512的特征信息,减少了参数计算并增加了感受野;其中每个卷积层都包含BN层和ReLU函数;引入用于解决深度学习网络中梯度消失问题的残差堆积块结构,同时使用能够实现网络编码结构获取更为丰富特征信息的DenseASPP模块,DenseASPP以密集连接的方式连接一组卷积层,有效地生成密集的空间采样,增大感受野,以此获取多尺度道路特征信息,减少上下文信息的丢失,更好地保存道路特征。Furthermore, in step 2, the encoding part consists of four residual stacking blocks, six convolutional layers and a densely connected dilated pyramid module, which is used to capture road context information; the encoding part converts the input image into feature information of size 16×16×512, reducing parameter calculation and increasing the receptive field; each convolutional layer contains a BN layer and a ReLU function; a residual stacking block structure is introduced to solve the gradient vanishing problem in deep learning networks, and a DenseASPP module is used to realize a network coding structure to obtain richer feature information. DenseASPP connects a group of convolutional layers in a densely connected manner, effectively generates dense spatial sampling, and increases the receptive field, so as to obtain multi-scale road feature information, reduce the loss of context information, and better preserve road features.

进一步地,步骤二中,解码部分由四个卷积层和五个残差叠加块组成,用于获取道路分割结果;解码结构将来自编码结构的输出特征转换为512×512×16尺寸的特征图;在解码部分的上采样过程中,将不同尺度的特征信息进行融合,将解码过程最后一层的16×16×512特征转换为大小256×256×32,然后将这组特征信息与编码过程中相同尺度的特征信息融合,该操作弱化了网络中信息丢失带来的影响,同时将道路特征的空间细节融入到语义信息中,利于影像中道路信息的提取;CRE-Net用于将输入的遥感图像转换为特征信息,捕捉更多关于道路的上下文信息,从而实现初步提取道路特征信息,并为后续的FECP模块准备不同尺度的特征信息。Furthermore, in step 2, the decoding part consists of four convolutional layers and five residual superposition blocks to obtain road segmentation results; the decoding structure converts the output features from the encoding structure into a feature map of size 512×512×16; in the upsampling process of the decoding part, feature information of different scales is fused, and the 16×16×512 features of the last layer of the decoding process are converted to a size of 256×256×32, and then this set of feature information is fused with the feature information of the same scale in the encoding process. This operation weakens the impact of information loss in the network, and at the same time integrates the spatial details of the road features into the semantic information, which is conducive to the extraction of road information in the image; CRE-Net is used to convert the input remote sensing image into feature information, capture more contextual information about the road, thereby realizing the preliminary extraction of road feature information, and preparing feature information of different scales for the subsequent FECP module.

进一步地,步骤三中,在初始道路提取网络的下采样过程中,存在特征信息的丢失,导致用于道路特征提取的上下文信息不足。因此,在设计道路提取网络时,考虑道路上下文信息的提取并最大限度地减少信息损失。为了解决这个问题,本发明设计特征增强和一致性感知模块(FECP模块)的网络模块;在FECP模块中,为了减少特征信息的丢失,首先对大小为512×512×16的特征信息进行下采样至256×256×32,与前一特征层的大小保持一致;然后,将这两组特征信息进行融合,得到尺寸为256×256×32的特征信息,表示为p1;为了在特征信息中包括更多的空间信息,将p1与来自CRE-Net的下采样特征信息合并,得到128×128×64的大小,表示为p2;p2不仅包含丰富的语义信息,而且还包含道路的空间信息;为了进一步丰富特征中的语义信息,将p2下采样到64×64×64,并与相同尺度的特征信息进行卷积,最终获得大小为64×64×64的特征信息,表示为p3;最后,融合的多尺度特征p3作为该模块的输出,表示为pc,最后通过pc获取道路的分割结果。Furthermore, in step 3, during the downsampling process of the initial road extraction network, feature information is lost, resulting in insufficient context information for road feature extraction. Therefore, when designing the road extraction network, the extraction of road context information is considered and information loss is minimized. In order to solve this problem, the present invention designs a network module of a feature enhancement and consistency perception module (FECP module); in the FECP module, in order to reduce the loss of feature information, the feature information of a size of 512×512×16 is first downsampled to 256×256×32, which is consistent with the size of the previous feature layer; then, the two sets of feature information are fused to obtain feature information of a size of 256×256×32, represented as p1; in order to include more spatial information in the feature information, p1 is merged with the downsampled feature information from CRE-Net to obtain a size of 128×128×64, represented as p2; p2 not only contains rich semantic information, but also contains spatial information of the road; in order to further enrich the semantic information in the feature, p2 is downsampled to 64×64×64 and convolved with the feature information of the same scale, and finally feature information of a size of 64×64×64 is obtained, represented as p3; finally, the fused multi-scale feature p3 is used as the output of the module, represented as pc, and finally the segmentation result of the road is obtained through pc.

进一步地,步骤四中,为了解决高分辨率遥感图像中道路背景信息不足的问题,本发明在CRE-Net和FECP模块中采用了两种不同类型的损失函数,CRE-Net中使用的损失函数被称为Loss1,FECP模块中的损失函数被称作Loss2,这些损失函数由二进制交叉熵、Dice系数和IoU损失组成;在FECP-Net的训练过程中,将Loss1和Loss2通过不同权重组合来突出道路的特征,增强提取道路的一致性,修复初步提取结果,提高道路提取结果的完整性;具体操作如下:Furthermore, in step 4, in order to solve the problem of insufficient road background information in high-resolution remote sensing images, the present invention adopts two different types of loss functions in the CRE-Net and FECP modules. The loss function used in the CRE-Net is called Loss1, and the loss function in the FECP module is called Loss2. These loss functions are composed of binary cross entropy, Dice coefficient and IoU loss. In the training process of FECP-Net, Loss1 and Loss2 are combined with different weights to highlight the characteristics of the road, enhance the consistency of the extracted road, repair the preliminary extraction results, and improve the integrity of the road extraction results. The specific operations are as follows:

Loss1=W1*BCE_loss+W2*DICE_loss+W3*iou_loss (1)Loss1=W1 *BCE_loss+W2 *DICE_loss+W3 *iou_loss (1)

Loss2=W4*BCE_loss+W5*DICE_loss+W6*iou_loss (2)Loss2=W4 *BCE_loss+W5 *DICE_loss+W6 *iou_loss (2)

Loss=Loss1+Loss2 (3)Loss=Loss1+Loss2 (3)

BCE_loss(SP,SL)=-W(SL*ln(SP)+(1-SL)*log(1-SP)) (4)BCE_loss(SP ,SL )=-W(SL *ln(SP )+(1-SL )*log(1-SP )) (4)

其中SP表示预测值,SL表示标签,Si表示预测值和标签之间的交集,SU表示预测值与标签之间的并集;在实际操作中,设置W1=W2=0.4,W3=0.2,W4=W5=0.2,W6=0.6,这种分配权重的方式突出了道路信息在遥感影像中的重要性,提高道路与背景的区分度,提高了模型道路提取的能力。WhereSP represents the predicted value,SL represents the label,Si represents the intersection between the predicted value and the label, and SU represents the union between the predicted value and the label. In actual operation,W1 =W2 = 0.4,W3 = 0.2,W4 =W5 = 0.2, andW6 = 0.6 are set. This way of allocating weights highlights the importance of road information in remote sensing images, improves the distinction between roads and backgrounds, and improves the ability of model road extraction.

进一步地,步骤五中,选择召回率、IoU和F1分数三个指标评估模型在道路提取过程中的性能;其中,召回率和IoU用于评估道路提取结果的质量,后者主要用于评估模型提取道路的形状和面积;召回率表示正确预测的真实像素与实际真实像素的比例;F1分数被定义为精确度和召回率的调和平均值;IoU表示真实值和预测值的交集与其并集的比值;道路提取任务是一个二分类问题,预测结果被分为道路和背景,这些评估指标的计算公式如下:Furthermore, in step 5, three indicators, namely recall rate, IoU and F1 score, are selected to evaluate the performance of the model in the road extraction process; among them, recall rate and IoU are used to evaluate the quality of road extraction results, and the latter is mainly used to evaluate the shape and area of the road extracted by the model; recall rate represents the ratio of correctly predicted true pixels to actual true pixels; F1 score is defined as the harmonic mean of precision and recall rate; IoU represents the ratio of the intersection of the true value and the predicted value to their union; the road extraction task is a binary classification problem, and the prediction results are divided into road and background. The calculation formulas of these evaluation indicators are as follows:

本发明的有益效果是:The beneficial effects of the present invention are:

1、本发明提供了一种基于特征一致性感知的遥感影像道路提取方法,特征一致性感知网络由两部分组成:初始道路提取网络(CRE-Net)和特征增强和特征一致性感知模块(FECP模块),CRE-Net用于获取道路的初始信息和特征,而FECP模块用于将粗糙道路信息与多尺度道路特征信息结合,尽可能地减少特征信息丢失,提高网络的道路提取性能。FECP-Net增强了道路特征的一致性,丰富了道路的特征信息,改善了由建筑物阴影或者其他地物遮盖导致的道路不连通和破碎化的问题,增强了道路提取结果的连通性和完整性。1. The present invention provides a method for extracting roads from remote sensing images based on feature consistency perception. The feature consistency perception network consists of two parts: an initial road extraction network (CRE-Net) and a feature enhancement and feature consistency perception module (FECP module). CRE-Net is used to obtain the initial information and features of the road, while the FECP module is used to combine the rough road information with the multi-scale road feature information to minimize the loss of feature information and improve the road extraction performance of the network. FECP-Net enhances the consistency of road features, enriches the feature information of the road, improves the problems of road disconnection and fragmentation caused by building shadows or other ground objects, and enhances the connectivity and integrity of the road extraction results.

2、本发明引入的FECP模块合并不同尺度的特征信息,通过将初始道路信息和多尺度特征信息相结合,更好地利用多尺度信息,减少特征信息丢失,提高网络性能。2. The FECP module introduced in the present invention combines feature information of different scales, and by combining the initial road information with multi-scale feature information, it can better utilize the multi-scale information, reduce feature information loss, and improve network performance.

3、本发明设计的损失函数,突出了道路边界信息和背景信息之间的差异,使网络能够更好地识别道路特征。3. The loss function designed by the present invention highlights the difference between road boundary information and background information, enabling the network to better identify road features.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

为了更清楚地说明本发明实施例的技术方案,下面将对实施例描述所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for describing the embodiments will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present invention. For ordinary technicians in this field, other drawings can be obtained based on these drawings without creative work.

图1为本发明实施例的总体模型结构图;FIG1 is a diagram of the overall model structure of an embodiment of the present invention;

图2为本发明实施例的不同模型道路提取结果图。FIG. 2 is a diagram showing the extraction results of different model roads according to an embodiment of the present invention.

具体实施方式DETAILED DESCRIPTION

下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其它实施例,都属于本发明保护的范围。The following will be combined with the drawings in the embodiments of the present invention to clearly and completely describe the technical solutions in the embodiments of the present invention. Obviously, the described embodiments are only part of the embodiments of the present invention, not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by ordinary technicians in this field without creative work are within the scope of protection of the present invention.

请参阅图1-图2所示,本实施例提供一种基于特征一致性感知的遥感影像道路提取方法,包括以下步骤:Referring to FIG. 1 and FIG. 2 , this embodiment provides a method for extracting roads from remote sensing images based on feature consistency perception, comprising the following steps:

步骤一、数据集获取及样本集制作;Step 1: Data set acquisition and sample set production;

步骤二、初始道路提取网络结构;Step 2: Initial road extraction network structure;

步骤三、特征增强和特征一致性感知模块;Step 3: Feature enhancement and feature consistency perception module;

步骤四、损失函数优化模型;Step 4: Loss function optimization model;

步骤五、模型结果评估。Step 5: Model result evaluation.

步骤一具体包括以下内容:Step 1 specifically includes the following:

1)分别获取马萨诸塞州数据集、DeepGlobal数据集,以及CHT数据集;1) Obtain the Massachusetts dataset, DeepGlobal dataset, and CHT dataset respectively;

2)利用人工标记的方式在遥感影像上标注道路标签,并对所有的遥感影像和标签进行裁剪,尺寸大小统一裁剪为512×512;2) Use manual labeling to mark road labels on remote sensing images, and crop all remote sensing images and labels to a uniform size of 512×512;

3)按照4:1的比例将所有数据分为训练样本集和验证样本集。3) All data are divided into training sample set and validation sample set in a ratio of 4:1.

步骤二具体包括以下内容:Step 2 specifically includes the following:

本实施例结合残差结构概念和连接不同尺度特征层的思想设计了初始道路提取网络,即CRE-Net。该提取网络基于编码器-解码器结构进行特征提取,并在编码和解码过程中结合多尺度特征,以此获得更为丰富的语义信息并减少特征损失。This embodiment combines the concept of residual structure and the idea of connecting feature layers of different scales to design an initial road extraction network, namely CRE-Net. The extraction network performs feature extraction based on the encoder-decoder structure and combines multi-scale features in the encoding and decoding process to obtain richer semantic information and reduce feature loss.

编码部分由四个残差堆叠块、六个卷积层和一个密集连接的扩张金字塔模块组成,用于捕获道路上下文信息。编码部分将输入图像转换为大小为16×16×512的特征信息,减少了参数计算并增加了感受野。其中每个卷积层都包含BN层和ReLU函数。此外,引入了残差堆积块结构,以解决深度学习网络中的梯度消失问题。为使得网络编码结构获取更为丰富的特征信息,同时使用了DenseASPP模块。DenseASPP以密集连接的方式连接一组卷积层,有效地生成密集的空间采样,增大感受野,以此获取多尺度道路特征信息,减少上下文信息的丢失,更好地保存道路特征。The encoding part consists of four residual stacking blocks, six convolutional layers, and a densely connected dilated pyramid module to capture road context information. The encoding part converts the input image into feature information of size 16×16×512, reducing parameter calculation and increasing the receptive field. Each convolutional layer contains a BN layer and a ReLU function. In addition, a residual stacking block structure is introduced to solve the gradient vanishing problem in deep learning networks. In order to enable the network coding structure to obtain richer feature information, the DenseASPP module is also used. DenseASPP connects a set of convolutional layers in a densely connected manner, effectively generating dense spatial sampling and increasing the receptive field, thereby obtaining multi-scale road feature information, reducing the loss of context information, and better preserving road features.

解码部分由四个卷积层和五个残差叠加块组成,旨在获取道路分割结果。解码结构将来自编码结构的输出特征转换为512×512×16尺寸的特征图。在解码部分的上采样过程中,将不同尺度的特征信息进行融合。例如,将解码过程最后一层的16×16×512特征转换为大小256×256×32,然后将这组特征信息与编码过程中相同尺度的特征信息融合。该操作弱化了网络中信息丢失带来的影响,同时将道路特征的空间细节融入到语义信息中,利于影像中道路信息的提取。CRE-Net的目的是将输入的遥感图像转换为特征信息,捕捉更多关于道路的上下文信息,从而实现初步提取道路特征信息,并为后续的FECP模块准备不同尺度的特征信息。The decoding part consists of four convolutional layers and five residual stacking blocks, aiming to obtain road segmentation results. The decoding structure converts the output features from the encoding structure into a feature map of size 512×512×16. During the upsampling process of the decoding part, feature information of different scales is fused. For example, the 16×16×512 features of the last layer of the decoding process are converted to a size of 256×256×32, and then this set of feature information is fused with the feature information of the same scale in the encoding process. This operation weakens the impact of information loss in the network, and at the same time integrates the spatial details of road features into semantic information, which is conducive to the extraction of road information in the image. The purpose of CRE-Net is to convert the input remote sensing image into feature information, capture more contextual information about the road, so as to achieve preliminary extraction of road feature information and prepare feature information of different scales for the subsequent FECP module.

步骤三具体包括以下内容:Step 3 specifically includes the following:

在初始道路提取网络的下采样过程中,存在特征信息的丢失,导致用于道路特征提取的上下文信息不足。因此,在设计道路提取网络时,考虑道路上下文信息的提取并最大限度地减少信息损失。为了解决这个问题,本实施例设计了特征增强和一致性感知模块(FECP模块)的网络模块。在FECP模块中,为了减少特征信息的丢失,首先对大小为512×512×16的特征信息进行下采样至256×256×32,与前一特征层的大小保持一致。然后,将这两组特征信息进行融合,得到尺寸为256×256×32的特征信息,表示为p1。为了在特征信息中包括更多的空间信息,将p1与来自CRE-Net的下采样特征信息合并,得到128×128×64的大小,表示为p2。p2不仅包含丰富的语义信息,而且还包含道路的空间信息。为了进一步丰富特征中的语义信息,将p2下采样到64×64×64,并与相同尺度的特征信息进行卷积,最终获得大小为64×64×64的特征信息,表示为p3。最后,融合的多尺度特征p3作为该模块的输出,表示为pc,最后通过pc获取道路的分割结果。During the downsampling process of the initial road extraction network, there is a loss of feature information, resulting in insufficient contextual information for road feature extraction. Therefore, when designing the road extraction network, the extraction of road context information is considered and the information loss is minimized. In order to solve this problem, this embodiment designs a network module of a feature enhancement and consistency perception module (FECP module). In the FECP module, in order to reduce the loss of feature information, the feature information of size 512×512×16 is first downsampled to 256×256×32, which is consistent with the size of the previous feature layer. Then, the two sets of feature information are fused to obtain feature information of size 256×256×32, denoted as p1. In order to include more spatial information in the feature information, p1 is merged with the downsampled feature information from CRE-Net to obtain a size of 128×128×64, denoted as p2. p2 contains not only rich semantic information, but also spatial information of the road. In order to further enrich the semantic information in the features, p2 is downsampled to 64×64×64 and convolved with the feature information of the same scale to finally obtain feature information of size 64×64×64, denoted as p3. Finally, the fused multi-scale feature p3 is used as the output of the module, denoted as pc, and the road segmentation result is finally obtained through pc.

步骤四具体包括以下内容:Step 4 specifically includes the following:

为了解决高分辨率遥感图像中道路背景信息不足的问题,本实施例在CRE-Net和FECP模块中采用了两种不同类型的损失函数。CRE-Net中使用的损失函数被称为Loss1,FECP模块中的损失函数被称作Loss2。这些损失函数由二进制交叉熵、Dice系数和IoU损失组成。在FECP-Net的训练过程中,将Loss1和Loss2通过不同权重组合来突出道路的特征,增强提取道路的一致性,修复初步提取结果,提高道路提取结果的完整性。具体操作如下:In order to solve the problem of insufficient road background information in high-resolution remote sensing images, this embodiment uses two different types of loss functions in the CRE-Net and FECP modules. The loss function used in CRE-Net is called Loss1, and the loss function in the FECP module is called Loss2. These loss functions are composed of binary cross entropy, Dice coefficient, and IoU loss. During the training process of FECP-Net, Loss1 and Loss2 are combined with different weights to highlight the characteristics of the road, enhance the consistency of the extracted roads, repair the preliminary extraction results, and improve the integrity of the road extraction results. The specific operations are as follows:

Loss1=W1*BCE_loss+W2*DICE_loss+W3*iou_loss (1)Loss1=W1 *BCE_loss+W2 *DICE_loss+W3 *iou_loss (1)

Loss2=W4*BCE_loss+W5*DICE_loss+W6*iou_loss (2)Loss2=W4 *BCE_loss+W5 *DICE_loss+W6 *iou_loss (2)

Loss=Loss1+Loss2 (3)Loss=Loss1+Loss2 (3)

BCE_loss(SP,SL)=-W(SL*ln(SP)+(1-SL)*log(1-SP)) (4)BCE_loss(SP ,SL )=-W(SL *ln(SP )+(1-SL )*log(1-SP )) (4)

其中SP表示预测值,SL表示标签,Si表示预测值和标签之间的交集,SU表示预测值与标签之间的并集。在实际操作中,设置W1=W2=0.4,W3=0.2,W4=W5=0.2,W6=0.6。这种分配权重的方式突出了道路信息在遥感影像中的重要性,提高道路与背景的区分度,提高了模型道路提取的能力。WhereSP represents the predicted value,SL represents the label,Si represents the intersection between the predicted value and the label, andSu represents the union between the predicted value and the label. In actual operation,W1 =W2 = 0.4,W3 = 0.2,W4 =W5 = 0.2, andW6 = 0.6 are set. This way of allocating weights highlights the importance of road information in remote sensing images, improves the distinction between roads and backgrounds, and improves the ability of model road extraction.

步骤五具体包括以下内容:Step 5 specifically includes the following:

选择召回率、IoU和F1分数三个指标评估模型在道路提取过程中的性能。其中,召回率和IoU用于评估道路提取结果的质量,后者主要用于评估模型提取道路的形状和面积。召回率表示正确预测的真实像素与实际真实像素的比例。F1分数被定义为精确度和召回率的调和平均值。IoU表示真实值和预测值的交集与其并集的比值。本实施例中的道路提取任务是一个二分类问题,预测结果被分为道路和背景。这些评估指标的计算公式如下:The recall rate, IoU and F1 score are selected to evaluate the performance of the model in the road extraction process. Among them, the recall rate and IoU are used to evaluate the quality of the road extraction results, and the latter is mainly used to evaluate the shape and area of the road extracted by the model. The recall rate represents the ratio of correctly predicted true pixels to the actual true pixels. The F1 score is defined as the harmonic mean of precision and recall. IoU represents the ratio of the intersection of the true value and the predicted value to their union. The road extraction task in this embodiment is a binary classification problem, and the prediction results are divided into roads and backgrounds. The calculation formulas of these evaluation indicators are as follows:

本实施例的一个具体应用为:A specific application of this embodiment is:

分别获取马萨诸塞州数据集、DeepGlobal数据集和CHT数据集,对其进行人工标注及裁剪,分别获得10854个、24904个、12908个尺寸大小为512×512的样本集,将所有样本按照4:1的比例分为训练样本集和验证样本集。使用TensorFlow1.13作为深度学习框架,开发平台为JetBrains PyCharm2020,使用的编程语言为Python 3.6。所有模型都在配备IntelXeon Gold 6148CPU和NVIDIATeslaV100-PCIE显卡的计算机上运行,FECP-Net使用Adam作为自适应网络优化器,输入512×512大小的图像,批量大小为4,训练轮数为80,每轮3000次迭代数,初始学习率设置为1e-3。在模型训练过程中,学习率会随着学习轮数的增加而自动降低,以帮助模型更快地收敛。The Massachusetts dataset, DeepGlobal dataset, and CHT dataset were obtained, manually annotated, and cropped to obtain 10854, 24904, and 12908 sample sets of 512×512 in size, respectively. All samples were divided into training sample sets and validation sample sets in a ratio of 4:1. TensorFlow1.13 was used as the deep learning framework, the development platform was JetBrains PyCharm2020, and the programming language was Python 3.6. All models were run on computers equipped with Intel Xeon Gold 6148 CPU and NVIDIA Tesla V100-PCIE graphics cards. FECP-Net used Adam as an adaptive network optimizer, input images of 512×512 size, batch size was 4, the number of training rounds was 80, 3000 iterations per round, and the initial learning rate was set to 1e-3. During the model training process, the learning rate will automatically decrease as the number of learning rounds increases to help the model converge faster.

不同模型精度验证结果对比如表1所示:The comparison of the accuracy verification results of different models is shown in Table 1:

表1不同模型精度验证结果对比Table 1 Comparison of accuracy verification results of different models

本实施例方法IoU相比于其他网络分别提高了14.82%、11.31%和11.08%。召回率分别提高了9.76%、7.38%和7.20%。F1分数分别提高了8.12%、8.56%和5.10%。从图2来看,虽然DGRN、D-LinkNet和U-net可以提取道路的基本轮廓,但FECP-net可以产生更平滑、更完整、更接近地表真实情况的提取结果。此外,与其他方法相比,FECP-Net在各个准确性指标上也有所改进,进一步验证了FECP-Net用于道路提取任务的可行性和优越性。The IoU of this embodiment method is improved by 14.82%, 11.31% and 11.08% respectively compared with other networks. The recall rate is improved by 9.76%, 7.38% and 7.20% respectively. The F1 score is improved by 8.12%, 8.56% and 5.10% respectively. From Figure 2, although DGRN, D-LinkNet and U-net can extract the basic contours of the road, FECP-net can produce smoother, more complete and closer to the actual surface situation. In addition, compared with other methods, FECP-Net has also improved in various accuracy indicators, which further verifies the feasibility and superiority of FECP-Net for road extraction tasks.

以上公开的本发明优选实施例只是用于帮助阐述本发明。优选实施例并没有详尽叙述所有的细节,也不限制该发明仅为具体实施方式。显然,根据本说明书的内容,可作很多的修改和变化。本说明书选取并具体描述这些实施例,是为了更好地解释本发明的原理和实际应用,从而使所属技术领域技术人员能很好地理解和利用本发明。本发明仅受权利要求书及其全部范围和等效物的限制。The preferred embodiments of the present invention disclosed above are only used to help explain the present invention. The preferred embodiments do not describe all the details in detail, nor do they limit the invention to specific implementation methods. Obviously, many modifications and changes can be made according to the content of this specification. This specification selects and specifically describes these embodiments in order to better explain the principles and practical applications of the present invention, so that those skilled in the art can understand and use the present invention well. The present invention is limited only by the claims and their full scope and equivalents.

Claims (8)

4. The method for extracting the road from the remote sensing image based on the feature consistency sensing according to claim 3, wherein in the second step, the coding part consists of four residual stacking blocks, six convolution layers and one densely connected expansion pyramid module, and is used for capturing the road context information; the encoding part converts the input image into characteristic information with the size of 16 multiplied by 512, so that parameter calculation is reduced and a receptive field is increased; wherein each convolution layer contains a BN layer and a ReLU function; and a residual accumulation block structure for solving the gradient disappearance problem in a deep learning network is introduced, a DENSEASPP module capable of realizing that the network coding structure obtains richer characteristic information is used, DENSEASPP is connected with a group of convolution layers in a dense connection mode, dense space sampling is effectively generated, a receptive field is increased, so that multi-scale road characteristic information is obtained, the loss of context information is reduced, and road characteristics are better stored.
5. The method for extracting the road from the remote sensing image based on the feature consistency sensing according to claim 4, wherein in the second step, the decoding part consists of four convolution layers and five residual superposition blocks, and is used for obtaining a road segmentation result; the decoding structure converts the output features from the encoding structure into feature maps of 512 x 16 dimensions; in the up-sampling process of the decoding part, the feature information of different scales is fused, the 16 multiplied by 512 features of the last layer of the decoding process are converted into 256 multiplied by 32, and then the group of feature information is fused with the feature information of the same scale in the encoding process, so that the influence caused by information loss in a network is weakened, and meanwhile, the space details of the road features are fused into semantic information, thereby being beneficial to the extraction of the road information in the image; the CRE-Net is used for converting an input remote sensing image into characteristic information and capturing more context information about a road, so that the road characteristic information is preliminarily extracted, and characteristic information with different scales is prepared for a subsequent FECP module.
6. The method for extracting the road from the remote sensing image based on the feature consistency sensing according to claim 5, wherein in the third step, a network module of a feature enhancement and consistency sensing module is designed; in FECP module, in order to reduce the loss of the feature information, firstly, the feature information with the size of 512×512×16 is downsampled to 256×256×32, and the size of the feature information is consistent with the size of the previous feature layer; then, the two sets of characteristic information are fused to obtain characteristic information with the size of 256 multiplied by 32, which is expressed as p1; to include more spatial information in the feature information, p1 is combined with the downsampled feature information from CRE-Net to yield a size of 128 x 64, denoted p2; p2 contains not only rich semantic information, but also spatial information of the road; to further enrich the semantic information in the features, p2 is downsampled to 64 x 64, and convolving the characteristic information with the same scale to finally obtain the characteristic information with the size of 64 multiplied by 64, wherein the characteristic information is expressed as p3; finally, the fused multi-scale feature p3 is taken as the output of the module and expressed as pc, and finally the segmentation result of the road is obtained through pc.
8. The method for extracting the road from the remote sensing image based on the feature consistency perception according to claim 7, wherein in the fifth step, the performance of the three index evaluation models including the recall ratio, ioU and the F1 score in the road extraction process is selected; the recall rate and IoU are used for evaluating the quality of the road extraction result, and the latter is mainly used for evaluating the shape and the area of the model extraction road; recall represents the ratio of correctly predicted real pixels to actual real pixels; the F1 score is defined as the harmonic mean of accuracy and recall; ioU denotes the ratio of the intersection of the real value and the predicted value to its union; the road extraction task is a classification problem, the prediction result is divided into a road and a background, and the calculation formulas of the evaluation indexes are as follows:
CN202410671607.6A2024-05-282024-05-28Remote sensing image road extraction method based on feature consistency perceptionPendingCN118537733A (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202410671607.6ACN118537733A (en)2024-05-282024-05-28Remote sensing image road extraction method based on feature consistency perception

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202410671607.6ACN118537733A (en)2024-05-282024-05-28Remote sensing image road extraction method based on feature consistency perception

Publications (1)

Publication NumberPublication Date
CN118537733Atrue CN118537733A (en)2024-08-23

Family

ID=92382217

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202410671607.6APendingCN118537733A (en)2024-05-282024-05-28Remote sensing image road extraction method based on feature consistency perception

Country Status (1)

CountryLink
CN (1)CN118537733A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN119863711A (en)*2025-03-212025-04-22中国科学院空天信息创新研究院Forest road network model construction method and device considering terrain constraint and crown shielding
CN120495654A (en)*2025-04-272025-08-15耕宇牧星(北京)空间科技有限公司Image segmentation method based on lightweight feature extraction and multi-receptive field feature interaction

Cited By (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN119863711A (en)*2025-03-212025-04-22中国科学院空天信息创新研究院Forest road network model construction method and device considering terrain constraint and crown shielding
CN120495654A (en)*2025-04-272025-08-15耕宇牧星(北京)空间科技有限公司Image segmentation method based on lightweight feature extraction and multi-receptive field feature interaction

Similar Documents

PublicationPublication DateTitle
CN110059768B (en) Semantic segmentation method and system for fusion of point and area features for street view understanding
US12118780B2 (en)Pointeff method for urban object classification with LiDAR point cloud data
CN110197182A (en)Remote sensing image semantic segmentation method based on contextual information and attention mechanism
CN114495029B (en) A traffic target detection method and system based on improved YOLOv4
CN118537733A (en)Remote sensing image road extraction method based on feature consistency perception
CN114724155A (en)Scene text detection method, system and equipment based on deep convolutional neural network
CN115641445B (en) A remote sensing image shadow detection method based on fusion of asymmetric inner convolution and Transformer
CN111709387B (en) A method and system for building segmentation of high-resolution remote sensing images
CN115082778B (en)Multi-branch learning-based homestead identification method and system
CN116934780B (en)Deep learning-based electric imaging logging image crack segmentation method and system
CN114519819A (en)Remote sensing image target detection method based on global context awareness
CN116524189A (en)High-resolution remote sensing image semantic segmentation method based on coding and decoding indexing edge characterization
CN110516676A (en) A Bank Card Number Recognition System Based on Image Processing
CN114187520B (en) Construction and application method of a building extraction model
CN115512220A (en)Remote sensing image road segmentation method based on improved Unet network model
CN115100405A (en) Object detection method in occluded scene for pose estimation
CN114155524A (en) Single-stage 3D point cloud target detection method and device, computer equipment, and medium
CN120088835A (en) A face target detection method based on improved YOLOv8
CN115439756A (en) Building extraction model training method, extraction method, equipment and storage medium
CN114693966A (en)Target detection method based on deep learning
CN117912058A (en) A method for recognizing cow faces
CN117522801A (en)Pit detection method, device and equipment for road image and storage medium
CN117636324A (en) A scene text detection method based on multi-scale feature fusion network
CN116012863A (en) A method and system for detecting literature tables
CN111582275B (en)Serial number identification method and device

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination

[8]ページ先頭

©2009-2025 Movatter.jp