




技术领域technical field
本发明涉及图像分割领域,特别是涉及一种基于全尺度融合和流场注意力的图像分割方法。The invention relates to the field of image segmentation, in particular to an image segmentation method based on full-scale fusion and flow field attention.
背景技术Background technique
近年来,基于深度学习的分割方法在图像分割任务上的应用取得了显著的进展。深度学习分割方法的成功得益于深度神经网络有强大的图像特征提取能力,并通过对图像特征信息的处理得到精细化的分割结果。深度神经网络模型包括有卷积神经网络(convolutional neural networks,CNN)、多层感知机(Multilayer Perceptron,MLP)、以及Transfomer等。现今卷积神经网络已经在图像分割领域取得了一定成就,例如全卷积网络(FCN)、U-Net和SegNet等。其中U-Net是一个具有代表性的图像分割网络。U-Net在编码阶段利用分级CNN提取图像特征,解码阶段则通过反卷积和跳跃连接以实现编解码器的特征交互融合,从而得到较好的图像分割预测图。基于U-Net,研究者们分别提出了结合注意力机制的Attention U-Net和增加特征精细化处理的DFM,均使得分割精度有所提升。然而,基于卷积神经网络的方法中的卷积运算的内在局部性,造成它缺乏对全局特征相关性建模的能力。Transformer和多层感知机能够利用自注意力机制和全连接网络来捕获图像的全局信息,能提取到卷积神经网络较难获取的图像特征。例如TransUNet通过将U-Net编码器中最深层次的特征替换为从ViT中提取的特征。UNetXt集成了卷积和多层感知机的医学图像分割网络提升了分割预测精度。然而目前大多基于深度学习的图像分割方法仍存在不足之处:第一,网络输入图像的尺寸要求存有限制。大多数基于Transformer的方法都需要数据集有大小规整方正的训练与测试图像,但是广泛收集的真实图像数据大多是尺寸任意的,而图像大小的调整将导致图像变形失真并影响分割结果。第二,编码解码阶段中单一网络挖掘的图像特征信息可能不足。多层感知机网络提取的是全局特征信息,卷积神经网络的优势是聚焦局部的信息。故可以通过合理地结合不同类型网络来获得更全面的特征信息,以提升网络性能。第三,U-Net的简单跳跃连接结构难以将各尺度跳连接特征中的粗粒度信息和细粒度信息进行有效融合,各尺度跳连接特征间存在语义差异,从而限制了网络的分割性能,存在一定的缺陷。In recent years, the application of deep learning-based segmentation methods to image segmentation tasks has achieved remarkable progress. The success of the deep learning segmentation method is due to the powerful image feature extraction ability of the deep neural network, and the refined segmentation results are obtained by processing the image feature information. Deep neural network models include convolutional neural networks (CNN), multilayer perceptron (Multilayer Perceptron, MLP), and Transformer. Nowadays, convolutional neural networks have made certain achievements in the field of image segmentation, such as fully convolutional network (FCN), U-Net and SegNet. Among them, U-Net is a representative image segmentation network. In the encoding stage, U-Net uses hierarchical CNN to extract image features, and in the decoding stage, deconvolution and skip connections are used to realize the interactive fusion of codec features, so as to obtain a better image segmentation prediction map. Based on U-Net, the researchers proposed the Attention U-Net combined with the attention mechanism and the DFM with added feature refinement, both of which improved the segmentation accuracy. However, the inherent locality of the convolution operation in convolutional neural network-based methods causes it to lack the ability to model global feature correlations. Transformer and multi-layer perceptron can use self-attention mechanism and fully connected network to capture the global information of the image, and can extract image features that are difficult for convolutional neural networks to obtain. For example, TransUNet replaces the deepest features in the U-Net encoder with features extracted from ViT. UNetXt integrates convolutional and multilayer perceptron medical image segmentation networks to improve segmentation prediction accuracy. However, most current image segmentation methods based on deep learning still have shortcomings: First, the size requirements of the network input image are limited. Most Transformer-based methods require datasets with regular and square training and test images, but most of the widely collected real image data are of arbitrary size, and the adjustment of image size will cause image deformation and distortion and affect the segmentation results. Second, the image feature information mined by a single network in the encoding and decoding stage may be insufficient. The multi-layer perceptron network extracts global feature information, and the advantage of convolutional neural network is to focus on local information. Therefore, more comprehensive feature information can be obtained by combining different types of networks reasonably to improve network performance. Third, the simple skip connection structure of U-Net is difficult to effectively integrate the coarse-grained information and fine-grained information in the skip connection features of each scale, and there are semantic differences between the skip connection features of each scale, which limits the segmentation performance of the network. Certain flaws.
因此,亟需提出一种基于全尺度融合和流场注意力的图像分割方法,以解决上述问题。Therefore, it is urgent to propose an image segmentation method based on full-scale fusion and flow field attention to solve the above problems.
发明内容Contents of the invention
本发明的目的在于,提供一种基于全尺度融合和流场注意力的图像分割方法,以提高图像分割任务的适用性以及分割精度,同时提高图像分割的鲁棒性。The purpose of the present invention is to provide an image segmentation method based on full-scale fusion and flow field attention, so as to improve the applicability and segmentation accuracy of image segmentation tasks, and at the same time improve the robustness of image segmentation.
为解决上述技术问题,本发明提供一种基于全尺度融合和流场注意力的图像分割方法,包括如下步骤:In order to solve the above technical problems, the present invention provides an image segmentation method based on full-scale fusion and flow field attention, which includes the following steps:
获取图像数据集,对数据图像进行预处理;Obtain an image dataset and preprocess the data image;
以U-Net为骨干网络,构建图像分割模型;Use U-Net as the backbone network to build an image segmentation model;
将数据集中的训练图像输入至图像分割模型中进行训练;Input the training images in the data set into the image segmentation model for training;
通过选择合适的参数和损失函数调整模型至最优效果并进行保存;Adjust the model to the optimal effect by selecting appropriate parameters and loss functions and save it;
将数据集中的验证图像输入到训练好的图像分割模型中,得到分割预测结果。Input the verification image in the dataset into the trained image segmentation model to obtain the segmentation prediction result.
进一步的,所述图形分割模型包括依次连接的特征编码器、全尺度特征融合模块以及特征解码器。Further, the graph segmentation model includes a feature encoder, a full-scale feature fusion module and a feature decoder connected in sequence.
进一步的,所述图像分割模型数据处理包括如下步骤:Further, the image segmentation model data processing includes the following steps:
所述特征编码器提取图像的层次特征信息和全局特征信息;The feature encoder extracts hierarchical feature information and global feature information of the image;
所述全尺度特征融合模块将所述特征编码器提取的各层次特征信息以及全局特征信息进行交互融合;The full-scale feature fusion module interactively fuses the feature information of each level extracted by the feature encoder and the global feature information;
所述特征解码器细化各尺度特征映射,得到分割结果。The feature decoder refines feature maps of each scale to obtain segmentation results.
进一步的,所述特征编码器包括依次连接的U-Net骨干网络以及卷积MLP模块。Further, the feature encoder includes a U-Net backbone network and a convolutional MLP module connected in sequence.
进一步的,所述U-Net骨干网络用于获取输入图像5层不同尺度的特征映射所述卷积MLP模块用于提取输入图像的全局特征信息并将其与底层特征映射F1级联得到融合特征T1。Further, the U-Net backbone network is used to obtain feature maps of 5 layers of different scales of the input image The convolutional MLP module is used to extract the global feature information of the input image and concatenate it with the underlying feature map F1 to obtain the fusion feature T1 .
进一步的,所述全尺度特征融合模块沿着特征高度、宽度和通道维度进行特征融合,生成各支路融合特征其中生成第层的融合特征公式为:Further, the full-scale feature fusion module performs feature fusion along the feature height, width and channel dimensions to generate the fusion feature of each branch The fusion feature formula for generating the first layer is:
其中表示第i层卷积操作用来调整特征映射的通道数,dc和uc分别表示卷积下采样和卷积上采样,Cat表示通道叠加操作。in Indicates the number of channels used by the i-th layer convolution operation to adjust the feature map, dc and uc represent convolutional downsampling and convolutional upsampling, respectively, and Cat represents the channel superposition operation.
进一步的,所述特征解码器将所述特征编码器的输出T1和全尺度特征融合模块的输出逐级输入,输出逐级精细的特征图Further, the feature decoder uses the output T1of the feature encoder and the output of the full-scale feature fusion module Step by step input, output step by step fine feature map
进一步的,所述特征解码器处理流程包括如下步骤:Further, the processing flow of the feature decoder includes the following steps:
将特征图Pi-1上采样至的相同尺寸的融合特征Ti,两者级联后通过卷积运算得到特征流场来指导特征图Pi-1进行形变;The feature map Pi-1 is upsampled to the fusion feature Ti of the same size, and the two are cascaded to obtain the feature flow field through convolution operation To guide the deformation of the feature map Pi-1 ;
将形变后的特征图Pi-1与Ti级联并输入并输入到特征解码器,输出特征Pi;Concatenate the deformed feature map Pi-1 with Ti and input it to the feature decoder, and output the feature Pi ;
将特征图P5的通道数映射为分割的类别数,得到最终的分割结果。Map the number of channels of the feature mapP5 to the number of categories to obtain the final segmentation result.
进一步的,所述损失函数为GDL损失函数和交叉熵损失函数进行融合构建,所述损失函数为:Further, the loss function is constructed by fusion of the GDL loss function and the cross-entropy loss function, and the loss function is:
L=LCE+1.1LGDL;L=LCE +1.1LGDL ;
其中,LCE为交叉熵损失函数;LGDL为GDL损失函数。Among them, LCE is the cross-entropy loss function; LGDL is the GDL loss function.
进一步的,所述GDL损失函数以及交叉熵损失函数分别为:Further, the GDL loss function and the cross-entropy loss function are respectively:
其中:表示在M个类别中第m个类别的权重,gmn表示类别m在第n和位置像素的真值,而pmn表示相应的预测值。in: Represents the weight of the mth category in the M categories, gmn represents the true value of the nth and position pixel of the category m, and pmn represents the corresponding predicted value.
相比于现有技术,本发明至少具有以下有益效果:Compared with the prior art, the present invention has at least the following beneficial effects:
本发明改进了传统网络的编码器和解码器结构,通过结合特定功能的网络结构以及对网络结构的改进,弥补了一般U型网络对于捕捉图像全局特征信息能力不足、图像特征上采样过程中失真等问题,提高了本方法对于不同分割任务的适应性以及图像分割精度。The invention improves the structure of the encoder and decoder of the traditional network. By combining the network structure with specific functions and improving the network structure, it makes up for the lack of ability of the general U-shaped network to capture the global feature information of the image and the distortion in the process of image feature upsampling. and other problems, which improves the adaptability of this method to different segmentation tasks and the accuracy of image segmentation.
同时,本发明提出的全尺度特征融合模块通过在各级跳跃连接上融合了粗粒度特征和细粒度特征,减小了各尺度特征间的语义差异,突出图像的关键特征信息,使得网络的性能与鲁棒性均显著提高。At the same time, the full-scale feature fusion module proposed by the present invention fuses coarse-grained features and fine-grained features at all levels of skip connections, which reduces the semantic differences between features of each scale, highlights the key feature information of the image, and makes the performance of the network and robustness are significantly improved.
附图说明Description of drawings
图1为本发明基于全尺度融合和流场注意力的图像分割方法流程图;Fig. 1 is the flow chart of the image segmentation method based on full-scale fusion and flow field attention in the present invention;
图2为本发明基于全尺度融合和流场注意力的图像分割方法的图像分割模型网络结构示意图;2 is a schematic diagram of the image segmentation model network structure of the image segmentation method based on full-scale fusion and flow field attention in the present invention;
图3为本发明基于全尺度融合和流场注意力的图像分割方法的卷积MLP模块结构示意图;3 is a schematic diagram of the convolutional MLP module structure of the image segmentation method based on full-scale fusion and flow field attention in the present invention;
图4为本发明基于全尺度融合和流场注意力的图像分割方法的全尺度特征融合模块单支示意图;Fig. 4 is a schematic diagram of a single branch of the full-scale feature fusion module of the image segmentation method based on full-scale fusion and flow field attention in the present invention;
图5为本发明基于全尺度融合和流场注意力的图像分割方法的流场注意力模块结构示意图。Fig. 5 is a schematic structural diagram of the flow field attention module of the image segmentation method based on full-scale fusion and flow field attention in the present invention.
具体实施方式detailed description
下面将结合示意图对本发明的基于全尺度融合和流场注意力的图像分割方法进行更详细的描述,其中表示了本发明的优选实施例,应该理解本领域技术人员可以修改在此描述的本发明,而仍然实现本发明的有利效果。因此,下列描述应当被理解为对于本领域技术人员的广泛知道,而并不作为对本发明的限制。The image segmentation method based on full-scale fusion and flow field attention of the present invention will be described in more detail below in conjunction with a schematic diagram, which represents a preferred embodiment of the present invention, and it should be understood that those skilled in the art can modify the present invention described here , while still realizing the advantageous effects of the present invention. Therefore, the following description should be understood as the broad knowledge of those skilled in the art, but not as a limitation of the present invention.
在下列段落中参照附图以举例方式更具体地描述本发明。根据下面说明和权利要求书,本发明的优点和特征将更清楚。需说明的是,附图均采用非常简化的形式且均使用非精准的比例,仅用以方便、明晰地辅助说明本发明实施例的目的。In the following paragraphs the invention is described more specifically by way of example with reference to the accompanying drawings. Advantages and features of the present invention will be apparent from the following description and claims. It should be noted that all the drawings are in a very simplified form and use imprecise scales, and are only used to facilitate and clearly assist the purpose of illustrating the embodiments of the present invention.
如图1所示,本发明实施例提出了一种基于全尺度融合和流场注意力的图像分割方法,包括如下步骤:As shown in Figure 1, the embodiment of the present invention proposes an image segmentation method based on full-scale fusion and flow field attention, including the following steps:
获取图像数据集,对数据图像进行预处理;Obtain an image dataset and preprocess the data image;
以U-Net为骨干网络,构建图像分割模型;Use U-Net as the backbone network to build an image segmentation model;
将数据集中的训练图像输入至图像分割模型中进行训练;Input the training images in the data set into the image segmentation model for training;
通过选择合适的参数和损失函数调整模型至最优效果并进行保存;Adjust the model to the optimal effect by selecting appropriate parameters and loss functions and save it;
将数据集中的验证图像输入到训练好的图像分割模型中,得到分割预测结果。Input the verification image in the dataset into the trained image segmentation model to obtain the segmentation prediction result.
以下列举所述基于全尺度融合和流场注意力的图像分割方法在医学图像分割中的较优实施例,以清楚的说明本发明的内容,应当明确的是,本发明的内容并不限制于以下实施例,其他通过本领域普通技术人员的常规技术手段的改进亦在本发明的思想范围之内。The preferred embodiments of the image segmentation method based on full-scale fusion and flow field attention in medical image segmentation are listed below to clearly illustrate the content of the present invention. It should be clear that the content of the present invention is not limited to The following embodiments and other improvements through conventional technical means by those skilled in the art are also within the scope of the present invention.
S100、获取医学图像数据集,对数据图像进行预处理。S100. Acquire a medical image data set, and perform preprocessing on the data image.
具体的,若医学图像数据集是三维图像时,可以沿着图像的轴位方向切片采样1mm形成二维切片;若医学图像数据集是二维图像时,则不做切片操作;若获取的脑医学图像带有脑壳结构,可以通过算法处理去除脑壳;预处理后对数据图像进行归一化处理,使得输入图像像素均值为0,方差为1,且采用随机旋转和随机翻转等方式扩充数据量。数据集全部按照6:4的比例划分为训练集和测试集。Specifically, if the medical image dataset is a three-dimensional image, it can be sliced and sampled 1mm along the axial direction of the image to form a two-dimensional slice; if the medical image dataset is a two-dimensional image, no slice operation is performed; if the acquired brain The medical image has a skull structure, which can be removed by algorithm processing; after preprocessing, the data image is normalized so that the average value of the input image pixels is 0, and the variance is 1, and random rotation and random flipping are used to expand the amount of data . All datasets are divided into training set and test set according to the ratio of 6:4.
S200、以U-Net为骨干网络,构建包含有卷积MLP模块,全尺度特征融合以及流场注意力解码模块的图像分割模型。S200. Using U-Net as a backbone network, construct an image segmentation model including a convolutional MLP module, a full-scale feature fusion and a flow field attention decoding module.
具体的,如图2所示,该方法图像分割模型分为三个处理阶段:第一阶段采用集成U-Net骨干网络和卷积MLP模块的特征编码器来提取图像的层次特征信息和全局特征信息;第二阶段全尺度特征融合模块将一阶段提取的各层次信息特征进行交互融合,生成各级用于跳连接的特征映射;第三阶段利用了包含流场变换和注意力机制的流场注意力模块(Flow and Attention Decoding Unit,FADU),构成的特征解码器细化了各尺度输入的特征映射,通过通道映射得到预测分割结果。Specifically, as shown in Figure 2, the image segmentation model of this method is divided into three processing stages: the first stage uses the feature encoder integrating the U-Net backbone network and the convolutional MLP module to extract the hierarchical feature information and global features of the image information; in the second stage, the full-scale feature fusion module interactively fuses the information features of each level extracted in the first stage to generate feature maps for jump connections at all levels; in the third stage, the flow field including flow field transformation and attention mechanism is used The attention module (Flow and Attention Decoding Unit, FADU), the feature decoder composed of refines the feature map of each scale input, and obtains the predicted segmentation result through the channel map.
特征编码器分为U-Net骨干网络以及卷积MLP模块两部分。给定输入图像经过U-Net得到5层不同尺度的特征映射同时利用卷积MLP模块提取输入图像I的全局特征信息并将其与底层特征映射F1级联得到T1。如图3所示,卷积MLP模块包含一组由3×3卷积下采样、3×3深度可分离卷积和池化下采样组成的卷积块,以及三组包含MLP和卷积下采样的卷积MLP块。特征编码器在提取图像的局部空间特征信息同时保留了图像全局特征信息。The feature encoder is divided into two parts: the U-Net backbone network and the convolutional MLP module. Given an input image Get 5 layers of feature maps of different scales through U-Net At the same time, the convolutional MLP module is used to extract the global feature information of the input image I and concatenate it with the underlying feature map F1 to obtain T1 . As shown in Figure 3, the convolutional MLP module consists of a set of convolutional blocks consisting of 3×3 convolutional downsampling, 3×3 depthwise separable convolution, and pooling downsampling, and three sets of convolutional blocks consisting of MLP and convolutional downsampling Sampled convolutional MLP block. The feature encoder extracts the local spatial feature information of the image while retaining the global feature information of the image.
全尺度特征融合模块由四条独立分支组成,按照特定的特征高度、宽度和通道维度进行特征融合,生成各支路融合特征即各级用于跳连接的特征映射。参阅图4,来阐述单分支结构工作流程,down和up分别表示将特征图下采样和上采样到特征F2的高度和宽度,null表示不做采样操作。C表示卷积操作,这里统一均为3×3卷积,输出通道数为64。经过Cat操作,输入的5层特征按通道级联叠加为单一特征映射(通道数为320),表示卷积操作用来调整特征映射的通道数,将通道数调整为对应特征映射F2的通道大小,最后再与特征映射F2相加得到跳连接特征T2。故生成第i层的融合特征Ti公式为:The full-scale feature fusion module consists of four independent branches, which perform feature fusion according to specific feature height, width and channel dimensions, and generate fusion features of each branch That is, feature maps for skip connections at all levels. Refer to Figure 4 to illustrate the workflow of the single-branch structure. down and up represent the downsampling and upsampling of the feature map to the height and width of the feature F2 respectively, and null represents no sampling operation. C represents the convolution operation, which is uniformly 3×3 convolution, and the number of output channels is 64. After the Cat operation, the input 5-layer features are cascaded and superimposed into a single feature map by channel (the number of channels is 320), Indicates that the convolution operation is used to adjust the number of channels of the feature map, adjust the number of channels to the channel size of the corresponding feature map F2 , and finally add it to the feature map F2 to obtain the jump connection feature T2 . Therefore, the formula for generating the fusion feature Ti of the i-th layer is:
其中公式(1)中表示第i层卷积操作用来调整特征映射的通道数,将通道数调整为对应特征映射Fi的通道大小,dC和uC分别表示卷积下采样和卷积上采样,Cat表示通道叠加操作。where in the formula (1) Indicates that the i-th layer convolution operation is used to adjust the number of channels of the feature map, and adjust the number of channels to the channel size of the corresponding feature map Fi , dC and uC represent convolution down-sampling and convolution up-sampling, respectively, and Cat represents the channel superposition operation .
特征解码器由流场注意力模块(FADU)构成,该解码器将特征编码器的输出T1和全尺度特征融合模块的输出逐级输入,输出逐级精细的特征图实现特征细化的同时避免信息冗余。如图5所示,流场注意力解码模块工作方式如下:首先将输入特征Pi-1上采样至Ti的相同尺寸,两者级联后通过卷积运算得到特征流场来指导特征Pi-1进行形变,其形变公式如下:The feature decoder consists of a Flow Field Attention Module (FADU), which combines the output T1of the feature encoder and the output of the full-scale feature fusion module Step by step input, output step by step fine feature map Achieving feature refinement while avoiding information redundancy. As shown in Figure 5, the flow field attention decoding module works as follows: first, the input feature Pi-1 is upsampled to the same size as Ti , and after the two are cascaded, the feature flow field is obtained by convolution operation To guide the deformation of the feature Pi-1 , the deformation formula is as follows:
warp(Pi-1)=Pi-1(px+φ(p)x,px+φ(p)x)warp(Pi-1 )=Pi-1 (px +φ(p)x ,px +φ(p)x )
(2)(2)
其中公式(2)中p下标的x和y是像素点的坐标,流场形变减少了特征图在上采样过程中失真的问题;再将形变后的特征Pi-1与Ti级联输入3×3卷积块输出特征Pi′。卷积运算公式为:The x and y of the p subscript in the formula (2) are the coordinates of the pixel point, and the deformation of the flow field reduces the distortion of the feature map during the upsampling process; then the deformed feature Pi-1 and Ti are cascaded and input The 3×3 convolutional block outputs features Pi ′. The convolution operation formula is:
Pi′=σ(C(WPPi-1)+C(WTTi))Pi ′=σ(C(WP Pi-1 )+C(WT Ti ))
(3)(3)
其中公式(3)中σ表示ReLU激活函数,C表示卷积运算,WP和WT分别表示隐藏状态Pi-1和跳连接特征Ti的权重矩阵。由于简单将高层特征和浅层特征融合的方式通常会带来信息的冗余和混乱,故将特征Pi′输入卷积注意力模块(CBAM)得到Pi,注意力机制提高了模块对有效信息的利用效率。总的来看,Pi可以逐层通过输入的Pi-1和Ti来生成,公式如下所示:Among them, σ in formula (3) represents the ReLU activation function, C represents the convolution operation, WP and WT represent the hidden state Pi-1 and the weight matrix of the skip connection feature Ti , respectively. Since the way of simply fusing high-level features and shallow features usually brings information redundancy and confusion, the feature Pi' is input into the convolutional attention module (CBAM ) to obtain Pi. The attention mechanism improves the module's ability to effectively inform utilization efficiency. In general, Pi can be generated layer by layer through the input Pi-1 and Ti , the formula is as follows:
Pi=FADU(Pi-1,Ti;φ)Pi =FADU(Pi-1 ,Ti ;φ)
(4)(4)
其中公式(4)中i的取值为1到5,P0则是初始化为0的张量。编码过程的最后利用3×3卷积将特征图P5的通道数映射为分割的类别数,得到最终的分割结果。The value of i in formula (4) is 1 to 5, and P0 is a tensor initialized to 0. At the end of the encoding process, 3×3 convolution is used to map the number of channels of the feature map P5 into the number of categories to be segmented to obtain the final segmentation result.
S300、将数据集中的训练图像输入至图像分割模型中进行训练。S300. Input the training images in the data set into the image segmentation model for training.
具体的,在训练过程中,模型采用的是Adam优化算法促使损失函数趋向最小来更新网络参数,初始学习率设置为0.0006,权重衰减值为0.0005。同时,训练中的每批数据量大小设置为1以及总的迭代次数是30000。Specifically, during the training process, the model uses the Adam optimization algorithm to minimize the loss function to update the network parameters. The initial learning rate is set to 0.0006, and the weight decay value is 0.0005. At the same time, the size of each batch of data in training is set to 1 and the total number of iterations is 30000.
S400、通过选择合适的参数和损失函数调整模型至最优效果并进行保存。S400. Adjust the model to an optimal effect by selecting appropriate parameters and a loss function and save it.
具体的,影响图像分割模型总体性能结果的因素不仅有网络结构设计,还有损失函数也发挥着关键作用。该网络的损失函数为交叉熵损失函数和GDL损失函数进行融合构建,其中交叉熵损失函数为其中gmn表示类别m在第n和位置像素的真值,而pmn表示相应的预测值,而GDL的损失函数为:其中表示在M类别中第m类别的权重,最终形成的融合损失函数公式为:L=LCE+1.1LGDL。Specifically, the factors that affect the overall performance of the image segmentation model are not only the network structure design, but also the loss function plays a key role. The loss function of the network is constructed by fusion of the cross-entropy loss function and the GDL loss function, where the cross-entropy loss function is Where gmn represents the true value of the nth and position pixel of category m, and pmn represents the corresponding predicted value, and the loss function of GDL is: in Indicates the weight of the mth category in the M category, and the final fusion loss function formula is: L=LCE +1.1LGDL .
S500、将数据集中的验证图像输入到训练好的图像分割模型中,得到分割预测结果。S500. Input the verification image in the data set into the trained image segmentation model to obtain a segmentation prediction result.
相比于现有技术,本发明至少具有以下有益效果:Compared with the prior art, the present invention has at least the following beneficial effects:
本发明改进了传统网络的编码器和解码器结构,通过结合特定功能的网络结构以及对网络结构的改进,弥补了一般U型网络对于捕捉图像全局特征信息能力不足、图像特征上采样过程中失真等问题,提高了本方法对于不同分割任务的适应性以及图像分割精度。The invention improves the structure of the encoder and decoder of the traditional network. By combining the network structure with specific functions and improving the network structure, it makes up for the lack of ability of the general U-shaped network to capture the global feature information of the image and the distortion in the process of image feature upsampling. and other problems, which improves the adaptability of this method to different segmentation tasks and the accuracy of image segmentation.
同时,本发明提出的全尺度特征融合模块通过在各级跳跃连接上融合了粗粒度特征和细粒度特征,减小了各尺度特征间的语义差异,突出图像的关键特征信息,使得网络的性能与鲁棒性均显著提高。At the same time, the full-scale feature fusion module proposed by the present invention fuses coarse-grained features and fine-grained features at all levels of skip connections, which reduces the semantic differences between features of each scale, highlights the key feature information of the image, and makes the performance of the network and robustness are significantly improved.
显然,本领域的技术人员可以对本发明进行各种改动和变型而不脱离本发明的精神和范围。这样,倘若本发明的这些修改和变型属于本发明权利要求及其等同技术的范围之内,则本发明也意图包含这些改动和变型在内。Obviously, those skilled in the art can make various changes and modifications to the present invention without departing from the spirit and scope of the present invention. Thus, if these modifications and variations of the present invention fall within the scope of the claims of the present invention and equivalent technologies thereof, the present invention also intends to include these modifications and variations.
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202211220793.9ACN115588013B (en) | 2022-10-08 | 2022-10-08 | An image segmentation method based on full-scale fusion and flow field attention |
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202211220793.9ACN115588013B (en) | 2022-10-08 | 2022-10-08 | An image segmentation method based on full-scale fusion and flow field attention |
| Publication Number | Publication Date |
|---|---|
| CN115588013Atrue CN115588013A (en) | 2023-01-10 |
| CN115588013B CN115588013B (en) | 2025-08-29 |
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202211220793.9AActiveCN115588013B (en) | 2022-10-08 | 2022-10-08 | An image segmentation method based on full-scale fusion and flow field attention |
| Country | Link |
|---|---|
| CN (1) | CN115588013B (en) |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN116433693A (en)* | 2023-04-13 | 2023-07-14 | 潍柴动力股份有限公司 | Image processing method, device, electronic equipment and storage medium |
| CN116630631A (en)* | 2023-07-24 | 2023-08-22 | 无锡日联科技股份有限公司 | Image segmentation method, device, electronic equipment and storage medium |
| CN117115178A (en)* | 2023-08-23 | 2023-11-24 | 国网四川省电力公司电力科学研究院 | A power infrared inspection image segmentation and detection method based on semi-parameter sharing |
| CN117576727A (en)* | 2023-11-24 | 2024-02-20 | 广东科技学院 | Class information dynamic adjustment training method based on pest detection |
| CN117671692A (en)* | 2023-12-12 | 2024-03-08 | 南昌航空大学 | Scene text segmentation method, system, equipment and medium |
| CN118469821A (en)* | 2024-07-15 | 2024-08-09 | 湖南大学 | A volumetric super-resolution method for medical images based on diffusion model |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20210216806A1 (en)* | 2020-01-12 | 2021-07-15 | Dalian University Of Technology | Fully automatic natural image matting method |
| CN113674253A (en)* | 2021-08-25 | 2021-11-19 | 浙江财经大学 | Rectal cancer CT image automatic segmentation method based on U-transducer |
| CN113792686A (en)* | 2021-09-17 | 2021-12-14 | 中南大学 | Vehicle re-identification method based on cross-sensor invariance of visual representation |
| CN113850825A (en)* | 2021-09-27 | 2021-12-28 | 太原理工大学 | Remote sensing image road segmentation method based on context information and multi-scale feature fusion |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20210216806A1 (en)* | 2020-01-12 | 2021-07-15 | Dalian University Of Technology | Fully automatic natural image matting method |
| CN113674253A (en)* | 2021-08-25 | 2021-11-19 | 浙江财经大学 | Rectal cancer CT image automatic segmentation method based on U-transducer |
| CN113792686A (en)* | 2021-09-17 | 2021-12-14 | 中南大学 | Vehicle re-identification method based on cross-sensor invariance of visual representation |
| CN113850825A (en)* | 2021-09-27 | 2021-12-28 | 太原理工大学 | Remote sensing image road segmentation method based on context information and multi-scale feature fusion |
| Title |
|---|
| 余帅;汪西莉;: "基于多级通道注意力的遥感图像分割方法", 激光与光电子学进展, no. 04* |
| 李凯: "基于尺度融合的编解码网络的医学图像分割研究", 《中国优秀硕士学位论文全文数据库 基础科学辑》, 15 January 2024 (2024-01-15), pages 006 - 519* |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN116433693A (en)* | 2023-04-13 | 2023-07-14 | 潍柴动力股份有限公司 | Image processing method, device, electronic equipment and storage medium |
| CN116630631A (en)* | 2023-07-24 | 2023-08-22 | 无锡日联科技股份有限公司 | Image segmentation method, device, electronic equipment and storage medium |
| CN116630631B (en)* | 2023-07-24 | 2023-10-20 | 无锡日联科技股份有限公司 | Image segmentation method, device, electronic equipment and storage medium |
| CN117115178A (en)* | 2023-08-23 | 2023-11-24 | 国网四川省电力公司电力科学研究院 | A power infrared inspection image segmentation and detection method based on semi-parameter sharing |
| CN117115178B (en)* | 2023-08-23 | 2024-05-14 | 国网四川省电力公司电力科学研究院 | Semi-parameter sharing-based power infrared inspection image segmentation and detection method |
| CN117576727A (en)* | 2023-11-24 | 2024-02-20 | 广东科技学院 | Class information dynamic adjustment training method based on pest detection |
| CN117671692A (en)* | 2023-12-12 | 2024-03-08 | 南昌航空大学 | Scene text segmentation method, system, equipment and medium |
| CN118469821A (en)* | 2024-07-15 | 2024-08-09 | 湖南大学 | A volumetric super-resolution method for medical images based on diffusion model |
| CN118469821B (en)* | 2024-07-15 | 2024-09-13 | 湖南大学 | A volumetric super-resolution method for medical images based on diffusion model |
| Publication number | Publication date |
|---|---|
| CN115588013B (en) | 2025-08-29 |
| Publication | Publication Date | Title |
|---|---|---|
| CN115588013A (en) | Image segmentation method based on full-scale fusion and flow field attention | |
| CN118134952B (en) | Medical image segmentation method based on feature interaction | |
| CN113780149B (en) | An efficient method for extracting building targets from remote sensing images based on attention mechanism | |
| CN111681252B (en) | Medical image automatic segmentation method based on multipath attention fusion | |
| CN113096017B (en) | Image super-resolution reconstruction method based on depth coordinate attention network model | |
| CN109087258B (en) | A method and device for removing rain from images based on deep learning | |
| CN112258526B (en) | CT kidney region cascade segmentation method based on dual attention mechanism | |
| CN110020989A (en) | A kind of depth image super resolution ratio reconstruction method based on deep learning | |
| CN114821050B (en) | A transformer-based method for referential image segmentation | |
| CN114663759A (en) | Remote sensing image building extraction method based on improved deep LabV3+ | |
| CN108062754A (en) | Segmentation, recognition methods and device based on dense network image | |
| CN111028235A (en) | Image segmentation method for enhancing edge and detail information by utilizing feature fusion | |
| CN114821058A (en) | An image semantic segmentation method, device, electronic device and storage medium | |
| CN112560719B (en) | High-resolution image water body extraction method based on multi-scale convolution-multi-core pooling | |
| CN115908805A (en) | U-shaped image segmentation network based on convolution enhanced cross self-attention deformer | |
| CN117726954B (en) | A method and system for segmenting land and sea in remote sensing images | |
| CN115984714B (en) | Cloud detection method based on dual-branch network model | |
| CN111160341B (en) | Scene Chinese text recognition method based on double-attention-machine mechanism | |
| CN116385454A (en) | Medical image segmentation method based on multi-stage aggregation | |
| CN115170622A (en) | Transformer-based medical image registration method and system | |
| CN115457043A (en) | Image segmentation network based on overlapped self-attention deformer framework U-shaped network | |
| CN114140317A (en) | An Image Animation Method Based on Cascaded Generative Adversarial Networks | |
| CN114882047A (en) | Medical image segmentation method and system based on semi-supervision and Transformers | |
| CN116342675A (en) | Real-time monocular depth estimation method, system, electronic equipment and storage medium | |
| CN116758092A (en) | Image segmentation method, device, electronic equipment and storage medium |
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant |