CN115082496A

Movatterモバイル変換

Info

Publication number: CN115082496A
Application number: CN202110280836.1A
Authority: CN
Inventors: 翟世平; 陈维强; 高雪松; 曲磊
Original assignee: Hisense Group Holding Co Ltd
Current assignee: Hisense Group Holding Co Ltd
Priority date: 2021-03-16
Filing date: 2021-03-16
Publication date: 2022-09-20

Abstract

The application provides an image segmentation method and device. The method comprises the steps of carrying out first segmentation processing on an original image, determining a first mask image of the original image, carrying out feature extraction on the original image and the first mask image to obtain a fusion feature image, carrying out first attention mechanism on the fusion feature image to obtain an edge attention weight, carrying out second attention mechanism on the fusion feature image to obtain a feathering attention weight, and carrying out edge attention weight, feathering attention weight and the original image to obtain a second mask image of the original image. Because the image segmentation result is violently up-sampled in the prior art, the edge of the image segmentation result has obvious sawtooth phenomena, so that the edge of the segmented image segmentation result is incomplete and unsmooth, the scheme can reduce the sawtooth phenomena of the edge of the mask image by distributing different edge attention weights and feathering attention weights to different pixel points of the original image, and can ensure the integrity, the smoothness and the definition of the edge of the mask image.

Description

Translated fromChinese

一种图像分割方法及装置An image segmentation method and device

技术领域technical field

本申请涉及图像处理技术领域，尤其涉及一种图像分割方法及装置。The present application relates to the technical field of image processing, and in particular, to an image segmentation method and apparatus.

背景技术Background technique

现阶段，人们为了追求实例(比如人像、物体等)的个性化，通常会将已存在的图像(比如视频实例图像或图片实例图像等)中的实例分割出来，再将分割出的实例与新的背景进行融合，形成新的图像，以此可实现实例的多样化展示。At this stage, in order to pursue the personalization of instances (such as portraits, objects, etc.), people usually segment the instances in existing images (such as video instance images or picture instance images, etc.), and then segment the instances with new ones. The background is fused to form a new image, which can realize the diversified display of instances.

为了实现实例从图像中分离出来，一般是通过掩膜图实现的。在生成掩膜图时，最为关键的是实例与图像边缘的处理。现有方案提取的边缘经常存在不完整、不平滑的问题，多有毛刺感，也影响了实例的后续展示。In order to achieve the separation of instances from the image, it is generally achieved through a mask map. When generating a mask map, the most critical is the processing of instances and image edges. The edges extracted by the existing solutions often have problems of incompleteness and smoothness, and there are many burrs, which also affects the subsequent display of the instance.

综上，目前亟需一种图像分割方法，用以确保掩膜图边缘的完整性和平滑性。In conclusion, there is an urgent need for an image segmentation method to ensure the integrity and smoothness of the edge of the mask image.

发明内容SUMMARY OF THE INVENTION

本申请示例性的实施方式中提供了一种图像分割方法及装置，用以确保掩膜图边缘的完整性和平滑性。Exemplary embodiments of the present application provide an image segmentation method and apparatus to ensure the integrity and smoothness of the edge of the mask image.

第一方面，本申请示例性的实施方式中提供了一种图像分割方法，包括：In a first aspect, an exemplary embodiment of the present application provides an image segmentation method, including:

对原图进行第一分割处理，确定所述原图的第一掩膜图；Perform a first segmentation process on the original image to determine a first mask image of the original image;

对所述原图和所述第一掩膜图进行特征提取，得到融合特征图；Perform feature extraction on the original image and the first mask image to obtain a fusion feature map;

将所述融合特征图通过第一注意力机制，得到边缘注意力权重；Passing the fusion feature map through the first attention mechanism to obtain the edge attention weight;

将所述融合特征图通过第二注意力机制，得到羽化注意力权重；Passing the fusion feature map through the second attention mechanism to obtain the feathered attention weight;

通过所述边缘注意力权重、所述羽化注意力权重和所述原图，得到所述原图的第二掩膜图。Through the edge attention weight, the feather attention weight and the original image, a second mask image of the original image is obtained.

上述技术方案中，通过对原图进行第一分割处理，确定出原图的第一掩膜图，并对原图和第一掩膜图进行特征提取，得到融合特征图。再将融合特征图通过第一注意力机制，得到边缘注意力权重，并将融合特征图通过第二注意力机制，得到羽化注意力权重。然后，通过边缘注意力权重、羽化注意力权重和原图，得到原图的第二掩膜图。由于现有技术中通过对图像分割结果进行暴力上采样导致图像分割结果边缘出现明显的锯齿现象，以使分割出的图像分割结果边缘不完整、不平滑，因此本方案通过对原图的不同像素点分配不同的边缘注意力权重、羽化注意力权重，可以实现针对原图中不同边缘像素点的不同羽化效果，以便减少生成的掩膜图边缘的锯齿现象，从而可以确保掩膜图边缘的完整性、平滑性和清晰性。In the above technical solution, by performing a first segmentation process on the original image, a first mask image of the original image is determined, and feature extraction is performed on the original image and the first mask image to obtain a fusion feature map. The fused feature map is then passed through the first attention mechanism to obtain the edge attention weight, and the fused feature map is passed through the second attention mechanism to obtain the feathered attention weight. Then, through the edge attention weight, feather attention weight and the original image, the second mask image of the original image is obtained. Since the edge of the image segmentation result is obviously jagged by violent upsampling of the image segmentation result in the prior art, so that the edge of the segmented image segmentation result is incomplete and unsmooth. Points are assigned different edge attention weights and feather attention weights, which can achieve different feathering effects for different edge pixels in the original image, so as to reduce the aliasing phenomenon of the generated mask image edge, so as to ensure the integrity of the mask image edge. stability, smoothness and clarity.

在一些示例性的实施方式中，所述对原图进行第一分割处理，确定所述原图的第一掩膜图，包括：In some exemplary embodiments, the performing a first segmentation process on the original image to determine the first mask image of the original image includes:

提取所述原图中各像素点的特征；extracting the features of each pixel in the original image;

根据所述各像素点的特征，确定所述各像素点各自对应的类别标签；According to the feature of each pixel point, determine the category label corresponding to each pixel point;

根据所述各像素点各自对应的类别标签，确定所述原图的第一掩膜图。The first mask image of the original image is determined according to the category labels corresponding to the respective pixel points.

上述技术方案中，通过根据各像素点的特征，可以为各像素点打上对应的类别标签，以便基于各像素点的类别标签可以及时有效地得到原图的第一掩膜图。In the above technical solution, each pixel can be labeled with a corresponding category label according to the characteristics of each pixel, so that the first mask image of the original image can be obtained in a timely and effective manner based on the category label of each pixel.

在一些示例性的实施方式中，所述对所述原图和所述第一掩膜图进行特征提取，得到融合特征图，包括：In some exemplary embodiments, performing feature extraction on the original image and the first mask image to obtain a fusion feature map, including:

将所述原图和所述第一掩膜图输入到多层感知网络，得到所述融合特征图。The original image and the first mask image are input into a multi-layer perceptual network to obtain the fusion feature map.

上述技术方案中，该方案为了能够更好地利用原图的特征，通过将原图和第一掩膜图一起输入到多层感知网络进行处理，来获取原图和第一掩膜图的融合特征图，以此为后续更准确地获取到更完整、更平滑的掩膜图边缘提供支持，从而可以为后续得到更清晰的掩膜图提供支持。In the above technical solution, in order to make better use of the features of the original image, the solution obtains the fusion of the original image and the first mask image by inputting the original image and the first mask image together into a multi-layer perceptual network for processing. The feature map is used to provide support for obtaining a more complete and smoother mask map edge more accurately in the future, so as to provide support for a clearer mask map in the future.

在一些示例性的实施方式中，所述第一注意力机制通过如下方式确定：In some exemplary embodiments, the first attention mechanism is determined by:

将第一样本图的样本融合特征图输入到初始的第一注意力机制，得到样本边缘注意力权重；Input the sample fusion feature map of the first sample map into the initial first attention mechanism to obtain the sample edge attention weight;

通过所述第一样本图和所述样本边缘注意力权重，确定第一样本掩膜图；Determine a first sample mask map by using the first sample map and the sample edge attention weight;

通过所述第一样本掩膜图和所述第一样本图的第一标签掩膜图，调整所述初始的第一注意力机制，直至满足第一设定条件，得到所述第一注意力机制。The initial first attention mechanism is adjusted through the first sample mask map and the first label mask map of the first sample map until the first set condition is satisfied, and the first attention mechanism.

上述技术方案中，通过在更好地利用第一样本图的特征的基础上，将第一样本图的样本融合特征图输入到初始的第一注意力机制，可以准确地得到样本边缘注意力权重，并基于第一样本掩膜图和第一标签掩膜图进行不断地更新调整初始的第一注意力机制，以此得到符合设定条件的第一注意力机制，可以为后续基于该第一注意力机制确定出更为完整、更为平滑的掩膜图边缘提供支持。此外，在更好地利用第一样本图的特征的基础上，将第一样本图和样本边缘注意力权重相结合所确定的第一样本掩膜图来调整初始的第一注意力机制，可以进一步提升第一注意力机制的模型精度。In the above technical solution, on the basis of making better use of the features of the first sample map, the sample fusion feature map of the first sample map is input into the initial first attention mechanism, so that the sample edge attention can be accurately obtained. force weight, and continuously update and adjust the initial first attention mechanism based on the first sample mask map and the first label mask map, so as to obtain the first attention mechanism that meets the set conditions, which can be used for subsequent The first attention mechanism determines more complete and smoother mask edges to support. In addition, on the basis of making better use of the features of the first sample map, the first sample mask map determined by combining the first sample map and the sample edge attention weights is used to adjust the initial first attention mechanism, which can further improve the model accuracy of the first attention mechanism.

在一些示例性的实施方式中，所述第二注意力机制通过如下方式确定：In some exemplary embodiments, the second attention mechanism is determined by:

将第二样本图的样本融合特征图输入到初始的第二注意力机制，得到样本羽化注意力权重；Input the sample fusion feature map of the second sample map into the initial second attention mechanism to obtain the sample feathering attention weight;

通过所述第二样本图和所述样本羽化注意力权重，确定第二样本掩膜图；Determine a second sample mask map by using the second sample map and the sample feathering attention weight;

通过所述第二样本掩膜图和所述第二样本图的第二标签掩膜图，调整所述初始的第二注意力机制，直至满足第二设定条件，得到所述第二注意力机制。The initial second attention mechanism is adjusted through the second sample mask map and the second label mask map of the second sample map until the second set condition is satisfied, and the second attention is obtained mechanism.

上述技术方案中，通过在更好地利用第二样本图的特征的基础上，将第二样本图的样本融合特征图输入到初始的第二注意力机制，可以准确地得到样本羽化注意力权重，并基于第二样本掩膜图和第二标签掩膜图进行不断地更新调整初始的第二注意力机制，以此得到符合设定条件的第二注意力机制，可以为后续基于该第二注意力机制确定出更为完整、更为平滑的掩膜图边缘提供支持。此外，在更好地利用第二样本图的特征的基础上，将第二样本图和样本羽化注意力权重相结合所确定的第二样本掩膜图来调整初始的第二注意力机制，可以进一步提升第二注意力机制的模型精度。In the above technical solution, on the basis of making better use of the features of the second sample map, the sample fusion feature map of the second sample map is input into the initial second attention mechanism, so that the sample feathering attention weight can be accurately obtained. , and continuously update and adjust the initial second attention mechanism based on the second sample mask map and the second label mask map, so as to obtain a second attention mechanism that meets the set conditions, which can be used for subsequent The attention mechanism determines more complete and smoother mask edges to support. In addition, on the basis of making better use of the features of the second sample map, the second sample mask map determined by combining the second sample map and the sample feathering attention weights is used to adjust the initial second attention mechanism. Further improve the model accuracy of the second attention mechanism.

在一些示例性的实施方式中，所述第一注意力机制和所述第二注意力机制通过如下方式确定：In some exemplary embodiments, the first attention mechanism and the second attention mechanism are determined by:

将第三样本图的样本融合特征图输入到初始的第一注意力机制，得到样本边缘注意力权重；Input the sample fusion feature map of the third sample map into the initial first attention mechanism to obtain the sample edge attention weight;

将第三样本图的样本融合特征图输入到初始的第二注意力机制，得到样本羽化注意力权重；Input the sample fusion feature map of the third sample map into the initial second attention mechanism to obtain the sample feathering attention weight;

通过所述第三样本图、所述样本边缘注意力权重和所述样本羽化注意力权重，确定第三样本掩膜图；Determine a third sample mask map by using the third sample map, the sample edge attention weight and the sample feather attention weight;

通过所述第三样本掩膜图和所述第三样本图的第三标签掩膜图，调整所述初始的第一注意力机制和所述初始的第二注意力机制，直至满足第三设定条件，得到所述第一注意力机制和所述第二注意力机制。Adjust the initial first attention mechanism and the initial second attention mechanism through the third sample mask map and the third label mask map of the third sample map until the third setting is satisfied Under certain conditions, the first attention mechanism and the second attention mechanism are obtained.

上述技术方案中，通过在更好地利用第三样本图的特征的基础上，将第三样本图的样本融合特征图分别同时输入到初始的第一注意力机制和初始的第二注意力机制，可以准确地得到样本边缘注意力权重、样本羽化注意力权重，并基于第三样本掩膜图和第三标签掩膜图进行不断地更新调整初始的第一注意力机制和初始的第二注意力机制，以此得到符合设定条件的第一注意力机制和第二注意力机制，可以为后续基于该第一注意力机制和第二注意力机制确定出更为完整、更为平滑的掩膜图边缘提供支持。此外，在更好地利用第三样本图的特征的基础上，将第三样本图和样本边缘注意力权重、样本羽化注意力权重相结合所确定的第三样本掩膜图来调整初始的第一注意力机制和初始的第二注意力机制，可以进一步提升第一注意力机制和第二注意力机制的模型精度。In the above technical solution, on the basis of making better use of the features of the third sample map, the sample fusion feature map of the third sample map is simultaneously input into the initial first attention mechanism and the initial second attention mechanism. , the sample edge attention weight and sample feather attention weight can be accurately obtained, and the initial first attention mechanism and the initial second attention mechanism can be continuously updated and adjusted based on the third sample mask map and the third label mask map. In order to obtain the first attention mechanism and the second attention mechanism that meet the set conditions, a more complete and smoother mask can be determined for the follow-up based on the first attention mechanism and the second attention mechanism. Membrane figure edges provide support. In addition, on the basis of making better use of the features of the third sample image, the third sample mask image determined by combining the third sample image with the sample edge attention weight and the sample feather attention weight is used to adjust the initial third sample image. The first attention mechanism and the initial second attention mechanism can further improve the model accuracy of the first attention mechanism and the second attention mechanism.

在一些示例性的实施方式中，在得到所述原图的第二掩膜图之后，还包括：In some exemplary embodiments, after obtaining the second mask image of the original image, the method further includes:

通过所述第二掩膜图对所述原图进行图像分割，得到所述原图的目标子图。Image segmentation is performed on the original image through the second mask image to obtain a target sub-image of the original image.

上述技术方案中，通过基于第二掩膜图对原图进行图像分割，可以准确地得到针对原图的更为清晰的目标子图。此外，由于通过基于第二掩膜图对原图进行图像分割，因此可以减少目标子图边缘的锯齿现象，并可以确保目标子图的完整性和平滑性，从而可以为后续基于该目标子图进行图像处理研究提供更大的便利。In the above technical solution, by performing image segmentation on the original image based on the second mask image, a clearer target sub-image for the original image can be accurately obtained. In addition, since the original image is segmented based on the second mask image, the aliasing phenomenon on the edge of the target sub-image can be reduced, and the integrity and smoothness of the target sub-image can be ensured, so that the target sub-image can be used for subsequent image segmentation based on the target sub-image. Provides greater convenience for conducting image processing research.

第二方面，本申请示例性的实施方式中提供了一种图像分割装置，包括：In a second aspect, an exemplary embodiment of the present application provides an image segmentation apparatus, including:

确定单元，用于对原图进行第一分割处理，确定所述原图的第一掩膜图；a determining unit, configured to perform a first segmentation process on the original image, and determine a first mask image of the original image;

处理单元，用于对所述原图和所述第一掩膜图进行特征提取，得到融合特征图；将所述融合特征图通过第一注意力机制，得到边缘注意力权重；将所述融合特征图通过第二注意力机制，得到羽化注意力权重；通过所述边缘注意力权重、所述羽化注意力权重和所述原图，得到所述原图的第二掩膜图。a processing unit, configured to perform feature extraction on the original image and the first mask image to obtain a fusion feature map; pass the fusion feature map through the first attention mechanism to obtain an edge attention weight; The feature map obtains the feathering attention weight through the second attention mechanism; through the edge attention weight, the feathering attention weight and the original image, the second mask image of the original image is obtained.

在一些示例性的实施方式中，所述确定单元具体用于：In some exemplary embodiments, the determining unit is specifically configured to:

在一些示例性的实施方式中，所述处理单元具体用于：In some exemplary embodiments, the processing unit is specifically configured to:

所述第一注意力机制通过如下方式确定：The first attention mechanism is determined by:

所述第二注意力机制通过如下方式确定：The second attention mechanism is determined as follows:

所述第一注意力机制和所述第二注意力机制通过如下方式确定：The first attention mechanism and the second attention mechanism are determined by:

在一些示例性的实施方式中，所述处理单元还用于：In some exemplary embodiments, the processing unit is further configured to:

在得到所述原图的第二掩膜图之后，通过所述第二掩膜图对所述原图进行图像分割，得到所述原图的目标子图。After the second mask image of the original image is obtained, image segmentation is performed on the original image through the second mask image to obtain a target sub-image of the original image.

第三方面，本申请实施例提供一种计算设备，包括至少一个处理器以及至少一个存储器，其中，所述存储器存储有计算机程序，当所述程序被所述处理器执行时，使得所述处理器执行上述第一方面任意所述的图像分割方法。In a third aspect, an embodiment of the present application provides a computing device, including at least one processor and at least one memory, wherein the memory stores a computer program, and when the program is executed by the processor, the processing is performed. The device executes the image segmentation method described in any of the first aspect above.

第四方面，本申请实施例提供一种计算机可读存储介质，其存储有可由计算设备执行的计算机程序，当所述程序在所述计算设备上运行时，使得所述计算设备执行上述第一方面任意所述的图像分割方法。In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium, which stores a computer program executable by a computing device, and when the program runs on the computing device, causes the computing device to execute the above-mentioned first The image segmentation method of any of the aspects.

附图说明Description of drawings

为了更清楚地说明本申请的技术方案，下面将对实施例描述中所需要使用的附图作简要介绍，显而易见地，下面描述中的附图仅仅是本申请的一些实施例，对于本领域的普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图。In order to illustrate the technical solutions of the present application more clearly, the following briefly introduces the accompanying drawings used in the description of the embodiments. Obviously, the drawings in the following description are only some embodiments of the For those of ordinary skill, other drawings can also be obtained from these drawings without any creative effort.

图1为本申请一些实施例提供的一种图像分割系统架构的示意图；FIG. 1 is a schematic diagram of the architecture of an image segmentation system provided by some embodiments of the present application;

图2为本申请一些实施例提供的一种图像分割方法的流程示意图；2 is a schematic flowchart of an image segmentation method provided by some embodiments of the present application;

图3为本申请一些实施例提供的一种对分割出的图像分割结果中实例对象的边缘进行注意力优化的结构示意图；3 is a schematic structural diagram of performing attention optimization on edges of instance objects in a segmented image segmentation result provided by some embodiments of the present application;

图4为本申请一些实施例提供的一种图像分割装置的结构示意图；FIG. 4 is a schematic structural diagram of an image segmentation apparatus provided by some embodiments of the present application;

图5为本申请一些实施例提供的一种计算设备的结构示意图。FIG. 5 is a schematic structural diagram of a computing device according to some embodiments of the present application.

具体实施方式Detailed ways

为了使本申请的目的、技术方案和优点更加清楚，下面将结合附图对本申请作进一步地详细描述，显然，所描述的实施例仅仅是本申请的一部分实施例，而不是全部的实施例。基于本申请中的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其它实施例，都属于本申请保护的范围。In order to make the purpose, technical solutions and advantages of the present application clearer, the present application will be further described in detail below with reference to the accompanying drawings. Obviously, the described embodiments are only a part of the embodiments of the present application, rather than all the embodiments. Based on the embodiments in the present application, all other embodiments obtained by those of ordinary skill in the art without creative work fall within the protection scope of the present application.

为了便于理解本申请实施例，首先以图1中示出的系统结构为例说明适用于本申请实施例的图像分割系统架构。该图像分割系统架构可以应用于人像分割，或者可以用于物体分割等。其中，人像或物体等可以为视频中的人像或物体等，或者，也可以为图片中的人像或物体等，本申请实施例对此并不作限定。如图1所示，该系统架构可以包括终端设备100和服务设备200。In order to facilitate the understanding of the embodiments of the present application, first, the system structure shown in FIG. 1 is used as an example to describe an image segmentation system architecture applicable to the embodiments of the present application. The image segmentation system architecture can be applied to portrait segmentation, or can be used for object segmentation and the like. The portrait or object may be a portrait or an object in a video, or may also be a portrait or an object in a picture, which is not limited in this embodiment of the present application. As shown in FIG. 1 , the system architecture may include a terminal device 100 and a service device 200 .

其中，终端设备100包括但不限于具有数据处理能力的终端，包括但不限于智能手机、平板电脑、台式电脑、笔记本电脑等电子设备，还可以是智能镜(比如智能穿衣镜或智能化妆镜等)或车载终端(比如车载行车记录仪或车载摄像机等)等终端设备。The terminal device 100 includes, but is not limited to, a terminal with data processing capabilities, including but not limited to electronic devices such as smart phones, tablet computers, desktop computers, and notebook computers, and can also be smart mirrors (such as smart dressing mirrors or smart makeup mirrors, etc.) Or vehicle terminal (such as vehicle driving recorder or vehicle camera, etc.) and other terminal equipment.

服务设备200具有信息处理以及信息转发的功能，服务设备可以是单个服务器，也可以是服务器集群。比如，服务设备可以是独立的物理服务器，也可以是多个物理服务器构成的服务器集群或者分布式系统，还可以是提供云服务、云计算、云函数、云存储、云通信、域名服务、安全服务以及大数据和人工智能平台等基础云计算服务的云服务器。The service device 200 has the functions of information processing and information forwarding, and the service device may be a single server or a server cluster. For example, a service device can be an independent physical server, a server cluster or a distributed system composed of multiple physical servers, or a cloud service, cloud computing, cloud function, cloud storage, cloud communication, domain name service, security Services and cloud servers for basic cloud computing services such as big data and artificial intelligence platforms.

终端设备100与服务设备200之间可以通过一个或者多个网络进行通信连接。该网络可以是有线网络，也可以是无线网络，例如无线网络可以是无线保真(WIreless-Fidelity，WIFI)网络，或者可以是移动蜂窝网络，还可以是其他可能的网络，本申请实施例对此并不作限定。The terminal device 100 and the service device 200 may be communicatively connected through one or more networks. The network may be a wired network or a wireless network. For example, the wireless network may be a wireless fidelity (WIreless-Fidelity, WIFI) network, or may be a mobile cellular network, or may be other possible networks. This is not limited.

示例性地，以终端设备100为智能手机为例，对本申请实施例的应用场景进行介绍。智能手机安装有图像采集装置(比如手机相机等)，智能手机的使用用户使用该智能手机拍摄一幅包含有人像的图片。在拍摄好包含有人像的图片后，该用户基于自身需求想要更换该图片的背景，于是通过图像采集装置将该图片和想要更换的背景图片发送给服务设备200进行处理。服务设备200在接收到该图片后，基于图像分割方法对该图片进行分割处理，得到更清晰的分割后的图像(人体分割图像)，然后将分割后的图像与想要更换的背景图片进行融合，得到融合后的图像，即为该用户想要的新图片。Illustratively, the application scenarios of the embodiments of the present application are introduced by taking the terminal device 100 as a smartphone as an example. The smart phone is equipped with an image acquisition device (such as a mobile phone camera, etc.), and a user of the smart phone uses the smart phone to take a picture including a human figure. After taking a picture containing a portrait, the user wants to change the background of the picture based on his own needs, and then sends the picture and the background picture to be replaced to the service device 200 for processing through the image acquisition device. After receiving the picture, the service device 200 performs segmentation processing on the picture based on the image segmentation method to obtain a clearer segmented image (human body segmentation image), and then fuses the segmented image with the background image to be replaced. , the fused image is obtained, which is the new image the user wants.

需要说明的是，上述图1所示的结构仅是一种示例，本申请实施例中的图像分割方法除了上述示例的可以应用于服务设备中，也可以直接应用于终端设备中，终端设备直接基于该图像分割方法进行相应的图像分割处理，对此本申请实施例并不做限定。It should be noted that the structure shown in FIG. 1 above is only an example, and the image segmentation method in this embodiment of the present application can be applied to a service device or a terminal device directly, in addition to the above example. Corresponding image segmentation processing is performed based on the image segmentation method, which is not limited in this embodiment of the present application.

基于上述描述，图2示例性的示出了本申请实施例提供的一种图像分割方法的流程，该流程可以由图像分割装置执行。Based on the above description, FIG. 2 exemplarily shows a flow of an image segmentation method provided by an embodiment of the present application, and the flow may be executed by an image segmentation apparatus.

如图2所示，该流程具体包括：As shown in Figure 2, the process specifically includes:

步骤201，对原图进行第一分割处理，确定所述原图的第一掩膜图。Step 201: Perform a first segmentation process on the original image to determine a first mask image of the original image.

步骤202，对所述原图和所述第一掩膜图进行特征提取，得到融合特征图。Step 202: Perform feature extraction on the original image and the first mask image to obtain a fusion feature map.

步骤203，将所述融合特征图通过第一注意力机制，得到边缘注意力权重。Step 203: Pass the fusion feature map through the first attention mechanism to obtain the edge attention weight.

步骤204，将所述融合特征图通过第二注意力机制，得到羽化注意力权重。Step 204: Pass the fusion feature map through the second attention mechanism to obtain the feathered attention weight.

步骤205，通过所述边缘注意力权重、所述羽化注意力权重和所述原图，得到所述原图的第二掩膜图。Step 205: Obtain a second mask image of the original image through the edge attention weight, the feather attention weight and the original image.

上述步骤201中，对原图进行第一分割处理，确定出原图的第一掩膜图。即，首先提取出原图中各像素点的特征，再根据各像素点的特征，确定出各像素点各自对应的类别标签，并根据各像素点各自对应的类别标签，确定出原图的第一掩膜图。具体地，将原图输入到图像分割网络进行处理，通过图像分割网络(比如语义分割、实例分割或全景分割等)对原图中每个像素点进行类别标注，得到分割出的实例对象以及分割出的图像背景(即原图的第一掩膜图)。In theabove step 201, a first segmentation process is performed on the original image, and a first mask image of the original image is determined. That is, first extract the characteristics of each pixel in the original image, then determine the corresponding category label of each pixel according to the characteristics of each pixel, and determine the first category label of the original image according to the corresponding category label of each pixel. A mask image. Specifically, the original image is input into the image segmentation network for processing, and each pixel in the original image is classified by the image segmentation network (such as semantic segmentation, instance segmentation or panoramic segmentation, etc.) to obtain the segmented instance objects and segmentation. The background of the outgoing image (that is, the first mask image of the original image).

然而，由于现有图像分割算法在对图像数据进行分割处理时，无法准确地计算原图中实例对象与图像背景的边缘，导致分割出的实例对象的边缘不完整、不平滑、不清晰，因此原图在经过图像分割网络处理后得到的原图的第一掩膜图是粗糙的(第一掩膜图的边缘不完整、不平滑、不清晰)，如此会影响该第一掩膜图在后续的图像应用。所以本申请实施例通过对该粗糙的第一掩膜图的边缘进行进一步的优化处理，从而可以得到完整、平滑、清晰的第二掩膜图。However, since the existing image segmentation algorithms cannot accurately calculate the edge between the instance object and the image background in the original image when segmenting the image data, the edge of the segmented instance object is incomplete, unsmooth, and unclear. The first mask image of the original image obtained after the original image is processed by the image segmentation network is rough (the edges of the first mask image are incomplete, not smooth, and unclear), which will affect the first mask image in the Subsequent image applications. Therefore, in the embodiment of the present application, a complete, smooth and clear second mask image can be obtained by further optimizing the edge of the rough first mask image.

基于此，本申请实施例在步骤202中，首先对原图和第一掩膜图进行特征提取，得到融合特征图。即，将原图和第一掩膜图输入到多层感知网络，得到融合特征图，如此可以为后续更准确地获取到更完整、更平滑的掩膜图边缘提供支持，从而可以为后续得到更清晰的掩膜图提供支持。具体地，将原图和第一掩膜图一起输入到多层感知网络，多层感知网络分别对该原图和第一掩膜图进行特征提取处理，提取出该原图中各像素点的特征，以及提取出第一掩膜图中各像素点的特征。再将该第一掩膜图中各像素点的特征与原图中各像素点的特征进行对应连接，可以得到针对该第一掩膜图和原图的融合特征图。Based on this, instep 202 in this embodiment of the present application, feature extraction is first performed on the original image and the first mask image to obtain a fusion feature map. That is, the original image and the first mask image are input into the multi-layer perceptual network to obtain the fusion feature map, which can provide support for the subsequent acquisition of more complete and smoother mask image edges, which can be used for subsequent acquisition. Clearer mask maps provide support. Specifically, the original image and the first mask image are input into the multi-layer perceptual network, and the multi-layer perceptual network performs feature extraction processing on the original image and the first mask image respectively, and extracts the characteristics of each pixel in the original image. features, and extract the features of each pixel in the first mask image. Then, the features of each pixel in the first mask image are correspondingly connected with the features of each pixel in the original image, and a fusion feature map for the first mask image and the original image can be obtained.

在步骤203和步骤204中，将融合特征图通过第一注意力机制，得到边缘注意力权重，以及将融合特征图通过第二注意力机制，得到羽化注意力权重。具体地，本申请实施例在对第一注意力机制和第二注意力机制进行训练时，可以通过两种训练方式来实现对第一注意力机制和第二注意力机制的训练，以得到符合设定条件的第一注意力机制和第二注意力机制。其中，第一种训练方式是分别对第一注意力机制和第二注意力机制进行训练，然后再将训练好的第一注意力机制以及训练好的第二注意力机制组合在一起。其中，针对第一注意力机制的训练，即，将第一样本图的样本融合特征图输入到初始的第一注意力机制，得到样本边缘注意力权重，并通过第一样本图和所述样本边缘注意力权重，确定出第一样本掩膜图。再通过第一样本掩膜图和第一样本图的第一标签掩膜图，调整初始的第一注意力机制，直至满足第一设定条件，得到第一注意力机制。针对第二注意力机制的训练，即，将第二样本图的样本融合特征图输入到初始的第二注意力机制，得到样本羽化注意力权重，并通过第二样本图和样本羽化注意力权重，确定出第二样本掩膜图。再通过第二样本掩膜图和第二样本图的第二标签掩膜图，调整初始的第二注意力机制，直至满足第二设定条件，得到第二注意力机制。可以为后续基于该第一注意力机制和第二注意力机制确定出更为完整、更为平滑的掩膜图边缘提供支持。Instep 203 and step 204, the fused feature map is passed through the first attention mechanism to obtain the edge attention weight, and the fused feature map is passed through the second attention mechanism to obtain the feathered attention weight. Specifically, when training the first attention mechanism and the second attention mechanism in the embodiments of the present application, the first attention mechanism and the second attention mechanism can be trained through two training methods, so as to obtain the Set the conditioned first and second attention mechanisms. Among them, the first training method is to train the first attention mechanism and the second attention mechanism separately, and then combine the trained first attention mechanism and the trained second attention mechanism together. Among them, for the training of the first attention mechanism, that is, input the sample fusion feature map of the first sample image into the initial first attention mechanism, obtain the sample edge attention weight, and pass the first sample image and all The above-mentioned sample edge attention weight is used to determine the first sample mask map. Then, through the first sample mask map and the first label mask map of the first sample map, the initial first attention mechanism is adjusted until the first setting condition is satisfied, and the first attention mechanism is obtained. For the training of the second attention mechanism, that is, input the sample fusion feature map of the second sample image into the initial second attention mechanism to obtain the sample feathering attention weight, and pass the second sample image and the sample feathering attention weight , and determine the second sample mask image. Then, through the second sample mask map and the second label mask map of the second sample map, the initial second attention mechanism is adjusted until the second setting condition is satisfied, and the second attention mechanism is obtained. It can provide support for the subsequent determination of a more complete and smoother edge of the mask map based on the first attention mechanism and the second attention mechanism.

第二种训练方式是将第一注意力机制和第二注意力机制一起进行训练，即，将第三样本图的样本融合特征图输入到初始的第一注意力机制，得到样本边缘注意力权重，并将第三样本图的样本融合特征图输入到初始的第二注意力机制，得到样本羽化注意力权重。再通过第三样本图、样本边缘注意力权重和样本羽化注意力权重，确定出第三样本掩膜图。再通过第三样本掩膜图和第三样本图的第三标签掩膜图，调整初始的第一注意力机制和初始的第二注意力机制，直至满足第三设定条件，得到第一注意力机制和第二注意力机制。如此，可以为后续基于该第一注意力机制和第二注意力机制确定出更为完整、更为平滑的掩膜图边缘提供支持。此外，在更好地利用第三样本图的特征的基础上，将第三样本图和样本边缘注意力权重、样本羽化注意力权重相结合所确定的第三样本掩膜图来调整初始的第一注意力机制和初始的第二注意力机制，可以进一步提升第一注意力机制和第二注意力机制的模型精度。The second training method is to train the first attention mechanism and the second attention mechanism together, that is, input the sample fusion feature map of the third sample image into the initial first attention mechanism to obtain the sample edge attention weight. , and input the sample fusion feature map of the third sample map into the initial second attention mechanism to obtain the sample feathering attention weight. Then, the third sample mask image is determined through the third sample image, the sample edge attention weight and the sample feather attention weight. Then, through the third sample mask map and the third label mask map of the third sample map, adjust the initial first attention mechanism and the initial second attention mechanism until the third setting condition is met, and the first attention is obtained. force mechanism and second attention mechanism. In this way, support can be provided for the subsequent determination of a more complete and smoother edge of the mask map based on the first attention mechanism and the second attention mechanism. In addition, on the basis of making better use of the features of the third sample image, the third sample mask image determined by combining the third sample image with the sample edge attention weight and the sample feather attention weight is used to adjust the initial third sample image. The first attention mechanism and the initial second attention mechanism can further improve the model accuracy of the first attention mechanism and the second attention mechanism.

上述步骤205中，通过边缘注意力权重、羽化注意力权重和原图，得到原图的第二掩膜图。即，通过将边缘注意力权重、羽化注意力权重和原图进行融合处理，即可确保可以得到完整、平滑、清晰的第二掩膜图。在得到第二掩膜图后，通过基于第二掩膜图对原图进行图像分割，可以准确地得到针对原图的更为清晰的目标子图。此外，由于通过基于第二掩膜图对原图进行图像分割，因此可以减少目标子图边缘的锯齿现象，并可以确保目标子图的完整性和平滑性。In theabove step 205, the second mask image of the original image is obtained through the edge attention weight, the feathering attention weight and the original image. That is, by fusing the edge attention weight, feather attention weight and the original image, it can be ensured that a complete, smooth and clear second mask image can be obtained. After the second mask image is obtained, by performing image segmentation on the original image based on the second mask image, a clearer target sub-image for the original image can be accurately obtained. In addition, since the original image is segmented based on the second mask image, the aliasing phenomenon at the edge of the target sub-image can be reduced, and the integrity and smoothness of the target sub-image can be ensured.

有鉴于此，下面对本申请实施例中图像分割方法的实施过程进行具体描述。In view of this, the implementation process of the image segmentation method in the embodiment of the present application will be described in detail below.

Step1：利用图像采集设备采集图像。Step 1: Use an image capture device to capture images.

本申请实施例可以根据用户的自身需求，通过图像采集设备采集图像数据。In this embodiment of the present application, image data may be collected by an image collection device according to the user's own needs.

示例性地，本申请实施例的方法可以处理单张图片图像或视频图像，图像质量不限于1080P、4K等分辨率的图像。图像采集设备可以包括相机、USB摄像头、网络摄像头等，本申请实施例对此并不作限定。当然，本申请实施例也可以通过图像采集设备将采集的图像数据的格式转换编码为jpg或png等，以便后续进行图像处理。其中，采集的图像数据可以为视频中的人像或物体等，或者，也可以为图片中的人像或物体等。Exemplarily, the method in this embodiment of the present application can process a single picture image or a video image, and the image quality is not limited to images with resolutions such as 1080P and 4K. The image collection device may include a camera, a USB camera, a web camera, etc., which is not limited in this embodiment of the present application. Of course, in this embodiment of the present application, the format of the collected image data can also be converted and encoded into jpg or png through an image collection device, so as to perform image processing subsequently. The collected image data may be a portrait or an object in a video, or may also be a portrait or an object in a picture.

Step2：通过图像分割网络对采集好的图像数据进行分割处理。Step 2: Segment the collected image data through an image segmentation network.

本申请实施例在通过图像采集设备采集好图像数据后，可以通过图像分割网络(比如语义分割、实例分割或全景分割等)对图像数据中每个像素点进行类别标注，得到分割出的实例对象以及分割出的图像背景(即图像数据的掩膜图)。其中，每个像素点可以根据其所在的对象或区域进行分类标记。In this embodiment of the present application, after the image data is collected by the image collection device, an image segmentation network (such as semantic segmentation, instance segmentation or panorama segmentation, etc.) can be used to label each pixel in the image data to obtain a segmented instance object. And the segmented image background (ie, the mask map of the image data). Among them, each pixel can be classified and marked according to the object or area where it is located.

示例性地，本申请实施例以视频人像分割为例，将带有人像的图像输入到图像分割网络进行处理，通过图像分割算法将视频中每帧图像的每一个像素点打上类别标签，输出图像中的人体轮廓，以此可以实现人体轮廓与图像背景进行分割，从而得到分割出的人体轮廓图像以及分割出的图像背景。其中，对于每一帧图像的类别，都有人像和背景两个类别；分割出的人体轮廓图像可以包括单人、多人、各类人体姿态等多种人像分割场景的图像。Exemplarily, the embodiments of the present application take video portrait segmentation as an example, input an image with a portrait into an image segmentation network for processing, and use an image segmentation algorithm to label each pixel of each frame of image in the video with a category label, and output an image. In this way, the human body outline and the image background can be segmented, so as to obtain the segmented human body outline image and the segmented image background. Among them, for the category of each frame of image, there are two categories: portrait and background; the segmented human silhouette image can include images of a variety of portrait segmentation scenes such as single person, multiple people, and various human poses.

然而，由于现有图像分割算法在对图像数据进行分割处理时，无法准确地计算实例对象与图像背景的边缘，导致分割出的实例对象的边缘不完整、不平滑、不清晰，因此图像数据在经过图像分割网络处理后得到的图像数据的掩膜图是粗糙的(掩膜图的边缘不完整、不平滑、不清晰)，如此会影响该图像数据的掩膜图在后续的图像应用。所以本申请实施例通过对该粗糙的掩膜图的边缘进行进一步的优化处理，可以减少掩膜图的边缘的锯齿现象，从而可以确保掩膜图的边缘的完整性、平滑性和清晰性。However, since the existing image segmentation algorithms cannot accurately calculate the edge between the instance object and the image background when segmenting the image data, the edge of the segmented instance object is incomplete, unsmooth, and unclear. The mask map of the image data obtained after being processed by the image segmentation network is rough (the edges of the mask map are incomplete, smooth, and unclear), which will affect the subsequent image application of the mask map of the image data. Therefore, by further optimizing the edge of the rough mask image in the embodiments of the present application, the jaggedness of the edge of the mask image can be reduced, thereby ensuring the integrity, smoothness and clarity of the edge of the mask image.

Step3：对分割出的实例对象的边缘进行注意力优化处理。Step 3: Perform attention optimization on the edge of the segmented instance object.

由于现有图像分割算法在真实应用场景下，对于人像或物体等边缘的分割存在不完整、不清晰、不平滑的问题，导致该分割出的人像或物体等图像在后续的图像应用中达不到理想效果。比如，在后续的图像融合中，由于人像或物体等边缘的分割存在不完整，导致融合后的图像有较为明显的图像断层。因此本申请实施例在分割出人像或物体等后，对分割出的人像或物体等边缘进行注意力优化处理，即，对图像分割网络输出的图像中的人像或物体等图像分割结果中各边缘像素点，进行像素级的权重分配，以抑制人像或物体等边缘的噪声，得到更清晰的人像或物体等边缘数据，以实现得到更精确的人像或物体等图像分割结果(即更清晰的人像或物体等图像分割结果)，如此可以解决现有技术中存在人像或物体等边缘的分割存在不完整、不清晰、不平滑的问题。Due to the problems of incomplete, unclear and unsmooth edge segmentation of portraits or objects in existing image segmentation algorithms in real application scenarios, the segmented images of portraits or objects cannot be used in subsequent image applications. to the desired effect. For example, in the subsequent image fusion, due to the incomplete segmentation of edges such as portraits or objects, the fused image has obvious image faults. Therefore, in the embodiment of the present application, after segmenting a portrait or an object, etc., the attention optimization process is performed on the edge of the segmented portrait or object, that is, each edge in the image segmentation result of the portrait or object in the image output by the image segmentation network is Pixel points, perform pixel-level weight distribution to suppress noise on the edges of portraits or objects, and obtain clearer edge data such as portraits or objects, so as to achieve more accurate image segmentation results such as portraits or objects (that is, clearer portraits) or object and other image segmentation results), which can solve the problems in the prior art that the segmentation of edges such as portraits or objects is incomplete, unclear, and not smooth.

需要说明的是，现有融合注意力机制的图像分割网络在进行注意力分配时，使用的原始图像是压缩后的图像特征图。但是，该压缩后的图像特征图已经丢失了很多图像的细节特征，比如衣服褶皱及边缘，或者人物衣服与图像背景颜色非常类似的情景下，会出现非常多的边缘噪声，以至于图像边缘提取不完整。然而，本申请实施例为了提高注意力机制网络的模型精确度，将图像分割网络输出的图像分割结果(mask图像，掩膜图)和原始图像同时输入到注意力机制网络，如此能够更好的利用原始图像的特征，来完成对图像分割结果中各边缘像素点的特征提取，以此形成边缘注意力权重矩阵和羽化注意力权重矩阵，然后使用边缘注意力权重矩阵和羽化注意力权重矩阵对原始图像的各边缘像素点进行像素级的处理，从而可以得到更清晰的图像分割结果。It should be noted that when the existing image segmentation network fused with attention mechanism performs attention allocation, the original image used is the compressed image feature map. However, the compressed image feature map has lost many details of the image, such as clothing folds and edges, or when the color of the person's clothes is very similar to the background color of the image, there will be a lot of edge noise, so that the image edge extraction incomplete. However, in this embodiment of the present application, in order to improve the model accuracy of the attention mechanism network, the image segmentation result (mask image, mask map) output by the image segmentation network and the original image are simultaneously input to the attention mechanism network, so that the better Use the features of the original image to complete the feature extraction of each edge pixel in the image segmentation result, so as to form the edge attention weight matrix and the feather attention weight matrix, and then use the edge attention weight matrix and the feather attention weight matrix to pair Each edge pixel of the original image is processed at the pixel level, so that a clearer image segmentation result can be obtained.

参考图3，图3为本申请实施例提供的一种对分割出的图像分割结果中实例对象的边缘进行注意力优化的结构示意图。如图3所示，具体地，本申请实施例在通过图像分割网络对采集好的图像数据进行分割处理，得到分割出的粗糙的图像分割结果(分割出的实例对象以及分割出的图像背景)后，将该粗糙的图像分割结果(mask图像)和该粗糙的图像分割结果所对应的原始图像一起输入到多层感知网络(Multi-layer perceptron neuralnetworks，MLP)进行处理，得到该粗糙的图像分割结果和原始图像的融合特征图。即，在将该粗糙的图像分割结果和原始图像一起输入到多层感知网络后，多层感知网络分别对该粗糙的图像分割结果和原始图像进行特征提取处理，提取出该粗糙的图像分割结果中各像素点的特征，以及提取出原始图像中各像素点的特征。再将该粗糙的图像分割结果中各像素点的特征与原始图像中各像素点的特征进行对应连接，可以得到针对该粗糙的图像分割结果和原始图像的融合特征图。然后将该融合特征图分别同时输入到第一注意力机制和第二注意力机制进行处理，如此，经过第一注意力机制的处理可以得到边缘注意力权重矩阵，经过第二注意力机制的处理可以得到羽化注意力权重矩阵。最后，将边缘注意力权重矩阵、羽化注意力权重矩阵以及原始图像进行融合处理，即，基于边缘注意力权重矩阵和羽化注意力权重矩阵对原始图像的各边缘像素点进行像素级处理，可以得到更清晰的图像分割结果(即完整、平滑、清晰的图像分割结果)。Referring to FIG. 3 , FIG. 3 is a schematic structural diagram of performing attention optimization on an edge of an instance object in a segmented image segmentation result provided by an embodiment of the present application. As shown in FIG. 3 , specifically, in the embodiment of the present application, the collected image data is segmented through an image segmentation network to obtain a segmented rough image segmentation result (segmented instance objects and segmented image backgrounds) Then, the rough image segmentation result (mask image) and the original image corresponding to the rough image segmentation result are input to the Multi-layer perceptron neural network (MLP) for processing, and the rough image segmentation is obtained. The fused feature map of the result and the original image. That is, after the rough image segmentation result and the original image are input into the multi-layer perceptual network, the multi-layer perceptual network performs feature extraction processing on the rough image segmentation result and the original image respectively, and extracts the rough image segmentation result. feature of each pixel in the original image, and extract the feature of each pixel in the original image. Then, the feature of each pixel in the rough image segmentation result is correspondingly connected with the feature of each pixel in the original image, and a fusion feature map for the rough image segmentation result and the original image can be obtained. Then, the fusion feature map is input to the first attention mechanism and the second attention mechanism for processing at the same time. In this way, the edge attention weight matrix can be obtained through the processing of the first attention mechanism, and the edge attention weight matrix can be obtained through the processing of the second attention mechanism. The feathered attention weight matrix can be obtained. Finally, the edge attention weight matrix, the feather attention weight matrix and the original image are fused, that is, pixel-level processing is performed on each edge pixel of the original image based on the edge attention weight matrix and the feather attention weight matrix. Sharper image segmentation results (ie, complete, smooth, and clear image segmentation results).

此外，本申请实施例在对第一注意力机制和第二注意力机制进行训练时，可以通过两种训练方式来实现对第一注意力机制和第二注意力机制的训练，以得到符合设定条件(即直至第一注意力机制或第二注意力机制收敛，或者第一注意力机制或第二注意力机制的训练次数达到预设迭代训练轮次为止)的第一注意力机制和第二注意力机制。其中，第一种训练方式是分别对第一注意力机制和第二注意力机制进行训练，然后再将训练好的第一注意力机制以及训练好的第二注意力机制组合在一起。第二种训练方式是将第一注意力机制和第二注意力机制一起进行训练，再将训练好的第一注意力机制和第二注意力机制用于处理融合特征图。In addition, when training the first attention mechanism and the second attention mechanism in the embodiments of the present application, the training of the first attention mechanism and the second attention mechanism can be realized by two training methods, so as to obtain the requirements that meet the requirements of the design. The first attention mechanism and the first attention mechanism under certain conditions (that is, until the first attention mechanism or the second attention mechanism converges, or the number of training times of the first attention mechanism or the second attention mechanism reaches a preset iterative training round) Two attention mechanism. Among them, the first training method is to train the first attention mechanism and the second attention mechanism separately, and then combine the trained first attention mechanism and the trained second attention mechanism together. The second training method is to train the first attention mechanism and the second attention mechanism together, and then use the trained first attention mechanism and the second attention mechanism to process the fusion feature map.

具体地，第一种训练方式为：针对第一注意力机制的训练，即，获取第一样本图，通过图像分割网络对该第一样本图进行分割处理，得到第一样本图的掩膜图。再将第一样本图的掩膜图和第一样本图一起输入到多层感知网络，多层感知网络分别对该第一样本图的掩膜图和第一样本图进行特征提取处理，提取出该第一样本图的掩膜图中各像素点的特征，以及提取第一样本图中各像素点的特征。再将该第一样本图的掩膜图中各像素点的特征与第一样本图中各像素点的特征进行对应连接，可以得到针对该第一样本图的掩膜图和第一样本图的样本融合特征图。然后将该样本融合特征图输入到初始的第一注意力机制，得到样本边缘注意力权重，并通过第一样本图和样本边缘注意力权重进行融合处理，确定出第一样本掩膜图。最后通过第一样本掩膜图和第一样本图的第一标签掩膜图之间的损失函数更新初始的第一注意力机制，直至满足第一设定条件(直至初始的第一注意力机制收敛或者初始的第一注意力机制的训练次数达到预设迭代训练轮次为止)，得到第一注意力机制。Specifically, the first training method is: training for the first attention mechanism, that is, obtaining a first sample image, and performing segmentation processing on the first sample image through an image segmentation network to obtain the first sample image. mask map. Then, the mask map of the first sample map and the first sample map are input to the multi-layer perceptual network, and the multi-layer perceptual network performs feature extraction on the mask map and the first sample map of the first sample map respectively. processing, extracting the feature of each pixel in the mask image of the first sample image, and extracting the feature of each pixel in the first sample image. Then, the features of each pixel in the mask map of the first sample map are correspondingly connected with the features of each pixel in the first sample map, and the mask map and the first sample map for the first sample map can be obtained. The sample fusion feature map of the sample map. Then input the sample fusion feature map into the initial first attention mechanism to obtain the sample edge attention weight, and perform fusion processing through the first sample image and the sample edge attention weight to determine the first sample mask map . Finally, the initial first attention mechanism is updated through the loss function between the first sample mask map and the first label mask map of the first sample map, until the first set condition is satisfied (until the initial first attention The first attention mechanism is obtained until the force mechanism converges or the number of training times of the initial first attention mechanism reaches a preset iterative training round).

针对第二注意力机制的训练，即，获取第二样本图，通过图像分割网络对该第二样本图进行分割处理，得到第二样本图的掩膜图。再将第二样本图的掩膜图和第二样本图一起输入到多层感知网络，多层感知网络分别对该第二样本图的掩膜图和第二样本图进行特征提取处理，提取出该第二样本图的掩膜图中各像素点的特征，以及提取第二样本图中各像素点的特征。再将该第二样本图的掩膜图中各像素点的特征与第二样本图中各像素点的特征进行对应连接，可以得到针对该第二样本图的掩膜图和第二样本图的样本融合特征图。然后将该样本融合特征图输入到初始的第二注意力机制，得到样本羽化注意力权重，并通过第二样本图和样本羽化注意力权重进行融合处理，确定出第二样本掩膜图。最后通过第二样本掩膜图和第二样本图的第二标签掩膜图之间的损失函数更新初始的第二注意力机制，直至满足第二设定条件(直至初始的第二注意力机制收敛或者初始的第二注意力机制的训练次数达到预设迭代训练轮次为止)，得到第二注意力机制。For the training of the second attention mechanism, that is, obtaining a second sample image, and performing segmentation processing on the second sample image through an image segmentation network to obtain a mask image of the second sample image. Then, the mask map of the second sample map and the second sample map are input into the multi-layer perceptual network, and the multi-layer perceptual network performs feature extraction processing on the mask map and the second sample map of the second sample map respectively, and extracts Features of each pixel in the mask image of the second sample image, and extracting features of each pixel in the second sample image. Then, the features of each pixel in the mask map of the second sample map are correspondingly connected with the features of each pixel in the second sample map, and the mask map and the second sample map for the second sample map can be obtained. Sample fusion feature map. Then, the sample fusion feature map is input into the initial second attention mechanism to obtain the sample feathering attention weight, and the second sample mask map is determined by the fusion processing of the second sample image and the sample feathering attention weight. Finally, the initial second attention mechanism is updated through the loss function between the second sample mask map and the second label mask map of the second sample map until the second set condition is satisfied (until the initial second attention mechanism Convergence or the number of training times of the initial second attention mechanism reaches the preset iterative training rounds), and the second attention mechanism is obtained.

示例性地，以包含有人像的图片为例，对第一种训练方式进行介绍。针对第一注意力机制的训练，即，获取一定数量的包含有人像的图片样本集，通过图像分割网络(实例分割算法或语义分割算法等)分别对该一定数量的包含有人像的图片样本集中各人像图片样本进行分割处理，得到人像图片样本集中各人像图片样本的掩膜图，针对人像图片样本集中每个人像图片样本，将该人像图片样本和该人像图片样本的掩膜图一起输入到多层感知网络，多层感知网络分别对该人像图片样本的掩膜图和人像图片样本进行特征提取处理，提取出该人像图片样本的掩膜图中各像素点的特征，以及提取人像图片样本中各像素点的特征。再将该人像图片样本的掩膜图中各像素点的特征与人像图片样本中各像素点的特征进行对应连接，可以得到针对该人像图片样本的掩膜图和人像图片样本的样本融合特征图。然后将该样本融合特征图输入到初始的第一注意力机制，得到样本边缘注意力权重，并通过人像图片样本和样本边缘注意力权重进行融合处理，确定出第一人像图片样本掩膜图。最后通过第一人像图片样本掩膜图和人像图片样本的第一标签掩膜图之间的损失函数更新初始的第一注意力机制，直至初始的第一注意力机制收敛或者初始的第一注意力机制的训练次数达到预设迭代训练轮次为止，得到第一注意力机制。Illustratively, the first training method is introduced by taking a picture containing a person as an example. For the training of the first attention mechanism, that is, obtain a certain number of image sample sets containing human figures, and use an image segmentation network (instance segmentation algorithm or semantic segmentation algorithm, etc.) to respectively set the certain number of image sample sets containing human figures. Each portrait image sample is segmented to obtain a mask image of each portrait image sample in the portrait image sample set. For each portrait image sample in the portrait image sample set, the portrait image sample and the mask image of the portrait image sample are input into the Multi-layer perceptual network, the multi-layer perceptual network performs feature extraction processing on the mask image of the portrait image sample and the portrait image sample respectively, extracts the features of each pixel in the mask image of the portrait image sample, and extracts the portrait image sample. features of each pixel in the . Then, the characteristics of each pixel in the mask image of the portrait image sample are correspondingly connected with the characteristics of each pixel in the portrait image sample, and the mask image of the portrait image sample and the sample fusion feature map of the portrait image sample can be obtained. . Then input the sample fusion feature map into the initial first attention mechanism to obtain the sample edge attention weight, and perform fusion processing through the portrait image sample and the sample edge attention weight to determine the first portrait image sample mask map . Finally, the initial first attention mechanism is updated through the loss function between the mask image of the first portrait image sample and the first label mask image of the portrait image sample, until the initial first attention mechanism converges or the initial first attention mechanism converges. The first attention mechanism is obtained until the number of training times of the attention mechanism reaches the preset iterative training rounds.

针对第二注意力机制的训练，即，获取一定数量的包含有人像的图片样本集，通过图像分割网络(实例分割算法或语义分割算法等)分别对该一定数量的包含有人像的图片样本集中各人像图片样本进行分割处理，得到人像图片样本集中各人像图片样本的掩膜图，针对人像图片样本集中每个人像图片样本，将该人像图片样本和该人像图片样本的掩膜图一起输入到多层感知网络，多层感知网络分别对该人像图片样本的掩膜图和人像图片样本进行特征提取处理，提取出该人像图片样本的掩膜图中各像素点的特征，以及提取人像图片样本中各像素点的特征。再将该人像图片样本的掩膜图中各像素点的特征与人像图片样本中各像素点的特征进行对应连接，可以得到针对该人像图片样本的掩膜图和人像图片样本的样本融合特征图。然后将该样本融合特征图输入到初始的第二注意力机制，得到样本羽化注意力权重，并通过人像图片样本和样本羽化注意力权重进行融合处理，确定出第二人像图片样本掩膜图。最后通过第二人像图片样本掩膜图和人像图片样本的第二标签掩膜图之间的损失函数更新初始的第二注意力机制，直至初始的第二注意力机制收敛或者初始的第二注意力机制的训练次数达到预设迭代训练轮次为止，得到第二注意力机制。For the training of the second attention mechanism, that is, obtain a certain number of image sample sets containing human figures, and use an image segmentation network (instance segmentation algorithm or semantic segmentation algorithm, etc.) Each portrait image sample is segmented to obtain a mask image of each portrait image sample in the portrait image sample set. For each portrait image sample in the portrait image sample set, the portrait image sample and the mask image of the portrait image sample are input into the Multi-layer perceptual network, the multi-layer perceptual network performs feature extraction processing on the mask image of the portrait image sample and the portrait image sample respectively, extracts the features of each pixel in the mask image of the portrait image sample, and extracts the portrait image sample. features of each pixel in the . Then, the characteristics of each pixel in the mask image of the portrait image sample are correspondingly connected with the characteristics of each pixel in the portrait image sample, and the mask image of the portrait image sample and the sample fusion feature map of the portrait image sample can be obtained. . Then, the sample fusion feature map is input into the initial second attention mechanism to obtain the sample feathering attention weight, and the second portrait image sample mask map is determined through the fusion processing of the portrait image sample and the sample feathering attention weight. Finally, the initial second attention mechanism is updated through the loss function between the second portrait image sample mask map and the second label mask map of the portrait image sample, until the initial second attention mechanism converges or the initial second attention The second attention mechanism is obtained until the number of training times of the force mechanism reaches the preset iterative training rounds.

第二种训练方式为：获取第三样本图，通过图像分割网络对该第三样本图进行分割处理，得到第三样本图的掩膜图。再将第三样本图的掩膜图和第三样本图一起输入到多层感知网络，多层感知网络分别对该第三样本图的掩膜图和第三样本图进行特征提取处理，提取出该第三样本图的掩膜图中各像素点的特征，以及提取第三样本图中各像素点的特征。再将该第三样本图的掩膜图中各像素点的特征与第三样本图中各像素点的特征进行对应连接，可以得到针对该第三样本图的掩膜图和第三样本图的样本融合特征图。然后将该样本融合特征图分别同时输入到初始的第二注意力机制和初始的第二注意力机制，得到样本边缘注意力权重以及样本羽化注意力权重，并通过第三样本图、样本边缘注意力权重以及样本羽化注意力权重进行融合处理，确定出第三样本掩膜图。最后通过第三样本掩膜图和第三样本图的第三标签掩膜图之间的损失函数更新初始的第一注意力机制以及初始的第二注意力机制，直至满足第三设定条件(直至第一注意力机制、第二注意力机制收敛，或者，第一注意力机制、第二注意力机制的训练次数达到预设迭代训练轮次为止)，得到第一注意力机制和第二注意力机制。The second training method is: obtaining a third sample image, and performing segmentation processing on the third sample image through an image segmentation network to obtain a mask image of the third sample image. Then, the mask map of the third sample map and the third sample map are input to the multi-layer perceptual network, and the multi-layer perceptual network performs feature extraction processing on the mask map and the third sample map of the third sample map respectively, and extracts the Features of each pixel in the mask image of the third sample image, and extracting features of each pixel in the third sample image. Then, the features of each pixel in the mask map of the third sample map are correspondingly connected with the features of each pixel in the third sample map, and the mask map and the third sample map for the third sample map can be obtained. Sample fusion feature map. Then the sample fusion feature map is input into the initial second attention mechanism and the initial second attention mechanism at the same time, to obtain the sample edge attention weight and the sample feather attention weight, and through the third sample map, sample edge attention The force weight and the sample feathering attention weight are fused to determine the third sample mask map. Finally, the initial first attention mechanism and the initial second attention mechanism are updated through the loss function between the third sample mask map and the third label mask map of the third sample map until the third set condition ( Until the first attention mechanism and the second attention mechanism converge, or the training times of the first attention mechanism and the second attention mechanism reach the preset iterative training round), the first attention mechanism and the second attention mechanism are obtained. force mechanism.

示例性地，继续以包含有人像的图片为例，对第二种训练方式进行介绍。即，获取一定数量的包含有人像的图片样本集，通过图像分割网络(实例分割算法或语义分割算法等)分别对该一定数量的包含有人像的图片样本集中各人像图片样本进行分割处理，得到人像图片样本集中各人像图片样本的掩膜图，针对人像图片样本集中每个人像图片样本，将该人像图片样本和该人像图片样本的掩膜图一起输入到多层感知网络，多层感知网络分别对该人像图片样本的掩膜图和人像图片样本进行特征提取处理，提取出该人像图片样本的掩膜图中各像素点的特征，以及提取人像图片样本中各像素点的特征。再将该人像图片样本的掩膜图中各像素点的特征与人像图片样本中各像素点的特征进行对应连接，可以得到针对该人像图片样本的掩膜图和人像图片样本的样本融合特征图。然后将该样本融合特征图分别同时输入到初始的第一注意力机制和初始的第二注意力机制，得到样本边缘注意力权重以及样本羽化注意力权重，并通过人像图片样本、样本边缘注意力权重以及样本羽化注意力权重进行融合处理，确定出第三人像图片样本掩膜图。最后通过第三人像图片样本掩膜图和人像图片样本的第三标签掩膜图之间的损失函数更新初始的第一注意力机制以及初始的第二注意力机制，直至第一注意力机制、第二注意力机制收敛，或者，第一注意力机制、第二注意力机制的训练次数达到预设迭代训练轮次为止，得到第一注意力机制和第二注意力机制。Exemplarily, continue to take a picture containing a person as an example to introduce the second training method. That is, a certain number of image sample sets containing human figures are obtained, and each portrait image sample in the certain number of image sample sets containing human figures is segmented through an image segmentation network (instance segmentation algorithm or semantic segmentation algorithm, etc.) to obtain The mask image of each portrait image sample in the portrait image sample set, for each portrait image sample in the portrait image sample set, the portrait image sample and the mask image of the portrait image sample are input into the multi-layer perceptual network, the multi-layer perceptual network Feature extraction processing is performed on the mask image of the portrait image sample and the portrait image sample respectively, and the feature of each pixel in the mask image of the portrait image sample is extracted, and the feature of each pixel in the portrait image sample is extracted. Then, the characteristics of each pixel in the mask image of the portrait image sample are correspondingly connected with the characteristics of each pixel in the portrait image sample, and the mask image of the portrait image sample and the sample fusion feature map of the portrait image sample can be obtained. . Then the sample fusion feature map is input into the initial first attention mechanism and the initial second attention mechanism at the same time, to obtain the sample edge attention weight and the sample feather attention weight, and through the portrait image sample, sample edge attention The weight and the sample feathering attention weight are fused to determine the sample mask map of the third portrait image. Finally, the initial first attention mechanism and the initial second attention mechanism are updated through the loss function between the third portrait image sample mask map and the third label mask map of the portrait image sample, until the first attention mechanism, The second attention mechanism converges, or the first attention mechanism and the second attention mechanism are obtained until the training times of the first attention mechanism and the second attention mechanism reach a preset iterative training round.

Step4：基于图像分割结果，提取图像分割结果中的目标对象。Step4: Based on the image segmentation result, extract the target object in the image segmentation result.

本申请实施例通过对更清晰的图像分割结果进行目标对象提取处理，即可得到完整、平滑、清晰的目标对象。例如，通过对更清晰的包含有人像轮廓或物体轮廓等的mask图像进行人像轮廓或物体轮廓等提取处理，即可得到完整、平滑、清晰的人像轮廓或物体轮廓等。如此，在后续的图像应用中，比如将该包含有人像轮廓或物体轮廓等的mask图像中的图像背景进行替换，则需要对包含有人像轮廓或物体轮廓等的mask图像进行人像轮廓或物体轮廓等提取处理，即可得到完整、平滑、清晰的人像轮廓或物体轮廓等，再将该完整、平滑、清晰的人像轮廓或物体轮廓等与新的图像背景进行融合，以此形成新的图像。In this embodiment of the present application, a complete, smooth and clear target object can be obtained by performing target object extraction processing on a clearer image segmentation result. For example, a complete, smooth, and clear portrait outline or object outline can be obtained by extracting the portrait outline or the object outline on a clearer mask image containing the outline of a person or an object. In this way, in subsequent image applications, for example, to replace the image background in the mask image containing the outline of a person or an object, etc., the mask image containing the outline of a person or the outline of an object needs to perform the outline of the portrait or the outline of an object. After the extraction process, a complete, smooth and clear portrait outline or object outline can be obtained, and then the complete, smooth and clear portrait outline or object outline can be fused with the new image background to form a new image.

上述实施例表明，通过对原图进行第一分割处理，确定出原图的第一掩膜图，并对原图和第一掩膜图进行特征提取，得到融合特征图。再将融合特征图通过第一注意力机制，得到边缘注意力权重，并将融合特征图通过第二注意力机制，得到羽化注意力权重。然后，通过边缘注意力权重、羽化注意力权重和原图，得到原图的第二掩膜图。由于现有技术中通过对图像分割结果进行暴力上采样导致图像分割结果边缘出现明显的锯齿现象，以使分割出的图像分割结果边缘不完整、不平滑，因此本方案通过对原图的不同像素点分配不同的边缘注意力权重、羽化注意力权重，可以实现针对原图中不同边缘像素点的不同羽化效果，以便减少生成的掩膜图边缘的锯齿现象，从而可以确保掩膜图边缘的完整性、平滑性和清晰性。The above embodiment shows that by performing the first segmentation process on the original image, the first mask image of the original image is determined, and feature extraction is performed on the original image and the first mask image to obtain a fusion feature map. The fused feature map is then passed through the first attention mechanism to obtain the edge attention weight, and the fused feature map is passed through the second attention mechanism to obtain the feathered attention weight. Then, through the edge attention weight, feather attention weight and the original image, the second mask image of the original image is obtained. Since the edge of the image segmentation result is obviously jagged by violent upsampling of the image segmentation result in the prior art, so that the edge of the segmented image segmentation result is incomplete and unsmooth. Points are assigned different edge attention weights and feather attention weights, which can achieve different feathering effects for different edge pixels in the original image, so as to reduce the aliasing phenomenon of the generated mask image edge, so as to ensure the integrity of the mask image edge. stability, smoothness and clarity.

基于相同的技术构思，图4示例性的示出了本申请实施例提供的一种图像分割装置，该装置可以执行图像分割方法的流程。Based on the same technical concept, FIG. 4 exemplarily shows an image segmentation apparatus provided by an embodiment of the present application, and the apparatus can execute the flow of an image segmentation method.

如图4所示，该装置包括：As shown in Figure 4, the device includes:

确定单元401，用于对原图进行第一分割处理，确定所述原图的第一掩膜图；a determiningunit 401, configured to perform a first segmentation process on the original image, and determine a first mask image of the original image;

处理单元402，用于对所述原图和所述第一掩膜图进行特征提取，得到融合特征图；将所述融合特征图通过第一注意力机制，得到边缘注意力权重；将所述融合特征图通过第二注意力机制，得到羽化注意力权重；通过所述边缘注意力权重、所述羽化注意力权重和所述原图，得到所述原图的第二掩膜图。Theprocessing unit 402 is configured to perform feature extraction on the original image and the first mask image to obtain a fusion feature map; pass the fusion feature map through a first attention mechanism to obtain an edge attention weight; The fused feature map obtains the feathering attention weight through the second attention mechanism; the second mask image of the original image is obtained through the edge attention weight, the feathering attention weight and the original image.

在一些示例性的实施方式中，所述确定单元401具体用于：In some exemplary embodiments, the determiningunit 401 is specifically configured to:

在一些示例性的实施方式中，所述处理单元402具体用于：In some exemplary embodiments, theprocessing unit 402 is specifically configured to:

在一些示例性的实施方式中，所述处理单元402还用于：In some exemplary embodiments, theprocessing unit 402 is further configured to:

基于相同的技术构思，本申请实施例还提供了一种计算设备，如图5所示，包括至少一个处理器501，以及与至少一个处理器连接的存储器502，本申请实施例中不限定处理器501与存储器502之间的具体连接介质，图5中处理器501和存储器502之间通过总线连接为例。总线可以分为地址总线、数据总线、控制总线等。Based on the same technical concept, an embodiment of the present application further provides a computing device, as shown in FIG. 5 , including at least oneprocessor 501 and amemory 502 connected to the at least one processor, and processing is not limited in the embodiment of the present application The specific connection medium between theprocessor 501 and thememory 502 is taken as an example of the connection between theprocessor 501 and thememory 502 via a bus in FIG. 5 . The bus can be divided into address bus, data bus, control bus and so on.

在本申请实施例中，存储器502存储有可被至少一个处理器501执行的指令，至少一个处理器501通过执行存储器502存储的指令，可以执行前述的图像分割方法中所包括的步骤。In this embodiment of the present application, thememory 502 stores instructions that can be executed by at least oneprocessor 501 , and the at least oneprocessor 501 can execute the steps included in the foregoing image segmentation method by executing the instructions stored in thememory 502 .

其中，处理器501是计算设备的控制中心，可以利用各种接口和线路连接计算设备的各个部分，通过运行或执行存储在存储器502内的指令以及调用存储在存储器502内的数据，从而实现数据处理。可选的，处理器501可包括一个或多个处理单元，处理器501可集成应用处理器和调制解调处理器，其中，应用处理器主要处理操作系统、用户界面和应用程序等，调制解调处理器主要处理下发指令。可以理解的是，上述调制解调处理器也可以不集成到处理器501中。在一些实施例中，处理器501和存储器502可以在同一芯片上实现，在一些实施例中，它们也可以在独立的芯片上分别实现。Among them, theprocessor 501 is the control center of the computing device, and can use various interfaces and lines to connect various parts of the computing device, and realize the data by running or executing the instructions stored in thememory 502 and calling the data stored in thememory 502. deal with. Optionally, theprocessor 501 may include one or more processing units, and theprocessor 501 may integrate an application processor and a modem processor, wherein the application processor mainly processes the operating system, user interface, and application programs, etc., and the modem The calling processor mainly deals with issuing instructions. It can be understood that, the above-mentioned modulation and demodulation processor may not be integrated into theprocessor 501 . In some embodiments, theprocessor 501 and thememory 502 may be implemented on the same chip, and in some embodiments, they may be implemented separately on separate chips.

处理器501可以是通用处理器，例如中央处理器(CPU)、数字信号处理器、专用集成电路(Application Specific Integrated Circuit，ASIC)、现场可编程门阵列或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件，可以实现或者执行本申请实施例中公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者任何常规的处理器等。结合图像分割方法实施例所公开的方法的步骤可以直接体现为硬件处理器执行完成，或者用处理器中的硬件及软件模块组合执行完成。Theprocessor 501 may be a general-purpose processor, such as a central processing unit (CPU), a digital signal processor, an application specific integrated circuit (ASIC), a field programmable gate array or other programmable logic devices, discrete gates or transistors Logic devices and discrete hardware components can implement or execute the methods, steps, and logic block diagrams disclosed in the embodiments of this application. A general purpose processor may be a microprocessor or any conventional processor or the like. The steps of the methods disclosed in combination with the image segmentation method embodiments can be directly embodied as executed by a hardware processor, or executed by a combination of hardware and software modules in the processor.

存储器502作为一种非易失性计算机可读存储介质，可用于存储非易失性软件程序、非易失性计算机可执行程序以及模块。存储器502可以包括至少一种类型的存储介质，例如可以包括闪存、硬盘、多媒体卡、卡型存储器、随机访问存储器(Random AccessMemory，RAM)、静态随机访问存储器(Static Random Access Memory，SRAM)、可编程只读存储器(Programmable Read Only Memory，PROM)、只读存储器(Read Only Memory，ROM)、带电可擦除可编程只读存储器(Electrically Erasable Programmable Read-Only Memory，EEPROM)、磁性存储器、磁盘、光盘等等。存储器502是能够用于携带或存储具有指令或数据结构形式的期望的程序代码并能够由计算机存取的任何其他介质，但不限于此。本申请实施例中的存储器502还可以是电路或者其它任意能够实现存储功能的装置，用于存储程序指令和/或数据。Thememory 502, as a non-volatile computer-readable storage medium, can be used to store non-volatile software programs, non-volatile computer-executable programs and modules. Thememory 502 may include at least one type of storage medium, for example, may include a flash memory, a hard disk, a multimedia card, a card-type memory, a random access memory (Random Access Memory, RAM), a static random access memory (Static Random Access Memory, SRAM), a Programmable Read Only Memory (PROM), Read Only Memory (ROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Magnetic Memory, Disk, CD and so on.Memory 502 is, but is not limited to, any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer. Thememory 502 in this embodiment of the present application may also be a circuit or any other device capable of implementing a storage function, for storing program instructions and/or data.

基于相同的技术构思，本申请实施例还提供了一种计算机可读存储介质，其存储有可由计算设备执行的计算机程序，当所述程序在所述计算设备上运行时，使得所述计算设备执行上述图像分割方法的步骤。Based on the same technical concept, the embodiments of the present application also provide a computer-readable storage medium, which stores a computer program executable by a computing device, and when the program is executed on the computing device, the computing device causes the Perform the steps of the image segmentation method described above.

本领域内的技术人员应明白，本申请的实施例可提供为方法、系统、或计算机程序产品。因此，本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且，本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。As will be appreciated by those skilled in the art, the embodiments of the present application may be provided as a method, a system, or a computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.

本申请是参照根据本申请的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器，使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to the present application. It will be understood that each flow and/or block in the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to the processor of a general purpose computer, special purpose computer, embedded processor or other programmable data processing device to produce a machine such that the instructions executed by the processor of the computer or other programmable data processing device produce Means for implementing the functions specified in a flow or flow of a flowchart and/or a block or blocks of a block diagram.

这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中，使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品，该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory result in an article of manufacture comprising instruction means, the instructions The apparatus implements the functions specified in the flow or flow of the flowcharts and/or the block or blocks of the block diagrams.

这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上，使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理，从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded on a computer or other programmable data processing device to cause a series of operational steps to be performed on the computer or other programmable device to produce a computer-implemented process such that The instructions provide steps for implementing the functions specified in the flow or blocks of the flowcharts and/or the block or blocks of the block diagrams.

尽管已描述了本申请的优选实施例，但本领域内的技术人员一旦得知了基本创造性概念，则可对这些实施例作出另外的变更和修改。所以，所附权利要求意欲解释为包括优选实施例以及落入本申请范围的所有变更和修改。While the preferred embodiments of the present application have been described, additional changes and modifications to these embodiments may occur to those skilled in the art once the basic inventive concepts are known. Therefore, the appended claims are intended to be construed to include the preferred embodiment and all changes and modifications that fall within the scope of this application.

显然，本领域的技术人员可以对本申请进行各种改动和变型而不脱离本申请的精神和范围。这样，倘若本申请的这些修改和变型属于本申请权利要求及其等同技术的范围之内，则本申请也意图包含这些改动和变型在内。Obviously, those skilled in the art can make various changes and modifications to the present application without departing from the spirit and scope of the present application. Thus, if these modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is also intended to include these modifications and variations.

Claims

Translated fromChinese

1.一种图像分割方法，其特征在于，包括：1. an image segmentation method, is characterized in that, comprises:

2.如权利要求1所述的方法，其特征在于，所述对原图进行第一分割处理，确定所述原图的第一掩膜图，包括：2. The method according to claim 1, wherein the performing a first segmentation process on the original image to determine the first mask image of the original image comprises:

3.如权利要求1所述的方法，其特征在于，所述对所述原图和所述第一掩膜图进行特征提取，得到融合特征图，包括：3. The method of claim 1, wherein the feature extraction is performed on the original image and the first mask image to obtain a fusion feature map, comprising:

4.如权利要求1至3任一项所述的方法，其特征在于，所述第一注意力机制通过如下方式确定：4. The method of any one of claims 1 to 3, wherein the first attention mechanism is determined by:

5.如权利要求1至3任一项所述的方法，其特征在于，所述第二注意力机制通过如下方式确定：5. The method of any one of claims 1 to 3, wherein the second attention mechanism is determined by:

6.如权利要求1至3任一项所述的方法，其特征在于，所述第一注意力机制和所述第二注意力机制通过如下方式确定：6. The method of any one of claims 1 to 3, wherein the first attention mechanism and the second attention mechanism are determined by:

7.如权利要求1所述的方法，其特征在于，在得到所述原图的第二掩膜图之后，还包括：7. The method of claim 1, further comprising: after obtaining the second mask image of the original image:

8.一种图像分割装置，其特征在于，包括：8. An image segmentation device, comprising:

9.一种计算设备，其特征在于，包括至少一个处理器以及至少一个存储器，其中，所述存储器存储有计算机程序，当所述程序被所述处理器执行时，使得所述处理器执行权利要求1至7任一权利要求所述的方法。9. A computing device, comprising at least one processor and at least one memory, wherein the memory stores a computer program that, when the program is executed by the processor, causes the processor to execute the right The method of any one of claims 1 to 7.

10.一种计算机可读存储介质，其特征在于，其存储有可由计算设备执行的计算机程序，当所述程序在所述计算设备上运行时，使得所述计算设备执行权利要求1至7任一权利要求所述的方法。10. A computer-readable storage medium, characterized in that it stores a computer program executable by a computing device, and when the program is executed on the computing device, the computing device is made to execute any one of claims 1 to 7. A method as claimed in claim.