CN108229504A

Movatterモバイル変換

Info

Publication number: CN108229504A
Application number: CN201810085628.4A
Authority: CN
Inventors: 陈益民; 张伟; 林倞
Original assignee: Shenzhen Sensetime Technology Co Ltd
Current assignee: Shenzhen Sensetime Technology Co Ltd
Priority date: 2018-01-29
Filing date: 2018-01-29
Publication date: 2018-06-29
Anticipated expiration: 2038-01-29
Also published as: CN108229504B

Abstract

Translated fromChinese

本公开涉及一种图像解析方法及装置。该方法通过解析模型实现，模型包括：特征共享模块、语义分割模块、边缘检测模块，该方法包括：通过特征共享模块对待解析图像进行特征提取处理，获取共享特征，共享特征包括多个网络深度的特征信息；分别通过语义分割模块和边缘检测模块对共享特征进行语义分割处理和边缘检测处理，获取待解析图像的初步语义分割结果及初步边缘检测结果。根据本公开实施例，能够通过解析模型提取待解析图像的共享特征，并分别通过对共享特征进行语义分割处理和边缘检测处理以获取初步语义分割结果及初步边缘检测结果，从而提高了语义分割结果与边缘检测结果之间的一致性。

The present disclosure relates to an image analysis method and device. The method is implemented through an analysis model, which includes: a feature sharing module, a semantic segmentation module, and an edge detection module. The method includes: performing feature extraction processing on the image to be analyzed through the feature sharing module to obtain shared features. The shared features include multiple network depths. Feature information: perform semantic segmentation processing and edge detection processing on the shared features through the semantic segmentation module and the edge detection module respectively, and obtain the preliminary semantic segmentation results and preliminary edge detection results of the image to be parsed. According to the embodiments of the present disclosure, the shared features of the image to be analyzed can be extracted through the analysis model, and the semantic segmentation processing and edge detection processing are performed on the shared features to obtain preliminary semantic segmentation results and preliminary edge detection results, thereby improving the semantic segmentation results. Consistency with edge detection results.

Description

Translated fromChinese

图像解析方法及装置Image analysis method and device

技术领域technical field

本公开涉及计算机技术领域，尤其涉及一种图像解析方法及装置。The present disclosure relates to the field of computer technology, in particular to an image analysis method and device.

背景技术Background technique

随着互联网的快速普及以及电子商务的兴起与发展，基于计算机视觉的图像分析技术得到了空前的发展。对图像中的对象(例如人体、动物、车辆等)进行解析并识别出对象的各个部分(例如人的头部、胳膊、服装等)，在视频监控、人物行为分析等领域有比较重要的意义。With the rapid popularization of the Internet and the rise and development of e-commerce, image analysis technology based on computer vision has achieved unprecedented development. Analyzing objects in images (such as human bodies, animals, vehicles, etc.) and identifying various parts of objects (such as human heads, arms, clothing, etc.) is of great significance in the fields of video surveillance, character behavior analysis, etc. .

在相关技术中，对图像中一个或多个对象的解析通常依赖于对象的检测，在检测出图像中的对象后，再对该对象进行解析。这种方式对对象检测结果依赖较大，可能会出现检测结果与分割(解析)结果的不一致的情况，准确度和精度均较差，无法满足需求。In related technologies, the analysis of one or more objects in an image usually relies on object detection, and after the object in the image is detected, the object is analyzed. This method relies heavily on the object detection results, and there may be inconsistencies between the detection results and the segmentation (analysis) results, and the accuracy and precision are poor, which cannot meet the requirements.

发明内容Contents of the invention

有鉴于此，本公开提出了一种图像解析方法及装置，能够提高待解析图像的语义分割结果与边缘检测结果之间的一致性。In view of this, the present disclosure proposes an image analysis method and device, which can improve the consistency between the semantic segmentation result and the edge detection result of the image to be analyzed.

根据本公开的一方面，提供了一种图像解析方法，所述方法通过解析模型实现，所述解析模型包括：特征共享模块、语义分割模块、边缘检测模块，According to one aspect of the present disclosure, an image analysis method is provided, the method is implemented by an analysis model, and the analysis model includes: a feature sharing module, a semantic segmentation module, an edge detection module,

所述方法包括：The methods include:

通过所述特征共享模块对待解析图像进行特征提取处理，获取共享特征，所述共享特征包括经所述特征共享模块的多个网络层处理得到的多个网络深度的特征信息；Perform feature extraction processing on the image to be analyzed by the feature sharing module to obtain shared features, and the shared features include feature information of multiple network depths obtained by processing multiple network layers of the feature sharing module;

分别通过所述语义分割模块和所述边缘检测模块对所述共享特征进行语义分割处理和边缘检测处理，获取所述待解析图像的初步语义分割结果及初步边缘检测结果。Perform semantic segmentation processing and edge detection processing on the shared features through the semantic segmentation module and the edge detection module respectively, and obtain preliminary semantic segmentation results and preliminary edge detection results of the image to be analyzed.

在一种可能的实现方式中，所述解析模型还包括聚合优化模块，In a possible implementation, the analytical model further includes an aggregation optimization module,

其中，在获取所述待解析图像的初步语义分割结果及初步边缘检测结果的步骤之后，所述方法还包括：Wherein, after the step of obtaining the preliminary semantic segmentation result and the preliminary edge detection result of the image to be analyzed, the method further includes:

通过所述聚合优化模块将所述初步语义分割结果及所述初步边缘检测结果输入所述聚合优化模块中组合为特征层；Inputting the preliminary semantic segmentation result and the preliminary edge detection result into the aggregation optimization module through the aggregation optimization module to form a feature layer;

在所述聚合优化模块中对所述特征层进行聚合，并采用所述聚合优化模块的多个卷积网络层进行优化处理，确定针对所述待解析图像的语义分割结果及边缘检测结果。In the aggregation optimization module, the feature layers are aggregated, and multiple convolutional network layers of the aggregation optimization module are used for optimization processing to determine the semantic segmentation results and edge detection results for the image to be parsed.

在一种可能的实现方式中，所述特征共享模块包括级联的第一卷积池化网络、第二残差网络、第三残差网络、第四残差网络以及第五残差网络；In a possible implementation, the feature sharing module includes a cascaded first convolutional pooling network, a second residual network, a third residual network, a fourth residual network, and a fifth residual network;

所述共享特征包括所述第三残差网络、所述第四残差网络和所述第五残差网络分别输出的特征信息。The shared features include feature information respectively output by the third residual network, the fourth residual network and the fifth residual network.

在一种可能的实现方式中，在确定针对所述待解析图像的语义分割结果及边缘检测结果的步骤之后，所述方法还包括：In a possible implementation manner, after the step of determining the semantic segmentation result and the edge detection result of the image to be parsed, the method further includes:

对所述边缘检测结果进行线分割处理，确定所述待解析图像中的多个分割区域；performing line segmentation processing on the edge detection result, and determining a plurality of segmented regions in the image to be analyzed;

根据所述语义分割结果对所述多个分割区域进行聚集处理，确定所述待解析图像中至少一个解析对象的聚集区域；performing aggregation processing on the plurality of segmented regions according to the semantic segmentation result, and determining an aggregation region of at least one analysis object in the image to be analyzed;

将所述至少一个解析对象的聚集区域与所述语义分割结果相关联，确定针对所述至少一个解析对象的解析结果。Associating the aggregation area of the at least one analysis object with the semantic segmentation result, and determining the analysis result for the at least one analysis object.

在一种可能的实现方式中，所述边缘检测结果包括边缘图，In a possible implementation manner, the edge detection result includes an edge map,

其中，对所述边缘检测结果进行线分割处理，确定所述待解析图像中的多个分割区域，包括：Wherein, performing line segmentation processing on the edge detection result to determine a plurality of segmented regions in the image to be analyzed includes:

分别在水平和垂直方向扫描所述边缘图，获取非背景区域中的多条水平线段和多条垂直线段，其中，每条水平线段和每条垂直线段的端点为所述边缘图中的边缘点，每条水平线段所在的区域属于同一分割区域，每条垂直线段所在的区域属于同一分割区域；Scanning the edge map in the horizontal and vertical directions respectively to obtain a plurality of horizontal line segments and a plurality of vertical line segments in the non-background area, wherein the endpoints of each horizontal line segment and each vertical line segment are edge points in the edge map , the area where each horizontal line segment is located belongs to the same segmented area, and the area where each vertical line segment is located belongs to the same segmented area;

对非背景区域中的多条水平线段和多条垂直线段所在的区域进行聚集处理，获得所述待解析图像中的多个分割区域。Aggregating the areas where the multiple horizontal line segments and the multiple vertical line segments are located in the non-background area is performed to obtain multiple segmented areas in the image to be analyzed.

在一种可能的实现方式中，根据所述语义分割结果对所述多个分割区域进行聚集处理，确定所述待解析图像中至少一个解析对象的聚集区域，包括以下至少一个步骤：In a possible implementation manner, performing aggregation processing on the plurality of segmented regions according to the semantic segmentation result, and determining the aggregation region of at least one analysis object in the image to be analyzed includes at least one of the following steps:

在分割区域的尺寸大于或等于第一阈值，且所述分割区域包括多个语义分割结果时，确定所述分割区域为同一解析对象的聚集区域；When the size of the segmented area is greater than or equal to the first threshold, and the segmented area includes a plurality of semantic segmentation results, determining that the segmented area is an aggregation area of the same analysis object;

在分割区域的尺寸小于第二阈值，且所述分割区域包括一个语义分割结果时，将所述分割区域并入与所述分割区域之间的距离最近的聚集区域。When the size of the segmented area is smaller than the second threshold and the segmented area includes a semantic segmentation result, the segmented area is merged into the aggregation area with the closest distance to the segmented area.

在一种可能的实现方式中，所述方法还包括：In a possible implementation, the method further includes:

将样本图像输入初始解析模型中进行处理，获取针对所述样本图像的训练解析结果，其中，所述初始解析模型包括初始特征共享模块、初始语义分割模块、初始边缘检测模块、初始聚合优化模块以及监测模块；Input the sample image into the initial parsing model for processing, and obtain the training parsing results for the sample image, wherein the initial parsing model includes an initial feature sharing module, an initial semantic segmentation module, an initial edge detection module, an initial aggregation optimization module and monitoring module;

根据所述样本图像的期望解析结果以及所述训练解析结果，确定所述样本图像的模型损失，所述样本图像的模型损失包括所述初始特征共享模块、所述初始语义分割模块、所述初始边缘检测模块及所述初始聚合优化模块的模型损失的加权和；According to the expected parsing result of the sample image and the training parsing result, determine the model loss of the sample image, the model loss of the sample image includes the initial feature sharing module, the initial semantic segmentation module, the initial The weighted sum of the model losses of the edge detection module and the initial aggregation optimization module;

根据所述样本图像的模型损失，调整所述初始解析模型中的参数权重，确定调整后的解析模型；Adjusting parameter weights in the initial analytical model according to the model loss of the sample image, and determining an adjusted analytical model;

在所述样本图像的模型损失满足训练条件的情况下，将调整后的解析模型确定为最终的解析模型。If the model loss of the sample image satisfies the training condition, the adjusted analytical model is determined as the final analytical model.

在一种可能的实现方式中，其特征在于，In a possible implementation, it is characterized in that,

所述初步语义分割结果包括：语义特征和语义分割图；和/或The preliminary semantic segmentation results include: semantic features and semantic segmentation graphs; and/or

所述初步边缘检测结果包括：边缘特征和边缘检测图。The preliminary edge detection result includes: edge feature and edge detection map.

在一种可能的实现方式中，所述特征共享模块、所述语义分割模块、所述边缘检测模块及所述聚合优化模块分别包括全卷积神经网络。In a possible implementation manner, the feature sharing module, the semantic segmentation module, the edge detection module and the aggregation optimization module respectively include a fully convolutional neural network.

根据本公开的另一方面，提供了一种图像解析装置，所述装置通过解析模型实现，所述解析模型包括：特征共享模块、语义分割模块、边缘检测模块，所述装置包括：According to another aspect of the present disclosure, an image analysis device is provided, the device is implemented by an analysis model, and the analysis model includes: a feature sharing module, a semantic segmentation module, and an edge detection module, and the device includes:

共享特征获取单元，用于通过所述特征共享模块对待解析图像进行特征提取处理，获取共享特征，所述共享特征包括经所述特征共享模块的多个网络层处理得到的多个网络深度的特征信息；The shared feature acquisition unit is used to perform feature extraction processing on the image to be analyzed through the feature sharing module to acquire shared features, and the shared features include features of multiple network depths obtained by processing multiple network layers of the feature sharing module information;

初步结果确定单元，用于分别通过所述语义分割模块和所述边缘检测模块对所述共享特征进行语义分割处理和边缘检测处理，获取所述待解析图像的初步语义分割结果及初步边缘检测结果。A preliminary result determination unit, configured to perform semantic segmentation processing and edge detection processing on the shared features through the semantic segmentation module and the edge detection module, respectively, to obtain preliminary semantic segmentation results and preliminary edge detection results of the image to be parsed .

在一种可能的实现方式中，所述解析模型还包括聚合优化模块，所述装置还包括：In a possible implementation manner, the analytical model further includes an aggregation optimization module, and the device further includes:

特征层组合单元，用于通过所述聚合优化模块将所述初步语义分割结果及所述初步边缘检测结果输入所述聚合优化模块中组合为特征层；A feature layer combination unit, configured to input the preliminary semantic segmentation result and the preliminary edge detection result into the aggregation optimization module to form a feature layer through the aggregation optimization module;

聚合优化单元，用于在所述聚合优化模块中对所述特征层进行聚合，并采用所述聚合优化模块的多个卷积网络层进行优化处理，确定针对所述待解析图像的语义分割结果及边缘检测结果。An aggregation optimization unit, configured to aggregate the feature layers in the aggregation optimization module, and perform optimization processing using multiple convolutional network layers of the aggregation optimization module, to determine the semantic segmentation result for the image to be parsed and edge detection results.

在一种可能的实现方式中，所述装置还包括：In a possible implementation manner, the device further includes:

分割处理单元，用于对所述边缘检测结果进行线分割处理，确定所述待解析图像中的多个分割区域；a segmentation processing unit, configured to perform line segmentation processing on the edge detection result, and determine a plurality of segmented regions in the image to be analyzed;

聚集区域确定单元，用于根据所述语义分割结果对所述多个分割区域进行聚集处理，确定所述待解析图像中至少一个解析对象的聚集区域；An aggregation area determining unit, configured to perform aggregation processing on the plurality of segmented areas according to the semantic segmentation result, and determine an aggregation area of at least one analysis object in the image to be analyzed;

解析结果确定单元，用于将所述至少一个解析对象的聚集区域与所述语义分割结果相关联，确定针对所述至少一个解析对象的解析结果。The parsing result determining unit is configured to associate the aggregation area of the at least one parsing object with the semantic segmentation result, and determine the parsing result for the at least one parsing object.

在一种可能的实现方式中，所述边缘检测结果包括边缘图，其中，所述分割处理单元包括：In a possible implementation manner, the edge detection result includes an edge map, wherein the segmentation processing unit includes:

扫描子单元，用于分别在水平和垂直方向扫描所述边缘图，获取非背景区域中的多条水平线段和多条垂直线段，其中，每条水平线段和每条垂直线段的端点为所述边缘图中的边缘点，每条水平线段所在的区域属于同一分割区域，每条垂直线段所在的区域属于同一分割区域；The scanning subunit is used to scan the edge map in the horizontal and vertical directions respectively, and obtain multiple horizontal line segments and multiple vertical line segments in the non-background area, wherein the endpoint of each horizontal line segment and each vertical line segment is the For the edge points in the edge map, the area where each horizontal line segment is located belongs to the same segmented area, and the area where each vertical line segment is located belongs to the same segmented area;

分割区域确定子单元，用于对非背景区域中的多条水平线段和多条垂直线段所在的区域进行聚集处理，获得所述待解析图像中的多个分割区域。The segmented area determination subunit is configured to perform aggregation processing on areas where multiple horizontal line segments and multiple vertical line segments are located in the non-background area to obtain multiple segmented areas in the image to be analyzed.

在一种可能的实现方式中，所述聚集区域确定单元包括以下至少一个子单元：In a possible implementation manner, the aggregation area determination unit includes at least one of the following subunits:

区域确定子单元，用于在分割区域的尺寸大于或等于第一阈值，且所述分割区域包括多个语义分割结果时，确定所述分割区域为同一解析对象的聚集区域；A region determining subunit, configured to determine that the segmented region is an aggregation region of the same analysis object when the size of the segmented region is greater than or equal to a first threshold and the segmented region includes multiple semantic segmentation results;

区域合并子单元，用于在分割区域的尺寸小于第二阈值，且所述分割区域包括一个语义分割结果时，将所述分割区域并入与所述分割区域之间的距离最近的聚集区域。The region merging subunit is configured to merge the segmented region into the aggregation region with the closest distance to the segmented region when the size of the segmented region is smaller than a second threshold and the segmented region includes a semantic segmentation result.

训练解析单元，用于将样本图像输入初始解析模型中进行处理，获取针对所述样本图像的训练解析结果，其中，所述初始解析模型包括初始特征共享模块、初始语义分割模块、初始边缘检测模块、初始聚合优化模块以及监测模块；The training analysis unit is used to input the sample image into the initial analysis model for processing, and obtain the training analysis result for the sample image, wherein the initial analysis model includes an initial feature sharing module, an initial semantic segmentation module, and an initial edge detection module , an initial aggregation optimization module and a monitoring module;

损失确定单元，用于根据所述样本图像的期望解析结果以及所述训练解析结果，确定所述样本图像的模型损失，所述样本图像的模型损失包括所述初始特征共享模块、所述初始语义分割模块、所述初始边缘检测模块及所述初始聚合优化模块的模型损失的加权和；A loss determination unit, configured to determine the model loss of the sample image according to the expected parsing result of the sample image and the training parsing result, the model loss of the sample image includes the initial feature sharing module, the initial semantic a weighted sum of the model losses of the segmentation module, the initial edge detection module, and the initial aggregation optimization module;

模型调整单元，用于根据所述样本图像的模型损失，调整所述初始解析模型中的参数权重，确定调整后的解析模型；A model adjustment unit, configured to adjust parameter weights in the initial analysis model according to the model loss of the sample image, and determine an adjusted analysis model;

模型确定单元，用于在所述样本图像的模型损失满足训练条件的情况下，将调整后的解析模型确定为最终的解析模型。A model determining unit, configured to determine the adjusted analytical model as the final analytical model when the model loss of the sample image satisfies the training condition.

在一种可能的实现方式中，In one possible implementation,

根据本公开的另一方面，提供了一种图像解析装置，包括：处理器；用于存储处理器可执行指令的存储器；其中，所述处理器被配置为执行上述方法。According to another aspect of the present disclosure, an image analysis device is provided, including: a processor; a memory for storing instructions executable by the processor; wherein the processor is configured to execute the above method.

根据本公开的另一方面，提供了一种非易失性计算机可读存储介质，其上存储有计算机程序指令，其中，所述计算机程序指令被处理器执行时实现上述方法。According to another aspect of the present disclosure, there is provided a non-volatile computer-readable storage medium on which computer program instructions are stored, wherein the computer program instructions implement the above method when executed by a processor.

根据本公开的各方面的图像解析方法及装置，能够通过解析模型的特征共享模块提取待解析图像的共享特征，并分别通过语义分割模块和边缘检测模块对共享特征进行语义分割处理和边缘检测处理以获取初步语义分割结果及初步边缘检测结果，从而提高了待解析图像的语义分割结果与边缘检测结果之间的一致性。According to the image analysis method and device of various aspects of the present disclosure, the shared features of the image to be analyzed can be extracted through the feature sharing module of the analysis model, and the semantic segmentation processing and edge detection processing can be performed on the shared features through the semantic segmentation module and the edge detection module respectively To obtain preliminary semantic segmentation results and preliminary edge detection results, thereby improving the consistency between the semantic segmentation results and edge detection results of the image to be parsed.

根据下面参考附图对示例性实施例的详细说明，本公开的其它特征及方面将变得清楚。Other features and aspects of the present disclosure will become apparent from the following detailed description of exemplary embodiments with reference to the accompanying drawings.

附图说明Description of drawings

包含在说明书中并且构成说明书的一部分的附图与说明书一起示出了本公开的示例性实施例、特征和方面，并且用于解释本公开的原理。The accompanying drawings, which are incorporated in and constitute a part of the specification, illustrate exemplary embodiments, features, and aspects of the disclosure and, together with the specification, serve to explain the principles of the disclosure.

图1是根据一示例性实施例示出的一种图像解析方法的流程图。Fig. 1 is a flow chart of an image analysis method according to an exemplary embodiment.

图2是根据一示例性实施例示出的解析模型的示意图。Fig. 2 is a schematic diagram of an analytical model according to an exemplary embodiment.

图3是根据一示例性实施例示出的一种图像解析方法的流程图。Fig. 3 is a flowchart showing an image analysis method according to an exemplary embodiment.

图4是根据一示例性实施例示出的一种图像解析方法的流程图。Fig. 4 is a flow chart showing an image analysis method according to an exemplary embodiment.

图5a、图5b及图5c是根据一示例性实施例示出的样本图像的示意图。Fig. 5a, Fig. 5b and Fig. 5c are schematic diagrams showing sample images according to an exemplary embodiment.

图6是根据一示例性实施例示出的初始解析模型的示意图。Fig. 6 is a schematic diagram of an initial analytical model according to an exemplary embodiment.

图7是根据一示例性实施例示出的一种图像解析方法的流程图。Fig. 7 is a flow chart showing an image analysis method according to an exemplary embodiment.

图8是根据一示例性实施例示出的图像解析方法的示意图。Fig. 8 is a schematic diagram of an image analysis method according to an exemplary embodiment.

图9是根据一示例性实施例示出的一种图像解析装置的框图。Fig. 9 is a block diagram of an image analysis device according to an exemplary embodiment.

图10是根据一示例性实施例示出的一种图像解析装置的框图。Fig. 10 is a block diagram of an image analyzing device according to an exemplary embodiment.

具体实施方式Detailed ways

以下将参考附图详细说明本公开的各种示例性实施例、特征和方面。附图中相同的附图标记表示功能相同或相似的元件。尽管在附图中示出了实施例的各种方面，但是除非特别指出，不必按比例绘制附图。Various exemplary embodiments, features, and aspects of the present disclosure will be described in detail below with reference to the accompanying drawings. The same reference numbers in the figures indicate functionally identical or similar elements. While various aspects of the embodiments are shown in drawings, the drawings are not necessarily drawn to scale unless specifically indicated.

在这里专用的词“示例性”意为“用作例子、实施例或说明性”。这里作为“示例性”所说明的任何实施例不必解释为优于或好于其它实施例。The word "exemplary" is used exclusively herein to mean "serving as an example, embodiment, or illustration." Any embodiment described herein as "exemplary" is not necessarily to be construed as superior or better than other embodiments.

另外，为了更好的说明本公开，在下文的具体实施方式中给出了众多的具体细节。本领域技术人员应当理解，没有某些具体细节，本公开同样可以实施。在一些实例中，对于本领域技术人员熟知的方法、手段、元件和电路未作详细描述，以便于凸显本公开的主旨。In addition, in order to better illustrate the present disclosure, numerous specific details are given in the following specific implementation manners. It will be understood by those skilled in the art that the present disclosure may be practiced without some of the specific details. In some instances, methods, means, components and circuits that are well known to those skilled in the art have not been described in detail so as to obscure the gist of the present disclosure.

图1是根据一示例性实施例示出的一种图像解析方法的流程图。该方法可应用于服务器中。该方法通过解析模型实现，该解析模型包括：特征共享模块、语义分割模块、边缘检测模块。如图1所示，根据本公开实施例的图像解析方法包括：Fig. 1 is a flow chart of an image analysis method according to an exemplary embodiment. This method can be applied to the server. The method is realized through an analysis model, and the analysis model includes: a feature sharing module, a semantic segmentation module, and an edge detection module. As shown in Figure 1, the image analysis method according to the embodiment of the present disclosure includes:

在步骤S101中，通过所述特征共享模块对待解析图像进行特征提取处理，获取共享特征，所述共享特征包括经所述特征共享模块的多个网络层处理得到的多个网络深度的特征信息；In step S101, the feature sharing module performs feature extraction processing on the image to be analyzed to obtain shared features, and the shared features include feature information of multiple network depths obtained by processing multiple network layers of the feature sharing module;

在步骤S102中，分别通过所述语义分割模块和所述边缘检测模块对所述共享特征进行语义分割处理和边缘检测处理，获取所述待解析图像的初步语义分割结果及初步边缘检测结果。In step S102, perform semantic segmentation processing and edge detection processing on the shared features through the semantic segmentation module and the edge detection module respectively, and obtain preliminary semantic segmentation results and preliminary edge detection results of the image to be analyzed.

根据本公开的实施例，能够通过解析模型的特征共享模块提取待解析图像的共享特征，并分别通过语义分割模块和边缘检测模块对共享特征进行语义分割处理和边缘检测处理以获取初步语义分割结果及初步边缘检测结果，从而提高了待解析图像的语义分割结果与边缘检测结果之间的一致性。According to the embodiments of the present disclosure, the shared features of the image to be parsed can be extracted through the feature sharing module of the parsing model, and the semantic segmentation processing and edge detection processing are performed on the shared features through the semantic segmentation module and the edge detection module respectively to obtain preliminary semantic segmentation results and preliminary edge detection results, thereby improving the consistency between the semantic segmentation results of the image to be parsed and the edge detection results.

举例来说，语义分割和边缘检测有一些关键的共同点，都需要能够稠密的识别出物体以及其位置，而这些都需要通过相邻像素之间的区别这种低层信息和用于定位的高层信息来确定。因此，可以采用部分共享的卷积神经网络来提取待解析图像的共享特征，从而提高特征提取的效率。For example, semantic segmentation and edge detection have some key things in common. Both need to be able to densely identify objects and their locations, and these require low-level information such as the difference between adjacent pixels and high-level information for localization. information to determine. Therefore, a partially shared convolutional neural network can be used to extract the shared features of the image to be parsed, thereby improving the efficiency of feature extraction.

针对包括有一个或多个对象(例如人体、动物、车辆等)的待解析图像，可以将该待解析图像输入到预先训练的解析模型中进行处理，解析模型可以包括多个全卷积神经网络。该解析模型可以通过共享部分卷积运算提取待解析图像的共享特征；将共享特征分别输入语义分割和物体边缘检测两个模块中进行语义分割处理和边缘检测处理，分别确定待解析图像的初步语义分割结果及初步边缘检测结果。For an image to be analyzed that includes one or more objects (such as human body, animal, vehicle, etc.), the image to be analyzed can be input into a pre-trained analysis model for processing, and the analysis model can include multiple fully convolutional neural networks . The parsing model can extract the shared features of the image to be parsed through the shared part of the convolution operation; the shared features are input into the two modules of semantic segmentation and object edge detection for semantic segmentation and edge detection processing, respectively to determine the preliminary semantics of the image to be parsed Segmentation results and preliminary edge detection results.

图2是根据一示例性实施例示出的解析模型的示意图。如图2所示，该解析模型包括：特征共享模块21、语义分割模块22、边缘检测模块23。Fig. 2 is a schematic diagram of an analytical model according to an exemplary embodiment. As shown in FIG. 2 , the analysis model includes: a feature sharing module 21 , a semantic segmentation module 22 , and an edge detection module 23 .

在一种可能的实现方式中，特征共享模块21可包括级联的第一卷积池化网络(conv1+pool1)、第二残差网络(res2)、第三残差网络(res3)、第四残差网络(res4)以及第五残差网络(res5)；共享特征包括第三残差网络(res3)、第四残差网络(res4)以及第五残差网络(res5)分别输出的特征信息。In a possible implementation, the feature sharing module 21 may include a cascaded first convolutional pooling network (conv1+pool1), a second residual network (res2), a third residual network (res3), a Four residual network (res4) and fifth residual network (res5); shared features include the characteristics output by the third residual network (res3), the fourth residual network (res4) and the fifth residual network (res5) information.

特征共享模块21(主干神经网络)可采用全卷积神经网络模型，例如采用例如101层的ResNet残差网络模型。如图2所示，特征共享模块21可以包括级联的conv1+pool1(卷积1+池化1)、res2(残差2)、res3(残差3)、res4(残差4)、res5(残差5)等多个卷积神经网络。其中，多个卷积神经网络的不同的网络深度对应不同的特征信息，多个网络深度的特征的组合有利于提高图像解析的准确度。本公开对特征共享模块21的具体神经网络模型及模型的具体网络层数不作限制。The feature sharing module 21 (backbone neural network) can adopt a fully convolutional neural network model, such as a 101-layer ResNet residual network model. As shown in Figure 2, the feature sharing module 21 may include cascaded conv1+pool1 (convolution 1+pooling 1), res2 (residual 2), res3 (residual 3), res4 (residual 4), res5 (residual 5) and other multiple convolutional neural networks. Among them, different network depths of multiple convolutional neural networks correspond to different feature information, and the combination of features of multiple network depths is conducive to improving the accuracy of image analysis. The present disclosure does not limit the specific neural network model of the feature sharing module 21 and the specific number of network layers of the model.

在一种可能的实现方式中，将待解析图像25输入到特征共享模块21中进行特征提取处理，可获取待解析图像25的多个共享特征201(例如图2中神经网络res3、res4、res5输出的三个特征信息)，该共享特征包括经特征共享模块的多个网络层处理得到的多个网络深度的特征信息，从而体现待解析图像25的不同深度信息。可以将多个共享特征201分别输入到语义分割模块22和边缘检测模块23中进行处理。应当理解，可以获取任意数量的神经网络层输出的多个共享特征，本公开对此不作限制。In a possible implementation, the image to be analyzed 25 is input into the feature sharing module 21 for feature extraction processing, and a plurality of shared features 201 of the image to be analyzed 25 can be obtained (for example, the neural network res3, res4, res5 in FIG. 2 The three feature information output), the shared feature includes the feature information of multiple network depths processed by multiple network layers of the feature sharing module, so as to reflect the different depth information of the image 25 to be analyzed. A plurality of shared features 201 can be respectively input into the semantic segmentation module 22 and the edge detection module 23 for processing. It should be understood that multiple shared features output by any number of neural network layers can be obtained, and this disclosure is not limited thereto.

在一种可能的实现方式中，语义分割模块22(语义分割网络)可以分割出图像中对象(人体)的各个部分，例如分割出图像中人体的头部、上半身、手臂、上衣、裙子等各个部分。初步语义分割结果可包括语义特征和语义分割图。如图2所示，语义分割模块22可以包括级联的多个卷积层以及金字塔池化层(Pyramid Pooling)。可以将多个共享特征201组合为新的特征层，并输入到级联的多个卷积层中进行语义分割处理，根据多个网络深度的特征信息来得到不同网络深度下的语义特征202；然后可将语义特征202输入到金字塔池化层中进行融合处理，得到语义分割图203；并且可以将语义特征202和语义分割图203共同作为初步语义分割结果。如图2所示，语义分割图203可表示为图像26。In a possible implementation, the semantic segmentation module 22 (semantic segmentation network) can segment various parts of the object (human body) in the image, for example, segment the head, upper body, arms, coat, skirt, etc. of the human body in the image. part. Preliminary semantic segmentation results may include semantic features and semantic segmentation maps. As shown in FIG. 2 , the semantic segmentation module 22 may include multiple cascaded convolutional layers and pyramid pooling layers (Pyramid Pooling). Multiple shared features 201 can be combined into a new feature layer, and input to multiple cascaded convolutional layers for semantic segmentation processing, and semantic features 202 at different network depths can be obtained according to feature information of multiple network depths; Then, the semantic feature 202 can be input into the pyramid pooling layer for fusion processing to obtain the semantic segmentation map 203; and the semantic feature 202 and the semantic segmentation map 203 can be used together as a preliminary semantic segmentation result. As shown in FIG. 2 , semantic segmentation map 203 may be represented as image 26 .

在一种可能的实现方式中，边缘检测模块23(边缘检测网络)可以检测出图像中对象(人体)的边缘位置。初步边缘检测结果可包括边缘特征和边缘检测图。如图2所示，边缘检测模块23可以包括级联的多个卷积层以及金字塔池化层。可以通过级联的多个卷积层对多个共享特征201分别进行不同的扩散卷积运算，来检测不同网络深度的边缘特征信息(多个边缘特征204)；并且可以将卷积部分不同的特征层进行聚合，并采用多个网络深度的边缘特征信息来获取图像中所有的边缘信息，从而得到边缘检测图205；并且，可以将边缘特征204和边缘检测图205共同作为初步边缘检测结果。如图2所示，边缘检测图205可表示为图像27。In a possible implementation, the edge detection module 23 (edge detection network) can detect the edge position of the object (human body) in the image. Preliminary edge detection results may include edge features and edge detection maps. As shown in FIG. 2 , the edge detection module 23 may include multiple cascaded convolutional layers and pyramid pooling layers. Different diffusion convolution operations can be performed on multiple shared features 201 through cascaded multiple convolutional layers to detect edge feature information (multiple edge features 204) at different network depths; The feature layer is aggregated, and the edge feature information of multiple network depths is used to obtain all the edge information in the image, so as to obtain the edge detection map 205; and the edge feature 204 and the edge detection map 205 can be used together as the preliminary edge detection result. As shown in FIG. 2 , edge detection map 205 may be represented as image 27 .

图3是根据一示例性实施例示出的一种图像解析方法的流程图。在一种可能的实现方式中，所述解析模型还包括聚合优化模块。如图3所示，在步骤S102之后，所述方法还包括：Fig. 3 is a flowchart showing an image analysis method according to an exemplary embodiment. In a possible implementation manner, the analytical model further includes an aggregation optimization module. As shown in Figure 3, after step S102, the method further includes:

在步骤S103中，通过所述聚合优化模块将所述初步语义分割结果及所述初步边缘检测结果输入所述聚合优化模块中组合为特征层；In step S103, input the preliminary semantic segmentation result and the preliminary edge detection result into the aggregation optimization module through the aggregation optimization module to form a feature layer;

在步骤S104中，在所述聚合优化模块中对所述特征层进行聚合，并采用所述聚合优化模块的多个卷积网络层进行优化处理，确定针对所述待解析图像的语义分割结果及边缘检测结果。In step S104, the feature layer is aggregated in the aggregation optimization module, and multiple convolutional network layers of the aggregation optimization module are used for optimization processing to determine the semantic segmentation result and Edge detection results.

举例来说，聚合优化模块24(聚合优化网络)可用于精细化处理语义分割和边缘检测的结果。在获得初步语义分割结果及所述初步边缘检测结果后，可以通过解析模型的聚合优化模块进一步优化处理。如图2所示，聚合优化模块24可包括级联的多个卷积层以及分别进行语义分割和边缘检测的两个金字塔池化层。可以将语义特征202、语义分割图203、边缘特征204及边缘检测图205组合为新的特征层，输入聚合优化模块24的级联的多个卷积层中进行聚合，并分别采用多个网络深度的特征信息来进行运算优化；经由用于语义分割的金字塔池化层和用于边缘检测的金字塔池化层进行处理后，可以确定针对待解析图像最终的语义分割结果206(可表示为图2中的图像28)及边缘检测结果207(可表示为图2中的图像29)。For example, the aggregation optimization module 24 (aggregation optimization network) can be used to fine-tune the results of semantic segmentation and edge detection. After the preliminary semantic segmentation result and the preliminary edge detection result are obtained, further optimization can be performed through the aggregation optimization module of the analytical model. As shown in FIG. 2 , the aggregation optimization module 24 may include multiple cascaded convolutional layers and two pyramid pooling layers for semantic segmentation and edge detection respectively. The semantic feature 202, the semantic segmentation map 203, the edge feature 204 and the edge detection map 205 can be combined into a new feature layer, which is input into the cascaded multiple convolutional layers of the aggregation optimization module 24 for aggregation, and multiple network Depth feature information is used to optimize operations; after processing through the pyramid pooling layer for semantic segmentation and the pyramid pooling layer for edge detection, the final semantic segmentation result 206 for the image to be parsed can be determined (which can be expressed as 2) and the edge detection result 207 (can be represented as image 29 in FIG. 2).

通过这种方式，可以获取待解析图像语义分割结果和边缘检测结果，提高图像解析的精度。In this way, the results of semantic segmentation and edge detection of the image to be analyzed can be obtained, and the accuracy of image analysis can be improved.

在一种可能的实现方式中，在对解析图像进行处理之前，可以对初始的解析模型进行训练，确定初始解析模型中各个参数的参数权重，以使得训练得到的最终解析模型满足精度需求。下面对解析模型的训练过程进行说明。In a possible implementation manner, before processing the analyzed image, the initial analytical model may be trained, and parameter weights of each parameter in the initial analytical model may be determined, so that the final analytical model obtained through training meets the accuracy requirement. The training process of the analytical model is described below.

图4是根据一示例性实施例示出的一种图像解析方法的流程图。图5a、图5b及图5c是根据一示例性实施例示出的样本图像的示意图。Fig. 4 is a flow chart showing an image analysis method according to an exemplary embodiment. Fig. 5a, Fig. 5b and Fig. 5c are schematic diagrams showing sample images according to an exemplary embodiment.

如图4所示，在一种可能的实现方式中，该方法还可包括：As shown in Figure 4, in a possible implementation, the method may further include:

在步骤S105中，将样本图像输入初始解析模型中进行处理，获取针对所述样本图像的训练解析结果，其中，所述初始解析模型包括初始特征共享模块、初始语义分割模块、初始边缘检测模块、初始聚合优化模块以及监测模块；In step S105, the sample image is input into the initial parsing model for processing, and the training parsing result for the sample image is obtained, wherein the initial parsing model includes an initial feature sharing module, an initial semantic segmentation module, an initial edge detection module, Initial aggregation optimization module and monitoring module;

在步骤S106中，根据所述样本图像的期望解析结果以及所述训练解析结果，确定所述样本图像的模型损失，所述样本图像的模型损失包括所述初始特征共享模块、所述初始语义分割模块、所述初始边缘检测模块及所述初始聚合优化模块的模型损失的加权和；In step S106, according to the expected parsing result of the sample image and the training parsing result, determine the model loss of the sample image, the model loss of the sample image includes the initial feature sharing module, the initial semantic segmentation module, the weighted sum of the model losses of the initial edge detection module and the initial aggregation optimization module;

在步骤S107中，根据所述样本图像的模型损失，调整所述初始解析模型中的参数权重，确定调整后的解析模型；In step S107, according to the model loss of the sample image, adjust the parameter weights in the initial analytical model, and determine the adjusted analytical model;

在步骤S108中，在所述样本图像的模型损失满足训练条件的情况下，将调整后的解析模型确定为最终的解析模型。In step S108, if the model loss of the sample image satisfies the training condition, the adjusted analytical model is determined as the final analytical model.

举例来说，可以采用样本图像对初始解析模型进行训练。如图5a所示，样本图像可以采用电商模特图片以及公开学术图片(数据集)等，可以在样本图像中标注像素级别的对象位置(如图5b所示)，并区分每个部位属于哪一个人体(如图5c所示，分成人体1～11)。可以将标注对象位置的样本图像作为期望解析结果。For example, sample images can be used to train the initial parsing model. As shown in Figure 5a, the sample images can use e-commerce model pictures and public academic pictures (data sets), etc., and the pixel-level object positions can be marked in the sample images (as shown in Figure 5b), and it is possible to distinguish which part each part belongs to. A human body (as shown in FIG. 5c, divided into human bodies 1-11). The sample image with the position of the marked object can be used as the expected parsing result.

图6是根据一示例性实施例示出的初始解析模型的示意图。如图6所示，初始解析模型可包括初始特征共享模块61、初始语义分割模块62、初始边缘检测模块63、初始聚合优化模块64以及监测模块65。Fig. 6 is a schematic diagram of an initial analytical model according to an exemplary embodiment. As shown in FIG. 6 , the initial parsing model may include an initial feature sharing module 61 , an initial semantic segmentation module 62 , an initial edge detection module 63 , an initial aggregation optimization module 64 and a monitoring module 65 .

其中，监测模块65可包括多个深黑空间金字塔池化层(Atrous Spatial PyramidPooling，ASPP)，ASPP的数量可与初始特征共享模块61输出的训练共享特征的数量相同(例如为3个)。可以将多个监督信息分别输入多个ASPP中，以生成多个监测边缘特征，与初始特征共享模块61输出的多个训练共享特征，共同作为初始边缘检测模块63的输入。这样，可以提高整个初始解析模型的训练速度和训练效果。Wherein, the monitoring module 65 may include a plurality of deep black space pyramid pooling layers (Atrous Spatial Pyramid Pooling, ASPP), and the number of ASPPs may be the same as the number of training shared features output by the initial feature sharing module 61 (for example, 3). A plurality of supervisory information can be input into a plurality of ASPPs respectively to generate a plurality of monitoring edge features, which together with a plurality of training shared features output by the initial feature sharing module 61 are used as the input of the initial edge detection module 63 . In this way, the training speed and training effect of the entire initial parsing model can be improved.

在一种可能的实现方式中，可将样本图像输入初始特征共享模块61中进行特征提取处理，确定样本图像的多个训练共享特征；将多个训练共享特征输入初始语义分割模块62中进行语义分割处理，确定样本图像的训练语义特征及训练语义分割图(初步语义分割结果)；将监测特征输入监测模块65进行处理，确定监测边缘特征；将训练共享特征及监测边缘特征输入边缘检测模块63中进行边缘检测处理，确定待解析图像的训练边缘特征及训练边缘检测图(初步边缘检测结果)；将训练语义特征、训练语义分割图、训练边缘特征及训练边缘检测图输入初始聚合优化模块64中进行优化处理，确定针对样本图像的训练解析结果(也即语义分割结果和边缘检测结果)。样本图像的训练解析结果的具体获取过程与待解析图像的语义分割结果和边缘检测结果的获取过程相似，此处不再重复描述。In a possible implementation, the sample image can be input into the initial feature sharing module 61 for feature extraction processing, and multiple training shared features of the sample image can be determined; multiple training shared features can be input into the initial semantic segmentation module 62 for semantic segmentation. Segmentation processing, determine the training semantic feature of the sample image and the training semantic segmentation map (preliminary semantic segmentation result); the monitoring feature is input into the monitoring module 65 for processing, and the monitoring edge feature is determined; the training shared feature and the monitoring edge feature are input into the edge detection module 63 Carry out edge detection processing in, determine the training edge feature and training edge detection map (preliminary edge detection result) of image to be analyzed; Input initial aggregation optimization module 64 with training semantic feature, training semantic segmentation map, training edge feature and training edge detection map Optimizing processing is carried out in , and the training analysis results (that is, semantic segmentation results and edge detection results) for the sample images are determined. The specific acquisition process of the training analysis result of the sample image is similar to the acquisition process of the semantic segmentation result and the edge detection result of the image to be analyzed, and the description will not be repeated here.

在一种可能的实现方式中，根据样本图像的期望解析结果(也即人工标注的对象位置及语义分割情况)以及训练解析结果，可确定样本图像的模型损失，所述样本图像的模型损失包括初始特征共享模块61、初始语义分割模块62、初始边缘检测模块63、初始聚合优化模块64以及监测模块65的模型损失的加权和。整个解析模型的损失函数可如公式(1)所示：In a possible implementation, the model loss of the sample image can be determined according to the expected parsing result of the sample image (that is, the position of the manually labeled object and the semantic segmentation) and the training parsing result, and the model loss of the sample image includes The weighted sum of the model losses of the initial feature sharing module 61 , the initial semantic segmentation module 62 , the initial edge detection module 63 , the initial aggregation optimization module 64 , and the monitoring module 65 . The loss function of the entire analytical model can be shown in formula (1):

在公式(1)中，L可表示整个解析模型的模型损失，L_seg可表示聚合优化模块的语义分割损失函数，L′_seg可表示语义分割模块的语义分割损失函数，L_edge可表示聚合优化模块的边缘检测损失函数，L′_edge可表示边缘检测模块的边缘检测损失函数，可表示监测模块的第n个ASPP的监测损失函数，n的取值为1～N，N可表示监测模块的ASPP的数量。In formula (1), L can represent the model loss of the entire parsing model, L_seg can represent the semantic segmentation loss function of the aggregation optimization module, L′_seg can represent the semantic segmentation loss function of the semantic segmentation module, and L_edge can represent the aggregation optimization The edge detection loss function of the module, L′_edge can represent the edge detection loss function of the edge detection module, It can represent the monitoring loss function of the nth ASPP of the monitoring module, and the value of n can be from 1 to N, and N can represent the number of ASPPs of the monitoring module.

其中，α和β分别表示语义分割部分和边缘检测部分的系数，可以由技术人员根据实际情况进行设定，以调整语义分割部分和边缘检测部分在整个解析模型中的权重。Among them, α and β respectively represent the coefficients of the semantic segmentation part and the edge detection part, which can be set by technicians according to the actual situation, so as to adjust the weights of the semantic segmentation part and the edge detection part in the whole analysis model.

其中，损失函数L_seg、L′_seg、L_edge、L′_edge、可以分别采用本领域公知的损失函数，本公开对各个损失函数的具体选取不作限制。Among them, the loss functions L_seg , L′_seg , L_edge , L′_edge , Loss functions known in the art can be used respectively, and the present disclosure does not limit the specific selection of each loss function.

在一种可能的实现方式中，可以根据模型损失调整初始解析模型中的参数权重，确定调整后的解析模型。例如可以通过反向传播算法，基于模型损失对该模型的参数权重求梯度，并基于该梯度来调整初始解析模型中的参数权重。如果模型损失满足训练条件，例如达到设定的迭代训练次数和/或满足设定的收敛条件等，则可以将调整后的解析模型确定为最终的解析模型。In a possible implementation manner, parameter weights in the initial analysis model may be adjusted according to the model loss to determine an adjusted analysis model. For example, the gradient of the parameter weights of the model can be calculated based on the model loss through the backpropagation algorithm, and the parameter weights in the initial analysis model can be adjusted based on the gradient. If the model loss satisfies the training conditions, such as reaching the set iteration training times and/or meeting the set convergence conditions, the adjusted analytical model can be determined as the final analytical model.

通过这种方式，实现了解析模型的训练过程。In this way, the training process of the analytical model is realized.

图7是根据一示例性实施例示出的一种图像解析方法的流程图。图8是根据一示例性实施例示出的图像解析方法的示意图。如图7所示，在一种可能的实现方式中，在步骤104之后，该方法还可包括：Fig. 7 is a flow chart showing an image analysis method according to an exemplary embodiment. Fig. 8 is a schematic diagram of an image analysis method according to an exemplary embodiment. As shown in FIG. 7, in a possible implementation, after step 104, the method may further include:

在步骤S109中，对所述边缘检测结果进行线分割处理，确定所述待解析图像中的多个分割区域；In step S109, line segmentation processing is performed on the edge detection result, and multiple segmentation regions in the image to be analyzed are determined;

在步骤S110中，根据所述语义分割结果对所述多个分割区域进行聚集处理，确定所述待解析图像中至少一个解析对象的聚集区域；In step S110, perform aggregation processing on the plurality of segmented regions according to the semantic segmentation result, and determine the aggregation region of at least one analysis object in the image to be analyzed;

在步骤S111中，将所述至少一个解析对象的聚集区域与所述语义分割结果相关联，确定针对所述至少一个解析对象的解析结果。In step S111 , associating the aggregation area of the at least one analysis object with the semantic segmentation result, and determining the analysis result for the at least one analysis object.

举例来说，通过解析模型对待解析图像同时进行语义分割和边缘检测并进行优化处理，确定语义分割结果及边缘检测结果之后，可以根据语义分割结果及边缘检测结果确定图像中至少一个解析对象的解析结果，从而实现对图像中的一个或多个解析对象的解析。For example, through the analysis model, semantic segmentation and edge detection are performed on the image to be analyzed at the same time and optimized. After the semantic segmentation result and edge detection result are determined, the analysis of at least one analysis object in the image can be determined according to the semantic segmentation result and edge detection result. As a result, the analysis of one or more analysis objects in the image is realized.

如图8所示，在采用根据本公开的解析模型82(Detection-Free Network，DFN)对待解析图像81进行解析处理后，可以获得待解析图像81的语义分割结果(分割图821)及边缘检测结果(边缘图822)。在该情况下，可以对待解析图像81中的一个或多个解析对象进行划分。As shown in FIG. 8, after analyzing and processing the image 81 to be analyzed by using the analysis model 82 (Detection-Free Network, DFN) according to the present disclosure, the semantic segmentation result (segmentation graph 821) and edge detection of the image 81 to be analyzed can be obtained. Result (edge map 822). In this case, one or more analysis objects in the image to be analyzed 81 may be divided.

在一种可能的实现方式中，边缘检测结果可包括边缘图。其中，步骤S109可包括：In a possible implementation manner, the edge detection result may include an edge map. Wherein, step S109 may include:

在一种可能的实现方式中，可以分别在水平和垂直方向扫描边缘图822(或扫描边缘图822和分割图821)，对边缘图822进行线分割处理，如图8中的水平线分割图831和竖直线分割图832所示。线分割处理可以得到水平和垂直线段，以水平扫描为例，沿着图像的每一行进行扫描，自动跳过背景区域，如果扫描碰到一个边缘点，则以该边缘点为水平线段的起始点，直至碰到下一个边缘点作为线段的终点，并给每一个线段一个编号。以此类推，也可以得到垂直方向的线段。将所有的线段看做一个联通图，并将每个线段上的区域作为属于同一个人(对象)的区域，这样就可以将每个人的物体进行聚集，得到每个人的分割结果，也即如图8中的分割图833所示的多个分割区域。In a possible implementation, the edge map 822 (or the scanned edge map 822 and the segmentation map 821) can be scanned in the horizontal and vertical directions respectively, and line segmentation processing is performed on the edge map 822, such as the horizontal line segmentation map 831 in FIG. 8 And the vertical line divides the diagram as shown in 832. Line segmentation processing can obtain horizontal and vertical line segments. Taking horizontal scanning as an example, scan along each line of the image and automatically skip the background area. If the scan encounters an edge point, the edge point will be used as the starting point of the horizontal line segment. , until the next edge point is encountered as the end point of the line segment, and a number is given to each line segment. By analogy, vertical line segments can also be obtained. Think of all the line segments as a connected graph, and regard the area on each line segment as the area belonging to the same person (object), so that the objects of each person can be aggregated to obtain the segmentation result of each person, that is, as shown in the figure The multiple segmented regions shown in the segmented map 833 in 8.

在一种可能的实现方式中，分割图833所示的多个分割区域受边缘误检的影响较大，边缘点的误检会产生很多小的区域。从而，在步骤S110中，可以根据语义分割结果对多个分割区域进行聚集处理，确定所述待解析图像中至少一个解析对象的聚集区域。In a possible implementation manner, the multiple segmented regions shown in the segmentation map 833 are greatly affected by false detection of edges, and false detection of edge points will produce many small regions. Therefore, in step S110, a plurality of segmented regions may be aggregated according to the semantic segmentation result, and an aggregated region of at least one analysis object in the image to be analyzed may be determined.

在一种可能的实现方式中，步骤S110可包括以下至少一个步骤：In a possible implementation manner, step S110 may include at least one of the following steps:

举例来说，对于多个分割区域，如果某个分割区域的尺寸较大(大于或等于第一阈值)，并且该分割区域包括多个语义分割结果，则可以认为该分割区域为同一解析对象的聚集区域；如果某个目标分割区域的尺寸较小(小于第二阈值)，并且该目标分割区域仅包括一个语义分割结果，则认为该目标分割区域为错误聚集的结果，可以将该目标分割区域并入与该目标分割区域之间的距离最近的聚集区域。例如，可以分别计算任意两个分割区域中相距最近两个点的距离，将目标分割区域并入距离最近的分割区域。在对所有的分割区域进行处理后，可以获得优化后的最终聚合结果，如图8中的841所示。通过这种方式，可以提高聚合结果的精度。For example, for multiple segmented regions, if the size of a certain segmented region is larger (greater than or equal to the first threshold), and the segmented region includes multiple semantic segmentation results, it can be considered that the segmented region is the same parsing object Aggregation area; if the size of a certain target segmentation area is small (less than the second threshold), and the target segmentation area only includes a semantic segmentation result, then the target segmentation area is considered to be the result of wrong aggregation, and the target segmentation area can be Merge into the closest aggregation area between the target segmentation area. For example, the distances between the two closest points in any two segmented regions can be calculated respectively, and the target segmented region can be merged into the segmented region with the closest distance. After processing all the segmented regions, an optimized final aggregation result can be obtained, as shown by 841 in FIG. 8 . In this way, the precision of the aggregation results can be improved.

应当理解，第一阈值和第二阈值的具体取值可以根据实际情况进行设定；并且，可以采用本领域公知的各种区域合并方式实现目标分割区域的合并，本公开对此不作限制。It should be understood that the specific values of the first threshold and the second threshold can be set according to actual conditions; and various region merging methods known in the art can be used to realize the merging of the target segmented regions, which is not limited in the present disclosure.

在一种可能的实现方式中，在获得待解析图像81的至少一个解析对象的聚集区域后，可以将至少一个解析对象的聚集区域与语义分割结果相关联，将语义分割结果对应到每个人物(对象)上，从而确定针对至少一个解析对象的解析结果。例如，可以将图8中的841与语义分割图822相关联，确定如851所示的解析结果。图8中的86是对待解析图像人工标注的参考解析结果图，可见851与86较为相近，能够实现对待解析图像精确解析。In a possible implementation, after obtaining the aggregated area of at least one analytical object of the image 81 to be analyzed, the aggregated area of at least one analytical object can be associated with the semantic segmentation result, and the semantic segmentation result is corresponding to each character (object), thereby determining the parsing result for at least one parsing object. For example, 841 in FIG. 8 may be associated with the semantic segmentation graph 822 to determine the parsing result shown in 851 . 86 in FIG. 8 is a reference analysis result diagram manually marked for the image to be analyzed. It can be seen that 851 and 86 are relatively similar, which can realize accurate analysis of the image to be analyzed.

根据本公开实施例的图像解析方法，能够通过解析模型对待解析图像同时进行语义分割和边缘检测并聚合优化处理，确定高精度的语义分割结果及边缘检测结果，并根据语义分割结果及边缘检测结果确定图像中至少一个解析对象的解析结果，从而实现对图像中的一个或多个解析对象的精确解析。According to the image analysis method of the embodiment of the present disclosure, semantic segmentation and edge detection can be simultaneously performed on the image to be analyzed through the analysis model and aggregated and optimized to determine high-precision semantic segmentation results and edge detection results, and according to the semantic segmentation results and edge detection results An analysis result of at least one analysis object in the image is determined, so as to realize accurate analysis of one or more analysis objects in the image.

根据本公开实施例的图像解析方法，去除了人体检测网络，能够采用联合网络训练深度卷积神经网络，实现人物解析，解析结果不仅提供了像素级别的人体各部分定位，而且分割的服装结果有助于后续服装属性的识别；并且，联合网络同时训练语义分割和边缘检测，两个任务同时训练，对于各自性能提高均有帮助，也简化了网络训练的过程。According to the image analysis method of the embodiment of the present disclosure, the human body detection network is removed, and the joint network can be used to train the deep convolutional neural network to realize character analysis. The analysis result not only provides the positioning of each part of the human body at the pixel level, but also the segmented clothing results are It is helpful for the identification of subsequent clothing attributes; and, the joint network trains semantic segmentation and edge detection at the same time, and the two tasks are trained at the same time, which is helpful for improving their performance and simplifies the process of network training.

图9是根据一示例性实施例示出的一种图像解析装置的框图。如图9所示，所述装置通过解析模型实现，所述解析模型包括：特征共享模块、语义分割模块、边缘检测模块，所述装置包括：Fig. 9 is a block diagram of an image analysis device according to an exemplary embodiment. As shown in Figure 9, the device is implemented by an analytical model, the analytical model includes: a feature sharing module, a semantic segmentation module, an edge detection module, and the device includes:

共享特征获取单元901，用于通过所述特征共享模块对待解析图像进行特征提取处理，获取共享特征，所述共享特征包括经所述特征共享模块的多个网络层处理得到的多个网络深度的特征信息；The shared feature acquisition unit 901 is configured to perform feature extraction processing on the image to be analyzed by the feature sharing module to acquire shared features, the shared features include multiple network depths obtained by processing multiple network layers of the feature sharing module characteristic information;

初步结果确定单元902，用于分别通过所述语义分割模块和所述边缘检测模块对所述共享特征进行语义分割处理和边缘检测处理，获取所述待解析图像的初步语义分割结果及初步边缘检测结果。The preliminary result determination unit 902 is configured to perform semantic segmentation processing and edge detection processing on the shared features through the semantic segmentation module and the edge detection module respectively, and obtain the preliminary semantic segmentation result and preliminary edge detection of the image to be analyzed result.

在一种可能的实现方式中，所述初步语义分割结果包括：语义特征和语义分割图；和/或所述初步边缘检测结果包括：边缘特征和边缘检测图。In a possible implementation manner, the preliminary semantic segmentation result includes: semantic features and a semantic segmentation map; and/or the preliminary edge detection result includes: edge features and an edge detection map.

关于上述实施例中的装置，其中各个单元执行操作的具体方式已经在有关该方法的实施例中进行了详细描述，此处将不做详细阐述说明。Regarding the apparatus in the above embodiments, the specific manner in which each unit performs operations has been described in detail in the embodiments related to the method, and will not be described in detail here.

图10是根据一示例性实施例示出的一种图像解析装置1900的框图。例如，装置1900可以被提供为一服务器。参照图10，装置1900包括处理组件1922，其进一步包括一个或多个处理器，以及由存储器1932所代表的存储器资源，用于存储可由处理组件1922的执行的指令，例如应用程序。存储器1932中存储的应用程序可以包括一个或一个以上的每一个对应于一组指令的模块。此外，处理组件1922被配置为执行指令，以执行上述方法。Fig. 10 is a block diagram of an image analyzing device 1900 according to an exemplary embodiment. For example, apparatus 1900 may be provided as a server. Referring to FIG. 10 , apparatus 1900 includes processing component 1922 , which further includes one or more processors, and a memory resource represented by memory 1932 for storing instructions executable by processing component 1922 , such as application programs. The application programs stored in memory 1932 may include one or more modules each corresponding to a set of instructions. In addition, the processing component 1922 is configured to execute instructions to perform the above method.

装置1900还可以包括一个电源组件1926被配置为执行装置1900的电源管理，一个有线或无线网络接口1950被配置为将装置1900连接到网络，和一个输入输出(I/O)接口1958。装置1900可以操作基于存储在存储器1932的操作系统，例如Windows ServerTM，MacOS XTM，UnixTM，LinuxTM，FreeBSDTM或类似。Device 1900 may also include a power component 1926 configured to perform power management of device 1900 , a wired or wireless network interface 1950 configured to connect device 1900 to a network, and an input-output (I/O) interface 1958 . The apparatus 1900 can operate based on an operating system stored in the memory 1932, such as Windows Server™, MacOS X™, Unix™, Linux™, FreeBSD™ or the like.

在示例性实施例中，还提供了一种非易失性计算机可读存储介质，例如包括计算机程序指令的存储器1932，上述计算机程序指令可由装置1900的处理组件1922执行以完成上述方法。In an exemplary embodiment, there is also provided a non-transitory computer-readable storage medium, such as the memory 1932 including computer program instructions, which can be executed by the processing component 1922 of the apparatus 1900 to implement the above-mentioned method.

本公开可以是系统、方法和/或计算机程序产品。计算机程序产品可以包括计算机可读存储介质，其上载有用于使处理器实现本公开的各个方面的计算机可读程序指令。The present disclosure can be a system, method and/or computer program product. A computer program product may include a computer readable storage medium having computer readable program instructions thereon for causing a processor to implement various aspects of the present disclosure.

计算机可读存储介质可以是可以保持和存储由指令执行设备使用的指令的有形设备。计算机可读存储介质例如可以是――但不限于――电存储设备、磁存储设备、光存储设备、电磁存储设备、半导体存储设备或者上述的任意合适的组合。计算机可读存储介质的更具体的例子(非穷举的列表)包括：便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、静态随机存取存储器(SRAM)、便携式压缩盘只读存储器(CD-ROM)、数字多功能盘(DVD)、记忆棒、软盘、机械编码设备、例如其上存储有指令的打孔卡或凹槽内凸起结构、以及上述的任意合适的组合。这里所使用的计算机可读存储介质不被解释为瞬时信号本身，诸如无线电波或者其他自由传播的电磁波、通过波导或其他传输媒介传播的电磁波(例如，通过光纤电缆的光脉冲)、或者通过电线传输的电信号。A computer readable storage medium may be a tangible device that can retain and store instructions for use by an instruction execution device. A computer readable storage medium may be, for example, but is not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of computer-readable storage media include: portable computer diskettes, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM), or flash memory), static random access memory (SRAM), compact disc read only memory (CD-ROM), digital versatile disc (DVD), memory stick, floppy disk, mechanically encoded device, such as a printer with instructions stored thereon A hole card or a raised structure in a groove, and any suitable combination of the above. As used herein, computer-readable storage media are not to be construed as transient signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (e.g., pulses of light through fiber optic cables), or transmitted electrical signals.

这里所描述的计算机可读程序指令可以从计算机可读存储介质下载到各个计算/处理设备，或者通过网络、例如因特网、局域网、广域网和/或无线网下载到外部计算机或外部存储设备。网络可以包括铜传输电缆、光纤传输、无线传输、路由器、防火墙、交换机、网关计算机和/或边缘服务器。每个计算/处理设备中的网络适配卡或者网络接口从网络接收计算机可读程序指令，并转发该计算机可读程序指令，以供存储在各个计算/处理设备中的计算机可读存储介质中。Computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to a respective computing/processing device, or downloaded to an external computer or external storage device over a network, such as the Internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers. A network adapter card or a network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in each computing/processing device .

用于执行本公开操作的计算机程序指令可以是汇编指令、指令集架构(ISA)指令、机器指令、机器相关指令、微代码、固件指令、状态设置数据、或者以一种或多种编程语言的任意组合编写的源代码或目标代码，所述编程语言包括面向对象的编程语言—诸如Smalltalk、C++等，以及常规的过程式编程语言—诸如“C”语言或类似的编程语言。计算机可读程序指令可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中，远程计算机可以通过任意种类的网络—包括局域网(LAN)或广域网(WAN)—连接到用户计算机，或者，可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。在一些实施例中，通过利用计算机可读程序指令的状态信息来个性化定制电子电路，例如可编程逻辑电路、现场可编程门阵列(FPGA)或可编程逻辑阵列(PLA)，该电子电路可以执行计算机可读程序指令，从而实现本公开的各个方面。Computer program instructions for performing the operations of the present disclosure may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine-dependent instructions, microcode, firmware instructions, state setting data, or Source or object code written in any combination, including object-oriented programming languages—such as Smalltalk, C++, etc., and conventional procedural programming languages—such as the “C” language or similar programming languages. Computer readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server implement. In cases involving a remote computer, the remote computer can be connected to the user computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (such as via the Internet using an Internet service provider). connect). In some embodiments, an electronic circuit, such as a programmable logic circuit, field programmable gate array (FPGA), or programmable logic array (PLA), can be customized by utilizing state information of computer-readable program instructions, which can Various aspects of the present disclosure are implemented by executing computer readable program instructions.

这里参照根据本公开实施例的方法、装置(系统)和计算机程序产品的流程图和/或框图描述了本公开的各个方面。应当理解，流程图和/或框图的每个方框以及流程图和/或框图中各方框的组合，都可以由计算机可读程序指令实现。Aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It should be understood that each block of the flowcharts and/or block diagrams, and combinations of blocks in the flowcharts and/or block diagrams, can be implemented by computer-readable program instructions.

这些计算机可读程序指令可以提供给通用计算机、专用计算机或其它可编程数据处理装置的处理器，从而生产出一种机器，使得这些指令在通过计算机或其它可编程数据处理装置的处理器执行时，产生了实现流程图和/或框图中的一个或多个方框中规定的功能/动作的装置。也可以把这些计算机可读程序指令存储在计算机可读存储介质中，这些指令使得计算机、可编程数据处理装置和/或其他设备以特定方式工作，从而，存储有指令的计算机可读介质则包括一个制造品，其包括实现流程图和/或框图中的一个或多个方框中规定的功能/动作的各个方面的指令。These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine such that when executed by the processor of the computer or other programmable data processing apparatus , producing an apparatus for realizing the functions/actions specified in one or more blocks in the flowchart and/or block diagram. These computer-readable program instructions can also be stored in a computer-readable storage medium, and these instructions cause computers, programmable data processing devices and/or other devices to work in a specific way, so that the computer-readable medium storing instructions includes An article of manufacture comprising instructions for implementing various aspects of the functions/acts specified in one or more blocks in flowcharts and/or block diagrams.

也可以把计算机可读程序指令加载到计算机、其它可编程数据处理装置、或其它设备上，使得在计算机、其它可编程数据处理装置或其它设备上执行一系列操作步骤，以产生计算机实现的过程，从而使得在计算机、其它可编程数据处理装置、或其它设备上执行的指令实现流程图和/或框图中的一个或多个方框中规定的功能/动作。It is also possible to load computer-readable program instructions into a computer, other programmable data processing device, or other equipment, so that a series of operational steps are performed on the computer, other programmable data processing device, or other equipment to produce a computer-implemented process , so that instructions executed on computers, other programmable data processing devices, or other devices implement the functions/actions specified in one or more blocks in the flowcharts and/or block diagrams.

附图中的流程图和框图显示了根据本公开的多个实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上，流程图或框图中的每个方框可以代表一个模块、程序段或指令的一部分，所述模块、程序段或指令的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。在有些作为替换的实现中，方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如，两个连续的方框实际上可以基本并行地执行，它们有时也可以按相反的顺序执行，这依所涉及的功能而定。也要注意的是，框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合，可以用执行规定的功能或动作的专用的基于硬件的系统来实现，或者可以用专用硬件与计算机指令的组合来实现。The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in a flowchart or block diagram may represent a module, a portion of a program segment, or an instruction that includes one or more Executable instructions. In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks in succession may, in fact, be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved. It should also be noted that each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented by a dedicated hardware-based system that performs the specified function or action , or may be implemented by a combination of dedicated hardware and computer instructions.

以上已经描述了本公开的各实施例，上述说明是示例性的，并非穷尽性的，并且也不限于所披露的各实施例。在不偏离所说明的各实施例的范围和精神的情况下，对于本技术领域的普通技术人员来说许多修改和变更都是显而易见的。本文中所用术语的选择，旨在最好地解释各实施例的原理、实际应用或对市场中的技术的技术改进，或者使本技术领域的其它普通技术人员能理解本文披露的各实施例。Having described various embodiments of the present disclosure above, the foregoing description is exemplary, not exhaustive, and is not limited to the disclosed embodiments. Many modifications and alterations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen to best explain the principles of the various embodiments, practical applications or technical improvements over technologies in the market, or to enable other persons of ordinary skill in the art to understand the various embodiments disclosed herein.