CN115690056A

Movatterモバイル変換

Info

Publication number: CN115690056A
Application number: CN202211369435.4A
Authority: CN
Inventors: 田捷; 董迪; 张若凡; 方梦捷; 操润楠
Original assignee: Institute of Automation of Chinese Academy of Science
Current assignee: Institute of Automation of Chinese Academy of Science
Priority date: 2022-11-03
Filing date: 2022-11-03
Publication date: 2023-02-03

Abstract

The invention belongs to the technical field of computers and image processing, and particularly relates to a gastric cancer pathological image classification method and system based on HER2 gene detection, aiming at solving the problem that in the prior art, gastric cancer pathological images cannot be accurately classified, so that doctors cannot be effectively assisted in predicting the curative effect of trastuzumab therapy on patients. The invention comprises the following steps: acquiring a stomach digital pathological section image of a gastric cancer patient; performing region extraction and segmentation operation to obtain a plurality of image blocks with set pixel sizes and set tissue types; performing data enhancement normalization and data division on the image blocks to obtain training image packets; performing network iterative training through a training image packet to obtain a gastric cancer pathological image classification model; and classifying the digital pathological section images of the stomach of the patient, which are acquired in real time through the gastric cancer pathological image classification model. The method realizes accurate classification of the pathological images of the gastric cancer, thereby assisting doctors to effectively predict the curative effect of trastuzumab therapy of patients.

Description

Translated fromChinese

基于HER2基因检测的胃癌病理图像分类方法及系统Gastric cancer pathological image classification method and system based on HER2 gene detection

技术领域technical field

本发明属于计算机和图像处理技术领域，具体涉及了一种基于HER2基因检测的胃癌病理图像分类方法及系统。The invention belongs to the technical field of computer and image processing, and in particular relates to a gastric cancer pathological image classification method and system based on HER2 gene detection.

背景技术Background technique

胃癌是一种常见的恶性肿瘤，目前已知的几个胃癌重要分子致病通路包括：PI3K/AKT/mTOR，MAPK信号传导通路ERK、JNK、p38、Hippo通路等，与胃癌细胞的凋亡、自噬、肿瘤的大小、浸润深度和远处转移密切相关。然而纵使胃癌的靶向治疗在基础研究中效果可观，但难以将其转换至临床，目前仅人类表皮生长因子受体2(human epithelial growth factorreceptor 2,HER2)在胃癌临床靶向治疗获批[1]。Gastric cancer is a common malignant tumor. The currently known important molecular pathogenic pathways of gastric cancer include: PI3K/AKT/mTOR, MAPK signaling pathway ERK, JNK, p38, Hippo pathway, etc., which are related to gastric cancer cell apoptosis, Autophagy, tumor size, invasion depth and distant metastasis are closely related. However, even though the targeted therapy of gastric cancer has achieved considerable results in basic research, it is difficult to transfer it to the clinic. Currently, only human epithelial growth factor receptor 2 (human epithelial growth factor receptor 2, HER2) has been approved for clinical targeted therapy of gastric cancer[1] ].

曲妥珠单抗是特异性抗HER2靶向药，其联合方案较单纯化疗延长总生存期至16个月，较进展期胃癌传统化疗下不足一年的平均生存时间有明显提升。多项研究显示，抗HER2疗法(主要是HER2蛋白的单克隆抗体药物)在胃癌的离体和在体模型中都具有非常显著的疗效。Trastuzumab is a specific anti-HER2 targeted drug, and its combination regimen prolongs the overall survival to 16 months compared with chemotherapy alone, which is significantly improved compared with the average survival time of less than one year under traditional chemotherapy for advanced gastric cancer. Multiple studies have shown that anti-HER2 therapy (mainly monoclonal antibody drugs against HER2 protein) has very significant curative effects in both in vitro and in vivo models of gastric cancer.

然而，在实际临床实验中曲妥珠单抗的客观有效率仅47.3％，如何开发快速准确的方法，在治疗前对曲妥珠单抗治疗疗效进行预测，具有重要临床价值。胃癌中HER2表达存在较高的异质性，不同患者在病理图片中也表现出不同程度的异质性，曲妥珠单抗治疗疗效与胃癌HER2表达异质性相关。在接受曲妥珠单抗治疗患者中，研究观察到同质性较好的HER2表达胃癌患者无进展生存时间(PFS)、总生存期(OS)均较异质性较高胃癌患者延长[2][3]。However, the objective effective rate of trastuzumab in actual clinical trials is only 47.3%. How to develop a fast and accurate method to predict the therapeutic effect of trastuzumab before treatment has important clinical value. The expression of HER2 in gastric cancer has high heterogeneity, and different patients also show different degrees of heterogeneity in pathological pictures. The therapeutic effect of trastuzumab is related to the heterogeneity of HER2 expression in gastric cancer. In patients treated with trastuzumab, it was observed that the progression-free survival (PFS) and overall survival (OS) of patients with gastric cancer with better homogeneity and HER2 expression were longer than those with gastric cancer with higher heterogeneity [2 ][3].

在临床上，评估胃癌HER2表达异质性的方法主要是通过病理切片免疫组织化学和原位杂交方法来评估。免疫组化可以部分评估HER2表达状态，而原位杂交法是HER2检测的金标准。但是即使对HER2进行了较为准确的评估，也难以对患者胃癌病理图像进行准确有效的分类，从而无法辅助医生有效预测患者进行曲妥珠单抗治疗的疗效。Clinically, the methods for evaluating the heterogeneity of HER2 expression in gastric cancer are mainly evaluated by immunohistochemistry and in situ hybridization in pathological sections. Immunohistochemistry can partially evaluate the expression status of HER2, while in situ hybridization is the gold standard for HER2 detection. However, even with a relatively accurate assessment of HER2, it is difficult to accurately and effectively classify the pathological images of gastric cancer patients, so that it cannot assist doctors to effectively predict the efficacy of trastuzumab treatment for patients.

以下文献是与本发明相关的技术背景资料：The following documents are technical background materials relevant to the present invention:

[1]张芮毫综述,张明审校.进展期胃癌抗HER2治疗研究进展[J].医学研究生学报,2022(035-002).[1] Review by Zhang Ruihao, review by Zhang Ming. Research progress on anti-HER2 therapy for advanced gastric cancer [J]. Medical Postgraduate Journal, 2022(035-002).

[2]Gravalos,C.,&Jimeno,A.(2008).HER2 in gastric cancer:a newprognostic factor and a novel therapeutic target.Annals of Oncology,19(9),1523-1529.[2] Gravalos, C., & Jimeno, A. (2008). HER2 in gastric cancer: a new prognostic factor and a novel therapeutic target. Annals of Oncology, 19(9), 1523-1529.

[3]Huemer,F.,Weiss,L.,Regitnig,P.,Winder,T.,Hartmann,B.,Thaler,J.,...&

E.(2020).Local and Central Evaluation of HER2 Positivity andClinical Outcome in Advanced Gastric and Gastroesophageal Cancer—Resultsfrom the AGMT GASTRIC-5Registry.Journal of Clinical Medicine,9(4),935.[3] Huemer, F., Weiss, L., Regitnig, P., Winder, T., Hartmann, B., Thaler, J.,...&

E. (2020). Local and Central Evaluation of HER2 Positivity and Clinical Outcome in Advanced Gastric and Gastroesophageal Cancer—Results from the AGMT GASTRIC-5 Registry. Journal of Clinical Medicine, 9(4), 935.

发明内容Contents of the invention

为了解决现有技术中的上述问题，即现有技术无法进行胃癌病理图像的准确分类，从而无法有效辅助医生预测患者进行曲妥珠单抗治疗的疗效的问题，本发明提供了一种基于HER2基因检测的胃癌病理图像分类方法，所述胃癌病理图像分类方法包括：In order to solve the above-mentioned problems in the prior art, that is, the prior art cannot accurately classify gastric cancer pathological images, and thus cannot effectively assist doctors in predicting the curative effect of trastuzumab treatment for patients, the present invention provides a HER2 gene-based The detected gastric cancer pathological image classification method, the gastric cancer pathological image classification method includes:

步骤S10，获取胃癌患者的胃部数字病理切片图像；所述切片图像包含癌症组织表达HER2基因的表达情况；Step S10, acquiring a digital pathological slice image of the stomach of a patient with gastric cancer; the slice image includes the expression of the HER2 gene expressed by the cancer tissue;

步骤S20，对所述切片图像进行区域提取及分割操作，获得多个设定像素大小且包含设定组织类型的图像块；Step S20, performing region extraction and segmentation operations on the sliced image to obtain a plurality of image blocks with a set pixel size and a set tissue type;

步骤S30，进行所述多个设定像素大小且包含设定组织类型的图像块的数据增强归一化和数据划分，获得训练图像包；Step S30, perform data enhancement normalization and data division of the plurality of image blocks with a set pixel size and a set tissue type to obtain a training image package;

步骤S40，通过所述训练图像包对构建的ResNet卷积神经网络进行迭代训练，获得胃癌病理图像分类模型；Step S40, iteratively training the constructed ResNet convolutional neural network through the training image package to obtain a gastric cancer pathological image classification model;

步骤S50，基于实时获取的患者胃部数字病理切片图像，通过胃癌病理图像分类模型进行分类，获得图像分类结果。Step S50, based on the digital pathological slice images of the patient's stomach acquired in real time, the gastric cancer pathological image classification model is used to classify, and an image classification result is obtained.

在一些优选的实施例中，所述胃癌患者的胃部数字病理切片图像为通过苏木精——伊红染色法染色后的图像；In some preferred embodiments, the gastric digital pathological section image of the gastric cancer patient is an image stained by hematoxylin-eosin staining method;

所述癌症组织表达HER2基因的表达情况通过免疫组化的方式获得，包括阴性信息和阳性信息。The expression of the HER2 gene expressed by the cancer tissue is obtained by immunohistochemistry, including negative information and positive information.

在一些优选的实施例中，步骤S20包括：In some preferred embodiments, step S20 includes:

步骤S21，将标注文件的癌症区域边界点连接，获得癌症区域边界，并基于所述癌症区域边界转换成图像掩膜；Step S21, connecting the cancer region boundary points of the annotation file to obtain the cancer region boundary, and converting it into an image mask based on the cancer region boundary;

步骤S22，将所述切片图像和所述图像掩膜降采样至设定level；Step S22, downsampling the slice image and the image mask to a set level;

步骤S23，通过降采样的图像掩膜进行降采样的切片图像的区域提取，并将提取的子区域分割为设定像素大小的图像块；Step S23, performing area extraction of the down-sampled sliced image through the down-sampled image mask, and dividing the extracted sub-areas into image blocks of a set pixel size;

步骤S24，分别计算每个图像块中与ROI区域重合面积占图像块面积的比例，提取大于设定阈值的图像块，获得多个设定像素大小且包含设定组织类型的图像块。Step S24, calculating the ratio of the area overlapping with the ROI area in each image block to the area of the image block, extracting image blocks larger than a set threshold, and obtaining multiple image blocks with a set pixel size and a set tissue type.

在一些优选的实施例中，所述降采样，其方法为：In some preferred embodiments, the method of downsampling is:

其中，H，W分别为降采样后的切片图像和图像掩膜的高和宽，height，width分别为降采样前的切片图像和图像掩膜的高和宽，level为降采样尺度。Among them, H and W are the height and width of the sliced image and image mask after downsampling respectively, height and width are the height and width of the sliced image and image mask before downsampling respectively, and level is the downsampling scale.

在一些优选的实施例中，所述数据增强归一化，其方法为：In some preferred embodiments, the data enhancement normalization method is:

进行多个设定像素大小且包含设定组织类型的图像块中每一个图像块的水平翻转、垂直翻转、随机旋转，获得增强图像块集；Perform horizontal flipping, vertical flipping, and random rotation of each image block in multiple image blocks with a set pixel size and a set tissue type to obtain an enhanced image block set;

对增强图像块集中每一个图像块进行亮度、对比度的归一化，获得增强归一化图像块集。Normalize the brightness and contrast of each image block in the enhanced image block set to obtain the enhanced normalized image block set.

在一些优选的实施例中，所述数据划分，其方法为：In some preferred embodiments, the data division method is as follows:

对所述增强归一化图像块集进行随机划分，获得设定数量图像块组成的图像包集合；Randomly divide the enhanced normalized image block set to obtain an image packet set composed of a set number of image blocks;

判断每一个图像块的标签，并执行：Determine the label of each image block and execute:

若一个图像包中包含至少一个阳性标签的图像块，则该图像包标记为正类多示例包；否则，该图像包标记为负类多示例包。If an image packet contains at least one image patch with a positive label, the image packet is marked as a positive multi-instance bag; otherwise, the image bag is marked as a negative multi-instance bag.

在一些优选的实施例中，所述判断每一个图像块的标签，其方法为：In some preferred embodiments, the method for judging the label of each image block is:

通过ResNet卷积神经网络提取所述图像块的特征图；Extract the feature map of the image block by a ResNet convolutional neural network;

进行所述特征图的最大池化操作，并通过softmax归一化函数计算获取所述图像块为阳性类和阴性类的概率。Carry out the maximum pooling operation of the feature map, and calculate and obtain the probability that the image block is a positive class and a negative class through a softmax normalization function.

在一些优选的实施例中，所述胃癌病理图像分类模型，其训练方法为：In some preferred embodiments, the training method of the gastric cancer pathological image classification model is:

步骤B10，提取所述训练图像包中图像块的图片纹路、细胞形态，并排除胃间质细胞、腺体细胞，获得预处理训练图像包；Step B10, extracting the picture texture and cell shape of the image block in the training image package, and excluding gastric interstitial cells and glandular cells, to obtain a preprocessed training image package;

步骤B20，构建ResNet卷积神经网络，针对每个胃部数字病理切片图像的降采样尺度，基于预处理训练图像包，训练一个预测袋标签的单尺度多示例学习网络；Step B20, constructing a ResNet convolutional neural network, and training a single-scale multi-instance learning network for predicting bag labels based on the preprocessed training image package for the downsampling scale of each gastric digital pathology slice image;

步骤B30，基于所述单尺度多示例学习网络的权重，对多尺度级别的预处理训练图像包进行多示例学习，获得胃癌病理图像分类模型。Step B30, based on the weight of the single-scale multi-instance learning network, multi-instance learning is performed on the multi-scale level preprocessed training image package to obtain a gastric cancer pathological image classification model.

在一些优选的实施例中，所述ResNet卷积神经网络包括设定数量的残差连接块；In some preferred embodiments, the ResNet convolutional neural network includes a set number of residual connection blocks;

所述残差连接块包括顺次连接的3×3卷积层、批量规范化层、ReLU激活函数、3×3卷积层、批量规范化层、ReLU激活函数。The residual connection block includes sequentially connected 3×3 convolutional layers, batch normalization layers, ReLU activation functions, 3×3 convolutional layers, batch normalization layers, and ReLU activation functions.

本发明的另一方面，提出了一种基于HER2基因检测的胃癌病理图像分类系统，所述胃癌病理图像分类系统包括：Another aspect of the present invention proposes a gastric cancer pathological image classification system based on HER2 gene detection, the gastric cancer pathological image classification system comprising:

数据采集模块，配置为获取胃癌患者的胃部数字病理切片图像；所述切片图像包含癌症组织表达HER2基因的表达情况；The data acquisition module is configured to acquire digital pathological slice images of the stomach of patients with gastric cancer; the slice images include the expression of the HER2 gene expressed by the cancer tissue;

区域提取及分割模块，配置为对所述切片图像进行区域提取及分割操作，获得多个设定像素大小且包含设定组织类型的图像块；The region extraction and segmentation module is configured to perform region extraction and segmentation operations on the sliced image to obtain a plurality of image blocks with a set pixel size and a set tissue type;

数据分包模块，配置为进行所述多个设定像素大小且包含设定组织类型的图像块的数据增强归一化和数据划分，获得训练图像包；The data packetization module is configured to perform data enhancement normalization and data division of the plurality of image blocks with a set pixel size and a set tissue type to obtain a training image package;

模型训练模块，配置为通过所述训练图像包对构建的ResNet卷积神经网络进行迭代训练，获得胃癌病理图像分类模型；The model training module is configured to iteratively train the constructed ResNet convolutional neural network through the training image package to obtain a gastric cancer pathological image classification model;

分类模块，配置为基于实时获取的患者胃部数字病理切片图像，通过胃癌病理图像分类模型进行分类，获得图像分类结果。The classification module is configured to classify the gastric cancer pathological image classification model based on the digital pathological slice images of the patient's stomach acquired in real time, and obtain an image classification result.

本发明的有益效果：Beneficial effects of the present invention:

(1)本发明基于HER2基因检测的胃癌病理图像分类方法，可以对H&E染色的胃癌病理切片进行分析，并给出患者HER2基因表达的信息和生存时间的预测信息，准确进行患者各阶段胃部数字病理切片图像的分类，从而可有效辅助医生判断HER2靶向治疗的针对性，改善一部分需要靶向患者的预后，避免对部分患者做无意义的HER2靶向治疗。(1) The gastric cancer pathological image classification method based on HER2 gene detection of the present invention can analyze H&E stained gastric cancer pathological sections, and provide the information of HER2 gene expression and the prediction information of survival time of the patient, and accurately perform gastric cancer diagnosis at each stage of the patient. The classification of digital pathological slice images can effectively assist doctors in judging the pertinence of HER2-targeted therapy, improve the prognosis of some patients who need targeted therapy, and avoid meaningless HER2-targeted therapy for some patients.

(2)本发明基于HER2基因检测的胃癌病理图像分类方法，模型训练中，首先训练一个预测袋标签的单尺度多示例学习网络，然后在单尺度多示例学习网络的基础上，进行多示例学习，提升了训练后模型的性能，进一步提高了胃癌病理图像分类的准确性。(2) The method for classifying gastric cancer pathological images based on HER2 gene detection in the present invention, in model training, first train a single-scale multi-instance learning network for predicting bag labels, and then perform multi-instance learning on the basis of single-scale multi-instance learning network , which improves the performance of the trained model and further improves the accuracy of gastric cancer pathological image classification.

(3)本发明基于HER2基因检测的胃癌病理图像分类方法，在模型中添加注意力机制，注意力机制可以帮助模型对输入的每个部分赋予不同的权重，抽取出更加关键及重要的信息，使模型做出更加准确的判断，同时不会对模型的计算和存储带来更大的开销，进一步提高了胃癌病理图像分类的准确性。(3) The method for classifying gastric cancer pathological images based on HER2 gene detection in the present invention adds an attention mechanism to the model, which can help the model assign different weights to each part of the input and extract more critical and important information. The model can make more accurate judgments, and at the same time, it will not bring more overhead to the calculation and storage of the model, and further improve the accuracy of gastric cancer pathological image classification.

附图说明Description of drawings

通过阅读参照以下附图所作的对非限制性实施例所作的详细描述，本申请的其它特征、目的和优点将会变得更明显：Other characteristics, objects and advantages of the present application will become more apparent by reading the detailed description of non-limiting embodiments made with reference to the following drawings:

图1是本发明基于HER2基因检测的胃癌病理图像分类方法的流程示意图；Fig. 1 is a schematic flow chart of the method for classifying gastric cancer pathological images based on HER2 gene detection in the present invention;

图2是本发明基于HER2基因检测的胃癌病理图像分类方法的多尺度病理图像袋类标签预测流程示意图。Fig. 2 is a schematic flow diagram of the multi-scale pathological image bag label prediction method of the gastric cancer pathological image classification method based on HER2 gene detection in the present invention.

具体实施方式Detailed ways

下面结合附图和实施例对本申请作进一步的详细说明。可以理解的是，此处所描述的具体实施例仅用于解释相关发明，而非对该发明的限定。另外还需要说明的是，为了便于描述，附图中仅示出了与有关发明相关的部分。The application will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain related inventions, not to limit the invention. It should also be noted that, for the convenience of description, only the parts related to the related invention are shown in the drawings.

需要说明的是，在不冲突的情况下，本申请中的实施例及实施例中的特征可以相互组合。下面将参考附图并结合实施例来详细说明本申请。It should be noted that, in the case of no conflict, the embodiments in the present application and the features in the embodiments can be combined with each other. The present application will be described in detail below with reference to the accompanying drawings and embodiments.

本发明提供一种基于HER2基因检测的胃癌病理图像分类方法，开发准确并有效的胃癌病理图像分类方法，在此基础上辅助医生准确、稳定、方便地进行抗HER2靶向治疗疗效预测具有重要的临床价值，也是亟需解决的临床问题。The present invention provides a gastric cancer pathological image classification method based on HER2 gene detection, and develops an accurate and effective gastric cancer pathological image classification method. On this basis, it is important to assist doctors to predict the curative effect of anti-HER2 targeted therapy accurately, stably and conveniently. Clinical value is also a clinical problem that needs to be solved urgently.

本发明的一种基于HER2基因检测的胃癌病理图像分类方法，所述胃癌病理图像分类方法包括：A gastric cancer pathological image classification method based on HER2 gene detection of the present invention, the gastric cancer pathological image classification method comprises:

为了更清晰地对本发明基于HER2基因检测的胃癌病理图像分类方法进行说明，下面结合图1对本发明实施例中各步骤展开详述。In order to more clearly describe the method for classifying gastric cancer pathological images based on HER2 gene detection in the present invention, each step in the embodiment of the present invention will be described in detail below with reference to FIG. 1 .

本发明第一实施例的基于HER2基因检测的胃癌病理图像分类方法，包括步骤S10-步骤S50，各步骤详细描述如下：The method for classifying gastric cancer pathological images based on HER2 gene detection according to the first embodiment of the present invention includes step S10-step S50, each step is described in detail as follows:

步骤S10，获取胃癌患者的胃部数字病理切片图像；所述切片图像包含癌症组织表达HER2基因的表达情况。Step S10, obtaining digital pathological slice images of the stomach of gastric cancer patients; the slice images include the expression of HER2 gene expressed by cancer tissues.

胃癌患者的胃部数字病理切片图像为通过苏木精——伊红染色法染色后的图像，癌症组织表达HER2基因的表达情况通过免疫组化的方式获得，包括阴性信息和阳性信息。The gastric digital pathological section images of patients with gastric cancer were stained by hematoxylin-eosin staining method, and the expression of HER2 gene in cancer tissues was obtained by immunohistochemistry, including negative information and positive information.

本发明一个实施例中，数据集来自从肿瘤医院收集的163块WSI(数字病理切片图像)图片(H&E染色，苏木精——伊红染色)，数据集中具体包含四种评效的信息：ORR(客观缓解率)、DCR(疾病控制率)、PFS(无进展生存时间)和OS(总生存期)的信息。In one embodiment of the present invention, the data set comes from 163 pieces of WSI (digital pathological slice image) pictures (H&E staining, hematoxylin-eosin staining) collected from the cancer hospital, and the data set specifically includes four kinds of evaluation information: Information on ORR (objective response rate), DCR (disease control rate), PFS (progression-free survival) and OS (overall survival).

步骤S20，对所述切片图像进行区域提取及分割操作，获得多个设定像素大小且包含设定组织类型的图像块。Step S20, performing region extraction and segmentation operations on the sliced image to obtain a plurality of image blocks with a set pixel size and including a set tissue type.

先对数字病理切片图像进行预处理，将整张大图像分割成为许多块512×512像素的小图像，且每一小块图像都是一种特定的组织类型。由于WSI图像是一种比较特殊的医学图像，其图像的宽和高通常都有几万到几十万像素，这就导致整个图像较大，这使得整个图像无法像其他医学图像在简单处理之后就可以进入网络训练，而是需要通过Mask将图像中的肿瘤区域提取出来，然后将图像切分为可以用于深度学习模型的小块图像(Patch)。Firstly, the digital pathological slice image is preprocessed, and the whole large image is divided into many small images of 512×512 pixels, and each small image is a specific tissue type. Since the WSI image is a special kind of medical image, the width and height of the image usually have tens of thousands to hundreds of thousands of pixels, which leads to a larger whole image, which makes the whole image unable to be compared with other medical images after simple processing. Then you can enter the network training, but you need to extract the tumor area in the image through Mask, and then divide the image into small pieces of images (Patch) that can be used for deep learning models.

步骤S20中，首先利用标注好的图像掩膜(Mask)将图像中的肿瘤区域提取出来，然后将图像切分为可以用于深度学习模型的小块图像，具体包括以下步骤：In step S20, first use the marked image mask (Mask) to extract the tumor area in the image, and then divide the image into small image blocks that can be used for the deep learning model, specifically including the following steps:

步骤S21，将标注文件的癌症区域边界点连接，获得癌症区域边界，并基于所述癌症区域边界转换成图像掩膜。Step S21, connect the cancer region boundary points of the annotation files to obtain the cancer region boundary, and convert the cancer region boundary into an image mask based on the cancer region boundary.

标注文件是一个个点的信息，将其转换为图像的掩膜(Mask)。图像的标注文件是XML的文件格式。这个文件中储存的是每一个类别组织的边界，具体数据是由一系列点组成的，每一个点都是专业医生鼠标点击图像时的位置信息，这些点的位置组成了一个多边形的勾画，勾画出了每一种组织类型的区域。The annotation file is the information of each point, which is converted into a mask of the image (Mask). The annotation file of the image is an XML file format. This file stores the boundaries of each category organization. The specific data is composed of a series of points. Each point is the position information when a professional doctor clicks on the image with the mouse. The positions of these points form a polygonal outline. Areas for each tissue type are shown.

步骤S22，将所述切片图像和所述图像掩膜降采样至设定level。Step S22, down-sampling the slice image and the image mask to a set level.

原始WSI图像太大，需要将图像降采样。本发明一个实施例中，降采样的方法如式(1)所示，其中level代表降采样的尺度。根据这个公式将图像和对应的标注点降采样到对应的level，完成进一步分析：The original WSI image is too large and needs to be down-sampled. In one embodiment of the present invention, the downsampling method is shown in formula (1), where level represents the downsampling scale. According to this formula, the image and the corresponding label points are down-sampled to the corresponding level to complete further analysis:

其中，H，W分别为降采样后的切片图像和图像掩膜的高和宽，height，width分别为降采样前的切片图像和图像掩膜的高和宽。Among them, H and W are the height and width of the sliced image and image mask after downsampling, respectively, and height and width are the height and width of the sliced image and image mask before downsampling, respectively.

步骤S23，通过降采样的图像掩膜进行降采样的切片图像的区域提取，并将提取的子区域分割为设定像素大小的图像块。Step S23 , extracting the area of the down-sampled slice image through the down-sampled image mask, and dividing the extracted sub-areas into image blocks with a set pixel size.

对于步骤S21中提取得到的边界勾画多边形，可以先提取其轮廓的边界，提取出来按照轮廓的上下左右四个边界的最大值，围成一张等待预处理的图片子区域(获得勾画区域局部图片)，之后将提取出来的图片分割成512×512大小的patch级别图像。For the boundary drawing polygon extracted in step S21, the boundary of its outline can be extracted first, and extracted according to the maximum value of the four boundaries of the upper, lower, left, and right sides of the outline, to enclose a picture sub-region waiting for preprocessing (obtain a local picture of the outline area) ), and then segment the extracted image into patch-level images of size 512×512.

对于每个分割出来的Patch级别小图像，根据计算其和步骤S21中提取出的多边形ROI重合的面积，当重合的面积达到Patch级别小图像面积的75％以上时，就把这个Patch当作有效的数据，存入划分好的Patch级图像。For each segmented Patch level small image, according to calculating its overlapping area with the polygon ROI extracted in step S21, when the overlapping area reaches more than 75% of the Patch level small image area, this Patch is regarded as valid The data is stored in the divided Patch-level image.

步骤S30，进行所述多个设定像素大小且包含设定组织类型的图像块的数据增强归一化和数据划分，获得训练图像包。Step S30, performing data enhancement normalization and data division of the plurality of image blocks with a set pixel size and a set tissue type to obtain a training image package.

对于步骤S20获得的训练图像包，将HER2阴性(标签为HER2＝0和HER＝1的部分)和HER2阳性(标签为HER2＝2和HER＝3的部分)，以8:2的比例随机将病人划分成训练集和测试集。For the training image package obtained in step S20, the HER2-negative (labeled as HER2=0 and HER=1 part) and HER2-positive (labeled as HER2=2 and HER=3 part) are randomly divided in a ratio of 8:2 Patients are divided into training set and test set.

数据增强归一化，其方法为：Data enhancement normalization, the method is:

本发明一个实施例中，为了增加数据集的数量，避免模型过拟合，在处理数据的过程中对数据进行了水平或者垂直翻转。此外，还对数据集图片进行了随机45度旋转。In one embodiment of the present invention, in order to increase the number of data sets and avoid model overfitting, the data is flipped horizontally or vertically during data processing. In addition, a random 45-degree rotation was applied to the dataset images.

对于数据归一化的操作，对数据集进行了亮度、对比度等参数的归一化。对于图像的RGB三个颜色通道，均值分别归一化为[0.6209136,0.39992052,0.68346393](三个数值分别代表图像中RGB三个通道，下同)，方差分别归一化为[0.26443535,0.30418476,0.19353978]。For the operation of data normalization, parameters such as brightness and contrast were normalized for the data set. For the three RGB color channels of the image, the mean values are normalized to [0.6209136, 0.39992052, 0.68346393] (the three values represent the three RGB channels in the image, the same below), and the variances are normalized to [0.26443535, 0.30418476, 0.19353978].

在癌症病理学中，由于肿瘤和非肿瘤区域是混合的，一个WSI中常常含有HER2阴性和阳性区域的病理组织。因此，本发明借鉴了多示例学习(Multiple Instance Learning，MIL)的方法。具体地说，本发明将来自一个WSI的不同patch图片组织成一个或几个包。在这个图片组织过程中，相近的病理图像会被组织到一个包内，使得包内图像纹理和染色等信息比较相近。若一个包内含阳性和阴性病理图片，那么这个包会被标记为正例；若全是阴性图片被标记为负例。此外，当病理学家诊断病人，他们观察玻片在不同的尺度上。为了模仿这一点，本发明考虑具有多个不同尺度的patch。In cancer pathology, due to the mixture of tumor and non-tumor areas, a single WSI often contains pathological tissue of HER2-negative and positive areas. Therefore, the present invention draws on the method of multiple instance learning (Multiple Instance Learning, MIL). Specifically, the present invention organizes different patch pictures from one WSI into one or several packages. In the process of image organization, similar pathological images will be organized into a package, so that the image texture and staining information in the package are relatively similar. If a bag contains both positive and negative pathology images, then this bag will be marked as a positive example; if all negative images are marked as a negative example. Also, when pathologists diagnose patients, they view slides at different scales. To mimic this, the present invention considers patches with multiple different scales.

数据划分，其方法为：Data partitioning, the method is:

判断每一个图像块的标签，其方法为：To judge the label of each image block, the method is:

步骤S40，通过所述训练图像包对构建的ResNet卷积神经网络进行迭代训练，获得胃癌病理图像分类模型。Step S40, iteratively training the constructed ResNet convolutional neural network through the training image package to obtain a gastric cancer pathological image classification model.

胃癌病理图像分类模型，其训练方法为：Gastric cancer pathological image classification model, the training method is:

步骤B10，提取所述训练图像包中图像块的图片纹路、细胞形态，并排除胃间质细胞、腺体细胞，获得预处理训练图像包。Step B10, extracting the picture texture and cell shape of the image blocks in the training image package, and excluding gastric mesenchymal cells and glandular cells, to obtain a preprocessed training image package.

预处理待分割的Patch图像，提取图片纹路，细胞形态等信息，排除和肿瘤组织差异较大的胃间质细胞，腺体细胞等。Preprocess the Patch image to be segmented, extract image texture, cell shape and other information, and exclude gastric stromal cells and glandular cells that are significantly different from tumor tissue.

步骤B20，构建ResNet卷积神经网络，针对每个胃部数字病理切片图像的降采样尺度，基于预处理训练图像包，训练一个预测袋标签的单尺度多示例学习网络。Step B20, constructing a ResNet convolutional neural network, and training a single-scale multi-instance learning network for predicting bag labels based on the preprocessed training image package for the downsampling scale of each gastric digital pathology slice image.

本发明所使用的卷积神经网络是ResNet18，ResNet中主要结构是残差连接块。每一个残差块里首先有2个有相同输出通道数的3×3卷积层。每个卷积层后接一个批量规范化层和ReLU激活函数。然后通过跨层直接通路，跳过这2个卷积运算，将输入直接加在最后的ReLU激活函数前，如表1所示：The convolutional neural network used in the present invention is ResNet18, and the main structure in ResNet is a residual connection block. Each residual block first has two 3×3 convolutional layers with the same number of output channels. Each convolutional layer is followed by a batch normalization layer and a ReLU activation function. Then through the cross-layer direct path, these two convolution operations are skipped, and the input is directly added before the final ReLU activation function, as shown in Table 1:

表1Table 1

在模型训练之前，针对模型先进行预训练以提高模型效果。Before model training, pre-train the model to improve the model effect.

针对单尺度WSI的patch级图像进行多示例学习：Multi-instance learning for patch-level images of single-scale WSI:

针对每个WSI降采样的尺度训练一个单尺度多示例学习的网络来预测袋的标签，其中每个袋子只包含降采样尺度s图像上的patch，优化网络得到袋类标签预测。For each WSI downsampled scale, train a single-scale multi-instance learning network to predict the label of the bag, where each bag only contains the patch on the downsampled scale s image, and optimize the network to obtain the bag label prediction.

如图2所示，为本发明基于HER2基因检测的胃癌病理图像分类方法的多尺度病理图像袋类标签预测流程示意图，将单尺度多示例学习的训练任务表述为式(2)所示的最小化问题：As shown in Figure 2, it is a schematic diagram of the multi-scale pathological image bag label prediction process of the gastric cancer pathological image classification method based on HER2 gene detection in the present invention, and the training task of single-scale multi-instance learning is expressed as the minimum problem:

其中，最小化的这一项是袋类标签预测的损失函数，损失函数简单地定义为真实类标号与预测类标号之间的交叉熵。在这里，通过注意力机制，最终的bag类标签是通过只使用Attention较大的示例来预测的。Among them, the minimized term is the loss function for bag label prediction, which is simply defined as the cross-entropy between the true class label and the predicted class label. Here, through the attention mechanism, the final bag class label is predicted by using only the examples with large attention.

注意力机制(Attention Mechanism)是机器学习中的一种数据处理方法，其学习了人们在观察事物时只从一些比较重要的局部特征就可以抽取出来全局特征的特点。注意力机制可以帮助模型对输入的每个部分赋予不同的权重，抽取出更加关键及重要的信息，使模型做出更加准确的判断，同时不会对模型的计算和存储带来更大的开销。利用注意力机制，能够使得深度学习在观察目标时更加具有针对性，使得目标识别与分类的精度都有所提升。Attention Mechanism is a data processing method in machine learning, which learns the characteristics that people can extract global features only from some more important local features when observing things. The attention mechanism can help the model assign different weights to each part of the input, extract more critical and important information, and enable the model to make more accurate judgments without bringing more overhead to the calculation and storage of the model . Using the attention mechanism can make deep learning more targeted when observing targets, and improve the accuracy of target recognition and classification.

步骤B21，将某张胃癌病理切片图像得到的patch处理为224*224*3的3通道图像输入所述ResNet18模块中，在经过4个残差连接块(包括批归一化层，卷积层和激活函数)之后提取出7*7*512大小的特征图像Y1。Step B21, process the patch obtained from a certain gastric cancer pathological slice image into a 224*224*3 3-channel image and input it into the ResNet18 module, after passing through 4 residual connection blocks (including batch normalization layer, convolution layer and activation function) to extract a feature image Y1 of size 7*7*512.

步骤B22，将所述特征图Y1进行最大池化操作，并且通过softmax归一化函数计算为该图片为阳性类和阴性类的概率。In step B22, the feature map Y1 is subjected to a maximum pooling operation, and the probability of the picture being a positive class and a negative class is calculated by a softmax normalization function.

步骤B23，将一个包内的所有图片都完成B21-B22步骤后，通过注意力机制(Attention Mechanism)根据这个包内每一个图像Attention的权重不同得到这个包的标签输出P(阴性或阳性)。Step B23, after completing steps B21-B22 for all the pictures in a package, the label output P (negative or positive) of the package is obtained according to the weight of each image in the package through the attention mechanism (Attention Mechanism).

步骤B24，通过交叉熵计算公式计算输出P与胃癌病理图像包真实标签之间的交叉熵损失，将交叉熵损失通过梯度下降算法反向传播至卷积神经网络，更新卷积神经网络和Attention网络的网络参数。Step B24, calculate the cross-entropy loss between the output P and the real label of the gastric cancer pathological image package through the cross-entropy calculation formula, backpropagate the cross-entropy loss to the convolutional neural network through the gradient descent algorithm, and update the convolutional neural network and the Attention network network parameters.

对多尺度级别的patch图像进行多示例学习：Multi-instance learning for patch images at multiple scale levels:

将不同尺度的病理切片都纳入包中，对训练好的权重进一步微调。训练一个多尺度的网络来预测不同尺度的袋类标签，其中每个袋包含不同尺度的patch，优化网络得到袋类标签预测。Pathological slices of different scales are included in the package, and the trained weights are further fine-tuned. Train a multi-scale network to predict bag labels of different scales, where each bag contains patches of different scales, and optimize the network to obtain bag label predictions.

多尺度病理图像袋类标签预测过程表述为式(3)所示的函数：The multi-scale pathological image bag label prediction process is expressed as a function shown in formula (3):

其中，特征提取集

s∈[S]在单尺度病理图像的训练过程中已经初步训练，这里直接使用上一步的参数。Among them, the feature extraction set

s∈[S] has been preliminarily trained in the training process of single-scale pathological images, and the parameters in the previous step are directly used here.

参数集

的训练表示为式(4)所示的最小化问题：parameter set

The training of is expressed as the minimization problem shown in formula (4):

在获取患者胃部数字病理切片图像的分类结果后，还可以结合分类结果进行病人的ORR标签(完全缓解(CR,complete response)或部分缓解(PR,partial response))进行预测。After obtaining the classification results of the digital pathological slice images of the patient's stomach, the classification results can also be combined to predict the patient's ORR label (complete response (CR, complete response) or partial response (PR, partial response)).

针对客观缓解率ORR进行预测：For prediction of objective response rate ORR:

将胃癌病理图像分类模型的最后的全连接层改为输出ORR的预测，优化器采用交叉熵损失函数来作为Loss函数，如式(5)所示：Change the last fully connected layer of the gastric cancer pathological image classification model to output the prediction of ORR, and the optimizer uses the cross-entropy loss function as the Loss function, as shown in formula (5):

其中，y代表病人真实的ORR标签，

代表预测的ORR标签。Among them, y represents the patient's true ORR label,

represents the predicted ORR label.

针对无进展生存期PFS(病人从开始观察到死亡或者数据删失的时间)进行预测：Prediction of progression-free survival (PFS) (time from patient observation to death or data censoring):

将胃癌病理图像分类模型的最后的全连接层改为输出一个数值，优化器采用C-index指数计算损失函数，如式(6)所示：Change the last fully connected layer of the gastric cancer pathological image classification model to output a value, and the optimizer uses the C-index index to calculate the loss function, as shown in formula (6):

其中，T代表随机选取的i，j两个病人的生存时间，l和

分别代表在真实值和模型预测得到的值，如果i病人的生存时间长于j病人的生存时间，则该函数的值为1，反之为0。Among them, T represents the survival time of two randomly selected patients i and j, l and

Represent the actual value and the value predicted by the model, respectively. If the survival time of patient i is longer than the survival time of patient j, the value of this function is 1, otherwise it is 0.

在经过对模型的评估和考虑之后，本发明对模型的loss函数做了进一步调整，同样将胃癌病理图像分类模型最后的全连接层改为输出一个数值，用于输出预测得到的无进展生存期PFS，优化器和正则化项不变，如式(7)所示：After evaluating and considering the model, the present invention further adjusts the loss function of the model, and also changes the last fully connected layer of the gastric cancer pathological image classification model to output a value for outputting the predicted progression-free survival period PFS, the optimizer and regularization items remain unchanged, as shown in equation (7):

其中，N_E＝1代表发生有记录的临床事件(在本发明中是指患者死亡或者疾病进展)的患者人数，：E_i＝1代表每一例具体病例，

代表对网络进行l₂正则化，λ为正则化的参数，

代表网络的输出结果，

是指在时间t仍存活，且未来会发生死亡或者疾病进展的患者集合。Among them, NE₌₁ represents the number of patients with recorded clinical events (referring to patient death or disease progression in the present invention), E_i =1 represents each specific case,

Represents the l₂ regularization of the network, λ is the regularization parameter,

represents the output of the network,

It refers to the set of patients who are still alive at time t and will die or progress in the future.

上述实施例中虽然将各个步骤按照上述先后次序的方式进行了描述，但是本领域技术人员可以理解，为了实现本实施例的效果，不同的步骤之间不必按照这样的次序执行，其可以同时(并行)执行或以颠倒的次序执行，这些简单的变化都在本发明的保护范围之内。In the above embodiment, although the various steps are described according to the above sequence, those skilled in the art can understand that in order to achieve the effect of this embodiment, different steps do not have to be executed in this order, and they can be performed at the same time ( Parallel) execution or execution in reversed order, these simple changes are all within the protection scope of the present invention.

本发明第二实施例的基于HER2基因检测的胃癌病理图像分类系统，所述胃癌病理图像分类系统包括：The gastric cancer pathological image classification system based on HER2 gene detection according to the second embodiment of the present invention, the gastric cancer pathological image classification system includes:

所属技术领域的技术人员可以清楚地了解到，为描述的方便和简洁，上述描述的系统的具体工作过程及有关说明，可以参考前述方法实施例中的对应过程，在此不再赘述。Those skilled in the art can clearly understand that for the convenience and brevity of the description, the specific working process and related descriptions of the above-described system can refer to the corresponding process in the foregoing method embodiments, and will not be repeated here.

需要说明的是，上述实施例提供的基于HER2基因检测的胃癌病理图像分类系统，仅以上述各功能模块的划分进行举例说明，在实际应用中，可以根据需要而将上述功能分配由不同的功能模块来完成，即将本发明实施例中的模块或者步骤再分解或者组合，例如，上述实施例的模块可以合并为一个模块，也可以进一步拆分成多个子模块，以完成以上描述的全部或者部分功能。对于本发明实施例中涉及的模块、步骤的名称，仅仅是为了区分各个模块或者步骤，不视为对本发明的不当限定。It should be noted that the gastric cancer pathological image classification system based on HER2 gene detection provided in the above embodiment is only illustrated by the division of the above functional modules. In practical applications, the above functions can be assigned to different functions modules, that is, to decompose or combine the modules or steps in the embodiments of the present invention. For example, the modules in the above embodiments can be combined into one module, or can be further split into multiple sub-modules to complete all or part of the above description Function. The names of the modules and steps involved in the embodiments of the present invention are only used to distinguish each module or step, and are not regarded as improperly limiting the present invention.

本发明第三实施例的一种电子设备，包括：An electronic device according to a third embodiment of the present invention includes:

至少一个处理器；以及at least one processor; and

与至少一个所述处理器通信连接的存储器；其中，a memory communicatively coupled to at least one of said processors; wherein,

所述存储器存储有可被所述处理器执行的指令，所述指令用于被所述处理器执行以实现上述的基于HER2基因检测的胃癌病理图像分类方法。The memory stores instructions executable by the processor, and the instructions are used to be executed by the processor to implement the above-mentioned method for classifying gastric cancer pathological images based on HER2 gene detection.

本发明第四实施例的一种计算机可读存储介质，所述计算机可读存储介质存储有计算机指令，所述计算机指令用于被所述计算机执行以实现上述的基于HER2基因检测的胃癌病理图像分类方法。A computer-readable storage medium according to the fourth embodiment of the present invention, the computer-readable storage medium stores computer instructions, and the computer instructions are used to be executed by the computer to realize the above-mentioned gastric cancer pathological image based on HER2 gene detection Classification.

所属技术领域的技术人员可以清楚地了解到，为描述的方便和简洁，上述描述的存储装置、处理装置的具体工作过程及有关说明，可以参考前述方法实施例中的对应过程，在此不再赘述。Those skilled in the art can clearly understand that for the convenience and brevity of the description, the specific working process and related descriptions of the storage device and the processing device described above can refer to the corresponding process in the foregoing method embodiments, and will not be repeated here. repeat.

本领域技术人员应该能够意识到，结合本文中所公开的实施例描述的各示例的模块、方法步骤，能够以电子硬件、计算机软件或者二者的结合来实现，软件模块、方法步骤对应的程序可以置于随机存储器(RAM)、内存、只读存储器(ROM)、电可编程ROM、电可擦除可编程ROM、寄存器、硬盘、可移动磁盘、CD-ROM、或技术领域内所公知的任意其它形式的存储介质中。为了清楚地说明电子硬件和软件的可互换性，在上述说明中已经按照功能一般性地描述了各示例的组成及步骤。这些功能究竟以电子硬件还是软件方式来执行，取决于技术方案的特定应用和设计约束条件。本领域技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能，但是这种实现不应认为超出本发明的范围。Those skilled in the art should be able to realize that the modules and method steps described in conjunction with the embodiments disclosed herein can be implemented by electronic hardware, computer software, or a combination of the two, and that the programs corresponding to the software modules and method steps Can be placed in random access memory (RAM), internal memory, read-only memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, removable disk, CD-ROM, or known in the technical field any other form of storage medium. In order to clearly illustrate the interchangeability of electronic hardware and software, the composition and steps of each example have been generally described in terms of functions in the above description. Whether these functions are performed by electronic hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art may implement the described functionality using different methods for each particular application, but such implementation should not be considered as exceeding the scope of the present invention.

本发明第五实施例的胃癌疗效智能预测设备，包括病理图像采集设备、预测设备、显示设备；The intelligent prediction device for the curative effect of gastric cancer according to the fifth embodiment of the present invention includes a pathological image acquisition device, a prediction device, and a display device;

所述病理图像采集设备用于获取胃癌患者的胃部数字病理切片图像；所述切片图像包含癌症组织表达HER2基因的表达情况。The pathological image acquisition device is used for acquiring gastric digital pathological slice images of patients with gastric cancer; the slice images include the expression of HER2 gene expressed by cancer tissues.

病理图像采集设备可以为普通相机+扫描仪，也可以为摄像头/摄像机+图像采集卡，还可以是显微数码相机或显微扫描仪等，本发明在此不一一详述。The pathological image acquisition device can be a common camera+scanner, or a camera/camera+image acquisition card, or a microscopic digital camera or a microscopic scanner, etc., which are not described in detail in the present invention.

所述预测设备包括图像处理模块、模型训练模块、图像分类模块和预测模块：Described prediction equipment comprises image processing module, model training module, image classification module and prediction module:

图像处理模块用于对所述切片图像进行区域提取及分割操作，获得多个设定像素大小且包含设定组织类型的图像块，进行所述多个设定像素大小且包含设定组织类型的图像块的数据增强归一化和数据划分，获得训练图像包；The image processing module is used to perform region extraction and segmentation operations on the sliced image to obtain a plurality of image blocks with a set pixel size and a set tissue type, and to perform the image blocks with a set pixel size and a set tissue type. Data enhancement normalization and data division of image blocks to obtain training image packages;

模型训练模块用于通过所述训练图像包对构建的ResNet卷积神经网络进行迭代训练，获得胃癌病理图像分类模型；The model training module is used to iteratively train the ResNet convolutional neural network constructed by the training image package to obtain a gastric cancer pathological image classification model;

图像分类模块用于基于实时获取的患者胃部数字病理切片图像，通过胃癌病理图像分类模型进行分类，获得图像分类结果；The image classification module is used to classify the gastric cancer pathological image classification model based on the digital pathological slice image of the patient's stomach acquired in real time, and obtain the image classification result;

预测模块用于基于胃癌病理图像分类模型和图像分类结果，预测客观缓解率ORR和/或无进展生存期PFS。The prediction module is used to predict the objective response rate ORR and/or progression-free survival PFS based on the gastric cancer pathological image classification model and image classification results.

ORR包括完全缓解(CR,complete response)、部分缓解(PR,partial response)。ORR includes complete response (CR, complete response) and partial response (PR, partial response).

所述显示模块用于显示预测设备输出的预测结果。The display module is used for displaying the prediction result output by the prediction device.

术语“第一”、“第二”等是用于区别类似的对象，而不是用于描述或表示特定的顺序或先后次序。The terms "first", "second", etc. are used to distinguish similar items, and are not used to describe or represent a specific order or sequence.

术语“包括”或者任何其它类似用语旨在涵盖非排他性的包含，从而使得包括一系列要素的过程、方法、物品或者设备/装置不仅包括那些要素，而且还包括没有明确列出的其它要素，或者还包括这些过程、方法、物品或者设备/装置所固有的要素。The term "comprising" or any other similar term is intended to cover a non-exclusive inclusion such that a process, method, article, or apparatus/apparatus comprising a set of elements includes not only those elements but also other elements not expressly listed, or Also included are elements inherent in these processes, methods, articles, or devices/devices.

至此，已经结合附图所示的优选实施方式描述了本发明的技术方案，但是，本领域技术人员容易理解的是，本发明的保护范围显然不局限于这些具体实施方式。在不偏离本发明的原理的前提下，本领域技术人员可以对相关技术特征做出等同的更改或替换，这些更改或替换之后的技术方案都将落入本发明的保护范围之内。So far, the technical solutions of the present invention have been described in conjunction with the preferred embodiments shown in the accompanying drawings, but those skilled in the art will easily understand that the protection scope of the present invention is obviously not limited to these specific embodiments. Without departing from the principles of the present invention, those skilled in the art can make equivalent changes or substitutions to related technical features, and the technical solutions after these changes or substitutions will all fall within the protection scope of the present invention.

Claims

1. A gastric cancer pathological image classification method based on HER2 gene detection is characterized by comprising the following steps:

step S10, acquiring a stomach digital pathological section image of a gastric cancer patient; the slice images comprise the expression profile of cancer tissue expressing HER2 gene;

step S20, carrying out region extraction and segmentation operation on the slice image to obtain a plurality of image blocks with set pixel sizes and set tissue types;

step S30, performing data enhancement normalization and data division on the plurality of image blocks with the set pixel sizes and the set tissue types to obtain a training image packet;

s40, performing iterative training on the constructed ResNet convolutional neural network through the training image packet to obtain a gastric cancer pathological image classification model;

and S50, classifying the stomach digital pathological section images of the patient based on the acquired real-time stomach digital pathological section images through a stomach cancer pathological image classification model to obtain image classification results.

2. The method for classifying pathological images of gastric cancer based on HER2 gene detection according to claim 1, wherein the digital pathological section images of stomach of gastric cancer patient are stained by hematoxylin-eosin staining method;

the expression condition of the cancer tissue expressing HER2 gene is obtained by means of immunohistochemistry, and comprises negative information and positive information.

3. The method for classifying gastric cancer pathology images based on HER2 gene detection according to claim 1, wherein step S20 comprises:

step S21, connecting the cancer region boundary points of the annotation file to obtain a cancer region boundary, and converting the cancer region boundary into an image mask based on the cancer region boundary;

step S22, down-sampling the slice image and the image mask to a set level;

step S23, extracting the area of the downsampled slice image through a downsampled image mask, and dividing the extracted sub-area into image blocks with set pixel sizes;

and step S24, respectively calculating the proportion of the overlapping area of each image block and the ROI area in the image block, extracting the image blocks larger than a set threshold value, and obtaining a plurality of image blocks with set pixel sizes and set tissue types.

4. The method for classifying pathological images of gastric cancer based on HER2 gene detection according to claim 3, wherein the downsampling is performed by:

h and W are respectively the height and width of the slice image and the image mask after the down-sampling, height and width are respectively the height and width of the slice image and the image mask before the down-sampling, and level is the down-sampling scale.

5. The method for classifying pathological images of gastric cancer based on HER2 gene detection according to claim 1, wherein the data are normalized by enhancement by:

horizontally overturning, vertically overturning and randomly rotating each image block in a plurality of image blocks with set pixel sizes and containing set organization types to obtain an enhanced image block set;

and normalizing the brightness and the contrast of each image block in the enhanced image block set to obtain an enhanced normalized image block set.

6. The method for classifying gastric cancer pathology images based on HER2 gene detection according to claim 5, wherein said data is divided by:

randomly dividing the enhanced normalized image block set to obtain an image packet set consisting of a set number of image blocks;

judging the label of each image block, and executing:

if an image packet contains at least one image block with a positive label, the image packet is marked as a positive multi-example packet; otherwise, the image packet is marked as a negative class multi-instance packet.

7. The method for classifying pathological images of gastric cancer based on HER2 gene detection according to claim 6, wherein the label of each image block is determined by:

extracting a feature map of the image block through a ResNet convolution neural network;

and performing maximum pooling operation on the feature map, and calculating and acquiring the probability that the image block is a positive type and a negative type through a softmax normalization function.

8. The method for classifying gastric cancer pathological images based on HER2 gene detection according to claim 1, wherein the training method of the gastric cancer pathological image classification model is as follows:

b10, extracting picture lines and cell forms of image blocks in the training image packet, and removing interstitial cells and gland cells of the stomach to obtain a preprocessed training image packet;

step B20, constructing a ResNet convolutional neural network, and training a single-scale multi-example learning network of a prediction bag label based on a preprocessed training image packet aiming at the down-sampling scale of each stomach digital pathological section image;

and B30, performing multi-instance learning on the multi-scale-level preprocessing training image packet based on the weight of the single-scale multi-instance learning network to obtain a gastric cancer pathological image classification model.

9. The method for classifying pathological images of gastric cancer based on HER2 gene detection according to claim 8, wherein the ResNet convolutional neural network comprises a set number of residual connecting blocks;

the residual connecting block comprises a 3 × 3 convolutional layer, a batch normalization layer, a ReLU activation function, a 3 × 3 convolutional layer, a batch normalization layer and a ReLU activation function which are connected in sequence.

10. The method for classifying gastric cancer pathological images based on HER2 gene detection according to claim 1, wherein when the gastric cancer pathological image classification model predicts an objective remission rate ORR, the last full-link layer of the model is changed to output the prediction of the ORR, and an optimizer adopts a cross entropy Loss function as a Loss function;

the cross entropy loss function is:

where y represents the patient's true ORR label,

representing the predicted ORR signature.

11. The method for classifying gastric cancer pathological images based on HER2 gene detection according to claim 1, wherein when the gastric cancer pathological image classification model predicts a progression-free life cycle PFS, the last full-link layer of the model is changed to output a numerical value, and an optimizer calculates a loss function by using a C-index;

the C-index is as follows:

where T represents the survival time of two randomly selected i, j patients, l and

respectively representing the values predicted in the true and model, if the survival time of patient i is longer than that of patient jC-Index =1, whereas C-Index =0.

12. A gastric cancer pathological image classification system based on HER2 gene detection is characterized by comprising:

the data acquisition module is configured to acquire a stomach digital pathological section image of a gastric cancer patient; the slice images comprise the expression profile of cancer tissue expressing HER2 gene;

the region extraction and segmentation module is configured to perform region extraction and segmentation operation on the slice image to obtain a plurality of image blocks with set pixel sizes and set tissue types;

the data sub-packaging module is configured to perform data enhancement normalization and data division on the plurality of image blocks with set pixel sizes and set tissue types to obtain a training image package;

the model training module is configured to perform iterative training on the constructed ResNet convolutional neural network through the training image packet to obtain a gastric cancer pathological image classification model;

and the classification module is configured to classify the stomach digital pathological section images of the patient based on the real-time acquired stomach digital pathological section images through the stomach cancer pathological image classification model to obtain image classification results.