CN117976208A

Movatterモバイル変換

Info

Publication number: CN117976208A
Application number: CN202410078409.9A
Authority: CN
Inventors: 郑俊
Original assignee: Guangzhou First Peoples Hospital
Current assignee: Guangzhou First Peoples Hospital
Priority date: 2024-01-18
Filing date: 2024-01-18
Publication date: 2024-05-03

Abstract

Translated fromChinese

本发明提供了一种可解释的胰腺炎预测系统、设备及存储介质，该系统包括：用于获取模型构建数据集的数据特征获取模块；用于基于机器学习技术及模型构建数据集构建初始预测模型的粗建模模块；用于对初始预测模型进行可解释特征分析，获取有效特征的特征降维模块；用于基于有效特征及模型构建数据集对初始预测模型进行训练，获取胰腺炎预测模型的预测模型训练模块；用于将待预测病例样本数据输入胰腺炎预测模型中进行预测，以获取可解释预测结果的预测模块。本发明提供的系统采用胰腺炎预测模型简化预测流程，同时为胰腺炎预测模型提供可解释性，令病例特征权重得以分析，进而有效提高系统的预测准确率。

The present invention provides an interpretable pancreatitis prediction system, device and storage medium, the system comprising: a data feature acquisition module for acquiring a model construction data set; a coarse modeling module for constructing an initial prediction model based on machine learning technology and a model construction data set; a feature dimension reduction module for performing interpretable feature analysis on the initial prediction model to obtain effective features; a prediction model training module for training the initial prediction model based on effective features and a model construction data set to obtain a pancreatitis prediction model; and a prediction module for inputting sample data of a case to be predicted into a pancreatitis prediction model for prediction to obtain an interpretable prediction result. The system provided by the present invention uses a pancreatitis prediction model to simplify the prediction process, and at the same time provides interpretability for the pancreatitis prediction model, so that the case feature weights can be analyzed, thereby effectively improving the prediction accuracy of the system.

Description

Translated fromChinese

一种可解释的胰腺炎预测系统、设备及存储介质An explainable pancreatitis prediction system, device and storage medium

技术领域Technical Field

本发明涉及智能医疗技术领域，尤其是涉及一种可解释的胰腺炎预测系统、设备及存储介质。The present invention relates to the field of intelligent medical technology, and in particular to an explainable pancreatitis prediction system, device and storage medium.

背景技术Background Art

急性胰腺炎(acute pancreatitis，AP)是临床上常见的消化道疾病，人群中发病率约为4.9～73.4/10万，而急性胰腺炎发病患者中约20-30％会加重为重症急性胰腺炎(severe acute pancreatitis，SAP)，重症急性胰腺炎患者死亡率可达15.6％，造成巨大的医疗和经济负担。及早从急性胰腺炎病人中，识别具有重症转化倾向的重症急性胰腺炎病人，是制定合理医疗策略，施行预防性治疗的基础，可以有效降低病人死亡率。Acute pancreatitis (AP) is a common digestive tract disease in clinical practice, with an incidence rate of about 4.9 to 73.4 per 100,000 people. About 20-30% of patients with acute pancreatitis will develop severe acute pancreatitis (SAP), and the mortality rate of patients with severe acute pancreatitis can reach 15.6%, causing huge medical and economic burdens. Early identification of patients with severe acute pancreatitis who have a tendency to transform into severe disease is the basis for formulating reasonable medical strategies and implementing preventive treatment, which can effectively reduce the mortality rate of patients.

因而，现有存在一些危险分层评分工具诸如急性生理与慢性健康评分(AcutePhysiology and Chronic Health EvaluationⅡ，APACHⅡ)工具、急性胰腺炎严重程度床边指数(Bedside Index for Severity in Acute Pancreatitis，BISAP)工具、Ranson评分工具、改良Marshall评分工具和SIRS评分工具等，其均可用于对AP疾病进展进行分层评分，在一定程度上可以识别具有重症转化倾向的重症急性胰腺炎病人。但是，各评分工具预测AUROC维持在中等水平(0.6-0.8)，且在具体应用中还存在诸如Ranson评分需时过长、APACHⅡ评分项目过多，难以完成等实际操作问题。各评分工具在不同研究中的最佳截值亦各不相同，且可解释性不足，多数模型只提供针对单个特征的风险比(OR值)而不同OR值之间无法按权重叠加，这使得临床医生在实际应用时无所适从。Therefore, there are some risk stratification scoring tools such as Acute Physiology and Chronic Health Evaluation Ⅱ (APACH Ⅱ) tool, Bedside Index for Severity in Acute Pancreatitis (BISAP) tool, Ranson scoring tool, modified Marshall scoring tool and SIRS scoring tool, which can be used to stratify the progression of AP disease and identify patients with severe acute pancreatitis who have a tendency to transform into severe disease to a certain extent. However, the prediction AUROC of each scoring tool remains at a moderate level (0.6-0.8), and there are still practical operational problems in specific applications, such as the Ranson score takes too long, the APACH Ⅱ scoring items are too many and difficult to complete. The optimal cutoff values of each scoring tool are also different in different studies, and the interpretability is insufficient. Most models only provide risk ratios (OR values) for a single feature, and different OR values cannot be superimposed according to weights, which makes clinicians at a loss in practical application.

发明内容Summary of the invention

本发明旨在提供一种可解释的胰腺炎预测系统、设备及存储介质，以解决上述技术问题，采用胰腺炎预测模型简化预测流程，同时为胰腺炎预测模型提供可解释性，令病例特征权重得以分析，进而有效提高系统的预测准确率。The present invention aims to provide an explainable pancreatitis prediction system, device and storage medium to solve the above-mentioned technical problems, adopt a pancreatitis prediction model to simplify the prediction process, and at the same time provide interpretability for the pancreatitis prediction model, so that the case feature weights can be analyzed, thereby effectively improving the prediction accuracy of the system.

为了解决上述技术问题，本发明提供了一种可解释的胰腺炎预测系统，包括：In order to solve the above technical problems, the present invention provides an explainable pancreatitis prediction system, comprising:

数据特征获取模块，用于获取病例样本数据并基于病例样本数据提取病例特征，根据病例样本数据和病例特征获取模型构建数据集；A data feature acquisition module is used to acquire case sample data and extract case features based on the case sample data, and to construct a data set based on the case sample data and the case feature acquisition model;

粗建模模块，用于基于机器学习技术及模型构建数据集构建初始预测模型；A coarse modeling module is used to build an initial prediction model based on machine learning techniques and model building data sets;

特征降维模块，用于对初始预测模型进行可解释特征分析，获取有效特征；Feature dimension reduction module, used to analyze the interpretable features of the initial prediction model and obtain effective features;

预测模型训练模块，用于基于有效特征及模型构建数据集对初始预测模型进行训练，获取胰腺炎预测模型；A prediction model training module is used to train the initial prediction model based on effective features and model construction data sets to obtain a pancreatitis prediction model;

预测模块，用于存储胰腺炎预测模型并将待预测病例样本数据输入胰腺炎预测模型中进行预测，以获取可解释预测结果。The prediction module is used to store the pancreatitis prediction model and input the sample data of the case to be predicted into the pancreatitis prediction model for prediction to obtain an explainable prediction result.

上述方案中，采用胰腺炎预测模型简化预测流程，同时为胰腺炎预测模型提供可解释性，令病例特征权重得以分析，进而有效提高系统的预测准确率。In the above scheme, the pancreatitis prediction model is used to simplify the prediction process, and at the same time, the pancreatitis prediction model is provided with interpretability, so that the case feature weights can be analyzed, thereby effectively improving the prediction accuracy of the system.

进一步地，所述数据特征获取模块包括样本获取子模块、特征提取子模块和数据工程子模块；其中：Furthermore, the data feature acquisition module includes a sample acquisition submodule, a feature extraction submodule and a data engineering submodule; wherein:

所述样本获取子模块用于获取病例样本数据；The sample acquisition submodule is used to acquire case sample data;

所述特征提取子模块用于基于病例样本数据提取病例特征；The feature extraction submodule is used to extract case features based on case sample data;

所述数据工程子模块用于根据病例样本数据和病例特征获取模型构建数据集。The data engineering submodule is used to obtain a model and construct a data set based on case sample data and case characteristics.

进一步地，所述数据工程子模块包括病例标记单元、离群值剔除单元、语义转化编码单元、缺失值填充单元和归一化单元；其中：Furthermore, the data engineering submodule includes a case marking unit, an outlier removal unit, a semantic conversion encoding unit, a missing value filling unit and a normalization unit; wherein:

所述病例标记单元用于对病例样本数据进行标记处理，以确定每个病例样本数据最终是否进展为重症急性胰腺炎；The case marking unit is used to mark the case sample data to determine whether each case sample data eventually progresses to severe acute pancreatitis;

所述离群值剔除单元用于对具备连续数值数据的病例特征进行离群值剔除处理；The outlier elimination unit is used to perform outlier elimination processing on case characteristics with continuous numerical data;

所述语义转化编码单元用于对具备二分类数据的病例特征进行语义转化编码处理；The semantic conversion coding unit is used to perform semantic conversion coding processing on case features with binary classification data;

所述缺失值填充单元用于对离群值剔除处理后的数据及语义转化编码处理后的数据进行数据填充；The missing value filling unit is used to fill in the data after the outlier elimination process and the data after the semantic conversion coding process;

所述归一化单元用于将标记处理后的数据及数据填充后的数据进行归一化处理，以令病例样本数据和病例特征转化为符合初始预测模型的数据类型，从而获取模型构建数据集。The normalization unit is used to normalize the labeled data and the data filled data to convert the case sample data and case characteristics into data types that conform to the initial prediction model, thereby obtaining a model construction data set.

进一步地，所述数据工程子模块还包括过采样单元；所述过采样单元用于基于对称性小众样本过采样方法对标记处理后的数据进行处理，以得到平衡性数据；Furthermore, the data engineering submodule further includes an oversampling unit; the oversampling unit is used to process the labeled data based on a symmetrical minority sample oversampling method to obtain balanced data;

所述归一化单元用于将平衡性数据及数据填充后的数据进行归一化处理，以令病例样本数据和病例特征转化为符合初始预测模型的数据类型，从而获取模型构建数据集。The normalization unit is used to normalize the balanced data and the data after data filling, so as to convert the case sample data and case characteristics into data types that conform to the initial prediction model, thereby obtaining a model construction data set.

进一步地，所述粗建模模块包括构建子模块、训练子模块和性能评估子模块；其中：Furthermore, the coarse modeling module includes a construction submodule, a training submodule and a performance evaluation submodule; wherein:

所述构建子模块用于基于机器学习技术构建若干个初始模型；The construction submodule is used to construct several initial models based on machine learning technology;

所述训练子模块用于基于模型构建数据集对若干个初始模型进行训练；The training submodule is used to train several initial models based on the model construction data set;

所述性能评估子模块用于对训练完成的若干个初始模型进行性能评估，以筛选最优的初始模型作为初始预测模型。The performance evaluation submodule is used to perform performance evaluation on several trained initial models to select the best initial model as the initial prediction model.

进一步地，所述特征降维模块包括重要性图解析子模块、摘要图解析子模块、回归分析子模块、主成分分析子模块和有效特征获取子模块；其中：Furthermore, the feature dimension reduction module includes an importance graph parsing submodule, a summary graph parsing submodule, a regression analysis submodule, a principal component analysis submodule and an effective feature acquisition submodule; wherein:

所述重要性图解析子模块用于对初始预测模型进行可解释性解析，以获取所有病例特征对模型预测结果的第一贡献值并绘制特征重要性图；The importance graph analysis submodule is used to perform interpretability analysis on the initial prediction model to obtain the first contribution value of all case features to the model prediction results and draw a feature importance graph;

所述摘要图解析子模块用于对初始预测模型进行可解释性解析，以获取所有病例特征对模型预测结果的第二贡献值并绘制摘要图；The summary graph parsing submodule is used to perform interpretability analysis on the initial prediction model to obtain the second contribution value of all case characteristics to the model prediction results and draw a summary graph;

所述回归分析子模块用于对初始预测模型的模型预测结果进行回归分析，以获取所有病例特征与模型预测结果的相关性指标；The regression analysis submodule is used to perform regression analysis on the model prediction results of the initial prediction model to obtain correlation indicators between all case characteristics and the model prediction results;

所述主成分分析子模块用于对初始预测模型的模型预测结果进行主成分分析，以获取主成分数据进行回归分析，进而在所有病例特征中获取候选特征；The principal component analysis submodule is used to perform principal component analysis on the model prediction results of the initial prediction model to obtain principal component data for regression analysis, and then obtain candidate features from all case features;

所述有效特征获取子模块用于基于特征重要性图、特征摘要图、相关性指标和候选特征，从病例特征中获取有效特征。The effective feature acquisition submodule is used to acquire effective features from case features based on feature importance graphs, feature summary graphs, correlation indicators, and candidate features.

进一步地，所述有效特征包括肺部渗出、胸腔积液、心率、格拉斯哥评分、血肌酐、乳酸脱氢酶、甘油三酯、CTSI分级、尿素氮、血钙、白细胞计数、血糖、红细胞压积和胆囊炎史。Furthermore, the effective features include pulmonary infiltration, pleural effusion, heart rate, Glasgow score, blood creatinine, lactate dehydrogenase, triglyceride, CTSI grade, urea nitrogen, blood calcium, white blood cell count, blood sugar, hematocrit and history of cholecystitis.

进一步地，所述系统还包括可视化模块，所述可视化模块用于对可解释预测结果进行显示并提供相应的特征重要性图和特征摘要图。Furthermore, the system also includes a visualization module, which is used to display the explainable prediction results and provide corresponding feature importance graphs and feature summary graphs.

本发明还提供一种设备，所述设备部署有所述的一种可解释的胰腺炎预测系统并提供对外网络服务接口。The present invention also provides a device, which is deployed with the explainable pancreatitis prediction system and provides an external network service interface.

本发明还提供一种计算机存储介质，所述计算机存储介质存储有计算机指令，所述计算机指令用于实现所述的一种可解释的胰腺炎预测系统。The present invention also provides a computer storage medium, wherein the computer storage medium stores computer instructions, and the computer instructions are used to implement the explainable pancreatitis prediction system.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1为本发明一实施例提供的一种可解释的胰腺炎预测系统架构示意图；FIG1 is a schematic diagram of an explainable pancreatitis prediction system architecture provided by an embodiment of the present invention;

图2为本发明一实施例提供的急性胰腺炎CT严重度指数示例图；FIG2 is an example diagram of the CT severity index of acute pancreatitis provided by an embodiment of the present invention;

图3为本发明一实施例提供的ROC曲线示意图；FIG3 is a schematic diagram of a ROC curve provided by an embodiment of the present invention;

图4为本发明一实施例提供的SHAP特征重要性排序图；FIG4 is a SHAP feature importance ranking diagram provided by an embodiment of the present invention;

图5为本发明一实施例提供的重症急性胰腺炎事件预测的SHAP摘要图；FIG5 is a SHAP summary diagram of severe acute pancreatitis event prediction provided by an embodiment of the present invention;

图6为本发明一实施例提供的五种模型的受试者操作特征曲线示意图；FIG6 is a schematic diagram of receiver operating characteristic curves of five models provided in one embodiment of the present invention;

图7为本发明一实施例提供的最终版模型的SHAP特征重要性图；FIG7 is a SHAP feature importance diagram of the final version of the model provided by an embodiment of the present invention;

图8为本发明一实施例提供的最终版模型的SHAP摘要图。FIG. 8 is a SHAP summary diagram of the final version of the model provided by an embodiment of the present invention.

具体实施方式DETAILED DESCRIPTION

下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整的描述，显然，所描述的实施例仅仅是本发明一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。The following will be combined with the drawings in the embodiments of the present invention to clearly and completely describe the technical solutions in the embodiments of the present invention. Obviously, the described embodiments are only part of the embodiments of the present invention, not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by ordinary technicians in this field without creative work are within the scope of protection of the present invention.

现有关于重症急性胰腺炎的危险分层评分工具存在以下缺点：Existing risk stratification scoring tools for severe acute pancreatitis have the following shortcomings:

1)易用性不足。如APACHⅡ评分工具是一个包含动脉血气测量和复杂病史评估的积分系统，往往因其执行难度使结果准确性难以保障。而由于APACHⅡ评分中年龄占权重大，当年龄大于65时获5分，仅再额外得4分即临界诊断SAP，使其有效性受质疑。而Ranson评分工具需在到诊后24小时和48小时进行两次评估，其预测的AUROC维持在中等水平(0.6～0.8)，Ranson评分由第一日问卷和第二日问卷两部分组成，即完成一轮诊断最少需要48小时，这无疑延缓了诊断和干预的时机。1) Insufficient ease of use. For example, the APACHⅡ scoring tool is a scoring system that includes arterial blood gas measurement and complex medical history assessment. The accuracy of the results is often difficult to guarantee due to its difficulty in implementation. And because age plays a major role in the APACHⅡ score, when the age is over 65, 5 points are awarded, and only an additional 4 points are considered critical diagnosis SAP, which makes its effectiveness questionable. The Ranson scoring tool requires two assessments, 24 hours and 48 hours after the visit, and its predicted AUROC remains at a moderate level (0.6-0.8). The Ranson score consists of two parts: the first day questionnaire and the second day questionnaire, which means that it takes at least 48 hours to complete a round of diagnosis, which undoubtedly delays the timing of diagnosis and intervention.

2)特征少、代表性差。如BISAP工具，该工具是2008年提出的一种仅包含五个测评项目的胰腺炎预后评分工具，其虽然依靠缩减特征提高了可用性，但其放弃了大量讯息令其预测效率没办法进一步提升。2) Few features and poor representativeness. For example, the BISAP tool is a pancreatitis prognostic scoring tool proposed in 2008 that contains only five evaluation items. Although it improves usability by reducing features, it abandons a lot of information, making it impossible to further improve its prediction efficiency.

3)各评分工具在不同研究中的最佳截值各不相同，这使得临床医生在实际应用时无所适从。如在BISAP评分工具中，应用BISAP>2或者BISAP>3两个不同截取值，而不同的截取值所预测SAP发生的敏感度、特异性和准确率均有所不同，且每个截取值各自对应一个固定的发病率或者死亡率(如BISAP>2时，其对应的死亡风险为“>1.5％”)。而不同的研究又给出了不同的最佳截值，临床医生在应用此类评分工具时往往会面临置信哪个研究的抉择困难局面，即使应用某个确切的截取值，也仅能获取模糊印象，而不是较准确的诊断结论。3) The optimal cutoff values of each scoring tool vary in different studies, which makes clinicians at a loss when applying them in practice. For example, in the BISAP scoring tool, two different cutoff values, BISAP>2 or BISAP>3, are used. The sensitivity, specificity and accuracy of predicting the occurrence of SAP with different cutoff values are different, and each cutoff value corresponds to a fixed incidence or mortality rate (such as when BISAP>2, the corresponding risk of death is ">1.5%"). Different studies have given different optimal cutoff values. When using such scoring tools, clinicians often face the difficult situation of deciding which study to trust. Even if an exact cutoff value is used, only a vague impression can be obtained, rather than a more accurate diagnosis.

4)几乎所有工具缺乏移动端应用界面。现有仅存在部分传统评分有网页评分表，但其提供的信息较少，主要提供模糊的病情程度判断和提供可能的死亡率。4) Almost all tools lack a mobile application interface. Only some traditional scoring systems have web scoring tables, but they provide little information, mainly vague judgments on the severity of the disease and possible mortality rates.

5)预测结论定义缺乏统一性。2012年之前相关的研究并非基于最新版本修订版亚特兰大分级(Revised Atlanta Classification，RAC)对SAP的定义进行的，RAC是临床上应用最广泛的胰腺炎疾病程度分级标准之一，但现有评估工具的研究很多是依赖其在先定义所做出，因而会导致预测结论定义不统一，不利于临床实践，也不利于临床科研的可持续性。5) The definition of prediction conclusion lacks uniformity. Related studies before 2012 were not based on the definition of SAP in the latest version of the Revised Atlanta Classification (RAC). RAC is one of the most widely used clinical standards for grading the severity of pancreatitis, but many studies on existing assessment tools rely on its prior definition, which leads to inconsistent definitions of prediction conclusions, which is not conducive to clinical practice and the sustainability of clinical research.

6)预测能力不高。传统评分工具反映预测能力的受试者操作特征曲线下面积(Area Under Receiver Operating Characteristic Curve，AUROC)普遍维持在中等水平(0.6-0.8)。6) Low predictive ability. The area under the receiver operating characteristic curve (AUROC) of traditional scoring tools that reflects predictive ability generally remains at a moderate level (0.6-0.8).

7)不提供工具的可解释性，即无法解释特征在工具预测中如何发挥作用。多数评估工具只提供针对单个特征的风险比(OR)，临床医生难以实际应用。如年龄大于60岁者的发病风险是年龄小于60岁者的1.6倍，这对于科研可能有相对确切的价值，但对临床医生则只有模糊的发病风险印象，因为该锚定方法需要首先了解年龄小于60患者的发病概率。再者提供单个特征OR值这种做法并未考虑特征之间的交叉作用，比如年龄大于60岁但所有检查正常的发病者，与年龄小于60岁但肌酐严重升高、血钙严重下降、39度高热的发病患者，两者相较后者病情更重，但由于OR值之间无法按权重叠加，临床医生应用工具时仍需依靠自身经验对工具进行修正。7) The tool does not provide interpretability, that is, it cannot explain how the features play a role in the tool's prediction. Most evaluation tools only provide risk ratios (OR) for a single feature, which is difficult for clinicians to apply in practice. For example, the risk of disease in people over 60 years old is 1.6 times that of people under 60 years old. This may have relatively accurate value for scientific research, but it only gives clinicians a vague impression of the risk of disease, because the anchoring method requires first understanding the probability of disease in patients under 60 years old. Furthermore, providing a single feature OR value does not take into account the cross-effects between features. For example, patients over 60 years old but with normal all examinations are more seriously ill than patients under 60 years old but with severely elevated creatinine, severely decreased blood calcium, and a high fever of 39 degrees. However, since the OR values cannot be superimposed according to weights, clinicians still need to rely on their own experience to modify the tool when using it.

针对上述现有技术的缺点，本实施例提供一种可解释的胰腺炎预测系统，其制定了明确的预测终点，令预测结果具备统一性，有利于临床实践，也利于临床科研的可持续性；其选取的特征覆盖多维度，可以详尽反映病情，避免信息损失，如最初纳入的23个包含人群信息、生活习惯、病史、体征、实验室检验和影像学检查的特征，有效提升胰腺炎预测效率；其构建的胰腺炎预测模型是一种机器学习模型，并通过主成分分析、Logistic回归分析对特征实现降维，减少需要获取的特征数量，以保证胰腺炎预测模型的易用性；其应用SHAP为胰腺炎预测模型提供可解释性，通过提供模型内分析的图片，以解释预测模型运行逻辑，提供可解释性有利于缓解临床医生对于模型“黑箱运作”的疑虑，为预测模型的临床应用提供信心；其可以制作方便易用的互联网应用。而且，为了提高模型总体预测效率，对胰腺炎预测模型的AUROC尽量进行了提高，并且尽量降低模型的假阴性预测，避免模型低估病情，有利于增加预测模型应用的安全性。In view of the shortcomings of the above-mentioned prior art, the present embodiment provides an explainable pancreatitis prediction system, which establishes a clear prediction endpoint to make the prediction results uniform, which is beneficial to clinical practice and the sustainability of clinical scientific research; the selected features cover multiple dimensions, which can reflect the condition in detail and avoid information loss, such as the 23 features initially included, including population information, living habits, medical history, physical signs, laboratory tests and imaging examinations, which effectively improve the efficiency of pancreatitis prediction; the pancreatitis prediction model constructed by it is a machine learning model, and the features are reduced in dimension through principal component analysis and logistic regression analysis, reducing the number of features to be obtained to ensure the ease of use of the pancreatitis prediction model; it uses SHAP to provide explainability for the pancreatitis prediction model, and provides pictures of the analysis within the model to explain the operation logic of the prediction model. Providing explainability is conducive to alleviating clinicians' doubts about the "black box operation" of the model and providing confidence in the clinical application of the prediction model; it can make convenient and easy-to-use Internet applications. Moreover, in order to improve the overall prediction efficiency of the model, the AUROC of the pancreatitis prediction model was improved as much as possible, and the false negative prediction of the model was reduced as much as possible to avoid the model underestimating the disease, which is conducive to increasing the safety of the application of the prediction model.

具体的，本实施例提供的一种可解释的胰腺炎预测系统架构框图可参见图1所示。具体包括数据特征获取模块，用于获取病例样本数据并基于病例样本数据提取病例特征，根据病例样本数据和病例特征获取模型构建数据集；Specifically, the interpretable pancreatitis prediction system architecture block diagram provided in this embodiment can be seen in Figure 1. Specifically, it includes a data feature acquisition module, which is used to acquire case sample data and extract case features based on the case sample data, and construct a data set based on the case sample data and the case feature acquisition model;

在本实施例中，采用胰腺炎预测模型简化预测流程，同时为胰腺炎预测模型提供可解释性，令病例特征权重得以分析，进而有效提高系统的预测准确率。In this embodiment, a pancreatitis prediction model is used to simplify the prediction process, and at the same time, the pancreatitis prediction model is provided with interpretability, so that the case feature weights can be analyzed, thereby effectively improving the prediction accuracy of the system.

为了进一步描述本实施例所提供的病例样本数据获取过程，本实施例采用2015/1/1至2022/1/24期间某三甲医院门诊或急诊初诊为急性胰腺炎的病人的手写和电子病历记录作为病例样本。根据《亚特兰大急性胰腺炎分级与定义2012修订版》，定义“初诊急性胰腺炎”为满足下面3个特征中的2条者：In order to further describe the case sample data acquisition process provided in this embodiment, this embodiment uses the handwritten and electronic medical records of patients who were initially diagnosed with acute pancreatitis in the outpatient or emergency department of a tertiary hospital from January 1, 2015 to January 24, 2022 as case samples. According to the Atlanta Acute Pancreatitis Classification and Definition 2012 Revision, "initial acute pancreatitis" is defined as meeting two of the following three characteristics:

1)腹痛符合AP特征；1) Abdominal pain consistent with AP features;

2)血清脂肪酶或淀粉酶至少高于正常上限值3倍；2) serum lipase or amylase at least 3 times higher than the upper limit of normal;

3)CT扫描发现AP特征性改变。3) CT scan revealed characteristic changes of AP.

而将病例样本纳入病例样本数据的条件为：The conditions for including case samples in case sample data are:

1)满足上述初诊AP定义；1) Meet the above definition of newly diagnosed AP;

2)年龄小于80岁。2) Under 80 years old.

排除条件为：The exclusion conditions are:

1)院内发生的新发胰腺炎；1) New-onset pancreatitis occurring in the hospital;

2)因住院用药或临床操作导致的胰腺炎；2) Pancreatitis caused by hospitalized medication or clinical procedures;

3)初诊后拒绝继续接受医疗干预的患者。3) Patients who refuse to continue medical intervention after the initial diagnosis.

需要说明的是，本实施例仅利用可识别身份信息的数据进行研究，受试对象已无法找到，且研究项目不涉及个人隐私和商业利益，经伦理委员会审核豁免签署知情同意书后开展，最终共纳入4110个病例作为病例样本数据。It should be noted that this embodiment only uses data with identifiable identity information for research. The subjects can no longer be found, and the research project does not involve personal privacy and commercial interests. It was carried out after being reviewed by the Ethics Committee and exempted from signing the informed consent form. Finally, a total of 4110 cases were included as case sample data.

为了进一步描述本实施例所提供的病例特征提取过程，本实施例基于先验研究和经验，将病例样本数据中每个病例初诊24小时的23个特征及其数据进行提取，包括：性别、年龄、吸烟史、饮酒史、糖尿病史、高血压史、胆囊炎既往史、体温、心率、收缩压、格拉斯哥评分、Balthazar急性胰腺炎CT严重度指数(CTSI)、肺部渗出、胸水、白细胞计数、红细胞压积、血钙、血糖、血尿素氮、血肌酐、谷草转氨酶、乳酸脱氢酶和甘油三酯。In order to further describe the case feature extraction process provided in this embodiment, based on prior research and experience, this embodiment extracts 23 features and their data of each case in the case sample data within 24 hours of the initial diagnosis, including: gender, age, smoking history, drinking history, diabetes history, hypertension history, history of cholecystitis, body temperature, heart rate, systolic blood pressure, Glasgow score, Balthazar acute pancreatitis CT severity index (CTSI), pulmonary exudate, pleural effusion, white blood cell count, hematocrit, blood calcium, blood sugar, blood urea nitrogen, blood creatinine, aspartate aminotransferase, lactate dehydrogenase and triglycerides.

其中，性别、年龄、吸烟史、饮酒史、糖尿病史、高血压史和胆囊炎既往史可以从病例的病历资料中获取；体温、心率、收缩压和格拉斯哥评分可以取病历资料记录的首次测量数据并进行人工复核；通过人工复核将病例胰腺CT影像学改变按Balthazar急性胰腺炎CT严重度指数(CTSI)分为0-4级，具体可参见图2，图2中左上小图1病例标记为1级，右上小图2病例标记为2级，左下小图3病例标记为3级，右下小图4病例标记为4级；通过胸片结果判断病例是否存在肺部渗出和胸水；提取病例到诊后24h内首次检验的白细胞计数、红细胞压积、血钙、血糖、血尿素氮、血肌酐、谷草转氨酶、乳酸脱氢酶和甘油三酯。Among them, gender, age, smoking history, drinking history, history of diabetes, history of hypertension and history of cholecystitis can be obtained from the medical records of the cases; body temperature, heart rate, systolic blood pressure and Glasgow score can be obtained by taking the first measurement data recorded in the medical records and manually reviewing them; the pancreatic CT imaging changes of the cases are divided into 0-4 levels according to the Balthazar acute pancreatitis CT severity index (CTSI) through manual review, as shown in Figure 2, in which the case in the upper left inset 1 is marked as level 1, the case in the upper right inset 2 is marked as level 2, the case in the lower left inset 3 is marked as level 3, and the case in the lower right inset 4 is marked as level 4; the chest X-ray results are used to determine whether the case has pulmonary exudation and pleural effusion; the white blood cell count, hematocrit, blood calcium, blood sugar, blood urea nitrogen, blood creatinine, aspartate aminotransferase, lactate dehydrogenase and triglycerides tested for the first time within 24 hours after the case arrived at the clinic are extracted.

本实施例所提取的特征是体现急性胰腺炎患者病情的特征，将特征用于后续模型训练，为进一步筛选降维提供基础。The features extracted in this embodiment reflect the condition of patients with acute pancreatitis. The features are used for subsequent model training to provide a basis for further screening and dimensionality reduction.

为了进一步描述本实施例所提供的病例标记单元的工作过程，本实施例提供一种对病例样本数据进行标记处理的过程。由于用于后续训练模型的数据需要将每个病例的特征与预测结果标签一一对应，以令训练得到的模型可以探索特征与结果之间的关系，因而每个病例72小时后的检验结果是需要追踪的，通过每个病例72小时后的检验结果可以判断该病例AP是否发展为重症急性胰腺炎。根据2012亚特兰大急性胰腺炎分级与定义共识指定的诊断标准，重症急性胰腺炎定义为符合以下三个条件之一者：In order to further describe the working process of the case labeling unit provided in this embodiment, this embodiment provides a process for labeling case sample data. Since the data used for subsequent training models needs to correspond the characteristics of each case to the prediction result label one by one, so that the trained model can explore the relationship between the characteristics and the results, the test results of each case after 72 hours need to be tracked. The test results of each case after 72 hours can be used to determine whether the AP of the case develops into severe acute pancreatitis. According to the diagnostic criteria specified by the 2012 Atlanta Consensus on the Grading and Definition of Acute Pancreatitis, severe acute pancreatitis is defined as meeting one of the following three conditions:

1)持续器官衰竭大于48小时；1) Continuous organ failure for more than 48 hours;

2)单个器官衰竭；2) Single organ failure;

3)多个器官衰竭。3) Multiple organ failure.

需要说明的是，将入院72小时内死亡的患者均计入“进展为重症急性胰腺炎”。上述定义的设置可以使预测标准化，令预测结果具备统一性，有利于临床实践，也利于临床科研的可持续性。It should be noted that patients who died within 72 hours of admission were counted as "progressing to severe acute pancreatitis". The above definition can standardize the prediction and make the prediction results uniform, which is beneficial to clinical practice and the sustainability of clinical research.

在具体实施过程中，可以基于SOFA和改良Marshall评分工具评估MODS是否发生：若Marshall评分任一工具≥2分，则认为发生器官功能衰竭；若SOFA每日变化Δ≥2，则认为患者出现了器官衰竭的急性变化；当两者出现背离时，则优先考虑SOFA评分。应用以上结果对数据进行标签，用于后续模型的构建及训练。In the specific implementation process, the occurrence of MODS can be assessed based on SOFA and the modified Marshall score tool: if any Marshall score tool is ≥2 points, it is considered that organ failure has occurred; if the daily change of SOFA is Δ≥2, it is considered that the patient has acute changes in organ failure; when the two diverge, the SOFA score is given priority. The above results are used to label the data for subsequent model construction and training.

为了进一步描述本实施例所提供的离群值剔除单元的工作过程，本实施例提供一种对具备连续数值数据的病例特征进行离群值剔除处理的过程。具备连续数值数据的病例特征指数据集中的数值是按照连续的顺序排列的特征，这些特征数值之间没有间隔或者间隔很小，可以是实数或整数，例如心率、红细胞压积和白细胞计数等。为了规避数据录入不可避免产生的谬误，对于这些具备连续数值数据的病例特征，可以使用IBM SPSSStatistics 25软件计算其四分位间距，将小于第25％四分位间距所在值1.5倍，或大于第75％四分位间距所在值1.5倍的数据判定为离群值。将出现离群值的病例剔除。In order to further describe the working process of the outlier removal unit provided in this embodiment, this embodiment provides a process for removing outliers from case features with continuous numerical data. Case features with continuous numerical data refer to features in which the values in the data set are arranged in a continuous order, and there is no interval or a very small interval between these feature values, which can be real numbers or integers, such as heart rate, hematocrit, and white blood cell count. In order to avoid the inevitable errors in data entry, for these case features with continuous numerical data, the interquartile range can be calculated using IBM SPSSStatistics 25 software, and data that is less than 1.5 times the value of the 25th percentile interquartile range, or greater than 1.5 times the value of the 75th percentile interquartile range is determined as an outlier. Cases with outliers will be eliminated.

为了进一步描述本实施例所提供的语义转化编码单元的工作过程，本实施例提供一种对具备二分类数据的病例特征进行语义转化编码处理的过程。具备二分类数据的病例特征指数据集中的每个数据点只归属于两个类别中的一个的特征，如“是”或“否”或“有”或“无”等，性别、胆囊炎病史、有无胸腔积液等便是此类特征。将具备二分类数据的病例特征进行语义转换编码，将数据结果为“是否”或“有无”的数据进行数字编码，“否”和“无”统一编码为0，“是”和“有”统一编码为1，如一个病人性别为男，则性别一项该病例的编码为1，同时其胸片检查显示有胸腔积液，同时没有胆囊炎病史，则胸腔积液一项编码为1，胆囊炎病史一项编码为0。In order to further describe the working process of the semantic conversion coding unit provided in this embodiment, this embodiment provides a process for semantic conversion coding processing of case features with binary data. Case features with binary data refer to features in which each data point in the data set belongs to only one of the two categories, such as "yes" or "no" or "yes" or "no", etc. Gender, history of cholecystitis, presence or absence of pleural effusion, etc. are such features. The case features with binary data are semantically converted and coded, and the data results of "whether" or "yes" are digitally coded, "no" and "no" are uniformly coded as 0, "yes" and "yes" are uniformly coded as 1. For example, if a patient is male, the gender item of the case is coded as 1. At the same time, his chest X-ray examination shows pleural effusion, and there is no history of cholecystitis. Then the pleural effusion item is coded as 1, and the cholecystitis history item is coded as 0.

为了进一步描述本实施例所提供的缺失值填充单元的工作过程，本实施例提供一种对离群值剔除处理后的数据及语义转化编码处理后的数据进行数据填充的方式。将缺失数据占数据总数15％以上连续数值数据，先检查其是否符合正态分布，符合者则进行均值填充。在本实施例中，由于所有收集到的连续数值型数据均符合上述前提，故分别应用均值填充法进行了填充。对于二分类数据则使用众数进行数据填充。In order to further describe the working process of the missing value filling unit provided in this embodiment, this embodiment provides a method for filling data after outlier elimination and semantic conversion coding. For continuous numerical data whose missing data accounts for more than 15% of the total data, first check whether it conforms to the normal distribution, and if it conforms, fill it with the mean. In this embodiment, since all the collected continuous numerical data meet the above premise, the mean filling method is applied respectively. For binary data, the mode is used for data filling.

为了进一步描述本实施例所提供的归一化单元的工作过程，本实施例提供一种归一化处理方式。归一化处理是将不同量纲的数据通过公式统一映射至[0,1]取值范围之间，其目的在于将病例样本数据和病例特征转化为符合初始预测模型的数据类型，从而获取模型构建数据集。在本实施例中，可以应用Min-Max公式对连续数值数据进行归一化处理，该公式表示为：In order to further describe the working process of the normalization unit provided in this embodiment, this embodiment provides a normalization processing method. Normalization processing is to uniformly map data of different dimensions to the value range of [0,1] through a formula. Its purpose is to convert case sample data and case characteristics into data types that conform to the initial prediction model, thereby obtaining a model construction data set. In this embodiment, the Min-Max formula can be applied to normalize continuous numerical data, which is expressed as:

X'＝(X-X_min)/(X_max-X_min)X'＝(X-X_min)/(X_max-X_min)

式中，X'表示归一化后的数据；X表示原始数据；X_min和X_max分别是对应数据集中的最小值和最大值。例如：假设对于“年龄”数据集，最小值为1，最大值为87，那么将“74岁”归一化应用Min-Max标准化公式：X'＝(74-1)/(87-1)＝0.84。而由于二分类数据的取值为为0或1，其取值范围[0,1]之间，因而无需进行归一化处理。In the formula, X' represents the normalized data; X represents the original data; X_min and X_max are the minimum and maximum values in the corresponding data set. For example, assuming that for the "age" data set, the minimum value is 1 and the maximum value is 87, then the "74 years old" is normalized using the Min-Max normalization formula: X' = (74-1)/(87-1) = 0.84. Since the values of the binary data are 0 or 1, and the range of values is [0,1], there is no need to perform normalization.

由于急性胰腺炎重症转化率约为20％-30％之间，因此标记处理后的数据存在的两种类型必然存在不平衡，重症急性胰腺炎样本数量偏少会导致模型欠拟合，即当数据集中某些类别的样本数量远远多于其他类别时，模型会更容易学习到数据集中数量较多的类别的特征，而忽略数量较少的类别的特征。这样，模型在训练集上表现良好，但在测试集上可能会表现欠佳，因为测试集中数量较少的类别的样本会被模型忽略，导致模型无法正确预测这些类别的样本。因此在本实施例中，优选采用对称性小众样本过采样方法(Symthetic Minority Over-sampling Technique，SMOTE)对小众样本进行过采样，以获得一个相对平衡的新的数据集，从而避免出现模型欠拟合的情况发生，并使其获得预测效率提高。Since the conversion rate of severe acute pancreatitis is about 20%-30%, there must be an imbalance between the two types of data after labeling. The small number of severe acute pancreatitis samples will lead to underfitting of the model, that is, when the number of samples of certain categories in the data set is much larger than that of other categories, the model will more easily learn the features of the categories with a larger number in the data set, while ignoring the features of the categories with a smaller number. In this way, the model performs well on the training set, but may perform poorly on the test set, because the samples of the categories with a smaller number in the test set will be ignored by the model, resulting in the model being unable to correctly predict the samples of these categories. Therefore, in this embodiment, it is preferred to use the Symmetric Minority Over-sampling Technique (SMOTE) to oversample the minority samples to obtain a relatively balanced new data set, thereby avoiding the occurrence of underfitting of the model and improving its prediction efficiency.

进一步地，说要说明的是，所述模型构建数据集在进行模型训练时可以拆分为训练数据集和测试数据集。具体操作过程可以为：应用Python 3.5，基于模型构建数据集构建训练和验证数据集，若应用三折交叉验证法对模型进行训练，则需将模型构建数据集随机分割成三个片段，每一次将其中两个片段作为训练数据集，一个片段作为测试数据集，共循环执行三次，每次重新随机初始化训练模型权重，训练模型后再在对应测试数据集中进行验证。Furthermore, it should be noted that the model building dataset can be split into a training dataset and a test dataset when the model is trained. The specific operation process can be: using Python 3.5, build training and validation datasets based on the model building dataset. If the three-fold cross validation method is used to train the model, the model building dataset needs to be randomly divided into three segments, and two of the segments are used as training datasets and one segment is used as a test dataset each time. The cycle is performed three times in total, and the training model weights are randomly initialized each time. After the model is trained, it is verified in the corresponding test dataset.

具体地，所述粗建模模块包括构建子模块、训练子模块和性能评估子模块；其中：Specifically, the coarse modeling module includes a construction submodule, a training submodule and a performance evaluation submodule; wherein:

为了进一步描述本实施例所提供的构建子模块的工作过程，本实施例提供Logistic回归和支持向量机算法进行初始模型的构建过程。具体操作可以为：应用Python3.5的sklearn包中提供的工具进行编程，基于已收集的23个特征构建Logistic回归初始模型和支持向量机初始模型。In order to further describe the working process of the construction submodule provided in this embodiment, this embodiment provides the process of constructing the initial model using the Logistic regression and support vector machine algorithms. The specific operation can be: programming using the tools provided in the sklearn package of Python 3.5, and constructing the Logistic regression initial model and the support vector machine initial model based on the 23 collected features.

为了进一步描述本实施例所提供的性能评估子模块的工作过程，本实施例使用三折交叉验证法对Logistic回归初始模型和支持向量机初始模型进行模型内部验证，训练完成后将训练结果应用于测试数据集。即可以在Python中，使用scikit-learn库中的roc_curve和plot_roc_curve函数计算ROC曲线并作图，使用受试者操作特征曲线下面积(AreaUnder Receiver Operating Characteristic Curve,AUROC)和其95％置信区间以评估模型预测性能，并关注敏感度和特异性。其中，可参见图3所示，ROC曲线(Receiver OperatingCharacteristic Curve)是一条描绘模型预测性能的曲线，其横坐标表示模型预测发生假阳性(False-positive rate,FPR)的概率，其中FPR＝1-特异度(Specifity)。纵坐标表示模型预测发生真阳性(True-Positive Rate,TPR)的概率，TPR又等于敏感度(Sensitivity)。图3中，虚线是坐标(0，0)和坐标(1，1)间的直线，是辅助参考线。In order to further describe the working process of the performance evaluation submodule provided in the present embodiment, the present embodiment uses the three-fold cross validation method to perform internal model verification on the Logistic regression initial model and the support vector machine initial model, and the training results are applied to the test data set after the training is completed. That is, in Python, the roc_curve and plot_roc_curve functions in the scikit-learn library are used to calculate the ROC curve and draw a graph, and the area under the receiver operating characteristic curve (Area Under Receiver Operating Characteristic Curve, AUROC) and its 95% confidence interval are used to evaluate the model prediction performance, and sensitivity and specificity are concerned. Among them, as shown in Figure 3, the ROC curve (Receiver Operating Characteristic Curve) is a curve depicting the model prediction performance, and its horizontal axis represents the probability of the model predicting a false positive (False-positive rate, FPR), where FPR = 1-specificity (Specifity). The vertical axis represents the probability of the model predicting a true positive (True-Positive Rate, TPR), and TPR is equal to sensitivity (Sensitivity). In FIG. 3 , the dotted line is a straight line between coordinates (0, 0) and coordinates (1, 1), and is an auxiliary reference line.

在本实施例中，经过模型性能的评估得到的性能评估结果可见表1所示：In this embodiment, the performance evaluation results obtained through the evaluation of the model performance can be seen in Table 1:

表1性能评估结果表Table 1 Performance evaluation results

从表中可以看出，支持向量机初始模型是优势模型，后续将对支持向量机初始模型进行分析，并在此基础上进行降维和建模优化。It can be seen from the table that the initial model of the support vector machine is the dominant model. The initial model of the support vector machine will be analyzed later, and dimensionality reduction and modeling optimization will be performed on this basis.

为了进一步描述本实施例所提供的重要性图解析子模块的工作过程，本实施例提供一种获取所有病例特征对模型预测结果的第一贡献值并绘制特征重要性图的方式。本实施例应用SHAP(SHapley Additive exPlanations)方法实现，SHAP是一种用于解释机器学习模型预测结果的方法，其通过计算每个特征对模型预测结果的贡献来解释模型的决策过程。具体操作为：应用Python的SHAP包调用函数进行SHAP值计算，包括：In order to further describe the working process of the importance map parsing submodule provided in this embodiment, this embodiment provides a method for obtaining the first contribution value of all case features to the model prediction results and drawing a feature importance map. This embodiment is implemented using the SHAP (SHapley Additive exPlanations) method. SHAP is a method for explaining the prediction results of machine learning models, which explains the decision-making process of the model by calculating the contribution of each feature to the model prediction results. The specific operation is: using Python's SHAP package to call the function to calculate the SHAP value, including:

(1)计算模型预测结果的期望值：对于给定的数据点，计算模型在不考虑该数据点特征值的情况下的预测结果，即所有特征值都取平均值时的预测结果；(1) Calculate the expected value of the model prediction result: For a given data point, calculate the prediction result of the model without considering the eigenvalue of the data point, that is, the prediction result when all eigenvalues are averaged;

(2)计算特征值的增量值：对于每个特征，计算将该特征值从平均值增加到当前值时对模型预测结果的影响；(2) Calculate the incremental value of the feature value: For each feature, calculate the impact on the model prediction result when the feature value is increased from the average value to the current value;

(3)计算SHAP值：将每个特征的增量值乘以该特征在当前数据点中的权重，然后将所有特征的增量值相加，得到SHAP值。(3) Calculate the SHAP value: multiply the incremental value of each feature by the weight of the feature in the current data point, and then add the incremental values of all features to obtain the SHAP value.

需要说明的是，SHAP值的计算采用了一种叫做“基于值”的方法，即对于每个特征，计算其对模型预测结果的贡献是通过比较该特征值与平均值对模型预测结果的影响来实现的。SHAP值的计算公式如下：It should be noted that the calculation of SHAP value adopts a method called "value-based", that is, for each feature, its contribution to the model prediction result is calculated by comparing the influence of the feature value and the average value on the model prediction result. The calculation formula of SHAP value is as follows:

SHAP(X)＝E[output]-output(X)SHAP(X)＝E[output]-output(X)

调用SHAP包中的函数shap.summary_plot()，并使用matplotlib.pyplot：matplotlib库的pyplot模块，绘制SHAP特征重要性图(Feature importance plot)，计算每个特征SHAP值平均值的绝对值，并且将绝对值进行排序，并通过直方图展示，具体可参见图4所示，其可以了解特征对预测结果的贡献。Call the function shap.summary_plot() in the SHAP package, and use matplotlib.pyplot: the pyplot module of the matplotlib library to draw the SHAP feature importance plot (Feature importance plot), calculate the absolute value of the average SHAP value of each feature, sort the absolute values, and display them through a histogram, as shown in Figure 4, which can understand the contribution of the feature to the prediction result.

需要说明的是，SHAP评估因包含某特征而产生的结果，对于除外该特征时所有其他特征组合的重要性，从而为每个特征提供一致的贡献属性值，该属性值是所有参与特征点绝对值的平均值。平均SHAP绝对值越高，该特征对预测贡献越大。It should be noted that SHAP evaluates the importance of the result of including a feature to all other feature combinations when excluding that feature, thereby providing a consistent contribution attribute value for each feature, which is the average of the absolute values of all participating feature points. The higher the average SHAP absolute value, the greater the contribution of the feature to the prediction.

基于图4可以了解到，前十二名的特征对预测结果做出较多贡献，第十二名甘油三酯之后的特征贡献呈断崖式下降，考虑可以舍弃。具体选用结果还需结合摘要图解析子模块、回归分析子模块和主成分分析子模块的分析结果而定。Based on Figure 4, we can see that the first twelve features contribute more to the prediction results, and the contribution of the features after the twelfth triglyceride drops sharply, so they can be considered to be discarded. The specific selection results also need to be combined with the analysis results of the summary graph analysis submodule, regression analysis submodule and principal component analysis submodule.

为了进一步描述本实施例所提供的摘要图解析子模块的工作过程，本实施例提供一种获取所有病例特征对模型预测结果的第二贡献值并绘制摘要图的方式。在本实施例中，可以调用SHAP包中的函数shap.plots.bar()函数，并使用matplotlib.pyplot：matplotlib库的pyplot模块绘制SHAP摘要图，以了解每个病例特征具体对预测结果的SHAP贡献。SHAP摘要图具体实现如下：SHAP值越高，模型预测正事件“急性胰腺炎进展成重症急性胰腺炎”的概率越高，负的SHAP值则意味着模型预测负事件“急性胰腺炎不会进展成重症急性胰腺炎”的概率越高。具体特征点由红色、蓝色或者两者交叠的紫色呈现，点的颜色越红，代表特征数值偏高，反之颜色越蓝，则特征数值偏低。特征柱分色越明显，代表该特征数值高或者低时，对预测结果贡献大，对于特定数值模型倾向给出一致的预测。In order to further describe the working process of the summary graph parsing submodule provided in this embodiment, this embodiment provides a method for obtaining the second contribution value of all case features to the model prediction results and drawing a summary graph. In this embodiment, the function shap.plots.bar() in the SHAP package can be called, and the SHAP summary graph can be drawn using matplotlib.pyplot: the pyplot module of the matplotlib library to understand the SHAP contribution of each case feature to the prediction result. The SHAP summary graph is specifically implemented as follows: the higher the SHAP value, the higher the probability that the model predicts the positive event "acute pancreatitis progresses to severe acute pancreatitis", and the negative SHAP value means that the model predicts the negative event "acute pancreatitis will not progress to severe acute pancreatitis". The specific feature point is presented in red, blue or purple overlapping the two. The redder the color of the point, the higher the feature value, and vice versa. The bluer the color, the lower the feature value. The more obvious the color separation of the feature column, the greater the contribution to the prediction result when the feature value is high or low, and the consistent prediction is given for the specific numerical model tendency.

本实施例绘制的摘要图可参见图5所示，从图中可知钙离子、胸水、格拉斯哥评分、乳酸脱氢酶、心率、肌酐、尿素氮、白细胞、肺部渗出、胰腺形态(即CTSI评分)和甘油三酯分色明显，模型倾向于给出一致的预测，是优选的特征。The summary graph drawn in this embodiment can be seen in Figure 5. It can be seen from the figure that calcium ions, pleural effusion, Glasgow score, lactate dehydrogenase, heart rate, creatinine, urea nitrogen, white blood cells, lung exudate, pancreatic morphology (i.e., CTSI score) and triglycerides are clearly separated in color, and the model tends to give consistent predictions, which is a preferred feature.

为了进一步描述本实施例所提供的回归分析子模块的工作过程，本实施例提供一种获取所有病例特征与模型预测结果的相关性指标的方式。在本实施例中，可以使用SPSS25软件进行logistic回归分析，其输出结果可参见表2。本方式需判断模型是否拟合良好，“拟合良好”需符合两个条件：1)总体准确率高于分界值0.5；2)霍斯默-莱梅肖拟合优度检验显著性大于0.05。而模型总体准确率为0.89，大于分界值0.5，故可以认为模型拟合良好；且霍斯默-莱梅肖拟合优度检验显著性p＝0.61,大于0.05，认为接受原假设“模型与观测值能良好拟合”。因为模型拟合良好，可以进一步观察特征与预测相关性概况(如表2)：In order to further describe the working process of the regression analysis submodule provided in this embodiment, this embodiment provides a method for obtaining the correlation index between all case characteristics and the model prediction results. In this embodiment, SPSS25 software can be used for logistic regression analysis, and the output results can be found in Table 2. This method needs to determine whether the model is well-fitted, and "good-fitting" must meet two conditions: 1) The overall accuracy rate is higher than the cut-off value of 0.5; 2) The Hosmer-Lemmeshaw goodness-of-fit test is significant greater than 0.05. The overall accuracy of the model is 0.89, which is greater than the cut-off value of 0.5, so it can be considered that the model fits well; and the Hosmer-Lemmeshaw goodness-of-fit test is significant p=0.61, which is greater than 0.05, and it is considered that the null hypothesis "the model and the observed values can fit well" is accepted. Because the model fits well, the profile of the correlation between features and predictions can be further observed (as shown in Table 2):

表2特征与预测结果的相关性概况Table 2 Overview of the correlation between features and prediction results

需要说明的是，在表2中与预测结果相关性好的特征按比值比降序排列。拟合斜率表示模型的系数，即每个自变量对因变量的影响程度；而比值比是指某个自变量的变化对因变量的影响程度，它可以用来衡量两个类别之间的风险比。拟合斜率是模型的系数，可以用来解释模型的预测结果；而比值比是模型的结果，可以用来评估某个自变量对因变量的影响程度。这里提供比值比和拟合斜率表，是为了交叉印证特征对模型的作用，比如比值比最高的肺部渗出、血钙、胸水，也是SHAP值靠前的特征，证明降维的时候需要保留，是作为一种佐证存在。It should be noted that in Table 2, the features with good correlation with the prediction results are arranged in descending order by odds ratio. The fitted slope represents the coefficient of the model, that is, the degree of influence of each independent variable on the dependent variable; while the odds ratio refers to the degree of influence of the change of an independent variable on the dependent variable, which can be used to measure the risk ratio between two categories. The fitted slope is the coefficient of the model, which can be used to explain the prediction results of the model; while the odds ratio is the result of the model, which can be used to evaluate the degree of influence of a certain independent variable on the dependent variable. The odds ratio and fitted slope table are provided here to cross-verify the effect of the features on the model. For example, pulmonary infiltration, blood calcium, and pleural effusion with the highest odds ratio are also the features with the highest SHAP values, which prove that they need to be retained when reducing the dimension, and exist as a kind of supporting evidence.

由表2可知，十五个特征显著性p<0.05，接受假设“与模型预测结果相关性较好”，因此该十五个特征应列为候选特征。As shown in Table 2, fifteen features have a significance of p < 0.05, and the hypothesis "good correlation with the model prediction results" is accepted, so these fifteen features should be listed as candidate features.

为了进一步描述本实施例所提供的主成分分析子模块的工作过程，本实施例提供一种获取主成分数据进行回归分析，进而在所有病例特征中获取候选特征的方式。对于主成分分析，其可以了解特征间关系以确认哪些特征构成预测主成分，哪些主成分有统计学意义，应该使用哪个特征代表主成分。在本实施例中可以使用SPSS软件进行主成分分析，并使用标准化过后的主成分数据进行了Logistic回归分析。主成分分析的KMO取样适切性量数＝0.609，一共筛选出8个主成分，共解释总方差占比为58.96％。使用主成分数据进行的Logistic回归提示，霍斯默-莱梅肖拟合优度检验显著性p＝0.20,大于0.05，认为接受原假设“模型与观测值能良好拟合”，除主成分3和主成分6外，其他主成分与模型预测结果相关性好(显著性p＜0.05)，于是可以将主成分1，2，4，5，7，8中的特征作为降维时的候选特征，主成分概况如表3：In order to further describe the working process of the principal component analysis submodule provided in this embodiment, this embodiment provides a method for obtaining principal component data for regression analysis, and then obtaining candidate features from all case features. For principal component analysis, it can understand the relationship between features to confirm which features constitute the predicted principal components, which principal components are statistically significant, and which feature should be used to represent the principal component. In this embodiment, SPSS software can be used for principal component analysis, and Logistic regression analysis is performed using the standardized principal component data. The KMO sampling suitability measure of principal component analysis = 0.609, and a total of 8 principal components are screened out, explaining a total variance of 58.96%. The logistic regression using principal component data suggests that the Hosmer-Lemmeshow goodness-of-fit test is significant p = 0.20, which is greater than 0.05. It is considered that the null hypothesis "the model and the observed values can fit well" is accepted. Except for principal components 3 and 6, the other principal components have a good correlation with the model prediction results (significant p < 0.05). Therefore, the features in principal components 1, 2, 4, 5, 7, and 8 can be used as candidate features for dimensionality reduction. The overview of the principal components is shown in Table 3:

表3主成分概况Table 3 Overview of principal components

需要说明的是，符号“∮”表示特征值>1的成分，符号“※”表示Logistic回归主成分与预测结果相关性的显著性p<0.05。特征值是一个正实数，表示每个主成分所能解释的数据方差的大小。特征值越大，表示该主成分对数据的解释能力越强。It should be noted that the symbol "∮" indicates a component with an eigenvalue > 1, and the symbol "※" indicates that the significance of the correlation between the principal component of Logistic regression and the prediction result is p < 0.05. The eigenvalue is a positive real number that indicates the size of the data variance that each principal component can explain. The larger the eigenvalue, the stronger the principal component's ability to explain the data.

至此，可以通过有效特征获取子模块，基于特征重要性图、特征摘要图、相关性指标和候选特征，从病例特征中获取有效特征，实现特征降维。入选有效特征主要考虑符合以下条件：At this point, the effective feature acquisition submodule can be used to obtain effective features from case features based on feature importance graphs, feature summary graphs, correlation indicators, and candidate features to achieve feature dimensionality reduction. The selection of effective features mainly considers the following conditions:

条件一，回归分析中相关性分析p<0.05；Condition 1: p < 0.05 in the correlation analysis in the regression analysis;

条件二，SHAP重要性分析排名前50％；Condition 2: SHAP importance analysis ranked in the top 50%;

条件三：主成分分析中认定为相关性好的主成分中的组分；Condition 3: Components in the principal components identified as having good correlation in principal component analysis;

条件四：临床上认为该特征与结果具有相关性的特征。Condition 4: The feature is clinically considered to be related to the outcome.

故14个有效特征包括肺部渗出、胸腔积液、心率、格拉斯哥评分、血肌酐、乳酸脱氢酶、甘油三酯、CTSI分级、尿素氮、血钙、白细胞计数、血糖、红细胞压积和胆囊炎史。Therefore, the 14 valid features included pulmonary infiltration, pleural effusion, heart rate, Glasgow score, serum creatinine, lactate dehydrogenase, triglycerides, CTSI grade, urea nitrogen, blood calcium, white blood cell count, blood glucose, hematocrit, and history of cholecystitis.

本实施例通过获取最优的初始模型并应用SHAP解析、Logistic回归相关性分析和主成分分析三个方法交叉验证，最终从23个特征中提取14个特征作为最终建模时使用的特征，通过该本系统可以实现有依据的、可视化的、可解释的特征降维。This embodiment obtains the optimal initial model and applies SHAP analysis, Logistic regression correlation analysis and principal component analysis for cross-validation, and finally extracts 14 features from 23 features as the features used in the final modeling. This system can achieve well-founded, visual and explainable feature dimensionality reduction.

进一步地，在预测模型训练模块中，可以对模型构建数据集使用SPSS软件进行基于回归的多重插补缺失值填补，以优化缺失数据的质量，提高模型预测效率。Furthermore, in the prediction model training module, the SPSS software can be used to perform regression-based multiple interpolation missing value filling on the model construction data set to optimize the quality of missing data and improve the model prediction efficiency.

为了进一步描述本实施例所提供的预测模型训练模块的工作过程，本实施例提供了一种获取胰腺炎预测模型的方式。在本实施例中包括两个部分：第一部分，可以在python软件中构建逻辑回归、支持向量机、多层感知机和随机森林四种算法模型；第二部分：利用XGBoost这一种梯度提升算法并行融合第一部分构建的弱模型。XGBoost是一种监督学习算法，可以通过设置并行度参数来控制并行融合的程度。具体实施方法是调节max_depth(最大深度)，n_estimators(弱模型数量)，min_child_weight(最小叶子节点权重)，subsample(子采样比例)，colsample_bytree(特征子采样比例)等参数以获得拟合程度最高、预测效率满意的最终模型作为胰腺炎预测模型。In order to further describe the working process of the prediction model training module provided in this embodiment, this embodiment provides a method for obtaining a pancreatitis prediction model. In this embodiment, two parts are included: in the first part, four algorithm models of logistic regression, support vector machine, multi-layer perceptron and random forest can be constructed in python software; in the second part, the weak model constructed in the first part is parallelly fused using XGBoost, a gradient boosting algorithm. XGBoost is a supervised learning algorithm, and the degree of parallel fusion can be controlled by setting the parallelism parameter. The specific implementation method is to adjust max_depth (maximum depth), n_estimators (number of weak models), min_child_weight (minimum leaf node weight), subsample (subsampling ratio), colsample_bytree (feature subsampling ratio) and other parameters to obtain the final model with the highest degree of fit and satisfactory prediction efficiency as the pancreatitis prediction model.

本实施例最终构建了优势XGBoost最终版模型作为胰腺炎预测模型，其受试者操作特征曲线下面积达0.90，准确率0.90，特异性0.94，敏感度0.62。五个模型的预测效率参数可见表4，五个模型的受试者操作特征曲线可参见图6。This example finally constructed the superior XGBoost final version model as a pancreatitis prediction model, with an area under the receiver operating characteristic curve of 0.90, an accuracy of 0.90, a specificity of 0.94, and a sensitivity of 0.62. The prediction efficiency parameters of the five models can be seen in Table 4, and the receiver operating characteristic curves of the five models can be seen in Figure 6.

表4五个模型的预测效率参数表Table 4 Prediction efficiency parameters of the five models

需要说明的是，在图6中，纵坐标为真阳性率，横坐标为假阳性率。英文缩写对照：ROC Curve：受试者操作特征曲线，MLP：多层感知机，Logistic Regression：逻辑回归，Random Forest：随机森林，SVM：支持向量机，XGBoost：XGBoost集成算法。It should be noted that in Figure 6, the ordinate is the true positive rate and the abscissa is the false positive rate. English abbreviation comparison: ROC Curve: Receiver Operating Characteristic Curve, MLP: Multi-layer Perceptron, Logistic Regression: Logistic Regression, Random Forest: Random Forest, SVM: Support Vector Machine, XGBoost: XGBoost Ensemble Algorithm.

在本实施例中，可采用Python3.5的SHAP包解析最终版模型，制作SHAP重要性图(图7)和SHAP摘要图(图8)。此模块目的是为临床医生提供可视化的模型可解释性，增加可解释性可以提高临床医生对预测模型的信任感，减少因预测模型“黑箱操作”带来的疑虑；再者，临床医生可以通过分析预测模型的特征权重，了解重症急性胰腺炎进展相关预测特征的重要性顺序，并且直观理解特征数值的高低如何影响预测。In this embodiment, the final version of the model can be parsed using the SHAP package of Python 3.5 to produce a SHAP importance graph (Figure 7) and a SHAP summary graph (Figure 8). The purpose of this module is to provide clinicians with visual model interpretability. Increasing interpretability can increase clinicians' trust in the prediction model and reduce concerns caused by the "black box operation" of the prediction model. Furthermore, clinicians can understand the order of importance of the prediction features related to the progression of severe acute pancreatitis by analyzing the feature weights of the prediction model, and intuitively understand how the high and low feature values affect the prediction.

需要说明的是，图7所示的特征重要性图可以显示特征对预测重症胰腺炎发生的贡献；图8所示的SHAP摘要图可以显示特征具体值的高低如何影响重症胰腺炎的预测。每一个小点代表一个病例该特征的分布，横坐标为SHAP值，代表特征对预测重症胰腺炎发生贡献。对应特征原始数值越高则点的色温偏暖色调，反之则偏冷色调。英文缩写：Ca：血钙，HCT：红细胞压积，GlassGlow:格拉斯哥评分，LDH：乳酸脱氢酶，HR：心率，Glu：血糖，BUN：尿素氮，CTSI：CTSI评分，Pulmonary_Infi:肺渗出，以1代表有，0代表没有编码，WBC_C:白细胞计数，TG：甘油三酯，Crea:血肌酐，Pleyral_eff：胸水，以1代表有，0代表没有编码，Billiary_P:胆囊炎病史，以1代表有，0代表没有编码。It should be noted that the feature importance diagram shown in Figure 7 can show the contribution of the feature to the prediction of severe pancreatitis; the SHAP summary diagram shown in Figure 8 can show how the specific value of the feature affects the prediction of severe pancreatitis. Each dot represents the distribution of the feature in a case, and the horizontal axis is the SHAP value, which represents the contribution of the feature to the prediction of severe pancreatitis. The higher the original value of the corresponding feature, the warmer the color temperature of the point, and vice versa. English abbreviations: Ca: blood calcium, HCT: hematocrit, GlassGlow: Glasgow score, LDH: lactate dehydrogenase, HR: heart rate, Glu: blood glucose, BUN: urea nitrogen, CTSI: CTSI score, Pulmonary_Infi: pulmonary infiltration, 1 represents the presence, 0 represents no code, WBC_C: white blood cell count, TG: triglyceride, Crea: serum creatinine, Pleyral_eff: pleural effusion, 1 represents the presence, 0 represents no code, Billiary_P: history of cholecystitis, 1 represents the presence, 0 represents no code.

本实施例还提供一种设备，所述设备部署有所述的一种可解释的胰腺炎预测系统并提供对外网络服务接口。This embodiment also provides a device, which is deployed with the explainable pancreatitis prediction system and provides an external network service interface.

在本实施例中，可以使用shell脚本将训练好的胰腺炎预测模型部署到收费的网络服务器上。具体实施可以为：先使用SSH(Secure Shell)远程登录协议连接到网络服务器，在服务器上安装所需的依赖库，使用Python脚本语言编写部署脚本，将模型导入并对外提供API接口。In this embodiment, a shell script can be used to deploy the trained pancreatitis prediction model to a paid network server. The specific implementation can be: first use the SSH (Secure Shell) remote login protocol to connect to the network server, install the required dependency libraries on the server, use the Python scripting language to write a deployment script, import the model and provide an API interface to the outside.

利用该网页版应用，可以对本实施例所采用的病例样本100例所提取特征进行预测，即将样本输入网页版应用进行预测，并比对三日后病例病情进展情况。预测结果如下表5：Using the web application, the features extracted from the 100 case samples used in this embodiment can be predicted, that is, the samples are input into the web application for prediction, and the disease progression of the cases after three days is compared. The prediction results are shown in Table 5 below:

表5胰腺炎预测模型结果混淆矩阵100例Table 5 Confusion matrix of pancreatitis prediction model results for 100 cases

在实施例测试得到模型的泛化能力，在具有良好预测准确率的前提下，保证了较高的敏感度和特异性，使模型应用安全性得到保障。The generalization ability of the model was tested in the embodiment, and high sensitivity and specificity were guaranteed under the premise of good prediction accuracy, so that the safety of model application was guaranteed.

本实施例还提供一种计算机存储介质，所述计算机存储介质存储有计算机指令，所述计算机指令用于实现所述的一种可解释的胰腺炎预测系统。This embodiment also provides a computer storage medium, wherein the computer storage medium stores computer instructions, and the computer instructions are used to implement the explainable pancreatitis prediction system.

以上所述是本发明的优选实施方式，应当指出，对于本技术领域的普通技术人员来说，在不脱离本发明原理的前提下，还可以做出若干改进和润饰，这些改进和润饰也视为本发明的保护范围。The above is a preferred embodiment of the present invention. It should be pointed out that a person skilled in the art can make several improvements and modifications without departing from the principle of the present invention. These improvements and modifications are also considered to be within the scope of protection of the present invention.