Disclosure of Invention
The embodiment of the application provides a gastric cancer screening system, a gastric cancer screening method, a gastric cancer screening device, an electronic device and a storage medium, and aims to at least solve the problem of low accuracy and efficiency of gastric cancer screening in the related art.
In a first aspect, embodiments of the present application provide a gastric cancer screening system, which includes: the system comprises terminal equipment and main control equipment; the main control equipment comprises an input unit and a first prediction unit;
the terminal equipment is used for acquiring a first modal data set and a second modal data set and sending the first modal data set and the second modal data set to the main control equipment;
the input unit is used for inputting the first modality data set into a completely trained data conversion model and outputting a risk prediction result of the first modality data set;
the first prediction unit is used for inputting the risk prediction result and the second modality data set into a well-trained gastric cancer prediction model and outputting the gastric cancer prediction result; wherein the data conversion model and the gastric cancer prediction model are processed in different modalities.
In some of these embodiments, the master device further comprises a second prediction unit;
the second prediction unit is configured to, when the first modality data set is detected to be missing, obtain a complementary value corresponding to the first modality data set, input the complementary value and the second modality data set to the gastric cancer prediction model, and output a gastric cancer prediction result.
In some embodiments, the terminal device is further configured to obtain a first preset data set and a second preset data set, and send the first preset data set and the second preset data set to the main control device; the master control equipment further comprises a model training unit;
the model training unit is used for performing first preprocessing on the first preset data set to obtain first training data, and the first training data comprises a first training set; the model training unit inputs the first training set into a preset neural network to obtain the data conversion model;
the model training unit is further configured to perform second preprocessing on the second preset data set to obtain second training data, where the second training data includes a second training set; the model training unit inputs the first preset data set into the data conversion model to obtain third training data, and the third training data comprise a third training set; and the model training unit inputs the second training set and the third training set into a preset decision tree model to obtain the gastric cancer prediction model.
In some of these embodiments, the master device further comprises a rank screening unit;
the sorting and screening unit is used for acquiring the index characteristics of the second training set, acquiring a characteristic importance curve according to the index characteristics and acquiring a sorting and screening result according to the characteristic importance curve;
the model training unit is further configured to input the second training set and the third training set into the preset decision tree model, and train the preset decision tree model by using the sorting and screening result to obtain the gastric cancer prediction model.
In some of these embodiments, the first training data further comprises a first test set, the second training data further comprises a second test set; the main control equipment further comprises a checking unit;
the inspection unit is used for acquiring a first detection result of the data conversion model by using the first test set and acquiring a second detection result of the gastric cancer prediction model by using the second test set according to preset accuracy indexes, sensitivity indexes and specificity indexes.
In some of these embodiments, the data transformation model is a DenseNet model and the gastric cancer prediction model is a Lightgbm model.
In a second aspect, the present embodiments provide a gastric cancer screening method, which is applied to the gastric cancer screening system described in the first aspect, and the method includes;
acquiring a first modal data set and a second modal data set of the terminal equipment;
inputting the first modality data set into a completely trained data conversion model, outputting a risk prediction result of the first modality data set, inputting the risk prediction result and the second modality data set into a completely trained gastric cancer prediction model, and outputting the gastric cancer prediction result; wherein the data conversion model and the gastric cancer prediction model are processed in different modalities.
In some of these embodiments, after the acquiring the gastroscopic image and the index information, the method further comprises:
and under the condition that the first modal data set is detected to be missing, acquiring a complementary value corresponding to the first modal data set, inputting the complementary value and the second modal data set into the gastric cancer prediction model, and outputting to obtain the gastric cancer prediction result.
In a third aspect, an embodiment of the present application provides a gastric cancer screening device, which is applied to the gastric cancer screening system of the first aspect, and the device includes: an acquisition module and a result module;
the acquisition module is used for acquiring a first modal data set and a second modal data set of the terminal equipment;
the result module is used for inputting the first modality data set into a completely-trained data conversion model, outputting a risk prediction result of the first modality data set, inputting the risk prediction result and the second modality data set into a completely-trained gastric cancer prediction model, and outputting a gastric cancer prediction result; wherein the data conversion model and the gastric cancer prediction model are processed in different modalities.
In a fourth aspect, embodiments of the present application provide an electronic device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, when executing the computer program, implementing the operation of the gastric cancer screening system as described in the first aspect above.
In a fifth aspect, embodiments of the present application provide a storage medium having a computer program stored thereon, which when executed by a processor, implement the operation of the gastric cancer screening system as described in the first aspect above.
In contrast to the related art, embodiments of the present application provide a gastric cancer screening system, method, device, electronic device and storage medium, the system including: the system comprises terminal equipment and main control equipment; the main control equipment comprises an input unit and a first prediction unit; the terminal equipment is used for acquiring a first modal data set and a second modal data set and sending the first modal data set and the second modal data set to the main control equipment; the input unit is used for inputting the first modal data set into a completely trained data conversion model and outputting a risk prediction result of the first modal data set; the first prediction unit is used for inputting the risk prediction result and the second modality data set into a well-trained gastric cancer prediction model and outputting the gastric cancer prediction result; the data conversion model and the gastric cancer prediction model are different in processing mode, and the problems of low accuracy and low efficiency of gastric cancer screening are solved.
The details of one or more embodiments of the application are set forth in the accompanying drawings and the description below to provide a more thorough understanding of the application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be described and illustrated below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments provided in the present application without any inventive step are within the scope of protection of the present application. Moreover, it should be appreciated that in the development of any such actual implementation, as in any engineering or design project, numerous implementation-specific decisions must be made to achieve the developers' specific goals, such as compliance with system-related and business-related constraints, which may vary from one implementation to another.
Reference in the specification to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the specification. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of ordinary skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments without conflict.
Unless defined otherwise, technical or scientific terms referred to herein shall have the ordinary meaning as understood by those of ordinary skill in the art to which this application belongs. Reference to "a," "an," "the," and similar words throughout this application are not to be construed as limiting in number, and may refer to the singular or the plural. The present application is directed to the use of the terms "including," "comprising," "having," and any variations thereof, which are intended to cover non-exclusive inclusions; for example, a process, method, system, article, or apparatus that comprises a list of steps or modules (elements) is not limited to the listed steps or elements, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus. Reference to "connected," "coupled," and the like in this application is not intended to be limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. Reference herein to "a plurality" means greater than or equal to two. "and/or" describes an association relationship of associated objects, meaning that three relationships may exist, for example, "A and/or B" may mean: a exists alone, A and B exist simultaneously, and B exists alone. Reference herein to the terms "first," "second," "third," and the like, are merely to distinguish similar objects and do not denote a particular ordering for the objects.
Fig. 1 is a diagram of an application scenario of a gastric cancer screening system according to an embodiment of the present application, and the embodiment of the system provided in this embodiment may be applied to the application scenario shown in fig. 1. Wherein theterminal device 102 communicates with theserver device 104 over a network.Server device 104 acquires the first modality data set and the second modality data set throughterminal device 102; theserver device 104 inputs the first modality data set to a data conversion model to output a risk prediction result, inputs the risk prediction result and the second modality data set to a gastric cancer prediction model, and finally outputs the gastric cancer prediction result; the data conversion model is different from the modality of the gastric cancer prediction model. Theterminal device 102 may be, but is not limited to, various smart phones, personal computers, notebook computers, and tablet computers, and theserver device 104 may be implemented by an independent server or a server cluster formed by a plurality of servers.
The present embodiment provides a gastric cancer screening system, and fig. 2 is a block diagram illustrating a gastric cancer screening system according to an embodiment of the present application, and as shown in fig. 2, the system includes: aterminal device 102 and amain control device 20; themaster device 20 includes aninput unit 21 and afirst prediction unit 22. Theinput unit 21 is connected to thefirst prediction unit 22 and theterminal device 102, respectively.
Theterminal device 102 is configured to obtain a first modality data set and a second modality data set, and send the first modality data set and the second modality data set to themaster device 20. Specifically, theterminal device 102 may obtain the gastric cancer detection Information of the user from Information Management systems such as a medical image archiving and communication System (PACS), a Laboratory Information Management System (LIS), and a Hospital Information System (HIS), where the gastric cancer detection Information of the user includes the first modality data set and the second modality data set. The information management systems may be deployed on theterminal device 102, or may be deployed on other server devices, and theterminal device 102 is connected to the other server devices, so that theterminal device 102 can obtain the first-modality data set and the second-modality data set from the information management systems.
It should be noted that, the data sources and forms of the first modality data set and the second modality data set are different; the dimensions of the data in the first modality data set are at least two-dimensional, e.g., the first modality data set may include image-like data derived from PACS; the dimensions of the data in the second modality data set are one-dimensional, for example, the second modality data set may include audio, text, and numerical data from HIS, LIS. After theterminal device 102 acquires the first modality data set and the second modality data set, theterminal device 102 may directly send the first modality data set and the second modality data set to themain control device 20; alternatively, theterminal device 102 may generate a sending instruction when detecting an operation such as clicking or touching by the administrator, and send the first-modality data set and the second-modality data set to themaster control device 20 based on the sending instruction; alternatively, themaster control device 20 may read the first modality data set and the second modality data set from theterminal device 102 upon receiving a transmission instruction generated by theterminal device 102; alternatively, themaster device 20 may poll and read the first-modality data set and the second-modality data set at regular intervals, which is not described herein again. Thehost device 20 may be a personal computer, laptop, server device, or other device for processing data models.
Theinput unit 21 is configured to input the first-modality data set to a completely trained data conversion model, and output a risk prediction result of the first-modality data set. The data conversion model is a model dedicated to converting each two-dimensional data in the first modality data set into one-dimensional data, and can be a neural network model such as a google lenet model, a VGG-19 model, a ResNet model or a densneet model; preferably, the present embodiment adopts a DenseNet model as the data conversion model, which can make the model efficiency and accuracy higher. Before the first modality data set is input to the data conversion model, preprocessing such as edge removal, cropping and noise reduction may be performed on the original image in the first modality data set, and data enhancement may be performed on the image data obtained after preprocessing, so as to improve the processing efficiency of the data conversion model. Then inputting the processed first modality data set into the data conversion model, and converting each image data of the first modality data set into a risk prediction result by the data conversion model; for example, the data conversion model is a classification network model, and the classification network model is used for classifying the first modality data set and obtaining a classification result of each image data in the first modality data set, and further obtaining the risk prediction result based on comprehensive analysis of the classification result; the risk prediction result may be represented by a numerical range of 0 to 1, wherein a risk prediction result of 0 indicates that the converted risk probability is extremely high.
Thefirst prediction unit 22 is configured to input the risk prediction result and the second modality data set into a well-trained gastric cancer prediction model, and output a gastric cancer prediction result; wherein the data conversion model and the gastric cancer prediction model are different in treatment modality. The gastric cancer prediction model is a model which is specially used for processing one-dimensional data to predict a probability result, and can be an XGboost model, a Lightgbm model or other decision tree algorithm models, or can be other neural network models for predicting the probability result; preferably, the present embodiment employs a Lightgbm model as the gastric cancer prediction model. Before the prediction by the gastric cancer prediction model, the data such as the text and the numerical value in the second modality data set may be subjected to preprocessing such as encoding and normalization, thereby improving the processing efficiency of the gastric cancer prediction model. And then, under the condition that theterminal device 102 acquires the first modality data set, inputting the risk prediction result serving as an input item to the gastric cancer prediction model in a traversing combination with the extraction of the processed second modality data set, and performing data analysis processing on the risk prediction result and the second modality data set by the gastric cancer prediction model to obtain a gastric cancer prediction result. For example, the second modality data set may be input into the gastric cancer prediction model, and combined with HIS electronic medical record, the gastric cancer prediction model may perform binarization processing on semantic information in the second modality data set, such as sex, age, Helicobacter pylori (Hp) infection, high-incidence gastric cancer area, previous gastric cancer disease, family history, and dietary habit, and if the gastric cancer family history describes that there is a gastric cancer probability prediction, it is represented by 1, otherwise it is 0; and processing the risk prediction result and the binary data by the gastric cancer prediction model, and finally outputting to obtain the gastric cancer prediction result.
Through the embodiment, themain control device 20 converts the first modality data set into the gastroscope result risk value by using the data conversion model, inputs the gastroscope result risk value and the second modality data set into the gastric cancer prediction model to obtain the gastric cancer prediction result, and the modalities of the data conversion model and the gastric cancer prediction model are different, so that multi-modality modeling is adopted, gastric cancer screening covering comprehensive blood routine indexes is realized, more accurate comprehensive prediction can be performed by combining indexes and image information for a patient with a gastroscope image, the problems of low accuracy and efficiency of gastric cancer screening are solved, and the gastric cancer screening system specially used for comprehensively predicting gastric cancer probability based on multi-modality modeling is realized.
In some of these embodiments, themaster device 20 further comprises a second prediction unit; wherein the second prediction unit is connected to theterminal device 102. The second prediction unit is configured to, when the first modality data set is detected to be missing, obtain a complementary value corresponding to the first modality data set, input the complementary value and the second modality data set to the gastric cancer prediction model, and output a gastric cancer prediction result.
First, it may be determined whether theterminal device 102 successfully acquires the first-modality data set; for example, theterminal device 102 may determine whether the user has or needs to perform gastroscopy through human-computer interaction with the user, and if it is determined that the user has not performed gastroscopy, it indicates that the first modality data set including image data and the like cannot be generated, or the generated first modality data set is an empty set, and at this time, the second prediction unit may detect that the first modality data set to be acquired is missing. If the first modality data set to be acquired by theterminal device 102 is missing, the acquired second modality data set may be directly input to the gastric cancer prediction model, and the image type data of the first modality data set may be automatically supplemented to obtain a supplement value corresponding to the first modality data set; and directly inputting the complement value NAN as the risk prediction result into a gastric cancer prediction model, and finally outputting the gastric cancer prediction result by the gastric cancer prediction model based on the complement value NAN and the second modality data set. It is understood that, if it is determined that theterminal device 102 successfully acquires the first modality data set, the gastric cancer prediction result may be acquired directly by using theinput unit 21 and thefirst prediction unit 22, which is not described herein again.
In some embodiments, theterminal device 102 is further configured to obtain a first preset data set and a second preset data set, and send the first preset data set and the second preset data set to themain control device 20; the first preset data set comprises historical image data acquired in a historical time period or template image data preset by an administrator; the second preset data set comprises data such as historical texts and numerical values acquired in historical time periods, or preset template texts and numerical value data. Themain control device 20 further includes a model training unit; the model training unit is connected to theinput unit 21 described above.
The model training unit is used for performing first preprocessing on the first preset data set to obtain first training data, and the first training data comprises a first training set; the model training unit inputs the first training set into a preset neural network to obtain the data conversion model. The preset neural network may be set in advance by a user. The first preset data set is subjected to first preprocessing such as frame removal, cutting, noise reduction and the like, the processed data is used as first training data, and the first training data is divided into a training set and a test set according to a certain proportion, for example, the proportion of 8:2, so that a first training set is obtained. According to the first training set, the data conversion model can be trained by utilizing a lightweight preset neural network; for example, a gastroscopic image in the first training set can be used as an input x and input to the preset neural network, and whether gastric cancer is converted into 0 or 1 is used as an output y, so that a data conversion model can be obtained.
The model training unit is further configured to perform a second preprocessing on the second preset data set to obtain second training data, where the second training data includes a second training set; the model training unit inputs the first preset data set into the data conversion model to obtain third training data, and the third training data comprises a third training set; the model training unit inputs the second training set and the third training set into a preset decision tree model to obtain the gastric cancer prediction model. The preset decision tree model may be preset by a user. And performing second preprocessing such as coding and normalization on the second preset data set, taking the processed data as second training data, and dividing the second training data into a training set and a test set according to a certain proportion, for example, a proportion of 8:2, to obtain a second training set. Meanwhile, inputting the image data of the first preset data set into a data conversion model with complete training, outputting one-dimensional third training data, and dividing the third training data according to a certain proportion to obtain a third training set. Finally, the gastric cancer prediction model can be trained by using a lightweight preset decision tree model according to the second training set and the third data set.
In some of these embodiments, themaster device 20 further includes a sorting filter unit; the sorting and screening unit is connected to the model training unit. The sorting and screening unit is used for acquiring the index characteristics of the second training set, acquiring a characteristic importance curve according to the index characteristics and acquiring a sorting and screening result according to the characteristic importance curve; the model training unit is further configured to input the second training set and the third training set into the preset decision tree model, and train the preset decision tree model using the sorting and screening result to obtain the gastric cancer prediction model. In the process of training the gastric cancer prediction model, various index features of each sample in the second training set, such as gastroscopy risk, Hp antibody negative and positive, and the like, are extracted. Then, the sorting and screening unit draws a characteristic importance curve according to the index characteristics; for example, the function code example is:
plot_importance(model,max_num_features=20);
the characteristic curve with the top 20 importance can be obtained by using the above example code. In the present embodiment, it can be configured to obtain the top 10 index characteristics, including gastroscope picture risk, Hp antibody negative-positive, prothrombin ratio (abbreviated as PGR), carcinoembryonic antigen (CEA), carbohydrate antigen (CA19-9 and CA72-4), gastrin 17 (abbreviated as G-17), patient gender, and age. And finally, taking the acquired importance index characteristics as the sequencing screening result, and training to complete the gastric cancer prediction model.
Through the embodiment, the importance degrees of the indexes are sequenced, the detection indexes with high importance degrees are reserved, hundreds of indexes are screened to remove unnecessary indexes, a high-performance gastric cancer prediction model is trained, and the accuracy and the efficiency of gastric cancer screening are further improved.
In some of these embodiments, the first training data further comprises a first test set, the second training data further comprises a second test set; the first training data may be divided into the first training set and the first test set according to a certain proportion, and the second training data may be divided into the second training set and the second test set according to a certain proportion. Themain control device 20 further includes a verification unit; the inspection unit is used for acquiring a first detection result of the data conversion model by using the first test set according to a preset accuracy index, a preset sensitivity index and a preset specificity index, and acquiring a second detection result of the gastric cancer prediction model by using the second test set. For example, the first test set may be input to the data conversion model, the data conversion model is detected according to an accuracy index, a sensitivity index and a specificity index, and the first detection result is output; similarly, the second test set is input to the data conversion model, the gastric cancer prediction model is detected according to the accuracy index, the sensitivity index and the specificity index, and the second detection result is output.
Through the embodiment, the miss rate and the false detection rate of the cancer predicted by the gastric cancer prediction model are detected by using the preset three indexes of accuracy, sensitivity and specificity, so that the model evaluation method on the gastric cancer risk prediction effect is realized, and the accuracy of the gastric cancer screening system is effectively improved.
The embodiment also provides a gastric cancer screening method. Fig. 3 is a flowchart of a gastric cancer screening method according to an embodiment of the present application, as shown in fig. 3, the flowchart includes the following steps:
step S310, a first modality data set and a second modality data set of the terminal device are obtained.
Step S320, inputting the first modality data set into a completely trained data conversion model, outputting a risk prediction result of the first modality data set, inputting the risk prediction result and the second modality data set into a completely trained gastric cancer prediction model, and outputting the gastric cancer prediction result; wherein the data conversion model and the gastric cancer prediction model are different in treatment modality.
Through the steps S310 to S320, the data conversion model is used for converting the first modality data set into the gastroscope result risk value, the gastroscope result risk value and the second modality data set are input into the gastric cancer prediction model to obtain the gastric cancer prediction result, and the modalities of the data conversion model and the gastric cancer prediction model are different, so that the multi-modality modeling is adopted, the gastric cancer screening covering comprehensive blood general indexes is realized, the more accurate comprehensive prediction can be carried out on a patient with a gastroscope image by combining the indexes and the image information, the problems of low accuracy and efficiency of the gastric cancer screening are solved, and the gastric cancer screening method based on the multi-modality modeling and specially used for comprehensively predicting the gastric cancer probability is realized.
In some embodiments, after acquiring the gastroscopic image and the index information, the method further comprises the steps of: and under the condition that the first modal data set is detected to be missing, acquiring a complementary value corresponding to the first modal data set, inputting the complementary value and the second modal data set into the gastric cancer prediction model, and outputting to obtain the gastric cancer prediction result.
In some embodiments, the training of the data conversion model and the gastric cancer prediction model comprises the following steps: receiving a first preset data set and a second preset data set acquired by terminal equipment; performing first preprocessing on the first preset data set to obtain first training data, wherein the first training data comprises a first training set; inputting the first training set into a preset neural network to obtain the data conversion model; performing second preprocessing on the second preset data set to obtain second training data, wherein the second training data comprises a second training set; inputting the first preset data set into the data conversion model to obtain third training data, wherein the third training data comprises a third training set; and inputting the second training set and the third training set into a preset decision tree model to obtain the gastric cancer prediction model.
In some embodiments, the inputting the second training set and the third training set into a predetermined decision tree model to obtain the gastric cancer prediction model comprises the following steps: acquiring the index characteristics of the second training set, acquiring a characteristic importance curve according to the index characteristics, and acquiring a sorting screening result according to the characteristic importance curve; inputting the second training set and the third training set into the preset decision tree model, and training the preset decision tree model by using the sorting and screening result to obtain the gastric cancer prediction model.
In some embodiments, the first training data further comprises a first test set, and the second training data further comprises a second test set; the gastric cancer screening method further comprises the following steps: and according to preset accuracy indexes, sensitivity indexes and specificity indexes, acquiring a first detection result of the data conversion model by using the first test set, and acquiring a second detection result of the gastric cancer prediction model by using the second test set.
In some embodiments, the data transformation model is a DenseNet model and the gastric cancer prediction model is a Lightgbm model.
The embodiments of the present application will be described in detail with reference to practical application scenarios, and fig. 4 is a flowchart of a gastric cancer screening method according to a preferred embodiment of the present application, and as shown in fig. 4, the flowchart includes the following steps:
step S401, start the gastric cancer screening process and determine whether the user has a gastroscope.
Step S402, if the judgment result in the step S401 is positive, acquiring a gastroscope image of the terminal equipment, inputting the gastroscope image into a DenseNet model to obtain a risk prediction result corresponding to the gastroscope image, and then executing the step S405; if the determination result in the above step S401 is no, the following step S405 is directly executed.
In step S403, first index information derived from HIS is acquired, the first index information including sex, age, Hp infection, high-incidence area of gastric cancer, previous pre-gastric cancer disease, family history of gastric cancer, and other risk factors of gastric cancer.
In step S404, second index information derived from LIS is obtained, wherein the second index information comprises Hp antibody, PGR, G-17 detection and the like.
Step S405, inputting the risk prediction result, the first index information, and the second index information into a Lightgbm model; alternatively, the first index information and the second index information are directly input to the Lightgbm model. Wherein the first index information and the second index information constitute the second modality data set.
Step S406, outputting a gastric cancer prediction result to judge whether the high and medium risk exists; if yes, pathological diagnosis is carried out; if not, the gastric cancer screening process is ended. When the risk of gastric cancer is greater than or equal to a threshold value preset by an administrator, it indicates that there is a high risk.
Compared with the screening scale-based early gastric cancer screening method in the related art, the method can cover more blood routine indexes through the steps S401 to S406, carries out gastric cancer probability prediction on semantic information such as sex, age, Hp infection, areas with high gastric cancer incidence, previous gastric cancer diseases, family past history, dietary habits and the like of a patient by combining with an HIS electronic medical record, and can carry out more accurate comprehensive prediction on the patient with a gastroscope image by combining with indexes and image information by adopting multi-modal modeling, so that the problems of low accuracy and low efficiency of gastric cancer screening are solved.
It should be noted that the steps illustrated in the above-described flow diagrams or in the flow diagrams of the figures may be performed in a computer system, such as a set of computer-executable instructions, and that, although a logical order is illustrated in the flow diagrams, in some cases, the steps illustrated or described may be performed in an order different than here. For example, step S402, step S403, and step S404 in fig. 4 may be executed sequentially or simultaneously.
The present embodiment also provides a gastric cancer screening device, which is used to implement the above embodiments and preferred embodiments, and the description thereof is omitted. As used hereinafter, the terms "module," "unit," "subunit," and the like may implement a combination of software and/or hardware for a predetermined function. Although the means described in the embodiments below are preferably implemented in software, an implementation in hardware, or a combination of software and hardware is also possible and contemplated.
Fig. 5 is a block diagram illustrating a gastric cancer screening apparatus according to an embodiment of the present application, and as shown in fig. 5, the apparatus includes: anacquisition module 52 and aresults module 54; the obtainingmodule 52 is configured to obtain a first modality data set and a second modality data set of the terminal device; theresult module 54 is configured to input the first modality data set to a completely trained data conversion model, output a risk prediction result of the first modality data set, input the risk prediction result and the second modality data set to a completely trained gastric cancer prediction model, and output a gastric cancer prediction result; wherein the data conversion model and the gastric cancer prediction model are different in treatment modality.
Through the embodiment, theresult module 54 converts the first modality data set into the gastroscope result risk value by using the data conversion model, and inputs the gastroscope result risk value and the second modality data set into the gastric cancer prediction model to obtain the gastric cancer prediction result, and the modalities of the data conversion model and the gastric cancer prediction model are different, thereby adopting multi-modality modeling, realizing gastric cancer screening covering comprehensive blood routine indexes, enabling patients with gastroscope images to carry out more accurate comprehensive prediction by combining indexes and image information, solving the problems of accuracy and low efficiency of gastric cancer screening, and realizing the gastric cancer screening device specially used for comprehensively predicting gastric cancer probability based on multi-modality modeling.
In some embodiments, theresult module 54 is further configured to, in a case that the first modality data set is detected to be missing, obtain a complementary value corresponding to the first modality data set, input the complementary value and the second modality data set to the gastric cancer prediction model, and output the gastric cancer prediction result.
In some embodiments, the obtainingmodule 52 is further configured to obtain a first preset data set and a second preset data set of the terminal device; the gastric cancer screening device further comprises a training module; the training module is used for carrying out first preprocessing on the first preset data set to obtain first training data, and the first training data comprises a first training set; the training module inputs the first training set into a preset neural network to obtain the data conversion model; the training module is further configured to perform second preprocessing on the second preset data set to obtain second training data, where the second training data includes a second training set; the training module inputs the first preset data set into the data conversion model to obtain third training data, and the third training data comprises a third training set; the training module inputs the second training set and the third training set into a preset decision tree model to obtain the gastric cancer prediction model.
In some embodiments, the training module is further configured to obtain an index feature of the second training set, obtain a feature importance curve according to the index feature, and obtain a sorting and screening result according to the feature importance curve; the training module inputs the second training set and the third training set into the preset decision tree model, and trains the preset decision tree model by using the sorting and screening result to obtain the gastric cancer prediction model.
In some embodiments, the first training data further comprises a first test set, and the second training data further comprises a second test set; the training module is further used for acquiring a first detection result of the data conversion model by using the first test set according to preset accuracy indexes, sensitivity indexes and specificity indexes, and acquiring a second detection result of the gastric cancer prediction model by using the second test set.
In some embodiments, the data transformation model is a DenseNet model and the gastric cancer prediction model is a Lightgbm model.
The above modules may be functional modules or program modules, and may be implemented by software or hardware. For a module implemented by hardware, the modules may be located in the same processor; or the modules can be respectively positioned in different processors in any combination.
In some embodiments, a computer device is provided, and the computer device may be a server, and fig. 6 is a structural diagram of the inside of a computer device according to the embodiment of the present application, as shown in fig. 6. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used to store gastric cancer prediction results. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement the gastric cancer screening method described above.
Those skilled in the art will appreciate that the architecture shown in fig. 6 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
The present embodiment also provides an electronic device comprising a memory having a computer program stored therein and a processor configured to execute the computer program to perform the steps of any of the above method embodiments.
Optionally, the electronic apparatus may further include a transmission device and an input/output device, wherein the transmission device is connected to the processor, and the input/output device is connected to the processor.
Optionally, in this embodiment, the processor may be configured to execute the following steps by a computer program:
and S1, acquiring a first modality data set and a second modality data set of the terminal equipment.
S2, inputting the first modality data set into a completely trained data conversion model, outputting a risk prediction result of the first modality data set, inputting the risk prediction result and the second modality data set into a completely trained gastric cancer prediction model, and outputting a gastric cancer prediction result; wherein the data conversion model and the gastric cancer prediction model are different in treatment modality.
It should be noted that, for specific examples in this embodiment, reference may be made to examples described in the foregoing embodiments and optional implementations, and details of this embodiment are not described herein again.
In addition, in combination with the gastric cancer screening method in the above embodiments, the embodiments of the present application may be implemented by providing a storage medium. The storage medium having stored thereon a computer program; the computer program, when executed by a processor, implements any of the gastric cancer screening methods of the above embodiments.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
It should be understood by those skilled in the art that various features of the above-described embodiments can be combined in any combination, and for the sake of brevity, all possible combinations of features in the above-described embodiments are not described in detail, but rather, all combinations of features which are not inconsistent with each other should be construed as being within the scope of the present disclosure.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.