Disclosure of Invention
In order to solve the defects of the prior art, the invention provides a medical recommendation method and system based on multi-modal cardiovascular disease information. The multi-modal data are introduced into a medical recommendation system of cardiovascular disease information, so that a more accurate user physical condition and demand model can be constructed for a user, and the recommendation performance and personalized recommendation effect of the system are improved; the recommendation algorithm based on the deep learning field is combined with the traditional recommendation algorithm, and the advantages of the deep algorithm model in feature selection and model construction are utilized, so that the recommendation capability of the medical recommendation system is improved, accurate medical information with individual requirements is provided, a user can seek medical treatment more accurately, and configuration optimization of medical resources is facilitated.
The first purpose of the invention is to provide a medical recommendation method based on multi-modal cardiovascular disease information.
The second purpose of the invention is to provide a medical recommendation system based on multi-modal cardiovascular disease information.
It is a third object of the invention to provide a computer apparatus.
It is a fourth object of the present invention to provide a storage medium.
The first purpose of the invention can be achieved by adopting the following technical scheme:
a method of medical recommendation based on multimodal cardiovascular disease information, the method comprising:
acquiring clinical text modal data of the cardiovascular disease as first modal data, acquiring electrocardiogram signal data as second modal data, and preprocessing the first modal data and the second modal data;
constructing an individualized recommendation model based on a plurality of recommendation algorithms according to the preprocessed data;
and acquiring a recommendation result set based on the personalized recommendation model, and recommending the recommendation result to the target user.
Further, the preprocessing the first modality data and the second modality data specifically includes:
identifying and extracting the acquired clinical text modal data of the cardiovascular disease, removing irrelevant information, and performing structured processing;
and carrying out definite learning dynamic modeling processing on the acquired electrocardiogram signal data to obtain electrocardiogram dynamics characteristic data, carrying out normalization processing on the electrocardiogram dynamics characteristic data, carrying out characteristic operation to obtain quantitative index data, and carrying out structural processing on the quantitative index data.
Further, the constructing of the personalized recommendation model based on a plurality of recommendation algorithms according to the preprocessed data specifically includes:
performing characteristic engineering processing on the preprocessed data to obtain a structured data set;
randomly extracting data from the structured data set, and dividing the data into a training set and a verification set;
and mixing the structured data set, the training set and the verification set by using a plurality of recommendation algorithms to form an individualized recommendation model.
Further, the performing feature engineering processing on the preprocessed data to obtain a structured data set specifically includes:
selecting data characteristics from the preprocessed data, and processing repeated value, abnormal value and missing value characteristics in the data characteristics;
dividing or cutting off partial numerical characteristics in the processed data;
and carrying out standardized processing on the data after the barrel division or the truncation to enable the data in the data to be scaled according to equal proportion and converted into dimensionless numerical values, and obtaining a structured data set after processing.
Further, the missing value feature is processed by one of random filling, taking the missing value feature as information and directly neglecting the missing value feature, wherein the random filling is filling by adopting a mean value, a mode or a median of the data feature; the repeated values and abnormal values are processed as direct deletion.
Further, the recommendation algorithm comprises a collaborative filtering algorithm, a content-based filtering algorithm and a depth factorization algorithm;
the structured data set, the training set and the verification set are mixed by using various recommendation algorithms to form an individualized recommendation model, and the method specifically comprises the following steps:
calculating interest similarity among users according to the structured data set by adopting a collaborative filtering algorithm, and measuring the interest degree of the users on recommended items to obtain a user similarity matrix model;
adopting a content-based filtering algorithm, performing content analysis on past preference information of a user, searching and comparing similar information in a structured data set, using similarity measurement calculation, and generating a recommendation ranking list for items with similarity measurement values larger than a preset threshold value and returning the recommendation ranking list to the user; screening and filtering the contents in the recommended sorted list according to the real-time feedback of the user to obtain a content filtering model;
adopting a depth factorization machine algorithm, inputting relevant discrete characteristics selected from a structured data set, and training by using a training set to obtain an initial depth factorization machine model;
detecting the performance of the initial depth factorization model through the verification set, and stopping training until a training error meets a set threshold value to obtain a target depth factorization model;
and mixing the user similarity matrix model, the content filtering model and the target depth factor decomposition machine model by using a linear weighting fusion method, and then synchronizing to obtain the personalized recommendation model.
Further, the acquiring a recommendation result set based on the personalized recommendation model and recommending the recommendation result to the user specifically includes:
constructing an index for medical information data in a database;
analyzing and calculating input information of a target user according to the personalized recommendation model, and obtaining a recommendation result set after combining index analysis;
and sequencing the obtained recommendation result set through a preset weight rule to obtain the top N recommendation results, recommending the top N recommendation results to a target user in real time, and completing a one-time recommendation process.
The second purpose of the invention can be achieved by adopting the following technical scheme:
a medical recommendation system based on multimodal cardiovascular disease information, the system comprising:
the data acquisition module is used for acquiring clinical text modal data of the cardiovascular disease as first modal data, acquiring electrocardiogram signal data as second modal data, and preprocessing the first modal data and the second modal data;
the model construction module is used for constructing an individualized recommendation model based on a plurality of recommendation algorithms according to the preprocessed data;
and the recommendation engine module is used for acquiring a recommendation result set based on the personalized recommendation model and recommending the recommendation result to the target user.
The third purpose of the invention can be achieved by adopting the following technical scheme:
a computer device comprising a processor and a memory for storing a program executable by the processor, wherein the processor implements the medical recommendation method when executing the program stored in the memory.
The fourth purpose of the invention can be achieved by adopting the following technical scheme:
a storage medium stores a program that, when executed by a processor, implements the medical recommendation method described above.
Compared with the prior art, the invention has the following beneficial effects:
1. the method and the system adopt multi-modal cardiovascular disease information data, jointly use the method of image data processing and text data processing, and are applied to the construction of personalized recommendation models in the field of cardiovascular diseases. The method mainly comprises the steps that electrocardiogram signal data are introduced into the field of recommendation systems, and original text information is combined, so that a recommendation model which can reflect the physical condition and the demand of a user more accurately can be constructed for the user.
2. The method and the system adopt the combination of the depth factorization machine algorithm based on the deep learning field and the traditional recommendation algorithm, compared with the traditional method, the advantages of the model after the depth algorithm in feature selection and learning expression are introduced, and the recommendation capability of the medical recommendation system is improved under the condition that medical information is continuously increased, so that the configuration optimization of medical resources is facilitated.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be described in detail and completely with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some embodiments of the present invention, but not all embodiments, and all other embodiments obtained by a person of ordinary skill in the art without creative efforts based on the embodiments of the present invention belong to the protection scope of the present invention.
Example 1:
as shown in fig. 1, the present embodiment provides a medical recommendation method based on multi-modal cardiovascular disease medical information, which is mainly applied to a medical recommendation system, and includes the following steps:
s101, acquiring clinical text modal data of the cardiovascular disease as first modal data, acquiring electrocardiogram signal data as second modal data, and preprocessing the first modal data and the second modal data.
S1011, obtaining clinical text modal data of the cardiovascular disease as first modal data, specifically: the method comprises the steps of acquiring medical information data related to various cardiovascular diseases from medical related websites of the Internet, wherein the medical information data includes disease keyword information of Chinese medical dictionary medical academic articles, Chinese medical electronic medical records, information of related hospitals, departments and doctors, acquiring personal information and symptom information from users, and forming clinical text modal data of the cardiovascular diseases by symptom word description, disease characteristic information and hospital doctor text information data of the users as first modal data.
S1012, acquiring electrocardiogram signal data as second modality data, specifically: the method comprises the steps of obtaining electrocardiogram signal data uploaded by a user, or obtaining the electrocardiogram signal data of the user based on intelligent wearable equipment, or obtaining the electrocardiogram signal data of the user based on a portable electrocardiogram machine, wherein the electrocardiogram signal data are time sequence signal data which are image signal modal data, and the electrocardiogram signal data are used as second modal data.
S1013, preprocessing the first modality data and the second modality data, specifically including:
1) the acquired clinical text modal data of the cardiovascular disease are identified and extracted, irrelevant information is removed, main information is left, and then the main information is subjected to structuring processing and stored in a database.
2) The method comprises the steps of carrying out definite learning dynamic modeling processing on acquired electrocardiogram signal data (ECG) to obtain electrocardiogram dynamic Characteristic Data (CDG), carrying out normalization processing on the electrocardiogram dynamic characteristic data, carrying out characteristic operation to obtain quantitative index data, carrying out structuring processing on the quantitative index data, and storing the quantitative index data in a database.
Specifically, the dynamic modeling process for determining and learning electrocardiographic signal data (ECG) refers to: the electrocardiogram data is converted into electrocardiogram dynamic data through clinical verification and an effective myocardial ischemia auxiliary detection method based on a definite learning theory. And determining a learning theory to realize local accurate identification of the electrocardiogram data through the RBF neural network, storing the weight constant value of the RBF neural network, modeling, and performing three-dimensional visualization to obtain electrocardiogram dynamic Characteristic Data (CDG).
Specifically, the characteristic operation is to extract the space discrete quantitative characteristics of the electrocardiographic dynamic characteristic data by an electrocardiographic dynamic data quantitative analysis method, and then process the characteristics by a geometric average method to obtain quantitative index data.
And S102, combining and constructing a personalized recommendation model based on a plurality of recommendation algorithms according to the preprocessed data.
Further, the step S102 includes three processes of feature engineering processing, data partitioning and model training, which are specifically described as follows:
s1021, characteristic engineering: and performing characteristic engineering processing on the preprocessed data to obtain a structured data set.
Selecting data characteristics from the preprocessed data, and processing repeated value, abnormal value and missing value characteristics in the data characteristics; the processing of the missing value features is one of random filling, taking the missing value features as information and directly neglecting the missing value features, wherein the random filling is to fill by adopting the mean value, mode or median of the data features, and the processing of the repeated value and the abnormal value is to directly delete; bucket dividing or truncation is carried out on partial numerical characteristics in the processed data, namely discretization is carried out on the characteristics to obtain sparse characteristics, so that the inner product multiplication speed of the algorithm is higher; and carrying out standardized processing on the data after barrel division or truncation, scaling the data in equal proportion, converting the data into a dimensionless value, contributing to improving the convergence speed of model training in the later stage, and obtaining a structured data set after processing.
S1022, data division: data is randomly drawn from the structured data set and divided into a training set and a validation set.
Data is randomly extracted from the structured data set obtained in step S1021 and divided into a training set and a validation set, the former for model training and the latter for model validation.
S1023, model training: and mixing the structured data set, the training set and the verification set by using a plurality of recommendation algorithms to form an individualized recommendation model.
The personalized recommendation model is mainly formed by mixing three recommendation algorithms, wherein the three recommendation algorithms are respectively a collaborative filtering algorithm, a content-based filtering algorithm and a depth factorization machine algorithm, model training is performed by respectively using the collaborative filtering algorithm, the content-based filtering algorithm and the depth factorization machine algorithm (deep fm algorithm) in combination with the training set and the verification set in the step S1022, and the obtained personalized recommendation model is stored by using a storage model.
1) And calculating interest similarity between users according to the structured data set by adopting a collaborative filtering algorithm, and measuring the interest degree of the users in the recommended items to obtain a user similarity matrix model.
Specifically, from the structured dataset, the cosine similarity is calculated using the following Jaccard formula:
wherein, A and B are two different users in the user set, N (A) and N (B) represent the item sets which are interested by the users A and B, WABThe interest similarity of the user A and the user B.
And then, calculating by using the following formula, and measuring the interest degree of the user in the recommended item:
wherein S (A, K) comprises K users with the closest interest to the user A, N (I) is a set of users who have a behavior to the recommended item I, WABIs the interest similarity of user A and user B, rBIRepresenting the user's B interest in the item I. And obtaining a user similarity matrix model after calculation, and storing.
2) By adopting a content-based filtering algorithm (the principle and architecture of the algorithm are shown in fig. 2), content analysis is carried out on the past preference information of the user, similar information is searched and compared in the structured data set, and a recommendation ranking list is generated and returned to the user by using similarity measurement calculation and items with similarity measurement values larger than a preset threshold value, namely items with high similarity measurement values; the content of the recommended sorted list can be screened and filtered through the filtering module according to the real-time feedback of the user, and a content filtering model is obtained; and meanwhile, after the recommendation is finished, the relevant information of the recommendation is stored, so that reference is provided for the next recommendation.
3) And (3) adopting a depth factorization machine algorithm, inputting relevant discrete characteristics selected from the structured data set, and training by using a training set to obtain an initial depth factorization machine model.
Recording the output value of the depth factorization machine model as y, the formula is as follows:
y=sigmoid(yFM+yDNN)
wherein sigmoid represents a commonly used S-type function in the field of machine learning, yFMRepresenting the operation of parameters of a factoring machine, yDNNRepresenting the parametric operation of the deep neural network.
Training is carried out by utilizing a training set through minimizing the following function, and an initial depth factorization model is obtained, wherein the function formula is as follows:
wherein W, b are model parameters, y is a model output value, y isFruit of Chinese wolfberryIs the sample actual value.
And detecting the performance of the initial depth factorization model through the verification set, stopping training until the training error meets a set threshold value to obtain a target depth factorization model, and outputting a recommended item list by the target depth factorization model according to input information of a user.
4) And mixing the user similarity matrix model, the content filtering model and the target depth factor decomposition machine model by using a linear weighted fusion method, and then synchronizing to obtain a personalized recommendation model, wherein the personalized recommendation model can perform personalized recommendation prediction on a target user in the subsequent steps.
Because the traditional recommendation algorithm mainly comprises a collaborative filtering algorithm and a content-based filtering algorithm, a recommendation model of the traditional recommendation algorithm depends on massive data preprocessing, the characteristic engineering is complex and time-consuming, and the recommendation performance and precision are limited, the precision and the personalized recommendation capability of the recommendation model are somewhat deficient. In the embodiment, a depth factorization machine algorithm based on the deep learning field is combined with a traditional recommendation algorithm, and the recommendation capability of the medical recommendation system is improved by using the advantages of a depth algorithm model in feature selection and model construction.
S103, acquiring a recommendation result set based on the personalized recommendation model, and recommending the recommendation result to the user.
In step S103, an index is established for medical information data in the database, a personalized recommendation model is applied to recall and sort, a recommendation result list containing diseases, hospitals, departments and doctors is returned to a user in real time through the Internet and a browser, and a recommendation process can be completed; the step S103 includes three processes of index construction, result recall and result sorting, which are specifically described as follows:
s1031, index construction: an index is built for medical information data in the database, so that selection and calling of a subsequent result list are facilitated, and the performance and the working efficiency of the medical recommendation system are improved.
S1032, result recall: and analyzing and calculating the input information of the target user according to the personalized recommendation model, and combining index analysis to obtain a recommendation result set.
S1033, result sorting: and sequencing the obtained recall results (namely the recommendation result set) through a preset weight rule to obtain the top N recommendation results, and recommending the top N recommendation results to a target user in real time to complete a recommendation process.
Example 2:
as shown in fig. 3, the embodiment provides a medical recommendation system based on multi-modal cardiovascular disease information, which can be applied to the field of medical data analysis and recommendation, and the main application value of the medical recommendation system is to provide personalized medical information for a user, quantify and sort the originally abstract medical level information of hospitals and doctors, improve the trust and understanding of the user on the medical information, optimize the configuration of medical resources, reduce the waste of the medical resources, help to improve the doctor-patient relationship, and provide assistance for the development of the medical career. The medical recommendation system of the embodiment comprises a data acquisition module, a model construction module and a recommendation engine module, wherein the specific description of each module is as follows:
the data acquisition module is used for acquiring and acquiring medical information data related to various cardiovascular diseases and preprocessing the medical information data, and specifically comprises the following steps: clinical text modality data of the cardiovascular disease is acquired as first modality data, and electrocardiogram signal data is acquired as second modality data, and the first modality data and the second modality data are preprocessed.
The data acquisition module of this embodiment may include a first modality data acquisition unit, a second modality data acquisition unit, and a preprocessing unit, and the specific descriptions of the first modality data acquisition unit, the second modality data acquisition unit, and the preprocessing unit are as follows:
the first modality data acquiring unit is configured to acquire clinical text modality data of the cardiovascular disease, and as the first modality data, specifically: the method comprises the steps of acquiring medical information data related to various cardiovascular diseases from medical related websites of the Internet, such as disease keyword information of Chinese medical dictionary medical academic articles, Chinese medical electronic medical records, information of related hospitals, departments and doctors, acquiring personal information and symptom information from users, and forming clinical text modal data of the cardiovascular diseases by symptom word description, disease characteristic information and hospital doctor text information data of the users as first modal data.
The second modality data acquisition unit is configured to acquire electrocardiogram signal data as second modality data, and specifically includes: the method comprises the steps of obtaining electrocardiogram signal data uploaded by a user, or obtaining the electrocardiogram signal data of the user based on intelligent wearable equipment, or obtaining the electrocardiogram signal data of the user based on a portable electrocardiogram machine, wherein the electrocardiogram signal data are time sequence signal data which are image signal modal data, and the electrocardiogram signal data are used as second modal data.
The preprocessing unit is used for identifying and extracting the acquired clinical text modal data of the cardiovascular disease, eliminating irrelevant information and carrying out structuralization processing; and carrying out definite learning dynamic modeling processing on the acquired electrocardiogram signal data to obtain electrocardiogram dynamics characteristic data, carrying out normalization processing on the electrocardiogram dynamics characteristic data, carrying out characteristic operation to obtain quantitative index data, and carrying out structural processing on the quantitative index data.
The model construction module is used for processing the preprocessed data and constructing a personalized recommendation model based on various recommendation algorithms, and specifically comprises the following steps: and constructing an individualized recommendation model according to the preprocessed data, wherein the individualized recommendation model is constructed by mixing three recommendation algorithms, namely a collaborative filtering recommendation algorithm, a content-based filtering algorithm and a depth factor decomposition (deepFM) algorithm. And after the personalized recommendation model is constructed, synchronizing to the recommendation engine module.
In the model construction module, the recommendation algorithm can meet the increasingly complex and diversified information recommendation requirements in the medical field, and is beneficial to realizing the optimal configuration of medical resources, so that the model construction module is intensively researched. The intelligent medical recommendation system based on the recommendation algorithm can provide relevant recommendation information aiming at personalized medical requirements of different users, and meets specific medical requirements of the users. For example, the personalized recommendation system acquires personal information input by a user, analyzes data, constructs a model, and combines other information of the user and medical data to recommend personalized medical information to the user after being integrated.
The recommendation engine module is used for acquiring a recommendation result set based on the personalized recommendation model and recommending the recommendation result to a target user, and specifically comprises the following steps: and indexing, recalling and sequencing according to the generated personalized recommendation model, generating a personalized recommendation result for the target user, and recommending the recommendation result to the target user.
In the specific implementation process of the embodiment, each module of the multimodal cardiovascular disease information-based medical recommendation system can be implemented by Python language, and when the system is used by a user, the relevant operation can be completed only by connecting a conventional browser and the internet without other attached software or being limited by an operating system, and the system can be used in operating platforms such as Linux, Windows, and MacOS.
Example 3:
the present embodiment provides a computer device, which may be a computer, as shown in fig. 4, and includes aprocessor 402, a memory, aninput device 403, adisplay 404, and anetwork interface 405 connected by asystem bus 401, where the processor is used to provide computing and control capabilities, the memory includes anonvolatile storage medium 406 and aninternal memory 407, thenonvolatile storage medium 406 stores an operating system, a computer program, and a database, theinternal memory 407 provides an environment for the operating system and the computer program in the nonvolatile storage medium to run, and when theprocessor 402 executes the computer program stored in the memory, the medical recommendation method of embodiment 1 is implemented as follows:
acquiring clinical text modal data of the cardiovascular disease as first modal data, acquiring electrocardiogram signal data as second modal data, and preprocessing the first modal data and the second modal data;
constructing an individualized recommendation model based on a plurality of recommendation algorithms according to the preprocessed data;
and acquiring a recommendation result set based on the personalized recommendation model, and recommending the recommendation result to the target user.
Example 3:
the present embodiment provides a storage medium, which is a computer-readable storage medium, and stores a computer program, and when the computer program is executed by a processor, the medical recommendation method of embodiment 1 is implemented as follows:
acquiring clinical text modal data of the cardiovascular disease as first modal data, acquiring electrocardiogram signal data as second modal data, and preprocessing the first modal data and the second modal data;
constructing an individualized recommendation model based on a plurality of recommendation algorithms according to the preprocessed data;
and acquiring a recommendation result set based on the personalized recommendation model, and recommending the recommendation result to the target user.
It should be noted that the computer readable storage medium of the present embodiment may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
In conclusion, the invention introduces multi-modal cardiovascular medical data, the first modal information is formed by hospital doctor text information data according to symptom word description and disease characteristic information of a user, the second modal information is formed by acquiring time sequence waveform signal data of electrocardiogram of the user, and the multi-modal information is combined with the characteristic representation to more accurately reflect the physical condition and medical requirement of the user, thereby improving the modeling precision.
The above description is only for the preferred embodiments of the present invention, but the protection scope of the present invention is not limited thereto, and any person skilled in the art can substitute or change the technical solution and the inventive concept of the present invention within the scope of the present invention.