Intelligent auxiliary diagnosis method and system for children respiratory diseasesTechnical Field
The invention relates to an artificial intelligence technology, in particular to an intelligent auxiliary diagnosis method and a diagnosis system for children respiratory diseases.
Background
The current situations of personnel shortage and uneven capability distribution in the existing children medical health service in China are still very serious, basic-level personnel lack abundant clinical experience, and the traditional medical training means are difficult to play expected roles. The direct manifestation is that the primary doctor is difficult to take the responsibility and the patient is difficult to obtain the medical service with high quality.
Therefore, the industry people pay attention to the technology based on artificial intelligence as the core, the capacity of copying the domestic best medical resources of pediatrics is realized, and the orderly and effective sinking of high-quality medical resources is realized through the technology.
At present, the Baidu medical brain, which is the latest result of artificial intelligence in the medical field, is introduced in Baidu, and the artificial intelligence technology is formally applied to the medical health industry, and the Baidu medical brain is a specific application of the Baidu brain in a medical scene. The specific application scenes comprise intelligent assistance provided for on-line inquiry of hundred-degree doctors, assistance provided for hospitals, user portrait established for patients, chronic disease management and the like.
The Baidu medical brain cannot realize the diagnosis and treatment of the diseases of children. Since children do not express their individual characteristics are not clearly known. At present, artificial intelligence is still blank in the aspects of diagnosis and treatment of respiratory diseases of children, particularly difficult miscellaneous diseases, the existing artificial intelligence image analysis only limited to the radiation images of the adult lung is mostly focused on the interpretation of lung cancer, the research on the difficult miscellaneous diseases of the respiratory department of children is very little, and the application of the radiation images is strictly limited clinically due to the particularity of children.
How to realize the intelligent diagnosis of the difficult and complicated symptoms of children becomes a problem which needs to be solved urgently at present.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides an intelligent auxiliary diagnosis method and a diagnosis system for children respiratory diseases.
In a first aspect, the present invention provides a method for intelligently assisting in diagnosing respiratory diseases of children, comprising:
s1, acquiring auxiliary examination information and/or doctor inquiry record information of the child to be diagnosed;
s2, screening key information according to the auxiliary examination information and/or doctor inquiry record information;
s3, processing the key information by adopting a pre-trained typical symptom diagnosis model according to a pre-established children disease knowledge base system to obtain a diagnosis result;
the diagnostic result includes: at least one possible disease, and each possible disease corresponds to a characteristic of the child to be diagnosed.
Optionally, before step S3, the method further includes:
collecting various symptom information and diagnosis information of the difficult and complicated diseases of the respiratory department of children;
establishing a term dictionary of the difficult and complicated diseases in the respiratory department and information of each case in the term dictionary according to the collected symptom information, the collected diagnosis information and the prior knowledge of experts;
constructing a children disease knowledge base system comprising a training set, a verification set and a test set through the information of each case in the term dictionary;
training the established typical symptom diagnosis model by adopting the established children disease knowledge base system to obtain the trained typical symptom diagnosis model;
or,
collecting various symptom information and diagnosis information of the respiratory department of children;
establishing a term dictionary of the respiratory department and information of each case in the term dictionary according to the collected symptom information, the collected diagnosis information and the prior knowledge of experts;
constructing a children disease knowledge base system comprising a training set, a verification set and a test set through the information of each case in the term dictionary;
and training the established typical symptom diagnosis model by adopting the established children disease knowledge base system to obtain the trained typical symptom diagnosis model.
Optionally, the diagnostic information comprises: one or more of diagnosis thought, diagnosis method, diagnosis process, disease symptom expression and disease differential diagnosis.
Optionally, the auxiliary inspection information includes: one or more of pathological examination information, ultrasonic auxiliary examination information, X-ray auxiliary examination information, CT and nuclear magnetic auxiliary examination information;
and/or the presence of a gas in the gas,
the doctor inquiry record information comprises: auxiliary examination information of non-expert doctors, recorded family history information and allergy history information.
Optionally, the step S2 includes:
acquiring personal disease history data of the child to be diagnosed, which is stored by an automatic diagnosis system;
screening key information according to personal disease history data, auxiliary examination information and/or doctor inquiry record information of the child to be diagnosed; or, processing personal disease history data, auxiliary examination information and/or doctor inquiry record information by adopting one of hidden Markov, conditional random field CRF and deep learning method, and screening out key information of disease characteristics for forming input vector;
and/or the weights of all words in the key information are represented by one vector, the key information is a multidimensional vector, and each disease feature is one dimension information.
Alternatively, the typical symptom diagnosis model is a BP neural network model,
the number of nodes of the input layer of the BP neural network model is consistent with the dimension of the input vector,
the learning step length is 0.01-0.8; the hidden layer node number is determined according to the complexity of a network structure and the error requirement by using a node deletion method and an expansion method;
the output layer of the BP neural network model is a layer, and the number of the nodes is consistent with the number of the output vectors.
In a second aspect, an intelligent auxiliary diagnosis system for respiratory diseases of children comprises:
the first information collection module is used for collecting doctor inquiry record information of the child to be diagnosed;
the second information collection module is used for acquiring auxiliary examination information of the child to be diagnosed when auxiliary examination items exist in the doctor inquiry record information;
the processing module is used for screening key information according to the auxiliary examination information and/or the doctor inquiry record information; processing the key information by adopting a pre-trained typical symptom diagnosis model according to a pre-established children disease knowledge base system to obtain a diagnosis result;
a display module for displaying the diagnosis result, wherein the diagnosis result comprises: at least one possible disease, and each possible disease corresponds to a characteristic of the child to be diagnosed.
Optionally, the first information collecting module includes:
the disease onset recording unit is used for recording disease onset classification information of the children to be diagnosed;
the symptom expression recording unit is used for recording the body state information of the child to be diagnosed;
the symptom examination recording unit is used for recording the preliminary examination information of the current doctor on the child to be diagnosed;
the past medical history inquiry recording unit is used for recording the response medical history information of the child to be diagnosed;
and the auxiliary inspection item confirming unit is used for recording auxiliary inspection items needing auxiliary inspection.
Optionally, the first information collection module and the presentation module are simultaneously located on an information presentation interface of the auxiliary diagnosis system.
The invention has the following beneficial effects:
in the invention, based on the acquisition of the diagnosis standard and the prior knowledge of the difficult and complicated symptoms of the children respiration department, an artificial intelligence technology is utilized to develop an artificial intelligence auxiliary diagnosis system for the difficult and complicated symptoms of the children respiration department, so that a doctor is helped to improve the cognition and diagnosis level of the difficult and complicated symptoms, misdiagnosis is reduced, a treatment scheme is optimized, and a powerful support is provided for early prevention.
In addition, the method can realize favorable support for primary medical treatment, so that the infant patient can perform preliminary examination and screening on the primary level, and then transfer the problem to an expert, thereby being capable of reasonably distributing medical resources and ensuring the reasonable use of the medical resources.
The method of the invention has been tested, the resolution ratio of the difficult and complicated disease of the department of respiration can reach 99 percent, and the method can effectively assist doctors to carry out examination.
Drawings
Fig. 1 is a schematic flow chart of a method for intelligently assisting in diagnosing respiratory diseases of children according to an embodiment of the present invention;
fig. 2 is a schematic view of an information display interface of an intelligent auxiliary diagnosis system for children's respiratory diseases according to an embodiment of the present invention;
FIG. 3 is a flowchart of training a BP neural network according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a trained BP neural network diagnostic process according to an embodiment of the present invention;
fig. 5 is an architecture diagram of an intelligent auxiliary diagnosis method for respiratory diseases of children according to an embodiment of the present invention.
Detailed Description
For the purpose of better explaining the present invention and to facilitate understanding, the present invention will be described in detail by way of specific embodiments with reference to the accompanying drawings.
Example 1
As shown in fig. 1, the intelligent auxiliary diagnosis method for respiratory diseases of children of the present embodiment includes the following steps:
and S1, acquiring auxiliary examination information and/or doctor inquiry record information of the child to be diagnosed.
For example, the auxiliary inspection information includes: one or more of ultrasound-assisted examination information, X-ray-assisted examination information, and nuclear magnetic-assisted examination information;
the doctor inquiry record information comprises: auxiliary examination information of non-expert doctors, recorded family history information and allergy history information.
And S2, screening key information according to the auxiliary examination information and/or the doctor inquiry record information.
For example, personal disease history data of the child to be diagnosed, which is stored by the automatic diagnosis system, is obtained;
and screening key information according to personal disease history data, auxiliary examination information and/or doctor inquiry record information of the children to be diagnosed.
The key information of the present embodiment may be represented by a vector, and each disease feature in the key information may be a vector of one dimension.
The auxiliary inspection information in this embodiment includes: one or more of pathological examination information, ultrasonic auxiliary examination information, X-ray auxiliary examination information, CT and nuclear magnetic auxiliary examination information;
the doctor inquiry record information comprises: auxiliary examination information of non-expert doctors, recorded family history information and allergy history information.
S3, processing the key information by adopting a pre-trained typical symptom diagnosis model according to a pre-established children disease knowledge base system to obtain a diagnosis result; as shown in fig. 5.
The diagnostic result includes: at least one possible disease, and each possible disease corresponds to a characteristic of the child to be diagnosed.
The typical symptom diagnosis model in this embodiment may be a BP neural network model, which has an input layer, a convolutional layer, a full link layer, and the like trained in advance, and adaptively changes a threshold, a weight, and the like in the BP neural network model trained according to different disease characteristics.
For example, the following steps may be performed in this embodiment before step S3 to establish a child disease knowledge base system, and train the BP neural network model based on the established child disease knowledge base system.
A01, collecting various symptom information and diagnosis information of the difficult and complicated diseases of the respiratory department of children.
For example, the diagnostic information includes: one or more of diagnosis thought, diagnosis method, diagnosis process, disease symptom expression and disease differential diagnosis. The diagnostic information includes textual information and/or image data.
A02, establishing a term dictionary of the difficult and complicated diseases in the respiratory department and the information of each case in the term dictionary according to the collected symptom information and diagnosis information and the prior knowledge of experts.
In the present embodiment, the term dictionary includes a plurality of classification information, and each classification information may include a plurality of case information.
A03, constructing a children disease knowledge base system comprising a training set, a verification set and a test set through the information of each case in the term dictionary;
and A04, training the established typical symptom diagnosis model by adopting the constructed children disease knowledge base system to obtain the trained typical symptom diagnosis model.
In the training process, for the defect that the distribution of each disease in the training set is not uniform, in this embodiment, each disease may be equalized by using a data upsampling method, for example, by using an oversample/undersample method.
The above description is directed to the related respiratory distress syndrome in pediatrics.
Of course, in practical application, it is also possible to collect symptoms of the respiratory department, and the symptoms are not necessarily limited to the difficult and complicated symptoms of the respiratory department. In this case, the step a01 can be adaptively adjusted as follows: and collecting various symptom information and diagnosis information of the child respiratory department.
Part of the classification information of the term dictionary is shown in the following table.
| Serial number | Department's office | ICD coding | Name of disease |
| 1 | Department of respiration | B37.101+J99.8* | Pulmonary fungal infection-pulmonary candidiasis |
| 2 | Department of respiration | B44.101+J99.8* | Allergic bronchopulmonary aspergillosis |
| 3 | Department of respiration | B44.102+J99.8* | Pulmonary fungal infection-pulmonary aspergillosis |
| 4 | Department of respiration | B45.001+J99.8* | Pulmonary fungal infection-cryptococcosis of lung |
| 5 | Department of respiration | C85.908 | Pulmonary lymphoma |
| 6 | Department of respiration | C95.0018 | Interstitial pneumonia of lymphocytes |
| 7 | Department of respiration | D71xx03 | Chronic granulomatosis |
| 8 | Department of respiration | D76.015 | Histiocytosis of Langerhans cells |
| 9 | Department of respiration | D80.501 | Immunodeficiency with high lgM |
| 10 | Department of respiration | D81.901 | Combined immunodeficiency disease |
| 11 | Department of respiration | D83.901 | Common variant immunodeficiency disease |
| 12 | Department of respiration | E84.902 | Cystic fibrosis |
| 13 | Department of respiration | J47xx01 | Bronchiectasis |
| 14 | Department of respiration | J67.902 | Allergic alveolitis |
| 15 | Department of respiration | J69.001 | Aspiration pneumonia |
| 16 | Department of respiration | J84.001 | Pulmonary alveolar proteinosis |
| 17 | Department of respiration | J84.003 | Diffuse alveolar hemorrhage |
| 18 | Department of respiration | J84.902 | Interstitial pneumonia |
| 19 | Department of respiration | M30.101 | Eosinophilic granulomatous vasculitis |
| 20 | Department of respiration | G24.902 | Primary ciliary dyskinesia |
The child disease knowledge base system in this embodiment may include a training set, a validation set, and a test set.
In some embodiments of the present application, a large number of professional terms may be collected from various websites or web pages, such as Baidu encyclopedia/medical research and inquiry, electronic medical record data, etc., to construct a multi-type term dictionary; and word segmentation processing can be performed by using a CRF (conditional random field), a hidden Markov, a deep learning method and the like, and in the embodiment, word segmentation information can be corrected by a pediatric transfer to obtain a corpus in a term dictionary.
Each word in the corpus is represented by a vector, so that the keywords in the corpus can be characterized in vectorization.
Since the vector has a large dimension and less non-0 elements, the vector is stored in a sparse manner. For example, each word is assigned a numerical ID, e.g., "fever" as 3 and "fever" as 5 (assuming a mark from 0). During programming, the number distribution work of each word can be completed through the Hash table. Word vectors may be trained using the Skip-gram model.
Furthermore, the format required for inputting the model can be processed from the text of the electronic medical record, for example, after the electronic medical record is extracted from a database, the format required for inputting the model can be processed; and acquiring image data of CT, ultrasound, MR, pathology, etc. from a system of a hospital, and performing a preprocessing operation (such as normalization or binarization processing) on the image data.
Since the diagnostic information includes text information and image data, for this purpose, the normalization process of the image data may include:
the image data of the training set, the validation set, and the test set are normalized, for example, in this embodiment, the mean and the standard deviation of the training set are calculated first, and then the data of the validation set and the test set are normalized according to the mean and the standard deviation of the training set, so as to process the data set including the classification information and the diagnosis information using a unified standard. In addition, data enhancement operation can be performed on the data of the training set, so that the data volume in the data set processing process is increased, and the training precision in the data set processing process is improved.
The children's disease knowledge base system in this embodiment is a term dictionary using vectorization, one category in each term dictionary may be a multi-dimensional vector, each dimension may be understood as a disease feature in the category, and the weight of the disease features of multiple categories is also represented by one vector.
In addition, the model used in the present embodiment is a BP neural network model. The process of BP neural network training can be illustrated as follows:
referring to fig. 3 and 4, in the diagnosis model, a BP network can be used to build a prediction model of a disease, and the basic process of modeling is as follows: firstly, collecting main influencing factors causing diseases and results of the diseases, such as a training set, a testing set and a verification set of a children disease knowledge base system; then inputting the influencing factors and the disease results into a designed neural network model for repeated training until the network converges (namely, an expected training error is achieved), properly adopting a certain skill in the training process to ensure that the training speed of the network is fastest, the error is minimum and the model is optimal, and finally performing disease prediction by using the established model.
Design of input layer
A BP neural network is constructed whose number of input layer nodes depends on the dimension of the input vector, i.e., the dimension of the feature vector. Therefore, when designing a respiratory disease diagnosis simulation system in the respiratory department, the respiratory disease diagnosis selects variables related to diseases, namely age z, symptoms (cough. z, chest pain 22., fever z, expectoration zs, hemoptysis e, dyspnea z, cold sweat zs and pestle finger z) and examinations (radiographic examination z. and bronchofiberscope examination z) as input data, and the data is preprocessed to obtain the dimension of a feature vector so as to determine the number of nodes of a network input layer.
Design of hidden layer
Hidden layer (hidden layer) node number design
In the BP network, the selection of the number of hidden nodes is very important, which not only has great influence on the performance of the established neural network model, but also can not train the network at all or has poor network performance if the number of the hidden nodes is too small; if the number of hidden nodes is too large, although the system error of the network can be reduced, on one hand, the training time of the network is prolonged, on the other hand, the training is easy to fall into local minimum points and an optimal point cannot be obtained, and the hidden nodes are also the inherent reason for generating 'overfitting' during the training. Therefore, the reasonable hidden node number is determined by a node deletion method and an expansion method under the condition of comprehensively considering the complexity of the network structure and the error magnitude.
In this embodiment, the number of hidden nodes is determined, an "overfitting" phenomenon during training is avoided as much as possible, and a sufficiently high network performance and generalization capability are ensured, and the most basic principle for determining the number of hidden nodes is as follows: on the premise of meeting the precision requirement, a structure which is as compact as possible is obtained, namely the number of hidden nodes is as small as possible. According to empirical formulah is the number of hidden layer nodes, m is the number of input layer nodes, n is the number of output layer nodes, and a is a constant. Alternatively, the empirical formula is h ═ n (number of input layer nodes) + m (number of output layer nodes) + a (constant).
Learning step length
The learning step size affects the stability of the learning process of the BP neural network. The large learning rate may cause the correction amount of the network weight value each time to be too large, and even cause the weight value to exceed a certain error in the correction process, and the minimum value appears irregular jump and does not converge; however, an excessively small learning rate results in an excessively long learning time, but can ensure convergence to a certain minimum value. Generally, the learning rate is selected to be smaller to ensure the convergence (stability) of the learning process, and is usually between 0.01 and 0.8.
Initial connection weight of network
The BP algorithm determines that a plurality of local minimum points generally exist in an error function, and different network initial weights directly determine which local minimum point or global minimum point the BP algorithm converges on.
Determination of a transfer function
The linear transfer functions purelin, tan-sigmoid transfer function tansig and log-sigmoid transfer function logsig which are commonly used in BP neural networks.
1) The linear transfer function is a linear transfer function that calculates the output of the output layer from the input of the network in the format: (x) purelin (x).
2) tangent-Signoid-type transfer functions are used to map the input range of neurons from plus infinity to minus infinity to (-1, 1). The tangent Sigmoid type transfer function tansig () is a differentiable function and is therefore well suited for training a neural network using the BP algorithm, and this embodiment selects this function as the transfer function of the network implicit and output layers. The format is as follows: (x) tan sig (x).
3) A logarithmic Sigmoid type transfer function is used to map the input range of the neuron from positive infinity to negative infinity to (0, 1). The logarithmic Sigmoid type transfer function logsig () is also a differentiable function, suitable for training neural networks using the BP algorithm, in the format: (x) logsig (x).
Design of output layer
The output layer of the BP neural network is only one layer, and the number of nodes is determined by the number of output vectors. If the respiratory research shows that the patients have 10 diseases (pneumonia, asthma, nervous system combination and acute bronchitis … …), the output vector is 10.
Training of neural network systems
The training of the BP network is to continuously adjust the weight of the network by applying the error back-propagation principle to ensure that the sum of the squares of the errors between the output value of the network model and the output value of the known training sample reaches the minimum or is smaller than a certain expected value. A reasonable BP neural network model is established through learning (training) of training concentrated training samples, the trained BP neural network model is utilized, simulation diagnosis is carried out by using test set and verification concentrated data, experimental results are compared with doctor diagnosis with abundant experience, and the diagnosis coincidence rate is more than 90% and is qualified.
Example 2
As shown in fig. 2, the present embodiment provides an intelligent auxiliary diagnosis system for respiratory diseases of children, including:
the first information collection module is used for collecting doctor inquiry record information of the child to be diagnosed;
the second information collection module is used for acquiring auxiliary examination information of the child to be diagnosed when auxiliary examination items exist in the doctor inquiry record information;
the processing module is used for screening key information according to the auxiliary examination information and/or the doctor inquiry record information; processing the key information by adopting a pre-trained typical symptom diagnosis model according to a pre-established children disease knowledge base system to obtain a diagnosis result;
a display module for displaying the diagnosis result, wherein the diagnosis result comprises: at least one possible disease, and each possible disease corresponds to a characteristic of the child to be diagnosed.
As shown in fig. 2, the first information collecting module includes:
the disease onset recording unit is used for recording disease onset classification information of the children to be diagnosed;
the symptom expression recording unit is used for recording the body state information of the child to be diagnosed;
the symptom examination recording unit is used for recording the preliminary examination information of the current doctor on the child to be diagnosed;
the past medical history inquiry recording unit is used for recording the response medical history information of the child to be diagnosed;
and the auxiliary inspection item confirming unit is used for recording auxiliary inspection items needing auxiliary inspection.
In practical application, the first information collection module and the display module are simultaneously located on an information display interface of the auxiliary diagnosis system.
Through machine learning, a typical symptom diagnosis model in early occurrence of the respiratory distress/respiratory department is formed. Firstly, in the clinical standard diagnosis and treatment quality control process of the children hospital respiratory department cooperation specialist, a high-sensitivity screen-out reminding for key clinical work elements is formed; secondly, in the key difficult and complicated disease typing work, high-specificity auxiliary accurate diagnosis typing is realized, and auxiliary judgment is made on the state risk and the subsequent outcome of the disease.
The product is simple to operate, has a wide application range, is suitable for various medical health systems, and has disease symptoms including patient age, onset condition, main symptoms, special examinations, family history and auxiliary examinations which are arranged according to respiratory experts of Beijing children hospitals on the left side; the right panel shows the disease diagnosis. The doctor can display the diagnosis result in real time by clicking the symptom expression of the patient and the right column.
The diagnosis result format provided on the right side of the intelligent diagnosis is disease name + diagnosis weight, and the diagnosis results are arranged from large to small according to the weight. Typical symptoms of the disease are listed below for each diagnosis, with blue highlighted symptoms indicating that the current patient is present and grey symptoms indicating that the current patient is not present. If the result of the diagnosis of the disease is difficult miscellaneous disease, the upper right corner gives a red prompt to remind the patient that the patient has the possibility of difficult miscellaneous disease, and the doctor can be helped to identify and diagnose the difficult symptoms.
It should also be noted that the exemplary embodiments mentioned in this patent describe some methods or systems based on a series of steps or devices. However, the present invention is not limited to the order of the above-described steps, that is, the steps may be performed in the order mentioned in the embodiments, may be performed in an order different from the order in the embodiments, or may be performed simultaneously.
The above embodiments may be referred to each other, and the present embodiment does not limit the embodiments.
Finally, it should be noted that: the above-mentioned embodiments are only used for illustrating the technical solution of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.