Movatterモバイル変換


[0]ホーム

URL:


CN120429411A - Auxiliary diagnosis and treatment question-answering method and system based on knowledge graph, equipment and medium - Google Patents

Auxiliary diagnosis and treatment question-answering method and system based on knowledge graph, equipment and medium

Info

Publication number
CN120429411A
CN120429411ACN202510928288.7ACN202510928288ACN120429411ACN 120429411 ACN120429411 ACN 120429411ACN 202510928288 ACN202510928288 ACN 202510928288ACN 120429411 ACN120429411 ACN 120429411A
Authority
CN
China
Prior art keywords
question
processing
entity
medical
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202510928288.7A
Other languages
Chinese (zh)
Other versions
CN120429411B (en
Inventor
阚红星
梁振
张宝亮
谢宛青
谷宗运
邓勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hefei Comprehensive National Science Center Big Health Research Institute
Anhui University of Traditional Chinese Medicine AHUTCM
Original Assignee
Hefei Comprehensive National Science Center Big Health Research Institute
Anhui University of Traditional Chinese Medicine AHUTCM
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hefei Comprehensive National Science Center Big Health Research Institute, Anhui University of Traditional Chinese Medicine AHUTCMfiledCriticalHefei Comprehensive National Science Center Big Health Research Institute
Priority to CN202510928288.7ApriorityCriticalpatent/CN120429411B/en
Publication of CN120429411ApublicationCriticalpatent/CN120429411A/en
Application grantedgrantedCritical
Publication of CN120429411BpublicationCriticalpatent/CN120429411B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Landscapes

Abstract

The invention provides an auxiliary diagnosis and treatment question answering method, a system, equipment and a medium based on a knowledge graph, wherein the method comprises the steps of obtaining an original question sentence and a picture; the method comprises the steps of processing and fusing the same by using a CLIP model to obtain fusion characteristics, processing the fusion characteristics to obtain a segmentation mask, processing the segmentation mask to obtain keywords, processing and splicing pictures and the keywords to obtain prompt vectors, processing the prompt vectors to obtain description texts, extracting structural information from the description texts, generating medical question sentences according to original question sentences and the structural information, and obtaining answer contents according to the medical question sentences. The medical question is obtained through fusion processing of the original question and the picture, and the answer content is obtained from the knowledge graph based on the medical question, so that the limitation that a user can only input text is avoided, and the user is allowed to shoot the picture at the same time and input a plurality of prompt words to carry out auxiliary diagnosis and treatment questioning.

Description

Auxiliary diagnosis and treatment question-answering method and system based on knowledge graph, equipment and medium
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to an auxiliary diagnosis and treatment question-answering method, system, equipment and medium based on a knowledge graph.
Background
Along with the development of artificial intelligence and big data technology, the medical industry is accelerating the intelligent transformation. Various medical intelligent question-answering systems have been developed and aim to assist doctors in diagnosis and treatment and patient consultation through natural language processing technology. Such systems retrieve relevant medical knowledge from a knowledge base or knowledge graph and return answers by analyzing the medical natural language of the user's question, converting it into machine-understandable information. However, since the medical question contains a large number of terms and the expression modes are various, the system has high requirements on word segmentation, entity recognition and semantic understanding of the language. In practical application, medical data has the characteristics of complex sources and various formats, and how to construct and maintain a high-quality medical knowledge base or knowledge graph becomes a key subject.
In the prior art, a knowledge graph technology is generally adopted to carry out structural representation on medical knowledge so as to support the efficient operation of a medical intelligent question-answering system. However, the prior art scheme has a plurality of technical defects in practical application, namely, the medical data sources are heterogeneous and have poor quality, and the problems of insufficient accuracy of entity identification and relation extraction exist in the knowledge extraction process. And secondly, the entity alignment and fusion technical means are single, so that redundant nodes in the knowledge graph are easily increased, and the query efficiency is influenced. Thirdly, the analysis of the user natural language question is difficult, and the existing system is difficult to accurately match the complex question with the map entity, so that the question-answering accuracy is reduced. Fourthly, the map retrieval performance is limited by the redundancy of map structure, the response speed is low, and the high-efficiency knowledge updating and expanding mechanism is lacking, so that the requirement of rapid evolution of medical knowledge is difficult to adapt.
Although the prior art improves the performance of the question-answering system to some extent, the prior art still has the defects in practical application. Specifically, in the aspect of Chinese natural language processing, due to the lack of standardized medical vocabulary and labeling data, the existing system has low word segmentation and semantic understanding accuracy, and influences entity matching and answer retrieval effects. In the aspects of entity recognition and intention recognition, the accuracy is insufficient, and the complex medical questioning intention of the user cannot be fully understood. In the aspect of multi-source knowledge fusion, the lack of effective entity alignment and redundancy removal mechanisms leads to increased redundant information in a knowledge base and reduced retrieval efficiency and answer quality. In addition, the system architecture is closed, the expansibility and portability are poor, and the requirements of medical knowledge rapid update and multi-scene application are difficult to adapt. Therefore, there is a need for improvements in the prior art to improve the overall accuracy, query efficiency, and flexibility of use of the medical question-answering system.
Disclosure of Invention
In view of the defects in the prior art, the invention provides an auxiliary diagnosis and treatment question answering method, an auxiliary diagnosis and treatment question answering system, auxiliary diagnosis and treatment question answering equipment and an auxiliary diagnosis and treatment question answering medium based on a knowledge graph, so as to solve the technical problems that in the prior art, the intention is inaccurate in recognition and the entity in the user expression is not matched with the knowledge graph entity.
The invention provides an auxiliary diagnosis and treatment question answering method based on a knowledge graph, which comprises the steps of obtaining an original question and a picture input by a user, processing the original question and the picture by utilizing a CLIP model, fusing the original question and the picture to obtain fusion characteristics, processing the fusion characteristics by utilizing a decoder to obtain a segmentation mask so as to find out an area related to the original question, processing the segmentation mask by utilizing a multi-head attention mechanism according to a preset keyword library, screening out keywords most relevant to the area related to the original question, processing the picture and the keywords by utilizing the CLIP model, splicing to obtain a prompt vector, processing the prompt vector by utilizing a LLM model to obtain a description text, extracting structural information from the description text, generating a graph fused with picture information according to the original question and the structural information, and obtaining answer content from the knowledge question according to the medical question.
In an embodiment of the invention, the original question sentence and the picture are processed by using a CLIP model and then fused to obtain fusion characteristics, and the method comprises the steps of respectively processing the picture and the original question sentence by using the CLIP model to obtain image characteristics and text characteristics, adding the image characteristics and the text characteristics element by element, then carrying out layer normalization processing, inputting the image characteristics and the text characteristics into a multi-layer perceptron for nonlinear transformation after the image characteristics and the text characteristics are subjected to splicing processing, and adding the characteristics subjected to the layer normalization processing and the characteristics output by the multi-layer perceptron to obtain the fusion characteristics.
In an embodiment of the invention, the method for processing the fusion features by using a decoder to obtain a segmentation mask comprises the steps of processing the fusion features by using a plurality of convolution layers to obtain a plurality of different scale features, performing up-sampling processing on the different scale features to restore the different scale features to the size of the fusion features, adding the plurality of up-sampled features element by element, processing by using one convolution layer to obtain a single-channel mask feature, and processing the single-channel mask feature by using a Sigmoid function to obtain the segmentation mask.
In an embodiment of the present invention, processing the segmentation mask by using a multi-head attention mechanism according to a preset keyword library, and screening out keywords most relevant to a region related to the original question, including respectively calculating semantic similarity of each keyword in the segmentation mask and the keyword library by using a plurality of attention heads; splicing the outputs of a plurality of attention heads and integrating the outputs through linear transformation to obtain a matching degree matrix; and screening out the keywords according to the maximum value in the matching degree matrix.
In one embodiment of the invention, the method for processing the prompt vector by using the LLM model to obtain a description text and extracting structural information from the description text comprises the steps of converting the prompt vector into a continuous vector representation by using an embedded layer of the LLM model, processing the continuous vector representation by using a multi-layer transducer structure and an autoregressive mechanism inside a decoder of the LLM model to obtain the description text, and decomposing the description text into the structural information by using a natural language processing technology, wherein the structural information comprises focus positioning, morphological characteristics and clinical prompts.
In an embodiment of the invention, answer content is obtained from a knowledge graph according to the medical question, and the answer content comprises the steps of processing the medical question by utilizing a pre-trained first language model and a first neural network to obtain user intention, processing the medical question to obtain a first entity, linking the first entity to a node in the knowledge graph to obtain a standard entity name, selecting a corresponding query template from a predefined sentence pattern template according to the user intention, filling the standard entity name into the query template to obtain a query sentence, and inquiring a returned entity and attribute thereof from the knowledge graph according to the query sentence, and generating the answer content by combining the user intention.
In an embodiment of the invention, the first language model is a first BERT language model, the first neural network is TextCNN neural networks, the medical question is processed by utilizing the pre-trained first language model and the first neural network to obtain the user intention, the method comprises the steps of carrying out semantic coding on the medical question by utilizing the pre-trained first BERT language model to obtain a high-dimensional semantic vector of the medical question, inputting the high-dimensional semantic vector of the medical question into the pre-trained TextCNN neural network, extracting local features in the vector through convolution operation, carrying out nonlinear activation and pooling processing to obtain probability distribution of the user intention, and obtaining the user intention according to the probability maximum value of the probability distribution of the user intention.
In an embodiment of the invention, the first entity is linked to the nodes in the knowledge graph to obtain the standard entity name, which comprises the steps of inquiring whether the node which is the same as the name or the alias of the first entity exists according to the entity name or the description corresponding to the node in the knowledge graph, if so, taking the name of the corresponding node of the knowledge graph as the standard entity name, and if not, calculating the semantic similarity of the name or the description between the first entity and the corresponding entity of the node in the knowledge graph by using a second BERT language model, and selecting the name of the node with the highest semantic similarity as the standard entity name.
In an embodiment of the invention, the knowledge graph is constructed by acquiring a medical text, processing the medical text by using a named entity recognition module to obtain an optimal entity tag sequence corresponding to the medical text, splitting the optimal entity tag sequence according to attribute mapping and a syntax template based on a domain rule to obtain an entity and corresponding attributes thereof, extracting semantic relations among all the entities by using a relation extraction module to construct relations among the entities, and constructing the knowledge graph according to the entities and the corresponding attributes thereof and the relations among the entities.
In an embodiment of the invention, the medical text data is processed by utilizing an entity recognition model to obtain an optimal entity tag sequence corresponding to the medical text, and the method comprises the steps of carrying out word segmentation and Token coding on the medical text by utilizing a RoBERTa pre-training language model to obtain a context-sensitive word vector sequence, capturing the front-back semantic dependency relationship of the word vector sequence by utilizing a bidirectional long-short-term memory network to obtain a bidirectional sequence representation fused with local context, carrying out cross Token semantic association modeling on the bidirectional sequence representation by utilizing a multi-head self-attention mechanism, aggregating multi-head output to obtain a semantic enhanced vector representation, and carrying out tag prediction on each position in the semantic enhanced vector representation by utilizing a conditional random field to obtain the optimal entity tag sequence corresponding to the medical text.
In an embodiment of the invention, label prediction is performed on each position in the semantically enhanced vector representation by using a conditional random field to obtain an optimal entity label sequence corresponding to the medical text, wherein the method comprises the steps of mapping the semantically enhanced vector representation to a label space through linear transformation to obtain label emission scores of each position in the sequence, dynamically planning and calculating final scores of all possible label sequence paths through a Viterbi algorithm according to the emission scores and a transition matrix in the conditional random field, selecting a label sequence path with the highest final score as an optimal path, and taking a label sequence corresponding to the optimal path as the optimal entity label sequence.
In an embodiment of the invention, the step of splitting the optimal entity tag sequence according to the attribute mapping and the syntax template based on the domain rule to obtain the entity and the corresponding attribute thereof further comprises the steps of calculating semantic similarity between different entity names and descriptions by using a pre-trained third BERT language model, and carrying out entity fusion according to a similarity calculation result, wherein the knowledge graph is constructed based on the fused entity.
The invention further provides an auxiliary diagnosis and treatment question-answering system based on the knowledge graph, which comprises a request acquisition module, an image and text embedding fusion module, a fine granularity decoding segmentation module, a segmentation mask generation module, a region and keyword matching module, a medical question generation module and a medical question generation module, wherein the request acquisition module is used for acquiring an original question and a picture input by a user, the image and text embedding fusion module is used for processing the original question and the picture by using a CLIP model and then fusing the processed original question and the picture to obtain fusion characteristics, the fine granularity decoding segmentation module is used for processing the fusion characteristics by using a decoder to obtain a segmentation mask so as to find out a region related to the original question, the region and keyword matching module is used for processing the segmentation mask by using a multi-head attention mechanism according to a preset keyword library, the keyword most related to the region related to the original question is screened out, the prompt vector is calculated by using a CLIP model and then is spliced to obtain a prompt vector, the fine granularity description generation module is used for processing the prompt vector by using a LLM model and obtaining a description text, and extracting structural information from the description text, the medical question generation module is used for generating the medical question-answering information according to the structural information and the medical question-answering information, and the medical question-answering information is generated by the fusion module.
In one embodiment of the invention, the question and answer generation module comprises a user intention recognition module, an entity link module, a query analysis module and an answer generation module, wherein the user intention recognition module is used for processing the medical question by utilizing a pre-trained first language model and a first neural network to obtain a user intention, the entity link module is used for processing the medical question to obtain a first entity, linking the first entity to a node in a knowledge graph to obtain a standard entity name, the query analysis module is used for selecting a corresponding query template from a predefined sentence pattern template according to the user intention and filling the standard entity name into the query template to obtain a query sentence, and the answer generation module is used for querying a returned entity and attributes thereof from the knowledge graph according to the query sentence and generating answer content by combining the user intention.
The system comprises a data acquisition module, a named entity identification module, a label sequence splitting module, an entity relation extraction module and a knowledge graph construction module, wherein the data acquisition module is used for acquiring medical texts, the named entity identification module is used for processing the medical texts to obtain optimal entity label sequences corresponding to the medical texts, the label sequence splitting module is used for splitting the optimal entity label sequences according to attribute mapping and a syntax template based on field rules to obtain entities and corresponding attributes thereof, the entity relation extraction module is used for extracting semantic relations among all the entities to construct relations among the entities, and the knowledge graph construction module is used for constructing the knowledge graph according to the entities and the corresponding attributes thereof and the relations among the entities.
In an embodiment of the invention, the system further comprises an entity alignment and fusion module, wherein the entity alignment and fusion module is used for calculating semantic similarity between different entity names and descriptions by utilizing a pre-trained third BERT language model and carrying out entity fusion according to a similarity calculation result.
To achieve the above and other related objects, the present invention also provides an electronic device including a processor, a memory, and a communication bus for connecting the processor and the memory, the processor being configured to execute a computer program stored in the memory to implement a method as provided in any one of the embodiments above.
To achieve the above and other related objects, the present invention also provides a computer-readable storage medium having stored thereon a computer program for causing a computer to perform the method provided in any one of the above embodiments.
The auxiliary diagnosis and treatment question answering method based on the knowledge graph, the system, the equipment and the medium have the advantages that the medical question sentence fused with the picture information is obtained through fusion processing of the original question sentence and the picture, and answer content is obtained from the knowledge graph based on the medical question sentence, so that limitation that a user can only input texts is avoided, scenes which cannot be well described or cannot accurately describe actual conditions are provided for the user, and the user is allowed to shoot pictures at the same time and input a plurality of prompt words to carry out auxiliary diagnosis and treatment question.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below. The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application. It is evident that the drawings in the following description are only some embodiments of the present application and that other drawings may be obtained from these drawings without inventive effort for a person of ordinary skill in the art.
FIG. 1 is a flow chart of an assisted diagnosis and treatment question answering method according to an embodiment of the present invention;
FIG. 2 is a detailed flowchart of step S20 according to an embodiment of the present invention;
FIG. 3 is a detailed flowchart of step S30 according to an embodiment of the present invention;
FIG. 4 is a detailed flowchart of step S40 according to an embodiment of the present invention;
FIG. 5 is a detailed flowchart of step S60 according to an embodiment of the present invention;
FIG. 6 is a detailed flowchart of step S80 according to an embodiment of the present invention;
FIG. 7 is a flow chart of user intent recognition provided in an embodiment of the present invention;
FIG. 8 is a flow chart of knowledge graph construction according to an embodiment of the present invention;
FIG. 9 is a flowchart illustrating a process of a named entity recognition module according to an embodiment of the invention;
FIG. 10 is a schematic diagram of an assisted diagnosis and treatment system according to an embodiment of the present invention;
Fig. 11 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Reference numerals illustrate 101, processor, 102, memory.
Detailed Description
Other advantages and effects of the present invention will become apparent to those skilled in the art from the following disclosure, which is to be read in light of the following specific examples. It should be noted that the following embodiments and features in the embodiments may be combined with each other without conflict. In addition to the specific methods, devices, materials used in the embodiments, any methods, devices, and materials of the prior art similar or equivalent to those in the embodiments of the present invention may be used to practice the present invention according to the knowledge of one skilled in the art and the description of the present invention.
It is to be understood that the terminology used in the examples of the invention is for the purpose of describing particular embodiments only, and is not intended to limit the scope of the invention. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
In the following description, numerous details are set forth in order to provide a more thorough explanation of embodiments of the present invention, however, it will be apparent to one skilled in the art that embodiments of the present invention may be practiced without these specific details, in some of which well-known structures and devices are shown in block diagram form, rather than in detail, in order to avoid obscuring the embodiments of the present invention.
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
Referring to fig. 1, fig. 1 is a schematic diagram of an auxiliary diagnosis and treatment question answering method based on a knowledge graph according to an embodiment of the invention, which includes steps S10 to S80.
And S10, acquiring an original question and an original picture input by a user. In this step, the user can be either a patient or a doctor, and any person can consult without limiting the identity. The medical question does not need to be in a special format or in a professional expression term, and the user intention understanding and entity linking can be performed in the subsequent processing process, so that the input of the user is more convenient. The original question entered by the user may be, for example, "how do aspirin stomach pain:" can nitroglycerin be eaten at the onset of angina? "how the wound in the figure should be treated" and so on, the picture may be, for example, a user taking an uploaded image of the hand injury area.
And S20, processing the original question and the picture by using the CLIP model, and fusing to obtain fusion characteristics.
Referring to fig. 2, in an embodiment of the invention, step S20 includes steps S21 to S24.
And S21, respectively processing the picture and the original question sentence by using the CLIP model to obtain image characteristics and text characteristics. The formula can be expressed as:
;
;
in the formula, I represents a picture, T represents an original question, and phi (I) and phi (T) are respectively image characteristics and text characteristics.
Step S22, after adding the image features and the text features element by element, layer normalization (for example, using LayerNorm layers) is performed to stabilize the training process.
And S23, inputting the spliced image features and text features into a multi-layer perceptron to perform nonlinear transformation so as to capture more complex feature interaction.
And step S24, adding the characteristics after the layer normalization processing and the characteristics output by the multi-layer perceptron to obtain fusion characteristics. The design not only maintains the information of the original characteristics, but also enhances the expression capability of the characteristics through nonlinear transformation. This process enables the image and text to be fully fused in the same semantic space.
These three steps can be formulated as:
;
wherein F is the fusion characteristic.
And step S30, processing the fusion characteristic by using a decoder to obtain a segmentation mask so as to find out the area related to the original question.
Referring to fig. 3, in a specific embodiment of the present invention, step S30 includes processing the fusion feature by using a plurality of convolution layers to obtain a plurality of features with different scales, step S32 up-sampling the features with different scales to restore the features with different scales to the dimensions of the fusion feature, step S33 adding the up-sampled features element by element, and then processing the added features with one convolution layer to obtain a single-channel mask feature, and step S34 processing the single-channel mask feature by using a Sigmoid function to pressurize the output value to the interval of [0,1] to obtain the segmentation mask. The segmentation mask is used to represent the probability that each pixel belongs to the region to which the original question pertains.
And S40, processing the segmentation mask by using a multi-head attention mechanism according to a preset keyword library, and screening out keywords most relevant to the area related to the original question.
Referring to fig. 4, in an embodiment of the invention, step S40 includes steps S41 to S43.
Step S41, calculating semantic similarity of each keyword in the segmentation mask and the keyword library by using a plurality of attention headers. This step can be formulated as:
;
In the formula, headi represents the output characteristic of the ith attention header, M is the segmentation mask obtained in step S30, and K is the keyword library. The multi-head attention mechanism calculates the semantic similarity of each keyword in the keyword library and the segmentation mask through a plurality of attention heads respectively.
And S42, splicing the outputs of the plurality of attention heads and integrating the outputs through linear transformation to obtain a matching degree matrix. This step can be formulated as:
;
Where β is the final matching matrix and WO is the linear transformation matrix.
And step S43, screening out keywords according to the maximum value in the matching degree matrix, and providing key semantic information for subsequent description generation. This step can be formulated as:
;
wherein k is the selected keyword.
And S50, processing the pictures and the keywords by using the CLIP model, and then splicing to obtain the prompt vector. This step is formulated as:
;
wherein, psi (k) is text feature corresponding to the keyword obtained by processing the keyword by using the CLIP model.
And step S60, processing the prompt vector by using the LLM model to obtain a description text, and extracting structural information from the description text.
Referring to fig. 5, in an embodiment of the present invention, step S60 includes converting the prompt vector into a continuous vector representation through an embedded layer of the LLM model, step S62 processing the continuous vector representation to obtain a description text by using a multi-layer transducer structure and an autoregressive mechanism inside a decoder of the LLM model, and step S63 decomposing the description text into structural information including lesion localization, morphological features and clinical prompts through a natural language processing technique.
The steps S61 to S63 can be expressed as follows:
;
;
Wherein, the Prompt is a Prompt vector, the embedded is an embedded layer, the Decoder is a Decoder of LLM, and the Location is focus positioning information, morphology is morphological feature information and CLINICALTIP is clinical Prompt information in the decomposed features. For example, if a tongue picture is uploaded by the user and an original question is attached to the picture to judge my health status, the obtained structural information can be, for example, "red spots exist in the middle of the tongue, thick and greasy tongue fur, pale white color, edge tooth marks and ellipse morphology" through processing the original question and the picture.
And step S70, generating a medical question fused with the picture information according to the original question and the structural information. Taking the original question and the structured information in the previous paragraph as examples, the medical question finally generated can be, for example, "red and greasy tongue coating is thick and greasy, white and light in color, edge tooth trace, ellipse in shape, please judge my health status".
Step S80, obtaining answer content from the knowledge graph according to the medical question.
Referring to fig. 6, in an embodiment of the invention, step S80 includes steps S200 to S500.
And step 200, processing the medical question by using the pre-trained first language model and the first neural network to obtain the intention of the user. In order to achieve accurate answers to medical questions, the user's intent needs to be understood first.
Referring to fig. 7, in an embodiment of the present invention, the first language model is a first BERT language model, and the first neural network is a TextCNN neural network. The step S200 specifically comprises the steps of carrying out semantic coding on a medical question by utilizing a first BERT language model which is pre-trained to obtain a high-dimensional semantic vector of the medical question, inputting the high-dimensional semantic vector of the medical question into a TextCNN neural network which is pre-trained, extracting local features in the vector through convolution operation and carrying out nonlinear activation and pooling processing to obtain probability distribution of user intention, and S203 obtaining the user intention according to the probability maximum value of the probability distribution of the user intention.
Through the steps S201-S203, the method and the device can accurately identify which intention type the user questions about disease causes, symptom analysis, medication advice and the like, and conveniently map natural language questions to question-answering templates or query types preset by a system, so that accuracy of user intention identification is remarkably improved. The intention recognition technology of the embodiment effectively solves the recognition problem caused by the diversity, complexity and ambiguity of the natural language question expression, and enhances the understanding capability of the question-answering system on the user demand.
In this embodiment, a pre-trained first BERT language model is used to obtain a contextual semantic representation of the problem, and its output is then introduced into TextCNN neural networks for convolution operations. TextCNN the neural network extracts local key information, such as key word and phrase features, through a plurality of convolution kernels of different widths, and supplements the length Cheng Yuyi learned by the first BERT language model. The convolution layer is followed by a pooling and full connection layer, and finally probability distribution of the intention category is output. By combining the global understanding capability of the BERT and the sensitivity of TextCNN to local features, the type of the problem can be judged more accurately in the medical question-answering scene.
It should be noted that, the types of the user intentions are predefined, and the TextCNN neural network is trained according to the predefined good user intentions, so that the TextCNN neural network can output probability distribution corresponding to each user intention. The step also supports the expansion of the intention category, and incremental training can be completed only by adding a new category label into the training data and refining the sample. Common user intentions may include, for example, medication counseling, emergency instructions, side effect treatments, and the like.
And step S300, processing the medical question to obtain a first entity, and linking the first entity to a node in the knowledge graph to obtain a standard entity name. The process of resolving the entities during the construction of the medical question and the knowledge graph is the same, and the process will be described in detail in the following steps. And linking the first entity in the medical question before inquiring the knowledge graph, so that the name of the first entity can be standardized. The method solves the problem of entity matching difficulty caused by the expression diversity of users, and ensures the accurate positioning of the entities in the query process, thereby improving the hit rate and accuracy of questions and answers.
In a specific embodiment of the invention, a first entity is linked to nodes in a knowledge graph to obtain standard entity names, wherein the method comprises the steps of inquiring whether the nodes with the same names or aliases as the first entity exist according to entity names or descriptions corresponding to the nodes in the knowledge graph, if so, taking the names of the corresponding nodes in the knowledge graph as the standard entity names, and if not, calculating semantic similarity of the names or descriptions between the first entity and the corresponding entities of the nodes in the knowledge graph by using a second BERT language model, and selecting the name of the node with the highest semantic similarity as the standard entity name. In the step, the name of the first entity and the entity node in the knowledge graph are correctly mapped through literal matching, alias dictionary or semantic-based similarity calculation. For example, for a user-entered abbreviation or synonym, by this step the corresponding standard entity can be found and the query performed using its index.
In the step, whether the entity names or aliases are identical is judged directly, if yes, the user is expressed more accurately about the entity, the name of the corresponding node can be used as the standard entity name directly, otherwise, the semantic similarity calculation is needed through a second BERT language model. If the semantic similarity is low or ambiguity exists, candidate screening can be performed by combining the context intention and the query result, and if necessary, feedback is given to the user for confirmation. The entity links accurately correspond concepts in the problem to the entity nodes of the knowledge graph so as to ensure that the target information can be accurately positioned by subsequent inquiry. Besides, the alias dictionary and the entity alias rule can be configured, and online addition and quick update of new entities are supported, so that the method has better maintainability.
Step S400, selecting a corresponding query template from the predefined sentence pattern template according to the user intention, and filling the standard entity name into the query template to obtain a query sentence. The step is to further analyze the medical question into a query statement of the knowledge graph according to the intention recognition and entity link result so as to obtain an accurate diagnosis and treatment answer. Through predefined sentence pattern templates, each user intention corresponds to a query template, and then the linked entities are substituted into the templates to generate complete knowledge-graph query sentences (such as Cypher query of Neo4 j). The query statement fully utilizes the index and relation structure of the graph database during construction, so that the system can efficiently locate medical entities and attributes thereof related to user demands and quickly return answers. For example, only the relation edges associated with the matching entities need to be accessed in the query process, so that the additional overhead caused by full graph search is avoided. By the method, efficient knowledge graph retrieval is realized, the query efficiency and response speed of the system are further improved, and the problem of query burden after multi-source data fusion is effectively solved.
And S500, inquiring the returned entity and the attribute thereof from the knowledge graph according to the inquiry statement, and generating answer content by combining the user intention. In this step, for the entity returned by the query and its attribute information, the structured data may be organized into natural language answers in accordance with predefined answer templates or rules in conjunction with the user's intent. For example, information such as disease names, symptom descriptions, recommended treatment schemes and the like are spliced to form a complete diagnosis and treatment suggestion sentence. And this step also supports multiple rounds of answer generation of dialog contexts, such as supplementing or modifying previous answers in successive questions and answers. The information retrieved by the knowledge graph is converted into diagnosis and treatment suggestions which are easy to understand by a user, so that user experience and practicability of the question-answering system are improved.
In the above steps, the knowledge graph is pre-constructed, and the knowledge graph existing in the prior art can be adopted or can be self-constructed.
Referring to fig. 8, in an embodiment of the present invention, a knowledge graph is constructed through steps S510 to S550.
Step S510, acquiring medical texts. Specifically, diagnosis and treatment related data can be collected from multi-source medical resources (such as electronic medical records, medical documents and public medical databases), and operations such as preprocessing, cleaning, duplicate removal and the like can be performed on the related data, so that high-quality medical texts are finally obtained.
And step S520, processing the medical text by using the named entity recognition module to obtain an optimal entity tag sequence corresponding to the medical text.
Referring to fig. 9, in an embodiment of the invention, step S520 includes steps S521 to S524.
Step S521, word segmentation and Token coding are carried out on the medical text by utilizing RoBERTa pre-training language model, and a context-sensitive word vector sequence is obtained. RoBERTa the pre-trained language model is responsible for converting the entered medical text into a vector representation that the model can handle, which is equivalent to an embedding layer. The BERT model is composed of word vectors (Token Embedding), position codes (Position Embedding), and sentence codes (Segment Embedding), where the word vectors and position codes are key components of the embedded layer. In this step, a brute force optimized version of the BERT model, roBERTa pre-training language model, is used as an embedding layer, and the optimization on the training method is realized by training on longer sentences.
The RoBERTa pre-training language model (RoBERTa) is specifically improved based on the BERT model (BERT for short), wherein (1) the dynamic mask mechanism is changed from the static mask mechanism of BERT to the dynamic mask mechanism by RoBERTa. The original BERT may be the same for different batches of text sequence mask markers during training, which means that although the training dataset is replicated multiple times, in practice each sequence is masked in only a limited number of different ways. Further, since the BERT runs for multiple training periods, this means that each sequence with the same mask will be passed multiple times to the BERT training. It was found that this problem can be solved later by using a dynamic Mask, because the dynamic Mask generates a new "[ Mask ]" for each time a sequence is transferred into the model, improving the adaptability of the model to different Mask patterns during training, helping the model learn richer sequence representations, and improving the generalization capability of the model. (2) NSP task removal RoBERTa NSP task in BERT is removed, and the purpose of NSP is to determine whether two sentences are coherent, which is not necessary for downstream tasks and even introduces noise. Removing the NSP task does not affect the performance of the model on downstream tasks, even with a slight boost. Thus, canceling the NSP task can focus the model more on text encoding and representation learning. (3) And the larger Batch Size is that large-scale corpus is added for pre-training in the data layer RoBERTa, and compared with 16GB training data used by BERT, roBERTa is improved to 160GB, so that the bidirectional relation in the text can be captured more accurately, and the overall performance of the model is improved. (4) Byte text encoding-furthermore, the original BERT uses a character-level BPE (Byte-PairEncoding) word segmentation method of 30K, while RoBERTa selects a larger-scale Byte-level BPE vocabulary BBPE (Byte-level BPE, containing 50K subword units) as the smallest unit of the building sub-words to train BERT, enhancing the understanding of rare complex text.
And step 522, capturing the front-back semantic dependency relationship of the word vector sequence by utilizing the two-way long-short-term memory network to obtain the two-way sequence representation fused with the local context. Two-way long and short term memory networks learn context-dependent vectors through a presenter, and traditional methods such as CNN, RNN or unidirectional LSTM have some limitations in entity extraction tasks, for example, the methods do not consider that the importance degree corresponding to different terms in a named entity recognition task may be different, and partial element attention distribution is insufficient. Therefore, in view of these problems, we combine the two-way long and short term memory network (BiLSTM) and the multi-headed self-attention Mechanism (MHA) with the coding layers that together make up the model, further process the vectors output by RoBERTa layers. In this step, LSTM is modified by adding a reverse process, and the context information of the medical data is automatically learned by the bi-directional LSTM.
Step S523, performing cross Token semantic association modeling on the bi-directional sequence representation by utilizing a multi-head self-attention mechanism, and aggregating multi-head output to obtain a semantically enhanced vector representation. It has been mentioned above that the coding layer also comprises a multi-headed self-attention Mechanism (MHA) which is introduced in order to improve the model's ability to capture interaction features between arbitrary distance elements, further improving the entity extraction performance.
An output matrix H of a two-way long-short-term memory network (BiLSTM) is used as input of a multi-head self-attention Mechanism (MHA), and then the processing is carried out according to the following steps of firstly, dividing each feature of an input sequence into a plurality of groups to be used as heads, carrying out independent Q, K, V linear transformation on each head, wherein Q, K, V is respectively a query, a key and a value, X is an input matrix, W represents a corresponding weight matrix, secondly, calculating an attention score by using Q, K in a dot product mode, carrying out normalization on the attention score to generate attention weight, carrying out weighted summation on the attention weight to obtain a specific representation of the head on the input sequence, and thirdly, splicing the outputs of the H heads together, and obtaining a final output matrix through linear transformation of the output weight matrix Wo.
And S524, performing label prediction on each position in the semantic enhanced vector representation by using the conditional random field to obtain an optimal entity label sequence corresponding to the medical text. Conditional Random Fields (CRFs) correspond to a decoding layer that assigns each word or phrase an entity type, such as symptoms, diseases, or drugs, etc.
In a specific embodiment of the present invention, step S524 includes the steps of (1) mapping the semantically enhanced vector representation to a tag space through linear transformation to obtain a tag emission score of each position in the sequence, (2) dynamically planning and calculating final scores of all possible tag sequence paths through Viterbi algorithm according to the emission scores and a transition matrix in the conditional random field, (3) selecting a tag sequence path with the highest final score as an optimal path, and taking a tag sequence corresponding to the optimal path as an optimal entity tag sequence.
The effect of step S524 is to label predict each position in the sequence in conjunction with the output of the previous layer BiLSTM-MHA. There is usually a certain dependency relationship between the labels, if a conditional random field is not adopted, the coding layer ignores the dependency relationship between the labels, and the model predicts the labels for each independent position, so that inconsistent phenomenon occurs in the label sequence. Thus, in the NER task (i.e., named entity recognition task) of the present invention, a linear chain conditional random field is used to receive the output from the upper layer, and to reflect the dependency between labels, a Conditional Random Field (CRF) is added with a transition matrix M to describe the transition probabilities between different labels, which considers all possible sequence paths and calculates the path scores.
The calculation formula of the final score S in the step (2) is as follows:
;
Where Mi,j represents the transition matrix score from tag i to tag j, Ei represents the emission score, and the highest scoring path is the correct label path. Specifically, the Viterbi algorithm may be used to find the most probable entity label for label prediction with the highest score, and the optimal label calculation formula is as follows:
;
Of the formula (I)Is the predicted optimal tag sequence, y is the tag sequence (i.e., the entity tag corresponding to each word), and X is the input sequence (i.e., the word or character of the original text).
For Conditional Random Field (CRF), the transfer matrix M is obtained by training, and the loss function is calculated by introducing negative log likelihood of the real tag sequence to minimize, so as to optimize the parameters in the training process, and the calculation formula of the loss function is as follows:
;
;
Where ytrue denotes a correct path, yn denotes an nth path, P (ytrue |x) denotes a probability distribution of the correct path, and Si denotes a final score of the ith path.
The named entity recognition module constructed based on RoBERTa-BiLSTM-MHA-CRF can more accurately recognize various entity boundaries and entity types in Chinese medical questions, so that the accuracy of named entity recognition is remarkably improved, and the technical problem of low entity recognition accuracy in the medical field in the traditional method is effectively solved.
Taking the input medical text as "patient has history of hypertension and diabetes" as an example, the optimal physical tag sequence obtained by the processing of the steps S521-524 is, for example :["O", "O", "O", "B-Disease", "I-Disease", "I-Disease", "O", "B-Disease", "I-Disease", "I-Disease", "O"].
And step S530, splitting the optimal entity tag sequence according to the attribute mapping and the syntax template based on the domain rule to obtain the entity and the corresponding attribute thereof. The optimal entity tag sequence is obtained through step S520, which is just a set of tag sequences, and the entity and its corresponding attribute can be obtained through this step. The entity is a standardized entity, and the corresponding attribute is also the formatted attribute, so that the knowledge graph construction is facilitated.
Step S540, extracting semantic relations among all entities by using a relation extraction module to construct relations among the entities. And carrying out relation prediction on the entity pairs output by the named entity recognition module by analyzing semantic and syntactic information so as to construct relation edges between the entities. Specifically, for example, a rule method or a deep learning method may be employed to extract the types of entity relationships common in the medical field, such as "disease-symptoms", "disease-examination", "drug-therapy", and the like. The extracted relationships are used to form triples in the knowledge-graph, which are stored in a graph database along with the entity information. Through the relation extraction module, rich inter-entity connection is established in the knowledge graph, so that the question-answering system can inquire based on a relation network between entities, and the accuracy and efficiency of question-answering are indirectly improved.
Step S550, constructing a knowledge graph according to the entity and the corresponding attribute and the relation between the entity. In the process of constructing the knowledge graph, a graph database (such as Neo4 j) can be used for storage and management. And defining each entity as a node in the knowledge graph, connecting semantic relations among the entities through edges, and adding attribute information for the nodes and the edges. The graph database has efficient indexing and querying capabilities, and can quickly locate target nodes according to entity tags and attributes. By adopting graph database storage, high-efficiency management of large-scale medical knowledge is ensured, and technical support is provided for solving the problem of query efficiency reduction caused by multi-source data fusion.
In a specific embodiment of the present invention, step S530 further includes S560, calculating semantic similarity between different entity names and descriptions by using the pre-trained third BERT language model, and performing entity fusion according to the similarity calculation result. Based on this, when the knowledge graph is constructed, the knowledge graph is constructed based on the fused entities.
Aiming at the redundancy problem existing after the multi-source heterogeneous medical data fusion, after the candidate entity and the attribute thereof are extracted, the semantic similarity between different entity names or descriptions is calculated by utilizing a pre-trained third BERT language model, and the entities with high similarity are matched and combined. For the entities judged to be the same, the invention unifies the entities into one entity node and integrates the attribute information thereof, thereby eliminating redundant entities caused by multi-source data. And storing the fused entities and the relations thereof in a graph database in a form of triples, and constructing a complete and consistent medical knowledge graph. Through step S560, the multi-source data redundancy is effectively eliminated, and the efficiency and reliability of knowledge-graph query are improved, so that the problem of serious multi-source data fusion redundancy is solved.
It should be noted that, the above steps of the methods are divided, for clarity of description, and may be combined into one step or split into multiple steps when implemented, so long as they contain the same logic relationship, and all the steps are within the protection scope of the present application, and adding insignificant modification or introducing insignificant design to the algorithm or the process, but not changing the core design of the algorithm and the process, are within the protection scope of the patent.
The embodiment of the invention also provides an auxiliary diagnosis and treatment question-answering system based on the knowledge graph, which comprises a request acquisition module, an image and text embedding fusion module, a fine granularity decoding segmentation module, a region and keyword matching module, a prompt vector calculation module, a fine granularity description generation module, a medical question generation module and a question-answering generation module. The system comprises a request acquisition module, an image and text embedding fusion module, a fine-granularity decoding segmentation module, a region and keyword matching module, a prompt vector calculation module, a fine-granularity description generation module and a medical question generation module, wherein the request acquisition module is used for acquiring an original question and a picture input by a user, the image and text embedding fusion module is used for processing the original question and the picture by using a CLIP model and then fusing the original question and the picture to obtain fusion characteristics, the fine-granularity decoding segmentation module is used for processing the fusion characteristics by using a decoder to obtain segmentation masks so as to find out regions related to the original question, the region and keyword matching module is used for processing the segmentation masks by using a multi-head attention mechanism according to a preset keyword library, the keyword most related to the regions related to the original question is selected, the prompt vector calculation module is used for processing the original question and the keyword by using the CLIP model and then splicing the prompt vector to obtain the prompt vector, the fine-granularity description generation module is used for processing the prompt vector to obtain a description text and extracting structural information from the description text, the medical question generation module is used for generating a medical question fused with picture information according to the original question and the structural information, and the answer content is obtained from the knowledge.
It should be noted that, the auxiliary question-answering system of this embodiment is a system corresponding to the auxiliary question-answering method described above, and functional modules in the auxiliary question-answering system or corresponding steps in the auxiliary question-answering method respectively. The auxiliary diagnosis and treatment question answering system of the embodiment can be implemented in cooperation with the auxiliary diagnosis and treatment question answering method, that is, under the condition of no conflict, the related technical details mentioned in the auxiliary diagnosis and treatment question answering method of the embodiment can also be applied to the auxiliary diagnosis and treatment question answering system of the embodiment.
In addition to the above modules, the knowledge graph-based assisted diagnosis and treatment question and answer system may further include the following modules.
And the data acquisition module is used for acquiring the medical text. The module is responsible for collecting diagnosis and treatment related data from multi-source medical resources such as electronic medical records, medical literature and public medical databases.
And the data preprocessing module is used for preprocessing the acquired data, preprocessing operations comprise data format conversion, word segmentation, entity candidate extraction, attribute standardization and the like, removing noise information and unifying data formats so as to provide high-quality input for subsequent processing. Through the module, medical data from different sources can be ensured to be consistent in semantics and formats, a foundation is laid for subsequent entity identification and knowledge fusion, and redundancy and ambiguity caused by data isomerization are reduced.
And the named entity recognition module is used for processing the medical text to obtain an optimal entity tag sequence corresponding to the medical text. The module performs entity recognition on the preprocessed medical text by adopting a joint model based on RoBERTa pre-training language model and a two-way long-short-term memory network (BiLSTM), a multi-head self-attention mechanism and a Conditional Random Field (CRF). The module corresponds to the steps S521-S524. The module can accurately identify the entities such as the disease name, the examination item, the medication scheme and the like in the medical question sentence and determine the boundary and the type thereof, thereby remarkably improving the accuracy of Chinese medical text entity identification and effectively relieving the problem of low accuracy of the traditional method in the aspect of medical field entity identification.
And the label sequence splitting module is used for splitting the optimal entity label sequence according to the attribute mapping and the syntactic template based on the domain rule to obtain the entity and the corresponding attribute thereof. This step corresponds to step S530 described above, and the present module is configured to convert the optimal entity tag sequence into an entity and its corresponding attribute.
And the entity relation extracting module is used for extracting the semantic relation among all the entities to construct the relation among the entities. The module firstly utilizes a named entity recognition module to extract medical entities such as diseases, symptoms, medicines, treatment schemes and the like from the text, and then extracts semantic relations (such as relations of diseases-symptoms, diseases-treatments, medicines-side effects and the like) among the entities through a rule matching mode, a template mode or a grammar dependency analysis mode and the like.
And the knowledge graph construction module is used for constructing the knowledge graph according to the entity and the corresponding attribute thereof and the relation between the entities. Taking Neo4j graph database storage as an example, the constructed entity and the relation thereof are stored in the Neo4j graph database in the form of a knowledge graph. Neo4j is used as a protogram database, and can intuitively store medical knowledge in a node (entity) and side (relation) mode. Labels and attribute indexes (e.g., labels for diseases, symptoms, treatments, examinations, etc.) are created in Neo4j for different categories of entities, with fixed identification of relationship types.
And the entity alignment and fusion module is used for calculating semantic similarity between different entity names and descriptions by utilizing a pre-trained third BERT language model, and carrying out entity fusion according to a similarity calculation result. The module aligns and fuses synonymous entities and repeated entities in the multi-source heterogeneous data. And for the entity which is judged to be the same, unifying the entity in a knowledge graph into a single node, and merging attribute information of the single node, thereby eliminating data redundancy. And updating the fused entity and the relationship thereof into a graph database in a triplet form, so as to ensure the consistency and the integrity of the knowledge graph. The entity alignment and fusion module effectively solves the problem of redundant entities in the multi-source data fusion process, and improves the quality and the query efficiency of the knowledge graph.
The system login and user management module is responsible for identity authentication and authority management of the user. The user can use the question-answering function after login verification, and different user roles have different access rights. The system administrator can create and manage user accounts through the module, set access control policies, and ensure safe use of medical data and knowledge. In addition, the module also records the user operation log and provides basis for subsequent audit and analysis. Through the user management function, the system ensures the safety and stability under the multi-user environment.
The knowledge updating and maintaining module is used for dynamically updating the medical knowledge graph so as to cope with the continuous emergence of new knowledge. The system can periodically crawl information from medical data sources of authority (such as newly issued guidelines, clinical test data and the like), automatically execute processes of entity identification, relation extraction, entity alignment and the like, and fuse newly added knowledge into an existing knowledge base in a triplet form. In addition, the system provides knowledge auditors that allow the expert to verify or modify the automatically extracted knowledge. Through a continuous knowledge updating mechanism, the module ensures that medical knowledge integrated in the system keeps timeliness and integrity, thereby effectively improving the expansibility of the system and solving the problem of difficult adaptation of new medical knowledge.
And the system management module is used for operation monitoring and configuration management of the whole question-answering system. The system administrator can check the running state, performance index and log information of each functional module through the module, and set or adjust the system parameters. The module supports dynamic load balancing and cluster expansion functions, can automatically allocate computing resources according to the access amount, and ensures that the system can still stably operate under the condition of high concurrency. The system management module is also responsible for operation and maintenance tasks such as version updating, backup management and the like, and ensures stable iteration of the system and data safety. Through centralized management and monitoring, the module effectively improves the reliability and maintainability of the system, and enhances the applicability of the system in complex environments.
And the knowledge graph visualization module provides a visual interface to help a user to intuitively browse and analyze the entities and the relations in the medical knowledge graph. The user can view knowledge networks related to specific diseases or medical concepts through the interactive graphical interface, and zoom in, zoom out and screen nodes and edges. The visualization module also supports local focusing on the map according to the user requirement, highlights key entities and association relations thereof, and assists the user in understanding information structures in the knowledge map. In addition, the module can export the visual result into a report or an image, so that the expert can conveniently conduct knowledge auditing and sharing.
By adopting the modularized system architecture, the expandability of the system is enhanced. The functional modules of the system communicate through clear interfaces, are independent and have low coupling, and when new functions are needed or latest medical knowledge is introduced, the system only needs to be configured and expanded in the related modules, and the whole system is not required to be changed. For example, if a new disease classification or treatment regimen is introduced, this may be accomplished by updating the knowledge-graph construction module or the knowledge updating module, with other modules not requiring modification. The system also provides standardized data interfaces and service interfaces to the outside, which is convenient for integration with third party data sources, medical equipment or algorithm models, thereby rapidly expanding new medical resources and functional components. The modularized and interfacing design ensures that the system can flexibly expand along with the evolution of medical knowledge and the change of user demands, and effectively solves the problem of insufficient system expansibility.
Referring to fig. 10, in an embodiment of the present invention, the system may be designed in a hierarchical mode in addition to the modular system design architecture described above. In fig. 10, the system includes a data layer, a model layer, a logic layer, and a service layer. It should be noted in particular that the sub-modules of the hierarchical design of fig. 10 may all go in and out with the aforementioned module names, as they are not in a one-to-one relationship.
The data layer is responsible for acquiring, cleaning, structuring and storing knowledge data in the medical field, and provides high-quality basic information resources for the model layer and the logic layer. The data layer comprises a data acquisition and preprocessing module, an entity relation construction module and a Neo4j graph database storage module. The modularized and configurable pipeline mode is adopted in the design of the data layer, so that the data source or the preprocessing flow can be quickly adapted and integrated when being changed, and the data layer has good maintainability and expansibility.
The model layer may be used for storage of all models involved in the system. Such as RoBERTa-BiLSTM-MHA-CRF based named entity recognition model in the named entity recognition module, BERT-TextCNN based intent recognition model in the user intent recognition module (BERT may be instead RoBERTa), and BERT sentence similarity based entity alignment model in the entity alignment and fusion module, etc.
The logic layer mainly faces to user inquiry, analyzes and understands the input natural language questions, and invokes resources provided by the model layer and the data layer to finally generate answers. The system comprises a core function request acquisition module, a user intention recognition module (except a model part), an entity link module, a query analysis module and an answer generation module.
The business layer is responsible for user interaction interface, back-end service deployment and maintenance, high concurrency processing and the like of the system, so that the whole system can be ensured to stably run and effectively provide service for the terminal user. Such as front-end interactive design, back-end service deployment, assisted intelligent consultation, knowledge-graph visualization, problem management (e.g., database management), and so forth.
Front end interactive design, wherein the front end module provides a visual and easy-to-use man-machine interactive interface for a user. The web page and mobile application double-end design is adopted, the interface style is concise and clear, and the interface style comprises elements such as a user input box, a problem example, a search button, a result display area and the like. The front end uses modern front end technology (such as HTML5, CSS3, javaScript, and Vue, react frames, etc.) to implement responsive layout, compatible with different terminal devices. After the user inputs the problem, the front end sends the request to the back end through asynchronous communication modes such as AJAX or WebSocket, and the user input can be initially checked (such as sensitive word filtering) at the interface end. When the back end returns an answer, the front end renders and displays the answer content to the user, and can provide rich text or a presentation effect (such as highlighting an entity in a knowledge graph, displaying a relationship graph segment and the like) so as to improve the readability and the user trust. The front end is also provided with a log record module which tracks the user behavior and feedback and provides data support for the follow-up model optimization. The front end design adopts a componentized architecture, so that new functions (such as multi-round question and answer support, voice input and the like) are easily expanded, and the maintainability of the system is ensured.
And the back-end service deployment, wherein the back-end is responsible for bearing a logic layer functional module and providing computing services for the front-end and the model layer. The backend typically deploys functional modules, such as intent recognition services, question-answer parsing services, entity linking services, etc., through micro-services or modular services based on a distributed architecture. Each service communicates with other modules through RESTful APIs and is independently upgradeable. The back end is realized by adopting a mature Web framework (such as Django, flash or Spring Boot and the like) so as to be rapidly developed and reliably operated. The system is deployed in a cloud server or a private data center, and is managed by using a containerization technology (such as Docker) and a container arrangement tool (such as Kubernetes), so as to support automatic capacity expansion and load balancing. The load balancer distributes the user request to a plurality of back-end nodes, and ensures that the system can still respond smoothly when the access amount is increased suddenly. In the back-end process, a monitoring and logging system (e.g., prometheus, ELK) is also configured to monitor the health and performance metrics of each service, and automatically alarm or enable an emergency plan once an anomaly is found. The backend deployment design fully considers security, protecting user data security through access control and encryption protocol (HTTPS). The modularized service enables the back end to have good expandability and maintainability, and service or resources can be dynamically increased or decreased when service requirements change.
And the database management step of maintaining a bottom storage system by a business layer, wherein the bottom storage system comprises a knowledge graph database and an auxiliary database. And for the Neo4j graph database, backup, compression and migration tests are required to be regularly carried out, so that the safety and consistency of data are ensured. In addition, cluster deployment or read-write separation modes are configured for Neo4j to improve usability and concurrent query performance. A relational database (e.g., mySQL, postgreSQL) or a NoSQL database (e.g., redis) may be used to store user information, question and answer logs, session data, and the like. The system administrator monitors the data tables or key value storage through the database management module, including connection pool configuration, index optimization and timing maintenance, so as to ensure stable and efficient data access. For the hot spot data and repeated query results, the system can also keep the latest results in the memory cache, reduce the load of the database and accelerate the response speed. The database management layer is designed to emphasize scalability, namely, when the data volume or access volume continuously increases, the storage capacity can be seamlessly expanded through the modes of horizontal table division/partition, database cluster expansion and the like.
Referring to fig. 11, fig. 11 shows an electronic device according to an embodiment of the present invention, which includes a processor 101, a memory 102, and a communication bus, wherein the communication bus is used to connect the processor 101 and the memory 102, and the processor 101 is used to execute a computer program stored in the memory 102 to implement the auxiliary diagnosis and treatment question-answering method.
An embodiment of the present invention also provides a computer-readable storage medium having stored thereon a computer program for causing a computer to execute the above-described auxiliary diagnosis and treatment question answering method.
In general, the invention adopts advanced entity identification and intention classification technology and combines multi-source data fusion and redundancy elimination processing, thereby achieving remarkable effects in solving the key problems of low entity identification accuracy, difficult user natural language intention identification, influence of multi-source data redundancy on query efficiency, insufficient system expansibility and the like in the Chinese medical field. By adopting the joint naming entity recognition module based on RoBERTa pre-training language model, two-way long-short-term memory network, multi-head self-attention mechanism and conditional random field, the accuracy of Chinese medical text entity recognition is remarkably improved. Through the user intention recognition module fused with BERT and TextCNN, the accurate analysis of the semantics of the medical question input by the user is realized. Entity alignment and redundancy elimination are carried out through the entity alignment and fusion module, and entities with similar semantics in different sources are combined into the same node, so that data redundancy in a knowledge graph is remarkably eliminated, and query efficiency is improved. Meanwhile, the modularized and interfacing design is adopted, the medical knowledge base and the functional module are supported to be dynamically updated, and the expansibility of the system when new knowledge and demands appear is enhanced.
The above embodiments are merely illustrative of the principles of the present invention and its effectiveness, and are not intended to limit the invention. Modifications and variations may be made to the above-described embodiments by those skilled in the art without departing from the spirit and scope of the invention. Accordingly, it is intended that all equivalent modifications and variations of the invention be covered by the claims, which are within the ordinary skill of the art, be within the spirit and scope of the present disclosure.

Claims (10)

CN202510928288.7A2025-07-072025-07-07Auxiliary diagnosis and treatment question-answering method and system based on knowledge graph, equipment and mediumActiveCN120429411B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202510928288.7ACN120429411B (en)2025-07-072025-07-07Auxiliary diagnosis and treatment question-answering method and system based on knowledge graph, equipment and medium

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202510928288.7ACN120429411B (en)2025-07-072025-07-07Auxiliary diagnosis and treatment question-answering method and system based on knowledge graph, equipment and medium

Publications (2)

Publication NumberPublication Date
CN120429411Atrue CN120429411A (en)2025-08-05
CN120429411B CN120429411B (en)2025-09-09

Family

ID=96560134

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202510928288.7AActiveCN120429411B (en)2025-07-072025-07-07Auxiliary diagnosis and treatment question-answering method and system based on knowledge graph, equipment and medium

Country Status (1)

CountryLink
CN (1)CN120429411B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20200167428A1 (en)*2018-11-262020-05-28International Business Machines CorporationUtilizing external knowledge and memory networks in a question-answering system
US20220414482A1 (en)*2021-06-292022-12-29Sap SeVisual question answering with knowledge graphs
CN117851571A (en)*2024-01-152024-04-09湖州师范学院Knowledge graph and multi-mode dialogue model integrated traditional Chinese medicine knowledge question-answering method
US20240386015A1 (en)*2015-10-282024-11-21Qomplx LlcComposite symbolic and non-symbolic artificial intelligence system for advanced reasoning and semantic search
CN119782554A (en)*2025-03-102025-04-08成都理工大学 A medical visual question answering method and system based on knowledge graph

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20240386015A1 (en)*2015-10-282024-11-21Qomplx LlcComposite symbolic and non-symbolic artificial intelligence system for advanced reasoning and semantic search
US20200167428A1 (en)*2018-11-262020-05-28International Business Machines CorporationUtilizing external knowledge and memory networks in a question-answering system
US20220414482A1 (en)*2021-06-292022-12-29Sap SeVisual question answering with knowledge graphs
CN117851571A (en)*2024-01-152024-04-09湖州师范学院Knowledge graph and multi-mode dialogue model integrated traditional Chinese medicine knowledge question-answering method
CN119782554A (en)*2025-03-102025-04-08成都理工大学 A medical visual question answering method and system based on knowledge graph

Also Published As

Publication numberPublication date
CN120429411B (en)2025-09-09

Similar Documents

PublicationPublication DateTitle
JP7100087B2 (en) How and equipment to output information
Tyagi et al.Demystifying the role of natural language processing (NLP) in smart city applications: background, motivation, recent advances, and future research directions
US12412044B2 (en)Methods for reinforcement document transformer for multimodal conversations and devices thereof
US20240020538A1 (en)Systems and methods for real-time search based generative artificial intelligence
US20210034813A1 (en)Neural network model with evidence extraction
CN113657100B (en)Entity identification method, entity identification device, electronic equipment and storage medium
CN114064931A (en) A method and system for first aid knowledge question answering based on multimodal knowledge graph
US20230297603A1 (en)Cross-lingual meta-transfer learning adaptation to natural language understanding
WO2023029506A1 (en)Illness state analysis method and apparatus, electronic device, and storage medium
CN115293161A (en)Reasonable medicine taking system and method based on natural language processing and medicine knowledge graph
CN112948580B (en)Text classification method and system
KR102642488B1 (en)Data providing device, method and computer program generating answer using artificial intelligence technology
US20220147719A1 (en)Dialogue management
US12282504B1 (en)Systems and methods for graph-based dynamic information retrieval and synthesis
CN119046433A (en)Output method, device, equipment and storage medium for search enhancement generation type question and answer
CN120277199B (en) Children's education knowledge boundary management method, system and equipment based on large model
US11995394B1 (en)Language-guided document editing
CN113822018A (en) Entity Relation Joint Extraction Method
CN116956934A (en)Task processing method, device, equipment and storage medium
Chong et al.TransKGQA: enhanced knowledge graph question answering with sentence transformers
Alwaneen et al.Stacked dynamic memory-coattention network for answering why-questions in Arabic
CN120429411B (en)Auxiliary diagnosis and treatment question-answering method and system based on knowledge graph, equipment and medium
CN117689027A (en)Prompt text generation method and device, electronic equipment and storage medium
CN120045645A (en)Model training method, device, equipment, medium and program product
Cao et al.ALSEM: aspect-level sentiment analysis with semantic and emotional modeling

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp