Disclosure of Invention
One of the purposes of the invention is to provide a medical cloud communication system based on AI and trusted creation, which provides a safe and reliable communication mode based on a trusted creation environment, so that a doctor can accurately and reliably acquire historical diagnosis data of a patient, and the doctor can acquire an inquiry audio/video while acquiring a medical record of the patient by associating the inquiry audio/video with the medical record of the patient, so that the doctor can comprehensively master the state of illness of the patient.
The embodiment of the invention provides a medical cloud communication system based on AI and information creation, which comprises:
the video acquisition module is used for acquiring a first audio and video in the interrogation room, which is shot by audio and video acquisition equipment arranged in the interrogation room;
the video analysis module is used for analyzing and editing the first audio/video based on artificial intelligence to obtain a plurality of inquiry audios/videos;
the correlation module is used for correlating the inquiry audios and videos with the patient medical records one by one and storing to form patient treatment information;
the request receiving module is used for receiving a medical record acquisition request;
the verification module is used for verifying the medical record acquisition request;
the information acquisition module is used for acquiring the patient treatment information corresponding to the medical record acquisition request when the verification is passed;
and the transmission module is used for transmitting the acquired patient treatment information to a requester of the medical record acquisition request.
Preferably, the video analysis module analyzes and clips the first audio/video based on artificial intelligence to obtain a plurality of inquiry audios/videos, and executes the following operations:
carrying out human body contour recognition on each frame of image in the first audio/video based on a preset first neural network model, and determining the number of human body contours existing in each frame of image;
based on the number of human body contours in each frame of image, clipping the first audio and video to obtain a plurality of second audio and video;
identifying the audio frequency in each second audio/video, and determining first audio information of an inquiry doctor and second audio information interacted with the inquiry doctor in each second audio/video;
based on the second audio information, clipping the second audio and video to obtain a plurality of third audio and video;
extracting a human body contour map at a preset first position in an image in a third audio-video frequency;
extracting the characteristics of the audio corresponding to the second audio information interacted with the inquiry doctor in the third audio and video to obtain a first audio characteristic set;
grouping and aggregating the plurality of third audios and videos based on the first audio feature set and the human body contour map to obtain a plurality of fourth audios and videos;
and taking the fourth audio and video as an inquiry audio and video.
Preferably, the video analysis module clips the first audio and video based on the number of human body contours existing in each frame of image, acquires a plurality of second audio and video, and executes the following operations:
deleting the frame of image when only one human body contour or no human body contour exists in the image;
and integrating the continuous frame images into a second audio-video.
Preferably, the video analysis module identifies the audio in each second audio/video, determines the first audio information of the inquiry doctor and the second audio information interacted with the inquiry doctor in each second audio/video, and executes the following operations:
identifying the audio in the second audio/video to acquire a plurality of pieces of audio information to be processed;
respectively extracting keywords from each audio information to be processed based on a preset keyword extraction template to obtain a keyword extraction result;
dividing a plurality of audio information to be processed into query information and answer information based on the keyword extraction result;
taking the inquiry information as first audio information and the answer information as second audio information;
and/or the presence of a gas in the gas,
respectively extracting the characteristics of the audio corresponding to each audio information to be processed to obtain a plurality of characteristic values;
inputting a plurality of characteristic values into a preset second neural network model to obtain an identification factor;
inquiring a preset identification factor and personnel comparison table based on the identification factor, and determining an inquiry doctor in the identification factor and personnel comparison table or other personnel in the identification factor and personnel comparison table corresponding to the audio information to be processed;
and taking the audio information to be processed corresponding to the doctor for consultation as first audio information, and taking the audio information to be processed corresponding to other personnel as second audio information.
Preferably, the video analysis module clips the second audio and video based on the second audio information to obtain a plurality of third audio and video, and executes the following operations:
extracting the characteristics of the audio corresponding to each second audio information, and constructing a second audio characteristic set based on the extracted characteristic values;
calculating the similarity between the second audio feature sets;
classifying and grouping the second audio information based on the similarity to obtain a plurality of audio groups;
determining the acquisition time corresponding to each second audio information in each audio group;
sorting the second audio information based on the acquisition time;
acquiring a time difference value between two adjacent second audio information;
when the time difference value is less than or equal to a preset first time threshold value, the audio group is not split; otherwise, splitting the audio group between two pieces of second audio information of which the time difference value is greater than the first time threshold;
determining a first time range corresponding to the audio group based on the acquisition time corresponding to the first second audio information and the acquisition time corresponding to the last second audio information of the audio group;
determining whether the first time ranges corresponding to the audio groups overlap;
when the first time range corresponding to the audio group is not overlapped with the first time ranges corresponding to other audio groups, taking the first audio information corresponding to the first second audio information of the audio group as an initial position and taking the last second audio information of the audio group as a final position, clipping the second audio and video, and obtaining a third audio and video;
when the first time range corresponding to the audio group is overlapped with the first time ranges corresponding to other audio groups, determining the maximum time difference value and the overlapping time of the overlapping part based on the time ranges corresponding to the overlapped audio groups;
when the ratio of the overlapping time to the maximum time difference is smaller than or equal to a preset ratio, taking first audio information corresponding to first second audio information of the audio group as an initial position and taking last second audio information of the audio group as an end position, clipping a second audio and a video, and obtaining a third audio and a video;
and when the ratio of the overlapping time to the maximum time difference is greater than a preset ratio, combining the overlapped audio groups, taking the first audio information corresponding to the first second audio information of the combined audio group as a starting position and the last second audio information of the combined audio group as an ending position, clipping the second audio and video, and acquiring a third audio and video.
Preferably, the video analysis module performs grouping and aggregation on the plurality of third audios and videos based on the first audio feature set and the human body contour map to obtain a plurality of fourth audios and videos, and executes the following operations:
matching each human body contour map of the third audio-video with each human body contour map of other third audio-video one by one;
matching each first audio feature set of the third audio and video with each first audio feature set of other third audio and video;
and when the human body contour map or the first audio feature set is matched, integrating two third audios and videos which are matched with each other.
Preferably, the correlation module performs one-to-one correlation between the inquiry audio/video and the patient medical records, stores the inquiry audio/video and the patient medical records to form patient treatment information, and executes the following operations:
acquiring operation information of a computer in an interrogation room;
analyzing the operation information, and determining the calling time of the registration information of each patient;
determining a second time range for the interrogation of each patient based on the call time;
acquiring a time range set of an inquiry audio and video; the set of time ranges includes at least one third time range;
determining a correspondence of the interrogation audio video to the patient medical records based on the second time range and a plurality of third time ranges;
and associating the inquiry audios and videos with the patient medical records based on the corresponding relation.
Preferably, the request receiving module receives a medical record obtaining request and executes the following operations:
acquiring the actual position of a requester of the medical record acquisition request;
acquiring a preset permission area for permitting to receive a request;
when the actual position belongs to the permission area, receiving a medical record acquisition request; otherwise, it does not receive.
Preferably, the verification module verifies the medical record acquisition request and executes the following operations:
when a medical record acquisition request is received, acquiring reserved information of a patient;
sending a verification code based on the contact information in the reservation information;
and when the verification code is received from the requester of the medical record acquisition request within a preset second time threshold, the verification is passed.
Preferably, the AI and innovation based medical cloud communication system further comprises:
the data acquisition modules are in communication connection with the detection instruments in the hospital correspondingly and are used for acquiring detected original data;
and the association module is also used for associating the original data with the patient medical record and storing the original data and the patient medical record to form the detection data of the patient.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.
Detailed Description
The preferred embodiments of the present invention will be described in conjunction with the accompanying drawings, and it should be understood that they are presented herein only to illustrate and explain the present invention and not to limit the present invention.
An embodiment of the present invention provides a medical cloud communication system based on AI and innovation, as shown in fig. 1, including:
the system comprises avideo acquisition module 1, a video processing module and a video processing module, wherein thevideo acquisition module 1 is used for acquiring a first audio and video in an interrogation room shot by audio and video acquisition equipment arranged in the interrogation room; shooting a first audio and video from the beginning of a doctor in an inquiry room after work to the end of the doctor after work;
thevideo analysis module 2 is used for analyzing and editing the first audio/video based on artificial intelligence to obtain a plurality of inquiry audios/videos; automatically editing according to different inquiry personnel, wherein one inquiry audio and video corresponds to one patient;
thecorrelation module 3 is used for correlating the inquiry audios and videos with the patient medical records one by one and storing the inquiry audios and videos and the patient medical records to form patient treatment information;
the request receiving module 4 is used for receiving a medical record acquisition request;
theverification module 5 is used for verifying the medical record acquisition request;
the information acquisition module 6 is used for acquiring the patient treatment information corresponding to the medical record acquisition request when the verification is passed;
and thetransmission module 7 is used for transmitting the acquired patient information to a requester of the medical record acquisition request.
The working principle and the beneficial effects of the technical scheme are as follows:
the method has the advantages that the first audio and video collected by the audio and video collecting equipment arranged in the inquiry room are analyzed and edited, and the corresponding patient of the edited inquiry audio and video is determined, so that the automatic inquiry audio and video is associated with the medical record of the patient, the complexity of manually editing the video is avoided, and the intellectualization of the system is improved; and when the patient information of seeing a doctor is called, the medical record acquisition request is verified, and the patient information can be called only when the medical record acquisition request passes verification, so that the safety of data is ensured.
In one embodiment, thevideo analysis module 2 analyzes and clips the first audio/video based on artificial intelligence, obtains a plurality of inquiry audios/videos, and performs the following operations:
carrying out human body contour recognition on each frame of image in the first audio/video based on a preset first neural network model, and determining the number of human body contours existing in each frame of image;
based on the number of human body contours in each frame of image, clipping the first audio and video to obtain a plurality of second audio and video;
identifying the audio in each second audio/video, and determining first audio information of an inquiry doctor and second audio information interacted with the inquiry doctor in each second audio/video;
based on the second audio information, clipping the second audio and video to obtain a plurality of third audio and video;
extracting a human body contour map at a preset first position (for example, a patient sitting position set beside a doctor) in the image in the third audio-video frequency;
extracting the characteristics of the audio corresponding to the second audio information interacted with the inquiry doctor in the third audio and video to obtain a first audio characteristic set;
grouping and aggregating the plurality of third audios and videos based on the first audio feature set and the human body contour map to obtain a plurality of fourth audios and videos;
and taking the fourth audio and video as an inquiry audio and video.
The working principle and the beneficial effects of the technical scheme are as follows:
when a first audio/video is processed, the first step is to edit the first audio/video into a plurality of second audio/videos by the number of people in an inquiry room; determining whether patients exist in an interrogation room for interrogation according to the number of people; the amount of subsequently processed audio and video data is reduced, and the primary screening of the audio and video data is realized; the second step is that audio recognition is carried out on the second audio/video so as to divide the inquiry audio/video of two adjacent patients; thirdly, audio and video integration is carried out, and the method mainly aims at the examination items listed when a doctor puts forward the examination to be carried out on a patient after the patient is in an inquiry; when the report of the examination item is given, the patient enters the consulting room again for consulting; at the moment, the inquiry audio/video comprises third audio/video of a plurality of time periods; therefore, integration of the third audios and videos in multiple time periods is achieved through the first audio feature set and the human body contour map, and a complete inquiry audio-video is formed.
In one embodiment, thevideo analysis module 2 clips the first audio and video based on the number of human body contours existing in each frame of image, obtains a plurality of second audio and video, and executes the following operations:
deleting the frame of image when only one human body contour or no human body contour exists in the image;
and integrating the continuous frame images into a second audio-video.
The working principle and the beneficial effects of the technical scheme are as follows:
when only one human body contour or no human body contour exists in the image, no person or only one doctor exists in the consulting room, and the image of the frame is deleted; this situation is only applicable to the case where there is only one doctor in one consulting room; of course, different clipping conditions can be set according to the number of doctors in the consulting room, so that the first audio/video can be clipped based on the number of the human body outlines.
In one embodiment, thevideo analysis module 2 identifies the audio in each second audio/video, determines the first audio information of the inquiry doctor and the second audio information interacted with the inquiry doctor in each second audio/video, and performs the following operations:
identifying the audio in the second audio/video to acquire a plurality of pieces of audio information to be processed;
respectively extracting keywords from each audio information to be processed based on a preset keyword extraction template to obtain a keyword extraction result;
dividing a plurality of audio information to be processed into inquiry information and answer information based on the keyword extraction result;
taking the inquiry information as first audio information and the answer information as second audio information;
and/or the presence of a gas in the gas,
respectively extracting the characteristics of the audio corresponding to each audio information to be processed to obtain a plurality of characteristic values;
inputting a plurality of characteristic values into a preset second neural network model to obtain an identification factor;
inquiring a preset identification factor and personnel comparison table based on the identification factor, and determining an inquiry doctor in the identification factor and personnel comparison table or other personnel in the identification factor and personnel comparison table corresponding to the audio information to be processed;
and taking the audio information to be processed corresponding to the doctor for consultation as first audio information, and taking the audio information to be processed corresponding to other personnel as second audio information.
The working principle and the beneficial effects of the technical scheme are as follows:
the first mode is that the audio is recognized as corresponding audio information to be processed (for example, speech is recognized as characters) by an audio recognition technology; the method comprises the steps of performing keyword identification on characters, and distinguishing queries and answers based on the identified keywords; for example: "what case," "what feeling," etc., can be determined as a query; "abdominal pain", "heart and mouth pain", etc., can be determined as answers; the query and the answer are distinguished, and whether the query is the answer or not is determined by matching the extracted keywords with keywords corresponding to the queries in a preset query determination library; and determining whether the answer is the answer or not by matching the extracted keywords with relevant keywords of the answers in a preset answer determination library. In the second mode, the audio is subjected to feature extraction; identifying a person of the audio through a second neural network model; the second neural network model is used for collecting the audio frequency of each doctor in the hospital in advance, extracting the characteristics, acquiring the characteristic value and associating an identification factor; performing training convergence based on the characteristic value as input and the recognition factor as output; certainly, the identification factors of other persons are also assigned in the comparison table; after the characteristic values are processed by the second neural network model, other personnel other than doctors in the hospital can output the identification factors through the second neural network model, and then other personnel are determined to be patients.
In one embodiment, thevideo analysis module 2 clips the second audio and video based on the second audio information to obtain a plurality of third audio and video, and performs the following operations:
extracting the characteristics of the audio corresponding to each second audio information, and constructing a second audio characteristic set based on the extracted characteristic values; the extracted feature values include: short-time energy values, sound frequencies, amplitudes, etc.;
calculating the similarity between the second audio feature sets; the similarity calculation formula is as follows:
(ii) a Wherein,
、
second audio feature set respectively representing two second audio feature sets
A data value;
total number of data for the feature set;
is the similarity;
classifying and grouping the second audio information based on the similarity to obtain a plurality of audio groups; dividing the audio signals with the similarity larger than a judgment threshold (for example: 0.90) into an audio group;
determining the acquisition time corresponding to each second audio information in each audio group;
sorting the second audio information based on the acquisition time;
acquiring a time difference value between two adjacent second audio information;
when the time difference value is less than or equal to a preset first time threshold (for example: 2 minutes), the audio group is not split; otherwise, splitting the audio group between two pieces of second audio information of which the time difference value is greater than the first time threshold; through splitting processing, each audio group is ensured to be independent continuous audio, so that the audio of two adjacent patients behind can be divided;
determining a first time range corresponding to the audio group based on the acquisition time corresponding to the first second audio information of the audio group and the acquisition time corresponding to the last second audio information;
determining whether the first time ranges corresponding to the audio groups overlap;
when the first time range corresponding to the audio group is not overlapped with the first time ranges corresponding to other audio groups, taking the first audio information corresponding to the first second audio information of the audio group as a starting position and the last second audio information of the audio group as an ending position, cutting the second audio and video, and acquiring a third audio and video;
when the first time range corresponding to the audio group is overlapped with the first time ranges corresponding to other audio groups, determining the maximum time difference value and the overlapping time of the overlapping part based on the time ranges corresponding to the overlapped audio groups;
when the ratio of the overlapping time to the maximum time difference is smaller than or equal to a preset ratio (for example: 0.1), taking the first audio information corresponding to the first second audio information of the audio group as a starting position and the last second audio information of the audio group as an ending position, clipping the second audio and video, and acquiring a third audio and video;
and when the ratio of the overlapping time to the maximum time difference is greater than a preset ratio, combining the overlapped audio groups, taking the first audio information corresponding to the first second audio information of the combined audio group as a starting position and the last second audio information of the combined audio group as an ending position, clipping the second audio and video, and acquiring a third audio and video.
The working principle and the beneficial effects of the technical scheme are as follows:
the audio of the second audio information is divided into each audio group by extracting the characteristics of the audio, so that the audio of the same person is divided into one group; normally a person's audio represents a patient; however, there is a case where a patient does not visit himself or herself; for example: pediatrics, often accompanied by father and mother; thus by whether the audio groups overlap, and the ratio of the overlap; determining whether the person is accompanied by a plurality of persons; the accurate determination of the starting position and the ending position of the third audio is realized, and the accuracy of audio editing is improved.
In one embodiment, thevideo analysis module 2 performs grouping and aggregation on the plurality of third audios and videos based on the first audio feature set and the human body contour map to obtain a plurality of fourth audios and videos, and performs the following operations:
matching each human body contour map of the third audio-video with each human body contour map of other third audio-video one by one;
matching each first audio feature set of the third audio and video with each first audio feature set of other third audio and video;
when the human body contour map or the first audio feature set is matched, two third audio-video sets which are matched with each other are integrated.
The working principle and the beneficial effects of the technical scheme are as follows:
the body contour map and the feature set of the audio frequency in the third audio-video frequency are comprehensively analyzed, so that the integration of the third audio-video frequency when the patient enters for the first time and enters for the second time is realized, and the integrity of the inquired audio-video frequency is ensured.
In one embodiment, theassociation module 3 associates the inquiry videos with the patient medical records one by one, stores and forms the patient information, and performs the following operations:
acquiring operation information of a computer in an interrogation room;
analyzing the operation information, and determining the calling time of the registration information of each patient; the time that the registration information of the patient, which is clicked by the doctor and enters the diagnosis page, is determined as the calling time;
determining a second time range for the interrogation of each patient based on the call time; the calling time of the registration information of the previous patient to the calling time of the registration information of the next patient can be used as a second time range for the inquiry of one patient;
acquiring a time range set of inquiry audios and videos; the set of time ranges includes at least one third time range; when the patient enters the consulting room for multiple times, multiple third time ranges exist in the time range set;
determining a correspondence of the interrogation audio video to the patient medical records based on the second time range and a plurality of third time ranges; when the first third time range and the second time range have an overlap of a preset proportion (50%), it can be determined as corresponding; for example: the second time range falls within a first third time range.
And associating the inquiry audios and videos with the patient medical records based on the corresponding relation.
In one embodiment, the request receiving module 4 receives a medical record acquisition request, as shown in fig. 2, and performs the following operations:
step S1: acquiring the actual position of a requester of the medical record acquisition request;
step S2: acquiring a preset permission area for permitting to receive a request; the licensed region may be defined as within a hospital;
step S3: when the actual position belongs to the permission area, receiving a medical record acquisition request; otherwise, it does not receive.
And legality judgment of the request is carried out according to the actual position of the request, so that the safety of the data recorded by the medical record of the patient is ensured.
In one embodiment, theverification module 5 verifies the medical record acquisition request and performs the following operations:
when a medical record acquisition request is received, acquiring reserved information of a patient;
based on the contact information (such as the mobile phone number) in the reserved information, sending a verification code;
and when the verification code is received from the requester of the medical record acquisition request within a preset second time threshold (for example, 1 minute), the verification is passed.
The working principle and the beneficial effects of the technical scheme are as follows:
by sending the verification code to the reserved mobile phone for verification, the data of the medical record acquired under the authorization of the patient is ensured, and the privacy of the patient is effectively protected.
In one embodiment, the AI and trusted based medical cloud communication system, as shown in fig. 3, further comprises:
thedata acquisition modules 8 are correspondingly in communication connection with the detection instruments in the hospital and used for acquiring detected original data;
and theassociation module 3 is further used for associating the original data with the patient medical record and storing the original data and the patient medical record to form the detection data of the patient.
The working principle and the beneficial effects of the technical scheme are as follows:
through correlating the detected original data with the patient medical record, the hospital receiving the patient can acquire the original data under the conditions of patient transfer and the like, and the occurrence of misdiagnosis accidents can be further avoided.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.