Disclosure of Invention
In order to make the patient more aware of the surgical procedure and the surgical result, the application provides a short video surgical report generation method, a device, a computer device and a storage medium.
In a first aspect, the present application provides a short video surgery report generation method, which adopts the following technical scheme:
a short video surgical report generation method, comprising:
acquiring surgical data; processing the surgical data to obtain a plurality of audio/video frame sequences;
and matching the audio/video frame sequence with a preset video template to generate a short video surgery report.
By adopting the technical scheme, the operation data is converted into the short-video operation report, and the short-video operation report is vividly and visually presented to the patient for watching, so that the patient can know the operation process and the result more, the suspicion of the patient to the operation process is reduced, and further the medical contradiction and medical dispute are reduced, and the operation process can be more transparent through the short-video operation report, so that the supervision effect on medical staff is generated; in addition, no short video operation report exists in the market at present, the short video generated by the method is novel enough, and compared with the traditional media, the short video is preferred by the masses of people, so that the operation report is presented in a short video mode and can be preferred by patients.
Preferably, the surgical data comprises surgical video, and the sequence of audio-video frames comprises a sequence of video frames; the processing the operation data to obtain a plurality of audio/video frame sequences comprises:
splitting the operation video to obtain a plurality of video frame fragments;
and marking the video frame fragments to obtain a plurality of video frame sequences.
Preferably, the splitting the operation video to obtain a plurality of video frame segments includes:
identifying whether medical personnel send an event command during the operation;
if so, acquiring a time point of the event command, and adding a timestamp in the operation video based on the time point;
based on the time stamp, a plurality of video frame segments are intercepted in the surgery video.
Preferably, the marking the plurality of video frame segments to obtain a plurality of video frame sequences includes:
marking the video frame segment based on an event command corresponding to the video frame segment to obtain the video frame sequence;
and repeating the steps to finish marking all the video frame segments.
Preferably, the surgical data further includes at least one piece of surgical text data, and the audio-video frame sequence further includes an audio frame sequence; the processing the surgical data to obtain a plurality of audio/video frame sequences further comprises:
performing character-to-audio conversion operation on the at least one operation text data to obtain at least one audio frame segment;
and marking the at least one audio frame segment according to the type of the at least one operation text data to obtain at least one audio frame sequence.
Preferably, the matching the audio/video frame sequence with a preset video template to generate a short video surgery report includes:
matching the marks of the audio/video frame sequence with a plurality of preset marks of the video template, and fusing the successfully matched audio/video frame sequence according to the sequence of the preset marks to generate a short video surgery report.
Preferably, if the matching of the mark of the audio/video frame sequence and the preset marks of the video template is successful, the matching is performed
Judging whether the audio-video frame sequence is an audio frame sequence;
if yes, judging whether the audio frame sequence contains medical terms or not;
if so, screening the medical term;
screening an animation frame sequence corresponding to the medical term in a basic information base;
fusing the animation frame sequence and the audio frame sequence according to a preset rule;
and repeating the steps to complete the fusion of all the audio frame sequences and the animation frame sequences.
In a second aspect, the present application provides a short video surgery report generation apparatus, which adopts the following technical solution:
a short video surgical report generating device includes,
the processing module is used for acquiring surgical data; processing the surgical data to obtain a plurality of audio/video frame sequences; and the number of the first and second groups,
and the matching module is used for matching the audio/video frame sequence with a preset video template to generate a short video surgery report.
In a third aspect, the present application provides a computer device, which adopts the following technical solution:
a computer device comprising a memory and a processor, the memory having stored thereon a computer program that can be loaded by the processor and that executes the short video surgical report generation method of any of the first aspects.
In a fourth aspect, the present application provides a computer-readable storage medium, which adopts the following technical solutions:
a computer readable storage medium storing a computer program that can be loaded by a processor and executed to perform the short video surgical report generation method of any of the first aspects.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application.
The present embodiment provides a short video surgery report generation method, as shown in fig. 1, the main flow of the method is described as follows (steps S101 to S102):
step S101: acquiring surgical data; and processing the surgical data to obtain a plurality of audio/video frame sequences.
The operation data comprises an operation video, the operation video is a video for recording an operation process in an operating room, and the audio-video frame sequence comprises a video frame sequence.
In this embodiment, data exchange is performed with the digital surgery platform, each digital surgery room is monitored, the surgery site is recorded, surgery videos are recorded, and whether medical staff send out an event command in the surgery process is recognized in real time.
There are two ways to recognize the event command:
firstly, voice emitted in the operation process is recognized, a first keyword in the voice is compared with a plurality of preset second keywords, and if the first keyword is the same as the plurality of preset second keywords, the first keyword is determined to be an event command. If the triggered first keywords are different, the corresponding event types are also different, for example, the event types at least include a conventional event class, a control command, a surgical record and a special event; the conventional events comprise commands of operation starting, patient transferring, anesthesia starting, patient transferring, operation ending and the like; the control command refers to a control command for equipment such as a route, a lamp, a bed, a tower and the like.
Secondly, the operation control of the equipment is monitored in real time, and then a first operation command is obtained. Such as operation control of a keyboard and a mouse; comparing the first operation command with a plurality of preset second operation commands, and if the first operation command is the same as the preset second operation commands, determining that the first operation command is an event command, wherein the event type of the event command is a control command.
When the event command is identified, acquiring a current time point, taking the current time point as the time point for identifying the event command, and adding a timestamp in the recorded operation video; note that the time node at which the surgical video adds the time stamp coincides with the time point.
The event commands are paired, and the event command indicating the start of the event is used as a first command, the event command indicating the end of the event is used as a second command, for example, the event command "patient transfer in" and the event command "patient transfer out" are paired, the event command "patient transfer in" is used as the first command, and the event command "patient transfer out" is used as the second command.
Optionally, the event command may be added with a numerical term, such as "first patient transfer," "second patient transfer," "first patient transfer," and "second patient transfer," wherein the event command of "first patient transfer" and the event command of "first patient transfer" are paired, and the event command of "second patient transfer" are paired.
And recording the occurrence times of the first command and the second command when the first command and the second command are acquired.
And when the time stamp is added, the event command corresponding to the time stamp is marked at the same time, and the event type can be marked.
Next, the surgical video is split to obtain a plurality of video frame fragments. Specifically, after the surgical video is recorded, a timestamp corresponding to a first marked nth-appearing command is taken as a start segmentation point, a timestamp corresponding to a second marked nth-appearing command is taken as an end segmentation point, the surgical video between the start segmentation point and the end segmentation point is intercepted, and the intercepted surgical video is taken as a video frame segment; wherein N is a positive integer.
Marking the video frame segment based on an event command corresponding to the video frame segment to obtain a video frame sequence; wherein the event commands correspond to the tags of the video frame sequence one to one.
The operation data also comprises at least one operation text data, and the operation text data is called from systems such as HIS, PACS and LIS; the sequence of audiovisual frames also includes a sequence of audio frames.
And performing character-to-audio conversion operation on the obtained at least one operation text data to obtain at least one audio frame segment.
Marking the corresponding audio frame segment according to the type of the operation text data to obtain an audio frame sequence; wherein, the type of the operation text data corresponds to the mark of the audio frame sequence one by one. The types of the operation text data comprise a patient basic information class, an examination class, a doctor postoperative comment and the like. Wherein, the basic information class of the patient comprises the name, the sex, the age and the like of the patient; the examination class represents the operation text data as the examination data after the patient has performed various examinations; the doctor postoperative comment shows that the operation text data is the contrastive analysis of doctor in the operation and after the operation, and the characters in this type of operation text data can adopt the machine to independently history memory mode, conveniently nurses and nurses under the same or similar state of an illness, selects the characters of machine history memory fast, then directly uses this characters or revises on the basis of this characters, alleviates doctor's work load.
Optionally, the audio frame segment may also be directly obtained, and the audio frame segment is marked according to the type of the audio frame segment, so as to obtain the sequence of audio frames. For example, an audio frame segment of an order class is obtained, which refers to an electronic order entered by a physician in surgery through speech.
Step S102: and matching the audio/video frame sequence with a preset video template to generate a short video surgery report.
The video template comprises a task bar and a label bar, the task bar comprises a plurality of task modules, each task module corresponds to one label bar, and each label bar comprises at least one preset mark.
Referring to fig. 2, the surgical procedure video module on the left side in the video template is clicked, and a tab bar corresponding to the surgical procedure video module is displayed on the right side in the video template.
Presetting a first sequence among a plurality of task modules, if a tab bar comprises two or more than two preset marks, presetting a second sequence among the two or more than two preset marks, and obtaining a third sequence, namely a sequence among all the preset marks of the video template based on the first sequence and the second sequence.
For example, there are two task modules, A and B, respectively, with the first order being A → B; the label column corresponding to the A only comprises one preset mark which is a, the label column corresponding to the B comprises two preset marks which are B and c respectively, and the second sequence is B → c; the third order is a → b → c.
Matching the marks of the audio/video frame sequence with a plurality of preset marks of the video template, and fusing the successfully matched audio/video frame sequence according to a third sequence to generate a short video surgery report.
If the marks of a plurality of audio/video frame sequences are successfully matched with the same preset mark, only selecting the first audio/video frame sequence successfully matched for subsequent fusion; any one of the audio-video frame sequences which are successfully matched can be selected for subsequent fusion.
And if the preset mark is not matched with the mark of the audio-video frame sequence successfully, ignoring the preset mark in the third sequence. For example, if a is successfully matched with the mark of one audio-video frame sequence, c is successfully matched with the mark of one audio-video frame sequence, and b is not successfully matched with the mark of the audio-video frame sequence, the third sequence is changed to a → c, and the two audio-video frame sequences which are successfully matched are fused according to the sequence of the audio-video frame sequence corresponding to the a before and the audio-video frame sequence corresponding to the b after, so as to generate the short video surgery report.
It is noted that the short video surgery report may be formed by fusing only a plurality of video frame sequences, or may be formed by fusing a video frame sequence and an audio frame sequence.
Referring to fig. 2, the taskbar comprises a patient basic information module, a previous detection data module, an operation process video module, an operation result video module and a doctor postoperative comment module; the short video operation report is generated through the task modules and the preset marks corresponding to the task modules, firstly, the audio related to the basic information of the patient is played, then, the audio related to the examination data of the patient is played, then, the video in the operation process of the patient is played, then, the video after the operation of the patient is finished is played, and finally, the audio related to the postoperative comment of the doctor is played; wherein, play the video of patient's operation in-process earlier, the video after the back broadcast patient operation is ended can let patient contrast operation process and operation result, plays the audio frequency about doctor postoperative comment at last, lets patient know the contrastive analysis of doctor in the operation and after the operation to let patient's more clear understanding the process and the result of oneself operation.
Furthermore, each label column further comprises at least one preset time length, and the preset time lengths correspond to the preset marks one by one. And according to the preset duration, carrying out refinement processing on the audio/video frame sequence, fusing the audio/video frame sequence subjected to refinement processing, and generating a short video operation report.
Specifically, after the matching of the mark of the audio/video frame sequence and the preset mark is successful, whether the audio/video frame sequence is the video frame sequence is judged; if so, performing first refinement processing on the video frame sequence according to the preset duration corresponding to the preset mark.
The first refinement process is specifically as follows:
acquiring the video segment duration of a video frame sequence; the video segment duration is the total playing duration of the video frame sequence. Judging whether the duration of the video clip is greater than the preset duration corresponding to the preset mark or not; if yes, keeping a preset part in the video frame sequence, and deleting the rest part; the preset part is set according to the requirement, and the duration of the preset part is equal to the preset duration. For example, a part of the video frame sequence with a first preset time length is reserved and the rest part is deleted according to the requirement, and if the time length of the video segment is 20 seconds and the preset time length is 10 seconds, the part of the video frame sequence with the first 10 seconds is reserved and the part of the video frame sequence with the last 10 seconds is deleted.
Judging whether the duration of the video clip is equal to the preset duration corresponding to the preset mark or not; if so, the entire sequence of video frames is retained.
Judging whether the duration of the video clip is less than the preset duration corresponding to the preset mark or not; if yes, the whole video frame sequence is reserved; the video frame sequence may also be repeated until the repeated time length reaches a preset time length, for example, the video frame sequence is repeated once, that is, one copy is copied, and then the total time length of two identical video frame sequences, that is, the repeated time length is 20 seconds; or the last frame of the video frame sequence can be freeze-played until the total playing time of the video frame sequence reaches the preset time.
After the marks of the audio and video frame sequence are successfully matched with the preset marks, judging whether the audio and video frame sequence is the audio frame sequence; if so, performing second refinement processing on the audio frame sequence according to the preset duration corresponding to the preset mark.
The second refinement process is specifically as follows:
acquiring the audio segment duration of the audio frame sequence, and judging whether the audio segment duration is greater than the preset duration corresponding to the preset mark; if so, clipping the audio frame sequence; wherein, the duration of the audio clip is the total playing duration of the audio frame sequence.
There are various ways of clipping processing, wherein the first way is as follows:
matching the marks of the audio frame sequence with the table marks of a plurality of preset first segment tables; wherein the tags of the sequence of audio frames correspond one-to-one to the table tags of the first segment table. If the matching is successful, identifying each keyword segment in the audio frame sequence, and comparing the identified keyword segments with a plurality of first keyword segments in a first segment table which is successfully matched in sequence; if so, the key fragment will be retained. Repeating the operation until all the keyword segments are compared, and sequentially fusing the reserved first keyword segments into a new audio frame sequence; the keyword segment may be an audio segment of a word or an audio segment of a sentence.
The second mode of the clipping processing is specifically as follows:
matching the marks of the audio frame sequence with the table marks of a plurality of preset second segment tables; wherein the marks of the audio frame sequence are in one-to-one correspondence with the table marks of the second segment table. If the matching is successful, identifying each keyword segment in the audio frame sequence, and comparing the identified keyword segments with a plurality of second keyword segments in a second segment table which is successfully matched in sequence; if the keyword segments are the same, the keyword segments are deleted from the audio frame sequence. Repeating the above operations until all the keyword segments are compared, deleting all the keyword segments which are the same as the second keyword segment in the audio frame sequence, and taking the audio frame sequence as a new audio frame sequence after all the keyword segments are deleted.
The third mode of the clipping processing is specifically as follows:
matching the marks of the audio frame sequence with the table marks of a plurality of preset third segment tables; wherein, the marks of the audio frame sequence are in one-to-one correspondence with the table marks of the third segment table. If the matching is successful, identifying each keyword segment in the audio frame sequence, and sequentially comparing the identified keyword segments with third keyword segments in a third segment table which is successfully matched; if so, the keyword segment is marked in the audio frame sequence. Sequentially comparing the identified keyword segments with fourth keyword segments in a third segment table which is successfully matched; if so, the keyword segment is marked in the audio frame sequence. And intercepting a segment between two marked keyword segments in the audio frame sequence, and taking the intercepted segment as a new audio frame sequence.
And if the duration of the audio segment is not greater than the preset duration corresponding to the preset mark, reserving the whole audio frame sequence.
Furthermore, a voice broadcast function and a subtitle display function are added to the audio frame sequence in the short video surgery report, medical staff can pre-select a reading mode of the audio frame sequence through the voice broadcast function, and can pre-select a subtitle display mode of the audio frame sequence through the subtitle display function.
When playing a sequence of audio frames, in which some medical terms may appear, the patient is usually a non-professional, and it is difficult to understand these medical terms. To this end, the present application makes the following solutions:
after the marks of the audio/video frame sequence are successfully matched with the plurality of preset marks of the video template, judging whether the audio/video frame sequence is the audio frame sequence; if yes, judging whether the audio frame sequence contains medical terms or not; if yes, sequentially screening medical terms in the audio frame sequence; screening out an animation frame sequence corresponding to the medical term from a basic information base; and fusing the animation frame sequence and the audio frame sequence according to a preset rule. It should be noted that the number of screened-out animation frame sequences may be 1 or more.
The preset rules comprise two types, wherein one type is that the animation frame sequence is silent animation, the total duration of at least one screened animation frame sequence is adjusted to be the same as the duration of an audio clip of the audio frame sequence, and then the at least one animation frame sequence is sequentially fused with the audio frame sequence according to the screened sequence; the effect after fusion was: while the audio frame sequence is played, the animation frame sequence is also played.
Secondly, judging whether the sequence of the screened animation frames is 1 or not, wherein the sequence of the animation frames is voiced animation or unvoiced animation; if so, directly fusing the animation frame sequence and the audio frame sequence according to the following mode; if not, the plurality of animation frame sequences are sequentially fused according to the screened sequence, and a new animation frame sequence is generated.
Next, the animation frame sequence is inserted before the first frame of the audio frame sequence, i.e., the last frame of the animation frame sequence, before the audio frame sequence, or the animation frame sequence is inserted after the last frame of the audio frame sequence, i.e., the first frame of the animation frame sequence, after the audio frame sequence.
And repeating the steps to complete the fusion of all the audio frame sequences and the animation frame sequences, and then combining the third sequence to fuse the video frame sequences, the audio frame sequences and the animation frame sequences to generate the short video surgery report.
Alternatively, instead of the animated frame sequence fused with the sequence of audio frames, other animations may be possible, such as hospital-homemade anthropomorphic animations and surgical simulation animations, in addition to the animations related to medical terms.
If the second preset rule is selected, the audio frame sequence is played equivalently to be played in a black screen mode, so that the audio frame sequence in the operation report is matched with a preset image database, and if the matching is successful, the audio frame sequence is fused with one or more successfully matched images.
Specifically, the picture database comprises a plurality of pictures, and the pictures can be head portraits of patients, focus images or charts in the process of patient examination, and the like; wherein, the focus image can intercept one or more frames as the focus image in the operation video through the image recognition technology. And matching the marks of the pictures with marks in the audio frame sequence, or matching the marks of the pictures with marks of keyword fragments in the audio frame sequence, thereby obtaining one or more pictures which are successfully matched.
Judging whether the number of the successfully matched pictures is one or not; if so, fusing the picture and the audio frame sequence, wherein the fused effect is as follows: when the audio frame sequence is played, the picture is subjected to freeze playing according to the duration of an audio segment of the audio frame sequence; if not, randomly selecting one picture from the plurality of pictures successfully matched to fuse with the audio frame sequence, or setting as required to select the first picture or the last picture from the plurality of pictures successfully matched to fuse with the audio frame sequence; the multiple successfully matched pictures may also be fused with the audio frame sequence, for example, the duration of an audio clip is 10 seconds, the number of successfully matched pictures is 3, while the audio frame sequence is played, the first picture is played for 3 seconds in a freeze frame mode, the second picture is played for 3 seconds in a freeze frame mode, and the third picture is played for 4 seconds in a freeze frame mode.
Further, a progress bar is added to the generated short video surgical report. Creating a time node in the progress bar based on the duration of each frame sequence in the short video surgery report, each time node being between two adjacent frame sequences, meaning that in the two adjacent frame sequences, the time node is between the last frame of the preceding frame sequence and the first frame of the succeeding frame sequence; the frame sequence comprises a video frame sequence, an audio frame sequence and an animation frame sequence.
After the short video operation report is generated, clicking a time node, and skipping to the time node according to the playing progress of the short video operation report. And marking each time node with an event command corresponding to the time node, specifically, marking the event command corresponding to the time node as an event command corresponding to a frame sequence after the time node, and displaying the event command corresponding to the time node after acquiring the triggering operation of the patient on the display.
Referring to fig. 2, the video template further includes a script bar, and the script bar on the left side of the video template is clicked to display the related script program on the right side of the video template. Through the script column, the script of the video template can be modified, for example, the preset mark is modified, the cutting processing mode is selected, and the like.
The video template further comprises a rendering output column, and the output mode of the short video operation report can be selected through the rendering output column, for example, a platform link or a two-dimensional code is output, the patient can obtain the short video operation report by clicking the link or scanning the two-dimensional code, and an html webpage can be output or the short video operation report can be recorded in an optical disc.
In order to better implement the above method, the embodiment of the present application further provides a short video surgery report generating apparatus, which may be specifically integrated in a computer device, such as a terminal or a server, where the terminal may include, but is not limited to, a mobile phone, a tablet computer, or a desktop computer.
Fig. 3 is a block diagram of a short video surgery report generation apparatus according to an embodiment of the present application, and as shown in fig. 3, the apparatus mainly includes:
aprocessing module 201 for acquiring surgical data; processing the surgical data to obtain a plurality of audio/video frame sequences; and the number of the first and second groups,
and thematching module 202 is configured to match the audio/video frame sequence with a preset video template to generate a short video surgery report.
Various changes and specific examples in the method provided by the above embodiment are also applicable to the short video surgery report generation apparatus of the present embodiment, and through the foregoing detailed description of the short video surgery report generation method, those skilled in the art can clearly know the implementation method of the short video surgery report generation apparatus in the present embodiment, and for the brevity of the description, detailed descriptions are omitted here.
In order to better execute the program of the method, the embodiment of the present application further provides a computer device, as shown in fig. 4, thecomputer device 300 includes amemory 301 and aprocessor 302.
Thecomputer device 300 may be implemented in various forms including devices such as a cell phone, a tablet computer, a palm top computer, a laptop computer, and a desktop computer.
Thememory 301 may be used to store, among other things, instructions, programs, code sets, or instruction sets. Thememory 301 may include a program storage area and a data storage area, wherein the program storage area may store instructions for implementing an operating system, instructions for at least one function (such as identifying whether a medical staff issues an event command during a surgical procedure and adding a time stamp to a surgical video, etc.), and instructions for implementing the short video surgical report generation method provided by the above-described embodiments, etc.; the storage data area may store data and the like involved in the short video surgery report generation method provided by the above-described embodiment.
Processor 302 may include one or more processing cores. Theprocessor 302 may invoke the data stored in thememory 301 by executing or executing instructions, programs, code sets, or instruction sets stored in thememory 301 to perform the various functions of the present application and to process the data. TheProcessor 302 may be at least one of an Application Specific Integrated Circuit (ASIC), a Digital Signal Processor (DSP), a Digital Signal Processing Device (DSPD), a Programmable Logic Device (PLD), a Field Programmable Gate Array (FPGA), a Central Processing Unit (CPU), a controller, a microcontroller, and a microprocessor. It is understood that the electronic devices for implementing the functions of theprocessor 302 may be other devices, and the embodiments of the present application are not limited thereto.
An embodiment of the present application provides a computer-readable storage medium, including: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk. The computer readable storage medium stores a computer program that can be loaded by a processor and executes the short video surgical report generation method of the above-described embodiments.
The specific embodiments are merely illustrative and not restrictive, and various modifications that do not materially contribute to the embodiments may be made by those skilled in the art after reading this specification as required, but are protected by patent laws within the scope of the claims of this application.