Detailed Description
Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure are shown in the drawings, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but rather are provided for a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the disclosure are for illustration purposes only and are not intended to limit the scope of the disclosure.
It should be understood that the various steps recited in the method embodiments of the present disclosure may be performed in a different order, and/or performed in parallel. Moreover, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this respect.
The term "include" and variations thereof as used herein are open-ended, i.e., "including but not limited to". The term "based on" is "based, at least in part, on". The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments". Relevant definitions for other terms will be given in the following description.
It should be noted that the terms "first", "second", and the like in the present disclosure are only used for distinguishing different devices, modules or units, and are not used for limiting the order or interdependence relationship of the functions performed by the devices, modules or units.
It is noted that references to "a", "an", and "the" modifications in this disclosure are intended to be illustrative rather than limiting, and that those skilled in the art will recognize that "one or more" may be used unless the context clearly dictates otherwise.
The names of messages or information exchanged between devices in the embodiments of the present disclosure are for illustrative purposes only, and are not intended to limit the scope of the messages or information.
Fig. 1 is a flowchart of a display method in an embodiment of the present disclosure, where the embodiment is applicable to a client, the method may be executed by a display device, the device may be implemented in a software and/or hardware manner, and the device may be configured in an electronic device, such as a terminal, specifically including but not limited to a smart phone, a palm computer, a tablet computer, a wearable device with a display screen, a desktop computer, a notebook computer, an all-in-one machine, a smart home device, and the like.
As shown in fig. 1, the method may specifically include the following steps:
step 101, receiving a preset trigger operation.
The preset trigger operation may be an operation of a user starting a certain application program or an operation of starting a certain control in a certain application program, and the application program may be a client installed in the terminal. For example, the user may start a dance application, start a yoga application, or start a control of a fitness item in a fitness application.
And 102, responding to the preset trigger operation, displaying a target page, wherein the target page comprises a first area, a second area and a third area, a real-time picture of a user is displayed in the first area, a picture of a teaching video is displayed in the second area, and at least one action demonstration graph matched with the teaching video is displayed in the third area.
In one embodiment, reference may be made to a schematic diagram of a destination page as shown in FIG. 2, including afirst region 210, asecond region 220, and athird region 230. A real-time picture of the user, specifically a real-time picture shot by a terminal camera, is displayed in thefirst area 210. A screen of a teaching video, such as a yoga teaching video, a shoulder and neck relaxing teaching video, or a dance teaching video, is displayed in thesecond area 220. At least onemotion demonstration graph 231 matched with the teaching video is displayed in thethird area 230, for example, if the teaching video is a yoga teaching video, the at least onemotion demonstration graph 231 displayed in thethird area 230 is specifically a yoga motion demonstration graph included in the teaching video; if the teaching video is a dance-related teaching video, the at least onemotion demonstration map 231 displayed in thethird area 230 is specifically a dance motion demonstration map included in the teaching video. In fig. 2, taking the teaching video as yoga as an example, a real-time image of the teacher yoga action simulated by the user is displayed in thefirst area 210, the teaching video, specifically, a video for the teacher to explain the action and demonstrate the action, is displayed in thesecond area 220, and at least one action demonstration diagram 231 included in the teaching video is displayed in thethird area 230, where the action demonstration diagram 231 is a static image so as to facilitate the user to view and simulate the action. The action simulation is performed along with the teaching video, and particularly for new users, the new users cannot generally keep up with the teaching video, feel that the teaching video is played too fast, and need to continuously play back to repeatedly watch the demonstration of the specific action, so that by displaying the staticaction demonstration graph 231 in thethird area 230, the effect of facilitating the watching and simulation of the user can be obtained, and the use experience of the user can be improved.
Wherein, the at least one motion demonstration graph matched with the teaching video displayed in thethird area 230 can be extracted on line in real time based on the teaching video; or the teaching video is extracted off-line and stored in association with the teaching video, and when a user triggers a playing instruction of the teaching video, the teaching video and the teaching video are synchronously displayed and stored in association.
According to the display method provided by the embodiment of the disclosure, when a preset trigger operation is received, a target page is displayed, the target page comprises three display areas, namely a first area, a second area and a third area, a real-time picture of a user is displayed in the first area, a picture of a teaching video is displayed in the second area, and at least one action demonstration graph matched with the teaching video is displayed in the third area, so that the purpose of highlighting the action demonstration graph in the teaching video is achieved, the user can be guided to complete a standard target action, and the user experience is facilitated to be improved.
Based on the above embodiments, fig. 3 is a flowchart illustrating a display method according to an embodiment. In the present embodiment, an alternative embodiment is given with respect to the manner of acquiring the above-described action demonstration diagram. Specifically, before a target page is displayed, at least one group of target video frames are extracted from the teaching video based on preset logic, and the at least one action demonstration graph is obtained according to the at least one group of target video frames. As shown in fig. 3, the display method includes the steps of:
atstep 310, at least one group of target video frames is extracted from the teaching video based on a preset logic.
In one embodiment, extracting at least one set of target video frames from the teaching video based on preset logic comprises: identifying an action starting mark and an action ending mark which are marked in the teaching video in advance; determining a first timestamp corresponding to the action start mark and a second timestamp corresponding to the action end mark; extracting a plurality of video frames according to a set frequency from the teaching video between the first time stamp and the second time stamp; determining the plurality of video frames as a set of target video frames.
The action start mark and the action end mark are the start mark and the end mark for the same exemplary action, that is, each video frame between the video frame where the action start mark is located and the video frame where the action end mark is located is a video frame including the same exemplary action. It will be appreciated that if multiple demonstration actions are included in the instructional video, there are multiple pre-labeled action start markers and action end markers in the instructional video. For example, according to the sequence of the playing time sequence, the start marker of the first played exemplary action is defined as 11, the end marker of the first played exemplary action is defined as 12, the start marker of the second played exemplary action is defined as 21, the end marker of the second played exemplary action is defined as 22, the start marker of the third played exemplary action is defined as 31, the end marker of the third played exemplary action is defined as 32, and so on. Correspondingly, a plurality of video frames are extracted from the teaching video frames between the start mark 11 and the end mark 12 according to a set frequency, the plurality of video frames are determined as a group of target video frames, and the action demonstration graph of the first played demonstration action is determined based on the group of target video frames. Extracting a plurality of video frames according to a set frequency from the teaching video frames between the start mark 21 and the end mark 22, determining the plurality of video frames as a group of target video frames, and determining the action demonstration graph of the second played demonstration action based on the group of target video frames. Extracting a plurality of video frames according to a set frequency from the teaching video frames between the start mark 31 and the end mark 32, determining the plurality of video frames as a group of target video frames, and determining the action demonstration graph of the third played demonstration action based on the group of target video frames.
Step 320, obtaining the at least one motion demonstration graph according to the at least one group of target video frames.
Further, in one embodiment, the target video frame includes a plurality of video frames, and the obtaining of the action demonstration map according to the target video frame includes: performing target detection on the Nth video frame in the group of target video frames through a detection model to obtain a detection result; performing target segmentation through a segmentation model based on the detection result to obtain a target detection result corresponding to the Nth video frame; fusing the target detection results to obtain an Nth target segmentation result corresponding to the Nth video frame; inputting the nth target segmentation result and an (N + 1) th video frame in the group of target video frames into a video segmentation model to obtain an (N + 1) th target segmentation result corresponding to the (N + 1) th video frame, wherein the nth video frame is a previous video frame adjacent to the (N + 1) th video frame, N is a natural number greater than or equal to 1, and N +1 is less than or equal to the number of video frames included in the target video frame; determining a final target segmentation result corresponding to the (N + 1) th video frame according to the (N + 1) th target segmentation result and the (N + 1) th target segmentation result; and determining the final target segmentation result as a motion demonstration graph corresponding to the group of target video frames. In one embodiment, the determining a final target segmentation result corresponding to the N +1 th video frame according to the nth target segmentation result and the N +1 th target segmentation result includes: determining an intersection ratio between the (N + 1) th target segmentation result and the nth target segmentation result; if the intersection ratio is greater than or equal to a preset threshold value, determining the (N + 1) th target segmentation result as the final target segmentation result; if the intersection ratio is smaller than a preset threshold value, updating the (N + 1) th video frame to be the Nth video frame, returning to execute the operation of performing target detection on the Nth video frame in the group of target video frames through a detection model, and obtaining a detection result until each video frame in the group of target video frames is traversed.
For example, the above process is illustrated, assuming that the detection target is a human body, and for the 1 st video frame in a group of target video frames, performing human body detection through a detection model to obtain a detection result; performing human body segmentation through a segmentation model based on the detection result to obtain a human body detection result corresponding to the 1 st video frame (the human body detection result may include the head, the arm, the trunk and the like of the human body); fusing the human body detection results to obtain a 1 st human body segmentation result corresponding to the 1 st video frame; inputting the 1 st personal body segmentation result and the 2 nd video frame in the same group of target video frames into a video segmentation model to obtain a 2 nd personal body segmentation result corresponding to the 2 nd video frame; determining a final human body segmentation result corresponding to the 2 nd video frame according to the 1 st human body segmentation result and the 2 nd human body segmentation result; and determining the final human body segmentation result as an action demonstration graph corresponding to a group of target video frames. Further, determining a final human body segmentation result corresponding to the 2 nd video frame according to the 1 st human body segmentation result and the 2 nd human body segmentation result, including: determining the intersection ratio between the 2 nd human body segmentation result and the 1 st target segmentation result; if the intersection ratio is larger than or equal to a preset threshold value, determining the 2 nd human body segmentation result as the final human body segmentation result; if the intersection ratio is smaller than a preset threshold value, updating the 2 nd video frame to the 1 st video frame, namely repeating the operation from the 2 nd video frame until each video frame in a group of target video frames is traversed. Specifically, human body detection is carried out on the 2 nd video frame through a detection model to obtain a detection result; performing human body segmentation through a segmentation model based on the detection result to obtain a human body detection result corresponding to the 2 nd video frame (the human body detection result may include the head, the arm, the trunk and the like of the human body); fusing the human body detection results to obtain a 2 nd human body segmentation result corresponding to the 2 nd video frame; inputting the 2 nd personal body segmentation result and the 3 rd video frame in the same group of target video frames into a video segmentation model to obtain a 3 rd personal body segmentation result corresponding to the 3 rd video frame; determining a final human body segmentation result corresponding to the 3 rd video frame according to the 2 nd human body segmentation result and the 3 rd human body segmentation result; and determining the final human body segmentation result as an action demonstration graph corresponding to a group of target video frames. Further, determining a final human body segmentation result corresponding to the 3 rd video frame according to the 2 nd human body segmentation result and the 3 rd human body segmentation result, including: determining an intersection ratio between the 3 rd individual segmentation result and the 2 nd target segmentation result; if the intersection ratio is larger than or equal to a preset threshold value, determining the 3 rd human body segmentation result as the final human body segmentation result; if the intersection ratio is smaller than a preset threshold value, updating the 3 rd video frame to the 1 st video frame, namely repeating the operation from the 3 rd video frame, and so on.
The intersection and union ratio between the (N + 1) th target segmentation result and the nth target segmentation result refers to the ratio of the intersection and union of the (N + 1) th target segmentation result and the nth target segmentation result. Reference may be made to an intersection diagram of target segmentation results as shown in fig. 4, wherein the nth +1 target segmentation result is denoted as G, the nth target segmentation result is denoted as C, and an intersection ratio between the nth +1 target segmentation result and the nth target segmentation result is IOU ═ aera (C) and ═ aera (G)/aera (C) and aera (G), where aera (C) denotes an area of the nth target segmentation result and aera (G) denotes an area of the nth +1 target segmentation result.
The detection model may be FCOS (full volumetric One-Stage, first-order full convolution), and the distances from the detection target to the upper, lower, left, and right sides of the prediction frame are obtained by directly detecting each pixel, so that the detection result can be obtained very intuitively. The segmentation model may be deep lab v3+ or FCN (full volumetric Networks). The video segmentation model may be an STM (Spatial-Time Memory) network, and based on an nth target segmentation result corresponding to an nth video frame and an N +1 th video frame, an N +1 th target segmentation result corresponding to the N +1 th video frame is obtained, that is, feature matching is performed on information of a current frame (the N +1 th video frame) and information of a previous frame in space and Time sequence dimensions, so that a segmentation result corresponding to the current frame is obtained, and segmentation accuracy can be improved.
Step 330, receiving a preset trigger operation.
Step 340, responding to the preset trigger operation, displaying a target page, wherein the target page comprises a first area, a second area and a third area, a real-time picture of a user is displayed in the first area, a picture of the teaching video is displayed in the second area, and at least one action demonstration graph matched with the teaching video is displayed in the third area.
In the technical solution of this embodiment, on the basis of the above embodiment, an optional implementation is given for the obtaining manner of the motion demonstration diagram. Specifically, before a target page is displayed, at least one group of target video frames are extracted from the teaching video based on preset logic, and the at least one action demonstration graph is obtained according to the at least one group of target video frames.
Based on the above embodiments, fig. 5 is a flowchart illustrating a display method according to an embodiment. In this embodiment, the following step "when it is monitored that the playing progress of the teaching video reaches the start timestamp corresponding to the action demonstration diagram, the first identifier displayed in association with the action demonstration diagram and the action demonstration diagram are controlled to be highlighted so as to prompt the user to reach the time when the target action displayed by the action demonstration diagram is performed", is added, so that the optimization has the advantages that the user can be reminded in advance that the time when the user performs the action is about to reach, and the user can make thoughts in advance by displaying the action demonstration diagram to be performed, so that the probability of performing the standard action by the user can be improved, or in other words, the standard degree of the action performed by the user can be improved, and the use experience of the user can be improved.
As shown in fig. 5, the display method includes the steps of:
step 510, receiving a preset trigger operation.
And step 520, responding to the preset trigger operation, displaying a target page, wherein the target page comprises a first area, a second area and a third area, a real-time picture of a user is displayed in the first area, a picture of a teaching video is displayed in the second area, and at least one action demonstration graph matched with the teaching video is displayed in the third area.
Step 530, when it is monitored that the playing progress of the teaching video reaches the starting timestamp corresponding to the action demonstration diagram, controlling a first identifier displayed in association with the action demonstration diagram and highlight display of the action demonstration diagram so as to prompt a user to reach the time for performing the target action shown by the action demonstration diagram.
Illustratively, referring to a schematic diagram of a target page as shown in fig. 6, the target page includes afirst area 610, asecond area 620 and athird area 630, a real-time picture of a user is displayed in thefirst area 610, a picture of a teaching video is displayed in thesecond area 620, at least one action demonstration diagram 631 matching with the teaching video is displayed in thethird area 630, and afirst identifier 632 is displayed in association with the action demonstration diagram 631. A playingprogress bar 633 of the teaching video is further displayed in thethird area 630, and as the teaching video is played, the playingprogress bar 633 slides from left to right, and when it is monitored that the playing progress of the teaching video reaches the start timestamp corresponding to the action demonstration diagram 631, thefirst identifier 632 and the action demonstration diagram 631b displayed in association with the action demonstration diagram 631b are controlled to be highlighted, so as to prompt the user to reach the time for performing the target action shown in the action demonstration diagram 631 b. The user can be reminded in advance that the moment of doing the action is about to arrive, and the user can be well done with mind in advance by displaying the demonstration graph 631 of the action to be done, so that the probability of the user making the standard action can be improved, or in other words, the standard degree of the action done by the user can be improved, and the use experience of the user is improved.
In one embodiment, when the number of the action demonstration diagrams is plural and is not enough to be displayed in a third area of the target page in a lump, the action demonstration diagrams are slid in a right-to-left order according to the playing progress of the teaching video to move the action demonstration diagrams that have been played in the teaching video out of the display screen, and the action demonstration diagrams that are about to be played in the teaching video are displayed on the display screen. As shown in fig. 6, wherein the action demonstration diagram 631a shows an action demonstration diagram that has been played in the teaching video, the action demonstration diagram 631b shows an action demonstration diagram that is being played in the teaching video, thefirst identifier 632 displayed in association with the action demonstration diagram 631b and the action demonstration diagram 631b are highlighted, and the action demonstration diagram 631c shows an action demonstration diagram that has not been played in the teaching video.
In one embodiment, the display method further comprises:
and displaying the preset special effect when the target action performed by the user is determined to meet the preset condition. The preset special effect may be automatically showing text with a stimulating meaning, the text may have a more striking color or special effect, for example, showing the text in the form of a small animation or playing the text, or showing the text in the form of a small bubble, etc. The corresponding schematic diagram that can refer to a target page as shown in fig. 7 includes afirst area 710, asecond area 720, and athird area 730, where a real-time picture of a user is displayed in thefirst area 710, a picture of a teaching video is displayed in thesecond area 720, at least oneaction demonstration picture 731 matched with the teaching video is displayed in thethird area 730, and when it is determined that a target action performed by the user meets a preset condition, a presetspecial effect 732 "PERFECT" is displayed to enrich an interaction form and improve user experience. The preset condition comprises at least one of the following conditions:
the time for carrying out the target action is between the starting time stamp and the ending time stamp corresponding to the action demonstration graph; the matching degree between the target action and the target action shown in the action demonstration graph reaches a matching degree threshold value. The closer the time when the user performs the target action is to the time when the coach performs the action in the teaching video, the more the target action performed by the user meets the preset condition, the higher the obtained scoring result is; or, the higher the coincidence degree of the target action performed by the user and the action performed by the coach is, the more the target action performed by the user meets the preset condition, and the higher the obtained scoring result is.
In one embodiment, the display method further comprises:
respectively scoring the associated action parts according to the target action performed by the user to obtain scoring results corresponding to the action parts; and prompting the action part with the scoring result lower than the result threshold value. The action parts associated with the target action include, for example, arms, legs, buttocks, and the like. The action key is given by independently scoring each action part and prompting the action part with the scoring result lower than the result threshold value, so that the user can know whether the action of which part is standard or not and how to correct, and the user experience can be improved.
In a specific embodiment, the scoring the associated action part for the target action performed by the user to obtain a scoring result corresponding to the action part includes:
carrying out similarity calculation on the image of the action part and the image containing the standard action; and determining a scoring result corresponding to the action part according to the calculated similarity. In general, the higher the similarity between the image of the action part and the image including the standard action when the user performs the target action, the higher the scoring result.
In one embodiment, the display method further comprises:
displaying a target action wireframe flow in the first region to enable a user to perform the target action with reference to the target action wireframe flow. A corresponding schematic diagram that can refer to a target page as shown in fig. 8 includes afirst area 810, asecond area 820 and athird area 830, a real-time picture of a user is displayed in thefirst area 810, a picture of a teaching video is displayed in thesecond area 820, at least oneaction demonstration picture 831 matched with the teaching video is displayed in thethird area 830, a target actionwire frame stream 811 is also displayed in thefirst area 820, a target action indicated by the target actionwire frame stream 811 is consistent with an action taken by a coach in the teaching video displayed in thesecond area 820 at the current moment, and when the coach in the teaching video switches actions, the target actionwire frame stream 811 changes along with the action of the coach. By displaying the targetaction wireframe stream 811 in thefirst region 820, the user can be helped to mimic the action of a coach, making more standard actions, and improving the user experience.
Fig. 9 is a schematic structural diagram of a display device in an embodiment of the disclosure. The display device provided by the embodiment of the disclosure can be configured in the client. As shown in fig. 9, the display device specifically includes: a receivingmodule 910 and afirst display module 920. The receivingmodule 910 is configured to receive a preset trigger operation; thefirst display module 920 is configured to display a target page in response to the preset trigger operation, where the target page includes a first area, a second area, and a third area, a real-time picture of a user is displayed in the first area, a picture of a teaching video is displayed in the second area, and an action demonstration diagram matched with the teaching video is displayed in the third area.
Optionally, in the display device provided in the present disclosure, the method further includes: the extraction module is used for extracting at least one group of target video frames from the teaching video based on preset logic before the target page is displayed; and the acquisition module is used for acquiring the at least one action demonstration graph according to the at least one group of target video frames.
Optionally, the extraction module includes: the identification unit is used for identifying an action starting mark and an action ending mark which are marked in the teaching video in advance; a first determining unit, configured to determine a first timestamp corresponding to the action start marker and a second timestamp corresponding to the action end marker; the extracting unit is used for extracting a plurality of video frames according to a set frequency from the teaching video between the first time stamp and the second time stamp; a second determining unit for determining the plurality of video frames as a set of target video frames.
Optionally, the set of target video frames includes a plurality of video frames, and the obtaining module includes: the detection unit is used for carrying out target detection on the Nth video frame in the group of target video frames through a detection model to obtain a detection result; the first segmentation unit is used for carrying out target segmentation through a segmentation model based on the detection result to obtain a target detection result corresponding to the Nth video frame; the fusion unit is used for fusing the target detection result to obtain an Nth target segmentation result corresponding to the Nth video frame; a second segmentation unit, configured to input the nth target segmentation result and an (N + 1) th video frame in the set of target video frames into a video segmentation model, and obtain an (N + 1) th target segmentation result corresponding to the (N + 1) th video frame, where the nth video frame is a previous video frame adjacent to the (N + 1) th video frame, where N is a natural number greater than or equal to 1, and N +1 is less than or equal to a number of video frames included in the target video frame; a first determining unit, configured to determine a final target segmentation result corresponding to the (N + 1) th video frame according to the nth target segmentation result and the (N + 1) th target segmentation result; and a second determining unit, configured to determine the final target segmentation result as a motion demonstration graph corresponding to the set of target video frames.
Optionally, the first determining unit includes: a first determining subunit, configured to determine an intersection ratio between the (N + 1) th target segmentation result and the nth target segmentation result; a second determining subunit, configured to determine, if the intersection ratio is greater than or equal to a preset threshold, the (N + 1) th target segmentation result as the final target segmentation result; and the updating subunit is configured to update the (N + 1) th video frame to the nth video frame if the intersection ratio is smaller than a preset threshold, and return to execute the operation of performing target detection on the nth video frame in the group of target video frames through a detection model to obtain a detection result until each video frame in the group of target video frames is traversed.
Optionally, in the display device provided in the present disclosure, the method further includes: and the control module is used for controlling a first identifier displayed in association with the action demonstration graph and highlight display of the action demonstration graph when the fact that the playing progress of the teaching video reaches the starting timestamp corresponding to the action demonstration graph is monitored, so as to prompt a user to reach the moment of carrying out the target action displayed by the action demonstration graph.
Optionally, in the display device provided in the present disclosure, the method further includes: and the display module is used for displaying the preset special effect when the target action performed by the user is determined to meet the preset condition.
Optionally, the preset condition includes at least one of the following: the time for carrying out the target action is between the starting time stamp and the ending time stamp corresponding to the action demonstration graph; the matching degree between the target action and the target action shown in the action demonstration graph reaches a matching degree threshold value.
Optionally, according to one or more embodiments of the present disclosure, in the display device provided by the present disclosure, the display device further includes: the scoring module is used for scoring the associated action part according to the target action performed by the user to obtain a scoring result corresponding to the action part; and the prompting module is used for prompting the action part of which the scoring result is lower than the result threshold value.
Optionally, according to one or more embodiments of the present disclosure, in a display device provided by the present disclosure, the scoring module includes: a calculation unit for performing similarity calculation between the image of the action part and an image including a standard action; and the determining unit is used for determining the scoring result corresponding to the action part according to the calculated similarity.
Optionally, according to one or more embodiments of the present disclosure, in the display device provided by the present disclosure, the display device further includes: and the second display module is used for displaying a target action wire frame flow in the first area so that the user can carry out the target action by referring to the target action wire frame flow.
The display device provided in the embodiment of the disclosure may perform the steps in the display method provided in the embodiment of the disclosure, and the steps and the beneficial effects are not repeated herein.
Fig. 10 is a schematic structural diagram of an electronic device in an embodiment of the present disclosure. Referring now specifically to fig. 10, a schematic diagram of anelectronic device 500 suitable for use in implementing embodiments of the present disclosure is shown. Theelectronic device 500 in the embodiments of the present disclosure may include, but is not limited to, mobile terminals such as a mobile phone, a notebook computer, a digital broadcast receiver, a PDA (personal digital assistant), a PAD (tablet), a PMP (portable multimedia player), a vehicle-mounted terminal (e.g., a car navigation terminal), a wearable electronic device, and the like, and fixed terminals such as a digital TV, a desktop computer, a smart home device, and the like. The electronic device shown in fig. 10 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.
As shown in fig. 10, anelectronic device 500 may include a processing means (e.g., central processing unit, graphics processor, etc.) 501 that may perform various appropriate actions and processes to implement the … method of embodiments as described in this disclosure, according to a program stored in a Read Only Memory (ROM)502 or a program loaded from a storage means 508 into a Random Access Memory (RAM) 503. In theRAM 503, various programs and data necessary for the operation of theelectronic apparatus 500 are also stored. Theprocessing device 501, theROM 502, and theRAM 503 are connected to each other through abus 504. An input/output (I/O)interface 505 is also connected tobus 504.
Generally, the following devices may be connected to the I/O interface 505:input devices 506 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.;output devices 507 including, for example, a Liquid Crystal Display (LCD), speakers, vibrators, and the like;storage devices 508 including, for example, magnetic tape, hard disk, etc.; and acommunication device 509. The communication means 509 may allow theelectronic device 500 to communicate with other devices wirelessly or by wire to exchange data. While fig. 10 illustrates anelectronic device 500 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program carried on a non-transitory computer readable medium, the computer program containing program code for performing the method illustrated by the flow chart, thereby implementing the method as described above. In such an embodiment, the computer program may be downloaded and installed from a network via the communication means 509, or installed from the storage means 508, or installed from theROM 502. The computer program performs the above-described functions defined in the methods of the embodiments of the present disclosure when executed by theprocessing device 501.
It should be noted that the computer readable medium in the present disclosure can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.
In some embodiments, the clients, servers may communicate using any currently known or future developed network Protocol, such as HTTP (HyperText Transfer Protocol), and may interconnect with any form or medium of digital data communication (e.g., a communications network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the Internet (e.g., the Internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed network.
The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device.
The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to:
receiving a preset trigger operation; and responding to the preset trigger operation, displaying a target page, wherein the target page comprises a first area, a second area and a third area, a real-time picture of a user is displayed in the first area, a picture of a teaching video is displayed in the second area, and at least one action demonstration graph matched with the teaching video is displayed in the third area.
Optionally, when the one or more programs are executed by the electronic device, the electronic device may further perform other steps described in the above embodiments.
Computer program code for carrying out operations for the present disclosure may be written in any combination of one or more programming languages, including but not limited to an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in the embodiments of the present disclosure may be implemented by software or hardware. Where the name of an element does not in some cases constitute a limitation on the element itself.
The functions described herein above may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), systems on a chip (SOCs), Complex Programmable Logic Devices (CPLDs), and the like.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
According to one or more embodiments of the present disclosure, there is provided a display method including: receiving a preset trigger operation; and responding to the preset trigger operation, displaying a target page, wherein the target page comprises a first area, a second area and a third area, a real-time picture of a user is displayed in the first area, a picture of a teaching video is displayed in the second area, and at least one action demonstration graph matched with the teaching video is displayed in the third area.
According to one or more embodiments of the present disclosure, in the display method provided by the present disclosure, optionally, the method further includes: extracting at least one group of target video frames from the teaching video based on preset logic; and acquiring the at least one action demonstration graph according to the at least one group of target video frames.
According to one or more embodiments of the present disclosure, in the display method provided by the present disclosure, optionally, the extracting at least one group of target video frames from the teaching video based on preset logic includes: identifying an action starting mark and an action ending mark which are marked in the teaching video in advance; determining a first timestamp corresponding to the action start mark and a second timestamp corresponding to the action end mark; extracting a plurality of video frames according to a set frequency from the teaching video between the first time stamp and the second time stamp; determining the plurality of video frames as a set of target video frames.
According to one or more embodiments of the present disclosure, in the display method provided by the present disclosure, optionally, the set of target video frames includes a plurality of video frames, and obtaining one of the action demonstration diagrams according to a set of target video frames includes: performing target detection on the Nth video frame in the group of target video frames through a detection model to obtain a detection result; performing target segmentation through a segmentation model based on the detection result to obtain a target detection result corresponding to the Nth video frame; fusing the target detection results to obtain an Nth target segmentation result corresponding to the Nth video frame; inputting the nth target segmentation result and an (N + 1) th video frame in the group of target video frames into a video segmentation model to obtain an (N + 1) th target segmentation result corresponding to the (N + 1) th video frame, wherein the nth video frame is a previous video frame adjacent to the (N + 1) th video frame, N is a natural number greater than or equal to 1, and N +1 is less than or equal to the number of video frames included in the target video frame; determining a final target segmentation result corresponding to the (N + 1) th video frame according to the (N + 1) th target segmentation result and the (N + 1) th target segmentation result; and determining the final target segmentation result as a motion demonstration graph corresponding to the group of target video frames.
According to one or more embodiments of the present disclosure, in the display method provided by the present disclosure, optionally, the determining a final target segmentation result corresponding to the N +1 th video frame according to the nth target segmentation result and the N +1 th target segmentation result includes: determining an intersection ratio between the (N + 1) th target segmentation result and the nth target segmentation result; if the intersection ratio is greater than or equal to a preset threshold value, determining the (N + 1) th target segmentation result as the final target segmentation result; if the intersection ratio is smaller than a preset threshold value, updating the (N + 1) th video frame to be the Nth video frame, returning to execute the operation of performing target detection on the Nth video frame in the group of target video frames through a detection model, and obtaining a detection result until each video frame in the group of target video frames is traversed.
According to one or more embodiments of the present disclosure, in the display method provided by the present disclosure, optionally, the method further includes: when it is monitored that the playing progress of the teaching video reaches the starting timestamp corresponding to the action demonstration graph, controlling a first identifier displayed in association with the action demonstration graph and highlight display of the action demonstration graph so as to prompt a user to reach the time for carrying out the target action displayed by the action demonstration graph.
According to one or more embodiments of the present disclosure, in the display method provided by the present disclosure, optionally, the method further includes: and displaying the preset special effect when the target action performed by the user is determined to meet the preset condition.
According to one or more embodiments of the present disclosure, in the display method provided by the present disclosure, optionally, the method further includes: the preset condition comprises at least one of the following conditions: the time for carrying out the target action is between the starting time stamp and the ending time stamp corresponding to the action demonstration graph; the matching degree between the target action and the target action shown in the action demonstration graph reaches a matching degree threshold value.
According to one or more embodiments of the present disclosure, in the display method provided by the present disclosure, optionally, the method further includes: scoring the associated action part according to the target action performed by the user to obtain a scoring result corresponding to the action part; and prompting the action part with the scoring result lower than the result threshold value.
According to one or more embodiments of the present disclosure, in the display method provided by the present disclosure, optionally, the scoring the associated action part for the target action performed by the user to obtain a scoring result corresponding to the action part includes: carrying out similarity calculation on the image of the action part and the image containing the standard action; and determining a scoring result corresponding to the action part according to the calculated similarity.
According to one or more embodiments of the present disclosure, in the display method provided by the present disclosure, optionally, the method further includes: displaying a target action wireframe flow in the first region to enable a user to perform the target action with reference to the target action wireframe flow.
According to one or more embodiments of the present disclosure, there is provided a display device including: the receiving module is used for receiving preset trigger operation; and the first display module is used for responding to the preset trigger operation and displaying a target page, wherein the target page comprises a first area, a second area and a third area, a real-time picture of a user is displayed in the first area, a picture of a teaching video is displayed in the second area, and an action demonstration graph matched with the teaching video is displayed in the third area.
According to one or more embodiments of the present disclosure, in the display device provided by the present disclosure, further comprising: the extraction module is used for extracting at least one group of target video frames from the teaching video based on preset logic before the target page is displayed; and the acquisition module is used for acquiring the at least one action demonstration graph according to the at least one group of target video frames.
According to one or more embodiments of the present disclosure, in a display device provided by the present disclosure, the extraction module includes: the identification unit is used for identifying an action starting mark and an action ending mark which are marked in the teaching video in advance; a first determining unit, configured to determine a first timestamp corresponding to the action start marker and a second timestamp corresponding to the action end marker; the extracting unit is used for extracting a plurality of video frames according to a set frequency from the teaching video between the first time stamp and the second time stamp; a second determining unit for determining the plurality of video frames as a set of target video frames.
In accordance with one or more embodiments of the present disclosure, in a display apparatus provided by the present disclosure, the set of target video frames includes a plurality of video frames, and the obtaining module includes: the detection unit is used for carrying out target detection on the Nth video frame in the group of target video frames through a detection model to obtain a detection result; the first segmentation unit is used for carrying out target segmentation through a segmentation model based on the detection result to obtain a target detection result corresponding to the Nth video frame; the fusion unit is used for fusing the target detection result to obtain an Nth target segmentation result corresponding to the Nth video frame; a second segmentation unit, configured to input the nth target segmentation result and an (N + 1) th video frame in the set of target video frames into a video segmentation model, and obtain an (N + 1) th target segmentation result corresponding to the (N + 1) th video frame, where the nth video frame is a previous video frame adjacent to the (N + 1) th video frame, where N is a natural number greater than or equal to 1, and N +1 is less than or equal to a number of video frames included in the target video frame; a first determining unit, configured to determine a final target segmentation result corresponding to the (N + 1) th video frame according to the nth target segmentation result and the (N + 1) th target segmentation result; and a second determining unit, configured to determine the final target segmentation result as a motion demonstration graph corresponding to the set of target video frames.
According to one or more embodiments of the present disclosure, in a display device provided by the present disclosure, the first determination unit includes: a first determining subunit, configured to determine an intersection ratio between the (N + 1) th target segmentation result and the nth target segmentation result; a second determining subunit, configured to determine, if the intersection ratio is greater than or equal to a preset threshold, the (N + 1) th target segmentation result as the final target segmentation result; and the updating subunit is configured to update the (N + 1) th video frame to the nth video frame if the intersection ratio is smaller than a preset threshold, and return to execute the operation of performing target detection on the nth video frame in the group of target video frames through a detection model to obtain a detection result until each video frame in the group of target video frames is traversed.
According to one or more embodiments of the present disclosure, in the display device provided by the present disclosure, further comprising: and the control module is used for controlling a first identifier displayed in association with the action demonstration graph and highlight display of the action demonstration graph when the fact that the playing progress of the teaching video reaches the starting timestamp corresponding to the action demonstration graph is monitored, so as to prompt a user to reach the moment of carrying out the target action displayed by the action demonstration graph.
According to one or more embodiments of the present disclosure, in the display device provided by the present disclosure, further comprising: and the display module is used for displaying the preset special effect when the target action performed by the user is determined to meet the preset condition.
According to one or more embodiments of the present disclosure, in a display device provided by the present disclosure, the preset condition includes at least one of: the time for carrying out the target action is between the starting time stamp and the ending time stamp corresponding to the action demonstration graph; the matching degree between the target action and the target action shown in the action demonstration graph reaches a matching degree threshold value.
According to one or more embodiments of the present disclosure, in the display device provided by the present disclosure, further comprising: the scoring module is used for scoring the associated action part according to the target action performed by the user to obtain a scoring result corresponding to the action part; and the prompting module is used for prompting the action part of which the scoring result is lower than the result threshold value.
According to one or more embodiments of the present disclosure, in a display device provided by the present disclosure, the scoring module includes: a calculation unit for performing similarity calculation between the image of the action part and an image including a standard action; and the determining unit is used for determining the scoring result corresponding to the action part according to the calculated similarity.
According to one or more embodiments of the present disclosure, in the display device provided by the present disclosure, further comprising: and the second display module is used for displaying a target action wire frame flow in the first area so that the user can carry out the target action by referring to the target action wire frame flow.
In accordance with one or more embodiments of the present disclosure, there is provided an electronic device including:
one or more processors;
a memory for storing one or more programs;
when executed by the one or more processors, cause the one or more processors to implement any of the display methods provided by the present disclosure.
According to one or more embodiments of the present disclosure, there is provided a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements a display method as any one of those provided by the present disclosure.
Embodiments of the present disclosure also provide a computer program product comprising a computer program or instructions which, when executed by a processor, implement the display method as described above.
The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the disclosure herein is not limited to the particular combination of features described above, but also encompasses other embodiments in which any combination of the features described above or their equivalents does not depart from the spirit of the disclosure. For example, the above features and (but not limited to) the features disclosed in this disclosure having similar functions are replaced with each other to form the technical solution.
Further, while operations are depicted in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order. Under certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while several specific implementation details are included in the above discussion, these should not be construed as limitations on the scope of the disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.
Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims.