Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. However, the exemplary embodiments can be embodied in many different forms and should not be construed as limited to the examples set forth herein, but rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of the exemplary embodiments to those skilled in the art.
Furthermore, the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to give a thorough understanding of embodiments of the application. One skilled in the relevant art will recognize, however, that the application may be practiced without one or more of the specific details, or with other methods, components, devices, steps, etc. In other instances, well-known methods, devices, implementations, or operations are not shown or described in detail to avoid obscuring aspects of the application.
The block diagrams depicted in the figures are merely functional entities and do not necessarily correspond to physically separate entities. That is, the functional entities may be implemented in software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor devices and/or microcontroller devices.
The flow diagrams depicted in the figures are exemplary only, and do not necessarily include all of the elements and operations/steps, nor must they be performed in the order described. For example, some operations/steps may be decomposed, and some operations/steps may be combined or partially combined, so that the order of actual execution may be changed according to actual situations.
It should be noted that the term "plurality" as used herein means two or more. "and/or" describes the association relationship of the association object, and indicates that there may be three relationships, for example, a and/or B may indicate that there are three cases of a alone, a and B together, and B alone. The character "/" generally indicates that the context-dependent object is an "or" relationship.
It can be understood that, before collecting relevant data (such as user data of user features) of the user and during collecting relevant data of the user, a prompt interface or a popup window can be displayed, and the prompt interface or the popup window is used for prompting the user to collect relevant data currently, so that the application only starts to execute the relevant step of acquiring relevant data of the user after acquiring the confirmation operation of the user on the prompt interface or the popup window, otherwise (i.e. when the confirmation operation of the user on the prompt interface or the popup window is not acquired), the relevant step of acquiring relevant data of the user is finished, i.e. the relevant data of the user is not acquired. In other words, all user data collected by the present application is collected with the consent and authorization of the user, and the collection, use and processing of relevant user data requires compliance with relevant laws and regulations and standards of the relevant country and region.
Fig. 1 shows a schematic diagram of an exemplary system architecture to which the technical solution of an embodiment of the present application may be applied.
As shown in fig. 1, the system architecture 100 may include a terminal device 101, a game server 102, a live server 103, a terminal device 104, and a network 105. The terminal device 101 and the terminal device 104 are used by a user, and may be electronic devices with a live video function or a live video watching function. Specifically, the terminal device 101 and the terminal device 104 may be electronic devices such as a cellular phone, a tablet computer, a wearable device, a PC (Personal Computer ), and the like.
Optionally, the terminal device 101 and the terminal device 104 may be installed and run with a live client, where the live client refers to a client used by a user to explain live video or watch live video, and the live client may have functions of collecting, publishing, downloading and playing live video. In the exemplary embodiment, display screens for displaying live video and/or audio collection devices for collecting audio data of the anchor are provided in terminal device 101 and terminal device 104.
In a possible implementation manner, assuming that the terminal device 101 is a client for playing a live game, and the terminal device 104 is a client for watching the live game, a game client may be installed in the terminal device 101, in addition to the live client, where the game client may be a game client such as a shooting game, a First-person shooting game (FPS for short), a third-person shooting game, a multiplayer warfare survival game, a multiplayer online tactical competition game (Multiplayer Online Battle ARENA GAMES for short MOBA), a role playing game, an instant strategy game, a racing game, a music game, and the like. In addition to this, if the game server 102 is capable of providing other services such as a virtual reality service, a three-dimensional map service, an emulation service, etc., the terminal device 101 may also install a corresponding application program for video live broadcasting.
The game server 102 is a server that provides game services, and the game server 102 may include one server or a plurality of servers. Alternatively, the game server 102 may be a server that provides a cloud game.
The live broadcast server 103 is a server for providing background services for live broadcast clients, and the live broadcast server 103 may include one server or may include multiple servers, and optionally, the live broadcast server 103 may be a cloud server.
As shown in fig. 1, the terminal device 101, the game server 102, the live broadcast server 103, and the terminal device 104 can communicate with each other via a network 105. For example, the terminal device 101 and the game server 102 may communicate with each other via a wired network or a wireless network, the terminal device 101 and the live broadcast server 103 may communicate with each other via a wired network or a wireless network, the terminal device 104 and the live broadcast server 103 may communicate with each other via a wired network or a wireless network, and the game server 102 and the live broadcast server 103 may communicate with each other via a wired network or a wireless network.
In one embodiment of the present application, the terminal device 101 is a client for playing a live game, and the terminal device 104 is a client for viewing a live game. The terminal device 101 can log in the game server 102 through the first account number to play the game, and can log in the live broadcast server 103 through the second account number to live the game in progress, and the terminal device 104 can acquire the live game video of the terminal device 101 from the live broadcast server 103 and provide the live game video for the user to watch.
Specifically, the terminal device 101 may create a live room through the live server 103, and play live games based on the live room. The terminal device 104 can select to enter the live broadcast room corresponding to the terminal device 101 through the live broadcast room list provided by the live broadcast server 103, and watch the corresponding game live broadcast video. Since the terminal device 104 may have been live in the live broadcast room corresponding to the terminal device 101 for a period of time when the user needs to know the content already broadcast, one live broadcast scheme is to find the playback video released by the host to select to view live broadcast playback when the current live broadcast is finished and the host releases the playback video. The other live broadcast scheme is that when live broadcast on line is entered, the broadcast content is selected to be watched by dragging the progress bar, and then the current live broadcast progress is switched back through operation.
In order to avoid that the user of the terminal device 104 cannot keep up with the current live broadcast progress, in one embodiment of the present application, after detecting a request sent by the terminal device 104 to join the live broadcast room, the live broadcast server 103 may determine, according to the broadcast duration of the live broadcast room and the identification information of the user corresponding to the terminal device 104 (the identification information of the user may be the user identification information of the live broadcast client installed on the terminal device 104), the live broadcast content missed by the user of the terminal device 104, then select at least one video segment (for example, a video segment matched with the user feature of the user may be selected) from the live broadcast content missed by the user of the terminal device 104, and then generate a video highlight segment according to the selected video segment, and send the video highlight segment to the terminal device 104 for playing, so that the user of the terminal device 104 can efficiently solve the broadcast content of the live broadcast room by viewing the video highlight segment, and further can quickly keep up with the viewing acceptability of the live broadcast content, and also improve the live broadcast platform of the live broadcast.
For example, when the terminal device 104 finishes playing the video highlight reel or the user selects to skip playing the video highlight reel, the live content of the live room may be played.
The implementation details of the technical scheme of the embodiment of the application are described in detail below:
Fig. 2 illustrates a flowchart of a method of processing live video, which may be performed by a live server, which may be the live server 103 illustrated in fig. 1, according to an embodiment of the present application. Referring to fig. 2, the processing method of the live video at least includes steps S210 to S240, and is described in detail as follows:
in step S210, a join request for the target live broadcast room is received, where the join request includes identification information of a user requesting to join the target live broadcast room.
In one embodiment of the application, when a user needs to enter a live broadcasting room to watch live broadcasting, the user can log in the live broadcasting client, select a target live broadcasting room through a live broadcasting room list displayed by the live broadcasting client, then the live broadcasting client can send a joining request to a live broadcasting server based on the target live broadcasting room selected by the user, and add identification information (such as account information of the user in a live broadcasting application program) of the user and information of the target live broadcasting room in the joining request.
Alternatively, the target live room may be a live room in which live is being played, and the content of the live may be game play, video commentary, exhibition activities, and the like. The joining request may be a joining request triggered when the user first enters the target live broadcast room, or may be a joining request triggered when the user exits halfway and then enters the live broadcast room again.
In step S220, the live broadcast content missed by the user is determined according to the broadcast duration of the target live broadcast room and the identification information of the user.
In one embodiment of the application, because the user has the identification information and the information of the joining time point exists when the user joins the target live broadcast room, the time point of joining the target live broadcast room by the user can be obtained according to the identification information of the user, and then the missing live broadcast time period of the user is determined according to the broadcasted time length of the target live broadcast room and the time point of joining the target live broadcast room by the user, and further the missing live broadcast content of the user can be determined based on the missing live broadcast time period of the user.
For example, a live broadcast room is started from 9:00, and a user joins the live broadcast room at 10:15, and then the user misses a live broadcast period between 9:00 and 10:15, i.e. misses live broadcast content between 9:00 and 10:15. Assuming that the user exits the live room at 10:30 after joining the live room at 10:15 and joins the live room at 10:55, the live time period that the user missed this time is 10:30-10:55, i.e. the live content between 10:30-10:55 was missed.
In step S230, at least one video clip is selected from live content missed by the user.
In one embodiment of the present application, at least one video clip selected from live broadcast content missed by a user may be selected randomly, or may be selected separately for each live broadcast period, or may be selected according to a user feature, or may be selected according to a hotness value of the video clip.
In particular, if selected according to a user characteristic, the video clip may be added with an attribute tag for representing the type of video clip, for example, for a game video, a battle tag, a tag of a game hero, a tag of a hero type (such as a teacher hero, an auxiliary hero, etc.), etc., and the matching degree may be calculated according to the attribute tag and the user characteristic, which is used to represent that the user is loved, for example, that the user is interested in a game hero or a hero type, and then the video clip having the game hero tag or the video clip having the hero type tag is a video clip matching the user characteristic of the user. The technical scheme of the embodiment can improve the accuracy of at least one selected video clip, and is beneficial to realizing personalized selection.
If selected according to the hotness value of the video clips, at least one video clip may be selected in order of high-to-low hotness value.
Optionally, semantic analysis and/or content identification processing can be performed on the video segments of the played content in the target live broadcast room to obtain attribute tags of the video segments. The video voice analysis and the content recognition are carried out by analyzing and processing the video fragments by means of artificial intelligence technology, in particular by means of computer vision technology.
Specifically, artificial intelligence (ARTIFICIAL INTELLIGENCE, AI) technology is a theory, method, technique, and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend, and expand human intelligence, sense the environment, acquire knowledge, and use knowledge to obtain optimal results. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision. The artificial intelligence technology is a comprehensive subject, and relates to the technology with wide fields, namely the technology with a hardware level and the technology with a software level. Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.
Computer Vision (CV) technology is a science for researching how to make a machine "look at", and more specifically, to replace human eyes with a camera and a Computer to perform machine Vision such as recognition and measurement on a target, and further perform graphic processing, so that the Computer is processed into an image more suitable for human eyes to observe or transmit to an instrument to detect. As a scientific discipline, computer vision research-related theory and technology has attempted to build artificial intelligence systems that can acquire information from images or multidimensional data. Computer vision techniques typically include image processing, image recognition, image semantic understanding, image retrieval, OCR (Optical Character Recognition ), video processing, video semantic understanding, video content/behavior recognition, three-dimensional object reconstruction, 3D techniques, virtual reality, augmented reality, synchronous positioning, map construction, etc., as well as common biometric recognition techniques such as face recognition, fingerprint recognition, etc.
In one embodiment of the present application, after determining the attribute tag of the video clip, the attribute tag of the video clip may be further adjusted according to the interaction data corresponding to the video clip. If the viewing amount of a certain video clip is larger, or the like, the attribute tag of the video clip can be adjusted, such as to be a hot video.
In one embodiment of the application, when selecting the video clips matching the user characteristics from the live broadcast content missed by the user according to the user characteristics of the user, the similarity between the user characteristics and the attribute tags of the video clips can be calculated, and if the similarity is greater than a set threshold value, the matching of the user characteristics and the attribute tags of the video clips is indicated. Alternatively, the similarity may be determined by calculating a distance between the user feature and the attribute tag, such as Euclidean distance, minkowski distance, manhattan distance, etc., the smaller the distance, which illustrates the greater the similarity.
With continued reference to fig. 2, in step S240, a video highlight reel is generated from the selected video reel and presented to the user.
In one embodiment of the present application, the selected video segments may be synthesized, and the synthesized video segments may be used as generated video highlight segments. Alternatively, the process of presenting the video highlight clips to the user may be to send the generated video highlight clips to the viewer's live client and then to be played by the live client.
For example, in the process of presenting the video highlight clips to the user, if a skip play instruction of the video highlight clips is detected, a target video clip presented when the skip play instruction is received may be identified, and then the user characteristics of the user may be adjusted according to the attribute tag of the target video clip. Specifically, since the video highlight clips are generated by the selected video clips, when a skip play instruction of the video highlight clips is detected, the user may not be interested in the currently played video clip, so that the user characteristics of the user can be adjusted based on the skip play instruction, and the accuracy of the follow-up recommendation can be optimized.
Based on the technical solution of the embodiment shown in fig. 2, in an embodiment of the present application, as shown in fig. 3, after step S220, step S310 may further include determining whether the user missed live time period is greater than the first time period, for example, determining the user missed live time period according to the user missed live time period.
If the live broadcast time length missed by the user is longer than the first time length, step S230 and step S240 are executed again, namely, the process of selecting the video clip and generating the video highlight clip is performed. If the length of the live broadcast missed by the user is less than or equal to the first length, the real-time live broadcast content of the target live broadcast room can be directly presented to the user. According to the technical scheme, when the live broadcast time missed by the user is short, the real-time live broadcast content can be directly presented to the user. Alternatively, the first time period may be, for example, half an hour, 1 hour, etc.
Based on the solutions of the embodiments shown in fig. 2 and 3, in one embodiment of the present application, the video segments may also be generated by segmenting the broadcasted content of the target live room, so as to select a video segment from the generated video segments. Specifically, the second duration is taken as a segmentation reference, the broadcasted content of the target live broadcast room is segmented to obtain a plurality of video time periods, and then the video content in the video time periods is split to obtain a plurality of video segments corresponding to the broadcasted content. For example, the 10-minute reference may be used as a segment standard, the played content in the target live broadcast room is divided into a plurality of video time periods of 10 minutes, and then the video time periods are split to obtain a plurality of video segments corresponding to the played content. Alternatively, the second period of time may be set according to actual requirements, and may be, for example, 5 minutes, 15 minutes, 20 minutes, or the like, in addition to 10 minutes.
In one embodiment of the application, when the video content in the video period is split, the video content in the video period can be subjected to semantic analysis and then split based on a semantic analysis result, or the video content in the video period can be subjected to content identification processing and then split based on a content identification result, or the video content in the video period can be subjected to heat analysis and then split based on a heat analysis result. Of course, the video content in the video period may also be split by performing a plurality of processes among semantic analysis, content recognition processing, and thermal analysis on the video content. For example, the video clips are split into video clips of each game hero, into video clips containing a peak in heat, or into video clips of a certain number of game hero combat.
After segmenting the broadcasted content of the target live room, in one embodiment of the present application, the process of selecting the video segments matching the user characteristics from the broadcasted content missed by the user may be to determine at least one video period to which the broadcasted content missed by the user belongs based on a plurality of video segments corresponding to the broadcasted content, and then select at least one video segment from the at least one video period, respectively. That is, in order to ensure the content continuity of the generated video highlight clips, at least one video clip may be selected from each video period missed by the user to perform the composition process.
Optionally, when selecting video clips from each video period missed by the user, the video clips matching the user features may be selected from each video period only depending on the user features, the video clips with higher heat values may be selected from each video period only depending on the heat values, and the selection may be performed by combining the user features and the heat values. Specifically, if a video clip matching the user characteristic is not selected from the target video period missed by the user according to the user characteristic, at least one video clip may be selected from the video clips split from the target video period according to the hotness value of the video clip. I.e. 1 or more video clips can be selected in order of the heat value from high to low.
In an embodiment of the present application, a scheme for calculating a video clip heat value is also provided, specifically referring to fig. 4, including the following steps:
In step S410, the interaction data corresponding to each video clip is obtained.
In one embodiment of the application, the interaction data corresponding to the video clip can comprise watching number data, commentary number data, barrage data, wheat-linked interaction data and the like.
Step S420, according to the interaction data corresponding to each video clip, determining the heat factor of each video clip.
In one embodiment of the application, the popularity factor may be, for example, the number of viewers, the number of comments, the number of barrages, the duration of the wheat-along interaction, the amount of gifts received, etc. If the video clip has multiple heat factors, then the multiple heat factors may be counted (e.g., averaged, or weighted averaged, etc.) to obtain the final heat factor for the video clip.
Step S430, calculating interaction change trend corresponding to each video clip according to the heat factor of each video clip.
In one embodiment of the present application, the interactive variation trend is used to feed back the interactive variation of the live content, for example, the interactive variation trend can be measured by the variation trend of the heat factor with time. Alternatively, the interactive trend may be represented by a natural index of the derivative of the heat factor.
Step S440, calculating the heat value of each video clip based on the heat factor of each video clip and the interaction change trend corresponding to each video clip.
In one embodiment of the present application, the popularity factor of the video clip and the interactive trend corresponding to the video clip may be integrated to obtain the popularity value of the video clip.
Specifically, the heat value of the video clip can be calculated by the following formula:
H=Base(P)+ef'(P)
Wherein H represents the heat value of the video clip, P represents the heat factor, base (P) represents the Base of the heat factor, such as the number of viewers, and f' (P) represents the derivative of the dynamic change of the heat factor, which reflects the interactive change trend.
In one embodiment of the present application, the process of generating the video highlight clips according to the selected video clips in step S240 shown in fig. 2 and 3 may be that the selected video clips are synthesized according to the time sequence of the video to obtain synthesized video clips, and then the video highlight clips with the set duration range are generated based on the synthesized video clips.
Specifically, if the duration of the synthesized video clip is within the set duration range, the synthesized video clip can be directly used as a video highlight clip.
If the duration of the synthesized video clips exceeds the set duration range and at least two video clips are selected from the same video period, the video clips can be removed from the at least two video clips according to the heat value or the matching degree between the heat value and the user characteristics until the duration of the synthesized video clips is within the set duration range, and then the synthesized video clips within the set duration range are used as video highlight clips. Alternatively, video clips with low hotness values or video clips with low matching with user features may be removed.
Of course, if the duration of the synthesized video segment exceeds the set duration range, the video segment conforming to the set duration range can be directly intercepted from the synthesized video segment to serve as the video highlight segment. For example, the second half part of the synthesized video segment can be intercepted, namely, from back to front, so as to ensure that the generated video highlight segment has higher relevance with the real-time live broadcast content. Or a part of the synthesized video segment can be respectively cut from the first half part, the middle part and the second half part of the synthesized video segment to generate the video highlight segment.
If the duration of the synthesized video clips does not reach the set duration range, selecting the video clips from other video clips in at least one video period according to the heat value until the duration of the synthesized video clips is in the set duration range, and then taking the synthesized video clips in the set duration range as video highlight clips.
Fig. 2 to fig. 4 are illustrations from the viewpoint of a live broadcast server, and the following details of the technical solution of the embodiment of the present application from the viewpoint of a live broadcast client are described with reference to fig. 5:
Fig. 5 shows a flow chart of a method of processing live video, which may be performed by a live client, which may be running in the terminal device 104 shown in fig. 1, according to one embodiment of the application. Referring to fig. 5, the processing method of the live video at least includes steps S510 to S540, and is described in detail as follows:
in step S510, live room information being live is displayed.
In one embodiment of the present application, the live client (specifically, the client viewing the live) may obtain live room information that is being live from the live server, and then display the live room information that is being live (e.g., may be displayed in a list form, or in a thumbnail form, etc.) on an interface on the live client, where the live room information may be a name of a display live room, profile information, live content, live time period, etc.
In step S520, if it is detected that the user enters the target live broadcast room, a join request for the target live broadcast room is sent to the server, where the join request includes identification information of the user.
In one embodiment of the application, when a user needs to enter a live broadcasting room to watch live broadcasting, the user can log in the live broadcasting client, select a target live broadcasting room through the live broadcasting room information displayed by the live broadcasting client, then the live broadcasting client can send a joining request to the live broadcasting server based on the target live broadcasting room selected by the user, and add identification information (such as account information of the user in a live broadcasting application program) of the user and information of the target live broadcasting room in the joining request.
In step S530, a video highlight clip transmitted by the server is received, where the video highlight clip is generated according to a video clip selected from live broadcast content missed by the user.
The process of generating the video highlight clips by the server may refer to the technical solution of the foregoing embodiment, and will not be described herein.
In step S540, the video highlight clips are played.
Optionally, when the user clicks into the target live room, and after the video highlight clips are acquired, the video highlight clips can be played directly on the interface of the live client. Or the playing control of the video highlight clip can be displayed on the interface of the live client, and the video highlight clip is played after the triggering instruction of the playing control is detected.
In one embodiment of the application, in the process of playing the video highlight clip, if a skip play instruction of the video highlight clip is received, or if the video highlight clip is played, the real-time live content of the target live broadcasting room can be played. Alternatively, in this embodiment, the playing window of the video highlight reel and the playing window of the live content in real time in the live room may be the same.
In addition, in one embodiment of the application, a playing window of the video highlight clip and a playing window of the real-time live content can be displayed on an interface of the live client, and then whether to play the video highlight clip or the real-time live content is selected according to a trigger action of a user.
The technical solutions of the embodiments of the present application are described above from the perspective of the live broadcast server and the live broadcast client, respectively, and implementation details of the technical solutions of the embodiments of the present application are described in detail below with reference to fig. 6 to 14:
as shown in fig. 6, a method for processing live video according to an embodiment of the present application includes the steps of:
Step S601, after a user enters a live broadcasting room, a live broadcasting client transmits a time node and a user ID to a live broadcasting server.
Specifically, when a user enters a live broadcasting room to start watching online live broadcasting, the live broadcasting client sends a request to the live broadcasting server, wherein the request can contain time node information and a user ID. Optionally, the time node information may be a time point when the user enters the live broadcast room, so that the live broadcast server may determine a live broadcast period missed by the user according to the time point, or the time node information may also be a live broadcast period missed by the user and determined by the live broadcast client.
Step S602, the live broadcast server matches interest tags and video clip hotness under the user ID, and generates video highlights to be sent to the client.
In one embodiment of the present application, after the host of the live room is opened, the live server may begin to obtain the broadcasted content of the live room in real time, then split into large segments (i.e., video slots in the above embodiment) according to time, and store in sequence as Video1, video2, video3. As shown in fig. 7, the time length of the split large segments may be the same or different, and the time length of each large segment may be set according to actual requirements.
For the video content in each large segment, an AI algorithm of semantic analysis and picture content identification can be combined, or the segmentation can be performed according to the heat distribution condition (such as a wave band peak value) of the video content, so that small segments are obtained, and the small segments are named correspondingly in sequence. As shown in fig. 8, the Video1 may be split into small segments v1.1, v1.2, v1.3, v1.4, the Video2 may be split into small segments v2.1, v2.2, v2.3, v2.4, and the Video3 may be split into small segments v3.1, v3.2, etc.
Assuming that the Video1 is split according to the heat distribution, as shown in fig. 9, the heat curve of the Video1 includes peaks and troughs of heat values, and the peak of each heat value can be split with the peak of each heat value as the center, so as to obtain small segments v1.1, v1.2, v1.3, and v1.4. When the AI algorithm combining semantic analysis and picture content recognition is split, the large segment can be divided into a plurality of small segments related to content comparison, such as small segments of hero of each game.
After the large segments are split to obtain the small segments, semantic analysis and picture content identification can be carried out on each small segment so as to mark corresponding attribute labels on each small segment, and the small segments and the attribute labels are stored together at the live broadcast server. Specifically, as shown in fig. 10, for example, the small segment v1.1 is marked with an attribute tag a, the small segment v1.2 is marked with an attribute tag b, the small segment v1.3 is marked with an attribute tag a, the small segment v1.4 is marked with an attribute tag c, the small segment v2.1 is marked with an attribute tag d, the small segment v2.2 is marked with an attribute tag f, the small segment v2.3 is marked with an attribute tag g, the small segment v2.4 is marked with an attribute tag a, the small segment v3.1 is marked with an attribute tag b, and the small segment v3.2 is marked with an attribute tag f.
Optionally, the attribute tags may be, for example, game hero tags to indicate that the small segment is a small segment corresponding to a game hero, hero type tags to indicate that the small segment is a small segment corresponding to a hero type, or combat type tags to indicate that the small segment is a team video segment or a singled video segment, etc.
In one embodiment of the present application, the heat value score of each small segment may also be obtained according to the amount of interaction per unit time of each small segment. The interaction amount can comprise factors such as the number of viewers, the number of comments, the number of barrages and the like of the small fragments, and meanwhile, the variable of the trend of the change with time can be considered when calculating the heat value score of each small fragment due to the correlation of live broadcast and real-time data. Alternatively, the natural index of the derivative of the heat factor P may be taken to represent the trend of change.
For example, in one embodiment, the heat value score for each small segment may be calculated by the formula h=base (P) +ef'(P).
Wherein P represents a heat factor, which can be the number of people watching live, the amount of gifts sent, the number of people living broadcast comments, the number of barrages, the duration of wheat connection interaction and the like. When a plurality of heat factors exist, the weighted average of the heat factors can be taken, so that the purposes of balancing the operations such as screen brushing and the like are achieved. Base (P) represents the cardinality of the heat factor, e.g., the number of people watching live in real time, while f '' (P) represents the derivative of the dynamic change of the number of people, suggesting a trend of live interaction change. Alternatively, the above expression can be understood as having a higher heat when the number of live viewers is large, the number of interactions is large, and the number of people in a unit time in the past increases rapidly or the number of interactions increases suddenly.
In one embodiment of the application, the interest tag of the user is a user feature, the attribute tag of the video clip is a feature of the video clip, and the user feature and the feature of the video clip are two related concepts. The initial characteristics of the video clips can be manually divided, for example, a host can select corresponding classifications during the playing, or the platform can manually mark according to the content, and the like. The characteristics of the video clips are not constant, and the characteristics can be determined by weighting the overall video watching user and the interaction behavior, for example, the higher the interaction of the user with the corresponding interest tag to a certain video clip is, the longer the watching time is, the larger the influence ratio of the interest tag of the user to the attribute tag of the video clip is, so that the attribute tag of the video clip can be adjusted accordingly.
The user features are feature vectors formed according to favorite classifications actively selected by the user and behavior data (such as attention behavior, praying behavior and the like in a live broadcast room) of the user.
In associating the user feature with the feature of the video clip, the user feature may be taken as a feature vector x, the feature of the video clip may be taken as a feature vector y, such as feature vector x= (x1,x2,...,xn), feature vector y= (y1,y2,...,yn), these two feature vectors may be represented as two points in the euclidean space, and then the euclidean distance dEuclidean (x, y) or dEuclidean (y, x) of points x and y may be represented as:
The euclidean distance between two feature vectors may represent the magnitude of the correlation between the feature vectors. The smaller the value is, the higher the matching degree between the two feature vectors is, so that one or more small fragments with higher matching degree can be selected from each large fragment to synthesize the video highlight and display the video highlight to a user, and the purpose of thousands of people and thousands of faces is achieved.
Specifically, as shown in fig. 11, assuming that the interest tag of the user is a, d, e, g, by matching with the attribute tag of the video clip, the obtained matching video clip may be v1.1, v1.3, v2.1, v2.3, v2.4.
Since the large segment Video3 has no small segment matched with the interest tag of the user, and in order to ensure the continuity of the Video highlight content, at least one small segment needs to be selected from each large segment, therefore, one small segment can be selected from the Video3 according to the heat value, and the small segments which are finally selected are v1.1, v1.3, v2.1, v2.3, v2.4 and v3.2.
After the small fragments are screened out, the small fragments can be assembled and synthesized into video highlights in the live broadcast server in time sequence. If the duration of the synthesized video highlights is longer than the preset 1 minute time (plus or minus 5 seconds), the small fragments with low heat values can be removed according to the heat value sequence so as to meet the requirement that the duration of the video highlights is equal to 1 minute (plus or minus 5 seconds). After the small segments are removed, it is also satisfied that at least 1 small segment is selected within each large segment to ensure the continuity of the content. Of course, in one embodiment of the present application, if the duration of the synthesized video highlight is greater than a preset 1 minute time (and + -5 seconds), then the duration of 1 minute (and + -5 seconds) may also be truncated therefrom as the final video highlight.
If the duration of the synthesized video highlight is less than the preset 1 minute time (and + -5 seconds), other small video clips can be added in sequence according to the hotness value until the synthesized video highlight meets the duration requirement. The time length requirement in this embodiment is merely an example, and flexible setting can be made according to the need at the time of actual application.
After the live client generates the video highlight, it may be sent to the live client.
With continued reference to fig. 6, the method further comprises the steps of:
Step S603, the live client starts playing the video highlight. Such as playing the video highlight on a live room interface of a live client.
Step S604, the live broadcast client judges whether to skip the video highlight, if not, the live broadcast progress of the current live broadcast room is played after the video highlight is played, and if so, the live broadcast progress of the current live broadcast room is directly played.
In step S605, if the live client skips the video highlight, the live server identifies the attribute tag a of the video clip played when skipped, and then reduces the weight of tag a in the interest tags under the current user ID to reduce the recommendation of such video clips.
Based on the technical solution of the embodiment shown in fig. 6, in an application scenario of the present application, as shown in fig. 12, a plurality of live rooms being live are displayed in a live list interface of a live client, and a user may enter a live room through a click operation, for example, after clicking an area 1201 in the interface, enter a live room named "national service field, i come back to be flushed". After the user enters the live broadcasting room, the live broadcasting server side automatically generates a video highlight (optionally, the video highlight is generated when the un-watched broadcast content exceeds a certain threshold value, such as 1 hour) for 1 minute (only an example) by extracting a plurality of fragments from the broadcast content according to the interest tag of the current user, and plays the video highlight through the live broadcasting client side. As shown in fig. 13, after entering the live broadcast room, the video highlight is played in a window 1301 of the live broadcast interface, and a playing progress bar 1303 may be displayed in a lower area in the window 1301. A guide button 1302 may also be displayed in the window 1301 to prompt the user to skip the video highlight by activating the guide button 1302.
After the live client plays the video highlight, as shown in fig. 14, the current live progress may be automatically played in window 1401. Of course, the user may click the guide button 1302 to skip the video highlight and directly enter the interface shown in fig. 14 to view the current live progress.
If the user exits the live broadcast room while watching the live broadcast, if the time interval of non-watching exceeds the threshold time for generating the video highlights by 1 hour (by way of example only) when entering the live broadcast room again, the video highlights of the new non-watching period are also generated in real time.
According to the technical scheme of the embodiment of the application, when a user enters the live broadcasting room, the user can efficiently know the broadcasted content of the live broadcasting room according to the video highlight clips, so that the progress of video live broadcasting can be quickly followed, the acceptability of the live broadcasting content is improved, the live broadcasting watching experience of the user is improved, and meanwhile, the user viscosity of a live broadcasting platform is also improved.
The following describes an embodiment of the apparatus of the present application, which may be used to execute the method for processing live video in the foregoing embodiment of the present application. For details not disclosed in the embodiment of the apparatus of the present application, please refer to the embodiment of the method for processing live video described above.
Fig. 15 illustrates a block diagram of a live video processing apparatus according to an embodiment of the present application, which may be disposed within a live service, which may be the live server 103 illustrated in fig. 1.
Referring to fig. 15, a processing apparatus 1500 for live video according to an embodiment of the present application includes a first receiving unit 1502, a determining unit 1504, a selecting unit 1506, and a first processing unit 1508.
The first receiving unit 1502 is configured to receive a joining request for a target live broadcast room, wherein the joining request comprises identification information of a user requesting to join the target live broadcast room, the determining unit 1504 is configured to determine live broadcast content missed by the user according to a broadcast duration of the target live broadcast room and the identification information of the user, the selecting unit 1506 is configured to select at least one video segment from the live broadcast content missed by the user, and the first processing unit 1508 is configured to generate a video highlight segment according to the selected video segment and present the video highlight segment to the user.
In some embodiments of the present application, based on the foregoing solution, the determining unit 1504 is configured to obtain, according to the identification information of the user, a time point when the user joins the target live room, and determine, according to a time period that has been broadcast by the target live room and a time point when the user joins the target live room, a live time period that has been missed by the user, so as to determine, based on the live time period that has been missed by the user, live content that has been missed by the user.
In some embodiments of the present application, based on the foregoing, the first processing unit 1508 is further configured to determine, before selecting at least one video clip from live content missed by the user, a live time length missed by the user according to a live time period missed by the user, if the live time length missed by the user is less than or equal to a first time length, present real-time live content in the target live room to the user, and if the live time length missed by the user is greater than the first time length, execute a process of selecting at least one video clip from live content missed by the user.
In some embodiments of the present application, based on the foregoing solutions, the processing apparatus 1500 for live video further includes a second processing unit configured to segment the broadcasted content of the target live room with a second duration as a segmentation reference to obtain a plurality of video time periods, and split the video content in the video time periods to obtain a plurality of video segments corresponding to the broadcasted content.
In some embodiments of the application, based on the foregoing, the second processing unit is configured to split the video content in the video period by at least one of:
Carrying out semantic analysis on the video content in the video period to split the video content in the video period based on a semantic analysis result;
Performing content identification processing on the video content in the video period to split the video content in the video period based on a content identification result;
And performing heat analysis on the video content in the video period to split the video content in the video period based on the heat analysis result.
In some embodiments of the present application, based on the foregoing solution, the selecting unit 1506 is configured to determine at least one video period to which the live content missed by the user belongs based on a plurality of video segments corresponding to the played content, and select at least one video segment from the at least one video period respectively.
In some embodiments of the present application, based on the foregoing solution, the selecting unit 1506 is configured to select, according to the user characteristics of the user, video segments matching the user characteristics from the at least one video period, respectively, and select, if, according to the user characteristics, no video segment matching the user characteristics from the target video period in the at least one video period, at least one video segment from the video segments split from the target video period according to the hotness value of the video segment.
In some embodiments of the present application, based on the foregoing solutions, the processing apparatus 1500 for live video further includes a third processing unit configured to obtain interaction data corresponding to each video segment, determine a heat factor of each video segment according to the interaction data corresponding to each video segment, calculate an interaction variation trend corresponding to each video segment according to the heat factor of each video segment, and calculate a heat value of each video segment based on the heat factor of each video segment and the interaction variation trend corresponding to each video segment.
In some embodiments of the present application, based on the foregoing scheme, the first processing unit 1508 is configured to perform a synthesis process on the selected video segments according to a time sequence of the video to obtain synthesized video segments, and generate, based on the synthesized video segments, video highlight clips having a set duration range.
In some embodiments of the present application, based on the foregoing solution, the first processing unit 1508 is configured to, if the duration of the synthesized video clip exceeds the set duration range and at least two video clips are selected from the same video period, reject video clips from the at least two video clips according to a heat value or a matching degree with the user feature until the duration of the synthesized video clip is within the set duration range, and use the synthesized video clip within the set duration range as the video highlight clip.
In some embodiments of the present application, based on the foregoing, the first processing unit 1508 is configured to intercept, from the synthesized video segments, video segments that conform to the set duration range as the video highlight segments if the duration of the synthesized video segments exceeds the set duration range.
In some embodiments of the present application, based on the foregoing solution, the first processing unit 1508 is configured to select, if the duration of the synthesized video segment does not reach the set duration range, a video segment from other video segments in the at least one video period according to a heat value until the duration of the synthesized video segment is within the set duration range, and use the synthesized video segment within the set duration range as the video highlight segment.
In some embodiments of the present application, based on the foregoing, the first processing unit 1508 is further configured to identify, during presentation of the video highlight reel to the user, a target video reel presented when the skip play instruction is received if a skip play instruction for the video highlight reel is detected, and adjust a user characteristic of the user according to an attribute tag of the target video reel.
In some embodiments of the present application, based on the foregoing, the first processing unit 1508 is further configured to perform semantic analysis and/or content identification processing on the video segment of the target live broadcast content to obtain an attribute tag of the video segment, where the attribute tag is used to calculate a degree of matching with the user feature.
In some embodiments of the present application, based on the foregoing, the first processing unit 1508 is further configured to adjust an attribute tag of the video clip according to the interaction data corresponding to the video clip.
Fig. 16 shows a block diagram of a live video processing apparatus according to an embodiment of the application, which may be provided within a live client, which may be the terminal device 104 shown in fig. 1.
Referring to fig. 16, a processing apparatus 1600 for live video according to an embodiment of the present application includes a display unit 1602, a transmission unit 1604, a second reception unit 1606, and a play unit 1608.
The system comprises a display unit 1602, a sending unit 1604, a second receiving unit 1606 and a playing unit 1608, wherein the display unit 1602 is configured to display live broadcasting room information being live broadcasting, the sending unit 1604 is configured to send a joining request for a target live broadcasting room to a server side if a user is detected to enter the target live broadcasting room, the joining request comprises identification information of the user, the second receiving unit 1606 is configured to receive video highlight clips transmitted by the server side, the video highlight clips are generated according to video clips selected from live broadcasting content missed by the user, and the playing unit 1608 is configured to play the video highlight clips.
In some embodiments of the present application, based on the foregoing, the playing unit 1608 is further configured to play the real-time live content of the target live room if a skip play command for the video highlight reel is received, or if playing of the video highlight reel is completed.
Fig. 17 shows a schematic diagram of a computer system suitable for use in implementing an embodiment of the application.
It should be noted that, the computer system 1700 of the electronic device shown in fig. 17 is only an example, and should not impose any limitation on the functions and the application scope of the embodiments of the present application.
As shown in fig. 17, the computer system 1700 includes a central processing unit (Central Processing Unit, CPU) 1701, which can perform various appropriate actions and processes, such as performing the methods described in the above embodiments, according to a program stored in a Read-Only Memory (ROM) 1702 or a program loaded from a storage portion 1708 into a random access Memory (Random Access Memory, RAM) 1703. In the RAM 1703, various programs and data required for system operation are also stored. The CPU 1701, ROM 1702, and RAM 1703 are connected to each other through a bus 1704. An Input/Output (I/O) interface 1705 is also connected to the bus 1704.
Connected to the I/O interface 1705 are an input portion 1706 including a keyboard, a mouse, and the like, an output portion 1707 including a display such as a Cathode Ray Tube (CRT), a Liquid crystal display (Liquid CRYSTAL DISPLAY, LCD), and a speaker, a storage portion 1708 including a hard disk, and the like, and a communication portion 1709 including a network interface card such as a LAN (Local Area Network) card, a modem, and the like. The communication section 1709 performs communication processing via a network such as the internet. The driver 1710 is also connected to the I/O interface 1705 as needed. A removable medium 1711 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is installed as needed on the drive 1710, so that a computer program read therefrom is installed into the storage portion 1708 as needed.
In particular, according to embodiments of the present application, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present application include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising a computer program for performing the method shown in the flowchart. In such an embodiment, the computer program can be downloaded and installed from a network via the communication portion 1709, and/or installed from the removable media 1711. When executed by a Central Processing Unit (CPU) 1701, performs various functions defined in the system of the present application.
It should be noted that, the computer readable medium shown in the embodiments of the present application may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of a computer-readable storage medium may include, but are not limited to, an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-Only Memory (ROM), an erasable programmable read-Only Memory (Erasable Programmable Read Only Memory, EPROM), a flash Memory, an optical fiber, a portable compact disc read-Only Memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present application, however, a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with a computer-readable computer program embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. A computer program embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, etc., or any suitable combination of the foregoing.
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. Where each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units involved in the embodiments of the present application may be implemented by software, or may be implemented by hardware, and the described units may also be provided in a processor. Wherein the names of the units do not constitute a limitation of the units themselves in some cases.
As another aspect, the present application also provides a computer-readable medium that may be included in the electronic device described in the above embodiment, or may exist alone without being incorporated into the electronic device. The computer-readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to implement the methods described in the above embodiments.
It should be noted that although in the above detailed description several modules or units of a device for action execution are mentioned, such a division is not mandatory. Indeed, the features and functions of two or more modules or units described above may be embodied in one module or unit in accordance with embodiments of the application. Conversely, the features and functions of one module or unit described above may be further divided into a plurality of modules or units to be embodied.
From the above description of embodiments, those skilled in the art will readily appreciate that the example embodiments described herein may be implemented in software, or may be implemented in software in combination with the necessary hardware. Thus, the technical solution according to the embodiments of the present application may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (may be a CD-ROM, a U-disk, a mobile hard disk, etc.) or on a network, and includes several instructions to cause a computing device (may be a personal computer, a server, a touch terminal, or a network device, etc.) to perform the method according to the embodiments of the present application.
Other embodiments of the application will be apparent to those skilled in the art from consideration of the specification and practice of the embodiments disclosed herein. This application is intended to cover any variations, uses, or adaptations of the application following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the application pertains.
It is to be understood that the application is not limited to the precise arrangements and instrumentalities shown in the drawings, which have been described above, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the application is limited only by the appended claims.