Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
In the present disclosure, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
In the internet, the rise of a network hot topic attracts a great deal of people to pay attention to, and meanwhile, a great deal of derivative content is brought to pay attention to, and the related content can excite the related hot topic for the second time. Then, there are many operators of video content, and the operators can rely on hot topics on the network to perform operation of related videos. The topics related to the hot spots can attract users to click and browse, and traffic is introduced to the video website.
The traditional mode at present needs to have certain impression on the content of all or most of the media videos manually, then the user checks hotspots on the network platform manually to understand the general content of the hotspots, and then selects related media videos according to personal understanding and subjective impression of the user on the media videos. The manual processing mode is time-consuming and labor-consuming, related personnel are required to know most of media information content, manual experience is required to be high, and the accuracy of manually selected videos and hot spot association is low.
In order to solve the problems, the invention provides a method and a device for pushing a media resource video, a storage medium and electronic equipment, which are used for improving the pushing efficiency and accuracy of the media resource video to be pushed.
The invention is operational with numerous general purpose or special purpose computing device environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet devices, multiprocessor devices, distributed computing environments that include any of the above devices or devices, and the like.
The execution subject of the method provided by the embodiment of the invention may be a computer terminal or a server, and referring to fig. 1, a method flowchart of a media video pushing method provided by the embodiment of the invention is specifically described as follows:
s101: and determining the network hot topics, and extracting hot key information of the network hot topics from the network hot topics.
In the method provided by the embodiment of the invention, the hot spot key information comprises hot spot visual characteristics, hot spot text information and hot spot key word labels.
When the network hot topics are determined, the network crawler technology can be used for crawling each real-time network hot topic from each large network platform, and screening each real-time network hot topic to obtain the required network hot topics, wherein the determined network hot topics can be a plurality of or one; each web hot topic is associated with a fund video, so that the process of pushing the associated fund video is the same.
Referring to fig. 2, a flowchart of a method for extracting hot key information of a network hot topic according to an embodiment of the present invention is specifically described below:
s201: and acquiring each hot picture of the network hot topic, and extracting hot visual features of the network hot topic from each hot picture.
The media data of the network hot topics comprise pictures and videos related to the network hot topics, the pictures are directly determined to be hot pictures, and for the videos, frame extraction processing is carried out on the videos to obtain a plurality of picture frames, each picture frame is determined to be the hot picture, so that each hot picture of the network hot topics can be obtained; and carrying out feature extraction processing on each hotspot picture to obtain each feature vector, and determining each feature vector as the hotspot visual feature of the network hotspot topic.
It should be noted that, when performing feature extraction processing on each hotspot image, a feature model constructed by using an image feature extraction algorithm in advance may be used to process each hotspot image, so as to obtain each feature vector, where the image feature extraction algorithm includes, but is not limited to, a convolutional neural network and a feature image pyramid network (Feature Pyramid Networks, FPN).
S202: and acquiring the text content of the network hot topic, and determining the text content as hot text information of the network hot topic.
Text content includes, but is not limited to, titles and specific content of web hot topics.
S203: performing word segmentation processing on the text content to obtain each word segment, selecting keywords from each word segment, and constructing a hot keyword label of the network hot topic based on each selected keyword.
Word segmentation processing can be performed on text content by using natural language processing technology to obtain each word segment, and the part of speech of each word segment is determined, wherein the part of speech comprises, but is not limited to, IP, star, task name and other contents; and selecting keywords in each word according to the part of speech and the keyword library of each word, wherein the number of the selected keywords is a plurality of or one.
After the keywords are selected, a hotspot keyword tag for the network hotspot topic may be constructed.
S204: and obtaining the hot spot key information of the network hot topic based on the hot spot visual features, the hot spot text information and the hot spot key word labels.
The hotspot visual features, the hotspot text information and the hotspot keyword labels can be combined into the hotspot key information of the network hotspot topic.
The hot visual characteristics, the hot text information and the hot key labels of the network hot topics are obtained by processing the network hot topics, the content is topic characteristic data of the network hot topics, and the data are used as the basis for selecting and associating the media video, so that the accuracy of the selected media video can be improved.
S102: and determining each media asset video to be pushed, and extracting media asset key information of each media asset video.
In the method provided by the embodiment of the invention, the media asset key information comprises media asset visual characteristics, media asset text information and media asset keyword labels.
The extracted media information includes media visual characteristics, media text information and media keyword labels, and the extraction process of the media information of each media video is the same, and referring to fig. 3, a flowchart of a method for extracting media information of the media video provided by an embodiment of the present invention is specifically described as follows:
s301: and performing frame extraction processing on the media asset video to obtain each video frame of the media asset video.
S302: and carrying out feature extraction processing on each video frame to obtain each feature vector of the media resource video, and determining each feature vector as the media resource visual feature of the media resource video.
When feature extraction processing is performed on each video frame, each video frame may be input into the feature model in S201 to be processed, so as to obtain each feature vector of the media video, and each feature vector is determined to be a media visual feature of the media video.
S303: and calling a preset computer vision model to process each video frame to obtain each attribute label.
The computer vision model may be constructed using computer vision techniques such as human recognition, optical character recognition (Optical Character Recognition, OCR) and motion recognition.
Each video frame is input into the computer vision model, so that the computer vision model processes each video frame and outputs each attribute tag, wherein the attribute tags comprise, but are not limited to, star tags, action tags, emotion tags, text tags and the like.
S304: and determining cataloging information of the media asset video, extracting cataloging labels from the cataloging information, and determining each attribute label and the cataloging information as media asset keyword labels of the media asset video.
The cataloging information contains basic information of the media asset video, such as the names of the media asset video, the contents of actors, directors and the like, and is used for supplementing labels of the media asset video.
Inventory tags are extracted from inventory information, which may be understood as supplemental tags to the media video, including but not limited to star tags, director tags, and the like. Preferably, if some media asset videos have no cataloging information, determining each attribute label as a media asset keyword label of the media asset video.
S305: and determining the catalogue information and the text labels in each attribute label as the media information text information of the media video.
In the method provided by the embodiment of the invention, when the media asset video does not have catalogue information, the text labels in the attribute labels are directly determined to be the media asset text information of the media asset video.
S306: and obtaining the medium resource key information of the medium resource video based on the medium resource visual characteristics, the medium resource keyword labels and the medium resource text information of the medium resource video.
In the method provided by the embodiment of the invention, the media resource key information of the media resource video comprises the media resource visual characteristics, the media resource keyword labels and the media resource text information, and the media resource key information screens the media resource video so as to screen the video which is more closely related to the network hot topics from each media resource video.
S103: and processing the media asset key information of each media asset video and the hot spot key information of the network hot topics to obtain a matching score set of each media asset video and the network hot topics.
In the method provided by the embodiment of the invention, the matching score set comprises a feature matching score, a text similarity score and a label similarity score.
The process of determining the matching score set of each asset video and the network hot topic is the same, and is specifically as follows:
determining cosine distances between the media asset visual features of the media asset video and the hotspot visual features of the network hotspot topics, and obtaining feature matching scores of the media asset video based on the cosine distances;
Matching the media asset keyword labels of the media asset videos with the hot keyword labels of the network hot topics to obtain matching similarity, and obtaining label similarity scores of the media asset videos based on the matching similarity;
Processing the media asset text information of the media asset video and the hot text information of the network hot topics by using a preset text similarity algorithm to obtain text similarity scores of the media asset video;
and obtaining a matching score set of the media resource video based on the feature matching score, the label similarity score and the text similarity score of the media resource video.
In the method provided by the embodiment of the invention, when the cosine distances of the medium visual features and the hot spot visual features are determined, each feature vector in the medium visual features and each feature vector in the hot spot visual features are required to be processed so as to obtain the cosine distances, and preferably, the first cosine distances of each feature vector in the medium visual features and each feature vector in the hot spot visual features can be determined, the average value of each first cosine distance is obtained, and the average value is determined as the cosine distance. Preferably, the cosine distances are different and the feature matching scores are different.
Preferably, the number of tags in the media asset keyword tags that are the same as the number of tags in the hot spot keyword tags can be determined, and the matching similarity is determined based on the number of the same tags, and different matching similarities correspond to different tag similarity scores.
Preferably, the text similarity algorithm includes, but is not limited to, a fuzzy matching algorithm of text, a doc2vec algorithm, and the like.
Further, the matching score set of the media asset video is a set of scores representing the relevance between the media asset video and the network hot topic.
S104: and screening each media resource video based on the matching score set of each media resource video to obtain each screened video.
In the method provided by the embodiment of the invention, each media resource video can be screened according to the matching score set, so that the number of media resource videos to be processed subsequently is reduced, the workload is reduced, and the working efficiency is improved.
Exemplary, the process of screening each media video provided by the embodiment of the present invention is as follows:
Sequencing all the media videos according to the sequence of the feature matching score from high to low to obtain a first video sequence, selecting the media videos from the first video sequence until the number of the selected media videos is equal to the preset first video number, and determining all the selected media videos as first media videos;
Sequencing all the media videos according to the sequence of the text similarity score from high to low to obtain a second video sequence, selecting the media videos from the second video sequence until the number of the selected media videos is equal to the preset second video number, and determining all the selected media videos as second media videos;
Sequencing all the media resource videos according to the sequence of the label similarity score from high to low to obtain a third video sequence, selecting the media resource videos from the third video sequence until the number of the selected media resource videos is equal to the preset number of the third videos, and determining all the selected media resource videos as the third media resource videos;
and merging the first media resource videos, the second media resource videos and the third media resource videos to obtain screening videos.
It should be noted that the first video, the second video and the third video may have the same video, and the same video needs to be combined into one video, so that the obtained screening videos are different. Preferably, the first video number, the second video number and the third video number can be set according to actual requirements, and can be the same or different.
S105: and processing the feature matching score, the text similarity score and the label similarity score of each screening video to obtain the association score of each screening video.
The process of determining the association score for each of the filter videos is identical, and exemplary, the process of determining the association score for the filter video is as follows:
Determining feature similarity based on cosine distance of the screened video;
determining the text similarity of the media information of the screening video and the hot text information of the network hot topics;
Determining feature weights, tag weights and text weights of the screening videos based on the feature similarities, the matching similarities and the text similarities of the screening videos;
Calculating the feature matching score and the feature weight of the screened video to obtain a first score; calculating the label similarity score and the label weight of the screened video to obtain a second score; calculating the text similarity score and the text weight of the screened video to obtain a third score;
And carrying out summation operation on the first score, the second score and the third score to obtain the association score of the screening video.
S106: based on the preset number of push videos and the association score of each screening video, determining the push videos from each screening video, and associating each push video with a network hot topic so as to push each push video.
The screening videos with the association score being greater than or equal to the preset association score are all determined to be fourth media resource videos;
determining the total number of the fourth media videos, and determining whether the total number is larger than the preset fourth video number;
if the total number is greater than the fourth video number, selecting videos from the fourth media resource videos based on the sequence of the association scores from high to low until the number of the selected videos is equal to the fourth video number, and determining each selected video as a push video;
And if the total number is not greater than the fourth video number, determining each fourth media resource video as a push video.
In the method provided by the embodiment of the invention, the preset association score can be set according to the actual demand, and the fourth video number can be set according to the actual demand.
In the method provided by the embodiment of the invention, a network hot topic is determined, hot key information of the network hot topic is extracted from the network hot topic, and the hot key information comprises hot visual characteristics, hot text information and hot key word labels; determining each media resource video to be pushed, and extracting media resource key information of each media resource video, wherein the media resource key information comprises media resource visual characteristics, media resource text information and media resource keyword labels; processing the media asset key information of each media asset video and the hot spot key information of the network hot topic to obtain a matching score set of each media asset video and the network hot topic, wherein the matching score set comprises feature matching scores, text similarity scores and label similarity scores; screening each media resource video based on the matching score set of each media resource video to obtain each screened video; processing the feature matching score, the text similarity score and the label similarity score of each screening video to obtain the association score of each screening video; based on the preset number of push videos and the association score of each screening video, determining push videos from each screening video, and associating each push video with the network hot topic so as to push each push video. According to the method, the hot key information of the network hot topics is obtained by processing the network hot topics, and each media resource video to be pushed is processed to obtain the media resource key information of each media resource video; based on the hot key information and each media resource key information, a matching score set of each media resource video is obtained, each screening video is screened out according to the matching score set, the association score of the screening video is obtained, the push video is determined from each screening video according to the association score, the push video is associated with the network hot topic so as to push the push video, the participation of operators is reduced in the whole process, the efficiency of pushing the media resource video by using the network hot topic is effectively improved, and the media resource video closer to the content of the network hot topic can be selected by using the method, and the viewing rate and click rate of a user on the pushed media resource video are improved.
The embodiment of the invention also provides another example of a method for pushing the media resource video, which is specifically as follows:
s1, crawling real-time network hot topics on each large network platform at fixed time intervals by using a web crawler technology;
S2, evaluating the operation value of the network hot topics according to the IP, the star tag, the heat and the media information video to be operated or pushed;
S3, selecting a network hot topic with an operation value greater than a preset operation value;
S4, analyzing the selected network hot topics, extracting the titles and the contents of the network hot topics and the videos and the pictures related to the network hot topics, determining the titles and the contents of the network hot topics as hot text information of the network hot topics, and processing the videos and the pictures related to the network hot topics to obtain hot visual characteristics of the network hot topics;
s5, segmenting the content and the title of the network hot topics by using a natural language processing technology to obtain keywords;
S6, performing part-of-speech analysis on keywords of the network hot topics by using a natural language processing technology to obtain hot keyword labels of the network hot topics; parts of speech can be divided into IP, star, character name and other contents;
s7, performing frame extraction and feature extraction on each video media asset to obtain the media asset visual features of each video media asset, and constructing a feature retrieval library according to the media asset visual features of each video media asset;
S8, extracting frames of each video media asset, processing the video frames of each video media asset by using computer vision technologies such as star recognition, OCR recognition and action recognition to obtain media asset keyword labels of each video media asset, and constructing a structured label library by using the media asset keyword labels of each video media asset;
s9, processing the titles and the contents of all video media assets to construct a text retrieval library;
s10, obtaining each media resource video conforming to the network hot topics through visual feature comparison, text content comparison and label content comparison, and determining N target videos in each obtained media resource video; n is an integer, and can be set according to actual requirements;
S11, if the duration of the target video exceeds the preset duration, clipping is carried out according to the content of the network hot spot questions to obtain a push video; if the duration of the target video does not exceed the preset duration, directly determining the target video as a push video;
S12, outputting M push videos, and associating each push video with a network hot topic so as to push each push video; m is an integer, and M can be set according to actual requirements.
The method provided by the invention realizes automatic association and matching of the network hot topics and the media videos, automatically analyzes the real-time network hot topics and the flow on the network platform, associates the media videos of the network hot topics, puts the associated media videos on the network, pushes and promotes the network hot topics by utilizing the flow of the network hot topics, and performs secondary stir-frying on the network hot topics.
In order to illustrate a process of processing a network hot topic to obtain hot text information, hot visual features and hot keyword labels of the network hot topic, referring to fig. 4, a flowchart of a method for obtaining hot text information, hot visual features and hot keyword labels of the network hot topic provided by an embodiment of the present invention is specifically described as follows:
step one: crawling a network hot topic by utilizing a crawler technology;
step two: determining each picture and each video of the network hot topics, and performing frame extraction on the video of the network hot topics to obtain each video frame;
Step three: inputting each picture and each video frame of the network hot topics into a neural network for feature extraction to obtain each feature vector, and determining each feature vector as the hot visual feature of the network hot topics; the neural network herein may be understood as the feature model described above;
Step four: determining a text of the network hot topic, wherein the text comprises a title and content, and the content is descriptive of the network hot topic; the method comprises the steps of segmenting the content and the title to obtain each segmented word, determining the part of speech of each segmented word, determining a keyword list in a database, comparing each segmented word with the keyword list based on each segmented word and the part of speech of each segmented word, selecting keywords from each segmented word, and constructing a hot keyword tag of a network hot topic by using the selected keywords;
Step five: directly storing the text of the hot spot as hot spot text information of a network hot spot question;
Step six: and associating with the media asset video by using the hotspot visual features, the hotspot text information and the hotspot keyword labels of the network hotspot topics.
In summary, the information crawling is mainly performed on the hotspots, and then the computer vision technology and the natural language processing technology are utilized to obtain the hotspot visual features, the hotspot text information and the hotspot keyword labels of the network hotspot questions for subsequent media resource video association matching.
Referring to fig. 5, a flowchart for constructing a text search library, a structured tag library and a feature search library according to an embodiment of the present invention is provided, where media information, media visual features and media keyword tags of each media video are obtained by processing each media video, and the text search library, the structured tag library and the feature search library are constructed by using the media text information, the media visual features and the media keyword tags of each media video, which is specifically described as follows:
step one: drawing frames of each media resource video to obtain each picture of each media resource video;
Step two: inputting each picture of each media asset video into a neural network to obtain each feature vector of each media asset video, and obtaining media asset visual features of each media asset video based on each feature vector of each media asset video; the neural network here is the same as that described in step three in fig. 4;
step three: integrating the visual features of the media assets of each media asset video to form a feature retrieval library;
step four: inputting each picture of each media asset video into a computer vision model to obtain each keyword of each media asset video, and obtaining each attribute tag of each media asset video based on each keyword of each media asset video; the computer vision model is constructed by using technologies such as star recognition, action recognition, emotion recognition, OCR and the like, wherein the star recognition can be face recognition; individual attribute tags include, but are not limited to, star tags, action tags, emotion tags, and text tags;
Step five: analyzing cataloging information of the media asset video with cataloging information to obtain cataloging labels of the media asset video, and obtaining media asset keyword labels of the media asset video based on the cataloging labels and each attribute label of the media asset video; for the media asset video without cataloging information, obtaining media asset keyword labels of the media asset video based on each attribute label of the media asset video; inventory information including, but not limited to, names of the media asset videos, actors, directors, etc., may be supplemented with labels of the media asset videos using the encoded information; constructing a structured tag library based on media asset keyword tags of each media asset video;
step six: determining the media asset text information of each media asset video, and constructing a text retrieval library by using the media asset text information of each media asset video, wherein when the catalogue information exists in the media asset video, the media asset text information of the media asset video is generated based on the catalogue information and the text label, and when the catalogue information does not exist in the media asset video, the media asset text information is generated based on the text label.
In summary, the invention extracts the visual features of the media assets, the text information of the media assets and the tags of the media assets by using the computer visual technology and the natural language processing technology, and constructs three matched search libraries of a feature search library, a text search library and a structured tag library based on the visual features of the media assets, the text information of the media assets and the tags of the media assets of each media asset.
Referring to fig. 6, a flowchart of a method for determining push videos according to the present invention is provided in conjunction with fig. 4 and 5, and the method is described as follows:
Step one: matching the hot keyword labels of the network hot topics obtained in fig. 4 with each media resource keyword label in the structured label library in fig. 5, sorting each media resource keyword label according to the matching result of each media resource keyword label and the hot keyword label to obtain a first sequence, determining all the first K1 media resource keyword labels in the first sequence as target media resource keyword labels, and determining the media resource video corresponding to each target media resource keyword label as first video, wherein the retrieval result obtained is each first video; the matching result may be the number of overlapping labels of the media asset keyword labels and the hot spot keyword labels, and the media asset keyword labels are ordered according to the order of the number of overlapping labels from more to less.
Step two: matching the hot text information of the network hot topic obtained in fig. 4 with each media information in the text retrieval library in fig. 5, sorting each media text information based on the matching result of each media information and the hot text information to obtain a sequence, determining the first K2 media information in the sequence as target media information, determining the media video corresponding to each target media information as a second video, and obtaining the retrieval result as each second video; the matching result can be the similarity score of the media information and the hot text information, and the media information is sequenced according to the sequence from high to low of the similarity score;
step three: matching the hot keyword labels of the network hot topics obtained in fig. 4 with each media resource keyword label in the structured label library in fig. 5, sorting each media resource keyword label based on the matching result of each media resource keyword label and the hot keyword label to obtain a sequence, determining all the first K3 media resource keyword labels in the sequence as target media resource keyword labels, determining the media resource video corresponding to each target media resource keyword label as third videos, and obtaining a retrieval result which is each third video; the matching result can be the cosine distance between the media asset keyword labels and the hot spot keyword labels, and the media asset keyword labels are ordered according to the sequence from the big cosine distance to the small cosine distance;
step four: combining each first video, each second video and each third video to obtain each screening video;
Step five: calculating the comprehensive score of each screening video and the network hot topic;
Step six: deleting the screening videos with the comprehensive score smaller than the preset score, and determining the remaining screening videos as push videos when the number of the remaining screening videos is smaller than or equal to the preset number; when the number of the remaining screening videos is greater than the preset number, selecting push videos from the remaining screening videos according to the sequence from high to low according to the comprehensive score until the number of the selected push videos is equal to the preset number;
Step seven: each push video is associated with a web hot topic so that the application web hot topic pushes each push video.
The K1、K2 and the K3 are integers and are set according to actual requirements.
In summary, the video associated with the hot spot is obtained by converting the media asset and the hot spot into visual features, text information and tag information, performing three forms of matching analysis and comprehensive judgment.
The method comprises the steps of automatically associating and matching network hot spots and media video, automatically analyzing real-time hot spots and flow on a network platform through an algorithm, automatically associating related media video of the hot topics, putting the related video on a network, popularizing the hot topics by utilizing the flow of the hot topics, performing secondary stir-frying on the hot topics, and performing full-process automation for 7 x 24 hours.
Corresponding to the method shown in fig. 1, the embodiment of the present invention further provides a device for pushing a media video, where the device is used to support application of the method shown in fig. 1 in real life, and the device may be disposed in a computer terminal or a computer device, and a schematic structural diagram of the device is shown in fig. 7, and specifically described below:
A determining unit 701, configured to determine a network hot topic, and extract hot key information of the network hot topic from the network hot topic, where the hot key information includes hot visual features, hot text information, and a hot keyword tag;
The extracting unit 702 is configured to determine each media resource video to be pushed, and extract media resource key information of each media resource video, where the media resource key information includes media resource visual features, media resource text information and media resource keyword labels;
A first obtaining unit 703, configured to process the asset key information of each of the asset videos and the hotspot key information of the network hotspot topic to obtain a matching score set of each of the asset videos and the network hotspot topic, where the matching score set includes a feature matching score, a text similarity score, and a tag similarity score;
A screening unit 704, configured to screen each of the media videos based on the matching score set of each of the media videos, so as to obtain each of the screened videos;
A second obtaining unit 705, configured to process the feature matching score, the text similarity score, and the tag similarity score of each of the screening videos to obtain an association score of each of the screening videos;
And the pushing unit 706 is configured to determine a push video from the respective filter videos based on a preset number of push videos and an association score of each filter video, and associate each push video with the hot topic of the network, so as to push each push video.
In the device provided by the embodiment of the invention, a network hot topic is determined, hot key information of the network hot topic is extracted from the network hot topic, and the hot key information comprises hot visual characteristics, hot text information and hot key word labels; determining each media resource video to be pushed, and extracting media resource key information of each media resource video, wherein the media resource key information comprises media resource visual characteristics, media resource text information and media resource keyword labels; processing the media asset key information of each media asset video and the hot spot key information of the network hot topic to obtain a matching score set of each media asset video and the network hot topic, wherein the matching score set comprises feature matching scores, text similarity scores and label similarity scores; screening each media resource video based on the matching score set of each media resource video to obtain each screened video; processing the feature matching score, the text similarity score and the label similarity score of each screening video to obtain the association score of each screening video; based on the preset number of push videos and the association score of each screening video, determining push videos from each screening video, and associating each push video with the network hot topic so as to push each push video. According to the method, the hot key information of the network hot topics is obtained by processing the network hot topics, and each media resource video to be pushed is processed to obtain the media resource key information of each media resource video; based on the hot key information and each media resource key information, a matching score set of each media resource video is obtained, each screening video is screened out according to the matching score set, the association score of the screening video is obtained, the push video is determined from each screening video according to the association score, the push video is associated with the network hot topic so as to push the push video, the participation of operators is reduced in the whole process, the efficiency of pushing the media resource video by using the network hot topic is effectively improved, and the media resource video closer to the content of the network hot topic can be selected by using the method, and the viewing rate and click rate of a user on the pushed media resource video are improved.
In the apparatus provided by the embodiment of the present invention, the determining unit 701 includes:
The first acquisition subunit is used for acquiring each hotspot picture of the network hotspot topic and extracting the hotspot visual characteristics of the network hotspot topic from each hotspot picture;
the second acquisition subunit is used for acquiring the text content of the network hot topic and determining the text content as hot text information of the network hot topic;
The selecting subunit is used for carrying out word segmentation processing on the text content to obtain each word segment, selecting keywords from each word segment, and constructing a hot keyword label of the network hot topic based on each selected keyword;
the first obtaining subunit is configured to obtain the hotspot key information of the network hotspot topic based on the hotspot visual feature, the hotspot text information and the hotspot keyword tag.
In the apparatus provided by the embodiment of the present invention, the extracting unit 702 includes:
The frame extraction processing subunit is used for carrying out frame extraction processing on each media resource video to obtain each video frame of each media resource video;
The feature extraction subunit is used for carrying out feature extraction processing on each video frame of each media resource video to obtain each feature vector of each media resource video, and determining each feature vector of each media resource video as the media resource visual feature of each media resource video;
the second obtaining subunit is used for calling a preset computer vision model for each media resource video to process each video frame of the media resource video so as to obtain each attribute label;
A first determining subunit, configured to determine, for each of the media videos, cataloging information of the media video, extract cataloging tags from the cataloging information, determine each of the attribute tags and the cataloging information as media keyword tags of the media video, and determine each of the cataloging information and text tags in each of the attribute tags as media text information of the media video;
and the third obtaining subunit is used for obtaining the media asset key information of each media asset video based on the media asset visual characteristics, the media asset keyword labels and the media asset text information of each media asset video.
In the apparatus provided by the embodiment of the present invention, the first obtaining unit 703 includes:
the second determining subunit is configured to determine, for each of the media videos, a cosine distance between a media visual feature of the media video and a hotspot visual feature of the network hotspot topic, and obtain a feature matching score of the media video based on the cosine distance;
The matching subunit is used for matching the media asset keyword labels of the media asset videos with the hot keyword labels of the network hot topics to obtain matching similarity, and obtaining label similarity scores of the media asset videos based on the matching similarity;
A fourth obtaining subunit, configured to process, for each of the media videos, media text information of the media video and hot text information of the network hot topic by using a preset text similarity algorithm, to obtain a text similarity score of the media video;
And a fifth obtaining subunit, configured to obtain a matching score set of each of the media videos based on the feature matching score, the tag similarity score, and the text similarity score of each of the media videos.
In the apparatus provided by the embodiment of the present invention, the screening unit 704 includes:
the first sequencing subunit is used for sequencing the media resource videos according to the sequence from high to low of the feature matching score to obtain a first video sequence, selecting the media resource videos from the first video sequence until the number of the selected media resource videos is equal to the preset first video number, and determining each selected media resource video as a first media resource video;
The second sequencing subunit is used for sequencing the media resource videos according to the sequence from high to low of the text similarity score to obtain a second video sequence, selecting the media resource videos from the second video sequence until the number of the selected media resource videos is equal to the number of preset second videos, and determining each selected media resource video as a second media resource video;
the third sequencing subunit is used for sequencing the media resource videos according to the sequence from high to low of the label similarity score to obtain a third video sequence, selecting the media resource videos from the third video sequence until the number of the selected media resource videos is equal to the preset third video number, and determining each selected media resource video as a third media resource video;
and the merging subunit is used for merging the first media resource videos, the second media resource videos and the third media resource videos to obtain screening videos.
In the apparatus provided in the embodiment of the present invention, the pushing unit 706 includes:
the third determining subunit is used for determining the screening videos with the association score being greater than or equal to the preset association score as fourth media resource videos;
A fourth determining subunit, configured to determine a total number of the fourth media videos, and determine whether the total number is greater than a preset fourth video number;
a selecting subunit, configured to select videos from the fourth media videos based on the order of the association scores from high to low if the total number is greater than the fourth video number, until the number of the selected videos is equal to the fourth video number, and determine each selected video as a push video;
And a fifth determining subunit, configured to determine each fourth media resource video as a push video if the total number is not greater than the fourth video number.
The embodiment of the invention also provides a storage medium, which comprises stored instructions, wherein when the instructions run, the equipment where the storage medium is located is controlled to execute the media video pushing method.
The embodiment of the invention also provides an electronic device, the structure of which is shown in fig. 8, specifically including a memory 801 and one or more instructions 802, where the one or more instructions 802 are stored in the memory 801, and configured to be executed by the one or more processors 803 to perform the method for pushing the media video.
The specific implementation process and derivative manner of the above embodiments are all within the protection scope of the present invention.
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for a system or system embodiment, since it is substantially similar to a method embodiment, the description is relatively simple, with reference to the description of the method embodiment being made in part. The systems and system embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative elements and steps are described above generally in terms of functionality in order to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.