CN108170845B

Movatterモバイル変換

Info

Publication number: CN108170845B
Application number: CN201810044934.3A
Authority: CN
Inventors: 张龙
Original assignee: Tencent Music Entertainment Technology Shenzhen Co Ltd
Current assignee: Tencent Music Entertainment Technology Shenzhen Co Ltd
Priority date: 2018-01-17
Filing date: 2018-01-17
Publication date: 2020-10-13
Anticipated expiration: 2038-01-17
Also published as: CN108170845A

Abstract

Description

Multimedia data processing method, device and storage medium

Technical Field

The present invention relates to the field of multimedia technologies, and in particular, to a method and an apparatus for processing multimedia data, and a storage medium.

Background

With the continuous popularization of terminals, users can acquire more and more information through the terminals. For example, music is an important consumption element in human life, and particularly, popularization of a terminal enables a user to conveniently obtain music information through the terminal.

In order to enable a user to quickly acquire desired music data, more and more music applications provide music data push services. For example, referring to fig. 1, fig. 1 is a music playing interface in the prior art, in which a music application service provider sets a song list push button a, and a user can push a song list edited by the user to the music application service provider by clicking the song list push button a.

After receiving the song sheets, the music application service provider has two processing modes, one mode is that a background worker audits the song sheets and then pushes the song sheets to other users; the other is machine auditing, namely detecting single variables such as song title content, song pictures and the like.

In the two processing modes, if the background staff is used for auditing, the backlog of the song list is easy to generate, and part of the high-quality song list is mistakenly audited as the common song list. If machine auditing is adopted, the problem that shallow detection is only carried out on a single variable exists, and the singing list processing accuracy is too low.

Disclosure of Invention

The embodiment of the invention provides a multimedia data processing method, a multimedia data processing device and a storage medium, which can improve the accuracy of multimedia data processing.

The embodiment of the invention provides a multimedia data processing method, which comprises the following steps:

receiving multimedia data to be processed, and acquiring attribute information of the multimedia data to be processed, wherein the attribute information comprises a plurality of tag lists, multimedia number, multimedia playing amount and multimedia identification, and each tag list is used for labeling one type of multimedia data to be processed;

judging whether the tag lists are consistent, if so, generating the popularity of the multimedia data to be processed according to the number of the multimedia and the multimedia playing amount, and generating the novelty of the multimedia data to be processed according to the multimedia identification;

analyzing and processing the popularity and the novelty, and if the popularity is smaller than a preset popularity threshold and the novelty is larger than a preset novelty threshold, acquiring user information of the multimedia data to be processed; and

and processing the multimedia data to be processed according to the historical multimedia data information corresponding to the user information, the plurality of label lists, the popularity and the novelty of the multimedia data to be processed.

An embodiment of the present invention further provides a multimedia data processing apparatus, including:

the system comprises a receiving module, a processing module and a processing module, wherein the receiving module is used for receiving multimedia data to be processed and acquiring attribute information of the multimedia data to be processed, the attribute information comprises a plurality of tag lists, a multimedia number, a multimedia playing amount and a multimedia identifier, and each tag list is used for labeling one type of multimedia data to be processed;

the judging module is used for judging whether the plurality of label lists are consistent, if so, generating the popularity of the multimedia data to be processed according to the number of the multimedia and the multimedia playing amount, and generating the novelty of the multimedia data to be processed according to the multimedia identifier;

the analysis module is used for analyzing and processing the popularity and the novelty, and if the popularity is smaller than a preset popularity threshold and the novelty is larger than a preset novelty threshold, user information of the multimedia data to be processed is acquired; and

and the processing module is used for processing the multimedia data to be processed according to the historical multimedia data information corresponding to the user information, the plurality of label lists, the popularity and the novelty of the multimedia data to be processed.

The embodiment of the invention also provides a storage medium, wherein the storage medium stores processor executable instructions, and the processor provides the multimedia data processing method by executing the instructions.

The multimedia data processing method, the device and the storage medium of the embodiment of the invention acquire the attribute information of the multimedia data to be processed, wherein the attribute information comprises a plurality of label lists, multimedia numbers, multimedia playing amount and multimedia identifications. And analyzing the consistency of the tag list, and when the tag list is consistent with the tag list, generating popularity according to the number of multimedia and the playing amount of the multimedia, and generating novelty according to the multimedia identifier. And when the popularity is smaller than a preset popularity threshold and the novelty is larger than a preset novelty threshold, acquiring historical multimedia data information corresponding to the user information of the multimedia data to be processed. And finally, processing the multimedia data to be processed according to the historical multimedia data information, the tag list, the popularity and the novelty. The scheme not only analyzes the consistency of a plurality of detection variables of the multimedia data to be processed, but also analyzes the multimedia data to be processed from deep level, thereby effectively improving the accuracy of processing the multimedia data to be processed.

Drawings

The technical solution and other advantages of the present invention will become apparent from the following detailed description of specific embodiments of the present invention, which is to be read in connection with the accompanying drawings.

Fig. 1 is a schematic diagram of an existing music playing interface according to an embodiment of the present invention.

Fig. 2 is a schematic view of a scene of a multimedia data processing method according to an embodiment of the present invention.

Fig. 3 is a flowchart illustrating a multimedia data processing method according to an embodiment of the invention.

Fig. 4 is a schematic diagram of forming a tag list according to an embodiment of the present invention.

Fig. 5 is a schematic diagram of a mapping relationship between a tag and a main word according to an embodiment of the present invention.

Fig. 6 is another schematic diagram of forming a tag list according to an embodiment of the present invention.

Fig. 7 is another flowchart illustrating a multimedia data processing method according to an embodiment of the invention.

Fig. 8 is a scene schematic diagram of processing a to-be-processed picture according to an embodiment of the present invention.

Fig. 9 is a schematic diagram of another forming of the tag list according to the embodiment of the present invention.

Fig. 10 is a schematic structural diagram of a multimedia data processing apparatus according to an embodiment of the present invention.

Fig. 11 is a schematic structural diagram of a receiving module according to an embodiment of the present invention.

Fig. 12 is a schematic structural diagram of a determining module according to an embodiment of the present invention.

Fig. 13 is a schematic structural diagram of a processing module according to an embodiment of the present invention.

Fig. 14 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Referring to fig. 2, the figure is a schematic view of a scene of a multimedia data processing method according to an embodiment of the present invention, in the scene, a multimedia data processing apparatus may be implemented as an entity, or may be implemented by being integrated in an electronic device such as a terminal or a server, where the electronic device may include a smart phone, a tablet computer, a personal computer, and the like.

Embodiments of the present invention provide a multimedia data processing method, an apparatus and a storage medium, which will be described in detail below.

In the embodiments of the present invention, description will be made from the perspective of a multimedia data processing apparatus, which can be specifically integrated in an electronic device.

Referring to fig. 3, fig. 3 is a flowchart of a multimedia data processing method according to an embodiment of the present invention, where the method includes:

step S101, receiving multimedia data to be processed, and obtaining attribute information of the multimedia data to be processed, wherein the attribute information comprises a plurality of tag lists, multimedia numbers, multimedia playing amounts and multimedia identifications.

From the viewpoint of file type, multimedia data can be divided into different kinds of multimedia data such as audio, video, pictures, and text. As shown in fig. 4, the multimedia data is presented in the form of a song list, in which several songs are included. The song list not only comprises audio, video and pictures, but also comprises texts such as a song list title, a song list description and the like. The attribute information of the multimedia data is used to label the essential features or characteristics of various multimedia data, such as the style of audio, and the size of audio, with a label. In the preferred embodiment, the acquired attribute information of the multimedia data to be processed includes a plurality of tag lists, a number of multimedia, a multimedia playing amount, a multimedia identifier, and the like.

In the embodiment of the present invention, each type of multimedia data to be processed corresponds to a tag list, that is, a tag list is used for labeling a type of multimedia data to be processed. Wherein the tag list includes one or more tags. For example, for the song list in fig. 4, the labels "cantonese" and "classic" can be extracted from the title "severe cantonese classic good song" of the song list, and the labels "melancholy", "vicissitus" and "classic" can be extracted from the description of the song list, and the labels form a label list { cantonese, classic, melancholy, vicissitus, classic }, which is used for labeling the text in the song list. Similarly, the labels 'Guangdong language' and 'popular' are obtained by analyzing the audio 'friend', the labels 'Guangdong language' and 'popular' are obtained by analyzing the audio 'semiminor evening primrose', all the audios in the song list are sequentially analyzed to obtain all the labels, and then a label list { Guangdong language, popular } is formed for labeling the audios in the song list.

Step S102, judging whether the plurality of label lists are consistent, if so, generating the popularity of the multimedia data to be processed according to the number of the multimedia and the multimedia playing amount, and generating the novelty of the multimedia data to be processed according to the multimedia identification.

In the specific implementation process, the step of judging whether the plurality of tag lists are consistent is as follows:

and mapping the labels in the label list to corresponding main words in a preset main word library based on a preset mapping relation so that each label list forms a corresponding main word list.

It is determined whether any two of the primary word lists have the same primary word.

If any two main word lists have the same main word, the plurality of label lists are determined to be consistent.

Assuming that the preset master lexicon is { cure, acg (animation public gate), excited }, the mapping relationship between the tag and the master word in the preset master lexicon is as shown in fig. 5, and different tags surround the corresponding master word, such as the master word "cure", which has a mapping relationship with the tags "warm", "pure tone", "alone", and the like. As shown in fig. 6, the multimedia data to be processed includes multimedia data of four file types, i.e., a text to be processed, a picture to be processed, an audio to be processed, and a video to be processed. If the label list a1 corresponding to the text to be processed is { fresh, love song, steve }, the corresponding main word list is b1{ cure, excited }. If the tag list a2 corresponding to the to-be-processed picture is { animation, warm }, the corresponding home word list b2 is { ACG, cure }, if the tag list a3 corresponding to the to-be-processed audio is { pure voice, lyric, passion }, the corresponding home word list b3 is { cure, excited }, and the tag list a4 corresponding to the to-be-processed video is { pure voice, love song }, the corresponding home word list b4 is { cure }.

Comparing the main word lists b1, b2, b3 and b4 two by two, if any two main word lists have the same main word, determining that the tag lists a1, a2, a3 and a4 are consistent, and if two main word lists do not have the same main word, determining that the tag lists a1, a2, a3 and a4 are inconsistent. Since the main word lists b1, b2, b3 and b4 all have the main word "cured", it can be determined that the tag lists a1, a2, a3 and a4 are consistent, and also indicate that the styles of various types of multimedia data in the multimedia data to be processed are uniform.

If the tags are inconsistent, the styles of the multimedia data of different file types in the multimedia data to be processed are not uniform, so that the multimedia data to be processed can be transferred to other storage devices for storage, and can also be directly deleted. If the plurality of tags are consistent, the popularity and novelty of the multimedia data to be processed can be further detected.

and generating the average multimedia playing amount according to the number of the multimedia and the multimedia playing amount.

And determining the popularity of the multimedia data to be processed according to the average multimedia playing amount.

Specifically, the multimedia average playing amount h can be calculated according to the following formula:

wherein, P_iThe multimedia playing quantity of the ith multimedia is represented, N represents the number of the multimedia, and N is a positive integer. The popularity may be determined according to a play amount level at which the average play amount of the multimedia is located. For example, if the multimedia data contains a plurality of songs, the multimedia data with the average playing amount of the songs exceeding 100 ten thousand times per month can be set as a first playing amount grade corresponding to a first popular grade; and setting the multimedia data with the average playing quantity of the songs between 50-100 ten thousand times per month as a second playing quantity grade corresponding to a second popular grade.

Assuming that the multimedia data to be processed includes three songs, and the playing amounts of the songs are respectively 100 ten thousand times/month, 50 ten thousand times/month and 30 ten thousand times/month, the average playing amount of the songs of the multimedia data to be processed can be calculated to be 60 ten thousand times/month according to the formula, and further, the average playing amount of the songs of the multimedia data to be processed can be obtained to be a second playing amount grade, namely, the multimedia data to be processed is a second popular grade.

In some embodiments, the hot home score may also be calculated directly according to the following formula:

wherein, by counting the multimedia average playing amount of the existing multimedia data, when P is enabled_iThe hot threshold hot _ score can be set to [0,10 ] in million times/month]In the meantime.

Through statistical analysis of a large amount of multimedia data, when the popularity hot _ score of the multimedia data to be processed is less than 0.69, the average multimedia playing amount P corresponding to the multimedia data to be processed is less than 5 times/month. Since such pending multimedia data lack sufficient a priori knowledge for the multimedia data processing apparatus to make the determination, such pending multimedia data may be retained for further detection. For the multimedia data to be processed with the popularity hot _ score greater than 6, the average playing amount P of the multimedia is more than millions of times per month, and statistics shows that the multimedia data to be processed tend to simply integrate popular videos and popular audios, and the topics of the multimedia data are unclear and the styles of the multimedia data are not uniform, so that the multimedia data can be deleted.

In summary, the hot threshold upper limit value 6 may be set as a preset hot threshold, and it should be noted that the preset hot threshold is not specifically limited herein. When the popularity is not less than the popularity threshold, the multimedia data to be processed can be deleted or the multimedia data to be processed can be transferred to other storage devices.

In the specific implementation process, the novelty of the multimedia data to be processed can be generated according to the multimedia identifier, and the specific steps are as follows:

and acquiring the multimedia identification of the reserved multimedia data.

And determining the novelty of the multimedia data to be processed according to the multimedia identifier of the reserved multimedia data and the multimedia identifier of the multimedia data to be processed.

Specifically, the novelty N may be calculated according to the following formula:

s is multimedia data to be processed, S_iThe multimedia identifier of the ith multimedia in the multimedia data to be processed, wherein n>＝i>And 0, i is a positive integer, and n is the number of multimedia in the multimedia data to be processed. K is reserved multimedia data, K_jThe multimedia identifier of the jth multimedia in the reserved multimedia data is identified. Wherein m is>＝j>0, j is a positive integer, and m is the number of multimedia in the reserved multimedia data.

Preferably, a predetermined novelty threshold may be set, and when the predetermined novelty threshold is exceeded, it indicates that the similarity between the multimedia data to be processed and the retained multimedia data is small. When the similarity is not greater than the preset novelty, the similarity between the multimedia data to be processed and the reserved multimedia data is higher, and in order to improve the richness of the multimedia data, the multimedia data to be processed can be deleted or the multimedia data to be processed can be transferred to other storage devices.

Step S103, analyzing the popularity and the novelty, and if the popularity is smaller than a preset popularity threshold and the novelty is greater than a preset novelty threshold, obtaining the user information of the multimedia data to be processed.

If the popularity is smaller than the preset popularity threshold and the novelty is larger than the preset novelty threshold, it is indicated that the multimedia data to be processed is not overheated data and has low similarity with the reserved multimedia data, so that the user information of the multimedia data to be processed can be further acquired for analysis. Wherein the user information comprises an author of the multimedia data to be processed. The author of the song list shown in fig. 4 is goldfish, i.e., user information corresponding to the song list.

Step S104, processing the multimedia data to be processed according to the historical multimedia data information corresponding to the user information, the plurality of label lists, the popularity and the novelty of the multimedia data to be processed.

The historical multimedia data information comprises historical multimedia data grade, historical multimedia data quantity and other information. Specifically, the retained multimedia data may be used as a training set, and the historical multimedia data information, the multiple label lists, the popularity, and the novelty of the to-be-processed multimedia data are used as feature values, and a logistic regression model (logistic regression) is input for training to determine whether the to-be-processed multimedia data is the multimedia data meeting the requirements.

As can be seen from the above, the multimedia data processing method according to the embodiment of the present invention obtains the attribute information of the multimedia data to be processed, where the attribute information includes a plurality of tag lists, a multimedia number, a multimedia playing amount, and a multimedia identifier. And analyzing the consistency of the tag list, and when the tag list is consistent with the tag list, generating popularity according to the number of multimedia and the playing amount of the multimedia, and generating novelty according to the multimedia identifier. And when the popularity is smaller than a preset popularity threshold and the novelty is larger than a preset novelty threshold, acquiring historical multimedia data information corresponding to the user information of the multimedia data to be processed. And finally, processing the multimedia data to be processed according to the historical multimedia data information, the tag list, the popularity and the novelty. The scheme not only analyzes the consistency of a plurality of detection variables of the multimedia data to be processed, but also analyzes the multimedia data to be processed from deep level, thereby effectively improving the accuracy of processing the multimedia data to be processed.

The multimedia data processing method described according to the above embodiment will be further explained by way of example. In the embodiments of the present invention, description will be made from the perspective of a multimedia data processing apparatus, which can be specifically integrated in an electronic device.

Referring to fig. 7, fig. 7 is another flowchart of a multimedia data processing method according to an embodiment of the present invention, where the method includes:

step S201, receiving multimedia data to be processed, and obtaining attribute information of the multimedia data to be processed, where the attribute information includes a plurality of tag lists, a multimedia number, a multimedia playing amount, and a multimedia identifier.

As shown in fig. 8, when the multimedia data to be processed includes a picture to be processed, the acquiring of the attribute information of the multimedia data to be processed in step S201 includes: and extracting the noise value, the fuzziness and the exposure of the picture to be processed, calculating the picture to be processed according to a preset formula, and generating a scoring result. And judging whether the scoring result is smaller than a preset score threshold value or not. And if the scoring result is not less than the preset score threshold value, extracting the label of the picture to be processed to form the label list.

Specifically, the following preset formula can be adopted for calculation:

score＝1-0.8*blur-0.1*noise-0.1*abs(exposure)

where score is the scoring result, blu is the ambiguity, noise is the noise value, exposure is the exposure, and abs () is an absolute value function. If the score result score is smaller than the preset score threshold value, the quality of the picture to be processed does not reach the standard, so that the multimedia data to be processed can be deleted, or the multimedia data to be processed can be transferred to other storage equipment. If the score result score is not less than the preset score threshold, the picture to be processed can be further analyzed to obtain a corresponding label, and a label list is formed.

Specifically, Network models such as CNN (Convolutional Neural Network), DBN (Deep Belief Network), RNN (Recurrent Neural Network), and Recurrent tensor Neural Network may be used to perform image recognition and face detection to extract the label of the to-be-processed image. For example, if the picture in fig. 8 is analyzed to obtain the emotional parameter values of "negative" and "positive" of the person in the picture, respectively, 0.012 and 0.988, the picture may be labeled with the label "happy".

In some embodiments, the to-be-processed multimedia data including the bad information may be filtered out by analyzing the tags extracted from the to-be-processed picture. For example, a preset tag list including bad tags such as "violence" and "bloody smell" is set, and when the tag list of the picture to be processed and the preset tag list have the same tag, the multimedia data to be processed is regarded as the multimedia data including bad information, so that the multimedia data to be processed can be deleted. It should be noted that, in this embodiment, the label of the picture to be processed may be extracted first for analysis, and then the noise value, the blur degree, and the exposure degree are extracted to detect the picture quality.

As shown in fig. 9, when the multimedia data to be processed includes audio to be processed, the acquiring of the attribute information of the multimedia data to be processed in step S201 includes: acquiring a plurality of preset labels corresponding to audio to be processed; and clustering the plurality of preset labels to obtain clustered labels so as to form the label list.

Specifically, a plurality of preset labels can be clustered by using a K-means method to obtain K clustered labels, so as to form a label list, where K is a positive integer. As shown in fig. 9, in the multimedia data to be processed, "my song list", each audio to be processed has one or more preset tags, the preset tags are obtained first, then the tags are clustered to obtain 2 clustering tags, "animation" and "cure", and the 2 clustering tags form a tag list { animation, cure } corresponding to the audio to be processed.

When the multimedia data to be processed includes a text to be processed, the acquiring the attribute information of the multimedia data to be processed in step S201 includes: extracting nominal phrases from the text to be processed within a preset word number range as tags based on a preset phrase template to form a tag list; and/or extracting tags from the text to be processed which exceeds the preset word number range based on a TextRank algorithm to form the tag list.

The TextRank algorithm is used for generating keywords and abstracts for the text. Specifically, assuming that the preset word number range is 0-10, as shown in fig. 4, since the title of the song list "classic good song with severe cantonese words" is in the preset word number range, the keyword "cantonese words" can be extracted from the title of the song list based on the preset phrase template and used as the tag. And the description text in the song list is beyond the preset word number range, so that the 'classic' keyword can be proposed based on the TextRank algorithm. These tags are then grouped into a list of tags { cantonese, classical }.

Step S202, based on the preset mapping relationship, the tags in the tag list are mapped to the corresponding main words in the preset main word library, so that each tag list forms a corresponding main word list.

Step S203, determine whether any two of the main word lists have the same main word.

Comparing every two of the main word lists b1, b2, b3 and b4, if any two main word lists have the same main word, determining that the tag lists a1, a2, a3 and a4 are consistent, and turning to step S204; if there are two main word lists not having the same main word, it is determined that the above tag lists a1, a2, a3 and a4 are not consistent, and it proceeds to step S211.

Step S204, if any two main word lists have the same main word, the popularity of the multimedia data to be processed is generated according to the multimedia number and the multimedia playing amount, and the novelty of the multimedia data to be processed is generated according to the multimedia identification.

Since the main word lists b1, b2, b3 and b4 all have the main word "cured", it can be determined that the tag lists a1, a2, a3 and a4 are consistent, and also indicate that the styles of various types of multimedia data in the multimedia data to be processed are uniform, so that the popularity and novelty of the multimedia data to be processed are further detected.

In the specific implementation process, the popularity of the multimedia data to be processed can be generated according to the number of multimedia and the multimedia playing amount, and the specific steps are as follows:

wherein, P_iThe multimedia playing quantity of the ith multimedia is represented, N represents the number of the multimedia, and N is a positive integer. Degree of heatingIt can be determined according to the playing amount grade of the multimedia average playing amount. For example, if the multimedia data contains a plurality of songs, the multimedia data with the average playing amount of the songs exceeding 100 ten thousand times per month can be set as a first playing amount grade corresponding to a first popular grade; and setting the multimedia data with the average playing quantity of the songs between 50-100 ten thousand times per month as a second playing quantity grade corresponding to a second popular grade.

and acquiring the multimedia identification of the reserved multimedia data.

In step S205, it is determined whether the hot degree is smaller than a predetermined hot threshold and the novelty is greater than a predetermined novelty threshold.

In summary, the hot threshold upper limit value 6 may be set as a preset hot threshold, and it should be noted that the preset hot threshold is not specifically limited in this embodiment. When the hot degree is greater than or equal to the hot threshold, step S211 may be performed to delete the to-be-processed multimedia data, or to transfer the to-be-processed multimedia data to another storage device.

A preset novelty threshold may also be set, and when the similarity is greater than the preset novelty threshold, it is indicated that the similarity between the multimedia data to be processed and the reserved multimedia data is smaller. When the similarity is less than or equal to the predetermined novelty, it indicates that the similarity between the multimedia data to be processed and the reserved multimedia data is higher, and in order to increase the richness of the multimedia data, step S211 may be performed to delete the multimedia data to be processed.

If the popularity is smaller than the predetermined popularity threshold and the novelty is greater than the predetermined novelty threshold, it indicates that the multimedia data to be processed is not overheated data and has low similarity to the retained multimedia data, so step S206 may be performed to obtain the user information of the multimedia data to be processed for analysis.

In step S206, if the popularity is smaller than the predetermined popularity threshold and the novelty is greater than the predetermined novelty threshold, the user information of the multimedia data to be processed is obtained.

Wherein the user information comprises an author of the multimedia data to be processed. The author of the song list shown in fig. 4 is goldfish, i.e., user information corresponding to the song list.

Step S207, obtaining the historical multimedia data information corresponding to the user information.

The historical multimedia data information corresponding to the user information can be counted, and the quality of the multimedia data to be processed is predicted. Specifically, the historical multimedia data score and the historical multimedia data quantity corresponding to the user information can be obtained for analysis. If the score of the historical multimedia data is smaller than the score of the preset multimedia data and the quantity of the historical multimedia data is larger than the quantity of the preset multimedia data, the quality of the multimedia data to be processed is possibly poor, and therefore the multimedia data to be processed can be deleted or stored.

Taking the example of the user editing the song list, if the monthly average contribution amount of the user is twice that of other users, but the already-cast song list is not reserved, the song list to be processed by the user can be predicted to be the song list with poor quality, so that the song list posted by the user can be deleted or saved to other storage devices.

Step S208, using the reserved multimedia data as a training set, and inputting the historical multimedia data information, the plurality of label lists, the popularity and the novelty as characteristic values into a logistic regression model for training to obtain a training result.

Because some detected data cannot be accurately measured in the detection steps of steps S201 to S207. Such as step S204, when the popularity hot _ score of the multimedia data to be processed is less than 0.69, such multimedia data to be processed makes the multimedia data processing apparatus lack enough a priori knowledge to make the determination. Therefore, for the uncontrollable detection data, a logistic regression model can be further used for learning and training so as to improve the accuracy of the multimedia data processing to be processed.

Specifically, the retained multimedia data may be used as a training set, and the historical multimedia data information, the multiple label lists, the popularity, and the novelty of the to-be-processed multimedia data may be used as feature values, and a Logistic Regression model (Logistic Regression) may be input for training to determine whether the to-be-processed multimedia data satisfies a preset condition.

In step S209, it is determined whether the training result satisfies a predetermined condition.

If the training result meets the preset condition, the step S210 is executed; if the training result does not satisfy the preset condition, the process proceeds to step S211.

Step S210, if the training result satisfies a predetermined condition, the multimedia data to be processed is retained.

If the training result meets the preset condition, the multimedia data to be processed is proved to meet the preset quality requirement, so that the multimedia data to be processed can be reserved and sent to a professional for further processing.

Step S211, if there are two main word lists without the same main word, or the popularity is greater than or equal to a preset popularity threshold, or the novelty is less than or equal to a preset novelty threshold, or if the training result does not satisfy a preset condition, deleting the multimedia data to be processed.

If the plurality of labels are not consistent in step S203, it is indicated that the styles of the multimedia data of different file types in the multimedia data to be processed are not uniform, so that the multimedia data to be processed can be deleted or transferred to other storage devices.

Similarly, if the popularity is greater than or equal to the preset popularity threshold, or the novelty is less than or equal to the preset novelty threshold, it is determined that the multimedia data to be processed is overheated multimedia data, or has too high similarity with other multimedia data, and if the training result does not satisfy the preset condition, it is determined that the multimedia data to be processed does not satisfy the preset quality requirement, so that the multimedia data to be processed can be deleted or transferred to other storage devices.

As can be seen from the above, the multimedia data processing method according to the embodiment of the present invention adopts the corresponding tag extraction method for the multimedia data to be processed of different file types, so that a more accurate tag list can be obtained. Furthermore, the quality of the multimedia data to be processed is predicted according to the historical multimedia data information corresponding to the user information, so that the accuracy of processing the multimedia data to be processed is further improved.

The present embodiment will be further described from the perspective of a multimedia data processing apparatus, which may be integrated in an electronic device, according to the method described in the above embodiment

Referring to fig. 10, fig. 10 is a block diagram of a multimedia data processing apparatus according to an embodiment of the present invention, and the apparatus may include areceiving module 301, a determiningmodule 302, ananalyzing module 303, and aprocessing module 304.

(1)Receiving module 301

The receivingmodule 301 is configured to receive multimedia data to be processed and obtain attribute information of the multimedia data to be processed, where the attribute information includes a plurality of tag lists, a multimedia number, a multimedia playing amount, and a multimedia identifier, and each tag list is used to label a type of multimedia data to be processed.

From the viewpoint of file type, multimedia data can be divided into different kinds of multimedia data such as audio, video, and pictures, text, etc. related to music. As shown in fig. 4, the multimedia data is presented in the form of a song list, in which several songs are included. The song list not only comprises audio, video and pictures corresponding to the song, but also comprises texts such as a song list title, a song list description and the like. The attribute information of the multimedia data is used to label the essential features or characteristics of various multimedia data, such as the style of audio, and the size of audio, with a label. In the preferred embodiment, the attribute information of the to-be-processed multimedia data acquired by the receivingmodule 301 includes a plurality of tag lists, a number of multimedia, a multimedia playing amount, a multimedia identifier, and the like.

In the embodiment of the present invention, each type of multimedia data to be processed corresponds to a tag list, that is, a tag list is used for labeling a type of multimedia data to be processed. Wherein the tag list includes one or more tags. For example, for the song list in fig. 4, from the title of the song list, "heavily warm cantonese classic good song", the receivingmodule 301 may extract the labels "cantonese", "classic", and from the description of the song list, the receivingmodule 301 may extract the labels "melancholy", "vicissation" and "classic", and the labels constitute a label list { cantonese, classic, melancholy, vicission, classic }, which is used for labeling the text in the song list. Similarly, the receivingmodule 301 analyzes the audio "friend" to obtain the tags "cantonese" and "popular", the receivingmodule 301 analyzes the audio "semiminor evening primrose" to obtain the tags "cantonese" and "popular", and analyzes all the audios in the song list in sequence to obtain all the tags, so as to form a tag list { cantonese, popular }, which is used for labeling the audios in the song list.

Specifically, as shown in fig. 11, the receivingmodule 301 includes ascoring sub-module 3011, a first determining sub-module 3012, and an extracting sub-module 3013.

When the multimedia data to be processed includes a picture to be processed, thescoring submodule 3011 is configured to extract a noise value, a blur degree, and an exposure degree of the picture to be processed, calculate the picture to be processed according to a preset formula, and generate a scoring result. The first determining sub-module 3012 is configured to determine whether the scoring result is smaller than a preset score threshold. The extracting sub-module 3013 is configured to, when the scoring result is not smaller than a preset score threshold, extract a tag of the to-be-processed picture to form the tag list.

Specifically, thescoring sub-module 3011 may be calculated by using a preset formula as follows:

score＝1-0.8*blur-0.1*noise-0.1*abs(exposure)

where score is the scoring result, blu is the ambiguity, noise is the noise value, exposure is the exposure, and abs () is an absolute value function. If the first determining sub-module 3012 determines that the score result score is smaller than the preset score threshold, it indicates that the quality of the to-be-processed picture does not meet the standard, and therefore the extracting sub-module 3013 may delete the to-be-processed multimedia data or transfer the to-be-processed multimedia data to another storage device. If the first determining sub-module 3012 determines that the score result score is not less than the preset score threshold, the extracting sub-module 3013 may further analyze the picture to be processed to obtain a tag corresponding to the picture to form a tag list.

Specifically, the extracting sub-module 3013 may use Network models such as CNN (Convolutional Neural Network), DBN (Deep Belief Network), RNN (Recurrent Neural Network), and Recurrent Neural tensor Network to perform image recognition and face detection to extract the label of the to-be-processed image. For example, the extracting sub-module 3013 analyzes the picture in fig. 8 to obtain the emotional parameter values of the person in the picture, namely negative (negative) is 0.012 and positive (positive) is 0.988, then the picture can be labeled with the label "joy".

In some embodiments, the extracting sub-module 3013 may further filter out the to-be-processed multimedia data containing the malicious information by analyzing the extracted tags of the to-be-processed picture. For example, a preset tag list including bad tags such as "violence" and "bloody smell" is set, and when the tag list of the picture to be processed and the preset tag list have the same tag, the multimedia data to be processed is regarded as the multimedia data including bad information, so that the multimedia data to be processed can be deleted. It should be noted that, in this embodiment, the extracting sub-module 3013 may also extract the label of the to-be-processed picture for analysis, and then extract the noise value, the blur degree, and the exposure degree to detect the picture quality.

As shown in fig. 11, the receivingmodule 301 further includes an obtaining sub-module 3014 and aclustering sub-module 3015. When the multimedia data to be processed includes the audio to be processed, the obtaining sub-module 3014 is configured to obtain a plurality of preset tags corresponding to the audio to be processed. Theclustering submodule 3015 is configured to cluster the multiple preset tags to obtain clustered tags, so as to form the tag list.

Specifically, theclustering submodule 3015 may cluster the plurality of preset tags by using a K-means method to obtain K clustering tags, so as to form a tag list, where K is a positive integer. As shown in fig. 9, in the multimedia data to be processed, "my song list", each audio to be processed has one or more preset tags, the preset tags are obtained first, then the tags are clustered to obtain 2 clustering tags, "animation" and "cure", and the 2 clustering tags form a tag list { animation, cure } corresponding to the audio to be processed.

As shown in fig. 11, the receivingmodule 301 further includes afirst extraction sub-module 3016 and asecond extraction sub-module 3017. When the multimedia data to be processed includes a text to be processed, thefirst extraction sub-module 3016 is configured to extract a noun phrase from the text to be processed within a preset word number range as a tag based on a preset phrase template to form the tag list; and/or thesecond extraction sub-module 3017 is configured to extract tags from the text to be processed that exceeds the preset word count range based on the TextRank algorithm to form the tag list.

Assuming that the preset word number range is 0-10, in the song list shown in fig. 4, since the title of the song list, "rewarming the classic good song in cantonese" is in the preset word number range, thefirst extraction sub-module 3016 may extract "cantonese" from the title of the song list based on the preset phrase template as a tag. Also as in the song list, the descriptive text is out of the preset word count range, so thesecond extraction submodule 3017 may propose "classic" from it based on the TextRank algorithm. These tags are then grouped into a list of tags { cantonese, classical }.

(2) Judgingmodule 302

The determiningmodule 302 is configured to determine whether the tag lists are consistent, and if so, generate a popularity of the to-be-processed multimedia data according to the number of the multimedia and the multimedia playing amount, and generate a novelty of the to-be-processed multimedia data according to the multimedia identifier.

As shown in fig. 12, the determiningmodule 302 may specifically include: amapping sub-module 3021, adecision sub-module 3022 and afirst determination sub-module 3023.

Themapping submodule 3021 is configured to map, based on a preset mapping relationship, the tags in the tag list to corresponding main words in a preset main word library, so that each tag list forms a corresponding main word list. The judging sub-module 3022 is configured to judge whether any two main word lists have the same main word. The first determining sub-module 3023 is configured to determine that the plurality of tag lists coincide when any two main word lists have the same main word.

Assuming that the preset master lexicon is { cure, acg (animation public gate), excited }, the mapping relationship between the tag and the master word in the preset master lexicon is as shown in fig. 5, and different tags surround the corresponding master word, such as the master word "cure", which has a mapping relationship with the tags "warm", "pure tone", "alone", and the like. As shown in fig. 6, the multimedia data to be processed includes multimedia data of four file types, i.e., a text to be processed, a picture to be processed, an audio to be processed, and a video to be processed. If the tag list a1 corresponding to the text to be processed is { fresh, love song, steve }, themapping sub-module 3021 maps the tags in the tag list a1 to the preset thesaurus to obtain a corresponding subject word list b1{ cured, excited }. If the tag list a2 corresponding to the picture to be processed is { animation, warm }, themapping sub-module 3021 maps the tags in the tag list a2 to the preset thesaurus to obtain a corresponding subject list b2 of { ACG, cure }, if the tag list a3 corresponding to the audio to be processed is { pure tone, lyric, passion }, themapping sub-module 3021 maps the tags in the tag list a3 to the preset thesaurus to obtain a corresponding subject list b3 of { cure, excitement }, and the tag list a4 corresponding to the video to be processed is { pure tone, love }, themapping sub-module 3021 maps the tags in the tag list a4 to the preset thesaurus to obtain a corresponding subject list b4 of { cure }.

Thejudgment sub-module 3022 compares the above-described primary word lists b1, b2, b3, and b4 two by two, and if any two primary word lists have the same primary word, thefirst determination sub-module 3023 determines that the above-described tag lists a1, a2, a3, and a4 are identical, and if two primary word lists do not have the same primary word, determines that the above-described tag lists a1, a2, a3, and a4 are not identical. Since the main word lists b1, b2, b3 and b4 all have the main word "cured", it can be determined that the tag lists a1, a2, a3 and a4 are consistent, and also indicate that the styles of various types of multimedia data in the multimedia data to be processed are uniform.

If the tags are inconsistent, the styles of the multimedia data of different file types in the multimedia data to be processed are not uniform, so that the multimedia data to be processed can be deleted or stored in other storage devices. If the tags are consistent, the determiningmodule 302 can further detect the popularity and novelty of the multimedia data to be processed.

The determiningmodule 302 can generate the popularity of the to-be-processed multimedia data according to the number of multimedia and the multimedia playing amount, and the specific steps are as follows:

Specifically, the determiningmodule 302 may calculate the average multimedia playing amount h according to the following formula:

In some embodiments, the determiningmodule 302 may also directly calculate the hotness hot _ score according to the following formula:

Through statistical analysis of a large amount of multimedia data, when the popularity hot _ score of the multimedia data to be processed is less than 0.69, the average multimedia playing amount P corresponding to the multimedia data to be processed is less than 5 times/month. Since such pending multimedia data lack sufficient a priori knowledge for the multimedia data processing apparatus to make the determination, such pending multimedia data may be retained for further detection. For the multimedia data to be processed with the popularity hot _ score greater than 6, the average playing amount P of the multimedia is more than millions of times per month, and statistics show that the multimedia data to be processed tend to simply collect popular videos and popular audios, the subjects of the multimedia data are unclear, the styles of the multimedia data are not uniform, and therefore the multimedia data can be deleted or stored in other storage devices.

In summary, the hot upper limit value 6 may be set as the preset hot threshold. When the popularity is larger than or equal to the popularity threshold, the multimedia data to be processed can be deleted or transferred to other storage devices.

As shown in fig. 12, thedecision module 302 further includes an identification acquisition sub-module 3024 and asecond determination sub-module 3025. The identifier obtaining sub-module 3024 is configured to obtain a multimedia identifier of the reserved multimedia data. The second determiningsubmodule 3025 is configured to determine the novelty of the multimedia data to be processed according to the multimedia identifier of the retained multimedia data and the multimedia identifier of the multimedia data to be processed.

A predetermined novelty threshold may be set, and when the similarity is greater than the predetermined novelty threshold, it indicates that the similarity between the multimedia data to be processed and the reserved multimedia data is smaller. When the similarity is not greater than the preset novelty, the similarity between the multimedia data to be processed and the reserved multimedia data is higher, and the multimedia data to be processed can be deleted in order to improve the richness of the multimedia data.

(3)Analysis Module 303

Theanalysis module 303 is configured to analyze and process the popularity and the novelty, and if the popularity is smaller than a preset popularity threshold and the novelty is greater than a preset novelty threshold, obtain user information of the to-be-processed multimedia data.

(4)Processing module 304

Theprocessing module 304 is configured to process the multimedia data to be processed according to the historical multimedia data information corresponding to the user information, the tag lists, the popularity and the novelty of the multimedia data to be processed.

Preferably, theprocessing module 304 may count historical multimedia data information corresponding to the user information, and predict the quality of the multimedia data to be processed. Specifically, the historical multimedia data score and the historical multimedia data quantity corresponding to the user information can be obtained for analysis. If the score of the historical multimedia data is smaller than the score of the preset multimedia data and the quantity of the historical multimedia data is larger than the quantity of the preset multimedia data, it indicates that the quality of the to-be-processed multimedia data is possibly poor, so theprocessing module 304 can delete or dump the to-be-processed multimedia data.

Taking the example of the user editing the song list, if the monthly average contribution amount of the user is twice that of other users, but none of the already-submitted song lists is reserved, the song list to be processed by the user can be predicted to be the song list with poor quality, so theprocessing module 304 can delete the song list submitted by the user or dump the song list to other storage devices.

As shown in fig. 13, theprocessing module 304 includes an information obtaining sub-module 3041, atraining sub-module 3042, a second judging sub-module 3043, and a reserving sub-module 3044.

The information obtaining sub-module 3041 is configured to obtain historical multimedia data information corresponding to the user information. Thetraining submodule 3042 is configured to input the historical multimedia data information, the plurality of label lists, the popularity, and the novelty as feature values into a logistic regression model for training by using the retained multimedia data as a training set, so as to obtain a training result. The second determiningsubmodule 3043 is configured to determine whether the training result meets a preset condition. The reservingsubmodule 3044 is configured to reserve the to-be-processed multimedia data when the training result meets a preset condition.

Specifically, thetraining submodule 3042 may use the retained multimedia data as a training set, input a Logistic Regression model (Logistic Regression) to the score of the historical multimedia data, the number of the historical multimedia data, and a plurality of label lists, popularity and novelty of the multimedia data to be processed as feature values for training, then judge whether the multimedia data to be processed satisfies a preset condition through the second judgingsubmodule 3043, and when the preset condition is satisfied, the retainingsubmodule 3044 retains the multimedia data to be processed.

The multimedia data processing device of the embodiment of the invention acquires the attribute information of the multimedia data to be processed, wherein the attribute information comprises a plurality of label lists, multimedia numbers, multimedia playing amounts and multimedia identifications. And analyzing the consistency of the tag list, and when the tag list is consistent with the tag list, generating popularity according to the number of multimedia and the playing amount of the multimedia, and generating novelty according to the multimedia identifier. And when the popularity is smaller than a preset popularity threshold and the novelty is larger than a preset novelty threshold, acquiring historical multimedia data information corresponding to the user information of the multimedia data to be processed. And finally, processing the multimedia data to be processed according to the historical multimedia data information, the tag list, the popularity and the novelty. The scheme not only analyzes the consistency of a plurality of detection variables of the multimedia data to be processed, but also analyzes the multimedia data to be processed from deep level, thereby effectively improving the accuracy of processing the multimedia data to be processed.

Accordingly, an embodiment of the present invention further provides an electronic device, as shown in fig. 14, which shows a schematic structural diagram of the electronic device according to the embodiment of the present invention, specifically:

the electronic device may include components such as aprocessor 401 of one or more processing cores,memory 402 of one or more computer-readable storage media, apower supply 403, and aninput unit 404. Those skilled in the art will appreciate that the electronic device configuration shown in fig. 14 does not constitute a limitation of the electronic device and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components. Wherein:

theprocessor 401 is a control center of the electronic device, connects various parts of the whole electronic device by various interfaces and lines, performs various functions of the electronic device and processes data by running or executing software programs and/or modules stored in thememory 402 and calling data stored in thememory 402, thereby performing overall monitoring of the electronic device. Optionally,processor 401 may include one or more processing cores; preferably, theprocessor 401 may integrate an application processor, which mainly handles operating systems, user interfaces, application programs, etc., and a modem processor, which mainly handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into theprocessor 401.

Thememory 402 may be used to store software programs and modules, and theprocessor 401 executes various functional applications and data processing by operating the software programs and modules stored in thememory 402. Thememory 402 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data created according to use of the electronic device, and the like. Further, thememory 402 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. Accordingly, thememory 402 may also include a memory controller to provide theprocessor 401 access to thememory 402.

The electronic device further comprises apower supply 403 for supplying power to the various components, and preferably, thepower supply 403 is logically connected to theprocessor 401 through a power management system, so that functions of managing charging, discharging, and power consumption are realized through the power management system. Thepower supply 403 may also include any component of one or more dc or ac power sources, recharging systems, power failure detection circuitry, power converters or inverters, power status indicators, and the like.

The electronic device may further include aninput unit 404, and theinput unit 404 may be used to receive input numeric or character information and generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function control.

Although not shown, the electronic device may further include a display unit and the like, which are not described in detail herein. Specifically, in this embodiment, theprocessor 401 in the electronic device loads the executable file corresponding to the process of one or more application programs into thememory 402 according to the following instructions, and theprocessor 401 runs the application program stored in thememory 402, thereby implementing various functions as follows:

judging whether the tag lists are consistent, if so, generating the popularity of the multimedia data to be processed according to the number of the multimedia and the multimedia playing amount, and generating the novelty of the multimedia data to be processed according to the multimedia identifier;

The electronic device can achieve the effective effect that can be achieved by any multimedia data processing apparatus provided in the embodiments of the present invention, which is detailed in the foregoing embodiments and not described herein again.

The electronic device of the embodiment of the invention acquires the attribute information of the multimedia data to be processed, wherein the attribute information comprises a plurality of label lists, multimedia numbers, multimedia playing amounts and multimedia identifications. And analyzing the consistency of the tag list, and when the tag list is consistent with the tag list, generating popularity according to the number of multimedia and the playing amount of the multimedia, and generating novelty according to the multimedia identifier. And when the popularity is smaller than a preset popularity threshold and the novelty is larger than a preset novelty threshold, acquiring historical multimedia data information corresponding to the user information of the multimedia data to be processed. And finally, processing the multimedia data to be processed according to the historical multimedia data information, the tag list, the popularity and the novelty. The scheme not only analyzes the consistency of a plurality of detection variables of the multimedia data to be processed, but also analyzes the multimedia data to be processed from deep level, thereby effectively improving the accuracy of processing the multimedia data to be processed.

Various operations of embodiments are provided herein. In one embodiment, the one or more operations may constitute computer readable instructions stored on one or more computer readable media, which when executed by an electronic device, will cause the computing device to perform the operations. The order in which some or all of the operations are described should not be construed as to imply that these operations are necessarily order dependent. Those skilled in the art will appreciate alternative orderings having the benefit of this description. Moreover, it should be understood that not all operations are necessarily present in each embodiment provided herein.

Moreover, although the disclosure has been shown and described with respect to one or more implementations, equivalent alterations and modifications will occur to others skilled in the art based upon a reading and understanding of this specification and the annexed drawings. The present disclosure includes all such modifications and alterations, and is limited only by the scope of the appended claims. In particular regard to the various functions performed by the above described components (e.g., elements, resources, etc.), the terms used to describe such components are intended to correspond, unless otherwise indicated, to any component which performs the specified function of the described component (e.g., that is functionally equivalent), even though not structurally equivalent to the disclosed structure which performs the function in the herein illustrated exemplary implementations of the disclosure. In addition, while a particular feature of the disclosure may have been disclosed with respect to only one of several implementations, such feature may be combined with one or more other features of the other implementations as may be desired and advantageous for a given or particular application. Furthermore, to the extent that the terms "includes," has, "" contains, "or variants thereof are used in either the detailed description or the claims, such terms are intended to be inclusive in a manner similar to the term" comprising.

Each functional unit in the embodiments of the present invention may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium. The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc. Each apparatus or system described above may perform the method in the corresponding method embodiment.

In summary, although the present invention has been disclosed in the foregoing embodiments, the serial numbers before the embodiments are used for convenience of description only, and the sequence of the embodiments of the present invention is not limited. Furthermore, the above embodiments are not intended to limit the present invention, and those skilled in the art can make various changes and modifications without departing from the spirit and scope of the present invention, therefore, the scope of the present invention shall be limited by the appended claims.

Claims

1. A method for processing multimedia data, comprising:

2. The method of claim 1, wherein the step of determining whether the tag lists are consistent comprises:

mapping the labels in the label list to corresponding main words in a preset main word library based on a preset mapping relation so that each label list forms a corresponding main word list;

judging whether any two main word lists have the same main word or not;

and if any two main word lists have the same main word, determining that the plurality of label lists are consistent.

3. The method according to claim 1, wherein the multimedia data to be processed comprises text to be processed, and the step of obtaining attribute information of the multimedia data to be processed comprises:

extracting nominal phrases from the text to be processed within a preset word number range as tags based on a preset phrase template to form a tag list; and/or

And extracting tags from the text to be processed which exceeds a preset word number range based on a TextRank algorithm to form the tag list.

4. The method according to claim 1, wherein the multimedia data to be processed comprises a picture to be processed, and the step of obtaining the attribute information of the multimedia data to be processed comprises:

extracting the noise value, the fuzziness and the exposure of the picture to be processed, calculating the picture to be processed according to a preset formula, and generating a scoring result;

judging whether the scoring result is smaller than a preset score threshold value or not;

and if the scoring result is not smaller than the preset score threshold value, extracting the label of the picture to be processed to form the label list.

5. The method according to claim 1, wherein the multimedia data to be processed comprises audio to be processed, and the step of obtaining attribute information of the multimedia data to be processed comprises:

acquiring a plurality of preset labels corresponding to the audio to be processed;

and clustering the plurality of preset labels to obtain clustered labels so as to form the label list.

6. The method as claimed in any one of claims 1 to 5, wherein the step of processing the multimedia data to be processed according to the historical multimedia data information corresponding to the user information, the plurality of tag lists, the popularity and the novelty of the multimedia data to be processed comprises:

acquiring the historical multimedia data information corresponding to the user information;

taking the reserved multimedia data as a training set, and inputting the historical multimedia data information, the plurality of label lists, the popularity and the novelty as characteristic values into a logistic regression model for training to obtain a training result;

judging whether the training result meets a preset condition or not;

and if the training result meets a preset condition, reserving the multimedia data to be processed.

7. The method according to any of claims 1-5, wherein the step of generating the novelty of the multimedia data to be processed according to the multimedia identifier comprises:

acquiring a multimedia identifier of the reserved multimedia data;

8. A multimedia data processing apparatus, comprising:

9. The apparatus according to claim 8, wherein the determining module comprises:

the mapping submodule is used for mapping the labels in the label list to corresponding main words in a preset main word library based on a preset mapping relation so that each label list forms a corresponding main word list;

the judging submodule is used for judging whether any two main word lists have the same main word or not;

a first determining sub-module, configured to determine that the plurality of tag lists are consistent when any two of the main word lists have the same main word.

10. The apparatus of claim 8, wherein the multimedia data to be processed comprises text to be processed, and the receiving module comprises:

the first extraction submodule is used for extracting nominal phrases from the text to be processed within a preset word number range as tags based on a preset phrase template so as to form a tag list; and/or

And the second extraction submodule is used for extracting the tags from the texts to be processed which exceed the preset word number range based on the TextRank algorithm so as to form the tag list.

11. The apparatus of claim 8, wherein the multimedia data to be processed comprises a picture to be processed, and the receiving module comprises:

the scoring submodule is used for extracting the noise value, the fuzziness and the exposure of the picture to be processed, calculating the picture to be processed according to a preset formula and generating a scoring result;

the first judgment submodule is used for judging whether the scoring result is smaller than a preset score threshold value or not;

and the extracting submodule is used for extracting the labels of the picture to be processed to form the label list when the scoring result is not smaller than the preset score threshold value.

12. The apparatus of claim 8, wherein the multimedia data to be processed comprises audio to be processed, and the receiving module comprises:

the acquisition submodule is used for acquiring a plurality of preset labels corresponding to the audio to be processed;

and the clustering submodule is used for clustering the plurality of preset labels to obtain clustered labels so as to form the label list.

13. The multimedia data processing apparatus according to any of claims 8-12, wherein the processing module comprises:

the information acquisition submodule is used for acquiring the historical multimedia data information corresponding to the user information;

the training submodule is used for inputting the historical multimedia data information, the label lists, the popularity and the novelty as characteristic values into a logistic regression model for training by taking the reserved multimedia data as a training set to obtain a training result;

the second judgment submodule is used for judging whether the training result meets a preset condition or not;

and the reserving submodule is used for reserving the multimedia data to be processed when the training result meets the preset condition.

14. The apparatus according to any of claims 8-12, wherein the determining means comprises:

the identification acquisition submodule is used for acquiring a multimedia identification of the reserved multimedia data;

and the second determining submodule is used for determining the novelty of the multimedia data to be processed according to the multimedia identifier of the reserved multimedia data and the multimedia identifier of the multimedia data to be processed.

15. A storage medium having stored therein processor-executable instructions, the processor providing the multimedia data processing method of any one of claims 1-7 by executing the instructions.