Disclosure of Invention
Aiming at the defects of the prior art, the technical problem to be solved by the invention is to provide the user personalized recommendation method based on the multi-mode data, which learns the user preference and forms the user portrait according to the history data formed by the user in various fields (such as news, video, music and the like), so as to recommend the multi-mode data and expand the diversity of recommendation results.
In order to solve the technical problems, the invention adopts the following technical scheme:
a user personalized recommendation method based on multi-mode data comprises the following steps:
Step 1, acquiring a total object set I and an object set L0 to be recommended, and definitely requiring oriented multi-mode information types.
And 2, aiming at each facing mode, applying a corresponding feature extraction algorithm, extracting features from each object in the total object set and mapping the features into the same mathematical multidimensional space S.
And step 3, acquiring and accumulating a user history behavior record and an image log. The historical behavior record is the click history record of the user before mapping, namely a clicked object list, and the mapping log is the object list displayed for the user and the click behavior of the user on the objects, wherein 1 represents clicking and 0 represents non-clicking.
And 4, initializing a recommendation system, inputting the multi-dimensional space S obtained in the step 2 and the user history behavior record and the mapping log obtained in the step 3 into the recommendation system, and setting intelligent agents of the recommendation system and reinforcement learning environment parameters.
And 5, training the intelligent agent of the recommendation system is executed.
And 6, processing the object set to be recommended L0 by using a trained recommendation system agent, simulating interaction behavior of a user facing the object set to be recommended L0 by using a recommendation system, and generating a corresponding mapping log D.
And 7, extracting an object set interacted by the user in the mapping log D, namely, a personalized interaction object set predicted by the recommendation system agent corresponding to the user, as a total recommendation list L.
And 8, processing the total recommendation list L according to different requirements to generate a multi-mode recommendation list.
And 9, recommending the generated multi-mode recommendation list to obtain a specific user interaction result, thereby generating an image log. The agent is further trained and updated each time the log accumulates to a given n.
The user personalized recommendation method based on the multi-mode data has the advantages that after the historical behavior record and the mapping log which are allowed to be collected by the user are obtained, all the related objects and the characteristics of each object to be recommended are extracted and mapped into the same multi-dimensional space, the multi-mode data are integrated, the reinforcement learning model is used as an intelligent body of a recommendation system, the collected user records the training intelligent body, and the trained recommendation intelligent body is used for recommendation, so that user personalized recommendation of the multi-mode data is achieved. Compared with the prior art, the user history behavior record, the mapping log, the object set to be recommended and the like in the method provided by the invention can contain objects (such as texts, pictures, videos and the like) in a plurality of fields, and the distinction between multiple modes is blurred by the method of integrating the extracted features, so that the problem that a traditional recommendation system can only be applied to recommendation in a single field is solved.
Detailed Description
The following describes in further detail the embodiments of the present invention with reference to the drawings and examples. The following examples are illustrative of the invention and are not intended to limit the scope of the invention.
As shown in fig. 1, the user personalized recommendation method based on the multimodal data in this embodiment is as follows.
Step 1, acquiring a total object set I and an object set L0 to be recommended, and definitely requiring oriented multi-mode information types.
And 2, aiming at each facing mode, applying a corresponding feature extraction algorithm, extracting features from each object in the total object set and mapping the features into the same mathematical multidimensional space S. For example, the image uses CNN or RNN convolutional neural network to extract features, the text uses tf-idf algorithm or CNN convolutional neural network to extract features, the video uses P3D (Pseudo-3D Residual Networks) residual network to extract features, the audio uses Discrete Wavelet Transform (DWT) or Perceptual Linear Prediction (PLP) algorithm to extract features, etc.
And step 3, acquiring and accumulating a user history behavior record and an image log. The historical behavior record is the click history record of the user before mapping, namely a clicked object list, and the mapping log is the object list displayed for the user and the click behavior of the user on the objects, wherein 1 represents clicking and 0 represents non-clicking.
And 4, initializing a recommendation system, inputting the multi-dimensional space S obtained in the step 2 and the user history behavior record and the mapping log obtained in the step 3 into the recommendation system, and setting intelligent agents of the recommendation system and reinforcement learning environment parameters.
Step 5, training of the recommendation system agent is executed, and as shown in fig. 2, the process is as follows:
and 5.1, the agent requests a user related record from the user history behavior record and the image log obtained in the step 3.
And 5.2, returning the record through the memory, and replacing the corresponding object with the characteristic representation through the result of the step 2.
And 5.3, forming a user portrait by the agent according to the historical behavior record of the user, sequentially executing images according to the user portrait, obtaining rewards by the agent if the action of the agent is consistent with the real log, and obtaining no rewards or punishments if the action of the agent is inconsistent with the real log.
And 5.4, after all the entries in the user mapping log are executed, finishing the reinforcement learning Frame once, judging that if the training termination condition in the reinforcement learning environment parameters set in the step 4 is met, finishing the training of the intelligent agent, and turning to the step 6, otherwise, returning to the step 5.1.
And 6, processing the object set to be recommended L0 by using a trained recommendation system agent, simulating interaction behavior of a user facing the object set to be recommended L0 by using a recommendation system, and generating a corresponding mapping log D.
And 7, extracting an object set interacted by the user in the mapping log D, namely, a personalized interaction object set predicted by the recommendation system agent corresponding to the user, as a total recommendation list L.
Step 8, processing the total recommendation list L according to different requirements to generate a multi-mode recommendation list;
When the extensive interests of the user in multiple fields need to be mined or the recommendation result modes need to be diversified, the total recommendation list L can be simply ordered and output according to specific scene requirements, and a multi-mode recommendation list L1 is generated.
When a user needs to make a recommendation in other fields (such as recommending text hyperlinks for news and movies) according to a certain object, other mode objects closest to the current object can be selected from the multidimensional space S according to the nearest neighbor principle, and a multimodal recommendation list L2 is generated.
When a user needs to be recommended more accurately and effectively by combining knowledge expressions of different levels, the features of different modes are fused by using a multi-mode fusion (multimodal fusion) algorithm, and an agent is trained based on the fused features, so that a recommendation list L3 with multi-mode expressions for the same object, such as a news headline is generated.
And 9, recommending the generated multi-mode recommendation list to obtain a specific user interaction result, thereby generating an image log. The agent is further trained and updated each time the log accumulates to a given n.
The method of the present invention is sequentially executed based on MIcrosoft NEWS DATASET (MIND) datasets:
Step 1, since the MIND data set only contains news text data and does not meet the requirement of multi-mode tasks, corresponding picture data in news is crawled through spiderFlow crawler tools (https:// gitub. Com/chenyuansgit/spiderFlow) according to news links provided in the MIND data set, and a picture data set corresponding to the news text is formed. Therefore, the obtained total object set I is all news texts and news pictures, the object set L0 to be recommended is the mapping log news texts and corresponding news picture sets in the test set, and the multi-mode information types which are definitely required to be oriented are texts and images.
And 2, respectively applying a corresponding feature extraction algorithm to two modes of a facing text and an image, extracting basic text features from news text features through Tf-idf item weighting, and then carrying out feature selection through a chi-square test algorithm to reduce the obtained basic text feature dimension, so that the model generalization capability is stronger, the overfitting is reduced, and the understanding between the features and the feature values is enhanced. The news picture features are extracted through a pre-trained VGG16 neural network in keras library, and finally the extracted features of each object in the total object set are mapped into the same mathematical multidimensional space S through a umap algorithm (https:// arxiv. Org/abs/1802.03426).
And 3, reading the accumulated user history behavior record and image log in the data set.
Initializing a recommendation system, inputting the multi-dimensional space S obtained in the step 2 and the image log obtained in the step 3 into the recommendation system, setting an intelligent body of the recommendation system and a reinforcement learning environment parameter (a rainbow model is taken as an example), wherein the action space is an integer of 0-10, the probability of predicting the object to be clicked is divided by 10, the state space is a multi-dimensional matrix for reflecting the characteristics of the user portrait and the current object to be determined, and rewarding is determined by auc indexes obtained by the prediction result output by the image log.
Step 5, training of the recommendation system agent is executed, and as shown in fig. 2, the process is as follows:
step 5.1, the agent requests a user related record from the collection obtained in the step 3;
Step 5.2, the memory returns a record, and the corresponding object is replaced by the characteristic representation according to the result of the step 2;
Step 5.3, the agent clusters and separates user preference characteristics by using a Mean-Shift algorithm according to the history record of the user to form a user portrait, predicts the click rate of the user on the object in the mapping log one by one according to the obtained user portrait, and awards the agent according to auc indexes obtained by calculation of the prediction result and the real record;
and 5.4, after all the entries in the user mapping log are executed, finishing the reinforcement learning Frame once, judging that if the training termination condition in the reinforcement learning environment parameters set in the step 4 is met, finishing the training of the agent, otherwise, returning to the step 5.1.
And 6, processing the mapping log in the test set by using the trained recommendation system agent, predicting the click probability of the user on each object according to the user history behavior record in the test set, calculating auc indexes, and recording the prediction result D.
And 7, using the high click probability object in the prediction result D as a medium recommendation list L.
And 8, processing the total recommendation list L according to different requirements to generate a multi-mode recommendation list, wherein the following requirement processing process comprises the following steps:
When the extensive interests of the user in multiple fields need to be mined or the recommendation result modes need to be diversified, the total recommendation list L can be simply ordered and output according to specific scene requirements, and a multi-mode recommendation list L1 is generated.
When it is required to recommend the user in other fields (such as news map matching, or news hyperlink recommending for pictures, etc.) according to a certain object, other mode objects closest to the current object can be selected from the multidimensional space S according to the nearest neighbor principle, and a multimodal recommendation list L2 is generated.
And 9, recommending the generated multi-mode recommendation list to obtain a specific user interaction result, thereby generating an image log. The agent may be further trained and updated each time the log accumulates to a given n.
Through the steps, the finally obtained experimental result indexes are shown in table 1 (step 5 adopts different recommended models respectively).
TABLE 1 results of various recommended model experiments
By adopting the reinforcement learning model disclosed by the invention as a recommendation system agent for recommendation, a test is carried out on MIcrosoft NEWS DATASET (MIND) data sets, and the result shows that the reinforcement learning agent has similar performance indexes as the traditional recommendation model, namely, can be qualified for recommendation tasks. The data of the MIND data set on the picture mode is expanded by the crawler to be a data set of a text-picture mixed mode, and further test is carried out, so that various indexes of the invention can meet the personalized recommendation requirement of users for multi-mode data from the test result, and the requirement of a recommendation system is met.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will understand that they can make modifications to the technical solutions described in the above-mentioned embodiments or make equivalent substitutions of some or all of the technical features, without departing from the essence of the corresponding technical solutions from the scope of the invention defined by the claims.