Detailed Description
Embodiments of the present invention are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are illustrative only and are not to be construed as limiting the invention.
In the description of the present invention, it should be understood that the terms "longitudinal," "transverse," "upper," "lower," "front," "rear," "left," "right," "vertical," "horizontal," "top," "bottom," "inner," "outer," and the like indicate orientations or positional relationships based on the orientation or positional relationships shown in the drawings, merely to facilitate describing the present invention and simplify the description, and do not indicate or imply that the devices or elements referred to must have a specific orientation, be configured and operated in a specific orientation, and therefore should not be construed as limiting the present invention.
In the description of the present invention, unless otherwise specified and defined, it should be noted that the terms "mounted," "connected," and "coupled" are to be construed broadly, and may be, for example, mechanical or electrical, or may be in communication with each other between two elements, directly or indirectly through intermediaries, as would be understood by those skilled in the art, in view of the specific meaning of the terms described above.
The inventor of the application finds that through a plurality of experimental analysis:
in reality, for consumer goods, such as paper towel, shampoo, bath lotion, etc., after the user interacts (purchases), the user's consumption preference should be lower than the desire to consume when purchasing during the period of the non-consumed goods (because the user does not lack such goods, the purchase demand must be lower than the demand when the user needs the urgent goods at this stage), however, after the goods are consumed, the user's purchase demand should reach a peak again. The preference degree of the user for the commodity reaches the peak value of the commodity in the class around the interaction time period (the interaction represents purchase and grading), the minimum value is continuously attenuated after the completion of one interaction, the preference of the user is continuously enhanced to the peak value along with the consumption of the commodity, and the preference of the user for the commodity is seen to be periodically changed.
For entertainment objects, such as listening to music, watching novels, watching movies, etc., the individual level of the person is also periodically changed, for example, people prefer to listen to pure music types or other songs which can improve learning efficiency when learning, and can prefer to listen to songs with low emotion when feeling solitary late at night, etc.; moreover, because the life of people has cycle limitation, such as the working days from monday to friday, the working hours and the working hours of each day are relatively fixed, and the life book of the people is a cycle repeated process for the individuals, the preference of the behavior of the people also has cycle property to a certain extent. Such commodities, besides carrying the life cycle of the person, also have consumption properties to some extent, like consumption commodities, for example, user a often watches movies for a certain period of time, but after a period of time may feel tired of watching movies, the demand for watching movies is continuously attenuated, and the demand for watching movies reaches a peak when next time there is a demand for watching movies, so that this type of commodity also has periodic properties.
It can be seen that the demands and preferences of the user for goods are necessarily changed more or less periodically, the behavior of the person as a social individual is influenced by the fixed life arrangement and the individual behavior biological clock, which behavior itself is a periodically changing process. The inventors have found that if periodic variations in the user's preferences for goods are considered, this will help to improve the timeliness and rationality of the recommendation of goods. Based on the discovery, the invention provides a recommendation method, a recommendation device and a recommendation system based on a self-adaptive user interest change period.
In a preferred embodiment of the present invention, a recommendation method based on an adaptive user interest change period is disclosed, as shown in fig. 1, including:
step S1, historical interaction data of a user and commodities are obtained.
The goods preferably but not limited to include physical goods, virtual goods and service goods, and the physical goods preferably but not limited to including living goods (such as toothpaste, soap, shampoo, paper towel, etc.), appliances (such as air conditioner, washing machine, refrigerator, etc.), foods (such as rice, biscuits and various snacks), and fruits. The virtual good is preferably but not limited to music, electronic novels, movies. The service type commodity is preferably, but not limited to, home appliance maintenance and household service. The user interaction data with the merchandise preferably, but not limited to, includes purchase records, score times, purchase times, and merchandise tags for different merchandise by multiple users, respectively.
The commodity labels are commodity category labels, and generally, commodities are classified into washing and caring types, seasoning types, grains, vegetables and the like, and music is classified into light music, rock music and the like. Each commodity is provided with a class label of the class in which the commodity is located, so that the commodities of the same commodity label are classified into one class of objects, the commodity class set is set as C, and the class index is set as C. Then splitting and calculating a scoring record matrix R of the target user based on a scoring database, and counting the highest scoring value R of the target user u for the commodity class cuc max, lowest score value ruc min, get [ r ]uc min,ruc max]And the scoring domain of the commodity class c is used as the scoring domain of the user, namely the scoring domain of the commodity of the same class is obtained through the lowest scoring and the highest scoring of the target user on the commodity of the same class in the historical data. If the target user has no interaction record, i.e. no score, for category c, category c uses the default scoring range. The scoring field is set to make full use of the historical interaction information and limit the scoring of the target user to certain commodities in a reasonable range.
And S2, determining the basic score of the target user on each commodity based on the historical interaction data. Preferably, step S2 includes:
and S21, calculating the similarity between the target user and other users except the target user contained in the historical interaction data based on the historical interaction data, and constructing a neighbor user set of the target user based on the similarity.
Specifically, the users having a common scoring commodity set with the target users are searched from the historical interaction data, the similarity between the target users and the users is calculated, and the similarity calculation method is preferably but not limited to cosine similarity, pearson correlation coefficient and the like, for example, similarity sim (u, v) of the pearson correlation coefficient calculation users u and v is as follows:
wherein P isuv Is a commodity set commonly scored by the target user u and the user v,the average score values of users u and v are shown, respectively. r is (r)uj′ Representing the score of the target user u on the j' th commodity, rvj′ Representing the score of user v for the j' th item.
Since users with higher similarity values may be more similar in preference, the first k most similar users are selected as their neighbor user set NBS for target user uu 。
Step S21, obtaining the scores of the neighbor users in the neighbor user set for each commodityThe target user scores the basis of each commodity. Specifically, user u base score P for the jth commodityuj The method comprises the following steps:
wherein r isvi Representing the score of user v for the j-th item.
Step S3, determining the predictive scores of the target user on all commodities, wherein the process of determining the predictive score of the target user on the j-th commodity is as follows:
step S31, extracting a scoring sequence of the target user for the preset time length of the category of the jth commodity from the historical interaction data. The preset time length can be set according to the needs of different types of commodity users, such as several months or one month. The scoring sequence comprises at least two dimensions of scoring and scoring time of the scoring.
Step S32, extracting periodic features of the scoring sequence, including:
and carrying out Fourier transformation on the scoring sequence to obtain frequency domain data, and extracting periodic characteristics from the frequency domain data, wherein the periodic characteristics comprise the amplitude values of N frequency points with larger amplitude values in the frequency data and periodic basis functions of the N frequency points, wherein N is a positive integer greater than or equal to 2, and preferably N is 3. The periodic basis function is a sine function or a cosine function. The fourier transform is preferably a fast fourier transform. And ordering the frequency points according to the amplitude values from high to low, and selecting the amplitude values and the periodic basis functions of the first N frequency points. Based on historical interaction data, a scoring record matrix R of a target user is obtained, wherein the scoring record matrix R is a three-dimensional matrix and comprises commodity categories, scoring and scoring time, and a scoring sequence of the target user u with preset time length (expressed as T) for the commodity category c is extracted from the scoring record matrix R of the target userThe scoring sequence is converted into a frequency domain through Fast Fourier Transform (FFT) to obtain frequency domain data, and time sequence analysis is carried out, wherein the specific formula is as follows:
wherein FFT (·) represents the FFT operation, amp (·) represents the amplitude value calculation, Ai Amplitude value of periodic basis function representing ith frequency point, corresponding to period length pi Is thatThe frequency of the i-th frequency point is indicated. Considering the sparseness of the frequency domain and avoiding nonsensical high-frequency noise influence, preferably, the present application selects the amplitude values and the periodic basis functions of the first N frequency points ordered from high to low according to the amplitude values as the periodic characteristics of the scoring sequence. Preferably, when N is equal to 3, the corresponding meaningful bin frequency is { f1 ,f2 ,f3 -corresponding period p1 ,p2 ,p3 The method can be respectively marked as a long period, a medium period and a short period and used for the prediction optimization of the subsequent scores.
The interest of the user in the commodity accords with a certain period change, but the interest period of the user is difficult to directly obtain from a user scoring matrix, the original function can be fitted into a plurality of sine functions in consideration of the common Fourier transformation in the signal field, namely, a plurality of main functions can be selected to approximately fit the period characteristics of the interest change of the user, so that after the quick Fourier transformation is selected, the first N main functions with highest amplitude intensity are selected as the period basis functions of the user. Preferably, the periodic basis function of the first 3 frequency points with highest amplitude intensity and the vibration amplitude thereof are selected, so that on the one hand meaningless high-frequency noise possibly existing after fourier transformation can be eliminated, and on the other hand, the periodic influence of different period length sections is considered. The length of the periodic transformation is defined as: long period, medium period, short period. The understanding ideas of long period, medium period and short period can be roughly understood and summarized as follows: long periods represent long time period characteristics of quarterly or longer time, and medium and short periods can be understood as short time period characteristics of months, weeks, days, etc., but in actual circumstances, the first three intensity periods may not necessarily have the above-described ideal period spans of definite quarterly, months, weeks, days, etc.
Step S33, the time interval of the last scoring of the category of the jth commodity by the current distance target user is obtained. Setting the time of the current time prediction scoring as t, and setting the last scoring time of the target user for the category of the jth commodity as tn Then the time interval is t-tn 。
Step S34, determining the time adjustment factor of the j-th commodity by using the time interval and the period characteristics. Preferably, it comprises:
step S341, inputting the periodic basis functions of N frequency points at time intervals to obtain N function values, and regularizing the N function values to obtain regularized values of the N frequency points;
step S342, obtaining the product of the regularization function value and the amplitude value of each frequency point to obtain the product value of each frequency point;
and S343, calculating the ratio of the accumulated sum of the product values of the N frequency points to the accumulated sum of the amplitude values of the N frequency points, and taking the ratio as the time adjustment factor of the jth commodity.
Specifically, when N is equal to 3, the time adjustment factor of the j-th commodity is calculated according to the following formula:
wherein,a time adjustment factor for the j-th commodity representing the time t of the target user u at the current time prediction score, the value range of which is [0,1]. i represents a frequency point index, i being less than or equal to N. A is thati Representing the amplitude value of the ith frequency point, Fi (. Cndot.) represents the periodic basis function of bin i. Normal (·) is a regularization function for inputting Fi (t-tn ) Regularizing to [0,1 ]]Interval. Time adjustment factor->The numerical range of (2) is [0,1 ]]The time adjustment factor is set to be 1 when the time t is a periodic peak in all three periodic basis functions, and is set to be 0 when the time t is a periodic trough in all three periodic basis functions.
And step S35, correcting the basic score of the jth commodity by using the time adjustment factor of the jth commodity to obtain the predicted score of the jth commodity. The way of correction is preferably, but not limited to, multiplying or adding its corresponding time adjustment factor to the base score of the jth commodity.
And S4, obtaining a scoring sequence according to the sequence from high to low of the predictive scores of all commodities, recommending the first K commodities in the scoring sequence to a target user, wherein K is a positive integer, and i is a commodity index.
In another preferred embodiment of the present invention, in the step of determining the predictive score of the target user for all the commodities (and step S3), if there is no score record of the target user for the category of the jth commodity in the historical interaction data, the predictive score of the jth commodity is a base score. If the historical interaction data contains the scoring record of the target user for the category of the jth commodity, the predicted scoring of the jth commodity is obtained according to the steps S31 to S34. The situation that the target user has no interactive record on certain commodity categories in the practice data set is considered, so that universality of the recommendation method is improved.
In a preferred embodiment of the present invention, the step S35 of correcting the base score of the jth commodity by using the time adjustment factor of the jth commodity to obtain the predicted score of the jth commodity includes:
s351, mapping the time adjustment factor of the jth commodity by using a preset mapping function to obtain a time adjustment mapping value of the jth commodity. The mapping function is expressed as Map (), and a linear mapping function or a nonlinear mapping function, such as an exponential function, can be selected and used forMapping a time adjustment factor to a set superparameter kmin ,kmax Between, i.e. the time adjustment factor ranges from [0,1]Mapping to [ k ]min ,kmax ]Expressed asIn this way, errors introduced in the value interval of the time adjustment factor are eliminated, for example, when the basic score is corrected by multiplying the time adjustment factor by the basic score, if the time adjustment factor is 0, the predicted score is 0, and in fact, the basic score is not 0, which is unreasonable. [ k ]min ,kmax ]Enhancement and suppression amplitudes for setting a time adjustment factor, if the current time adjustment factor value is 1, the mapped result is kmax ,kmax And kmin So as to be capable of carrying out super parameter adjustment according to actual conditions. Preferably, [ k ]min ,kmax ]The interval is set to be centered on 1, satisfying kmax -kmin The value interval of less than or equal to 0.2, e.g. [0.9,1.1 ]]、[0.95,1.05]。
S352, correcting the basic score of the jth commodity by using the time adjustment mapping value of the jth commodity to obtain the predicted score of the jth commodity. Preferably, the predictive score of the jth commodity is obtained by multiplying the time adjustment mapping value of the jth commodity by the base score of the jth commodity. After the Map mapping time adjustment mapping value is multiplied, the basic score is scaled to a certain extent, if the basic score belongs to the frequency point periodic basis function wave crest, the basic score is strengthened, and if the basic score belongs to the periodic basis function wave trough, the basic score is restrained to a certain extent.
In this embodiment, it is further preferable that, in order to avoid unreasonable distribution of the scaled scores (i.e., the predicted scores obtained after correction of the immediate adjustment mapping values), the target user' S scoring habit is not met, and the final predicted score of the jth commodity is obtained after the target user performs domain restriction processing on the predicted score of the jth commodity obtained in step S352 by using the scoring domain of the jth commodity category.
The domain limiting process comprises the following steps: setting target userThe scoring domain of the commodity category c of the jth commodity is [ r ]uc min,ruc max]If the prediction score obtained in step S352 is less than or equal to ruc min, the final prediction score is set to ruc min, if the prediction score obtained in step S352 is greater than ruc min is less than ruc max, the final predictive score is set to the obtained predictive score itself, if the predictive score obtained in step S352 is greater than or equal to ruc max, the final prediction score is set to ruc max. And finally obtaining a scoring sequence according to the final predictive scoring of all the commodities from high to low, and recommending the first K commodities in the scoring sequence to the target user. Therefore, the historical interaction information can be fully utilized, and the score of the target user for a certain type of commodity is limited in a reasonable range.
In this embodiment, the process of obtaining the predicted score of the jth commodity by correcting the base score of the jth commodity with the time adjustment factor of the jth commodity can be expressed by the following formula:
and correcting the basic score of the jth commodity by using the time adjustment mapping value of the jth commodity to obtain a predicted score of the user on the jth commodity:
and performing domain restriction processing by using the scoring domain to obtain the final predicted score of the target user u on the jth commodity:
limit (·) represents a domain restriction processing operation.
In this embodiment, it is considered that in the real world, people work, live, consume, entertain, etc. are periodic, and because social operations have periodic effects, statistical behavior is more or less periodic, whether it is a whole society or an independent individual. The Fourier transform is more commonly used in the signal field, can convert a sampling signal of a time-space domain into a frequency domain for processing, and is obviously characterized in that the period of a sampling signal sequence can be extracted, so that the recommendation field can also acquire the interactive preference period of a user for an article by means of the Fourier fast transform, and if the original interactive data is more perfect in sampling and the data is more abundant, the period effect after the transformation is better. The recommendation algorithm combines the period extracted by the fast Fourier transform, so that time factors can be fully considered, and a better effect can be achieved by combining the periodic effects of society and individuals. In the Fourier transform process, the algorithm considers that meaningless high-frequency noise signals possibly exist in the frequency domain, selects the first three main components to calculate the time period factors, namely the relative long period, the middle period and the short period, and can more efficiently utilize the period characteristics.
The invention also discloses a recommendation device based on the self-adaptive user interest change period, which is characterized by comprising the following components: the data acquisition module is used for acquiring historical interaction data of the user and the commodity; a basic score acquisition module for determining basic scores of the target users for each commodity based on the historical interaction data; the prediction score acquisition module is used for determining the prediction scores of the target user on all commodities, wherein the process of determining the prediction scores of the target user on the j-th commodity is as follows: extracting a scoring sequence of a target user for the preset time length of the category of the jth commodity from the historical interaction data; extracting periodic characteristics of the scoring sequence; acquiring the last scoring time interval of the current distance target user on the category of the jth commodity; determining a time adjustment factor of the jth commodity by utilizing the time interval and the period characteristics; correcting the basic score of the jth commodity by using the time adjustment factor of the jth commodity to obtain the predicted score of the jth commodity; and the sequencing output module is used for acquiring a scoring sequence according to the sequencing from high to low of the predictive scores of all the commodities, recommending the first K commodities in the scoring sequence to a target user, wherein K is a positive integer, and i is a commodity index.
In this embodiment, each module corresponds to the steps of the recommendation method based on the adaptive user interest change period disclosed above, and will not be described herein.
The invention also discloses a recommendation system for realizing the recommendation method based on the self-adaptive user interest change period disclosed by the invention, and the specific flow is shown in figure 2 and comprises the following steps:
the data processing plate is used for acquiring historical interaction data of the user and the commodity and generating a user scoring matrix and a scoring domain of each commodity category based on the historical interaction data. And setting more historical interaction data to be stored in a scoring database. Classifying the commodities with the same label into a class of commodities, and classifying the commodities. And generating a user scoring matrix of the target user, namely a scoring record matrix.
And the model optimization module is used for calculating the similarity between the target user and other users and screening out a neighbor user set with higher similarity. Determining a basic score of the target user on each commodity based on the historical interaction data, and determining a predictive score of the target user on all commodities, wherein the process of determining the predictive score of the target user on the j-th commodity is as follows: extracting a scoring sequence of a target user for the preset time length of the category of the jth commodity from the historical interaction data; extracting periodic characteristics of the scoring sequence; acquiring the last scoring time interval of the current distance target user on the category of the jth commodity; determining a time adjustment factor of the jth commodity by utilizing the time interval and the period characteristics; correcting the basic score of the jth commodity by using the time adjustment factor of the jth commodity to obtain the predicted score of the jth commodity;
and a recommendation module: and obtaining a scoring sequence according to the sequence from high to low of the predictive scores of all commodities, recommending the first K commodities in the scoring sequence to a target user, wherein K is a positive integer, and i is a commodity index.
In this embodiment, each module corresponds to the steps of the recommendation method based on the adaptive user interest change period disclosed above, and will not be described herein.
In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
While embodiments of the present invention have been shown and described, it will be understood by those of ordinary skill in the art that: many changes, modifications, substitutions and variations may be made to the embodiments without departing from the spirit and principles of the invention, the scope of which is defined by the claims and their equivalents.