Summary of the invention
In order to solve the problem, technical scheme of the present invention relates to Method of Commodity Recommendation and system thereof in a kind of ecommerce, and its concrete technical scheme is as follows:
A Method of Commodity Recommendation in ecommerce, comprises the following steps:
Step S1, gather user in the historical behavior data of e-commerce website;
Step S2, carry out product features calculating according to the historical behavior data of user, export user about purchase commodity probabilistic forecasting proper vector;
Step S3, press characteristic sum classification according to the user exported respectively about purchase commodity probabilistic forecasting proper vector and substitute into statistical function and calculate user's purchase probability forecast model, acquisition purchase probability forecast model;
Step S4, carry out commercial product recommending according to the purchase probability forecast model of user and user login information.
Also comprise:
Step S5, purchase probability forecast model is utilized to calculate purchase probability that is relevant or similar commodity;
Step S6, according to the purchase probability that calculates, will be correlated with or similar commercial product recommending to user.
Described historical behavior data comprise: user login information, and the webpage that user browses browses the duration of webpage, the commodity of user search, user put into collection folder commodity, user adds the commodity of shopping cart, user submits the commodity of order to, buys and browses accounting.
Historical behavior data configuration user based on user buys commodity probabilistic forecasting proper vector, specifically comprises the following steps:
1) according to user's historical data, carry out dissimilar page weight on e-commerce website and calculate, the time attenuation function of different commodity category calculates and calculates with the relevant of different commodity or similar commodity set; Wherein, described page weight calculates, and be the conversion ratio in the different page source of statistics, namely the page contributes weight calculation to conversion ratio, calculates with source page weight according to conversion ratio;
2) according to 1) in the result that calculates of user's historical data of obtaining, structure commodity purchasing predicted characteristics vector.
Wherein step 1) described in the relevant or similar commodity set of different commodity calculate, be obtain the relevant of each commodity or similar commodity set, calculate the relevant of each commodity category or similar commodity, specifically comprise:
1.1) similarity of commodity according to commodity classified, each class is a commodity group;
1.2) collaborative filtering or correlation rule is adopted to calculate the associated articles group of each commodity;
1.3) take the entire service under the top n commodity group that correlativity under each commodity group is the highest as the relevant of commodity under this commodity group or similar commodity.
Wherein step 2) described in structure commodity purchasing predicted characteristics vector comprise: calculate the relevant or similar commodity of commodity adding the relevant of commodity in shopping cart or similar commodity, folder of puting into collection respectively, the number of times browsing these commodity and duration, browse similar or dependent merchandise number of times and, duration, bought the eigenwert of dependent merchandise record.
Also comprise, described purchase forecast model is trained.
Wherein step S4 specifically comprises:
Step S401, purchase historical behavior data based on user, adopt correlation rule or collaborative filtering, calculate the associated articles of these commodity, get front n the commodity that the degree of association is the highest, being correlated with or similar commodity set as these commodity;
Step S402, purchase probability forecast model and user login information is utilized to calculate purchase probability that is relevant or similar commodity.
Adopt a system for Method of Commodity Recommendation in ecommerce, comprising:
User behavior data acquisition module, for gathering the historical behavior of user at e-commerce website, or gathering user login information, obtaining the historical information of login user;
User buys predicted characteristics vector calculation module, for based on user's historical behavior or historical information structuring user's proper vector, or according to the historical behavior data of login user, and structuring user's proper vector;
Purchase probability forecast model module, according to user characteristics vector training purchase probability forecast model module, thus calculates the purchase probability obtaining commodity; Or the proper vector according to login user calculates user to the purchase probability of commodity with training purchase probability forecast model module;
User's commercial product recommending module, for the purchase probability according to commodity, by commercial product recommending to user.Wherein, user buys predicted characteristics vector calculation module, also comprises:
Similarity computing module, classifies the similarity of commodity according to commodity, and each class is a commodity group;
Associated articles group computing module, adopts collaborative filtering or correlation rule to calculate each flat associated articles group;
Associated articles acquisition module, take under the top n commodity group that correlativity under each commodity group is the highest whole flat as commodity under this commodity group relevant after similar commodity.
Described system adopts above-mentioned Method of Commodity Recommendation.
The invention provides Method of Commodity Recommendation and system thereof in a kind of ecommerce, realize recommending the commodity of customer demand, realize personalized recommendation, improve customer satisfaction, strengthen good Consumer's Experience sense.
Embodiment
Commodity purchasing probabilistic forecasting is that e-commerce website carries out goods marketing, the basic forecast data of personalized recommendation etc.And propose in the present invention in a kind of ecommerce based on the Method of Commodity Recommendation of commodity purchasing probability and system thereof, the method, based on the historical behavior of user on e-commerce website, adopts the method prediction user of linear regression and statistical classification to the purchase probability of commodity on e-commerce website.
As shown in Fig. 1 a, Fig. 1 b, the flow process of specific embodiments of the invention, particularly, can be decomposed into step as shown in Figure 2, as follows:
1) historical behavior of user at e-commerce website is gathered;
2) based on user's historical behavior structuring user's proper vector;
3) according to user characteristics vector training statistical classification model, thus the purchase probability obtaining commodity is calculated;
4) purchase probability is exported.
Or,
1) user login information is gathered;
2) historical information of login user is obtained; According to the historical behavior data of login user, structuring user's proper vector;
3) user is calculated to the purchase probability of commodity according to the proper vector of login user and the purchase probability forecast model of training;
4) purchase probability is exported.
Wherein, as shown in Figure 3, purchase probability prediction module comprises further:
1, user's historical behavior data obtaining module:
This module main users gathers the historical behavior data of user, comprises the webpage that user browses, browses the duration of webpage, the commodity of user search, and user puts into collection the commodity of folder, and user adds the commodity of shopping cart, and user submits the commodity of order to;
2, user characteristics vector calculation module
This module buys commodity probabilistic forecasting proper vector based on the historical data structuring user's of user.
This module is mainly divided into two parts: 1) according to user's historical data complete or collected works, calculates the dissimilar page on e-commerce website and contributes weight to conversion ratio, the time attenuation function of different commodity category and the relevant similar commodity set of different commodity; 2) according to the historical record structural attitude vector of commodity.
As shown in fig. 4 a, the described historical behavior data configuration user based on user buys commodity probabilistic forecasting proper vector and specifically comprises the following steps:
B.1) according to user's historical data, carry out dissimilar page weight on e-commerce website and calculate, the time attenuation function of different commodity category calculates and calculates with the relevant similar commodity set of different commodity; Wherein, described page weight calculates, and be the conversion ratio in the different page source of statistics, namely the page contributes weight calculation to conversion ratio, calculates with source page weight according to conversion ratio;
B.2) according to b.1) in the result that calculates of user's historical data of obtaining, structure commodity purchasing predicted characteristics vector.
As shown in Figure 4 b, after obtaining user's historical behavior data,
2.1 page weight calculate
Add up the conversion ratio in different page source, calculate with source page weight according to conversion ratio, concrete page weight computation process is as follows:
1) browsing and buying ratio of the separate sources page is calculated;
2) browse and buy ratio as radix using catalogue page, the ratio of other pages and this radix is as the weight of this page;
2.2 time attenuation functions calculate
Learn the time attenuation function of each commodity category according to time effects, add up without commodity category time effects factor, particularly, adopt the time attenuation function of following form
Wherein, K is the half life period of attenuation function, according to the goods browse of different category in different time sections and time buying accounting, estimates the half life period on different category commodity.
2.3 are correlated with and similar commodity
Obtain the relevant of each commodity and similar commodity set, calculate the relevant of each commodity category and similar commodity.Be correlated with and acquaintance commodity calculation process, as shown in Figure 5, as follows:
1) similarity of commodity according to commodity classified, each class is a commodity group;
2) collaborative filtering or correlation rule is adopted to calculate the associated articles group of each commodity;
3) entire service under the highest front 3 the commodity groups of correlativity under each commodity group is got as the relevant of commodity under this commodity group or similar commodity.
As shown in Figure 6, calculate purchase probability commodity, comprise 8 kinds of situations:
1) be correlated with and similar commodity amount in recent purchases car
2) relevant and similar commodity amount in recent collection
3) this goods browse time in the recent period, further, based on different page weight to the weighting of the separate sources page
4) this goods browse number of times in the recent period, further, based on different page weight to the weighting of the separate sources page
5) be correlated with and similar goods browse number of times in the recent period, further, based on different page weight to the weighting of the separate sources page
6) be correlated with and the similar goods browse time in the recent period, further, based on different page weight to the weighting of the separate sources page
7) be correlated with and similar commodity purchasing number of times in the recent period
8) these commodity are bought in the recent period and are browsed accounting
Above-mentioned situation, further, obtains the vector value of recent every day based on subordinate function, gathered vector value in the recent period based on time decay;
Finally, output characteristic vector.Particularly,
2.4 product features calculate
1) add in shopping cart and be correlated with and similar commodity
Get the phase Sihe dependent merchandise set of these commodity respectively, structure subordinate function form
a=16
Calculate the mark of the every day of nearest 7 days, and according to time attenuation function, cumulative as final eigenwert after decay;
2) to put into collection folder commodity
Get the phase Sihe dependent merchandise set of these commodity, structure subordinate function form
a=10
Calculate the mark of the every day of nearest 7 days, and according to time attenuation function, cumulative as final eigenwert after decay;
3) number of times of this SKU is browsed
Browse level Four page number of times belonging to this SKU, and judge that the level Four page is originated, be multiplied by coefficient w according to the weight of source page, obtain mark on the same day according to subordinate function, a=5 in subordinate function.
Calculate the mark of the every day of nearest 7 days, and according to time attenuation function, cumulative as final eigenwert after decay;
4) duration of this SKU is browsed
Judge that the level Four page is originated, be multiplied by corresponding source page weight coefficient w according to page source, calculate the score of this user according to subordinate function, wherein weight coefficient a=800;
Calculate the mark of the every day of nearest 7 days, and according to time attenuation function, cumulative as final eigenwert after decay;
5) browse phase Sihe to be correlated with the number of times of SKU
Get the phase Sihe dependent merchandise set of these commodity, judge that the level Four page is originated, the respective weights of source page is multiplied by commodity page source.Be eigenwert, wherein a=45 according to member-shipfunction conversion.
Calculate the mark of the every day of nearest 7 days, and according to time attenuation function, cumulative as final eigenwert after decay;
6) browse and relevant add up duration with the similar SKU page
Get the phase Sihe dependent merchandise set of these commodity, get the browsing time of this user in phase Sihe dependent merchandise details page, and be multiplied by the weight of corresponding source page according to page source, calculate eigenwert, wherein a=2000S according to subordinate function.
Calculate the mark of the every day of nearest 7 days, and according to time attenuation function, cumulative as final eigenwert after decay;
7) Related product record was bought
Get the phase Sihe dependent merchandise set of these commodity.Take the order data submitted in the past three months of family according to user ID, add up to obtain the SKU type of merchandize quantity belonging to dependent merchandise set, calculate eigenwert, wherein a=16 according to degree of membership;
8) sales volume/browse
The quantity of SKU order and SKU level Four page browsing quantity ratio;
Get the accumulated value of 7 days, do not do and decay.As shown in Figure 7, the purchase forecast model construction described in step S3 calculates and comprises:
What step S301, training were gathered builds, and gets the proper vector of the commodity being converted into order in shopping cart as the positive sample data of training set; Get the proper vector of the commodity not being converted into order in shopping cart as the anti-sample data of training set;
Step S302, model training, adopt the function in neural network, Bayes classifier and support vector machine to calculate for obtaining purchase probability;
Step S303, purchase probability mark calculate, commodity are submitted to order and do not submit the purchase probability of ratio as commodity of order probability to, utilize sigmoid function smoothing to buying the rear probability obtained, make it be distributed in (0-100), computing formula is as follows:
Score=1/(1+e-r*x)
Wherein x=p1/p2; P1: commodity may submit the probability of order to; P2: commodity can not submit the probability of order to; R: smoothing factor.
The model training related in the present invention, adopts statistical classification model to calculate the purchase probability of each commodity, comprises neural network, Bayes classifier and support vector machine.Comprise:
3.1 training set are built
Get the proper vector of the commodity being converted into order in shopping cart as the positive sample data of training set;
Get the proper vector of the commodity not being converted into order in shopping cart as the anti-sample data of training set;
3.2 model training
1) neural network
Three layers of positive feedback neural network structure can be adopted, the output of neural network as being previously described 8 dimensional feature vectors, the probable value that the output of neural network is this commodity purchasing and does not buy.
The purchase exported in neural network with these commodity and the ratio do not bought are as the purchase probability of these commodity.
2) Bayes classifier
Gauss hybrid models is adopted to make the feature interpretation model of commodity.On positive sample and anti-sample, train gauss hybrid models parameter by expectation-maximization algorithm respectively, thus obtain buying and the gauss hybrid models do not bought.Buying by the feature of commodity and do not buying the posterior probability ratio that gauss hybrid models obtains, as the final purchase probability of these commodity;
3) support vector machine
The training set constructed above closes Training Support Vector Machines model, these commodity can be obtained to the product features vector of input and differentiate to positive sample the distance that face and anti-sample differentiate on feature space, use the final purchase probability value of ratio as these commodity that two differentiate plane.
3.3 purchase probability marks calculate
Commodity are submitted to order and do not submit the purchase probability of ratio as commodity of order probability to, utilize sigmoid function smoothing to the probability obtained after buying, make it be distributed in (0-100), computing formula is as follows:
Score=1/(1+e-r*x)
Wherein x=p1/p2
P1: commodity may submit the probability of order to;
P2: commodity can not submit the probability of order to;
R: smoothing factor.
As shown in Fig. 8 a, Fig. 8 b, the similar dependent merchandise purchase probability that the present invention relates to, comprising:
The phase Sihe dependent merchandise set of 4.1 commodity
Based on the purchase history data of user, correlation rule or collaborative filtering can be adopted, calculate the associated articles of these commodity, get front n the commodity that the degree of association is the highest, as the phase Sihe acquaintance commodity set of these commodity.
4.2 calculate acquaintance dependent merchandise purchase probability
Based on the degree of association between the main commodity of previous calculations and dependent merchandise, be correlated with and the purchase probability of similar commodity according to following formulae discovery
Score_i=Master_SPU_Pos*SKU_Score_i/SUM(SKU_Score_i)
Master_SPU_Pos is the main commodity purchasing probability of previous calculations, and SUM (SKU_Score_i) is all associated articles degree of association mark cumulative sums of main commodity, and SKU_Score_i is the degree of association of SKU_i and main commodity.
4.3 bought commodity filters
Take the commodity that order is submitted at family within nearly 7 days to, commodity group belonging to order commodity will be submitted to as user filtering commodity group; The commodity filtering commodity group are belonged in the set of filter user Recommendations;
4.4 sequence
Commodity based on Recommendations set after filtration are above combined, according to the purchase probability of commodity in commodity set, commodity is sorted, using the Recommendations list of the commodity after sequence as user.
As shown in Figure 9, be the structural representation of the system that the present invention relates to.This system, comprising: user behavior data acquisition module 201, for gathering the historical behavior of user at e-commerce website, or gathering user login information, obtaining the historical information of login user; User buys predicted characteristics vector calculation module 202, for based on user's historical behavior or historical information structuring user's proper vector, or according to the historical behavior data of login user, and structuring user's proper vector; Wherein, also comprise: similarity computing module 2021, the similarity of commodity according to commodity classified, each class is a commodity group; Associated articles group computing module 2022, adopts collaborative filtering or correlation rule to calculate each flat associated articles group; Associated articles acquisition module 2023, take under the top n commodity group that correlativity under each commodity group is the highest whole flat as commodity under this commodity group relevant after similar commodity.Purchase probability forecast model module 203, according to user characteristics vector training purchase probability forecast model module, thus calculates the purchase probability obtaining commodity; Or the proper vector according to login user calculates user to the purchase probability of commodity with training purchase probability forecast model module; User's commercial product recommending module 204, for the purchase probability according to commodity, by commercial product recommending to user.In addition, described system adopts above-mentioned Method of Commodity Recommendation.
Those skilled in the art should understand that; above-mentioned embodiment is only used to the object that illustrates and the example of lifting; instead of be used for limiting; the any amendment done under all instructions in the application and claims, equivalently to replace, all should be included in and this application claims in the scope of protection.