A kind of Method of Commodity Recommendation and device based on big dataTechnical field
The present invention relates to big data processing technology fields, and in particular to a kind of Method of Commodity Recommendation and dress based on big dataIt sets.
Background technique
It is generally lower in the multiple purchase rate of the articles such as house ornamentation field, especially floor, ceramic tile, user, it is difficult to pass through user'sHistorical data predicts buying behavior, and in sales process, the purchasing demand of user how is accurately held, and then recommended user needsThe commodity wanted are a highly important links.Traditional recommendation behavior judges according to the industry experience of sales force mostly,And when facing new user, uncertain factor is too many, leads to the accuracy big rise and fall of Recommendations.
Based on big data, by forming more stable shopping to the customer buying behavior and merchandise news precipitated early periodModel forms user's portrait, can provide a kind of more stable commercial product recommending side by obtaining the information data of new userFormula, therefore, how to be become using the commercial product recommending that big data provides high accuracy to new user is worth solving the problems, such as.
Summary of the invention
The present invention provides a kind of Method of Commodity Recommendation and device based on big data, can provide new user compared with high precisionThe commercial product recommending of degree.
A kind of Method of Commodity Recommendation based on big data provided by the invention, comprising the following steps:
Step A, the commodity data and user data for collecting bargain, pre-process the user data, shapeAt the feature samples comprising commodity data and user data;
Step B, model is constructed according to the feature samples;
Step C, the behavior record data of new user are obtained;
Step D, the Recommendations data after sequence are sent for new user.
Further, in the step A:
The merchandise news includes;Title, model, specification, the price of commodity;
The user data includes: personal essential information, geographical location information, the browsing record information, consumption note of userRecord information, temporal information corresponding with browsing record information, consumption recording information;
The pretreatment includes: the operation including data cleansing, filling, normalization.
Further, the step B is specifically included:
Step B1, setting makes the constant value of loss function minimization, calculates the negative gradient of loss function in "current" modelValue, as residual values;
Step B2, the leaf node region of regression tree is set, and fitting obtains residual values;
Step B3, using the value of linear search estimation node region, make loss function minimization;
Step B4, regression tree, the model exported are updated.
Further, the behavior record data of new user specifically include in the step C: the personal essential information of user,Manage location information, browsing record information, temporal information corresponding with browsing record information.
Further, the step D is specifically included:
Step D1, the behavior record data of the new user are matched with the model degree of being associated, is arranged by the degree of associationSequence;Wherein, to geographical location information, browsing record information, with the corresponding temporal information of the browsing record information, user it is aPeople's essential information assigns different weights by descending sequence;
Step D2, merchandise news to be recommended is obtained according to matched result;
Step D3, the descending arrangement of probability that the commodity to be recommended are purchased according to commodity is ranked up, is generatedRecommendations set;
Step D4, the commodity data of the Recommendations set is sent to the new user.
Further, the purchased probability of commodity is obtained by the calculating of following formula in the step D3:
D indicates that user buys the probability of commodity, and u indicates that new user, in (i) indicate the commodity set for being directed toward commodity i, out(j) the commodity set that commodity j is directed toward is indicated, PR (j) indicates that the purchased probability of commodity j, PR (i) are the purchased of commodity iProbability.
Further, the commodity set is formed according to the commodity for being greater than support threshold, and the support indicates that commodity are sameWhen purchased probability, calculated and obtained by following formula:
Freq (A ∩ B) indicates that commodity A and commodity B number purchased simultaneously, N indicate total sale stroke count, Support (A∩ B) it is commodity A and commodity B probability purchased simultaneously, the as support of commodity A and commodity B.
A kind of device for recommending the commodity based on big data provided by the invention, including computer-readable medium, the mediumIt is stored with computer-readable instruction, the computer-readable instruction can be executed by processor to realize side described in any of the above embodimentsMethod.
The beneficial effects of the present invention are: the present invention discloses a kind of Method of Commodity Recommendation and device based on big data, pass throughThe commodity data and user data of bargain are collected, forms feature samples, and then construct model;By obtaining new user'sBehavior record data are recommended personalized commodity for new user, and are ranked up to the commodity, and the present invention can be to new userThe commercial product recommending of high accuracy is provided.
Detailed description of the invention
The invention will be further described with example with reference to the accompanying drawing.
Fig. 1 is a kind of flow chart of the Method of Commodity Recommendation based on big data of the embodiment of the present invention;
Fig. 2 is a kind of flow chart of the Method of Commodity Recommendation step B based on big data of the embodiment of the present invention;
Fig. 3 is a kind of flow chart of the Method of Commodity Recommendation step D based on big data of the embodiment of the present invention.
Specific embodiment
With reference to Fig. 1~3, a kind of Method of Commodity Recommendation based on big data provided in an embodiment of the present invention, which is characterized in thatThe following steps are included:
Step A, the commodity data and user data for collecting bargain, pre-process the user data, shapeAt the feature samples comprising commodity data and user data;
Step B, model is constructed according to the feature samples;
Step C, the behavior record data of new user are obtained;
Step D, the Recommendations data after sequence are sent for new user.
Further, in the step A:
The merchandise news includes;Title, model, specification, the price of commodity;
The user data includes: personal essential information, geographical location information, the browsing record information, consumption note of userInformation, temporal information corresponding with browsing record information, consumption recording information are recorded, individual's essential information includes surnameName, gender, age, phone.
The pretreatment includes: the operation including data cleansing, filling, normalization, and the data cleansing includes that removing is differentRegular data and hash.
Further, the step B is specifically included:
Step B1, setting makes the constant value of loss function minimization, calculates the negative gradient of loss function in "current" modelValue, as residual values;
Step B2, the leaf node region of regression tree is set, and fitting obtains residual values;
Step B3, using the value of linear search estimation node region, make loss function minimization;
Step B4, regression tree, the model exported are updated.
Allow loss function along the decline of gradient direction, using the negative gradient of loss function "current" model value as returningThe approximation for returning problem to promote the residual error in tree algorithm goes one regression tree of fitting, when every wheel iteration, all goes fitting lossThe negative gradient of function under the current model.
By taking turns iteration, every wheel iteration generates a Weak Classifier, residual error of each classifier in last round of classifier moreOn the basis of be trained, the requirement to Weak Classifier is usually simple enough, and is low variance and high deviation, because of trainingProcess be that the precision of final classification device is continuously improved by reducing deviation.
Further, the behavior record data of new user specifically include in the step C: the personal essential information of user,Manage location information, browsing record information, temporal information corresponding with browsing record information.
Further, the step D is specifically included:
Step D1, the behavior record data of the new user are matched with the model degree of being associated, is arranged by the degree of associationSequence;Wherein, to geographical location information, browsing record information, with the corresponding temporal information of the browsing record information, user it is aPeople's essential information assigns different weights by descending sequence;
Step D2, merchandise news to be recommended is obtained according to matched result;
Step D3, the descending arrangement of probability that the commodity to be recommended are purchased according to commodity is ranked up, is generatedRecommendations set;
Step D4, the commodity data of the Recommendations set is sent to the new user.
Further, the purchased probability of commodity is obtained by the calculating of following formula in the step D3:
D indicates that user buys the probability of commodity, and u indicates that new user, in (i) indicate the commodity set for being directed toward commodity i, out(j) the commodity set that commodity j is directed toward is indicated, PR (j) indicates that the purchased probability of commodity j, PR (i) are the purchased of commodity iProbability.
Further, the commodity set is formed according to the commodity for being greater than support threshold, and the support indicates that commodity are sameWhen purchased probability, calculated and obtained by following formula:
Freq (A ∩ B) indicates that commodity A and commodity B number purchased simultaneously, N indicate total sale stroke count, Support (A∩ B) it is commodity A and commodity B probability purchased simultaneously, the as support of commodity A and commodity B.
A kind of device for recommending the commodity based on big data provided in an embodiment of the present invention, including computer-readable medium, instituteIt gives an account of matter and is stored with computer-readable instruction, the computer-readable instruction can be executed by processor to realize any of the above-described instituteThe method stated.
The above, only presently preferred embodiments of the present invention, the invention is not limited to above embodiment, as long asIt reaches technical effect of the invention with identical means, all should belong to protection scope of the present invention.