Movatterモバイル変換


[0]ホーム

URL:


CN103839167A - Commodity candidate set recommendation method - Google Patents

Commodity candidate set recommendation method
Download PDF

Info

Publication number
CN103839167A
CN103839167ACN201210475495.4ACN201210475495ACN103839167ACN 103839167 ACN103839167 ACN 103839167ACN 201210475495 ACN201210475495 ACN 201210475495ACN 103839167 ACN103839167 ACN 103839167A
Authority
CN
China
Prior art keywords
centerdot
commodity
user
recommendation
recommendations
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201210475495.4A
Other languages
Chinese (zh)
Inventor
梅昱婷
刘博�
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dalian Lingdong Technology Development Co ltd
Original Assignee
Dalian Lingdong Technology Development Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dalian Lingdong Technology Development Co ltdfiledCriticalDalian Lingdong Technology Development Co ltd
Priority to CN201210475495.4ApriorityCriticalpatent/CN103839167A/en
Publication of CN103839167ApublicationCriticalpatent/CN103839167A/en
Pendinglegal-statusCriticalCurrent

Links

Images

Landscapes

Abstract

Translated fromChinese

本发明公开了一种商品候选集推荐方法,包括以下步骤:频率矩阵的计算,包括关联矩阵和频率矩阵的计算,其中频率矩阵中的每个元素表示商品i跳转到商品j占商品i跳转到其他商品的比值。商品推荐候选集的推荐策略,在分析了一种基于阈值的传统方法与一种前K项最大值的方法的优缺点之后,提出了一种综合的推荐策略。该推荐策略使用用户浏览的会话序列作为推荐规则的条件,既考虑到了用户的历史浏览商品,又考虑到了当前浏览商品,综合了前两种推荐策略的优点。The present invention discloses a method for recommending a candidate set of commodities, comprising the following steps: calculating a frequency matrix, including calculating an association matrix and a frequency matrix, wherein each element in the frequency matrix represents the ratio of commodity i jumping to commodity j to commodity i jumping to other commodities. After analyzing the advantages and disadvantages of a traditional method based on a threshold and a method of the maximum value of the top K items, a comprehensive recommendation strategy is proposed for the recommendation strategy of the candidate set of commodity recommendations. The recommendation strategy uses the session sequence browsed by the user as the condition of the recommendation rule, taking into account both the user's historical browsing of commodities and the currently browsing commodities, and combining the advantages of the first two recommendation strategies.

Description

Translated fromChinese
一种商品候选集推荐方法A Product Candidate Set Recommendation Method

技术领域technical field

本发明涉及个性化商品推荐技术,特别是一种商品候选集推荐方法。The invention relates to a personalized product recommendation technology, in particular to a product candidate set recommendation method.

背景技术Background technique

互联网的快速普及其带来的信息革命将人类带入了信息社会和网络经济时代,对企业发展和个人生活都产生了深刻的影响。然而,电子商务网站的商品以指数速度增长,不论其数量上还是种类上都是人们难以想象的,要从这个海洋中准确迅速地找到并获得自己所需要的商品却很困难。绝大多数的站点结构是庞大和复杂的,用户在查询时往往会迷失他们的目标,或者得到一些模糊的结果。因此,很多用户在浏览站点时,往往花费大量的时间和精力浏览与自己想买的商品无关的页面,这使得很多用户对在此网站购买商品失去信心,从而使该网站丧失很多用户。为了增加销售、增加用户满意度、增加竞争力和理论研究,电子商务商品推荐系统便应运而生。它研究的主要内容是如何解决从海量的商品和信息中高效的获得有用知识,动态的分析顾客的个性化需求,实时、主动的为顾客提供符合其偏好的商品,以及有效的提高推荐水平等问题。The rapid popularization of the Internet and the information revolution brought by it have brought mankind into the era of information society and network economy, which has had a profound impact on the development of enterprises and personal life. However, the commodities on e-commerce websites are growing at an exponential rate, and it is unimaginable both in terms of quantity and variety. It is very difficult to accurately and quickly find and obtain the commodities you need from this ocean. Most of the site structures are large and complex, and users tend to lose their purpose when querying, or get some vague results. Therefore, when browsing the site, many users often spend a lot of time and energy browsing pages that have nothing to do with the goods they want to buy, which makes many users lose confidence in purchasing goods on this website, thus causing this website to lose many users. In order to increase sales, increase user satisfaction, increase competitiveness and theoretical research, e-commerce commodity recommendation system came into being. The main content of its research is how to efficiently obtain useful knowledge from massive commodities and information, dynamically analyze the individual needs of customers, provide customers with products that meet their preferences in real time and actively, and effectively improve the recommendation level, etc. question.

商品推荐涉及到两大类技术:商品候选集的选择;对商品候选集中的商品进行排序并呈现给用户。主要的推荐技术包括:基于内容推荐、协同过滤推荐、基于关联规则推荐、基于效用推荐、基于知识推荐和基于用户统计信息推荐。但是这些方法都存在许多缺点:基于内容的推荐算法缺乏个性化,只能发现用户感兴趣的项目,但是不能发现用户以后会感兴趣的新产品;基于内容的推荐只能对属性规定的内容进行分析,但是很多时候,属性并不能体现一些隐含的特点;缺乏用户反馈;基于用户统计信息的推荐技术虽在一些以会员制为主要销售模式的网站却很有用处,但并不适用于普通的电子商务模式;其实基于知识和效用的推荐同基于内容的推荐有一个共同的特点就是需要对项目即推荐产品的特征进行描述,然后才能推荐。而基于效用的推荐想确定用户的效用函数也比较困难。所以这两种方法也不是十分适用。于关联规则的推荐没有上述技术的那些局限。它可以依靠网站原有的记录为用户提供推荐,而且这些推荐不但可以满足用户的个性化偏好,还可以在一定程度上预测用户的购买行为。但是,由于关联规则没有考虑规则中各个项的先后次序,而用户访问网站的时候是有严格的先后次序的,因此基于关联规则的推荐技术是存在一定的不足。Commodity recommendation involves two categories of technologies: the selection of commodity candidate sets; sorting the commodities in the commodity candidate set and presenting them to users. The main recommendation techniques include: content-based recommendation, collaborative filtering recommendation, association rule-based recommendation, utility-based recommendation, knowledge-based recommendation and user statistical information-based recommendation. However, these methods have many shortcomings: the content-based recommendation algorithm lacks personalization, and can only find items that users are interested in, but cannot find new products that users will be interested in in the future; Analysis, but in many cases, attributes do not reflect some hidden features; lack of user feedback; recommendation technology based on user statistics is useful in some websites with membership as the main sales model, but it is not applicable to ordinary In fact, the recommendation based on knowledge and utility has a common feature with the recommendation based on content, that is, it needs to describe the characteristics of the item, that is, the recommended product, before it can be recommended. It is also difficult to determine the user's utility function for utility-based recommendation. So these two methods are not very applicable. Recommendations based on association rules do not have those limitations of the techniques described above. It can rely on the original records of the website to provide users with recommendations, and these recommendations can not only meet the user's personalized preferences, but also predict the user's purchase behavior to a certain extent. However, since association rules do not consider the order of items in the rules, and users visit websites in strict order, there are certain deficiencies in the recommendation technology based on association rules.

发明内容Contents of the invention

为解决现有技术存在的上述问题,本发明要设计一种商品候选集推荐方法。In order to solve the above-mentioned problems in the prior art, the present invention designs a method for recommending product candidate sets.

为了实现上述目的,本发明的技术方案如下:一种商品候选集推荐方法,包括以下步骤:In order to achieve the above object, the technical solution of the present invention is as follows: a method for recommending product candidate sets, comprising the following steps:

A、频率矩阵的计算A. Calculation of frequency matrix

推荐模型要完成的任务就是发现商品中商品集之间的关联。更确切的说,就是通过量化的数字描述所有商品集P子集B的出现对子集R的出现有多大的影响。其中P={p1,p2,....pn},B={b1,b2,....bn},R={r1,r2,....rn}是商品的集合,其中P包含所有的商品,B和R是P的两个子集,n、p、q分别是P、B、R三个集合中商品的数量。B是系统的输入数据,P是系统的输出数据。一个推荐规则可以表示成这里

Figure BDA00002444674500022
并且
Figure BDA00002444674500023
The task to be completed by the recommendation model is to discover the association between the product sets in the product. More precisely, it is to describe how much the appearance of subset B of all commodity sets P affects the appearance of subset R through quantified numbers. where P={p1 ,p2 ,....pn }, B={b1 ,b2 ,....bn }, R={r1 ,r2 ,....rn } is a collection of commodities, where P contains all commodities, B and R are two subsets of P, and n, p, and q are the quantities of commodities in the three sets of P, B, and R, respectively. B is the input data of the system, and P is the output data of the system. A recommendation rule can be expressed as here
Figure BDA00002444674500022
and
Figure BDA00002444674500023

一般而言,如果把事务作为规则分析的最小单位,那么得到的推荐结果就应该是更加精确的。原因在于:在一次连续的访问过程中,用户的兴趣都是稳定不变的,每一个事务都体现了用户当时的兴趣所在,针对事务进行分析相当于针对兴趣进行分析。同一个用户可能会有多次访问的经历,会存在多个不同的事务,但是这多个事务恰好反映了该用户在不同时刻,不同环境下不同的兴趣和爱好。针对访问事务进行分析就可以发现用户当前的兴趣所在,而以用户为基本单位进行分析得到的结果往往都是该用户以前感兴趣的产品,也就无法为用户提供更好的推荐服务。Generally speaking, if transactions are regarded as the smallest unit of rule analysis, then the recommendation results should be more accurate. The reason is that during a continuous visit, the user's interests are stable, and each transaction reflects the user's interests at that time, and analyzing transactions is equivalent to analyzing interests. The same user may have multiple access experiences, and there may be multiple different transactions, but these multiple transactions just reflect the different interests and hobbies of the user at different times and in different environments. Analyzing access transactions can reveal the current interest of the user, but the results obtained by analyzing the user as the basic unit are often the products that the user was interested in before, so it is impossible to provide users with better recommendation services.

通过扫描事务集中的全部事务可以构造商品浏览的“关联矩阵”A,由于矩阵A是基于全部事务生成的,所以其包含了所有用户的浏览模式和兴趣信息。关联矩阵中的每一项都表示商品i与商品j之间的关联性。By scanning all the transactions in the transaction set, the "association matrix" A of product browsing can be constructed. Since the matrix A is generated based on all transactions, it contains the browsing patterns and interest information of all users. Each item in the incidence matrix represents the association between commodity i and commodity j.

AA1,11,1AA1,21,2············AA11,,NNAA2,12,1AA2,22,2·&Center Dot;·····&Center Dot;·&Center Dot;·&Center Dot;AA22,,NN·&Center Dot;·····&Center Dot;·&Center Dot;···&Center Dot;···&Center Dot;···&Center Dot;··AANN,,11AANN,,22·······&Center Dot;·&Center Dot;·&Center Dot;AANN,,NN

关联矩阵A中的下标i,j都是商品的简化表示,在预处理环节识别出日志包含的所有页面中的商品,并采用商品代码简化表示,方便后续步骤的关联规则分析。单元格A(1,2)=80表示所有用户在浏览过程中从商品1跳转到商品2的次数为80。根据关联矩阵A计算得到各商品之间的频率矩阵F。The subscripts i and j in the association matrix A are simplified representations of commodities. In the preprocessing step, the commodities in all pages included in the log are identified, and simplified representations are used to facilitate the analysis of association rules in subsequent steps. Cell A(1,2)=80 indicates that the number of times all users jump from product 1 to product 2 during the browsing process is 80. According to the correlation matrix A, the frequency matrix F among the commodities is obtained.

Ff1,11,1Ff1,21,2············Ff11,,NNFf2,12,1Ff2,22,2·&Center Dot;·&Center Dot;·&Center Dot;·&Center Dot;·&Center Dot;··Ff22,,NN·&Center Dot;·&Center Dot;·&Center Dot;·&Center Dot;·&Center Dot;·&Center Dot;·········&Center Dot;·&Center Dot;FfNN,,11FfNN,,22·&Center Dot;···&Center Dot;···&Center Dot;·&Center Dot;FfNN,,NN

频率矩阵F中下标为(i,j)的元素F(i,j)由下述公式计算得到:The element F(i,j) with the subscript (i,j) in the frequency matrix F is calculated by the following formula:

Ff((ii,,jj))==AA((ii,,jj))AAii,,11≤≤ii,,jj≤≤nno

其中Ai=Σj=1nA(i,j)in A i = Σ j = 1 no A ( i , j )

F(i,j)表示由商品i跳转到商品j占所有从商品i跳转到其它商品的比率。F(i, j) represents the ratio of jumping from commodity i to commodity j to all jumps from commodity i to other commodities.

B、商品推荐候选集的推荐策略B. Recommendation strategies for product recommendation candidate sets

根据F(i,j)可以得到用户访问了商品i之后接着访问商品j的概率,由此便可以根据抓取的用户当前访问的页面,以频率矩阵F为根据向用户进行商品推荐。可以使用三种推荐策略:According to F(i, j), the probability that the user visits product i and then visits product j can be obtained, so that the product recommendation can be made to the user based on the captured page currently visited by the user and the frequency matrix F. Three recommendation strategies are available:

B1、传统的方法是规定一个推荐阈值β,抓取到当前用户访问的商品编号为P1,把F(P1,j)≥β的商品作为推荐集。但是这种方法存在很大的弊端,β很难确定。对同一个商品而言,β确定过小就会得到大量的推荐商品,用户还要从大量的推荐商品集中查找自己喜欢的商品,不仅分散了用户的注意力,而且用户的选择负担仍然很重;β确定过大,计算出的推荐商品数量少,有可能会出现用户无从选择的情况,从而无法满足用户的需求。对多个商品而言,为了达到较好的推荐效果,就要为每个商品确定合适的β,但是随着商品的数量增大,这个方案的难度也在不断的加大,还要有管理人员的干预。更进一步是,频率矩阵F中的数值是随着用户的浏览而不断变更的,如果是由人工来确定β的话也是不可能的,所以说这种基于阈值的推荐选择方法是不可取的。本文使用的方法是假设抓取到当前用户访问的商品编号为P1,那么从矩阵F中选出F(P1,j)数值最大的前k个商品编号,记做R1,R1={r1,r2,r3,...ri,...rk},把R1作为推荐候选集,即包含k个推荐商品。这种方法的优势是:在任何情况下,给用户的推荐商品数量都是固定的,只要确定一个合适的k值就可以方便的为用户推荐k个商品,至少在数量上是稳定的。B1. The traditional method is to specify a recommendation threshold β, capture the product number that the current user is accessing as P1, and use the products with F(P1 ,j)≥β as the recommendation set. But this method has great disadvantages, β is difficult to determine. For the same product, if β is determined to be too small, a large number of recommended products will be obtained, and users have to find their favorite products from a large number of recommended products, which not only distracts the user's attention, but also makes the user's selection burden still heavy ; If β is determined to be too large, the number of recommended products calculated is small, and there may be a situation where the user has no choice, thus failing to meet the needs of the user. For multiple products, in order to achieve a better recommendation effect, it is necessary to determine the appropriate β for each product. However, as the number of products increases, the difficulty of this solution is also increasing, and management is required. personnel intervention. Furthermore, the values in the frequency matrix F are constantly changing as the user browses, and it is impossible to determine β manually, so this threshold-based recommendation selection method is not advisable. The method used in this article is to assume that the product number accessed by the current user is P1, then select the top k product numbers with the largest value of F(P1 , j) from the matrix F, and record it as R1, R1={r1 ,r2 ,r3 ,...ri ,...rk }, take R1 as the recommendation candidate set, namely Contains k recommended items. The advantage of this method is: in any case, the number of recommended products to the user is fixed, as long as a suitable k value is determined, k products can be easily recommended to the user, at least in terms of quantity.

B2、假设抓取到当前用户访问的商品为P1,那么从频率矩阵F中选出F(P1,j)数值最大的前x个商品,记做R1,假设R1={r1,r2,r3,...,rx},再把R1作为一个输入数据,把R1中的任意元素ri(0<i<x)看作P2,分别选出F(P2,j)排名靠前的y个商品代码R2,求R2l(0<l<x)的并集记做R2,这样推荐候选集就是R1和R2的并集:

Figure BDA00002444674500042
最多包含x+x×y个推荐商品。就可以根据实际的要求确定x和y的值,从而获得不同数量的推荐商品。B2. Assuming that the product currently accessed by the current user is P1, then select the first x products with the largest value of F(P1 ,j) from the frequency matrix F, and record it as R1, assuming R1={r1 ,r2 ,r3 ,...,rx }, then take R1 as an input data, regard any element ri (0<i<x) in R1 as P2, and select F(P2 ,j) ranking For the first y commodity codes R2, find the union of R2l (0<l<x) and record it as R2, so that the recommended candidate set is the union of R1 and R2:
Figure BDA00002444674500042
Contains at most x+x×y recommended products. The values of x and y can be determined according to actual requirements, so as to obtain different quantities of recommended commodities.

B3、上面提出的两个方案都是仅仅依据用户当前浏览的商品作为条件进行推荐,不足之处:输入数据B只有一个商品,虽然推荐的方法有所不同,但是推荐的结果都是固定不变的。即所有的用户在浏览同一商品时,他们看到的推荐商品都是相同的。系统是完成了推荐的任务,但是无法根据用户的浏览状态不同推荐不同的商品。存在的缺点也带来了优点,既然所有条件的推荐结果固定不变,就可以在线下计算出所有的推荐结果,也会提高商品推荐的实时性。针对以上两种方案的不足,提出了第三种方案。抓取到用户当前浏览的商品编号P1和上一个商品编号P0,组成会话序列(P0,P1)。从频率矩阵中选出F(P1,j)数值最大的前x个商品,记做R1;假设R1={r11,r22,r33,...,rxx},把R1作为一个输入数据,另外把r11,r22,r33,...,rxx看作i,分别选出F(i,j)排名靠前的y个商品,记做R2;除此之外从矩阵F中选出F(P0,j)数值最大的z个商品,记做R3。这样推荐候选集结果就是R1,R2和R3的并集,即

Figure BDA00002444674500043
最多有x+x×y+z个推荐商品。之所以还要选择R3是因为R3包含的商品和P0有很高的关联度,而且具有相似兴趣的用户访问了P0之后有很高比例的用户都访问了R3,既然当前用户访问了P0,那么假设他可能也对R3感兴趣。调整x,y和z的大小,就可以调整推荐商品的数量。B3. The two solutions proposed above are only recommended based on the product currently browsed by the user as the condition. The disadvantage: the input data B has only one product. Although the recommended method is different, the recommended results are all fixed. of. That is, when all users browse the same product, they see the same recommended product. The system has completed the task of recommendation, but it cannot recommend different products according to the user's browsing status. The disadvantages also bring advantages. Since the recommendation results of all conditions are fixed, all recommendation results can be calculated offline, and the real-time performance of product recommendations will also be improved. Aiming at the deficiencies of the above two schemes, a third scheme is proposed. Capture the product number P1 currently browsed by the user and the previous product number P0 to form a session sequence (P0, P1). Select the first x products with the largest F(P1, j) value from the frequency matrix, and record it as R1; assuming R1={r11 ,r22 ,r33 ,...,rxx }, take R1 as an input In addition, r11 , r22 , r33 , ..., rxx are regarded as i, and respectively select y products with the top rank of F(i, j) and record them as R2; otherwise, from the matrix From F, select the z products with the largest F(P0,j) value, and record it as R3. In this way, the result of the recommended candidate set is the union of R1, R2 and R3, namely
Figure BDA00002444674500043
There are at most x+x×y+z recommended products. The reason for choosing R3 is that the products contained in R3 have a high degree of correlation with P0, and after users with similar interests visit P0, a high proportion of users visit R3. Since the current user visits P0, then Assuming he might also be interested in R3. By adjusting the size of x, y and z, the number of recommended items can be adjusted.

该推荐策略使用用户浏览的会话序列作为推荐规则的条件,考虑到了用户的历史浏览商品和当然浏览商品。用户浏览的序列不同,推荐的商品也就不同,基本可以达到个性化推荐的要求。该策略只选择了长度为二的商品浏览序列,没有选择更长序列的原因在于,P0代表着用户浏览的历史商品,通过它可以推测出商品集R3,如果增加历史商品的数量,推荐的结果受历史商品的影响较大,对用户兴趣的转移不敏感;P1是用户当前浏览商品,通过它推测出商品集R2和R3,R2和R3已经是猜测的结果了,如果再用它们来推荐商品,推荐出的结果集合误差会比较大,影响商品的推荐的准确度,也会间接影响用户的感觉。This recommendation strategy uses the session sequence browsed by the user as the condition of the recommendation rule, taking into account the user's historical browsing products and natural browsing products. The recommended products are different depending on the user's browsing sequence, which can basically meet the requirements of personalized recommendation. This strategy only selects a product browsing sequence of length two, and the reason why a longer sequence is not selected is that P0 represents the historical products browsed by the user, through which the product set R3 can be inferred. If the number of historical products is increased, the recommended result It is greatly affected by historical products and is not sensitive to the transfer of user interest; P1 is the product currently browsed by the user, through which the product set R2 and R3 are inferred. R2 and R3 are already the results of guessing. If they are used to recommend products , the error of the recommended result set will be relatively large, which will affect the accuracy of product recommendation and indirectly affect the user's feeling.

与现有技术相比,本发明具有以下有益效果:Compared with the prior art, the present invention has the following beneficial effects:

1、本发明中所用的一种基于频率矩阵和文本相似度的个性化商品推荐方法可以实现个性化推荐,有效地避免了基于内容的推荐算法的缺乏个性化、只能发现用户感兴趣的项目的缺点。1. A personalized product recommendation method based on frequency matrix and text similarity used in the present invention can realize personalized recommendation, effectively avoiding the lack of personalization of the content-based recommendation algorithm and only finding items of interest to users Shortcomings.

2.本发明中所用的一种基于频率矩阵和文本相似度的个性化商品推荐方法有效地避免了基于用户统计信息的推荐技术的不足。基于用户统计信息的推荐技术需要大量收集用户信息,这在实际应用中是不足的。但是基于频率矩阵和文本相似度的个性化商品推荐方法使用了关联规则的方法来实现了这个目标。2. A personalized product recommendation method based on the frequency matrix and text similarity used in the present invention effectively avoids the shortcomings of the recommendation technology based on user statistical information. The recommendation technology based on user statistics needs to collect a large amount of user information, which is insufficient in practical applications. But the personalized product recommendation method based on frequency matrix and text similarity uses the method of association rules to achieve this goal.

附图说明Description of drawings

本发明共有附图1张,其中:The present invention has 1 accompanying drawing, wherein:

图1是本发明的商品候选集处理流程图;Fig. 1 is the processing flowchart of commodity candidate set of the present invention;

具体实施方式Detailed ways

上面提出的两个方案都是仅仅依据用户当前浏览的商品作为条件进行推荐,不足之处:输入数据B只有一个商品,虽然推荐的方法有所不同,但是推荐的结果都是固定不变的。即所有的用户在浏览同一商品时,他们看到的推荐商品都是相同的。系统是完成了推荐的任务,但是无法根据用户的浏览状态不同推荐不同的商品。存在的缺点也带来了优点,既然所有条件的推荐结果固定不变,就可以在线下计算出所有的推荐结果,也会提高商品推荐的实时性。针对以上两种方案的不足,提出了第三种方案。抓取到用户当前浏览的商品编号P1和上一个商品编号P0,组成会话序列(P0,P1)。从频率矩阵中选出F(P1,j)数值最大的前x个商品,记做R1;假设R1={r11,r22,r33,...,rxx},把R1作为一个输入数据,另外把r11,r22,r33,...,rxx看作i,分别选出F(i,j)排名靠前的y个商品,记做R2;除此之外从矩阵F中选出F(P0,j)数值最大的z个商品,记做R3。这样推荐候选集结果就是R1,R2和R3的并集,即

Figure BDA00002444674500051
最多有x+x×y+z个推荐商品。之所以还要选择R3是因为R3包含的商品和P0有很高的关联度,而且具有相似兴趣的用户访问了P0之后有很高比例的用户都访问了R3,既然当前用户访问了P0,那么假设他可能也对R3感兴趣。调整x,y和z的大小,就可以调整推荐商品的数量。The two solutions proposed above are only recommended based on the product currently browsed by the user. The disadvantage is that the input data B only has one product. Although the recommended methods are different, the recommended results are all fixed. That is, when all users browse the same product, they see the same recommended product. The system has completed the task of recommendation, but it cannot recommend different products according to the user's browsing status. The disadvantages also bring advantages. Since the recommendation results of all conditions are fixed, all recommendation results can be calculated offline, and the real-time performance of product recommendations will also be improved. Aiming at the deficiencies of the above two schemes, a third scheme is proposed. Capture the product number P1 currently browsed by the user and the previous product number P0 to form a session sequence (P0, P1). Select the first x products with the largest F(P1, j) value from the frequency matrix, and record it as R1; assuming R1={r11 ,r22 ,r33 ,...,rxx }, take R1 as an input In addition, r11 , r22 , r33 , ..., rxx are regarded as i, and respectively select y products with the top rank of F(i, j) and record them as R2; otherwise, from the matrix From F, select z products with the largest F(P0, j) value, and record it as R3. In this way, the result of the recommended candidate set is the union of R1, R2 and R3, namely
Figure BDA00002444674500051
There are at most x+x×y+z recommended products. The reason for choosing R3 is that the products contained in R3 have a high degree of correlation with P0, and after users with similar interests visit P0, a high proportion of users visit R3. Since the current user visits P0, then Assuming he might also be interested in R3. By adjusting the size of x, y and z, the number of recommended items can be adjusted.

该推荐策略使用用户浏览的会话序列作为推荐规则的条件,考虑到了用户的历史浏览商品和当然浏览商品。用户浏览的序列不同,推荐的商品也就不同,基本可以达到个性化推荐的要求。该策略只选择了长度为二的商品浏览序列,没有选择更长序列的原因在于,P0代表着用户浏览的历史商品,通过它可以推测出商品集R3,如果增加历史商品的数量,推荐的结果受历史商品的影响较大,对用户兴趣的转移不敏感;P1是用户当前浏览商品,通过它推测出商品集R2和R3,R2和R3已经是猜测的结果了,如果再用它们来推荐商品,推荐出的结果集合误差会比较大,影响商品的推荐的准确度,也会间接影响用户的感觉。This recommendation strategy uses the session sequence browsed by the user as the condition of the recommendation rule, taking into account the user's historical browsing products and natural browsing products. The recommended products are different depending on the user's browsing sequence, which can basically meet the requirements of personalized recommendation. This strategy only selects a product browsing sequence of length two, and the reason why a longer sequence is not selected is that P0 represents the historical products browsed by the user, through which the product set R3 can be inferred. If the number of historical products is increased, the recommended result It is greatly affected by historical products and is not sensitive to the transfer of user interest; P1 is the product currently browsed by the user, through which the product set R2 and R3 are inferred. R2 and R3 are already the results of guessing. If they are used to recommend products , the error of the recommended result set will be relatively large, which will affect the accuracy of product recommendation and indirectly affect the user's feeling.

Claims (1)

1. a commodity Candidate Set recommend method, is characterized in that comprising the following steps:
The calculating of A, frequency matrix
The task that recommended models will complete is exactly to find the association between commodity collection in commodity; More precisely, the appearance of describing all commodity collection P subset B by the numeral quantizing is exactly on the great impact of having of subset R; Wherein P={p1, p2... .pn, B={b1, b2... .bn, R={r1, r2... .rnthe set of commodity, and wherein P comprises all commodity, and B and R are two subsets of P, and n, p, q are respectively the quantity of commodity in P, B, tri-set of R; B is the input data of system, and P is the output data of system; A recommendation rules can be expressed as
Figure FDA00002444674400011
here
Figure FDA00002444674400012
and
Generally speaking,, if the least unit using affairs as rule analysis, the recommendation results obtaining so should be just more accurate; Reason is: in the access process of one-time continuous, user's interest is all stablized constant, and each affairs has embodied user interest place at that time, is equivalent to analyze for interest for affairs analysis; Same user may have repeatedly the experience of access, can have multiple different affairs, but these multiple affairs have reflected that this user is not in the same time just, different interest and hobby under varying environment; Analyze the interest place that just can find that user is current for accessing work, and the result obtaining take user as base unit analysis is all often interested product before this user, also just cannot provide better recommendation service for user;
Whole affairs of concentrating by scanning affairs can be constructed " incidence matrix " A of goods browse, because matrix A generates based on whole affairs, so its browse mode that has comprised all users and interest information; Each in incidence matrix all represents the relevance between commodity i and commodity j;
A1,1A1,2&CenterDot;&CenterDot;&CenterDot;&CenterDot;&CenterDot;&CenterDot;A1,NA2,1A2,2&CenterDot;&CenterDot;&CenterDot;&CenterDot;&CenterDot;&CenterDot;A2,N&CenterDot;&CenterDot;&CenterDot;&CenterDot;&CenterDot;&CenterDot;&CenterDot;&CenterDot;&CenterDot;&CenterDot;&CenterDot;&CenterDot;AN,1AN,2&CenterDot;&CenterDot;&CenterDot;&CenterDot;&CenterDot;&CenterDot;AN,N
Subscript i in incidence matrix A, j is the reduced representation of commodity, identifies the commodity in all pages that log packet contains, and adopt commercial product code reduced representation in pre-service link, facilitates the Association Rule Analysis of subsequent step; Cell A (1,2)=80 represents that the number of times that all users jump to commodity 2 from commodity 1 in navigation process is 80; Calculate the frequency matrix F between each commodity according to incidence matrix A;
F1,1F1,2&CenterDot;&CenterDot;&CenterDot;&CenterDot;&CenterDot;&CenterDot;F1,NF2,1F2,2&CenterDot;&CenterDot;&CenterDot;&CenterDot;&CenterDot;&CenterDot;F2,N&CenterDot;&CenterDot;&CenterDot;&CenterDot;&CenterDot;&CenterDot;&CenterDot;&CenterDot;&CenterDot;&CenterDot;&CenterDot;&CenterDot;FN,1FN,2&CenterDot;&CenterDot;&CenterDot;&CenterDot;&CenterDot;&CenterDot;FN,N
The element F (i, j) that is designated as (i, j) under in frequency matrix F is calculated by following formula:
F(i,j)=A(i,j)Ai,1&le;i,j&le;n
WhereinAi=&Sigma;j=1nA(i,j)
F (i, j) represents that jumping to commodity j by commodity i accounts for all ratios that jump to other commodity from commodity i;
The recommendation strategy of B, commercial product recommending Candidate Set
Can obtain user according to F (i, j) and access after commodity i the then probability of access products j, thus just can be according to the page of the user's current accessed capturing, take frequency matrix F as according to carrying out commercial product recommending to user; Can use three kinds to recommend strategy:
B1, traditional method are that one of regulation is recommended threshold value beta, and the goods number that grabs active user's access is P1, F (P1, the commodity of j)>=β are as recommending collection; But this method exists very large drawback, β is difficult to determine; For same commodity, β determines too smallly will obtain a large amount of Recommendations, and user also will concentrate and search the commodity of oneself liking from a large amount of Recommendations, the notice of not only having disperseed user, and also user's selection burden is still very heavy; β determines excessive, and the Recommendations quantity calculating is few, likely there will be user's situation about selecting of having no way of, thereby cannot meet user's demand; For multiple commodity, in order to reach good recommendation effect, will determine suitable β for each commodity, but along with the quantity of commodity increases, the difficulty of this scheme, also in continuous increasing, also to there is managerial personnel's intervention; Further, the numerical value in frequency matrix F is constantly to change along with browsing of user, if be also impossible by the words of manually carrying out to determine β, this recommendations for selection method based on threshold value is worthless thus; Method used herein is that the goods number that hypothesis grabs active user access is P1, selects so F (P from matrix F1, j) front k goods number of numerical value maximum, note is R1, R1={r1, r2, r3... ri... rk, using R1 as recommended candidate collection,
Figure FDA00002444674400024
comprise k Recommendations; The advantage of this method is: under any circumstance, all fix to user's Recommendations quantity, as long as determine that a suitable k value can, easily for user recommends k commodity, be just at least quantitatively stable;
B2, the commodity of supposing to grab active user's access are P1, select so F (P from frequency matrix F1, j) front x commodity of numerical value maximum, note is R1, supposes R1={r1, r2, r3..., rx, then using R1 as one input data, the arbitrary element r in R1i(0<i<x) regard P2 as, select respectively F (P2, j) y forward commercial product code R2 of rank, asks R2l(0<l<x) union note is R2, and recommended candidate collection is exactly the union of R1 and R2 like this:
Figure FDA00002444674400031
comprise at most x+x × y Recommendations; Just can determine according to actual requirement the value of x and y, thereby obtain the Recommendations of varying number;
B3, two schemes presented above are all only to recommend as condition according to the current commodity of browsing of user, weak point: input data B only has commodity, although the method for recommending is different, the result of recommending is all changeless; All users are in the time browsing same commodity, and the Recommendations that they see are all identical; System has been the task of recommending, but cannot be according to the different commodity of the different recommendations of user's browse state; The shortcoming existing has also been brought advantage, since the recommendation results of all conditions immobilizes, calculates all recommendation results under just can be online, also can improve the real-time of commercial product recommending; For the deficiency of above two schemes, the third scheme is proposed; Grab the current goods number P1 browsing of user and a upper goods number P0, composition session sequence (P0, P1); From frequency matrix, select front x commodity of F (P1, j) numerical value maximum, note is R1; Suppose R1={r11, r22, r33..., rxx, using R1 as one input data, in addition r11, r22, r33..., rxxregard i as, select respectively y forward commodity of F (i, j) rank, note is R2; In addition from matrix F, select z commodity of F (P0, j) numerical value maximum, note is R3; Recommended candidate assembly fruit is exactly R1 like this, the union of R2 and R3,
Figure FDA00002444674400032
there are at most x+x × y+z Recommendations; Why also to select R3 to be because commodity and P0 that R3 comprises have the very high degree of association, and the user with similar interests has accessed after P0 and has very a high proportion of user to access R3, since active user has accessed P0, suppose that so he may be also interested in R3; Adjust x, the size of y and z, just can adjust the quantity of Recommendations;
The session sequence that this recommendation strategy use user browses, as the condition of recommendation rules, has been considered user's historical viewings commodity and has certainly browsed commodity; The sequence difference that user browses, the commodity of recommendation are also just different, substantially can reach the requirement of personalized recommendation; This strategy has only been selected the goods browse sequence that length is two, do not have to select the reason of longer sequence to be, P0 is representing the historical commodity that user browses, can infer and commodity collection R3 by it, if increase the quantity of historical commodity, the result of recommending is subject to the impact of historical commodity larger, insensitive to the transfer of user interest; P1 is the current commodity of browsing of user, is inferred and commodity collection R2 and R3 by it, and R2 and R3 have been the results of conjecture, if carry out Recommendations with them again, the results set error of recommending out can be larger, affects the accuracy of the recommendation of commodity, sensation that also can remote effect user.
CN201210475495.4A2012-11-212012-11-21Commodity candidate set recommendation methodPendingCN103839167A (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN201210475495.4ACN103839167A (en)2012-11-212012-11-21Commodity candidate set recommendation method

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201210475495.4ACN103839167A (en)2012-11-212012-11-21Commodity candidate set recommendation method

Publications (1)

Publication NumberPublication Date
CN103839167Atrue CN103839167A (en)2014-06-04

Family

ID=50802641

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201210475495.4APendingCN103839167A (en)2012-11-212012-11-21Commodity candidate set recommendation method

Country Status (1)

CountryLink
CN (1)CN103839167A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN104317945A (en)*2014-10-312015-01-28亚信科技(南京)有限公司E-commerce website commodity recommending method on basis of search behaviors
WO2017107802A1 (en)*2015-12-242017-06-29阿里巴巴集团控股有限公司Method and device for associating network item and calculating association information
CN107609037A (en)*2017-08-112018-01-19中电科新型智慧城市研究院有限公司A kind of intelligent sharing method and system based on block number evidence
CN107730336A (en)*2016-08-122018-02-23苏宁云商集团股份有限公司Commodity method for pushing and device in a kind of online transaction
CN109271590A (en)*2018-09-292019-01-25四川灵灵器机器人有限责任公司A kind of recommended method based on timing decision model
CN109727047A (en)*2017-10-302019-05-07北京京东尚科信息技术有限公司A kind of method and apparatus, data recommendation method and the device of determining data correlation degree
CN110378714A (en)*2018-04-132019-10-25北京京东尚科信息技术有限公司A kind of method and apparatus of processing access data
CN110413870A (en)*2018-12-182019-11-05北京沃东天骏信息技术有限公司Method of Commodity Recommendation, device and server
CN117611245A (en)*2023-12-142024-02-27浙江博观瑞思科技有限公司Data analysis management system and method for planning E-business operation activities

Citations (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN102411754A (en)*2011-11-292012-04-11南京大学Personalized recommendation method based on commodity property entropy
CN102650991A (en)*2011-02-252012-08-29苏州工业园区辰烁软件科技有限公司Commodity recommending method and system both based on user preference

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN102650991A (en)*2011-02-252012-08-29苏州工业园区辰烁软件科技有限公司Commodity recommending method and system both based on user preference
CN102411754A (en)*2011-11-292012-04-11南京大学Personalized recommendation method based on commodity property entropy

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
徐清: "B2C电子商务中商品推荐模型研究", 《中国优秀硕士学位论文全文数据库(信息科技辑)》*

Cited By (14)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN104317945A (en)*2014-10-312015-01-28亚信科技(南京)有限公司E-commerce website commodity recommending method on basis of search behaviors
WO2017107802A1 (en)*2015-12-242017-06-29阿里巴巴集团控股有限公司Method and device for associating network item and calculating association information
CN107730336A (en)*2016-08-122018-02-23苏宁云商集团股份有限公司Commodity method for pushing and device in a kind of online transaction
CN107609037A (en)*2017-08-112018-01-19中电科新型智慧城市研究院有限公司A kind of intelligent sharing method and system based on block number evidence
CN107609037B (en)*2017-08-112020-12-29中电科新型智慧城市研究院有限公司 A block data-based intelligent sharing method and system
CN109727047A (en)*2017-10-302019-05-07北京京东尚科信息技术有限公司A kind of method and apparatus, data recommendation method and the device of determining data correlation degree
CN110378714A (en)*2018-04-132019-10-25北京京东尚科信息技术有限公司A kind of method and apparatus of processing access data
CN110378714B (en)*2018-04-132024-02-13北京京东尚科信息技术有限公司Method and device for processing access data
CN109271590A (en)*2018-09-292019-01-25四川灵灵器机器人有限责任公司A kind of recommended method based on timing decision model
CN109271590B (en)*2018-09-292021-08-31四川灵灵器机器人有限责任公司Recommendation method based on time sequence decision model
CN110413870A (en)*2018-12-182019-11-05北京沃东天骏信息技术有限公司Method of Commodity Recommendation, device and server
CN110413870B (en)*2018-12-182021-12-31北京沃东天骏信息技术有限公司Commodity recommendation method and device and server
CN117611245A (en)*2023-12-142024-02-27浙江博观瑞思科技有限公司Data analysis management system and method for planning E-business operation activities
CN117611245B (en)*2023-12-142024-05-31浙江博观瑞思科技有限公司Data analysis management system and method for planning E-business operation activities

Similar Documents

PublicationPublication DateTitle
CN103839167A (en)Commodity candidate set recommendation method
US11354584B2 (en)Systems and methods for trend aware self-correcting entity relationship extraction
YuThe dynamic competitive recommendation algorithm in social network services
CN101826114A (en)Multi Markov chain-based content recommendation method
CN109165367B (en)News recommendation method based on RSS subscription
CN103064945A (en)Situation searching method based on body
CN102789462A (en)Project recommendation method and system
CN111949887A (en) Item recommendation method, device, and computer-readable storage medium
CN103617289A (en)Micro-blog recommendation method based on user characteristics and network relations
CN103345698A (en)Personalized recommendation method based on cloud processing mode and applied in e-business environment
Niu et al.Product hierarchy-based customer profiles for electronic commerce recommendation
CN110347923B (en)Traceable fast fission type user portrait construction method
CN106991592A (en)A kind of personalized recommendation method based on purchase user behavior analysis
LianThe construction of personalized Web page recommendation system in e-commerce
Xiong et al.Personalized intelligent hotel recommendation system for online reservation--A perspective of product and user characteristics
Lu et al.A novel e-commerce customer continuous purchase recommendation model research based on colony clustering
LuoResearch on User Profile Multi Model Fusion Recommendation System Based on Spark
Chauhan et al.Customer-Aware Recommender System for Push Notifications in an e-commerce Environment
NingRefined push method of marketing data based on social trust network
Wang et al.A novel e-commerce recommendation system model based on the pattern recognition and user behavior preference analysis
Wenxing et al.Design and implementation of web-based DSS for online shopping mall
WangCommerce Product Recommendation Algorithm Based on Collaborative Filtering
Fang et al.Design of Recommendation Algorithm Based on Knowledge Graph
Katariya et al.A Privacy Preserving Approach to Generating Personalized Recommendations Based on Short Text Classification
Amer-YahiaRecommendation projects at Yahoo!

Legal Events

DateCodeTitleDescription
C06Publication
PB01Publication
C10Entry into substantive examination
SE01Entry into force of request for substantive examination
RJ01Rejection of invention patent application after publication
RJ01Rejection of invention patent application after publication

Application publication date:20140604


[8]ページ先頭

©2009-2025 Movatter.jp