Movatterモバイル変換


[0]ホーム

URL:


CN103793504B - A kind of cluster initial point system of selection based on user preference and item attribute - Google Patents

A kind of cluster initial point system of selection based on user preference and item attribute
Download PDF

Info

Publication number
CN103793504B
CN103793504BCN201410035844.XACN201410035844ACN103793504BCN 103793504 BCN103793504 BCN 103793504BCN 201410035844 ACN201410035844 ACN 201410035844ACN 103793504 BCN103793504 BCN 103793504B
Authority
CN
China
Prior art keywords
item
point
cluster
similarity
distance
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201410035844.XA
Other languages
Chinese (zh)
Other versions
CN103793504A (en
Inventor
宿红毅
王彩群
闫波
郑宏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Institute of Technology BIT
Original Assignee
Beijing Institute of Technology BIT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Institute of Technology BITfiledCriticalBeijing Institute of Technology BIT
Priority to CN201410035844.XApriorityCriticalpatent/CN103793504B/en
Publication of CN103793504ApublicationCriticalpatent/CN103793504A/en
Application grantedgrantedCritical
Publication of CN103793504BpublicationCriticalpatent/CN103793504B/en
Expired - Fee Relatedlegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Classifications

Landscapes

Abstract

The present invention relates to a kind of cluster initial point system of selection based on user preference and item attribute, belong to machine learning field.Project-based similar matrix and the homologous factors based on user preference are determined first, and final similar matrix is obtained by two matrixes;And then by removing marginal point, selection cluster initial center point, complete the selection to initial center point.The present invention can effectively improve Clustering Effect.

Description

Translated fromChinese
一种基于用户偏好与项目属性的聚类初始点选择方法A clustering initial point selection method based on user preference and item attributes

技术领域technical field

本发明涉及一种基于用户偏好与项目属性的聚类初始点选择方法,属于机器学习领域。The invention relates to a method for selecting an initial clustering point based on user preference and item attributes, and belongs to the field of machine learning.

背景技术Background technique

聚类是一种无监督的学习方法,它通过一定的规则将数据对象按照定义的相似性划分成为多个类或簇,在同一个簇中的对象之间具有较高的相似度,而不同簇中的对象差别较大。到目前为止,聚类分析的应用已十分广泛,包括统计学、机器学习、图像分割、和数据挖掘等。目前,主要的聚类算法分为划分方法、层次方法、基于密度的方法、基于网格的方法和基于模型的方法。而划分式聚类算法是实际应用中聚类分析的支柱。划分式聚类算法需要预先指定聚类数目或聚类中心,通过反复迭代运算,逐步降低目标函数的误差值,当目标函数值收敛时,得到最终聚类结果。划分式聚类算法简单、快速而且能有效的处理大数据集,但此聚类算法存在高计算性及对数据的输入顺序敏感的缺点,且需要预先指定聚类数目或聚类中心。初始聚类中心点对聚类结果的影响很大。如果初始聚类中心点选择不当,得到的聚类结果可能会陷入局部最优,从而得不到较好的聚类结果。而划分式聚类初始聚类中心点的选择方法也是多种多样,主要有以下几种方法:Clustering is an unsupervised learning method, which divides data objects into multiple classes or clusters according to the defined similarity through certain rules, and objects in the same cluster have a high degree of similarity, while different The objects in the cluster are quite different. So far, cluster analysis has been widely used, including statistics, machine learning, image segmentation, and data mining. At present, the main clustering algorithms are divided into partition methods, hierarchical methods, density-based methods, grid-based methods and model-based methods. Partitioning clustering algorithm is the pillar of clustering analysis in practical application. The partitioning clustering algorithm needs to specify the number of clusters or cluster centers in advance, and gradually reduce the error value of the objective function through repeated iterative operations. When the objective function value converges, the final clustering result is obtained. Partition clustering algorithm is simple, fast and can effectively process large data sets, but this clustering algorithm has the disadvantages of high calculation and sensitivity to the input order of data, and needs to specify the number of clusters or cluster centers in advance. The initial cluster center point has a great influence on the clustering results. If the initial clustering center point is not selected properly, the obtained clustering results may fall into local optimum, so that better clustering results cannot be obtained. There are also various methods for selecting the initial clustering center point of partitioned clustering, and there are mainly the following methods:

随机选择法:随机选取k个数据点作为初始聚类中心点;Random selection method: randomly select k data points as the initial cluster center point;

经验法:依据经验,根据个体性质,选择k个有代表意义的点作为初始聚类中心点;Empirical method: Based on experience and individual properties, select k representative points as the initial cluster center point;

递推法:首先计算全体数据样本的均值,以这个数值点作为初始聚类中心,然后计算距离第一个数值点最远的一个点作为第2个聚类中心,以此类推,由第k-1个聚类中心计算聚类最远的一个数据样本作为最后一个聚类中心。Recursion method: first calculate the mean value of all data samples, take this numerical point as the initial cluster center, then calculate the point farthest from the first numerical point as the second cluster center, and so on, from the kth -1 cluster center calculates the data sample with the farthest cluster as the last cluster center.

密度估计选择法:计算特定半径内的每个数据样本的密度,具有最大密度的点作为第一个聚类中心点,然后再计算剩下的初始中心点,若是具有第二大密度的点距离第一个聚类中心点的距离大于特定值则作为第2个初始聚类中心点,按此方法依次选出k个中心点;Density estimation selection method: Calculate the density of each data sample within a specific radius, and use the point with the largest density as the first cluster center point, and then calculate the remaining initial center points, if the point distance with the second largest density If the distance of the first cluster center point is greater than a specific value, it will be used as the second initial cluster center point, and k center points will be selected in turn according to this method;

距离优化选择法:按照最大最小距离计算Distance optimization selection method: calculate according to the maximum and minimum distances

采用遗传算法计算聚类初始中心点等。The genetic algorithm is used to calculate the initial center point of clustering, etc.

由于初始聚类中心点对聚类结果的影响很大。如果初始聚类中心点选择不当,得到的聚类结果可能会陷入局部最优,从而得不到较好的聚类结果。为了获得恰当的初始聚类中心点,避免聚类结果陷入局部最优,本专利提出一种新的聚类初始中心点的选择方法。Because the initial cluster center point has a great influence on the clustering results. If the initial clustering center point is not selected properly, the obtained clustering results may fall into local optimum, so that better clustering results cannot be obtained. In order to obtain an appropriate initial clustering center point and prevent the clustering result from falling into a local optimum, this patent proposes a new method for selecting the initial clustering center point.

发明内容Contents of the invention

本发明的目的是为了解决基于划分的算法的初始中心点的选择的问题,使用用户的偏好信息和商品属性来构造相似矩阵,从而得到初试中心点。The purpose of the present invention is to solve the problem of selecting the initial central point of the partition-based algorithm, and use the user's preference information and commodity attributes to construct a similarity matrix, thereby obtaining the initial test central point.

本发明技术方案的实现过程为:The realization process of technical scheme of the present invention is:

步骤1、确定基于项目的相似矩阵;Step 1. Determine an item-based similarity matrix;

定义项目的特征向量:itemi=(p1,p2,…,pm);其中m为项目的属性个数,pi(1≤i≤m)代表了此项目第i个特征向量的值。然后每个项目可以转换为用一个向量itemi=(w1,w2,…,wm)表示,其中向量维数是m,即项目的属性特征个数。然后通过计算表示项目的向量间的距离Aij来表示itemi和itemj之间的相似性,从而构成相似矩阵Define the feature vector of the item: itemi =(p1 , p2 ,..., pm ); where m is the number of attributes of the item, and pi (1≤i≤m) represents the i-th feature vector of the item value. Then each item can be converted to be represented by a vector itemi = (w1 ,w2 , . Then, the similarity between itemi and itemj is represented by calculating the distance Aij between the vectors representing the items, thus forming a similarity matrix

所属项目u与项目v之间通过距离获取相似度的计算方法包括:皮尔逊相关的距离、欧氏距离、余弦距离、斯皮尔曼距离和基于谷本相关的距离。The calculation methods for obtaining similarity through distance between item u and item v include: Pearson correlation distance, Euclidean distance, cosine distance, Spearman distance and distance based on Tanimoto correlation.

步骤2、确定基于用户偏好的同现矩阵;Step 2, determining a co-occurrence matrix based on user preference;

定义用户对项目的偏好列表:prefs=(user_id,item_id,pref),其中pref代表用户对项目的评分,所有用户对项目的评分组成评分列表prefs。通过计算itemi和itemj同时出现在相同的用户的偏好列表中的次数Bij,来构成同现矩阵Define user's preference list for items: prefs=(user_id, item_id, pref), where pref represents user's ratings for items, and all users' ratings for items form the rating list prefs. The co-occurrence matrix is formed by calculating the number of times Bij that itemi and itemj appear in the same user's preference list at the same time

步骤3、确定最终的相似矩阵;Step 3, determine the final similarity matrix;

最终的相似矩阵定义为其中和β为自定义的权重。The final similarity matrix is defined as in and β are custom weights.

步骤4、去除边缘点;Step 4, remove edge points;

在TS的每行中,分别计算相似度大于给定阈值θ的项目的个数,记为αi,若是αi的个数小于给定阈值μ表示此点是边缘点,则从相似矩阵中删除代表此项目的行和列以此来实现从相似矩阵中去除此边缘点;遍历所有的行后完成所有去除边缘点的操作后再次获得相似矩阵;In each row of TS, calculate the number of items whose similarity is greater than a given threshold θ, which is recorded as αi , if the number of αi is less than a given threshold μ, it means that this point is an edge point, then from the similarity matrix Delete the row and column representing this item to remove this edge point from the similarity matrix; after traversing all the rows and completing all the edge point removal operations, the similarity matrix is obtained again;

步骤5、选择聚类初始中心点:Step 5. Select the initial center point of clustering:

(1)在步骤4中获得的相似矩阵中,找出最大相似度,然后将这个最大相似度的两个点的中心点作为聚类的中心点,记录到Cluster[]中;并计算两个点到它们的中心点的距离,找出较大距离的点,将相似矩阵中代表较大的距离的点的行和列删除,得到新的相似矩阵;(1) In the similarity matrix obtained in step 4, find the maximum similarity, and then use the center point of the two points with the maximum similarity as the center point of the cluster, and record it in Cluster[]; and calculate two The distance between the points and their center points, find the point with a larger distance, delete the row and column of the point representing the larger distance in the similarity matrix, and get a new similarity matrix;

(2)再从上述相似矩阵中找到最大相似度,依次计算具有此最大相似度的两个点分别到所有聚类初始中心点Cluster[]的距离,若是存在距离小于给定阈值ω,则合并此点到具有最小距离的聚类中,重新计算聚类中心点,否则若是不存在距离小于给定阈值ω,则此点作为新的聚类中心,并将此点作为另外一个初始中心点加入到Cluster[]中;然后将此最大相似度的两个点所代表的的行和列删除得到新的相似矩阵。进行迭代,直至聚类中心点的个数为k。(2) Then find the maximum similarity from the above similarity matrix, and calculate the distances from the two points with the maximum similarity to the initial center point Cluster[] of all clusters in turn. If there is a distance smaller than the given threshold ω, then merge From this point to the cluster with the minimum distance, recalculate the cluster center point, otherwise, if there is no distance less than the given threshold ω, then this point will be used as the new cluster center, and this point will be added as another initial center point to Cluster[]; then delete the row and column represented by the two points of the maximum similarity to obtain a new similarity matrix. Iterate until the number of cluster center points is k.

项目到聚类中心点的距离的计算方法包括:皮尔逊相关的距离、基于欧氏距离的距离、余弦距离、斯皮尔曼距离和基于谷本相关的距离。The calculation methods of the distance from the item to the cluster center point include: Pearson correlation distance, distance based on Euclidean distance, cosine distance, Spearman distance and distance based on Tanimoto correlation.

经过以上操作则完成对初始中心点的选择。After the above operations, the selection of the initial center point is completed.

有益效果Beneficial effect

本发明通过提出基于用户偏好信息与商品属性的初始点选择方法,来提高聚类的效果。The present invention improves the effect of clustering by proposing an initial point selection method based on user preference information and commodity attributes.

附图说明Description of drawings

图1为本发明实施的具体流程示意图Fig. 1 is the concrete schematic flow chart that the present invention implements

具体实施方式detailed description

下面通过实施例对的具体实施方式做进一步详细说明。The specific implementation manner of the pair of examples will be described in further detail below.

在某站点中,有用户1000个,电影5000部,每部电影具有名称、发售年份、类别3种属性,现使用基于改进的相似矩阵的聚类算法实现对该站点中的第1个物品20个聚类,基于用户偏好与项目属性的聚类初始点选择方法实施的具体流程如图1所示:In a certain site, there are 1000 users and 5000 movies, and each movie has three attributes: name, release year, and category. Now, the clustering algorithm based on the improved similarity matrix is used to realize the first item in the site 20 The specific process of implementing the clustering initial point selection method based on user preferences and item attributes is shown in Figure 1:

根据步骤1:确定基于项目的相似矩阵;According to step 1: determine the item-based similarity matrix;

定义电影的特征向量:itemi=(p1,p2,p3),pi(1≤i≤3)代表了此项目第i个特征的取值。首先将每部电影用3维向量表示itemi=(w1,w2,w3),其中wi(1≤i≤3)表示物品第i个特征的值。然后通过计算表示项目的向量间的距离Aij来表示itemi和itemj之间的相似性,从而构成相似矩阵Define the feature vector of the movie: itemi = (p1 , p2 , p3 ), pi (1≤i≤3) represents the value of the i-th feature of this item. First, each movie is represented by a 3-dimensional vector itemi = (w1 , w2 , w3 ), where wi (1≤i≤3) represents the value of the i-th feature of the item. Then, the similarity between itemi and itemj is represented by calculating the distance Aij between the vectors representing the items, thus forming a similarity matrix

所属项目u与项目v之间通过距离获取相似度的计算方法采用欧氏距离计算得到。The calculation method of obtaining the similarity between the item u and the item v through the distance is obtained by calculating the Euclidean distance.

根据步骤2:确定基于用户偏好的同现矩阵;According to step 2: determine the co-occurrence matrix based on user preference;

定义用户对项目的偏好列表:prefs=(userid,itemid,pref),其中pref代表用户对项目的评分,所有用户对项目的评分组成评分列表prefs。,通过计算每一对项目同时出现在同一个用户的偏好列表中的次数Bij(表示itemi和itemj同时出现在相同的用户的偏好列表中的次数)来构成同现矩阵Define the user's preference list for items: prefs=(userid, itemid, pref), where pref represents the user's rating for the item, and all users' ratings for the item form the rating list prefs. , to form a co-occurrence matrix by calculating the number of times Bij (indicating the number of times itemi and itemj simultaneously appear in the same user's preference list) that each pair of items simultaneously appear in the same user's preference list

根据步骤3:确定最终的相似矩阵;According to step 3: determine the final similarity matrix;

最终的相似矩阵定义为The final similarity matrix is defined as

其中α和β分别为0.5。 where α and β are 0.5, respectively.

根据步骤4:去除边缘点;According to step 4: remove edge points;

在TS的每行中,分别计算相似度大于给定阈值θ(θ定义为此行中最大相似度的0.2倍)的项目的个数,记为αi,若是αi的个数小于给定阈值μ(μ定义为 0·0O1N其中N代表所有聚类点的个数即5000)表示此点是边缘点,则从相似矩阵中删除代表此项目的行和列以此来实现从相似矩阵中去除此边缘点。遍历所有的行后完成所有去除边缘点的操作后再次获得相似矩阵。In each row of TS, calculate the number of items whose similarity is greater than a given threshold θ (θ is defined as 0.2 times the maximum similarity in this row), and record it as αi , if the number of αi is less than a given Threshold μ (μ is defined as 0·0O1N, where N represents the number of all cluster points, that is, 5000) indicates that this point is an edge point, then delete the row and column representing this item from the similarity matrix to achieve from the similarity matrix Remove this edge point. After traversing all rows, the similarity matrix is obtained again after all operations of removing edge points are completed.

根据步骤5:选择初始中心点;According to step 5: select the initial center point;

(1):在步骤4中获得的相似矩阵中,找出最大相似度即所有数据中的最大值,然后将这个最大相似度的两个点的中心点作为聚类的中心点,记录到Cluster[]中。并计算两个点到它们的中心点的距离,找出较大距离的点。然后找出最下相似度即所有数据中的最小值,然后计算这个最小相似度的两个点间的距离,即为distance。并将相似矩阵中代表较大的距离的点的行和列删除,得到新的相似矩阵;(1): In the similarity matrix obtained in step 4, find the maximum similarity, that is, the maximum value in all data, and then use the center point of the two points with the maximum similarity as the center point of the cluster, and record it in the Cluster []middle. And calculate the distance of two points to their center point, find the point with larger distance. Then find the lowest similarity, that is, the minimum value in all data, and then calculate the distance between the two points of the minimum similarity, which is distance. And delete the rows and columns of points representing larger distances in the similarity matrix to obtain a new similarity matrix;

(2):再从上述相似矩阵中找到最大相似度,依次计算具有此最大相似度的两个点分别到所有聚类初始中心点Cluster[]的距离,若是存在距离小于给定阈值ω(ω为distance/20*2,其中distance为步骤(1)中获得数据),则合并此点到具有最小距离的聚类中,重新计算聚类中心点,否则若是不存在距离小于给定阈值ω,则此点作为新的聚类中心,并将此点作为另外一个初始中心点加入到Cluster[]中。然后将此最大相似度的两个点所代表的行和列删除得到新的相似矩阵。迭代步骤直至聚类中心点的个数为20。(2): Then find the maximum similarity from the above similarity matrix, and calculate the distances from the two points with the maximum similarity to the initial center point Cluster[] of all clusters in turn, if there is a distance smaller than the given threshold ω(ω is distance/20*2, where distance is the data obtained in step (1), merge this point into the cluster with the minimum distance, and recalculate the cluster center point, otherwise, if there is no distance less than the given threshold ω, Then this point is used as the new cluster center, and this point is added to Cluster[] as another initial center point. Then delete the rows and columns represented by the two points with the maximum similarity to obtain a new similarity matrix. Iterate steps until the number of cluster center points is 20.

项目到聚类中心点的距离的计算方法选择基于欧氏距离的距离。The calculation method of the distance from the item to the cluster center point selects the distance based on the Euclidean distance.

Claims (1)

Translated fromChinese
1.一种基于用户偏好与项目属性的聚类初始点选择方法,其特征在于:1. A clustering initial point selection method based on user preference and item attributes, characterized in that:步骤1、确定基于项目的相似矩阵;定义项目的特征向量:itemi=(p1,p2,…,pm);其中m为项目的属性个数,pr(1≤r≤m)代表了此项目第r个特征向量的值;然后每个项目可以转换为用一个向量itemi=(w1,w2,…,wm)表示,其中向量维数是m,即项目的属性特征个数,wm表示第m个属性特征值;然后通过计算表示项目的向量间的距离Aij来表示itemi和itemj之间的相似性,从而构成相似矩阵itemj表示第j个项目,n表示项目的个数;Step 1. Determine the item-based similarity matrix; define the feature vector of the item: itemi = (p1 ,p2 ,...,pm ); where m is the number of attributes of the item, pr (1≤r≤m) Represents the value of the rth eigenvector of this item; then each item can be converted to a vector itemi = (w1 ,w2 ,…,wm ), where the dimension of the vector is m, which is the attribute of the item The number of features, wm represents the feature value of the mth attribute; then the similarity between itemi and itemj is represented by calculating the distance Aij between the vectors representing the items, thus forming a similarity matrix itemj represents the jth item, and n represents the number of items;步骤2、确定基于用户偏好的同现矩阵;定义用户对项目的偏好列表:prefs=(user_id,item_id,pref),其中pref代表用户对项目的评分,所有用户对项目的评分组成评分列表prefs;通过计算itemi和itemj同时出现在相同的用户的偏好列表中的次数Bij,来构成同现矩阵Step 2. Determine the co-occurrence matrix based on user preference; define the user's preference list for items: prefs=(user_id, item_id, pref), where pref represents the user's rating for the item, and all users' ratings for the item form the rating list prefs; The co-occurrence matrix is formed by calculating the number of times Bij that itemi and itemj appear in the same user's preference list at the same time步骤3、确定最终的相似矩阵:其中和β为自定义的权重;Step 3, determine the final similarity matrix: in and β are custom weights;步骤4、去除边缘点;在TS的每行中,分别计算相似度大于给定阈值θ的项目的个数,记为αq,若是αq的个数小于给定阈值μ表示此点是边缘点,则从相似矩阵中删除代表此项目的行和列以此来实现从相似矩阵中去除此边缘点;遍历所有的行后完成所有去除边缘点的操作后再次获得相似矩阵;Step 4. Remove edge points; in each row of TS, calculate the number of items whose similarity is greater than a given threshold θ, and record it as αq . If the number of αq is less than a given threshold μ, it means that this point is an edge point, delete the row and column representing this item from the similarity matrix to remove the edge point from the similarity matrix; after traversing all the rows and completing all the operations of removing edge points, the similarity matrix is obtained again;步骤5、选择聚类初始中心点;所述选择聚类初始中心点具体包括:Step 5, select the initial central point of the cluster; the selection of the initial central point of the cluster specifically includes:(1)在获得的相似矩阵中,找出最大相似度,然后将这个最大相似度的两个点的中心点作为聚类的中心点,记录到Cluster[]中;并计算两个点到它们的中心点的距离,找出较大距离的点,将相似矩阵中代表较大的距离的点的行和列删除,得到新的相似矩阵;(1) In the obtained similarity matrix, find the maximum similarity, and then use the center point of the two points of the maximum similarity as the center point of the cluster, record it in Cluster[]; and calculate the two points to them The distance between the central points of the distance, find the point with a larger distance, delete the row and column of the point representing the larger distance in the similarity matrix, and obtain a new similarity matrix;(2)再从上述相似矩阵中找到最大相似度,依次计算具有此最大相似度的两个点分别到所有聚类初始中心点Cluster[]的距离,若是存在距离小于给定阈值ω,则合并此点到具有最小距离的聚类中,重新计算聚类中心点,否则若是不存在距离小于给定阈值ω,则此点作为新的聚类中心,并将此点作为另外一个初始中心点加入到Cluster[]中;然后将此最大相似度的两个点所代表的的行和列删除得到新的相似矩阵;进行迭代,直至聚类中心点的个数为k。(2) Then find the maximum similarity from the above similarity matrix, and calculate the distances from the two points with the maximum similarity to the initial center point Cluster[] of all clusters in turn, if there is a distance smaller than the given threshold ω, then merge From this point to the cluster with the minimum distance, recalculate the cluster center point, otherwise, if there is no distance less than the given threshold ω, then this point will be used as the new cluster center, and this point will be added as another initial center point to Cluster[]; then delete the rows and columns represented by the two points of maximum similarity to obtain a new similarity matrix; iterate until the number of cluster center points is k.
CN201410035844.XA2014-01-242014-01-24A kind of cluster initial point system of selection based on user preference and item attributeExpired - Fee RelatedCN103793504B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN201410035844.XACN103793504B (en)2014-01-242014-01-24A kind of cluster initial point system of selection based on user preference and item attribute

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201410035844.XACN103793504B (en)2014-01-242014-01-24A kind of cluster initial point system of selection based on user preference and item attribute

Publications (2)

Publication NumberPublication Date
CN103793504A CN103793504A (en)2014-05-14
CN103793504Btrue CN103793504B (en)2018-02-27

Family

ID=50669170

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201410035844.XAExpired - Fee RelatedCN103793504B (en)2014-01-242014-01-24A kind of cluster initial point system of selection based on user preference and item attribute

Country Status (1)

CountryLink
CN (1)CN103793504B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN108268876A (en)*2016-12-302018-07-10广东精点数据科技股份有限公司A kind of detection method and device of the approximately duplicate record based on cluster
CN110413854A (en)*2019-06-142019-11-05平安科技(深圳)有限公司 Method for selecting clustering initial points based on user behavior characteristics and related equipment
CN110838123B (en)*2019-11-062022-02-11南京止善智能科技研究院有限公司 A Segmentation Method for Lighting Highlight Areas of Interior Design Effect Images
CN114201999A (en)*2020-08-312022-03-18中国移动通信集团浙江有限公司 Identification method, system, computing device and storage medium of abnormal account

Citations (3)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN101149759A (en)*2007-11-092008-03-26山西大学 A K-means Initial Clustering Center Selection Method Based on Neighborhood Model
CN102937985A (en)*2012-10-252013-02-20南京理工大学Method for classifying, optimizing and analyzing website based on user mental model
CN103440275A (en)*2013-08-082013-12-11南京邮电大学Prim-based K-means clustering method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US8229729B2 (en)*2008-03-252012-07-24International Business Machines CorporationMachine translation in continuous space

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN101149759A (en)*2007-11-092008-03-26山西大学 A K-means Initial Clustering Center Selection Method Based on Neighborhood Model
CN102937985A (en)*2012-10-252013-02-20南京理工大学Method for classifying, optimizing and analyzing website based on user mental model
CN103440275A (en)*2013-08-082013-12-11南京邮电大学Prim-based K-means clustering method

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Hadoop环境下的分布式协同过滤算法设计与实现;肖强 等;《现代图书情报技术》;20130131(第1期);83-89*
基于聚类的个性化推荐算法研究;雷震;《中国优秀硕士学位论文全文数据库 信息科技辑》;20140115(第01期);I138-1600*
改进的K-means算法在网络舆情分析中的应用;汤寒青 等;《计算机系统应用》;20110331;第20卷(第3期);165-168,196*

Also Published As

Publication numberPublication date
CN103793504A (en)2014-05-14

Similar Documents

PublicationPublication DateTitle
CN108733798B (en)Knowledge graph-based personalized recommendation method
CN107330451A (en)Clothes attribute retrieval method based on depth convolutional neural networks
US20160283533A1 (en)Multi-distance clustering
WO2019015246A1 (en)Image feature acquisition
CN109635140B (en) An Image Retrieval Method Based on Deep Learning and Density Peak Clustering
CN103106279A (en)Clustering method simultaneously based on node attribute and structural relationship similarity
WO2016066042A1 (en)Segmentation method for commodity picture and device thereof
CN111754345A (en) A Bitcoin Address Classification Method Based on Improved Random Forest
CN105373597A (en)Collaborative filtering recommendation method for user based on k-medoids project clustering and local interest fusion
CN109471982B (en)Web service recommendation method based on QoS (quality of service) perception of user and service clustering
CN107180093A (en)Information search method and device and ageing inquiry word recognition method and device
WO2018166273A1 (en)Method and apparatus for matching high-dimensional image feature
CN106649877A (en)Density peak-based big data mining method and apparatus
CN103793504B (en)A kind of cluster initial point system of selection based on user preference and item attribute
CN109034953B (en)Movie recommendation method
CN115546538A (en)Three-dimensional model classification method based on point cloud and local shape features
CN102722578B (en)Unsupervised cluster characteristic selection method based on Laplace regularization
EP3452916A1 (en)Large scale social graph segmentation
CN106845462A (en) A Face Recognition Method Based on Simultaneous Selection of Features and Clustering Induced by Triplets
CN104463864B (en)Multistage parallel key frame cloud extracting method and system
WO2020147259A1 (en)User portait method and apparatus, readable storage medium, and terminal device
CN103955524A (en)Event-related socialized image searching algorithm based on hypergraph model
CN107391594B (en)Image retrieval method based on iterative visual sorting
CN110309424A (en) A social recommendation method based on rough clustering
CN110968793A (en)User cold start recommendation algorithm based on collaborative filtering mixed filling

Legal Events

DateCodeTitleDescription
C06Publication
PB01Publication
C10Entry into substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant
CF01Termination of patent right due to non-payment of annual fee

Granted publication date:20180227

Termination date:20200124

CF01Termination of patent right due to non-payment of annual fee

[8]ページ先頭

©2009-2025 Movatter.jp