计算机应用 ›› 2016, Vol. 36 ›› Issue (4): 1054-1059.DOI: 10.11772/j.issn.1001-9081.2016.04.1054

• 人工智能 • 上一篇    下一篇

基于灰色关联分析的分布式协同过滤推荐算法

邱桂, 闫仁武   

  1. 江苏科技大学 计算机科学与工程学院, 江苏 镇江 212003
  • 收稿日期:2015-09-26 修回日期:2015-12-04 出版日期:2016-04-10 发布日期:2016-04-08
  • 通讯作者: 邱桂
  • 作者简介:邱桂(1990-),男,安徽宿州人,硕士研究生,主要研究方向:数据挖掘、大数据; 闫仁武(1962-),男,山东梁山人,副教授,硕士,主要研究方向:数据挖掘、信息融合。

Distributed collaborative filtering recommendation algorithm based on gray association analysis

QIU Gui, YAN Renwu   

  1. School of Computer Science and Engineering, Jiangsu University of Science and Technology, Zhenjiang Jiangsu 212003, China
  • Received:2015-09-26 Revised:2015-12-04 Online:2016-04-10 Published:2016-04-08

摘要: 针对原始的基于用户(User-based)或基于评分项目(Item-based)的协同过滤推荐算法(CFR)大多采用"硬分类"式聚类,且具有数据稀疏性和可扩展性的问题,提出一种基于灰色关联分析的分布式协同过滤推荐算法。算法使用Hadoop分布式计算平台,首先,计算评分矩阵中每个评分项目的灰色关系系数;然后,计算各评分项目的灰色关联度(GRG);最后,根据GRG获得每个评分项目的近邻集合,对不同用户的待预测项目用对应的近邻集合对其评分进行预测。通过在MovieLens数据集上进行实验,与User-based和Item-based的CFR算法相比,该算法平均绝对误差分别下降了1.07%和0.06%,而且随着数据规模的扩展,通过增加集群节点,算法运行效率有相应的提升。实验结果表明,该推荐算法可以有效地实现大规模数据的推荐,并能解决数据可扩展性的问题。

关键词: 灰色系统, 灰色关联度, 协同过滤推荐算法, 分布式系统, Hadoop

Abstract: In order to solve the problems of "hard classification" clustering, data sparsity and scalability in user-based or item-based Collaborative Filtering Recommendation (CFR) algorithms, a distributed collaborative filtering recommendation algorithm based on gray association analysis was proposed. Based on Hadoop platform, the grey relational coefficient of each item in rating matrix was calculated at first, then the Grey Relational Grade (GRG) of each item was calculated. Finally, the similar items for each item was constructed according to GRG, and item's rating for different users with related similar items was predicted. The experiment was conducted on the MovieLens dataset. The results showed that the Mean Absolute Error (MAE) of proposed algorithm was reduced by 1.07% and 0.06% respectively compared to the user-based and item-based CFR algorithms; and with the scale of dataset expending, the running efficiency was also improved by adding datanode to the Hadoop cluster. The experimental results illustrate that the proposed algorithm can make effective recommendation for large scale dataset and solve the problem of data scalability.

Key words: grey system, Grey Relational Grade (GRG), Collaborative Filtering Recommendation (CFR) algorithm, distributed system, Hadoop

中图分类号: