计算机应用 ›› 2016, Vol. 36 ›› Issue (2): 419-423.DOI: 10.11772/j.issn.1001-9081.2016.02.0419

• 第三届CCF大数据学术会议(CCF BigData 2015) • 上一篇    下一篇

基于位置编码索引树的个性化推荐算法

梁俊杰, 甘文婷, 余敦辉   

  1. 湖北大学 计算机与信息工程学院, 武汉 430062
  • 收稿日期:2015-08-29 修回日期:2015-09-15 出版日期:2016-02-10 发布日期:2016-02-03
  • 通讯作者: 甘文婷(1989-),女,江西丰城人,硕士研究生,主要研究方向:Web信息挖掘、个性化推荐。
  • 作者简介:梁俊杰(1974-),女,湖北武汉人,副教授,博士,CCF会员,主要研究方向:多媒体数据库、高维索引;余敦辉(1974-),男,湖北武汉人,副教授,博士,CCF会员,主要研究方向:个性化推荐、大数据。
  • 基金资助:
    湖北省自然科学基金重点资助项目(2015CFA067);湖北省教育厅科研项目计划重点项目(D20151001);武汉市科技攻关计划项目(2013012401010851)。

Personalized recommendation algorithm based on location bitcode tree

LIANG Junjie, GAN Wenting, YU Dunhui   

  1. School of Computer Science and Information Engineering, Hubei University, Wuhan Hubei 430062, China
  • Received:2015-08-29 Revised:2015-09-15 Online:2016-02-10 Published:2016-02-03

摘要: 针对协同过滤算法在海量数据环境个性化推荐应用中存在的低效率问题,结合MapReduce框架特点,设计了一种应用于个性化推荐的基于位置编码的索引树(LB-Tree),创新性地将索引结构应用于个性化推荐。利用聚类资源的差异性存储策略,提升MapReduce任务处理并行性;根据聚类数据分布特征,以质心为圆心对聚类中的数据对象进行同心圆分层,并对每层采用不同长度的二进制编码来表达,将所有数据对象的编码组织成索引树结构,缩短频繁推荐的数据查找路径,达到个性化推荐时利用索引结构快速确定搜索空间的目的。与基于项目的Top-N推荐算法和基于最近邻的推荐算法(SBNM)相比,LB-Tree所需时间开销增长最慢,准确率最高,验证了方法的有效性和高效性。

关键词: 海量数据, MapReduce, 个性化推荐, 索引树, 位置编码

Abstract: Since collaborative filtering recommendation algorithm is inefficient in large data environment, a personalized recommendation algorithm based on location bitcode tree, called LB-Tree, was developed. Combined with the characteristics of the MapReduce framework, a novel approach which applyed the index structure in personalized recommendation processing was proposed. For efficient parallel computing in MapReduce, a novel storage strategy based on the differences between clusters was presented. According to the distribution, each cluster was partitioned into several layers by concentric circles with the same centroid, and each layer was expressed by binary bitcodes with different length. To make the frequently recommended data search path shorter and quickly determine the search space by using the index structure, an index tree was constructed by bitcodes of all the layers. Compared with the Top-N recommendation algorithm and Similarity-Based Neighborhood Method (SBNM), LB-Tree has the highest accuracy with the slowest time-increasing, which verifies the effectiveness and efficiency of LB-Tree.

Key words: big data, MapReduce, personalized recommendation, index tree, location bitcode

中图分类号: