计算机应用 ›› 2017, Vol. 37 ›› Issue (12): 3472-3476.DOI: 10.11772/j.issn.1001-9081.2017.12.3472

• 人工智能 • 上一篇    下一篇

基于边界矩阵低阶近似和近邻模型的协同过滤算法

温占考, 易秀双, 田申申, 李婕, 王兴伟   

  1. 东北大学 计算机科学与工程学院, 沈阳 110819
  • 收稿日期:2017-05-04 修回日期:2017-07-10 出版日期:2017-12-10 发布日期:2017-12-18
  • 通讯作者: 易秀双
  • 作者简介:温占考(1980-),男,江西赣州人,工程师,硕士,主要研究方向:下一代互联网、网络安全、大数据分析;易秀双(1969-),男,内蒙古赤峰人,教授,博士,主要研究方向:下一代互联网、网络安全、大数据分析;田申申(1992-),男,辽宁沈阳人,硕士研究生,主要研究方向:下一代互联网、大数据分析;李婕(1981-),女,辽宁沈阳人,副教授,博士,主要研究方向:下一代互联网、智能路由;王兴伟(1968-),男,内蒙古包头人,教授,博士,主要研究方向:下一代互联网、智能路由、软件定义网络、网络空间安全、大数据分析。
  • 基金资助:
    国家自然科学基金资助项目(61572123);国家杰出青年科学基金资助项目(61225012,71325002);辽宁省百千万人才工程项目(2013921068);赛尔网络下一代互联网技术创新项目(NGⅡ20160616)。

Collaborative filtering algorithm based on bounded matrix low rank approximation and nearest neighbor model

WEN Zhankao, YI Xiushuang, TIAN Shenshen, LI Jie, WANG Xingwei   

  1. School of Computer Science and Engineering, Northeastern University, Shenyang Liaoning 110819, China
  • Received:2017-05-04 Revised:2017-07-10 Online:2017-12-10 Published:2017-12-18
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (61572123), the National Science Foundation for Distinguished Young Scholars in China (61225012, 71325002), the Liaoning Bai Qian Wan Talents Program (2013921068), the CERNET Innovation Project (NGⅡ20160616).

摘要: 为解决矩阵分解应用到协同过滤算法的局限性和准确率等问题,提出基于边界矩阵低阶近似(BMA)和近邻模型的协同过滤算法(BMAN-CF)来提高物品评分预测的准确率。首先,引入BMA的矩阵分解算法,挖掘子矩阵的隐含特征信息,提高近邻集合查找的准确率;然后,根据传统基于用户和基于物品的协同过滤算法分别预测出目标用户对目标物品的评分,利用平衡因子和控制因子动态平衡两个预测结果,得到目标用户对物品的评分;最后,利用MapReduce计算框架的特点,对数据进行分块,将该算法在Hadoop环境下并行化。实验结果表明,BMAN-CF比其他矩阵分解算法有更高的评分预测准确率,且加速比实验验证了该算法具有较好的可扩展性。

关键词: 协同过滤, 矩阵分解, 边界矩阵, 近邻模型, Hadoop

Abstract: To solve the limitation and accuracy of matrix decomposition in Collaborative Filtering (CF) algorithm, a Collaborative Filtering algorithm based on Bounded Matrix low rank Approximation (BMA) and Nearest neighbor model (BMAN-CF) was proposed to improve the accuracy of item scoring prediction. Firstly, the matrix factorization algorithm of BMA was introduced to extract the implicit feature information of sub-matrix and improve the accuracy of neighborhood set search. Then, the target users' scores on target items were respectively predicted according to the traditional user-based and item-based collaborative filtering algorithms. And the equilibrium factor and control factor were used to dynamically balance the two prediction results, the target users' scores of items were obtained. Finally, the data was partitioned, and the proposed algorithm was parallelized in Hadoop environment by using the characteristics of MapReduce computing framework. The experimental results show that, the BMAN-CF has higher rating prediction accuracy than other matrix factorization algorithms, and the speedup experiment shows that the proposed parallelized algorithm has better scalability.

Key words: collaborative filtering, matrix factorization, bounded matrix, nearest neighbor model, Hadoop

中图分类号: