计算机应用 ›› 2018, Vol. 38 ›› Issue (2): 497-502.DOI: 10.11772/j.issn.1001-9081.2017082493

• 数据科学与技术 • 上一篇    下一篇

简化的Slope One在线评分预测算法

孙丽梅, 李悦, Ejike Ifeanyi Michael, 曹科研   

  1. 沈阳建筑大学 信息与控制工程学院, 沈阳 110168
  • 收稿日期:2017-08-28 修回日期:2017-10-20 出版日期:2018-02-10 发布日期:2018-02-10
  • 通讯作者: 孙丽梅
  • 作者简介:孙丽梅(1974-),女,辽宁沈阳人,副教授,博士,CCF会员,主要研究方向:数据挖掘、推荐系统;李悦(1985-),男,辽宁沈阳人,硕士研究生,主要研究方向:推荐系统;Ejike Ifeanyi Michael(1988-),男,尼日利亚人,硕士,主要研究方向:推荐系统;曹科研(1981-),女,辽宁沈阳人,副教授,博士,CCF会员,主要研究方向:大数据分析。
  • 基金资助:
    国家自然科学基金资助项目(61602323);中国博士后科学基金资助项目(2016M591455);辽宁省档案局科技项目(L-2017-x-11);辽宁省博士启动基金资助项目(201601209)。

Simplified Slope One algorithm for online rating prediction

SUN Limei, LI Yue, Ejike Ifeanyi Michael, CAO Keyan   

  1. Information and Control Engineering Faculty, Shenyang Jianzhu University, Shenyang Liaoning 110168, China
  • Received:2017-08-28 Revised:2017-10-20 Online:2018-02-10 Published:2018-02-10
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (61602323), the Postdoctoral Foundation of China (2016M591455), the Science and Technology Project of Archives Bureau of Liaoning Province (L-2017-x-11), the Doctoral Research Start Foundation of Liaoning Province (201601209).

摘要: 个性化推荐系统是大数据时代信息过滤的有效手段,影响推荐系统预测准确性的主要原因之一是数据稀疏性。Slope One评分预测推荐算法采用简单的线性回归模型解决数据稀疏问题,具有易于实现、评分预测速度快的特点,但它在训练阶段生成项目之间评分差的时间和空间消耗大,训练阶段需离线进行。为解决以上问题,提出一种简化的Slope One算法——Simplified Slope One,以两项目历史平均分之差代替项目评分差,来降低算法的时间复杂度和空间复杂度,简化耗时最多的生成项目之间评分差的过程,以有效提高评分数据的利用率,对稀疏数据有更好的适应性。在Movielens数据集上利用按照时间戳排序后划分的测试集进行实验,结果表明Simplified Slope One算法对评分预测的准确性与原Slope One算法接近,但时间复杂度和空间复杂度均低于原Slope One算法,更适合在数据规模增长迅速的大型推荐系统中应用。

关键词: 个性化推荐, Slope One算法, 在线, 评分预测, 推荐系统

Abstract: In the era of big data, personalized recommendation system is an effective means of information filtering. One of the main factors that affect the prediction accuracy is data sparsity. Slope One online rating prediction algorithm uses simple linear regression model to solve data sparisity problem, which is easy to implement and has quick score rating, but its training stage needs to be offline because of high time and space consumption when generating differences between items. To solve above problems, a simplified Slope One algorithm was proposed, which simplified the most time-consuming procedure in Slope One algorithm when generating items' rating difference in the training stage by using each item's historical average rating to get the rating difference. The simplified algorithm reduces the time and space complexity of the algorithm, which can effectively improve the utilization rate of the rating data and has better adaptability to sparse data. In the experiments, rating records in Movielens data set were ordered by timestamps then divided into the training set and test set. The experimental results show that the accuracy of the proposed simplified Slope One algorithm is closely approximated to the original Slope One algorithm, but the time and space complexity are lower than that of Slope One, it means that the simplified Slope One algorithm is more suitable for large-scale recommendation system applications with rapid growth of data.

Key words: personalized recommendation, Slope One algorithm, online, rating prediction, recommendation system

中图分类号: