计算机应用

• 人工智能与仿真 •    下一篇

BIGDATA+170一种简化的Slope One在线评分预测算法

孙丽梅,李悦,Ejike Ifeanyi Michael,曹科研   

  1. 沈阳建筑大学
  • 收稿日期:2017-10-20 发布日期:2017-10-20 出版日期:2017-10-25
  • 通讯作者: 孙丽梅

BIGDATA+170A simplified Slope One algorithm for online rating prediction

  • Received:2017-10-20 Online:2017-10-20 Published:2017-10-25

摘要: 个性化推荐系统是大数据时代信息过滤的有效手段,影响推荐系统预测准确性的主要原因之一是数据稀疏性。Slope One评分预测推荐算法采用简单的线性回归模型解决数据稀疏问题,该算法的优点是易于实现,评分预测速度快;缺点是在训练阶段生成项目之间评分差的时间和空间消耗大,训练阶段需离线进行。提出了一种简化的Slope One算法,以两项目历史平均分之差代替项目评分差,降低了算法的时间复杂度和空间复杂度,简化了耗时最多的生成项目之间评分差的过程,有效提高了评分数据的利用率,对稀疏数据有更好的适应性。在Movielens数据集上利用按照时间戳排序后划分的测试集进行实验表明,提出的Simplified Slope One算法对评分预测的准确性与原Slope One算法接近,但时间复杂度和空间复杂度均低于原 Slope One算法,更适合在数据规模增长迅速的大型推荐系统中应用。

Abstract: In the era of big data, personalized recommendation system is an effective means of information filtering. One of the main factors which affect the prediction accuracy is data sparsity. Slope One online rating prediction algorithm uses simple linear regression model to solve data sparisity problem. It is easy to implement and predicts rating speedily. The disadvantage is in the training stage, generating differences between items is more time-consuming. As the data in recommendation system grow rapidly, the training time increases accordingly. A simplified Slope One algorithm was proposed, which simplified the most time-consuming procedure in Slope One algorithm when generating items’rating difference in the training stage by using each item’s historical average rating to get the rating difference. The simplified algorithm reduced the time and space complexity of the algorithm. It can effectively improve the utilization rate of the rating data and has better adaptability to sparse data. In the experiments, rating records in Movielens data set were ordered by timestamps then divided into the training set and test set. The experimental results show that the proposed simplified Slope One algorithm’s accuracy is closely approximated to the original Slope One algorithm. But the time and space complexity are lower than the original Slope One. These indicates simplified Slope One algorithm is more suitable for large-scale recommendation system applications with rapid growth of data.