Journal of Computer Applications ›› 2017, Vol. 37 ›› Issue (5): 1387-1391.DOI: 10.11772/j.issn.1001-9081.2017.05.1387

Previous Articles     Next Articles

Item collaborative filtering recommendation algorithm based on improved similarity measure

YU Jinming1, MENG Jun2, WU Qiufeng2   

  1. 1. College of Engineering, Northeast Agricultural University, Harbin Heilongjiang 150030, China;
    2. College of Science, Northeast Agricultural University, Harbin Heilongjiang 150030, China
  • Received:2016-10-08 Revised:2016-11-28 Online:2017-05-10 Published:2017-05-16
  • Supported by:
    This work is partially supported by the Public Welfare Industry (Agriculture) Scientific Research Special Projects Level-2 (201503116-04-06), the Postdoctoral Foundation of Heilongjiang Province (LBH-Z15020), the National Science and Technology Support Plan Thematic Mandate (2014BAD12B01-1-3), the Key Laboratory Open Fund of Agricultural Water Resources Efficient Utilization in Ministry of Agriculture (2015004), the Philosophical and Social Science Research Plan Annual Project of Heilongjiang Province (16YB17).

基于改进相似性度量的项目协同过滤推荐算法

于金明1, 孟军2, 吴秋峰2   

  1. 1. 东北农业大学 工程学院, 哈尔滨 150030;
    2. 东北农业大学 理学院, 哈尔滨 150030
  • 通讯作者: 孟军
  • 作者简介:于金明(1992-),女,黑龙江牡丹江人,硕士研究生,主要研究方向:数据挖掘、机器学习;孟军(1965-),男,黑龙江哈尔滨人,教授,博士,主要研究方向:数据挖掘、机器学习;吴秋峰(1979-),男,黑龙江双鸭山人,副教授,博士,CCF会员,主要研究方向:数据挖掘、机器学习。
  • 基金资助:
    公益性行业(农业)科研专项二级任务(201503116-04-06);黑龙江省博士后基金资助项目(LBH-Z15020);国家科技支撑计划专题任务(2014BAD12B01-1-3);农业部农业水资源高效利用重点实验室开放基金资助项目(2015004);黑龙江省哲学社会科学研究规划年度项目(16YB17)。

Abstract: Traditional collaborative filtering algorithm can not perform well under the condition of cold start. To solve this problem, IPSS-based (Inverse Item Frequence-based Proximity-Significance-Singularity) Item Collaborative Filtering (ICF_IPSS) was proposed, whose core was a novel similarity measure. The measure was composed of the rating similarity and the structure similarity. The difference between the ratings of two items, the difference between the item rating and the median value, and the difference between the rating value and the average rating value of other items were taken into account in the rating similarity. The structure similarity defined the ⅡF (Inverse Item Frequence) coefficient which fully reflected common-rating ratio and punished active users. Experiments were executed on Movie Lens and Jester data sets to testify the accuracy of the ICF_IPSS. In Movie Lens data set, when the nearest neighbor number was 10, the Mean Absolute Error (MAE) and Root Mean Square Error (RMSE) was 3.06%, 1.20% lower than ICF_JMSD (Jaccard-based Mean Square Difference-based Item Collaborative Filtering) respectively. When the recommendation item number was 10, the precision and recall was 67.79%, 67.86% higher than ICF_JMSD respectively. The experimental results show that ICF_IPSS is superior to other traditional collaborative filtering algorithms, such as ICF_JMSD.

Key words: Collaborative Filtering (CF), recommendation algorithm, similarity measure, rating similarity, structure similarity, cold start

摘要: 针对传统协同过滤推荐算法遇到冷启动情况效果不佳的问题,提出一种基于项目相似性度量方法(IPSS)的项目协同过滤推荐算法(ICF_IPSS),其核心是一种新的项目相似性度量方法,该方法由评分相似性和结构相似性两部分构成:评分相似性部分充分考虑两个项目评分之间的评分差、项目评分与评分中值之差,以及项目评分与其他评分平均值之差;结构相似性部分定义了共同评分项目占所有项目比重,并惩罚活跃用户的逆项目频率(ⅡF)系数。在Movie Lens和Jester数据集下测试算法准确率。在Movie Lens数据集下,当近邻数量为10时,ICF_IPSS的平均绝对偏差(MAE)和均方根误差(RMSE)分别比基于Jaccard系数的均方差异系数的项目协同过滤算法(ICF_JMSD)低3.06%和1.20%;当推荐项目数量为10时,ICF_IPSS的准确率和召回率分别比ICF_JMSD提升67.79%和67.86%。实验结果表明,基于IPSS的项目协同过滤算法在预测准确率和分类准确率方面均优于基于传统相似性度量的项目协同过滤算法,如ICF_JMSD等。

关键词: 协同过滤, 推荐算法, 相似性度量, 评分相似性, 结构相似性, 冷启动

CLC Number: