计算机应用 ›› 2016, Vol. 36 ›› Issue (2): 316-323.DOI: 10.11772/j.issn.1001-9081.2016.02.0316

• 第三届CCF大数据学术会议(CCF BigData 2015) • 上一篇    下一篇

基于位置的社会化网络的并行化推荐算法

曾雪琳, 吴斌   

  1. 北京邮电大学 智能通信软件与多媒体北京市重点实验室, 北京 100876
  • 收稿日期:2015-08-29 修回日期:2015-09-12 出版日期:2016-02-10 发布日期:2016-02-03
  • 通讯作者: 曾雪琳(1991-),女,陕西汉中人,硕士研究生,主要研究方向:云计算、推荐算法。
  • 作者简介:吴斌(1969-),男,湖南长沙人,教授,博士,CCF会员,主要研究方向:数据挖掘、复杂网络、大数据、云计算、商务智能。
  • 基金资助:
    国家863计划项目(2015AA050204);北京市教育委员会共建项目建设计划项目。

Parallelized recommendation algorithm in location-based social network

ZENG Xuelin, WU Bin   

  1. Beijing Key Laboratory of Intelligent Telecommunication Software and Multimedia, Beijing University of Posts and Telecommunications, Beijing 100876, China
  • Received:2015-08-29 Revised:2015-09-12 Online:2016-02-10 Published:2016-02-03

摘要: 针对传统的协同过滤算法在利用签到记录进行兴趣点(POI)推荐时不能充分利用签到信息所隐含的偏好、位置和社交网络信息而损失准确率的问题,以及传统的单机串行算法在大数据处理能力上的弱势,提出一种基于位置和朋友关系的协同过滤(LFBCF)算法,以用户历史偏好为基础,综合考虑用户社交关系网络进行协同过滤,并以用户的活动范围作为约束实现对用户的兴趣点推荐。为了支持大数据量的实验,将算法在Spark分布式计算平台上进行了并行化实现。研究过程中使用了Gowalla和Brightkite这两个基于位置的社会化网络数据集,分析了数据集中签到数量、签到位置之间距离、社交关系等可能对推荐结果造成影响的因素,以此来支持提出的算法。实验部分通过与传统的协同过滤算法等经典算法在准确率、F-measure上的对比验证了算法在推荐效果上的优越性,并通过并行算法与单机串行算法在不同数据规模上加速比的对比验证了算法并行化的意义以及性能上的优越性。

关键词: 基于位置的社交网络, 推荐系统, 协同过滤, 兴趣点, 并行化, Spark

Abstract: Since the traditional collaborative filtering algorithm cannot make full use of information implied in check-ins of users in recommendation process, which contains users' preference, location and social relationship, a recommendation algorithm was proposed, which exploits past user behavior, the check-in information and social relation of users to improve the precision of Point of Interests (POI) recommendation, namely Location-Friendship Based Collaborative Filtering (LFBCF). And the recommendation was implemented on distributed computing platform Spark to support large scale dataset in experiments. Two real datasets in Location-based Social Network (LBSN) including Gowalla and Brightkite were employed in experiments. The amount of check-ins, the distance between locations and the social relationship were analyzed to verify the proposed algorithm. The comparison of precision and F-measure with traditional algorithm confirms the effectiveness of the proposed algorithm; and the comparison of speed-up ratio between the parallelized algorithm and serial algorithm demonstrates the significance of parallelization and superiority of performance.

Key words: Location-based Social Network(LBSN), recommender system, collaborative filtering, Point of Interest(POI), parallelization, Spark

中图分类号: