基于主题模型的个性化图书推荐算法

doi:10.11772/j.issn.1001-9081.2015.09.2569

计算机应用 ›› 2015, Vol. 35 ›› Issue (9): 2569-2573.DOI: 10.11772/j.issn.1001-9081.2015.09.2569

基于主题模型的个性化图书推荐算法

郑祥云, 陈志刚, 黄瑞, 李博

中南大学软件学院, 长沙 410075

收稿日期:2015-04-23 修回日期:2015-06-16 出版日期:2015-09-10 发布日期:2015-09-17
通讯作者: 陈志刚(1964-),男,湖南益阳人,教授,博士生导师,博士,CCF会员,主要研究方向:无线网络、分布式计算,czg@csu.edu.cn
作者简介:郑祥云(1992-),男,湖南永州人,硕士研究生,主要研究方向:数据挖掘;黄瑞(1989-),男,安徽安庆人,博士研究生,主要研究方向:社交网络;李博(1988-),男,河北衡水人,硕士研究生,主要研究方向:数据挖掘。
基金资助:
国家自然科学基金资助项目(61379057,61309001,61272149,61103202);中南大学中央高校基本科研业务费专项资金资助项目(2015zzts228)。

Personalized book recommendation algorithm based on topic model

ZHENG Xiangyun, CHEN Zhigang, HUANG Rui, LI Bo

School of Software, Central South University, Changsha Hunan 410075, China

Received:2015-04-23 Revised:2015-06-16 Online:2015-09-10 Published:2015-09-17

摘要/Abstract

摘要： 针对传统推荐算法精准度不高的问题,在潜在狄利克雷分布(LDA)主题挖掘模型的基础上提出了一种新的适用于图书推荐(BR)的数据挖掘模型——BR_LDA模型。通过对目标借阅者的历史借阅数据与其他图书数据进行内容相似度分析,得到与目标借阅者历史借阅图书内容相似度较高的其他图书。通过对目标借阅者的历史借阅数据及其他借阅者的历史借阅数据进行相似性分析,得到最近邻借阅者的历史借阅数据。通过求解图书被推荐的概率,最终得到目标借阅者潜在感兴趣的图书。特别地,当推荐数量为4000时,BR_LDA模型比基于多特征方法和关联规则方法精准度分别提高了6.2%、4.5%;当推荐数量为500时,BR_LDA模型比协同过滤的近邻方法和矩阵分解方法分别提高了2.1%、0.5%。实验表明本模型能够更准确地向目标借阅者推荐历史感兴趣类别的新图书及潜在感兴趣的新类别的图书。

关键词: 图书推荐, 图书管理系统, 数据挖掘, 推荐算法

Abstract: Concerning the problem of high time complexity of traditional recommendation algorithms, a new recommendation model based on Latent Dirichlet Allocation (LDA) model was proposed. It was a data mining model applied to Book Recommendation (BR) in library management systems, named Book Recommendation_Latent Dirichlet Allocation (BR_LDA) model. Through the content similarity analysis of historical borrowing data of the target borrowers with other books, other books which had high content similarities with historical borrowing books of the target borrowers were gotten. Through the similarity analyses performed on the target borrowers' historical borrowing data and historical data from other borrowers, historical borrowing data of the nearest neighbors were gotten. Books which the target borrowers were interested in could be finally gotten by calculating the probabilities of the recommended books. In particular, when the number of recommended books is 4000, the precision of BR_LDA model is 6.2% higher than multi-feature method and 4.5% higher than association rule method; when the recommended list has 500 items, the precision of BR_LDA model is 2.1% higher than collaborative filtering based on the nearest neighbors and 0.5% higher than collaborative filtering based on matrix decomposition. The experimental results show that this model can efficiently mine data of books, reasonably recommend new books which belong to historical interested categories and new books in potential interested categories to the target borrowers.

Key words: Book Recommendation (BR), library management system, data mining, recommendation algorithm

中图分类号:

TP301.6

郑祥云, 陈志刚, 黄瑞, 李博. 基于主题模型的个性化图书推荐算法[J]. 计算机应用, 2015, 35(9): 2569-2573.

ZHENG Xiangyun, CHEN Zhigang, HUANG Rui, LI Bo. Personalized book recommendation algorithm based on topic model[J]. Journal of Computer Applications, 2015, 35(9): 2569-2573.

参考文献

[1] LIU S. Research on the key issues for the recommender system [D]. Hefei: University of Science and Technology of China, 2014.(刘士琛.面向推荐系统的关键问题研究及应用[D]. 合肥:中国科学技术大学,2014.)
[2] ZHANG F. Survey of online social network based on personalized recommendation [J]. Journal of Chinese Computer Systems, 2014,35(7):1470-1476.(张富国.基于社交网络的个性化推荐技术[J].小型微型计算机系统,2014,35(7):1470-1476.)
[3] KONG Y. Recommendation algorithms in the big data era [D]. Xiamen:Xiamen University,2014.(孔远帅.基于大数据的推荐算法研究[D].厦门:厦门大学,2014.)
[4] WANG Z, HE M, DU Y. Text similarity computing based on topic model LDA [J]. Computer Science, 2013,40(2):229-232.(王振振,何明,杜永萍.基于LDA主题模型的文本相似度计算[J].计算机科学,2013,40(2):229-232.)
[5] ZHU W. Research on user similarity function of recommendation system [D]. Chongqing:Chongqing University, 2014.(朱文奇.推荐系统用户相似度计算方法研究[D].重庆:重庆大学,2014.)
[6] BLEI D M. Introduction to probabilistic topic models [EB/OL]. [2015-01-11]. http://www.cs.princeton.edu/~blei/papers/Blei2011.pdf?origin=publication_detail.
[7] BOBADILLA J, ORTEGA F, HEMANDO A, et al. Improving collaborative filtering recommender system results and performance using genetic algorithms [J]. Knowledge-Based Systems, 2011,24(8):1310-1316.
[8] YANG Y, XIE K, ZHU Y, et al. Implementation of association rules recommendation model in recommendation system of e-commence Web [J]. Computer Engineering, 2004,30(19):57-59.(杨引霞,谢康林,朱扬勇,等.电子商务网站推荐系统中关联规则推荐模型的实现[J].计算机工程,2004,30(19):57-59.)
[9] LI K, LIANG Z. Personalized book recommendation algorithm based on multi-feature [J]. Computer Engineering, 2012,38(11):34-37.(李克潮,梁正友.基于多特征的个性化图书推荐算法[J].计算机工程,2012,38(11):34-37.)
[10] BLEI D M, ANDREW Y N, JORDAN M I. Latent Dirichlet allocation [J]. Journal of Machine Learning Research, 2003,3(1):993-1022.
[11] ZHANG Z, MIAO D, GAO C. Short text classification using latent Dirichlet allocation [J]. Journal of Computer Applications, 2013,33(6):1587-1590.(张志飞,苗夺谦,高灿.基于LDA主题模型的短文本分类方法[J]. 计算机应用,2013,33(6):1587-1590.)
[12] YAGER R R, YAGER R L. Social networks: querying and sharing mined information [C]//Proceedings of the 2014 47th Hawaii International Conference on System Sciences. Washington, DC: IEEE Computer Society, 2013:1435-1442.
[13] MA Z. Bayesian estimation of the Dirichlet distribution with expectation propagation [C]//Proceedings of the 20th European Signal Processing Conference. Piscataway: IEEE, 2012:689-693.
[14] GRIFFITHS T, STEYYERS M. Probabilistic topic models [J]. Handbook of Latent Semantic Analysis, 2007,427(7):424-440.
[15] TU D, SHU C, YU H. Using unified probabilistic matrix factorization for contextual advertisement recommendation [J]. Journal of Software, 2013,24(3):454-464.(涂丹丹,舒承椿,余海燕.基于联合概率矩阵分解的上下文广告推荐算法.软件学报,2013,24(3):454-464.)

基于主题模型的个性化图书推荐算法

Personalized book recommendation algorithm based on topic model

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

[1]	刘世泽, 秦艳君, 王晨星, 苏琳, 柯其学, 罗海勇, 孙艺, 王宝会. 基于深度残差长短记忆网络交通流量预测算法[J]. 计算机应用, 2021, 41(6): 1566-1572.
[2]	李旭娟, 皮建勇, 黄飞翔, 贾海朋. 基于自生成深度神经网络的4D航迹预测[J]. 计算机应用, 2021, 41(5): 1492-1499.
[3]	杨丽, 王时绘, 朱博. 基于动态和静态偏好的兴趣点推荐算法[J]. 计算机应用, 2021, 41(2): 398-406.
[4]	朱思淼, 魏世伟, 魏思恒, 余敦辉. 基于弹幕情感分析和主题模型的视频推荐算法[J]. 计算机应用, 2021, 41(10): 2813-2819.
[5]	陈凯, 于彦伟, 赵金东, 宋鹏. 基于城市交通监控大数据的工作位置推理方法[J]. 计算机应用, 2021, 41(1): 177-184.
[6]	田保军, 刘爽, 房建东. 融合主题信息和卷积神经网络的混合推荐算法[J]. 计算机应用, 2020, 40(7): 1901-1907.
[7]	龙洋洋, 陈玉玲, 辛阳, 豆慧. 基于联盟区块链的安全能源交易方案[J]. 计算机应用, 2020, 40(6): 1668-1673.
[8]	徐周波, 杨健, 刘华东, 黄文文. 基于XGBoost与拓扑结构信息的蛋白质复合物识别算法[J]. 计算机应用, 2020, 40(5): 1510-1514.
[9]	杜旭升, 于炯, 叶乐乐, 陈嘉颖. 基于图上随机游走的离群点检测算法[J]. 计算机应用, 2020, 40(5): 1322-1328.
[10]	陈曦, 梅广, 张金金, 许维胜. 融合知识图谱和协同过滤的学生成绩预测方法[J]. 计算机应用, 2020, 40(2): 595-601.
[11]	马董, 陈红梅, 王丽珍, 肖清. 空间亚频繁co-location模式的主导特征挖掘[J]. 计算机应用, 2020, 40(2): 465-472.
[12]	李莎莎, 梁冬阳, 余杰, 纪斌, 马俊, 谭郁松, 吴庆波. 基于师门关系的研究团队挖掘算法[J]. 计算机应用, 2020, 40(11): 3198-3202.
[13]	孙鹤立, 张优优, 杨洲, 何亮, 贾晓琳. 基于时间线段树的城市可达区域搜索[J]. 计算机应用, 2020, 40(10): 2936-2941.
[14]	李博, 张晓, 颜靖艺, 李可威, 李恒, 凌玉龙, 张勇. 基于值差度量和聚类优化的K最近邻算法在银行客户行为预测中的应用[J]. 计算机应用, 2019, 39(9): 2784-2788.
[15]	王磊, 任航, 龚凯. 基于多维信任和联合矩阵分解的社会化推荐方法[J]. 计算机应用, 2019, 39(5): 1269-1274.