Collaborative filtering algorithm based on collaborative training and Boosting

doi:10.11772/j.issn.1001-9081.2022101489

Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (10): 3136-3141.DOI: 10.11772/j.issn.1001-9081.2022101489

Special Issue: 数据科学与技术

• Data science and technology • Previous Articles Next Articles

Collaborative filtering algorithm based on collaborative training and Boosting

Xiaohan YANG, Guosheng HAO, Xiehua ZHANG(), Zihao YANG

College of Computer Science and Technology，Jiangsu Normal University，Xuzhou Jiangsu 221116，China

Received:2022-10-11 Revised:2023-01-13 Accepted:2023-01-16 Online:2023-04-12 Published:2023-10-10
Contact: Xiehua ZHANG
About author:YANG Xiaohan， born in 1995， M. S. candidate. Her research interests include machine learning， recommender system.
HAO Guosheng， born in 1972， Ph. D.， professor. His research interests include machine learning， evolutionary computation，personalized learning.
ZHANG Xiehua， born in 1977， Ph. D.， associate professor. Her research interests include machine learning， moving target detection and tracking.
YANG Zihao， born in 1998， M. S. candidate. His researchinterests include machine learning， computer vision.
Supported by:
National Natural Science Foundation of China(62277030);Postgraduate Scientific Research and Practical Innovation Program of Jiangsu Normal University(2022XKT1536)

基于协同训练与Boosting的协同过滤算法

杨晓菡, 郝国生, 张谢华(), 杨子豪

江苏师范大学计算机科学与技术学院，江苏徐州 221116

通讯作者: 张谢华
作者简介:杨晓菡（1995—），女，江苏徐州人，硕士研究生，主要研究方向：机器学习、推荐系统
郝国生（1972—），男，河北万全人，教授，博士，主要研究方向：机器学习、进化计算、个性化学习
张谢华（1977—），女，安徽宿松人，副教授，博士，主要研究方向：机器学习、运动目标检测与跟踪. 6019980030@jsnu. edu. cn
杨子豪（1998—），男，陕西咸阳人，硕士研究生，主要研究方向：机器学习、计算机视觉。
基金资助:
国家自然科学基金资助项目(62277030);江苏师范大学研究生科研与实践创新计划项目(2022XKT1536)

Abstract

Abstract:

Collaborative Filtering （CF） algorithm can realize personalized recommendation on the basis of the similarity between items or users. However， data sparsity has always been one of the challenges faced by CF algorithm. In order to improve the prediction accuracy， a CF algorithm based on Collaborative Training and Boosting （CFCTB） was proposed to solve the problem of sparse user-item scores. First， two CFs were integrated into a framework by using collaborative training， pseudo-labeled samples with high confidence were added to each other’s training set by the two CFs， and Boosting weighted training data were used to assist the collaborative training. Then， the weighted integration was used to predict the final user scores， and the accumulation of noise generated by pseudo-labeled samples was avoided effectively， thereby further improving the recommendation performance. Experimental results show that the accuracy of the proposed algorithm is better than that of the single models on four open datasets. On CiaoDVD dataset with the highest sparsity， compared with Global and Local Kernels for recommender systems （GLocal-K）， the proposed algorithm has the Mean Absolute Error （MAE） reduced by 4.737%. Compared with ECoRec （Ensemble of Co-trained Recommenders） algorithm， the proposed algorithm has the Root Mean Squared Error （RMSE） decreased by 7.421%. The above rasults verify the effectiveness of the proposed algorithm.

Key words: recommendation algorithm, Collaborative Filtering (CF), data sparsity, collaborative training, Boosting

摘要：

协同过滤（CF）算法基于物品之间或用户之间的相似度能实现个性化推荐，然而CF算法普遍存在数据稀疏性的问题。针对用户?物品评分稀疏问题，为使预测更加准确，提出一种基于协同训练与Boosting的协同过滤算法（CFCTB）。首先，利用协同训练将两种CF集成于一个框架，两种CF互相添加置信度高的伪标记样本到对方的训练集中，并利用Boosting加权训练数据辅助协同训练；其次，采用加权集成预测最终的用户评分，有效避免伪标记样本所产生的噪声累加，进一步提高推荐性能。实验结果表明，在4个公开数据集上，所提算法的准确率优于单模型；在稀疏度最高的CiaoDVD数据集上，与面向推荐系统的全局和局部核（GLocal-K）相比，所提算法的平均绝对误差（MAE）降低了4.737%；与ECoRec（Ensemble of Co-trained Recommenders）算法相比，所提算法的均方根误差（RMSE）降低了7.421%。以上结果验证了所提算法的有效性。

CLC Number:

TP181

Xiaohan YANG, Guosheng HAO, Xiehua ZHANG, Zihao YANG. Collaborative filtering algorithm based on collaborative training and Boosting[J]. Journal of Computer Applications, 2023, 43(10): 3136-3141.

杨晓菡, 郝国生, 张谢华, 杨子豪. 基于协同训练与Boosting的协同过滤算法[J]. 《计算机应用》唯一官方网站, 2023, 43(10): 3136-3141.

Figures/Tables 8

References 29

1	REN Y， LI G， ZHANG J， et al. The efficient imputation method for neighborhood-based collaborative filtering［C］// Proceedings of the 21st ACM International Conference on Information and Knowledge Management. New York： ACM， 2012： 684-693. 10.1145/2396761.2396849
2	BREESE J S， HECKERMAN D， KADIE C. Empirical analysis of predictive algorithms for collaborative filtering［C］// Proceedings of the 14th Conference on Uncertainty in Artificial Intelligence. San Francisco： Morgan Kaufmann Publishers Inc.， 1998： 43-52.
3	GONG J， WANG S， WANG J， et al. Attentional graph convolutional networks for knowledge concept recommendation in MOOCs in a heterogeneous view［C］// Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. New York： ACM， 2020： 79-88. 10.1145/3397271.3401057
4	RASHED A， GRABOCKA J， SCHMIDT-THIEME L. Attribute-aware non-linear co-embeddings of graph features［C］// Proceedings of the 13th ACM Conference on Recommender Systems. New York： ACM， 2019： 314-321. 10.1145/3298689.3346999
5	ZHU K， XIAO Y， ZHENG W， et al. A novel context-aware mobile application recommendation approach based on users behavior trajectories［J］. IEEE Access， 2021， 9： 1362-1375. 10.1109/access.2020.3046654
6	顾军华，樊帅，李宁宁，等. 基于知识图偏好注意力网络的长短期推荐模型及其更新方法［J］. 计算机应用， 2022， 42（4）： 1079-1086. 10.11772/j.issn.1001-9081.2021071242
	GU J H， FAN S， LI N N， et al. Long- and short-term recommendation model and updating method based on knowledge graph preference attention network［J］. Journal of Computer Applications， 2022， 42（4）： 1079-1086. 10.11772/j.issn.1001-9081.2021071242
7	KHARRAT F BEN， ELKHLEIFI A， FAIZ R. Recommendation system based contextual analysis of Facebook comment［C］// Proceedings of the IEEE/ACS 13th International Conference of Computer Systems and Applications. Piscataway： IEEE， 2016： 1-6. 10.1109/aiccsa.2016.7945792
8	LIN Y R， SU W H， LIN C H， et al. Clothing recommendation system based on visual information analytics［C］// Proceedings of the 2019 International Automatic Control Conference. Piscataway： IEEE， 2019： 1-6. 10.1109/cacs47674.2019.9024361
9	ENGELEN J E van， HOOS H H. A survey on semi-supervised learning［J］. Machine Learning， 2020， 109（2）： 373-440. 10.1007/s10994-019-05855-6
10	COSTA A F DA， MANZATO M G， CAMPELLO R J G B. Boosting collaborative filtering with an ensemble of co-trained recommenders［J］. Expert Systems with Applications， 2019， 115： 427-441. 10.1016/j.eswa.2018.08.020
11	NAN Z H， ZHAO F. Research on semi-supervised recommendation algorithm based on hybrid model［C］// Proceedings of the 2nd International Conference on Machine Learning， Big Data and Business Intelligence. Piscataway： IEEE， 2020： 344-348. 10.1109/mlbdbi51377.2020.00073
12	WU J， SANG X， CUI W. Semi-supervised collaborative filtering ensemble［J］. World Wide Web， 2021， 24（2）： 657-673. 10.1007/s11280-021-00866-7
13	SU X， KHOSHGOFTAAR T M. A survey of collaborative filtering techniques［J］. Advances in Artificial Intelligence， 2009， 2009： No.421425. 10.1155/2009/421425
14	KOREN Y. Factor in the neighbors： scalable and accurate collaborative filtering［J］. ACM Transactions on Knowledge Discovery from Data， 2010， 4（1）： No.1. 10.1145/1644873.1644874
15	FUNK S. Netflix update： try this at home ［EB/OL］. （2006-12-11）［2022-09-01］.. 10.1088/2058-7058/24/09/1
16	KOREN Y， BELL R， VOLINSKY C. Matrix factorization techniques for recommender systems［J］. Computer， 2009， 42（8）： 30-37. 10.1109/mc.2009.263
17	SHI W， WANG L， QIN J. User embedding for rating prediction in SVD++-based collaborative filtering［J］. Symmetry， 2020， 12（1）： No.121. 10.3390/sym12010121
18	张宜浩，朱小飞，徐传运，等. 基于用户评论的深度情感分析和多视图协同融合的混合推荐方法［J］. 计算机学报， 2019， 42（6）： 1318-1333. 10.11897/SP.J.1016.2019.01316
	ZHANG Y H， ZHU X F， XU C Y， et al. Hybrid recommendation approach based on deep sentiment analysis of user reviews and multi-view collaborative fusion［J］. Chinese Journal of Computers， 2019， 42（6）： 1318-1333. 10.11897/SP.J.1016.2019.01316
19	MATUSZYK P， SPILIOPOULOU M. Stream-based semi-supervised learning for recommender systems［J］. Machine Learning， 2017， 106（6）： 771-798. 10.1007/s10994-016-5614-4
20	FREUND Y， SCHAPIRE R E. A decision-theoretic generalization of on-line learning and an application to boosting［J］. Journal of Computer and System Sciences， 1997， 55（1）： 119-139. 10.1006/jcss.1997.1504
21	SCHCLAR A， TSIKINOVSKY A， ROKACH L， et al. Ensemble methods for improving the performance of neighborhood-based collaborative filtering［C］// Proceedings of the 3rd ACM Conference on Recommender Systems. New York： ACM， 2009： 261-264. 10.1145/1639714.1639763
22	BAR A， ROKACH L， SHANI G， et al. Improving simple collaborative filtering models using ensemble methods［C］// Proceedings of the 2013 International Workshop on Multiple Classifier Systems， LNCS 7872. Berlin： Springer， 2013： 1-12.
23	HERLOCKER J L， KONSTAN J A， TERVEEN L G， et al. Evaluating collaborative filtering recommender systems［J］. ACM Transactions on Information Systems， 2004， 22（1）： 5-53. 10.1145/963770.963772
24	HARPER F M， KONSTAN J A. The MovieLens datasets： history and context［J］. ACM Transactions on Interactive Intelligent Systems， 2015， 5（4）： No.19. 10.1145/2827872
25	AHN H J. A new similarity measure for collaborative filtering to alleviate the new user cold-starting problem［J］. Information Sciences， 2008， 178（1）： 37-51. 10.1016/j.ins.2007.07.024
26	LIU H， HU Z， MIAN A， et al. A new user similarity model to improve the accuracy of collaborative filtering［J］. Knowledge-Based Systems， 2014， 56： 156-166. 10.1016/j.knosys.2013.11.006
27	HIMABINDU T V R， PADMANABHAN V， PUJARI A K. Conformal matrix factorization based recommender system［J］. Information Sciences， 2018， 467： 685-707. 10.1016/j.ins.2018.04.004
28	杨凯欣，李雅玮. 基于协同过滤算法的移动智能学习平台的开发与设计［J］. 软件工程与应用， 2019， 8（3）： 104-114.
	YANG K X， LI Y W. Development and design of mobile intelligent learning platform on collaborative filtering［J］. Software Engineering and Applications， 2019， 8（3）： 104-114.
29	HAN S C， LIM T， LONG S， et al. GLocal-K： global and local kernels for recommender systems［C］// Proceedings of the 30th ACM International Conference on Information and Knowledge Management. New York： ACM， 2021： 3063-3067. 10.1145/3459637.3482112

数据集	用户数	物品数	评分数	稀疏度/%
ML-100K	943	1 682	100 000	93.695
ml-latest-small	9 742	610	100 837	98.303
Filmtrust	1 508	2 071	35 497	98.863
CiaoDVD	17 615	16 121	72 665	99.974

数据集	用户数	物品数	评分数	稀疏度/%
ML-100K	943	1 682	100 000	93.695
ml-latest-small	9 742	610	100 837	98.303
Filmtrust	1 508	2 071	35 497	98.863
CiaoDVD	17 615	16 121	72 665	99.974

算法	ML-100K				ml-latest-small				Filmtrust				CiaoDVD
	RMSE		MAE		RMSE		MAE		RMSE		MAE		RMSE		MAE
	均值	标准差	均值	标准差	均值	标准差	均值	标准差	均值	标准差	均值	标准差	均值	标准差	均值	标准差
SVD	0.927 6	0.003 9	0.731 7	0.003 8	0.867 3	0.002 9	0.664 5	0.001 2	0.786 8	0.007 7	0.608 3	0.007 0	0.939 5	0.004 9	0.725 3	0.005 8
SVD++	0.910 5	0.006 0	0.714 5	0.005 4	0.857 2	0.003 0	0.656 0	0.001 9	0.783 8	0.006 1	0.602 5	0.006 3	0.934 6	0.002 8	0.717 8	0.004 4
CFCTB	0.898 5	0.004 8	0.702 4	0.005 0	0.845 2	0.004 0	0.643 2	0.002 6	0.772 9	0.008 9	0.588 5	0.008 4	0.919 4	0.004 4	0.699 9	0.005 2

算法	ML-100K				ml-latest-small				Filmtrust				CiaoDVD
	RMSE		MAE		RMSE		MAE		RMSE		MAE		RMSE		MAE
	均值	标准差	均值	标准差	均值	标准差	均值	标准差	均值	标准差	均值	标准差	均值	标准差	均值	标准差
SVD	0.927 6	0.003 9	0.731 7	0.003 8	0.867 3	0.002 9	0.664 5	0.001 2	0.786 8	0.007 7	0.608 3	0.007 0	0.939 5	0.004 9	0.725 3	0.005 8
SVD++	0.910 5	0.006 0	0.714 5	0.005 4	0.857 2	0.003 0	0.656 0	0.001 9	0.783 8	0.006 1	0.602 5	0.006 3	0.934 6	0.002 8	0.717 8	0.004 4
CFCTB	0.898 5	0.004 8	0.702 4	0.005 0	0.845 2	0.004 0	0.643 2	0.002 6	0.772 9	0.008 9	0.588 5	0.008 4	0.919 4	0.004 4	0.699 9	0.005 2

算法	ML-100K				ml-latest-small				Filmtrust				CiaoDVD
	RMSE		MAE		RMSE		MAE		RMSE		MAE		RMSE		MAE
	均值	标准差	均值	标准差	均值	标准差	均值	标准差	均值	标准差	均值	标准差	均值	标准差	均值	标准差
SVD	0.932 7	0.002 6	0.735 2	0.003 0	0.866 9	0.011 6	0.666 0	0.008 0	0.793 9	0.007 5	0.612 4	0.006 0	0.930 6	0.007 8	0.718 5	0.007 1
KNNBaseline	0.931 5	0.001 8	0.732 8	0.002 3	0.869 7	0.010 5	0.665 4	0.007 5	0.816 1	0.007 1	0.634 1	0.002 6	0.963 4	0.007 3	0.729 0	0.005 9
CFCTB	0.915 0	0.002 7	0.719 9	0.004 0	0.851 7	0.006 6	0.650 8	0.008 5	0.784 8	0.007 1	0.599 1	0.003 9	0.918 8	0.006 9	0.701 5	0.005 5

Collaborative filtering algorithm based on collaborative training and Boosting

基于协同训练与Boosting的协同过滤算法

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 8

References 29

Related Articles 15

Recommended Articles

Metrics

[1]	Runchao PAN, Qishan YU, Hongfei XIONG, Zhihui LIU. Collaborative recommendation algorithm based on deep graph neural network [J]. Journal of Computer Applications, 2023, 43(9): 2741-2746.
[2]	Liang ZHU, Hua XU, Jinhai CHENG, Shen ZHU. Analysis and improvement of AdaBoost’s sample weight and combination coefficient [J]. Journal of Computer Applications, 2022, 42(7): 2022-2029.
[3]	Hailong CHEN, Chang YANG, Mei DU, Yingyu ZHANG. Credit risk prediction model based on borderline adaptive SMOTE and Focal Loss improved LightGBM [J]. Journal of Computer Applications, 2022, 42(7): 2256-2264.
[4]	Meng YU, Wentao HE, Xuchuan ZHOU, Mengtian CUI, Keqi WU, Wenjie ZHOU. Review of recommendation system [J]. Journal of Computer Applications, 2022, 42(6): 1898-1913.
[5]	Suqi ZHANG, Xinxin WANG, Shiyao SHE, Junhua GU. Knowledge graph recommendation model with multiple time scales and feature enhancement [J]. Journal of Computer Applications, 2022, 42(4): 1093-1098.
[6]	Wen WEN, Fangyu LIANG. Sequential behavior recommendation based on user’s latent state and dependency learning [J]. Journal of Computer Applications, 2022, 42(12): 3756-3762.
[7]	Mingyao SHEN, Meng HAN, Shiyu DU, Rui SUN, Chunyan ZHANG. Data center server energy consumption optimization algorithm combining XGBoost and Multi-GRU [J]. Journal of Computer Applications, 2022, 42(1): 198-208.
[8]	WANG Zhihe, CHANG Xiaoqing, DU Hui. Adaptive affinity propagation clustering algorithm based on universal gravitation [J]. Journal of Computer Applications, 2021, 41(5): 1337-1342.
[9]	CAO Yang, YAN Qiuyan, WU Xin. Ensemble classification algorithm for imbalanced time series [J]. Journal of Computer Applications, 2021, 41(3): 651-656.
[10]	CHEN Lang, WANG Rangding, YAN Diqun, LIN Yuzhen. Audio steganography detection model combing residual network and extreme gradient boosting [J]. Journal of Computer Applications, 2021, 41(2): 449-455.
[11]	YANG Li, WANG Shihui, ZHU Bo. Point-of-interest recommendation algorithm combing dynamic and static preferences [J]. Journal of Computer Applications, 2021, 41(2): 398-406.
[12]	Zilong LI, Yong ZHOU, Rong BAO, Hongdong WANG. Deep distance metric learning method based on optimized triplet loss [J]. Journal of Computer Applications, 2021, 41(12): 3480-3484.
[13]	ZHU Simiao, Wei Shiwei, WEI Siheng, YU Dunhui. Video recommendation algorithm based on danmaku sentiment analysis and topic model [J]. Journal of Computer Applications, 2021, 41(10): 2813-2819.
[14]	GU Tong, XU Guoliang, LI Wanlin, LI Jiahao, WANG Zhiyuan, LUO Jiangtao. Intelligent house price evaluation model based on ensemble LightGBM and Bayesian optimization strategy [J]. Journal of Computer Applications, 2020, 40(9): 2762-2767.
[15]	TIAN Baojun, LIU Shuang, FANG Jiandong. Hybrid recommendation algorithm by fusion of topic information and convolution neural network [J]. Journal of Computer Applications, 2020, 40(7): 1901-1907.