《计算机应用》唯一官方网站 ›› 2025, Vol. 45 ›› Issue (6): 1858-1868.DOI: 10.11772/j.issn.1001-9081.2024060824
• 数据科学与技术 • 上一篇
收稿日期:
2024-06-20
修回日期:
2024-09-18
接受日期:
2024-09-19
发布日期:
2024-10-11
出版日期:
2025-06-10
通讯作者:
李冠宇
作者简介:
吴宗航(2002—),男,吉林公主岭人,硕士研究生,CCF会员,主要研究方向:推荐系统、智能信息处理基金资助:
Zonghang WU, Dong ZHANG, Guanyu LI()
Received:
2024-06-20
Revised:
2024-09-18
Accepted:
2024-09-19
Online:
2024-10-11
Published:
2025-06-10
Contact:
Guanyu LI
About author:
WU Zonghang, born in 2002, M. S. candidate. His research interests include recommender system, intelligent information processing.Supported by:
摘要:
针对多模态推荐算法的数据稀疏性问题,以及现有的自监督学习(SSL)算法往往集中在对数据集中单一特征的SSL上,而忽视了多特征联合学习的可能性的问题,提出一种基于联合SSL的多模态融合推荐算法SFELMMR (SelF supErvised Learning for MultiModal Recommendation)。首先,整合并优化现有的SSL策略,以通过联合学习不同模态的数据特征,显著增强数据的表示能力,从而缓解数据稀疏性的问题;其次,通过融合全局视角下的深层次项目关系和局部视角下的直接相互作用,设计一种构造多模态潜在语义图的方法,使算法能更精准地捕捉项目间的复杂联系;最后,在3个数据集上进行实验。结果表明,与现有算法中表现最佳的多模态推荐算法相比,所提算法在多个推荐性能指标上取得了显著提升。具体地,所提算法的Recall@10分别提升了5.49%、2.56%、2.99%,NDCG@10分别提升了1.17%、1.98%、3.52%,Precision@10分别提升了4.69%、2.74%、1.22%,Map@10分别提升了0.81%、1.59%、3.11%。此外,通过对所提算法进行消融实验,验证了该算法的有效性。
中图分类号:
吴宗航, 张东, 李冠宇. 基于联合自监督学习的多模态融合推荐算法[J]. 计算机应用, 2025, 45(6): 1858-1868.
Zonghang WU, Dong ZHANG, Guanyu LI. Multimodal fusion recommendation algorithm based on joint self-supervised learning[J]. Journal of Computer Applications, 2025, 45(6): 1858-1868.
数据集 | 用户数 | 项目数 | 交互数 |
---|---|---|---|
TikTok | 9 308 | 6 710 | 68 722 |
Baby | 19 445 | 7 050 | 160 792 |
Sports | 35 598 | 18 357 | 296 337 |
表1 实验数据集的统计信息
Tab. 1 Experimental dataset statistics
数据集 | 用户数 | 项目数 | 交互数 |
---|---|---|---|
TikTok | 9 308 | 6 710 | 68 722 |
Baby | 19 445 | 7 050 | 160 792 |
Sports | 35 598 | 18 357 | 296 337 |
算法 | TikTok | Baby | Sports | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
R@10 | N@10 | P@10 | M@10 | R@10 | N@10 | P@10 | M@10 | R@10 | N@10 | P@10 | M@10 | |
最优较次优提升/% | 5.49 | 1.17 | 4.69 | 0.81 | 2.56 | 1.98 | 2.74 | 1.59 | 2.99 | 3.52 | 1.22 | 3.11 |
SelfCF | 0.058 6 | 0.029 2 | 0.005 9 | 0.020 3 | 0.052 1 | 0.027 9 | 0.005 8 | 0.019 9 | 0.063 0 | 0.034 4 | 0.007 0 | 0.024 8 |
LayerGCN | 0.059 4 | 0.033 9 | 0.005 9 | 0.026 3 | 0.051 8 | 0.027 7 | 0.005 8 | 0.019 6 | 0.061 6 | 0.033 6 | 0.006 9 | 0.024 1 |
MMGCN | 0.055 2 | 0.029 7 | 0.005 5 | 0.022 0 | 0.042 0 | 0.021 8 | 0.004 7 | 0.015 1 | 0.038 8 | 0.020 6 | 0.004 4 | 0.014 4 |
GRCN | 0.048 8 | 0.023 1 | 0.004 9 | 0.015 4 | 0.052 8 | 0.028 2 | 0.005 9 | 0.020 0 | 0.057 3 | 0.030 9 | 0.006 4 | 0.022 0 |
MGCN | 0.061 9 | 0.032 5 | 0.006 2 | 0.023 6 | 0.061 3 | 0.032 9 | 0.006 8 | 0.023 5 | 0.073 3 | 0.040 2 | 0.008 1 | 0.029 2 |
LATTICE | 0.057 8 | 0.030 8 | 0.005 8 | 0.022 6 | 0.054 9 | 0.029 1 | 0.006 1 | 0.020 5 | 0.062 2 | 0.034 1 | 0.006 9 | 0.024 7 |
FREEDOM | 0.053 7 | 0.031 6 | 0.006 4 | 0.024 5 | 0.062 8 | 0.032 9 | 0.006 8 | 0.022 7 | 0.071 3 | 0.038 2 | 0.007 9 | 0.027 2 |
DRAGON | 0.062 0 | 0.032 8 | 0.006 2 | 0.023 9 | 0.065 6 | 0.034 6 | 0.007 2 | 0.024 4 | 0.072 6 | 0.039 6 | ||
BM3 | 0.061 7 | 0.032 2 | 0.006 2 | 0.023 4 | 0.055 1 | 0.029 0 | 0.006 2 | 0.022 4 | 0.063 5 | 0.034 3 | 0.007 1 | 0.024 5 |
SLMRec | 0.046 0 | 0.023 2 | 0.004 6 | 0.016 5 | 0.055 1 | 0.029 5 | 0.006 1 | 0.021 0 | 0.067 6 | 0.037 4 | 0.007 5 | 0.027 2 |
MENTOR | 0.008 1 | 0.028 6 | ||||||||||
SFELMMR | 0.067 3 | 0.034 6 | 0.006 7 | 0.024 8 | 0.068 2 | 0.036 1 | 0.007 5 | 0.025 6 | 0.075 7 | 0.041 2 | 0.008 3 | 0.029 8 |
表2 SFELMMR算法与各对比方法的性能比较
Tab. 2 Performance comparison of SFELMMR algorithm and various comparison methods
算法 | TikTok | Baby | Sports | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
R@10 | N@10 | P@10 | M@10 | R@10 | N@10 | P@10 | M@10 | R@10 | N@10 | P@10 | M@10 | |
最优较次优提升/% | 5.49 | 1.17 | 4.69 | 0.81 | 2.56 | 1.98 | 2.74 | 1.59 | 2.99 | 3.52 | 1.22 | 3.11 |
SelfCF | 0.058 6 | 0.029 2 | 0.005 9 | 0.020 3 | 0.052 1 | 0.027 9 | 0.005 8 | 0.019 9 | 0.063 0 | 0.034 4 | 0.007 0 | 0.024 8 |
LayerGCN | 0.059 4 | 0.033 9 | 0.005 9 | 0.026 3 | 0.051 8 | 0.027 7 | 0.005 8 | 0.019 6 | 0.061 6 | 0.033 6 | 0.006 9 | 0.024 1 |
MMGCN | 0.055 2 | 0.029 7 | 0.005 5 | 0.022 0 | 0.042 0 | 0.021 8 | 0.004 7 | 0.015 1 | 0.038 8 | 0.020 6 | 0.004 4 | 0.014 4 |
GRCN | 0.048 8 | 0.023 1 | 0.004 9 | 0.015 4 | 0.052 8 | 0.028 2 | 0.005 9 | 0.020 0 | 0.057 3 | 0.030 9 | 0.006 4 | 0.022 0 |
MGCN | 0.061 9 | 0.032 5 | 0.006 2 | 0.023 6 | 0.061 3 | 0.032 9 | 0.006 8 | 0.023 5 | 0.073 3 | 0.040 2 | 0.008 1 | 0.029 2 |
LATTICE | 0.057 8 | 0.030 8 | 0.005 8 | 0.022 6 | 0.054 9 | 0.029 1 | 0.006 1 | 0.020 5 | 0.062 2 | 0.034 1 | 0.006 9 | 0.024 7 |
FREEDOM | 0.053 7 | 0.031 6 | 0.006 4 | 0.024 5 | 0.062 8 | 0.032 9 | 0.006 8 | 0.022 7 | 0.071 3 | 0.038 2 | 0.007 9 | 0.027 2 |
DRAGON | 0.062 0 | 0.032 8 | 0.006 2 | 0.023 9 | 0.065 6 | 0.034 6 | 0.007 2 | 0.024 4 | 0.072 6 | 0.039 6 | ||
BM3 | 0.061 7 | 0.032 2 | 0.006 2 | 0.023 4 | 0.055 1 | 0.029 0 | 0.006 2 | 0.022 4 | 0.063 5 | 0.034 3 | 0.007 1 | 0.024 5 |
SLMRec | 0.046 0 | 0.023 2 | 0.004 6 | 0.016 5 | 0.055 1 | 0.029 5 | 0.006 1 | 0.021 0 | 0.067 6 | 0.037 4 | 0.007 5 | 0.027 2 |
MENTOR | 0.008 1 | 0.028 6 | ||||||||||
SFELMMR | 0.067 3 | 0.034 6 | 0.006 7 | 0.024 8 | 0.068 2 | 0.036 1 | 0.007 5 | 0.025 6 | 0.075 7 | 0.041 2 | 0.008 3 | 0.029 8 |
算法 | TikTok | Baby | Sports | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
R@10 | N@10 | P@10 | M@10 | R@10 | N@10 | P@10 | M@10 | R@10 | N@10 | P@10 | M@10 | |
SFELMMRK | 0.059 4 | 0.030 6 | 0.005 9 | 0.022 0 | 0.065 8 | 0.034 7 | 0.007 2 | 0.024 5 | 0.073 9 | 0.039 9 | 0.008 1 | 0.028 6 |
SFELMMRT | 0.064 8 | 0.032 4 | 0.006 5 | 0.022 8 | 0.065 3 | 0.035 3 | 0.007 2 | 0.025 3 | 0.074 6 | 0.040 7 | ||
SFELMMRCMA | 0.066 1 | 0.032 9 | 0.005 2 | 0.023 0 | 0.067 0 | 0.035 5 | 0.025 2 | 0.074 5 | 0.040 1 | 0.028 8 | ||
SFELMMRFE | 0.066 6 | 0.034 4 | 0.0067 | 0.024 7 | 0.066 8 | 0.035 6 | 0.025 2 | 0.075 2 | 0.040 7 | 0.029 3 | ||
SFELMMRFE-C | 0.066 8 | 0.0354 | 0.0249 | 0.067 8 | 0.0075 | 0.0256 | 0.075 4 | 0.0083 | 0.029 3 | |||
SFELMMRGP | 0.066 0 | 0.066 1 | 0.035 6 | 0.007 3 | 0.075 0 | 0.040 7 | ||||||
SFELMMRGP-ProJH | 0.034 4 | 0.0067 | 0.024 7 | 0.0361 | 0.040 6 | 0.0083 | 0.029 1 | |||||
SFELMMR | 0.0673 | 0.034 6 | 0.0067 | 0.0682 | 0.0361 | 0.0075 | 0.0256 | 0.0757 | 0.0412 | 0.0083 | 0.0298 |
表3 消融实验结果
Tab. 3 Ablation study results
算法 | TikTok | Baby | Sports | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
R@10 | N@10 | P@10 | M@10 | R@10 | N@10 | P@10 | M@10 | R@10 | N@10 | P@10 | M@10 | |
SFELMMRK | 0.059 4 | 0.030 6 | 0.005 9 | 0.022 0 | 0.065 8 | 0.034 7 | 0.007 2 | 0.024 5 | 0.073 9 | 0.039 9 | 0.008 1 | 0.028 6 |
SFELMMRT | 0.064 8 | 0.032 4 | 0.006 5 | 0.022 8 | 0.065 3 | 0.035 3 | 0.007 2 | 0.025 3 | 0.074 6 | 0.040 7 | ||
SFELMMRCMA | 0.066 1 | 0.032 9 | 0.005 2 | 0.023 0 | 0.067 0 | 0.035 5 | 0.025 2 | 0.074 5 | 0.040 1 | 0.028 8 | ||
SFELMMRFE | 0.066 6 | 0.034 4 | 0.0067 | 0.024 7 | 0.066 8 | 0.035 6 | 0.025 2 | 0.075 2 | 0.040 7 | 0.029 3 | ||
SFELMMRFE-C | 0.066 8 | 0.0354 | 0.0249 | 0.067 8 | 0.0075 | 0.0256 | 0.075 4 | 0.0083 | 0.029 3 | |||
SFELMMRGP | 0.066 0 | 0.066 1 | 0.035 6 | 0.007 3 | 0.075 0 | 0.040 7 | ||||||
SFELMMRGP-ProJH | 0.034 4 | 0.0067 | 0.024 7 | 0.0361 | 0.040 6 | 0.0083 | 0.029 1 | |||||
SFELMMR | 0.0673 | 0.034 6 | 0.0067 | 0.0682 | 0.0361 | 0.0075 | 0.0256 | 0.0757 | 0.0412 | 0.0083 | 0.0298 |
1 | 刘君良,李晓光. 个性化推荐系统技术进展[J]. 计算机科学, 2020, 47(7):47-55. |
LIU J L, LI X G. Techniques for recommendation system: a survey[J]. Computer Science, 2020, 47(7): 47-55. | |
2 | HE R, McAULEY J. VBPR: visual Bayesian personalized ranking from implicit feedback[C]// Proceedings of the 30th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2016:144-150. |
3 | WANG Q, WEI Y, YIN J, et al. DualGNN: dual graph neural network for multimedia recommendation[J]. IEEE Transactions on Multimedia, 2023, 25: 1074-1084. |
4 | ZHANG J, ZHU Y, LIU Q, et al. Mining latent structures for multimedia recommendation[C]// Proceedings of the 29th ACM International Conference on Multimedia. New York: ACM, 2021: 3872-3880. |
5 | ZHOU X, SHEN Z. A tale of two graphs: freezing and denoising graph structures for multimodal recommendation[C]// Proceedings of the 31st ACM International Conference on Multimedia. New York: ACM, 2023: 935-943. |
6 | ZHU Y, XU Y, YU F, et al. Graph contrastive learning with adaptive augmentation[C]// Proceedings of the Web Conference 2021. New York: ACM, 2021: 2069-2080. |
7 | XUN J, ZHANG S, ZHAO Z, et al. Why do we click: visual impression-aware news recommendation[C]// Proceedings of the 29th ACM International Conference on Multimedia. New York: ACM, 2021: 3881-3890. |
8 | ZHOU X, ZHOU H, LIU Y, et al. Bootstrap latent representations for multi-modal recommendation[C]// Proceedings of the ACM Web Conference 2023. New York: ACM, 2023: 845-854. |
9 | WEI W, HUANG C, XIA L, et al. Multi-modal self-supervised learning for recommendation[C]// Proceedings of the ACM Web Conference 2023. New York: ACM, 2023: 790-800. |
10 | TAO Z, WEI Y, WANG X, et al. MGAT: multimodal graph attention network for recommendation[J]. Information Processing and Management, 2020, 57(5): No.102277. |
11 | TAO Z, LIU X, XIA Y, et al. Self-supervised learning for multimedia recommendation[J]. IEEE Transactions on Multimedia, 2023, 25: 5107-5116. |
12 | SUN R, CAO X, ZHAO Y, et al. Multi-modal knowledge graphs for recommender systems[C]// Proceedings of the 29th ACM International Conference on Information and Knowledge Management. New York: ACM, 2020: 1405-1414. |
13 | ZHOU H, ZHOU X, ZHANG L, et al. Enhancing dyadic relations with homogeneous graphs for multimodal recommendation[C]// Proceedings of the 26th European Conference on Artificial Intelligence/ the 12th Conference on Prestigious Applications of Intelligent Systems. Amsterdam: IOS Press, 2023: 3123-3130. |
14 | ZHOU X, SUN A, LIU Y, et al. SelfCF: a simple framework for self-supervised collaborative filtering[J]. ACM Transactions on Recommender Systems, 2023, 1(2): No.9. |
15 | WU J, WANG X, FENG F, et al. Self-supervised graph learning for recommendation[C]// Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2021: 726-735. |
16 | YU J, YIN H, XIA X, et al. Are graph augmentations necessary? simple graph contrastive learning for recommendation[C]// Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2022: 1294-1303. |
17 | XIA L, HUANG C, SHI J, et al. Graph-less collaborative filtering[C]// Proceedings of the ACM Web Conference 2023. New York: ACM, 2023: 17-27. |
18 | YI Z, WANG X, OUNIS I, et al. Multi-modal graph contrastive learning for micro-video recommendation[C]// Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2022: 1807-1811. |
19 | ZHOU J, CUI G, HU S, et al. Graph neural networks: a review of methods and applications [J]. AI Open, 2020, 1: 57-81. |
20 | WEI Y, WANG X, NIE L, et al. MMGCN: multi-modal graph convolution network for personalized recommendation of micro-video[C]// Proceedings of the 27th ACM International Conference on Multimedia. New York: ACM, 2019: 1437-1445. |
21 | HE X, DENG K, WANG X, et al. LightGCN: simplifying and powering graph convolution network for recommendation[C]// Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2020: 639-648. |
22 | XU J, CHEN Z, YANG S, et al. MENTOR: multi-level self-supervised learning for multimodal recommendation [EB/OL]. [2024-06-05].. |
23 | CHEN T, KORNBLITH S, SWERSKY K, et al. Big self-supervised models are strong semi-supervised learners[C]// Proceedings of the 34th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2020: 22243-22255. |
24 | CHEN T, KORNBLITH S, NOROUZI M, et al. A simple framework for contrastive learning of visual representations [C]// Proceedings of the 37th International Conference on Machine Learning. New York: JMLR.org, 2020: 1597-1607. |
25 | RENDLE S, FREUDENTHALER C, GANTNER Z, et al. BPR: Bayesian personalized ranking from implicit feedback[C]// Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence. Arlington, VA: AUAI Press, 2009: 452-461. |
26 | McAULEY J, TARGETT C, SHI Q, et al. Image-based recommendations on styles and substitutes [C]// Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2015: 43-52. |
27 | KINGMA D P, BA J L. Adam: a method for stochastics optimization[EB/OL]. [2024-06-05].. |
28 | ZHOU X, LIN D, LIU Y, et al. Layer-refined graph convolutional networks for recommendation [C]// Proceedings of the IEEE 39th International Conference on Data Engineering. Piscataway: IEEE, 2023: 1247-1259. |
29 | WEI Y, WANG X, NIE L, et al. Graph-refined convolutional network for multimedia recommendation with implicit feedback[C]// Proceedings of the 28th ACM International Conference on Multimedia. New York: ACM, 2020: 3541-3549. |
30 | YU P, TAN Z, LU G, et al. Multi-view graph convolutional network for multimedia recommendation [C]// Proceedings of the 31st ACM International Conference on Multimedia. New York: ACM, 2023: 6576-6585. |
31 | ZHOU X. MMRec: simplifying multimodal recommendation [C]// Proceedings of the 5th ACM International Conference on Multimedia in Asia Workshops. New York: ACM, 2023: No.6. |
[1] | 陈昕, 刘忠慧, 闵帆. 约简形式背景下的概念集构造及其推荐应用[J]. 《计算机应用》唯一官方网站, 2025, 45(5): 1415-1423. |
[2] | 杨雅莉, 黎英, 章育涛, 宋佩华. 面向人脸识别的多模态研究方法综述[J]. 《计算机应用》唯一官方网站, 2025, 45(5): 1645-1657. |
[3] | 田海燕, 黄赛豪, 张栋, 李寿山. 视觉指导的分词和词性标注[J]. 《计算机应用》唯一官方网站, 2025, 45(5): 1488-1495. |
[4] | 龙雨菲, 牟宇辰, 刘晔. 基于张量化图卷积网络和对比学习的多源数据表示学习模型[J]. 《计算机应用》唯一官方网站, 2025, 45(5): 1372-1378. |
[5] | 张庆, 杨凡, 方宇涵. 基于多模态信息融合的中文拼写纠错算法[J]. 《计算机应用》唯一官方网站, 2025, 45(5): 1528-1534. |
[6] | 党伟超, 温鑫瑜, 高改梅, 刘春霞. 基于多视图多尺度对比学习的图协同过滤[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1061-1068. |
[7] | 朱俊屹, 常雷雷, 徐晓滨, 郝智勇, 于海跃, 姜江. 基于最小先验知识的自监督学习方法[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1035-1041. |
[8] | 郭诗月, 党建武, 王阳萍, 雍玖. 结合注意力机制和多尺度特征融合的三维手部姿态估计[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1293-1299. |
[9] | 王一丁, 王泽浩, 李耀利, 蔡少青, 袁媛. 多尺度2D-Adaboost的中药材粉末显微图像识别算法[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1325-1332. |
[10] | 周阳, 李辉. 基于语义和细节特征双促进的遥感影像建筑物提取网络[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1310-1316. |
[11] | 杨光局, 罗天健, 王开军, 杨思琪. 多分支多视图的时间序列上下文对比表征学习方法[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1042-1052. |
[12] | 田仁杰, 景明利, 焦龙, 王飞. 基于混合负采样的图对比学习推荐算法[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1053-1060. |
[13] | 陈维, 施昌勇, 马传香. 基于多模态数据融合的农作物病害识别方法[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 840-848. |
[14] | 蒋占军, 李洋, 廉敬, 苗新法. 坐标增强与多源采样的脑肿瘤图像分割[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 996-1002. |
[15] | 孙晨伟, 侯俊利, 刘祥根, 吕建成. 面向工程图纸理解的大语言模型提示生成方法[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 801-807. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||