Journal of Computer Applications
Next Articles
Received:
Revised:
Accepted:
Online:
Published:
赵海华1,胡怡君1,唐瑞2,莫先1
通讯作者:
基金资助:
Abstract: Multimodal recommendation aims to enhance user and item feature representations by integrating multimodal information to improve recommendation performance. Existing methods still face challenges including insufficient cross-modal semantic information fusion, redundant multimodal features, and noise interference. To address these issues, a Semantic Fusion and Contrastive Enhancement-based multimodal recommendation method was proposed. First, a cross-modal semantic consistency enhancement framework was designed, which constructed a global correlation graph through a multimodal semantic feature filtering mechanism to dynamically aggregate common multimodal features while suppressing noise propagation. Concurrently, a multi-granularity attribute disentanglement module was introduced to separate coarse-grained common features from user behavior-driven fine-grained features within modal representations, to mitigate feature redundancy. Second, a multi-level contrastive learning paradigm was proposed, integrating four tasks: cross-modal consistency alignment, user behavior similarity modeling, item semantic relevance constraint, and explicit-latent feature mutual information maximization. This framework enhances representation discriminability through contrastive learning. Furthermore, a graph perturbation enhancement strategy was incorporated, employing noise injection and dual contrastive regularization to improve model robustness against sparse data and noise interference. Experiments on Amazon-Baby, Amazon-Sports, and Amazon-Clothing datasets demonstrate that this method outperforms all baseline models in both R@20 and N@20 metrics, with particularly superior performance in sparse scenarios. Ablation studies further validate the effectiveness of the proposed method.
Key words: recommender system, multimodal, contrastive learning, semantic fusion, feature disentanglement
摘要: 多模态推荐旨在通过融合多模态信息,增强用户和项目的特征表示,提升推荐性能。现有方法仍然存在跨模态语义信息融合不足、多模态特征冗余及噪声干扰问题。针对这些问题,提出一种基于语义融合和对比增强的多模态推荐方法。首先,设计跨模态语义一致性增强框架,通过多模态语义特征筛选机制构建全局关联图,动态聚合多模态共性特征并抑制噪声传播;同时提出多粒度属性解耦模块,从模态特征中分离粗粒度共性特征与用户行为驱动的细粒度特征,缓解特征冗余。其次,提出多层次对比学习范式,联合跨模态一致性对齐、用户行为相似性建模、项目语义关联性约束及显式-潜在特征互信息最大化四类任务,通过对比学习强化表征的判别性;进一步结合图扰动增强策略,通过添加噪声与双重对比正则化,提升模型对稀疏数据与噪声干扰的鲁棒性。该方法在Amazon-Baby、Amazon-Sports和Amazon-Clothing数据集上的实验表明,在R@20和N@20指标上均优于所有基线模型,尤其在稀疏场景下性能显著优于现有方法。并且通过对模型进行消融实验,验证了该方法的有效性。
关键词: 推荐系统, 多模态, 对比学习, 语义融合, 特征解耦
CLC Number:
TP183
赵海华 胡怡君 唐瑞 莫先. 基于语义融合和对比增强的多模态推荐方法[J]. 《计算机应用》唯一官方网站, DOI: 10.11772/j.issn.1001-9081.2025050528.
0 / Recommend
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2025050528