Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Multimodal recommendation method based on semantic fusion and contrast enhancement
Haihua ZHAO, Yijun HU, Rui TANG, Xian MO
Journal of Computer Applications    2026, 46 (4): 1058-1068.   DOI: 10.11772/j.issn.1001-9081.2025050528
Abstract147)   HTML3)    PDF (2148KB)(32)       Save

Multimodal recommendation aims to enhance user and item feature representations by integrating multimodal information, so as to improve recommendation performance. However, the existing methods still face challenges including insufficient cross-modal semantic information fusion, redundant multimodal features, and noise interference. To address these issues, a multimodal Recommendation method based on Semantic Fusion and Contrast Enhancement (SFCERec) was proposed. Firstly, a cross-modal semantic consistency enhancement framework was designed by which a global correlation graph was constructed through a multimodal semantic feature filtering mechanism, so as to aggregate common multimodal features dynamically while suppressing noise propagation. Concurrently, a multi-granularity attribute disentanglement module was introduced to separate coarse-grained common features from user behavior-driven fine-grained features from modal features, so as to mitigate feature redundancy. Secondly, a multi-level contrastive learning paradigm was proposed, so as to joint four tasks: cross-modal consistency alignment, user behavior similarity modeling, item semantic relevance constraint, and explicit-latent feature mutual information maximization, thereby enhancing representation discriminability through contrastive learning. Finally, a graph perturbation enhancement strategy was further incorporated, thereby employing noise injection and dual contrastive regularization to improve model robustness against sparse data and noise interference. Experimental results on Amazon-Baby, Amazon-Sports, and Amazon-Clothing datasets demonstrate that this method outperforms all baseline models in both Recall@20 and NDCG@20 metrics, particularly in sparse scenarios. Ablation studies further validate the effectiveness of the proposed method.

Table and Figures | Reference | Related Articles | Metrics