Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Multimodal fact verification with cross-modal semantic association
Huanxian LIU, Hongtao WANG, Xian’ao WANG, Hongmei WANG, Weifeng XU
Journal of Computer Applications    2026, 46 (4): 1069-1076.   DOI: 10.11772/j.issn.1001-9081.2025050526
Abstract80)   HTML1)    PDF (726KB)(30)       Save

Semantic differences between multimodal evidences and among claims and evidences during feature fusion were addressed through a proposed Cross-Modal Semantic Association (CMSA)-based Multimodal Fact Verification (MFV) method, so as to realize cross-level semantic alignment and adaptive feature interaction, thereby eliminating semantic gaps across multi-source information, and enhancing classification performance of complex claim verification. During evidence retrieval, relevant textual evidence was retrieved from claim text, and semantically related image evidence was further filtered using the textual evidence, so as to ensure high cross-modal relevance. During claim verification, semantic alignment between text and multimodal evidences was achieved using the CLIP (Contrastive Language-Image Pretraining) model, and a Linked Claim and Evidence Attention (LCEA) module was designed, so as to reinforce semantic associations among claim text, textual evidence, and image evidence. Experimental results show that CMSA improves the F1 score on the public and self-constructed datasets CEAD (Cross-modal Evidence Augmented Dataset) by 7.27% and 6.65% at least, respectively, compared to the MOCHEG model, demonstrating its effectiveness in MFV tasks.

Table and Figures | Reference | Related Articles | Metrics