Search Result

Journals

Publication Years

Keywords

Please wait a minute...

For Selected:

Download Citations
EndNote Ris BibTeX

Toggle Thumbnails

Select

Multimodal fact verification with cross-modal semantic association

Huanxian LIU, Hongtao WANG, Xian’ao WANG, Hongmei WANG, Weifeng XU

Journal of Computer Applications 2026, 46 (4): 1069-1076. DOI: 10.11772/j.issn.1001-9081.2025050526

Abstract （80）

HTML （1）

PDF （726KB）（30）

Save

Semantic differences between multimodal evidences and among claims and evidences during feature fusion were addressed through a proposed Cross-Modal Semantic Association （CMSA）-based Multimodal Fact Verification （MFV） method， so as to realize cross-level semantic alignment and adaptive feature interaction， thereby eliminating semantic gaps across multi-source information， and enhancing classification performance of complex claim verification. During evidence retrieval， relevant textual evidence was retrieved from claim text， and semantically related image evidence was further filtered using the textual evidence， so as to ensure high cross-modal relevance. During claim verification， semantic alignment between text and multimodal evidences was achieved using the CLIP （Contrastive Language-Image Pretraining） model， and a Linked Claim and Evidence Attention （LCEA） module was designed， so as to reinforce semantic associations among claim text， textual evidence， and image evidence. Experimental results show that CMSA improves the F1 score on the public and self-constructed datasets CEAD （Cross-modal Evidence Augmented Dataset） by 7.27% and 6.65% at least， respectively， compared to the MOCHEG model， demonstrating its effectiveness in MFV tasks.

Table and Figures | Reference | Related Articles | Metrics