Search Result

Journals

Publication Years

Keywords

Please wait a minute...

For Selected:

Download Citations
EndNote Ris BibTeX

Toggle Thumbnails

Select

Multimodal event extraction based on text-image dual-channel feature gated fusion mechanism

Delong WANG, Haoyi WANG, Qingchuan ZHANG, Zexi SONG

Journal of Computer Applications 2026, 46 (4): 1077-1085. DOI: 10.11772/j.issn.1001-9081.2025050563

Abstract （58）

HTML （3）

PDF （1477KB）（30）

Save

In order to improve the alignment accuracy and fusion efficiency between different modal features in multimodal event extraction methods and enhance the model’s understanding of semantic relationship between images and texts， a multimodal event extraction model based on dual-channel “text-image” feature gated fusion mechanism named MEE-DF （Multimodal Event Extraction based on Dual-channel Fusion） was proposed. Firstly， the channel of generating text descriptions from images was expanded， the event arguments existed in the images implicitly were mined， and the information representation of event extraction was improved. Secondly， Locality Constrained Cross Attention （LCCA） mechanism was built， the geometric alignment graphs were generated to embed image information， and image features with high discrimination were extracted. Thirdly， an adversarial gating mechanism based on interactive attention maps was constructed to achieve fine-grained alignment of text entities and image objects. Finally， a dual-channel fusion feature strategy was used to filter important Patch features， remove redundant information， and improve feature integration efficiency. Experimental results on the MEED and the M²E² public datasets show that MEE-DF has the F1 value reached 90.9% and 88.8% on the event type detection task， respectively， and the F1 value reached 73.3% and 68.1% on the Event Argument Extraction （EAE） task， respectively. It can be seen that MEE-DF is better than the existing event extraction models. Ablation experiments further demonstrate that each module of the proposed model has significant contribution to the improvement of event extraction performance.

Table and Figures | Reference | Related Articles | Metrics