Journal of Computer Applications

    Next Articles

Label reconstruction-based financial event extraction model with large language models

  

  • Received:2025-12-11 Revised:2026-03-05 Online:2026-03-24 Published:2026-03-24

基于标签重构的大语言模型金融事件抽取模型

杨维,才智杰   

  1. 青海师范大学
  • 通讯作者: 才智杰
  • 基金资助:
    国家自然科学基金资助项目;青海省藏文信息处理与机器翻译重点实验室;藏文信息处理教育部重点实验室

Abstract: Abstract: Event extraction aims to automatically identify event triggers, event types, arguments, and argument roles from unstructured or semi-structured text and transform them into structured representations. Due to the complex terminology, cross-sentence information distribution, and implicit expressions in financial event texts, traditional classification-based event extraction methods heavily rely on manual annotations, exhibit limited capability in recognizing complex events, and encounter difficulties in joint training and optimization of pipeline subtasks. To address these issues, a Label Reconstruction-based Financial Event Extraction Model (LREE) built upon a Large Language Model (LLM) was proposed. First, financial event labels were reconstructed into event record sequences that better align with the generative objectives of large language models, thereby transforming the classification-based event extraction task into a generative task. Then, a Transformer-based large language model was adopted as the backbone, and fine-tuning was conducted using instruction tuning and Low-Rank Adaptation(LoRA) techniques. Finally, during the generative decoding process, joint identification of event types, triggers, arguments, and roles was synchronously achieved through sequence generation. Experimental results on ChFinAnn show that the LREE model built upon Qwen-3-4B achieves an F1 score of 89.9% on the test set. This represents an improvement of 60.0 percentage points over the original Qwen-3-4B without fine-tuning and 9.6 percentage points over the DEEM-PT(Document Level Financial Event Extraction Method Guided by Prompt Template). The findings verify the effectiveness of the label reconstruction strategy in improving structured event generation performance for financial texts.

Key words: Natural Language Processing(NLP), financial event extraction, label reconstruction, Large Language Models(LLMs), generative event extraction

摘要: 摘 要: 事件抽取旨在从非结构化或半结构化文本中自动识别事件触发词、类型、论元及论元角色,并转化为结构化表示。针对金融事件文本术语复杂、信息跨句分布且隐形信息多,导致传统分类式事件抽取技术高度依赖标注、复杂事件识别能力弱及串联子任务训练调优困难等问题,提出一种基于标签重构的大语言模型(LLM)金融事件抽取模型(LREE)。首先,将金融事件标签重构为更加契合大语言模型生成目标的事件记录序列,将分类式事件抽取技术转换为生成式任务;其次,基于Transformer的大语言模型作为基座,联合指令微调和LoRA(Low-Rank Adaptation)技术对其进行微调;最后,模型在生成解码过程中以序列生成的方式同步完成事件类型、触发词、论元及角色的联合识别。在金融领域事件抽取通用数据集ChFinAnn上的实验结果表明,以Qwen-3-4B为基座模型构建的LREE在测试集上取得89.9%的F1值,较未经微调的Qwen-3-4B提高60个百分点,较传统小参数模型DEEM-PT(Document Level Financial Event Extraction Method Guided by Prompt Template)提高9.6个百分点,显著提升了金融事件抽取的准确性,有效克服了传统方法在复杂金融文本中的识别局限,为金融领域事件抽取提供了高性能且易落地的方案。

关键词: 自然语言处理(NLP), 金融事件抽取, 标签重构, 大语言模型(LLM), 生成式事件抽取

CLC Number: