Journal of Computer Applications ›› 2025, Vol. 45 ›› Issue (9): 2783-2789.DOI: 10.11772/j.issn.1001-9081.2024091393

• Artificial intelligence • Previous Articles    

Judgment document summarization method combining large language model and dynamic prompts

Binbin ZHANG1,2,3, Yongbin QIN1,2,3(), Ruizhang HUANG1,2,3, Yanping CHEN1,2,3   

  1. 1.College of Computer Science and Technology,Guizhou University,Guiyang Guizhou 550025,China
    2.State Key Laboratory of Public Big Data (Guizhou University),Guiyang Guizhou 550025,China
    3.Text Computing and Cognitive Intelligence Engineering Research Center of National Education Ministry,(Guizhou University),Guiyang Guizhou 550025,China
  • Received:2024-10-07 Revised:2025-01-08 Accepted:2025-01-16 Online:2025-03-21 Published:2025-09-10
  • Contact: Yongbin QIN
  • About author:ZHANG Binbin, born in 1999, M. S. candidate. His research interests include natural language processing, judicial summarization.
    HUANG Ruizhang, born in 1979, Ph. D., professor. Her research interests include big data, data mining, information extraction.
    CHEN Yanping, born in 1980, Ph. D., professor. His research interests include artificial intelligence, natural language processing.
  • Supported by:
    National Key Research and Development Program of China(2023YFC3304500);Key Project of Science and Technology Foundation of Guizhou Province([2024] 003)

结合大语言模型与动态提示的裁判文书摘要方法

张滨滨1,2,3, 秦永彬1,2,3(), 黄瑞章1,2,3, 陈艳平1,2,3   

  1. 1.贵州大学 计算机科学与技术学院,贵阳 550025
    2.公共大数据国家重点实验室(贵州大学),贵阳 550025
    3.文本计算与认知智能教育部工程研究中心(贵州大学),贵阳 550025
  • 通讯作者: 秦永彬
  • 作者简介:张滨滨(1999—),男,贵州仁怀人,硕士研究生,CCF会员,主要研究方向:自然语言处理、司法摘要
    黄瑞章(1979—),女,天津人,教授,博士,CCF会员,主要研究方向:大数据、数据挖掘、信息提取
    陈艳平(1980—),男,贵州长顺人,教授,博士,CCF会员,主要研究方向:人工智能、自然语言处理。
  • 基金资助:
    国家重点研发计划项目(2023YFC3304500);贵州省科学技术基金重点资助项目([2024]003)

Abstract:

In view of the problems of complex case structure, redundant facts involved in cases, and wide distribution of cases in judgment documents, the existing Large Language Models (LLMs) are difficult to focus on structural information effectively and may generate factual errors, resulting in missing structural information and factual inconsistency. To this end, a judgment document summary method combining LLMs and dynamic prompts, named DPCM (Dynamic Prompt Correction Method), was proposed. Firstly, LLMs were used for single-sample learning to generate a judgment document summary. Secondly, high-dimensional similarity between the original text and the summary was calculated to detect possible missing structure or factual inconsistency problems in the summary. If a problem was found, the wrong summary was spliced with the original text, and the prompt words were added. Then, one-shot learning was performed again to correct and generate a new summary, and a similarity test was performed again. If the problem still existed, the generation and detection process would be repeated. Finally, through this iterative method, the prompt words were adjusted dynamically to optimize the generated summary gradually. Experimental results on the CAIL2020 public justice summary dataset show that compared with Least-To-Most Prompting, Zero-Shot Reasoners, Self_Consistency_Cot and other methods, the proposed method has improvements in Rouge-1, Rouge-2, Rouge-L, BERTscore, FactCC (Factual Consistency) indicators.

Key words: Large Language Model (LLM), dynamic prompt, judgment document summary, missing structure, factual inconsistency

摘要:

针对裁判文书案件结构复杂、涉案事实冗余且案情分布广泛的问题,现有的大语言模型(LLM)难以有效关注结构信息并可能会产生事实错误关联,从而导致结构信息缺失和事实不一致。因此,提出一种结合LLM与动态提示的裁判文书摘要方法DPCM(Dynamic Prompt Correction Method)。首先,利用LLM进行单样本学习,以生成裁判文书摘要。其次,计算原文与摘要之间的高维相似性,以检测摘要中可能存在的结构缺失或事实不一致的问题:如果发现问题,将错误摘要与原文拼接,并加入提示词,随后再次进行单样本学习,以修正并生成新的摘要,且再次进行相似性检测,如果问题仍然存在,则重复此生成与检测过程。最后,通过这种反复迭代的方式动态调整提示词,以逐步优化生成的摘要。在CAIL2020公共司法摘要数据集上的实验结果表明,相较于Least-To-Most-Prompting、Zero-Shot-Reasoners和Self_Consistency_Cot等方法,所提方法在Rouge-1、Rouge-2、Rouge-L、BERTscore、FactCC (Factual Consistency)指标上均有所提高。

关键词: 大语言模型, 动态提示, 裁判文书摘要, 结构缺失, 事实不一致

CLC Number: