Legal case retrieval method integrating temporal behavior chain and event type

doi:10.11772/j.issn.1001-9081.2024070917

Journal of Computer Applications ›› 2025, Vol. 45 ›› Issue (6): 1741-1747.DOI: 10.11772/j.issn.1001-9081.2024070917

• CCF BigData 2024 • Previous Articles

Legal case retrieval method integrating temporal behavior chain and event type

Lilin ZHAN¹^,²^,³, Yongbin QIN¹^,²^,³(), Ruizhang HUANG¹^,²^,³, Hua WANG¹^,²^,³, Yanping CHEN¹^,²^,³

^1.Text Computing and Cognitive Intelligence Engineering Research Center of National Education Ministry （Guizhou University），Guiyang Guizhou 550025，China
^2.State Key Laboratory of Public Big Data （Guizhou University），Guiyang Guizhou 550025，China
^3.College of Computer Science and Technology，Guizhou University，Guiyang Guizhou 550025，China

Received:2024-06-29 Revised:2024-07-25 Accepted:2024-08-02 Online:2024-08-22 Published:2025-06-10
Contact: Yongbin QIN
About author:ZHAN Lilin， born in 2002， M. S. candidate. His research interests include natural language processing， information retrieval.
QIN Yongbin， born in 1980， Ph. D.， professor. His research interests include big data management and application， multi-source data fusion.
HUANG Ruizhang， born in 1979， Ph. D.， professor. Her research interests include big data， data mining， information extraction.
WANG Hua， born in 1981， Ph. D. candidate. His research interests include information retrieval， data mining.
CHEN Yanping， born in 1980， Ph. D.， professor. His research interests include artificial intelligence， natural language processing.
Supported by:
National Natural Science Foundation of China(62066008);Key Project of Science and Technology Foundation of Guizhou Province(［2024］ 003)

融合时序行为链与事件类型的类案检索方法

詹力林¹^,²^,³, 秦永彬¹^,²^,³(), 黄瑞章¹^,²^,³, 王华¹^,²^,³, 陈艳平¹^,²^,³

^1.文本计算与认知智能教育部工程研究中心（贵州大学），贵阳 550025
^2.公共大数据国家重点实验室（贵州大学），贵阳 550025
^3.贵州大学计算机科学与技术学院，贵阳 550025

通讯作者: 秦永彬
作者简介:詹力林（2002—），男，贵州盘州人，硕士研究生，CCF会员，主要研究方向：自然语言处理、信息检索
秦永彬（1980—），男，山东烟台人，教授，博士，CCF高级会员，主要研究方向：大数据管理与应用、多源数据融合 ybqin@gzu.edu.cn
黄瑞章（1979—），女，天津人，教授，博士，CCF会员，主要研究方向：大数据、数据挖掘、信息提取
王华（1981—），男，贵州都匀人，博士研究生，CCF会员，主要研究方向：信息检索、数据挖掘
陈艳平（1980—），男，贵州长顺人，教授，博士，CCF会员，主要研究方向：人工智能、自然语言处理。
基金资助:
国家自然科学基金资助项目(62066008);贵州省科学技术基金重点项目(［2024］003)

Abstract

Abstract:

Aiming at the problem that the existing Legal Case Retrieval （LCR） methods lack effective utilization of case elements and are easily misled by similarity of semantic structure of the case content， an LCR method integrating temporal behavior chain and event type was proposed. Firstly， the sequence labeling method was adopted to identify legal event type in the case description， and the temporal behavior chain was constructed by using behavioral elements in the case text， thereby highlighting key elements of the case， so that the model focused on core content of the case， so as to solve the problem that the existing methods are easily misled by similarity of semantic structure of the case content. Secondly， similarity vector representation matrix of the temporal behavior chain was constructed by segmented coding to enhance semantic interaction of behavioral elements among cases. Finally， through the aggregation scorer， relevance of the cases was measured from three perspectives： temporal behavior chain， legal event type， and crime type， so as to increase rationality of the case matching score. Experimental results show that on LeCaRD （Legal Case Retrieval Dataset）， compared with SAILER （Structure-Aware pre-traIned language model for LEgal case Retrieval） method， the proposed method has the P@5 value improved by 4 percentage points， the P@10 value increased by 3 percentage points， the MAP value improved by 4 percentage points， and the NDCG@30 value increased by 0.8 percentage points. It can be seen that this method utilizes case elements effectively to avoid interference of similarity of semantic structure of the case content， and can provide a reliable basis for LCR.

Key words: case element, behavioral element, event type, temporal behavioral chain, aggregation scorer

摘要：

针对现有的类案检索（LCR）方法缺乏对案情要素的有效利用而容易被案例内容的语义结构相似性误导的问题，提出一种融合时序行为链与事件类型的类案检索方法。首先，采取序列标注的方法识别案情描述中的法律事件类型，并利用案例文本中的行为要素构建时序行为链，以突出案情的关键要素，从而使模型聚焦于案例的核心内容，进而解决现有方法易被案例内容的语义结构相似性误导的问题；其次，利用分段编码构造时序行为链的相似性向量表征矩阵，从而增强案例间行为要素的语义交互；最后，通过聚合评分器，从时序行为链、法律事件类型、犯罪类型这3个角度衡量案例的相关性，从而增加案例匹配得分的合理性。实验结果表明，相较于SAILER（Structure-Aware pre-traIned language model for LEgal case Retrieval）方法，所提方法在LeCaRD（Legal Case Retrieval Dataset）上的P@5值提升了4个百分点、P@10值提升了3个百分点、MAP值提升了4个百分点，而NDCG@30值提升了0.8个百分点。可见，该方法能有效利用案情要素来避免案例内容的语义结构相似性的干扰，并能为类案检索提供可靠的依据。

关键词: 案情要素, 行为要素, 事件类型, 时序行为链, 聚合评分器

CLC Number:

TP391.1

Lilin ZHAN, Yongbin QIN, Ruizhang HUANG, Hua WANG, Yanping CHEN. Legal case retrieval method integrating temporal behavior chain and event type[J]. Journal of Computer Applications, 2025, 45(6): 1741-1747.

詹力林, 秦永彬, 黄瑞章, 王华, 陈艳平. 融合时序行为链与事件类型的类案检索方法[J]. 《计算机应用》唯一官方网站, 2025, 45(6): 1741-1747.

Figures/Tables 9

Tab.1 Examples of temporal behavior chain and event type

被告急需钱购房，于是持刀威胁原告交出钱包……（918字）

随后，他殴打了原告使其受伤。

持→威胁→交出→

殴打→受伤

｛持械\持枪、威胁/强迫、伤害人身、受伤｝

被告急需钱购房，于是溜进原告家中偷窃其钱包……（908字）

随后，他被原告发现。

溜进→偷窃→发现

｛入户/入室，盗窃财物｝

被告在巷子里劫持了原告，随后持刀刺伤了他，导致原告受伤，

目前住院治疗……（1 608字）

劫持→持→刺伤→受伤

｛绑架、持械\持枪、伤害人身、受伤｝

Fig. 1 Overall framework of proposed method

Fig. 2 Construction of temporal behavioral chain

Tab. 2 Experimental parameters setting

参数	值	参数	值
Batch size	1	weight_decay	0.01
学习率	3×10^-5	epoch	500
最大输入长度	510	行为链分段长度	254

Tab. 3 Comparison of LCR experimental results

方法	P@5	P@10	MAP	NDCG@10	NDCG@20	NDCG@30
BM25	0.30	0.29	0.37	0.666	0.748	0.857
BERT	0.31	0.33	0.41	0.736	0.794	0.868
BERT-Crime	0.43	0.39	0.56	0.772	0.817	0.880
Lawformer	0.46	0.40	0.48	0.768	0.819	0.909
BERT-PLI	0.32	0.36	0.44	0.743	0.807	0.891
BERT-LF	0.49	0.45	0.59	0.816	0.864	0.919
SAILER	0.46	0.44	0.56	0.839	0.880	0.924
本文方法	0.50	0.47	0.60	0.842	0.882	0.932

Tab. 4 Ablation experimental results

方法	P@5	P@10	MAP	NDCG@10	NDCG@20	NDCG@30
-时序行为链	0.49	0.44	0.54	0.822	0.877	0.921
-事件类型	0.49	0.45	0.55	0.835	0.872	0.922
-时序行为链- 事件类型	0.42	0.43	0.49	0.820	0.830	0.910
-分段编码	0.44	0.42	0.54	0.826	0.882	0.929
本文方法	0.50	0.47	0.60	0.842	0.882	0.932

Tab. 5 Parameter analysis experimental results

参数值			P@5	P@10	MAP	NDCG@10	NDCG@20	NDCG@30
$α$	$β$	$θ$	P@5	P@10	MAP	NDCG@10	NDCG@20	NDCG@30
0.1	0.1	0.8	0.50	0.45	0.59	0.835	0.882	0.932
0.1	0.2	0.7	0.47	0.45	0.58	0.840	0.878	0.931
0.1	0.3	0.6	0.50	0.47	0.60	0.842	0.882	0.932
0.1	0.4	0.5	0.48	0.45	0.57	0.819	0.876	0.928
0.2	0.6	0.2	0.44	0.45	0.55	0.846	0.884	0.930
0.2	0.5	0.3	0.45	0.45	0.55	0.845	0.880	0.929

Tab. 5 Parameter analysis experimental results

参数值			P@5	P@10	MAP	NDCG@10	NDCG@20	NDCG@30
$α$	$β$	$θ$	P@5	P@10	MAP	NDCG@10	NDCG@20	NDCG@30
0.1	0.1	0.8	0.50	0.45	0.59	0.835	0.882	0.932
0.1	0.2	0.7	0.47	0.45	0.58	0.840	0.878	0.931
0.1	0.3	0.6	0.50	0.47	0.60	0.842	0.882	0.932
0.1	0.4	0.5	0.48	0.45	0.57	0.819	0.876	0.928
0.2	0.6	0.2	0.44	0.45	0.55	0.846	0.884	0.930
0.2	0.5	0.3	0.45	0.45	0.55	0.845	0.880	0.929

Fig. 3 Heat map of easily confused cases

Fig. 4 Comparison experimental results of methods with or without vector matrix of temporal behavior chain

References 26

1	王景林，吴宜霖. 类案检索制度在司法实践中的应用研究［J］. 法制博览， 2022（2）：100-102.
	WANG J L， WU Y L. Research on the application of case-based retrieval system in judicial practice［J］. Legality Vision， 2022（2）： 100-102.
2	HONG Z， ZHOU Q， ZHANG R， et al. Legal feature enhanced semantic matching network for similar case matching［C］// Proceeding of the 2020 International Joint Conference on Neural Networks. Piscataway： IEEE， 2020：1-8.
3	LI H， AI Q， CHEN J， et al. SAILER： structure-aware pre-trained language model for legal case retrieval［C］// Proceeding of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York： ACM， 2023： 1035-1044.
4	SHAO Y， MAO J， LIU Y， et al. BERT-PLI： modeling paragraph-level interactions for legal case retrieval［C］// Proceeding of the 29th International Joint Conference on Artificial Intelligence. California： ijcai.org， 2020： 3501-3507.
5	MA Y X， SHAO Y， WU Y， et al. LeCaRD： a legal case retrieval dataset for Chinese law system［C］// Proceeding of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York： ACM， 2021： 2342-2348.
6	DEVLIN J， CHANG M W， LEE K， et al. BERT： pre-training of deep bidirectional transformers for language understanding［C］// Proceeding of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies， Volume 1 （Long and Short Papers）. Stroudsburg： ACL， 2019： 4171-4186.
7	LAFFERTY J， McCALLUM A， PEREIRA F C N. Conditional random fields： probabilistic models for segmenting and labeling sequence data ［C］// Proceeding of the 18th International Conference on Machine Learning. San Francisco： Morgan Kaufmann Publishers Inc.， 2001： 282-289.
8	SALTON G， BUCKLEY C. Term-weighting approaches in automatic text retrieval［J］. Information Processing and Management， 1988， 24（5）： 513-523.
9	ROBERTSON S， ZARAGOZA H. The probabilistic relevance framework： BM25 and beyond［J］. Foundations and Trends^® in Information Retrieval， 2009， 3（4）： 333-389.
10	PONTE J M， CROFT W B. A language modeling approach to information retrieval［C］// Proceeding of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York： ACM， 1998：275-281.
11	TRAN V， NGUYEN M L， SATOH K. Building legal case retrieval systems with lexical matching and summarization using a pre-trained phrase scoring model［C］// Proceeding of the 17th International Conference on Artificial Intelligence and Law. New York： ACM， 2019： 275-282.
12	ASKARI A， VERBERNE S， et al. Combining lexical and neural retrieval with Longformer-based summarization for effective case law retrieval［C］// Proceeding of the 2nd Design of Experimental Search and Information Retrieval Systems. Aachen： CEUR-WS.org， 2021： 162-170.
13	BHATTACHARYA P， GHOSH K， PAL A， et al. Methods for computing legal document similarity： a comparative study［EB/OL］. ［2024-03-15］..
14	LI J， LIU X， NIE X， et al. Weighted-attribute triplet hashing for large-scale similar judicial case matching［J］. Computational Intelligence and Neuroscience， 2021， 2021： No.6650962.
15	NIGAM S K， GOEL N， BHATTACHARYA A. nigam@COLIEE-22： legal case retrieval and entailment using cascading of lexical and semantic-based models［C］// Proceeding of the 2022 JSAI International Symposium on Artificial Intelligence， LNCS 13859. Cham： Springer， 2023： 96-108.
16	DE MARTINO G， PIO G， CECI M. PRILJ： an efficient two-step method based on embedding and clustering for the identification of regularities in legal case judgments［J］. Artificial Intelligence and Law， 2022， 30（3）： 359-390.
17	GE J， HUANG Y， SHEN X， et al. Learning fine-grained fact-article correspondence in legal cases［J］. IEEE/ACM Transactions on Audio， Speech， and Language Processing， 2021， 29： 3694-3706.
18	WANG Z. Legal element-oriented modeling with multi-view contrastive learning for legal case retrieval ［C］// Proceeding of the 2022 International Joint Conference on Neural Networks. Piscataway： IEEE， 2022： 1-10.
19	曹发鑫，孙媛媛，王治政，等. 面向借贷案件的相似案例匹配模型［J］.计算机工程， 2024， 50（1）：306-312.
	CAO F X， SUN Y Y， WANG Z Z， et al. Similar case matching model for lending cases ［J］. Computer Engineering， 2024， 50（1）：306-312.
20	刘权，余正涛，高盛祥，等. 融合案件要素的相似案例匹配［J］. 中文信息学报， 2022， 36（11）：140-147.
	LIU Q， YU Z T， GAO S X， et al. Incorporating case elements for case matching［J］. Journal of Chinese Information Processing， 2022， 36（11）：140-147.
21	XIAO C， ZHONG H， GUO Z， et al. CAIL2019-SCM： a dataset of similar case matching in legal domain［EB/OL］. ［2024-03-20］..
22	HU W， ZHAO S， ZHAO Q， et al. BERT_LF： a similar case retrieval method based on legal facts［J］. Wireless Communications and Mobile Computing， 2022， 2022： No.2511147.
23	SUN Z， XU J， ZHANG X， et al. Law article-enhanced legal case matching： a causal learning approach［C］// Proceeding of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York： ACM， 2023： 1549-1558.
24	YAO F， XIAO C， WANG X， et al. LEVEN： a large-scale Chinese legal event detection dataset［C］// Findings of the Association for Computational Linguistics： ACL 2022. Stroudsburg： ACL， 2022： 183-201.
25	LIU Y， OTT M， GOYAL N， et al. RoBERTa： a robustly optimized BERT pretraining approach［EB/OL］. ［2024-03-25］..
26	XIAO C， HU X， LIU Z， et al. Lawformer： a pre-trained language model for Chinese legal long documents［J］. AI Open 2021， 2： 79-84.

[1]	XIAO Yuhang, LI Guanfeng, CHEN Yuyin, QIN Jing. Few-shot relation extraction model with graph-based multi-view contrastive learning [J]. Journal of Computer Applications, 0, (): 0-0.
[2]	Mingfeng YU, Yongbin QIN, Ruizhang HUANG, Yanping CHEN, Chuan LIN. Multi-label text classification method based on contrastive learning enhanced dual-attention mechanism [J]. Journal of Computer Applications, 2025, 45(6): 1732-1740.
[3]	Ziliang LI, Guangli ZHU, Yulei ZHANG, Jiajia LIU, Yixuan JIAO, Shunxiang ZHANG. Aspect-based sentiment analysis model integrating syntax and sentiment knowledge [J]. Journal of Computer Applications, 2025, 45(6): 1724-1731.
[4]	GAO Fei, CHEN Dong, BIAN Dixing, FAN Wenqiang, LIU Qidong, LYU Pei, ZHANG Chaoyang, XU Mingliang. Multi-stage coupled decision-making framework for discipline revocation and researcher reallocation#br# #br# [J]. Journal of Computer Applications, 0, (): 0-0.
[5]	HUANG Yiming, ZOU Xihua, DENG Guo, ZHENG Di. Pre-answering and retrieval filtering: dual-stage optimization approach for RAG-based question-answering systems [J]. Journal of Computer Applications, 0, (): 0-0.
[6]	. Deep evolutionary document topic clustering model [J]. Journal of Computer Applications, 0, (): 0-0.
[7]	SHEN Bin, CHEN Xiaoning, CHENG Hua, FANG Yiquan, WANG Huifeng. Intelligent undergraduate teaching evaluation system based on large language models [J]. Journal of Computer Applications, 0, (): 0-0.
[8]	Haiyan TIAN, Saihao HUANG, Dong ZHANG, Shoushan LI. Visually guided word segmentation and part of speech tagging [J]. Journal of Computer Applications, 2025, 45(5): 1488-1495.
[9]	Qing ZHANG, Fan YANG, Yuhan FANG. Chinese spelling correction algorithm based on multi-modal information fusion [J]. Journal of Computer Applications, 2025, 45(5): 1528-1534.
[10]	Jie HU, Shuaixing WU, Zhilan CAO, Yan ZHANG. Named entity recognition model based on global information fusion and multi-dimensional relation perception [J]. Journal of Computer Applications, 2025, 45(5): 1511-1519.
[11]	Bo XU, Dezhi HAO, Erchen YU, Hongfei LIN, Linlin ZONG. Psychological counseling human-machine dialogue dataset construction for dialogue generation and mental disorder detection [J]. Journal of Computer Applications, 2025, 45(5): 1395-1402.
[12]	. Chinese Semantic Error Recognition Based on Hierarchical Information Enhancement [J]. Journal of Computer Applications, 0, (): 0-0.
[13]	. Multi-Label Text Classification method of Power Customer Service Tickets Integrating Feature Enhancement and Contrastive Learning [J]. Journal of Computer Applications, 0, (): 0-0.
[14]	. Entity-relation extraction strategy in Chinese open-domain based on large language model [J]. Journal of Computer Applications, 0, (): 0-0.
[15]	. Aspect sentiment triplet extraction model with multi-view linguistic features and sentiment lexicon [J]. Journal of Computer Applications, 0, (): 0-0.

Legal case retrieval method integrating temporal behavior chain and event type

融合时序行为链与事件类型的类案检索方法

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 9

References 26

Related Articles 15

Recommended Articles

Metrics