Large language model-driven stance-aware fact-checking

doi:10.11772/j.issn.1001-9081.2023101407

Abstract

Abstract:

To address the issues of evidence stance imbalance and neglect of stance information in the field of Fact-Checking （FC）， a Large Language Model-driven Stance-Aware fact-checking （LLM-SA） method was proposed. Firstly， a series of dialectical claims that differed from the original claim were generated by using a large language model， to capture different perspectives for fact-checking. Secondly， through semantic similarity calculations， the relevances of each piece of evidence sentence to the original claim and the dialectical claim were separately assessed， and the top k sentences with the highest semantic similarity to each of them were selected as the evidence to either support or oppose the original claim， which obtained evidences representing different stances， and helped the fact-checking model integrate information from multiple perspectives and evaluate the veracity of the claim more accurately. Finally， the BERT-StuSE （BERT-based Stance-infused Semantic Encoding network） model was introduced to fully incorporate the semantic and stance information of the evidence through the multi-head attention mechanism and make a more comprehensive and objective judgment on the relationship between the claim and the evidence. The experimental results on the CHEF dataset show that， compared to the BERT method， the Micro F1 value and Macro F1 value of the proposed method on the test set were improved by 3.52 and 3.90 percentage points， respectively， achieving a good level of performance. The experimental results demonstrate the effectiveness of the proposed method， and the significance of considering evidence from different stances and leveraging the stance information of the evidence for enhancing fact-checking performance.

Key words: Fact-Checking (FC), Natural Language Processing (NLP), Large Language Model (LLM), prompt engineering, stance awareness, multi-head attention mechanism

摘要：

为解决事实核查领域的证据立场不平衡和忽略立场信息的问题，提出一种大语言模型（LLM）驱动的立场感知事实核查（LLM-SA）方法。首先，使用LLM推理并生成一系列与原始声明立场不同的辩证声明，使事实核查任务能够获取不同立场的视角；其次，通过语义相似度计算衡量每个证据句子与原始声明及辩证声明之间的相关性，并从证据句子中分别选择与两者语义上最相近的k个句子，作为支持或反对原始声明的证据，从而获得代表不同立场的证据，有助于事实核查模型综合多角度的信息，更准确地评估声明的真实性；最后，引入BERT-StuSE（Bidirectional Encoder Representations from Transformers-based Stance-infused Semantic Encoding network）模型，利用多头注意力机制充分融合证据的语义和立场信息，并更全面、客观地判断声明和证据的关系。在CHEF数据集上的实验结果表明，与BERT方法相比，所提方法在测试集上的Micro F1值和Macro F1值分别提高了3.52、3.90个百分点，达到较好的水平。验证了所提方法的有效性，以及考虑不同立场的证据和充分利用证据的立场信息对事实核查的性能提升具有重要意义。

关键词: 事实核查, 自然语言处理, 大语言模型, 提示工程, 立场感知, 多头注意力机制

CLC Number:

TP391.1

Yushan JIANG, Yangsen ZHANG. Large language model-driven stance-aware fact-checking[J]. Journal of Computer Applications, 2024, 44(10): 3067-3073.

姜雨杉, 张仰森. 大语言模型驱动的立场感知事实核查[J]. 《计算机应用》唯一官方网站, 2024, 44(10): 3067-3073.

Figures/Tables 7

References 28

1	GUO Z， SCHLICHTKRULL M， VLACHOS A. A survey on automated fact-checking［J］. Transactions of the Association for Computational Linguistics， 2022， 10： 178-206.
2	AMAZEEN M A. Journalistic interventions： the structural factors affecting the global emergence of fact-checking［J］. Journalism， 2020， 21（1）： 95-111.
3	ZHOU X， ZAFARANI R. A survey of fake news： fundamental theories， detection methods， and opportunities［J］. ACM Computing Surveys， 2020， 53（5）： Article No. 109.
4	DIAS N， SIPPITT A. Researching fact checking： present limitations and future opportunities［J］. The Political Quarterly， 2020， 91（3）： 605-613.
5	HARDALOV M， ARORA A， NAKOV P， et al. A survey on stance detection for mis- and disinformation identification［C］// Finding of the Association for Computational Linguistics： NAACL 2022. Stroudsburg： ACL， 2022： 1259-1277.
6	DEVLIN J， CHANG M W， LEE K， et al. BERT： pre-training of deep bidirectional transformers for language understanding［C］ // Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies， Volume 1 （Long and Short Papers）. Stroudsburg： ACL， 2018： 4171-4186.
7	FLORIDI L， CHIRIATTI M. GPT-3： its nature， scope， limits， and consequences［J］. Minds and Machines， 2020， 30： 681-694.
8	ZHOU Y， MURESANU A I， HAN Z， et al. Large language models are human-level prompt engineers ［EB/OL］. ［2023-09-29］. .
9	THORNE J， VLACHOS A， CHRISTODOULOPOULOS C， et al. FEVER： a large-scale dataset for fact extraction and verification［C］// Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies. Stroudsburg： ACL， 2018：809-819.
10	HU X， GUO Z， WU G， et al. CHEF： a pilot Chinese dataset for evidence-based fact-checking［C］// Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies. Stroudsburg： ACL， 2022： 3362-3376.
11	HU X， HONG Z， GUO Z， et al. Read it twice： towards faithfully interpretable fact verification by revisiting evidence［C］// Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York： ACM， 2023： 2319-2323.
12	SOLEIMANI A， MONZ C， WORRING M. BERT for evidence retrieval and claim verification［C］// Proceedings of the 42nd European Conference on IR Research. Cham： Springer， 2020： 359-366.
13	JIANG K， PRADEEP R， LIN J. Exploring listwise evidence reasoning with T5 for fact verification［C］// Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing （Volume 2： Short Papers）. Stroudsburg： ACL， 2021： 402-410.
14	FAJCIK M， MOTLICEK P， SMRZ P. Claim-Dissector： an interpretable fact-checking system with joint re-ranking and veracity prediction［C］// Proceedings of the 2023 Annual Meeting of the Association for Computational Linguistics. Stroudsburg： ACL， 2023： 10184-10205.
15	KOTONYA N， TONI F. Explainable automated fact-checking： a survey［C］// Proceedings of the 28th International Conference on Computational Linguistics.［S.l.］： International Committee on Computational Linguistics， 2020： 5430-5443.
16	RAY P P. ChatGPT： a comprehensive review on background， applications， key challenges， bias， ethics， limitations and future scope［J］. Internet of Things and Cyber-Physical Systems， 2023， 3：121-154.
17	LIU P， YUAN W， FU J， et al. Pre-train， prompt， and predict： a systematic survey of prompting methods in natural language processing［J］. ACM Computing Surveys， 2023， 55（9）： Article No. 195.
18	KUMARA P， SAHU S K. SIM-BERT： speech intelligence model using NLP-BERT with improved accuracy［M］// Artificial Intelligence and Speech Technology. ［S.l.］： CRC Press， 2021： 439.
19	刘玮，彭鑫，李超，等. 立场分析研究综述［J］. 中文信息学报， 2020， 34（12）： 1-8.
	LIU W， PENG X， LI C， et al. A survey on stance detection［J］. Journal of Chinese Information Processing， 2020， 34（12）： 1-8.
20	SHENG Q， CAO J， ZHANG X， et al. Zoom out and observe： news environment perception for fake news detection［C］// Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. Stroudsburg： ACL， 2022： 4543-4556.
21	BEKOULIS G， PAPAGIANNOPOULOU C， DELIGIANNIS N. A review on fact extraction and verification［J］. ACM Computing Surveys， 2021， 55（1）： Article No. 12.
22	WANG X， CHEN G， QIAN G， et al. Large-scale multi-modal pre-trained models： a comprehensive survey［J］. Machine Intelligence Research， 2023， 20： 447-482.
23	LI J， WANG X， TU Z， et al. On the diversity of multi-head attention［J］. Neurocomputing， 2021， 454： 14-24.
24	NIU Z， ZHONG G， YU H. A review on the attention mechanism of deep learning［J］. Neurocomputing， 2021， 452： 48-62.
25	TAKAHASHI K， YAMAMOTO K， KUCHIBA A， et al. Confidence interval for micro-averaged F₁ and macro-averaged F₁ scores［J］. Applied Intelligence， 2022， 52（5）： 4961-4972.
26	VO N， LEE K. Hierarchical multi-head attentive network for evidence-aware fake news detection［C］// Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics. Stroudsburg： ACL， 2021：965-975.
27	POPAT K， MUKHERJEE S， YATES A， et al. DeClarE： debunking fake news and false claims using evidence-aware deep learning［C］// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Stroudsburg： ACL， 2018： 22-32.
28	ZHANG Z， HAN X， LIU Z， et al. ERNIE： enhanced language representation with informative entities［C］// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg： ACL， 2019：1441-1451.

数据划分	支持（SUP）	反对（REF）	证据不足（NEI）	总数
总数	3 543	5 065	1 442	10 020
训练集	2 877	4 399	776	88 052
验证集	333	333	333	999
测试集	333	333	333	999

数据划分	支持（SUP）	反对（REF）	证据不足（NEI）	总数
总数	3 543	5 065	1 442	10 020
训练集	2 877	4 399	776	88 052
验证集	333	333	333	999
测试集	333	333	333	999

参数	取值	参数	取值
Batch Size	32	Pad_size	512
Learning rate	5×10^-5	require_improvement	1 000
Num_epochs	10	Optimizer	AdamW

参数	取值	参数	取值
Batch Size	32	Pad_size	512
Learning rate	5×10^-5	require_improvement	1 000
Num_epochs	10	Optimizer	AdamW

模型	验证集		测试集
模型	Micro F1	Macro F1	Micro F1	Macro F1
ReRead	71.79	69.98	71.24	69.52
DeClarE	69.72	68.81	70.26	69.59
MAC	67.97	66.63	68.77	67.70
LisT5	70.57	68.96	70.62	69.76
BERT	72.07	70.80	70.97	69.57
本文模型	74.23	72.96	74.49	73.47