《计算机应用》唯一官方网站 ›› 2024, Vol. 44 ›› Issue (8): 2342-2350.DOI: 10.11772/j.issn.1001-9081.2023081176

• 人工智能 • 上一篇    下一篇

融合评论序列二义性与生成用户隐私特征的谣言检测

孟文凡1, 周丽华1(), 王晓旭2   

  1. 1.云南大学 信息学院,昆明 650504
    2.云南大学滇池学院,昆明 650228
  • 收稿日期:2023-08-31 修回日期:2023-09-14 接受日期:2023-10-09 发布日期:2024-08-22 出版日期:2024-08-10
  • 通讯作者: 周丽华
  • 作者简介:孟文凡(1996—),男,贵州独山人,硕士研究生,CCF会员,主要研究方向:数据挖掘、信息扩散、谣言检测
    周丽华(1968—),女,云南华坪人,教授,博士,CCF会员,主要研究方向:数据挖掘、多视角学习、社会网络分析 lhzhou@ynu.edu.cn
    王晓旭(1995—),女,安徽阜阳人,讲师,硕士,主要研究方向:数据挖掘、大数据。
  • 基金资助:
    国家自然科学基金资助项目(62062066);云南省基础研究计划项目(202201AS070015)

Rumor detection by fusing ambiguity in comment sequences and generating user privacy features

Wenfan MENG1, Lihua ZHOU1(), Xiaoxu WANG2   

  1. 1.School of Information Science and Engineering,Yunnan University,Kunming Yunnan 650504,China
    2.Dianchi College of Yunnan University,Kunming Yunnan 650228,China
  • Received:2023-08-31 Revised:2023-09-14 Accepted:2023-10-09 Online:2024-08-22 Published:2024-08-10
  • Contact: Lihua ZHOU
  • About author:MENG Wenfan ,born in 1996,M. S. candidate. His researchinterests include data mining, information diffusion, rumor detection.
    ZHOU Lihua ,born in 1968,Ph. D., professor. Her researchinterests include data mining, multi-view learning, social networkanalysis.
    WANG Xiaoxu,born in 1995,M. S., lecturer. Her researchinterests include data mining, big data.
  • Supported by:
    This work is partially supported by National Natural ScienceFoundation of China (62062066); Yunnan Basic Research Program(202201AS070015).

摘要:

现有谣言检测工作存在以下问题:1)没有同时捕获评论序列的文本语义特征和时间周期特征;2)在隐私保护环境下无法获取用户个人资料,导致传播结构中的信息难以充分融合。为此,提出融合评论序列二义性与生成用户隐私特征的谣言检测模型(RD-CSGU)。综合考虑了评论序列不同视角下的文本语义特征和时间周期特征,同时构建了反映传播过程中用户之间社交互动关系的谣言传播异质网络,并基于该网络中的语义关系通过生成对抗网络(GAN)生成用户的隐私特征,解决了用户个人资料访问受限的问题。在Twitter15、Twitter16、Weibo数据集上展开有效性验证,与次优基线模型GLAN(Global-Local Attention Network)相比,RD-CSGU的准确率(Acc)分别提升了0.9、2.2和1.8个百分点,真谣言F1(TR-F1)值分别提升了2.6、6.8和1.9个百分点;结合消融实验及GAN生成嵌入分析的实验结果表明,RD-CSGU能有效检测出社交媒体平台上发布的谣言帖子。

关键词: 谣言检测, 评论序列, 传播异质网络, 生成特征, 传播结构

Abstract:

There are some problems in existing rumor detection works, such as not fully integrating the information within propagation structure because of the deficiency of simultaneously capturing text semantic features and time periodic features in comment sequences and the inability to access the user personal profiles in a privacy-protected environment. To address the above problems, a Rumor Detection model fusing ambiguity in Comment Sequences and Generating User privacy features (RD-CSGU) was proposed. Text semantic features and time periodic features from different perspectives of comment sequences were comprehensively considered. Meanwhile, a heterogeneous network of rumor propagation for describing the social interaction relationship among users during the propagation process was constructed, based on which user privacy features were generated through a Generative Adversarial Network (GAN) based on the semantic relationships, overcoming the limitation of user personal profiles. The effectiveness of the proposed model was validated on Twitter15, Twitter16 and Weibo datasets. Compared with the suboptimal baseline model GLAN (Global-Local Attention Network), RD-CSGU achieved improvements of 0.9, 2.2 and 1.8 percentage points in Accuracy (Acc), as well as improvements of 2.6, 6.8 and 1.9 percentage points in TR (True Rumor)-F1 score. The results combined with those from ablation experiments and analysis of GAN-generated embeddings show that RD-CSGU can effectively detect rumor posts on social media platforms.

Key words: rumor detection, comment sequence, heterogeneous propagation network, generated feature, propagation structure

中图分类号: