《计算机应用》唯一官方网站 ›› 2025, Vol. 45 ›› Issue (11): 3593-3600.DOI: 10.11772/j.issn.1001-9081.2024111574

• 数据科学与技术 • 上一篇    

面向多维时间序列根因分析的概率生成图注意力网络方法

闫秋艳(), 蒋辉, 姜竹郡, 李博雪   

  1. 中国矿业大学 计算机科学与技术学院,江苏 徐州 221116
  • 收稿日期:2024-11-05 修回日期:2025-01-06 接受日期:2025-01-07 发布日期:2025-02-14 出版日期:2025-11-10
  • 通讯作者: 闫秋艳
  • 作者简介:蒋辉(1999—),男,江苏宿迁人,硕士,CCF会员,主要研究方向:时序数据挖掘、异常检测
    姜竹郡(2000—),女,山东威海人,硕士研究生,主要研究方向:时序数据挖掘、异常检测
    李博雪(2001—),女,安徽淮北人,硕士研究生,主要研究方向:时序数据挖掘、异常检测。
  • 基金资助:
    国家自然科学基金重点项目(51934007);国家自然科学基金面上项目(62277046);国家自然科学基金面上项目(61977061);国家自然科学基金—国家重大科研仪器研制项目(52227901)

Probabilistic generative graph attention network method for multi-dimensional time series root cause analysis

Qiuyan YAN(), Hui JIANG, Zhujun JIANG, Boxue LI   

  1. School of Computer Science and Technology,China University of Mining and Technology,Xuzhou Jiangsu 221116,China
  • Received:2024-11-05 Revised:2025-01-06 Accepted:2025-01-07 Online:2025-02-14 Published:2025-11-10
  • Contact: Qiuyan YAN
  • About author:JIANG Hui, born in 1999, M. S. His research interests include time series data mining, anomaly detection.
    JIANG Zhujun, born in 2000, M. S. candidate. Her research interests include time series data mining, anomaly detection.
    LI Boxue, born in 2001, M. S. candidate. Her research interests include time series data mining, anomaly detection.
  • Supported by:
    This work is partially supported by Key Program of National Natural Science Foundation of China(51934007);General Program of National Natural Science Foundation of China(62277046);National Natural Science Foundation of China — National Major Scientific Research Instrument Development Project(52227901)

摘要:

根因分析(RCA)对于帮助快速恢复系统、精确评估风险和保障生产安全具有重要意义。针对当前的方法不能很好地表征不同传感器之间的依赖关系,难以捕获时间序列中存在的随机波动的问题,提出一种面向多维时间序列根因分析的概率生成图注意力网络方法,称为GPRCA。该方法将维度特征嵌入定义为高斯分布向量,用于表征不同传感器的潜在特征,以捕捉多维时间序列中存在的随机波动,提高模型抗噪性;同时构建深度概率生成图注意力网络,学习维度之间非线性的依赖关系,从而很好地建模传感器网络中的依赖性;最后综合网络拓扑因果得分和节点个体因果得分进行根因分析。在2个公开数据集(安全水处理(SWaT)数据集、水分配(WADI)数据集)和1个私有数据集(Mine)上的实验结果显示,GPRCA的部分指标取得了最优值:在SWaT数据集上,GPRCA的5候选精度(P@5)、5候选平均精度(mAP@5)和平均倒数排名(MRR)比次优方法分别提升了2.2%、6.3%和11.6%;在WADI数据集上比次优方法分别提升了8.1%、7.0%和11.0%;在Mine数据集上的mAP@3和MRR比次优方法分别提升了3.6%和1.8%。可见GPRCA方法是有效的,并且性能优于基线方法。

关键词: 多维时间序列, 根因分析, 概率生成网络, 图注意力网络, 深度学习

Abstract:

Root Cause Analysis (RCA) is of critical importance in aiding rapid system recovery, accurately assessing risks, and ensuring production safety. Addressing the limitations of current RCA methods, which struggle to adequately characterize dependencies among different sensors and fail to capture stochastic fluctuations in time series, a new approach called GPRCA for multi-dimensional time series root cause analysis using a probabilistic generative graph attention network was proposed. This method regarded dimension feature embeddings as Gaussian distribution vectors to characterize latent features of different sensors, effectively capturing stochastic fluctuations in multi-dimensional time series and enhancing the model robustness against noise. Simultaneously, a deep probability generative graph attention network was constructed to learn the nonlinear dependencies between dimensions, thereby effectively modeling dependencies within sensor networks. Finally, the root cause analysis was conducted by integrating the network topology causal score and the individual node causal scores. Experimental results on two public datasets (SWaT and WADI) and one private dataset (Mine) showed that GPRCA achieved the optimal values on certain metrics. Specifically, on the SWaT dataset, GPRCA improved 2.2%, 6.3%, and 11.6% on P@5 (Precision), mAP@5 (mean Average Precision), and Mean Reciprocal Rank (MRR), respectively,compared to the sub-optimal method; on the WADI dataset, GPRCA improved 8.1%, 7.0%, and 11.0% on P@5, mAP@5, and MRR, respectively, compared to the sub-optimal method. On the Mine dataset, GPRCA improved 3.6% and 1.8% on mAP@3 and MRR, respectively, compared to the sub-optimal method. It can be seen that GPRCA method has the effectiveness and better performance than the baseline methods.

Key words: multi-dimensional time series, Root Cause Analysis (RCA), probabilistic generative network, Graph ATtention network (GAT), deep learning

中图分类号: