Journal of Computer Applications ›› 2024, Vol. 44 ›› Issue (10): 3294-3299.DOI: 10.11772/j.issn.1001-9081.2023101521

• The 40th CCF National Database Conference (NDBC 2023) • Previous Articles     Next Articles

Symmetric positive definite autoencoder method for multivariate time series anomaly detection

Hui JIANG1, Qiuyan YAN1,2(), Zhujun JIANG1   

  1. 1.School of Computer Science and Technology,China University of Mining and Technology,Xuzhou Jiangsu 221116,China
    2.Innovation Research Center of Disaster Intelligent Prevention and Emergency Rescue,China University of Mining and Technology,Xuzhou Jiangsu 221116,China
  • Received:2023-11-08 Revised:2024-01-09 Accepted:2024-01-10 Online:2024-10-15 Published:2024-10-10
  • Contact: Qiuyan YAN
  • About author:JIANG Hui, born in 1999, M. S. candidate. His research interests include time series data mining, anomaly detection.
    JIANG Zhujun, born in 2000, M. S. candidate. Her research interests include time series data mining, anomaly detection.
  • Supported by:
    National Natural Science Foundation of China(51934007);Graduate Innovation Program of China University of Mining and Technology(2023WLJCRCZL261)

面向多元时间序列异常检测的对称正定自编码器方法

蒋辉1, 闫秋艳1,2(), 姜竹郡1   

  1. 1.中国矿业大学 计算机科学与技术学院,江苏 徐州 221116
    2.中国矿业大学 灾害智能防控与应急救援创新研究中心,江苏 徐州 221116
  • 通讯作者: 闫秋艳
  • 作者简介:蒋辉(1999—),男,江苏宿迁人,硕士研究生,CCF会员,主要研究方向:时序数据挖掘、异常检测
    闫秋艳(1978—),女,江苏徐州人,副教授,博士,CCF会员,主要研究方向:多模态图像行为识别、教育大数据分析、时序数据挖掘 yanqy@cumt.edu.cn
    姜竹郡(2000—),女,山东威海人,硕士研究生,主要研究方向:时序数据挖掘、异常检测。
  • 基金资助:
    国家自然科学基金资助项目(51934007);中国矿业大学研究生创新计划项目(2023WLJCRCZL261)

Abstract:

Detecting abnormal patterns in multivariate time series is of great importance for the normal operation of complex systems in industrial production, Internet services, and other scenarios. Multidimensional data on continuous time has both temporal and spatial relationships, but most existing methods are deficient in modeling spatial relationships between dimensions. Due to the complexity of the spatial topology structure constructed by multidimensional data, traditional neural network models have difficulty in preserving well-modeled spatial relationships. To address these problems, an SPDAE (Symmetric Positive Definite AutoEncoder) method for multivariate time series anomaly detection was proposed. Gaussian kernel function was used to calculate the mutual relationship between two dimensions of the original data, multi-step and multi-window SPD (Symmetric Positive Definite) matrices were generated to capture the spatiotemporal features of multivariate time series. At the same time, a convolution-like AutoEncoder (AE) network was designed. The SPD feature matrix was taken as input at encoder stage, and an attention mechanism was introduced at the decoder stage to aggregate multi-step data obtained by each layer of the encoder to achieve multi-scale spatiotemporal feature reconstruction. In particular, in order to preserve the spatial structure of the input data, a convolution-like operation that conforms to the manifold topology was used by each layer of the encoder and the decoder to update model weights and a Log-Euclidean metric was used to calculate the reconstruction error. Experimental results on a private dataset show that the SPDAE method improves the precision by 2.3 percentage points compared to the suboptimal baseline model MSCRED (Multi-Scale Convolutional Recurrent Encoder-Decoder) and the F1 score by 3.0 percentage points compared to the suboptimal baseline model LSTM-ED (Long Short-Term Memory network based Encoder-Decoder). At the same time, due to the use of SPD matrices to represent spatial relationships between multidimensional data, according to the difference value of its reconstructed matrix, preliminary positioning of abnormal dimensions can be achieved.

Key words: multivariate time series, anomaly detection, deep learning, AutoEncoder (AE), non-Euclidean manifold

摘要:

检测多元时间序列中的异常模式对工业生产、互联网服务等场景中复杂系统的正常运行有着重要意义。连续时间上的多维数据同时存在时间和空间两种类型的相互关系,但大多数现有方法欠缺对维度之间空间关系的建模,且由于多维数据构造的空间拓扑结构的复杂性,传统的神经网络模型较难保留已建模的空间关系。针对上述问题,提出一种面向多元时间序列异常检测的对称正定自编码器(SPDAE)方法。使用高斯核函数计算原始数据2个维度之间的相互关系,生成多步长、多窗口的对称正定(SPD)矩阵,以捕捉多元时间序列的时空特征;同时,设计一个类卷积自编码器(AE)网络,编码器阶段以SPD特征矩阵为输入,解码器阶段则引入注意力机制聚合每层编码器得到的多步长数据,实现多尺度时空特征的重构;特别地,为保留输入数据的空间结构,编码解码器的每一层和损失计算部分分别使用符合流形拓扑的类卷积操作更新模型权重和Log-Euclidean度量计算重构误差。在私有数据集上的实验结果表明,SPDAE方法的精度指标相较于次优基线模型MSCRED(Multi-Scale Convolutional Recurrent Encoder-Decoder)提升了2.3个百分点,F1值指标相较于次优的基线模型长短期记忆编码器-解码器网络(LSTM-ED)提升了3.0个百分点;同时,由于采用了SPD矩阵表征多维数据之间的空间关系,根据重构矩阵的差异值可以实现异常维度的初步定位。

关键词: 多元时间序列, 异常检测, 深度学习, 自编码器, 非欧流形

CLC Number: