《计算机应用》唯一官方网站 ›› 2024, Vol. 44 ›› Issue (5): 1458-1463.DOI: 10.11772/j.issn.1001-9081.2023050726

• 2023年中国计算机学会人工智能会议(CCFAI 2023) • 上一篇    

基于边缘异常候选集的迭代式主动多元时序异常检测算法

孟凡1,2, 杨群力1, 霍静2(), 王新宽3   

  1. 1.江苏省战略与发展研究中心 公共信用信息中心, 南京 210036
    2.计算机软件新技术国家重点实验室(南京大学), 南京 210093
    3.中国移动江苏公司扬州分公司, 江苏 扬州 225012
  • 收稿日期:2023-06-06 修回日期:2023-06-20 接受日期:2023-07-04 发布日期:2023-08-01 出版日期:2024-05-10
  • 通讯作者: 霍静
  • 作者简介:孟凡(1988—),男,江苏南京人,副研究员,高级工程师,博士,CCF会员,主要研究方向:机器学习、大数据、无监督学习
    杨群力(1972—),男,江苏泰州人,高级工程师,硕士,主要研究方向:信用体系建设、社会经济发展
    王新宽(1980—),男,山东枣庄人,正高级工程师,硕士,主要研究方向:网络与信息安全、算力网络。
    第一联系人:霍静(1989—),女,江苏苏州人,副教授,博士,主要研究方向:机器学习、计算机视觉
  • 基金资助:
    江苏省社会信用体系建设专项资金资助项目(JSZC?G2018-393);南京大学计算机软件新技术国家重点实验室资助项目(KFKT2022B27)

EraseMTS: iterative active multivariable time series anomaly detection algorithm based on margin anomaly candidate set

Fan MENG1,2, Qunli YANG1, Jing HUO2(), Xinkuan WANG3   

  1. 1.Public Credit Information Center,Jiangsu Strategy and Development Research Center,Nanjing Jiangsu 210036,China
    2.State Key Laboratory for Novel Software Technology (Nanjing University),Nanjing Jiangsu 210093,China
    3.Yangzhou Branch,Chian Mobile Jiangsu Company,Yangzhou Jiangsu 225012,China
  • Received:2023-06-06 Revised:2023-06-20 Accepted:2023-07-04 Online:2023-08-01 Published:2024-05-10
  • Contact: Jing HUO
  • About author:MENG Fan, born in 1988, Ph. D., senior engineer. His research interests include machine learning, big data, unsupervised learning.
    YANG Qunli, born in 1972, M. S., senior engineer. His research interests include credit system construction, social economic development.
    WANG Xinkuan, born in 1980, M. S., professorate senior engineer. His research interests include network and information security, compute first networking.
  • Supported by:
    Jiangsu Province Social Credit System Construction Special Fund(JSZC-G2018-393);Open Project of State Key Laboratory of Novel Software Technology (Nanjing University)(KFKT2022B27)

摘要:

无监督多元时间序列(MTS)异常检测方法因标注成本低而广受关注,但传统方法一般基于两个假设:1)服从独立同分布(IID)假设,即假设时序数据样本之间和属性之间不存在依赖关系;2)高净度启动假设,即假设可拥有完全正常态的时序数据集进行训练。以上假设在实际场景中往往难以满足。为此,提出一种基于边缘异常候选集的迭代式主动多元时序异常检测算法(EraseMTS)。首先,利用一种多粒度时序特征学习方法捕捉子序列内和子序列间的依赖关系,并在此基础上对原始多元时间序列进行再表示;其次,提出一种利用边缘异常候选集的选择策略,以子序列异常得分为基础,同时考虑异常程度,选择待人工交互的范围;最后,提出一种迭代式子序列权重更新机制,将异常反馈信息融入无监督异常检测模型的训练过程中,通过迭代方式不断优化初始训练模型性能。在UCR时间序列库中的4个数据集和1个人工合成数据集上对所提算法的检测性能、可扩展性和稳定性进行验证,实验结果表明该算法能够有效且稳定运行。

关键词: 异常检测, 多元时间序列, 权重更新, 多粒度表示, 主动学习

Abstract:

Unsupervised anomaly detection methods for Multivariable Time Series (MTS) have attracted wide attention due to their low labeling costs. However, traditional unsupervised anomaly detection methods are often based on two assumptions: 1) Independent and Identical Distribution (IID) assumption, i.e., there is no dependency between samples and attributes of MTS. 2) High-purity starting assumption, i.e., it is assumed that a completely normal time series should be used for training. The above assumptions are often difficult to satisfy in practical scenarios. To address this problem, an iterative active MTS anomaly detection algorithm based on margin anomaly candidate set (called EraseMTS) was proposed. Firstly, a multi-granularity representation learning method was utilized to capture the dependencies within subsequences and between subsequences, and then represent the original MTS. Secondly, a selection strategy was proposed to interact with experts based on margin anomaly candidate set, which was determined by the subsequence anomaly score and the uncertainty of its anomaly degree. Finally, an iterative subsequence weight update mechanism was designed to integrate the abnormal feedback information into the training process of the unsupervised anomaly detection model. The performance of the initial training model was continuously optimized through iteration. The proposed algorithm was verified in detection performance, scalability, and stability respectively on four datasets in UCR time series archive and one synthetic dataset. Experimental results show that the proposed algorithm can run effectively and stably.

Key words: anomaly detection, Multivariable Time Series (MTS), weight update, multi-granularity representation, active learning

中图分类号: