Journal of Computer Applications ›› 2020, Vol. 40 ›› Issue (5): 1329-1334.DOI: 10.11772/j.issn.1001-9081.2019091631

• Data science and technology • Previous Articles     Next Articles

Time series anomaly detection method based on autoencoder and HMM

HUO Weigang, WANG Huifang   

  1. School of Computer Science and Technology, Civil Aviation University of China, Tianjin 300300, China
  • Received:2019-09-24 Revised:2019-10-19 Online:2020-05-10 Published:2020-05-15
  • Contact: HUO Weigang, born in 1978, Ph. D., associate professor. His research interests include data mining, fuzzy classification.
  • About author:HUO Weigang, born in 1978, Ph. D., associate professor. His research interests include data mining, fuzzy classification.WANG Huifang, born in 1993, M. S. candidate. Her research interests include data mining.
  • Supported by:

    This work is partially supported by the Civil Aviation Joint Research Fund of Committee of National Natural Science Foundation of China and Civil Aviation Administration of China (U1633110), the Special Fund for Civil Aviation University of China of Fundamental Research Funds for the Central Universities (3122019190).


霍纬纲, 王慧芳   

  1. 中国民航大学 计算机科学与技术学院,天津 300300
  • 通讯作者: 霍纬纲(1978—)
  • 作者简介:霍纬纲(1978—),男,山西洪洞人,副教授,博士,CCF会员,主要研究方向:数据挖掘、模糊分类; 王慧芳(1993—),女,山西大同人,硕士研究生,主要研究方向:数据挖掘。
  • 基金资助:



To solve the issue that the existing symbolic methods of anomaly detection based on Hidden Markov Model (HMM) cannot well represent the original time series, an Autoencoder and HMM-based Anomaly Detection (AHMM-AD) method was proposed. Firstly, the time series samples were segmented by sliding window, and several time series segmented sample sets were formed according to the positions of the segmentations, and the autoencoder of each segmentation was trained by the segmented sample set of different positions on the normal time series. Then, the low-dimensional feature representation of each segmented time series sample was obtained by using the autoencoder, and through K-means clustering of low-dimensional feature representation vector sets, the symbolization of time series sample sets was realized. Finally, the HMM was generated based on the symbol sequence set of the normal time series, and the abnormal detection was carried out according to the output probability values of the test samples on the established HMM. The experimental results on multiple common benchmark datasets show that AHMM-AD improves the accuracy, recall rate, and F1 value by 0.172, 0.477 and 0.313 respectively compared to those of the HMM-based time series anomaly detection model, and has 0.108, 0.450 and 0.319 increasement in these three aspects respectively compared with the autoencoder-based time series anomaly detection model. The experimental results illustrate that AHMM-AD method can extract the nonlinear features in time series, solve the problem that the time series cannot be well represented during the symbolization process of existing HMM-based time series modeling, and also improve the performance of time series anomaly detection.

Key words: autoencoder, symbol sequence, Hidden Markov Model (HMM), anomaly detection, time series



关键词: 自编码器, 符号化序列, 隐马尔可夫模型, 异常检测, 时间序列

CLC Number: