计算机应用 ›› 2014, Vol. 34 ›› Issue (8): 2217-2220.DOI: 10.11772/j.issn.1001-9081.2014.08.2217

• 第五届中国数据挖掘会议(CCDM 2014)论文 • 上一篇    下一篇

基于滑动窗口预测的水文时间序列异常检测

余宇峰1,朱跃龙1,万定生1,关兴中2   

  1. 1. 河海大学 计算机与信息学院,南京210098
    2. 江西省水文局,南昌330018
  • 收稿日期:2014-04-29 修回日期:2014-05-07 出版日期:2014-08-01 发布日期:2014-08-10
  • 通讯作者: 余宇峰
  • 作者简介:余宇峰(1979-),男,湖北黄冈人,高级实验师,博士研究生,主要研究方向:数据挖掘、数据质量、水利信息化;朱跃龙(1959-),男,江苏建湖人,教授,博士,主要研究方向:智能信息处理、数据挖掘;万定生(1963-),男,江苏溧阳人,教授,主要研究方向:水利信息化、数据挖掘。
  • 基金资助:

    国家自然科学基金资助项目;水利部948项目

Time series outlier detection based on sliding window prediction

YU Yufeng1,ZHU Yuelong1,WAN Dingsheng1,GUAN Xingzhong2   

  1. 1. College of Computer and Information, Hohai University, Nanjing Jiangsu 210098, China;
    2. Hydrology Bureau of Jiangxi Province, Nanchang Jiangxi 330018, China
  • Received:2014-04-29 Revised:2014-05-07 Online:2014-08-01 Published:2014-08-10
  • Contact: YU Yufeng

摘要:

针对水文时间序列分析与决策中存在的数据质量问题,提出了基于滑动窗口预测的水文时间序列异常检测算法。首先基于滑动窗口对时间序列进行子序列分割,再以子序列为基础建立预测模型对未来值进行预测,并将预测值和实测值间差异范围大于预设阈值的序列点判定为异常。探讨了算法中的滑动窗口和参数设置,并以实例数据对算法进行了验证。实验结果表明,所提算法不仅能够有效挖掘出水文时间序列中的异常点,而且将异常检测的灵敏度和特异度分别提高到80%和98%以上。

Abstract:

To solve data quality problems for hydrological time series analysis and decision-making, a new prediction-based outlier detection algorithm was proposed. The method first split given hydrological time series into subsequences so as to build a forecasting model to predict future values, and then outliers were assumed to take place if the difference between predicted and observed values was above a certain threshold. The setup of sliding window and parameters in the detection algorithm were analyzed, and the corresponding result was validated with the real data. The experimental results show that the proposed algorithm can effectively detect the outliers in time series and improves the sensitivity and specificity to at least 80 percent and 98 percent respectively.

中图分类号: