Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Anomaly detection method for hydrologic sensor data based on SparkR
LIU Zihao, LI Ling, YE Feng
Journal of Computer Applications    2019, 39 (2): 436-440.   DOI: 10.11772/j.issn.1001-9081.2018081782
Abstract530)      PDF (891KB)(264)       Save
To efficiently detect outliers in massive hydrologic sensor data, an anomaly detection method for hydrological time series based on SparkR was proposed. Firstly, a sliding window and Autoregressive Integrated Moving Average (ARIMA) model were used to forecast the cleaned data on SparkR platform. Then, the confidence interval was calculated for the prediction results, and the results outside the interval range were judged as anomaly data. Finally, based on the detection results, K-Means algorithm was used to cluster the original data, the state transition probability was calculated, and the anomaly data were evaluated in quality. Taking the data of hydrologic sensor obtained from the Chu River as experimental data, experiments on the detection time and outlier detection performance were carried out respectively. The results show that the millions of data calculation by two slaves costs more time than that by one slave, but when calculating the tens of milllions of data, the time costed by two slaves is less than that by one slave, and the maximum reduction is 16.21%. The sensitivity of the evaluation is increased from 5.24% to 92.98%. It shows that under big data platform, the proposed algorithm which is based on the characteristics of hydrological data and combines forecast test and cluster test can effectively improve the computational efficiency of hydrologic time series detection for tens of millions data and has a significant improvement in sensitivity.
Reference | Related Articles | Metrics