Cyanobacterial bloom forecast method based on genetic algorithm-first order lag filter and long short-term memory network
YU Jiabin1, SHANG Fangfang1, WANG Xiaoyi1, XU Jiping1, WANG Li1, ZHANG Huiyan1, ZHENG Lei2
1. School of Computer and Information Engineering, Beijing Technology and Business University, Beijing 100048, China; 2. College of Water Sciences, Beijing Normal University, Beijing 100875, China
Abstract:The process of algal bloom evolution in rivers or lakes has characteristics of suddenness and uncertainty, which leads to low prediction accuracy of algal bloom. To solve this problem, chlorophyll a concentration was used as the surface index of cyanobacteria bloom evolution process, and a cyanobacterial bloom forecast model based on Long Short-Term Memory (LSTM) and Recurrent Neural Network (RNN) was proposed. Firstly, the improved Genetic algorithm-First order lag filter (GF) optimization algorithm was taken as data smoothing filter. Secondly, a GF-LSTM network model was built to accurately predict the cyanobacterial bloom. Finally, the data sampled from Meiliang Lake in Taihu area were used to test the forecast model, and then the model was compared with the traditional RNN and LSTM network. The experimental results show that, the mean relative error of the proposed GF-LSTM network model is 16%-18%, lower than those of RNN model (28%-32%) and LSTM network model (19%-22%). The proposed model has good effect on data smoothing filtering, higher prediction accuracy and better adaptability to samples. It also avoids two widely known issues of gradient vanishing and gradient exploding when using traditional RNN model during long term training.
[1] 王寿兵,徐紫然,张洁.大型湖库富营养化蓝藻水华防控技术发展述评[J].水资源保护,2016,32(4):88-99.(WANG S B, XU Z R, ZHANG J. A review of technologies for prevention and control of cyanobacteria blooms in large-scale eutrophicated lakes and reservoirs[J]. Water Resources Protection, 2016, 32(4):88-99.) [2] 孔繁翔,高光.大型浅水富营养化湖泊中蓝藻水华形成机理的思考[J].生态学报,2005,25(3):589-595.(KONG F X, GAO G. Hypothesis on cyanobacteria bloom-forming mechanism in large shallow eutrophic lakes[J]. Acta Ecologica Sinica, 2005, 25(3):589-595.) [3] 王长友,于洋,孙运坤,等.基于ELCOM-CAEDYM模型的太湖蓝藻水华早期预测探讨[J].中国环境科学,2013,33(3):491-502.(WANG C Y, YU Y, SUN Y K, et al. The discussion of the early forecasting of cyanobacteria bloom in the Lake Taihu based on ELCOM-CAEDYM model[J]. China Environmental Science, 2013, 33(3):491-502.) [4] 邵飞,施彦,王小艺,等.基于复杂网络的城市湖库藻类水华形成识别研究[J].环境科学学报,2014,34(8):2121-2125.(SHAO F, SHI Y, WANG X Y, et al. Recognition of lake algal bloom based on complex network[J]. Acta Scientiae Circumstantiae, 2014, 34(8):2121-2125.) [5] 郑剑锋,焦继东,孙力平.基于神经网络的城市内湖水华预警综合建模方法研究[J].中国环境科学,2017,37(5):1872-1878.(ZHENG J F, JIAO J D, SUN L P. A modeling approach for early-warning of water bloom risk in urban lake based on neural network[J]. China Environmental Science, 2017, 37(5):1872-1878.) [6] 常淳,冯平,孙冬梅,等.基于逐步聚类分析的水库浮游藻类生长预测[J].中国环境科学,2015,35(9):2805-2812.(CHANG C, FENG P, SUN D M, et al. Prediction of the alga growth in a reservoir based on the stepwise cluster analysis[J]. China Environmental Science, 2015, 35(9):2805-2812.) [7] 王小艺,唐丽娜,刘载文,等.藻类水华形成机理的模糊Petri网优化建模研究[J].电子学报,2013,41(1):68-71.(WANG X Y, TANG L N, LIU Z W, et al. Research on the fuzzy Petri net optimization modeling of water bloom formation process[J]. Acta Electronica Sinica, 2013, 41(1):68-71.) [8] 张威威,李瑞敏,谢中教.基于深度学习的城市道路旅行时间预测[J].系统仿真学报,2017,29(10):2309-2322.(ZHANG W W, LI R M, XIE Z J. Travel time prediction of urban road based on deep learning[J]. Journal of System Simulation, 2017, 29(10):2309-2322.) [9] 徐敏捷.基于指数平滑法的微博舆情预测模型研究[J].中国公共安全(学术版),2016(1):80-84.(XU M J. Research on microblogging public opinion forecast model based on exponential smoothing[J]. China Public Security:Academy Edition, 2016(1):80-84.) [10] 郭丽丽,丁世飞.深度学习研究进展[J].计算机科学,2015,42(5):28-33.(GUO L L, DING S F. Research progress on deep learning[J]. Compute Science, 2015, 42(5):28-33.) [11] UBEYLI E D. Combining recurrent neural networks with eigenvector methods for classification of ECG beats[J]. Digital Signal Processing, 2009, 19(2):320-329. [12] SUNDERMEYER M, NEY H, SCHLUTER R. From feedforward to recurrent LSTM neural networks for language modeling[J]. IEEE/ACM Transactions on Audio Speech & Language Processing, 2015, 23(3):517-529. [13] PASCANU R, MIKOLOV T, BENGIO Y. On the difficulty of training recurrent neural networks[C]//Proceedings of the 30th International Conference on Machine Learning. Lilk:JMLR, 2013, 28:1310-1318. [14] JOZEFOWICZ R, ZAREMBA W, SUTSKEVER I. An empirical exploration of recurrent network architectures[C]//Proceedings of the 2015 International Conference on International Conference on Machine Learning. Lille:JMLR, 2015:2342-2350. [15] 张亮,黄曙光,石昭祥,等.基于LSTM型RNN的CAPTCHA识别方法[J].模式识别与人工智能,2011,24(1):40-47.(ZHANG L, HUANG S G, SHI Z X, et al. CAPTCHA recognition method based on RNN of LSTM[J]. Pattern Recognition and Artificial Intelligence, 2011, 24(1):40-47.) [16] ORDÓÑEZ F J, ROGGEN D. Deep convolutional and LSTM re-current neural networks for multimodal wearable activity recognition[J]. Sensors, 2016, 16(1):115. [17] 滕飞,郑超美,李文.基于长短期记忆多维主题情感倾向性分析模型[J].计算机应用,2016,36(8):2252-2256.(TENG F, ZHENG C M, LI W. Multidimensional topic model for oriented sentiment analysis based on long short-term memory[J]. Journal of Computer Applications, 2016, 36(8):2252-2256.) [18] 陈强,蒋卫国,陈曦,等.基于支持向量回归模型的水稻田甲烷排放通量预测研究[J].环境科学,2013,34(8):2975-2982.(CHEN Q, JIANG W G, CHEN X, et al. Prediction of methane emission of paddy field based on the support vector regression model[J]. Environmental Science, 2013, 34(8):2975-2982.) [19] 李渊,李云梅,王桥,等.基于集合均方根滤波的太湖叶绿a浓度估算与预测[J].环境科学,2013,34(1):61-68.(LI Y, LI Y M, WANG Q, et al. Estimation and forecast of chlorophyll a concentration in Taihu lake based on ensemble square root filters[J]. Environmental Sciences, 2013, 34(1):61-68.)