计算机应用 ›› 2018, Vol. 38 ›› Issue (11): 3319-3325.DOI: 10.11772/j.issn.1001-9081.2018040789

• 应用前沿、交叉与综合 • 上一篇    下一篇

基于地质大数据的泥石流灾害易发性评价

张永宏1, 葛涛涛1, 田伟2, 夏广浩1, 何静1   

  1. 1. 南京信息工程大学 信息与控制学院, 南京 210044;
    2. 南京信息工程大学 计算机与软件学院, 南京 210044
  • 收稿日期:2018-04-17 修回日期:2018-06-15 出版日期:2018-11-10 发布日期:2018-11-10
  • 通讯作者: 田伟
  • 作者简介:张永宏(1974-),男,山东临沂人,教授,博士生导师,博士,主要研究方向:模式识别、智能系统、图像识别检测;葛涛涛(1994-),男,江苏扬州人,硕士研究生,主要研究方向:灾害预警、机器学习;田伟(1980-),男,江苏涟水人,副教授,博士,主要研究方向:计算机软件、大数据处理、气象灾害;夏广浩(1994-),男,江苏宿迁人,硕士研究生,主要研究方向:图像处理、机器学习;何静(1994-),女,江苏泰州人,硕士研究生,主要研究方向:山地灾害数据库。
  • 基金资助:
    国家自然科学基金国际(地区)合作与交流项目(41661144039);国家自然科学基金面上项目(51575283)。

Evaluation of susceptibility to debris flow hazards based on geological big data

ZHANG Yonghong1, GE Taotao1, TIAN Wei2, XIA Guanghao1, HE Jing1   

  1. 1. School of Information & Control, Nanjing University of Information Science & Technology, Nanjing Jiangsu 210044, China;
    2. School of Computer and Software, Nanjing University of Information Science & Technology, Nanjing Jiangsu 210044, China
  • Received:2018-04-17 Revised:2018-06-15 Online:2018-11-10 Published:2018-11-10
  • Supported by:
    This work is partially supported by the International (Regional) Cooperation and Exchange Project of the National Natural Science Foundation of China (41661144039), the General Program of the National Natural Science Foundation of China (51575283).

摘要: 在地质大数据背景下,为了更加精准、客观地评估泥石流易发程度,提出一种基于神经网络的区域泥石流易发性评价模型,并结合使用平均影响值算法(MIV)、遗传算法(GA)、Borderline-SMOTE算法提升模型精度。在预处理阶段使用Borderline-SMOTE算法处理非平衡数据集的分类问题,之后采用神经网络拟合主要指标与易发程度的非线性关系并结合遗传算法提升拟合速度,最后结合MIV算法定量分析指标与易发程度相关性。选取雅鲁藏布江中上游流域作为研究区域,实验结果显示,模型能够有效降低非平衡数据集的过拟合,优化原始输入维度,同时在拟合速度上有了很大提升。采用AUC指标检验评价结果,测试集的分类精度达到97.95%,说明模型能够在非平衡数据集下为评价研究区域泥石流易发程度提供参考。

关键词: 地质大数据, 泥石流, 易发性, 平均影响值算法, 遗传算法, Borderline-SMOTE算法

Abstract: In the background of geological data, in order to more accurately and objectively assess the susceptibility of debris flow, a model of regional debris flow susceptibility assessment based on neural network was proposed, and the accuracy of the model was improved by using Mean Impact Value (MIV) algorithm, Genetic Algorithm (GA) and Borderline-SMOTE (Synthetic Minority Oversampling TEchnique) algorithm. Borderline-SMOTE algorithm was used to deal with the classification problem of imbalanced dataset in the preprocessing phase. Afterwards, a neural network was used to fit the non-linear relationship between the main indicators and the degree of proneness, and genetic algorithm was used to improve the fitting speed. Finally, MIV algorithm was combined to quantify the correlation between indicators and proneness. The middle and upper reaches of the Yarlung Zangbo River was selected as the study area. The experimental results show that the model can effectively reduce the overfitting of imbalanced datasets, optimize the original input dimension, and greatly improve the fitting speed. Using AUC (Area Under the Curve) metric to test the evaluation results, the classification accuracy of test set reached 97.95%, indicated that the model can provide reference for assessing the degree of debris flow proneness in the study area under imbalanced datasets.

Key words: geological big data, debris flow, susceptibility, Mean Impact Value (MIV) algorithm, Genetic Algorithm (GA), Borderline-SMOTE (Synthetic Minority Oversampling TEchnique) algorithm

中图分类号: