• •    

信息存储技术学术会议+48+基于Spark Streaming的实时数据分析系统及其应用

韩德志   

  1. 上海海事大学
  • 收稿日期:2016-11-18 修回日期:2016-11-26 发布日期:2016-11-26
  • 通讯作者: 韩德志

A Real-time Data Analysis System Based on Spark Streaming and Application

  • Received:2016-11-18 Revised:2016-11-26 Online:2016-11-26

摘要: 摘 要: 在并发网络访问中,快速有效的从海量数据流中分离出异常访问,识别网络攻击流并及时做出反馈具有重要意义。为了实现对实时网络数据流的快速分析,设计一种分布式实时数据流分析系统(distributed real-time data analysis system, DRDAS),能有效解决并发访问数据流的收集、存储和实时分析问题,为大数据环境的网络安全检测提供了一种有效的数据分析平台;根据Spark Streaming运行的原理设计一种动态采样的K-Means并行算法,与DRDAS系统结合能实时有效地检测大数据环境下的各种分布式拒绝服务(Distributed Denial of Service,DDoS)攻击。实验结果显示:DRDAS系统具有好的可扩展性、容错性和实时处理能力,与动态采样的K-Means并行算法结合能实时地检测各种DDoS攻击,缩短了攻击的检测时间。

关键词: 关键词: Spark Streaming框架, 分布式流处理, 网络数据分析, DDoS攻击

Abstract: Abstract: In the vast amounts of concurrent network access, it has important significance to identify the network attack and make timely feedback from the huge amounts of data stream. In order to realize the rapid analysis of massive real-time data, a distributed real-time data analysis system(for short DRDAS)was designed, which resolve the collection, storage and real-time analysis for mass concurrent data;And according to the operation principle of spark streaming,a dynamic sampling K-means parallel algorithm was proposed, which Can quickly and efficiently detect all kinds of DDoS attacks. In the concerned experiments, the DRDAS has good scalability, fault tolerance and real-time processing ability; And along with new K-means parallel algorithm, the DRDAS can real-time detect various DDoS attacks, and shorten the detecting time of attacks.

Key words: Keywords: Spark Streaming Frame, Distributed Streaming Processing, Network Data Analysis, DDoS Attack

中图分类号: