计算机应用 ›› 2017, Vol. 37 ›› Issue (3): 620-627.DOI: 10.11772/j.issn.1001-9081.2017.03.620

• 第四届大数据学术会议(CCF BIGDATA2016) • 上一篇    下一篇

大数据环境下的分布式数据流处理关键技术探析

陈付梅, 韩德志, 毕坤, 戴永涛   

  1. 上海海事大学 信息工程学院, 上海 201306
  • 收稿日期:2016-09-20 修回日期:2016-10-18 出版日期:2017-03-10 发布日期:2017-03-22
  • 通讯作者: 韩德志
  • 作者简介:陈付梅(1989-),女,山东临沂人,硕士研究生,主要研究方向:云计算、大数据实时分析;韩德志(1966-),男,河南信阳人,教授,博士,CCF高级会员,主要研究方向:云计算、云存储及其安全技术、大数据应用技术;毕坤(1981-),男,山东青岛人,讲师,博士,主要研究方向:云计算、云存储、大数据应用技术;戴永涛(1991-),男,湖南邵阳人,硕士研究生,主要研究方向:云计算、分布式计算、数据挖掘、网络安全技术。
  • 基金资助:
    国家自然科学基金资助项目(61373028,61672338)。

Key technologies of distributed data stream processing based on big data

CHEN Fumei, HAN Dezhi, BI Kun, DAI Yongtao   

  1. College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China
  • Received:2016-09-20 Revised:2016-10-18 Online:2017-03-10 Published:2017-03-22
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (61672338, 61373028).

摘要: 大数据环境下的数据流处理实时性要求高,数据计算要求持续性和高可靠性。分布式数据流处理系统(DDSPS)能解决大数据环境下的数据流处理问题,它除具备分布式系统的可扩展性和容错性优势外,还具有高的实时处理能力。详细介绍了组成基于大数据的分布式数据流处理系统的四个子系统及其关键技术,讨论和比较了各个子系统的不同技术方案;同时介绍一种分布式拒绝服务(DDoS)攻击检测数据流处理系统结构案例,其研究内容能为大数据环境下的数据流处理理论研究和应用技术开发提供技术参考。

关键词: 大数据, 流处理, 消息队列, 数据处理, 数据存储

Abstract: In the big data environment, the real-time processing requirement of data stream is high, and data calculations require persistence and high reliability. Distributed Data Stream Processing System (DDSPS) can solve the problem of data stream processing in big data environment. Besides, it has the advantages of scalability and fault-tolerance of distributed system, and also has high real-time processing capability. Four subsystems and their key technologies of the DDSPS based on big data were introduced in detail. The different technical schemes of each subsystem were discussed and compared. At the same time, an example of data stream processing system structure to detect Distributed Denial of Service (DDoS) attacks was introduced, which can provide the technical reference for data stream processing theory research and application technology development under big data environment.

Key words: big data, stream processing, message queue, data processing, data storage

中图分类号: