计算机应用 ›› 2017, Vol. 37 ›› Issue (4): 928-935.DOI: 10.11772/j.issn.1001-9081.2017.04.0928

• 大数据与云计算及其应用 • 上一篇    下一篇

基于Spark Streaming的实时能耗分项计量系统

武志学1,2   

  1. 1. 成都五舟汉云科技有限公司, 成都 611731;
    2. 成都信息工程大学 信息安全工程学院, 成都 610225
  • 收稿日期:2016-10-10 修回日期:2016-12-21 出版日期:2017-04-10 发布日期:2017-04-19
  • 通讯作者: 武志学
  • 作者简介:武志学(1960-),男,山西河津人,教授,博士,主要研究方向:云计算、流式数据处理、数据挖掘。

Real-time detailed classification energy consumption measurement system based on Spark Streaming

WU Zhixue1,2   

  1. 1. Chengdu Wuzhou Handge Technology Limited, Chengdu Sichuan 611731, China;
    2. School of Information Security Engineering, Chengdu University of Information Technology, Chengdu Sichuan 610225, China
  • Received:2016-10-10 Revised:2016-12-21 Online:2017-04-10 Published:2017-04-19

摘要: 能耗分项计量能够准确、及时、有效地发现能源使用问题,形成和实现最有效的节能措施。能耗分项计量系统需要对各项能源使用量在不同粒度上进行统计,既有实时性的需求,又需要涉及到聚合、去重、连接等较为复杂的统计需求。由于数据产生快、实时性强、数据量大,所以很难统一采集并入库存储后再作处理,这便导致传统的数据处理架构不能满足需求。为此,提出基于Spark Streaming大数据流式技术构建一个实时能耗分项计量系统,对实时能耗分项计量的系统架构和内部结构进行了详细介绍,并通过实验数据分析了系统的实时数据处理能力。与传统架构不同,实时能耗分项计量系统在数据流动的过程中实时地进行捕捉和处理,一方面把捕捉到的异常信息及时报警到前端,同时把分类分项统计处理的结果保存到数据库,以便进行离线分析和数据挖掘,能有效地解决上述数据处理过程中遇到的问题。

关键词: 流式计算, 能耗分项计量, Spark Streaming, Apache Kafka, 大数据

Abstract: Detailed classification energy consumption measurement can discover energy consuming issues more accurately, timely and effectively, which can form and implement the most effective energy-saving measures. Detailed classification energy measurement system needs to calculate energy consumption amounts at multiple time scales according to detailed classification coding. Not only does it need to complete the tasks timely, but also need to deal with data aggregating, data de-duplication and data joining operations. Due to the fast speed of the data being generated, the requirement of the data being processed in real-time, and the big size of the data volume, it is difficult to store the data to a database system first, and then to process the data afterwards. Therefore, the traditional data processing infrastructure cannot fulfil the requirements of detailed classification energy consumption measurement system. A new real-time detailed classification energy consumption measurement system based on Spark Streaming technologies was designed and implemented, the system infrastructure and the internal structure of the system were introduced in detail, and its real-time data processing capabilities were proved through experiments. Different from the traditional ways, the proposed system processes energy consumption data in real-time to capture any unusual behaviour timely; at the same time, it separates the data and calculates the consumption usages according to the detailed classification coding, and stores the results to a database system for offline analysis and data mining, which can effectively solve the previously mentioned problems encountered in the data processing process.

Key words: stream computing, detailed classification energy consumption measurement, Spark Streaming, Apache Kafka, big data

中图分类号: