计算机应用 ›› 2017, Vol. 37 ›› Issue (4): 941-944.DOI: 10.11772/j.issn.1001-9081.2017.04.0941

• 大数据与云计算及其应用 • 上一篇    下一篇

基于分布式计算框架的风暴三维追踪方法

曾沁1, 李永生2   

  1. 1. 广东省气象台, 广州 510080;
    2. 广东省气象探测数据中心, 广州 510080
  • 收稿日期:2016-10-08 修回日期:2017-01-06 出版日期:2017-04-10 发布日期:2017-04-19
  • 通讯作者: 曾沁
  • 作者简介:曾沁(1975-),男,广东梅州人,高级工程师,硕士,主要研究方向:大数据、云计算、精细化预报、气象大数据分析;李永生(1980-),男,内蒙古赤峰人,高级工程师,硕士,主要研究方向:大数据、云计算、海量数据存储、气象大数据分析。

Three dimensional strom tracking method based on distributed computing architecture

ZENG Qin1, LI Yongsheng2   

  1. 1. Guangdong Meteorological Observatory, Guangzhou Guangdong 510080, China;
    2. Guangdong Meteorological Observation Data Center, Guangzhou Guangdong 510080, China
  • Received:2016-10-08 Revised:2017-01-06 Online:2017-04-10 Published:2017-04-19

摘要: 气象数据的增长规模已达到每小时TB级,这使得传统基于关系型数据库和文件存储系统在海量数据存储与管理方面捉襟见肘,进而使得基于大规模异构气象数据的应用无法规模化,同时,也无法满足科研人员对海量气象数据高效探索的需要。为解决这一系列问题,研究者分别基于MapReduce、HBase等分布式框架下的分布式计算和存储技术,尝试为海量气象数据的探索提供有效技术手段,然而,综合性的研究据了解还未开展。因此,利用近年来积累的海量多普勒天气雷达数据,开展了基于MapReduce和HBase相结合的风暴三维追踪方法的研究,并基于传统Rest标准化接口实现了雷达资料的点、线、面、体的多种分布式服务接口,与传统的Rest标准化单机数据存储和访问接口的性能相比,所实现方法在性能方面有100%的效率提升。最后,以2007年至2009年珠江三角洲地区三年雷达数据的风暴追踪回算为例,进一步验证了所提方法在计算和存储管理方面的性能优势。

关键词: 分布式计算框架, 风暴追踪算法, 长时间序列分析

Abstract: In recent years, meteorological data increases dramatically, and the amount of data has been TB-per-hour-level. The traditional relational database and file storage system have troubles in the massive data storage and management, thus large-scale and heterogeneous meteorological data cannot also be used effectively in meteorological business. Furthermore, it would be also difficult for scientific researchers to efficiently explore the huge amount of heterogeneous meteorological data. In order to tackle these problems, researchers have developed many types of distributed computing frameworks based on MapReduce and HBase, etc., which provide an effective way to exploit large-scale meteorological data. The distributed computing and storing techniques have been tested separately in applications of meteorology field. However, to our best knowledge, these techniques have not been carefully studied jointly. Therefore, a new 3D storm tracking method based on the combination of MapReduce and Hbase was studied by using a large amount of weather radar data accumulated in recent years. Moreover, based on the original Rest interface, a series of distributed service interfaces were implemented for exploring a variety of point, line and surface data. Compared with the performance of the standard single data storage and access interface based on Rest, the proposed method has better comprehensive performance, and the efficiency is improved about 100%. A practical application for tracking 3D storm in Zhujiang River urban agglomeration from 2007 to 2009 was used to further validate the performance of the proposed method.

Key words: distributed computing framework, storm tracking method, long time series analysis

中图分类号: