计算机应用 ›› 2017, Vol. 37 ›› Issue (2): 311-315.DOI: 10.11772/j.issn.1001-9081.2017.02.0311

• 第33届中国数据库学术会议(NDBC 2016) • 上一篇    下一篇

面向海量交通数据的HBase时空索引

房俊, 李冬, 郭会云, 王嘉怡   

  1. 北方工业大学 大规模流数据集成与分析技术北京市重点实验室, 北京 100041
  • 收稿日期:2016-08-12 修回日期:2016-09-06 出版日期:2017-02-10 发布日期:2017-02-11
  • 通讯作者: 房俊,fangjun@ncut.edu.cn
  • 作者简介:房俊(1976-),男,江苏南京人,副研究员,博士,主要研究方向:云数据管理、海量时空数据管理;李冬(1989-),男,湖南永州人,硕士研究生,主要研究方向:云数据管理;郭会云(1992-),女,河南漯河人,硕士研究生,主要研究方向:分布式系统调度;王嘉怡(1993-),女,北京人,硕士研究生,主要研究方向:海量时空数据管理。
  • 基金资助:

    北京市自然科学基金资助项目(4131001,4142023)。

Spatio-temporal index for massive traffic data based on HBase

FANG Jun, LI Dong, GUO Huiyun, WANG Jiayi   

  1. Beijing Key Laboratory on Integration and Analysis of Large-scale Stream Data, North China University of Technology, Beijing 100041, China
  • Received:2016-08-12 Revised:2016-09-06 Online:2017-02-10 Published:2017-02-11
  • Supported by:

    This work is partially supported by the Beijing Municipal Natural Science Foundation (4131001, 4142023).

摘要:

针对HBase无法直接建立时空索引所带来的交通数据查询性能问题,基于HBase行键设计了面向海量交通数据的HBase时空索引。首先利用Geohash降维方法将二维空间位置数据转化为一维编码,再与时间维度进行组合;然后根据组合顺序的不同,提出了四种结构模型,分别讨论了模型的具体构成以及交通数据查询中的适应面;最后提出了相应的时空索引管理算法及基于Hbase时空索引的交通数据查询方法。通过实验验证了提出的HBase时空索引结构能有效提升海量交通数据的区域查询性能,并比较了四种时空索引结构在不同数据规模、不同查询半径以及不同时间范围的查询性能,量化验证了不同索引结构在交通数据查询中的适应场景。

关键词: 海量交通数据, HBase, Geohash, 时空索引, 区域查询

Abstract:

Focusing on the issue that the HBase storage without spatio-temporal index degrades the traffic data query performance, some HBase spatio-temporal indexes based on row keys were proposed for massive traffic data. Firstly, the dimensionality reduction method based on Geohash was used to convert two-dimensional spatial position data into a one-dimensional code. Then the code was combined with the temporal dimension. Secondly, four index models were put forward based on combination order, and the structures of the models and their adaption conditions for traffic data query were discussed. Finally, the algorithm of index creation as well as traffic data query algorithm was proposed. Experimental results show that the proposed HBase spatio-temporal index structure can effectively enhance the traffic data query performance. In addition, the query performance of four different spatio-temporal index structures in different data size, different query radius and different query time range were compared, which verified the different adaption scenes of different index structures in traffic data query.

Key words: massive traffic data, HBase, Geohash, spatio-temporal index, range query

中图分类号: