计算机应用 ›› 2015, Vol. 35 ›› Issue (7): 1849-1853.DOI: 10.11772/j.issn.1001-9081.2015.07.1849

• 先进计算 • 上一篇    下一篇

基于HBase的海量地形数据存储

李振举1, 李学军1, 谢剑薇1, 李雁南2   

  1. 1. 装备学院 信息装备系, 北京 101416;
    2. 96275部队, 河南 洛阳 471003
  • 收稿日期:2015-02-05 修回日期:2015-04-05 出版日期:2015-07-10 发布日期:2015-07-17
  • 通讯作者: 李振举(1987-),男,河南安阳人,助理工程师,博士研究生,CCF会员,主要研究方向:云计算、遥感数据管理,belovings@163.com
  • 作者简介:李学军(1967-),男,湖北监利人,教授,博士,CCF会员,主要研究方向:通信与信息系统、计算机图形学; 谢剑薇(1970-),女,安徽黄山人,副教授,硕士,主要研究方向:遥感数据管理; 李雁南(1988-),男,河南叶县人,助理工程师,硕士,主要研究方向:遥感数据管理。

Massive terrain data storage based on HBase

LI Zhenju1, LI Xuejun1, XIE Jianwei1, LI Yannan2   

  1. 1. Department of Information Equipment, Equipment Academy, Beijing 101416, China;
    2. 96275 Troops, Luoyang Henan 471003, China
  • Received:2015-02-05 Revised:2015-04-05 Online:2015-07-10 Published:2015-07-17

摘要:

随着遥感技术的发展,遥感数据的类型和量级发生了巨大变化,对于传统的存储方法产生了挑战。针对HBase中海量地形数据管理效率不高的问题,提出一种四叉树-Hilbert相结合的索引设计方法。首先,对传统地形数据管理方式和基于HBase的数据存储国内外研究现状进行了综述;然后,在基于四叉树对全球数据进行组织的基础上,提出了四叉树和Hilbert编码相结合的设计思想;其次,设计了根据经纬度求地形数据的行列号和根据行列号计算Hilbert编码的算法;最后,对设计的索引的物理存储结构进行了设计。实验结果表明,利用设计的索引进行海量地形数据入库,数据入库速度与单机情况相比,提高了63.79%~78.45%;在地形数据的范围查询中,设计的索引与传统的行序索引相比,查询时间降低了16.13%~39.68%。查询速度最低为14.71 MB/s,可以满足地形数据显示的要求。

关键词: HBase, 地形数据, 云存储, 四叉树-Hilbert索引, 三维地形显示

Abstract:

With the development of remote sensing technology, the data type and data volume of remote sensing data has increased dramatically in the past decades which is a challenge for traditional storage mode. A combination of quadtree and Hilbert spatial index was proposed in this paper to solve the the low storage efficiency in HBase data storage. Firstly, the research status of traditional terrain data storage and data storage based on HBase was reviewed. Secondly the design idea on the combination of quadtree and Hilbert spatial index based on managing global data was proposed. Thirdly the algorithm for calculating the row and column number based on the longitude and latitude of terrain data, and the algorithm for calculating the final Hilbert code was designed. Finally, the physical storage infrastructure for the index was designed. The experimental results illustrate that the data loading speed in Hadoop cluster improved 63.79%-78.45% compared to the single computer, the query time decreases by 16.13%-39.68% compared to the traditional row key index, the query speed is at least 14.71 MB/s which can meet the requirements of terrain data visualization.

Key words: HBase, terrain data, cloud storage, quadtree-Hilbert index, 3-dimensional terrain data visualization

中图分类号: