计算机应用 ›› 2014, Vol. 34 ›› Issue (11): 3218-3221.DOI: 10.11772/j.issn.1001-9081.2014.11.3218

• 先进计算 • 上一篇    下一篇

基于KD树和R树的多维云数据索引

何婧1,2,吴跃1,杨帆2,尹春雷3,周维2   

  1. 1. 电子科技大学 计算机科学与工程学院,成都 611731;
    2. 云南大学 软件学院,昆明 650091;
    3. 云南农业大学 建筑工程学院,昆明 650201
  • 收稿日期:2014-05-14 修回日期:2014-07-03 出版日期:2014-11-01 发布日期:2014-12-01
  • 通讯作者: 何婧
  • 作者简介:何婧(1978-),女,云南红河人,讲师,博士研究生,CCF会员,主要研究方向:云数据管理、数据挖掘、知识发现;吴跃(1958-),男,四川成都人,教授,博士生导师,主要研究方向:云计算、数据挖掘;杨帆(1990-),男,湖北武汉人,硕士研究生,主要研究方向:云存储、分布式计算;尹春雷(1974-),男,云南蒙自人,讲师,硕士研究生,主要研究方向:计算机网络、建筑自动化;周维(1974-),男,云南昆明人, 副教授,博士,主要研究方向:分布式计算、云计算。
  • 基金资助:

    国家自然科学基金资助项目

Multi-dimensional cloud index based on KD-tree and R-tree

HE Jing1,2,WU Yue1,YANG Fan2,YIN Chunlei3,ZHOU Wei2   

  1. 1. School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu Sichuan 611731, China;
    2. School of Software, Yunnan University, Kunming Yunnan 650091, China;
    3. School of Architecture Engineering, Yunnan Agricultural University, Kunming Yunnan 650201, China
  • Received:2014-05-14 Revised:2014-07-03 Online:2014-11-01 Published:2014-12-01
  • Contact: HE Jing

摘要:

针对云存储系统大多基于键值对模型存储数据,多维查询需要对整个数据集进行完全扫描,查询效率较低的问题,提出了一种基于KD树和R树的多维索引结构(简称KD-R索引)。KD-R索引采用双层索引模式,在全局服务器建立基于KD树的多维全局索引,在局部数据节点构建R树多维本地索引。基于性能损耗模型,选取索引代价较小的R树节点发布到全局KD树,从而优化多维查询性能。实验结果表明:与全局分布式R树索引相比,KD-R索引能够有效提高多维范围查询性能,并且在出现服务器节点失效的情况下,KD-R索引同样具有高可用性。

Abstract:

Most existing cloud storage systems are based on the model, which leads to a full dataset scan for multi-dimensional queries and low query efficiency. A KD-tree and R-tree based multi-dimensional cloud data index named KD-R index was proposed. KD-R index adopted two-layer architecture: a KD-tree based global index was built in the global server and R-tree based local indexes were built in local server. A cost model was used to adaptively select appropriate R-tree nodes to publish into global KD-tree index. The experimental results show that, compared with R-tree based global index, KD-R index is efficient for multi-dimensional range queries, and it has high availability in the case of server failure.

中图分类号: