• •    

信息存储技术学术会议+28+分布式数据库聚合计算性能优化研究

肖子达,朱立谷,冯东煜,张迪   

  1. 中国传媒大学
  • 收稿日期:2016-11-17 修回日期:2016-11-26 发布日期:2016-11-26
  • 通讯作者: 肖子达

Research on Optimization of Distributed Database Aggregation

  • Received:2016-11-17 Revised:2016-11-26 Online:2016-11-26
  • Contact: Zi-Da XIAO

摘要: 摘 要: 分布式数据库NoSQL解决大数据时代下传统关系型数据库的痛点,即满足大数据存储和高并发的需求。在分析应用方面,基于NoSQL的OLAP提供基本的聚合分析功能。MongoDB作为NoSQL数据库中的一类,其内置聚合功能能够满足基础的分析功能。针对在有限的物理资源下提高计算性能优化的问题,提出一种提高数据库整体性能的方法。该方法从片键选择和索引设置入手,通过分析MongoDB的分片特征选择合适的片键和索引使得数据在节点中均匀分布。这种方法能够充分利用硬件资源提高聚合计算的性能。最后通过实际项目的相关数据实验,分析和解释实验数据来验证本方法,结果表明聚合计算的性能达到较高水平。

关键词: NoSQL, MongoDB, MapReduce, 聚合计算, 性能优化方法

Abstract: Abstract: In the big data application, based on NoSQL OLAP provide basic analysis capability. MongoDB is one of NoSQL, its built-in aggregation using basic MapReduce. On the one hand, database can improve the performance of aggregation by scale-clusters. On the other hand, database configuration method combined on application business can improve computing performance in the limited resources. This article is based from the practical application; an optimization method of MongoDB aggregation performance is proposed. The basic idea is to set to start from the shard key select and index set, choose the appropriate shard key and indexes by analysis shard feature. This method can make full use of existing resources to enhance the system aggregation performance. Through practical experiments related datasets, analysis and interpretation of experimental data to validate the method, it establishes a solid foundation for further study.

Key words: NoSQL(Not Only SQL), MongoDB, MapReduce, Aggregation, performance optimization

中图分类号: