Received:
Revised:
Online:
赖锦辉1,徐子晨1,涂亦澄2,谭国龙1
通讯作者:
Abstract: The database system academia has discovered the possibility of overlapping access data paths and shared computing between different queries, and proposed a Multiple-Query-at-a-Time model that batches queries in workloads. Several multiple-query processing frameworks have been developed that have been proven effective, but all lack a general framework for building complete query processing and optimization methods. Our previous work constructed a query time operator merging optimization framework based on equivalent transformation. On this basis, This paper proposes a relational operator concurrent computing framework for heterogeneous architectures called OmegaDB. By studying the GPU-oriented relational operator flow-batch computing model, and constructing the relational data query pipeline, a flow-batch computing method of aggregated multi-query is implemented on the CPU-GPU heterogeneous architecture. This paper focuses on experiments and prototype implementation, in which OmegaDB is verified and demonstrated through theoretical demonstration and practice proof methods. By comparing with modern RDBMS, we reveal the potential of OmegaDB in taking advantage of new hardware. Based on the theoretical study of query optimization framework that includes the traditional relational algebraic rules of a multiple-query-at-a-time execution model, we propose several optimization methods and future research directions. Using the TPC-H business intelligence benchmark, our experiments show that OmegaDB achieves up to 24X end-to-end speedup while consuming lower disk I/O and CPU resources than the modern advanced commercial database system SQL SERVER.
Key words: highly concurrent relational database, relational algebra, streaming batch computing, new hardware acceleration
摘要: 数据库系统的不同查询之间存在访问数据路径重叠和计算共享的可能,工作负载中查询分批处理称为多条查询一次执行(Multiple-Query-at-a-Time)模型。一些已开发的多查询处理框架已经被证明有效,但是都缺乏构建完整查询处理和优化方法的普适框架。我们此前的工作基于等价变换构建了查询时算子合并优化架 ,在此基础上,本文提出一种面向异构架构的关系型算子并发计算框OmegaDB。通过研究面向GPU的关系算子流批计算模型,构建关系数据查询流水,在CPU-GPU异构架构上实现了聚合多查询的流批计算方法。在实验及原型实现上,通过理论论证和实践证明方法验证展示了OmegaDB相对现代RDBMS所具备的优势,也初步展示了OmegaDB利用新硬件的潜力。基于传统关系代数规则的多条查询一次执行模型的查询优化框架理论研究,提出了多个优化方法以及未来研究方向。在具体实验中使用TPC-H商业智能计算作为基准测试程序标准,实验结果表明OmegaDB与现代先进的商业数据库系统SQL SERVER相比,前者在消耗更低的磁盘I/O和CPU时间的情况下,最高可以达到24倍的端到端加速。
关键词: 高并发关系数据库, 关系代数, 流批计算, 新硬件加速
赖锦辉 徐子晨 涂亦澄 谭国龙. 面向异构架构的关系型算子并发计算框架OmegaDB[J]. .
0 / Recommend
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/