计算机应用 ›› 2010, Vol. 30 ›› Issue (8): 2056-2059.

• 先进计算 • 上一篇    下一篇

面向海量数据的并行天文交叉证认

赵青1,孙济州1,崔辰州2,于策1,肖健3   

  1. 1. 天津大学
    2. 中国科学院
    3.
  • 收稿日期:2010-02-11 修回日期:2010-03-20 发布日期:2010-07-30 出版日期:2010-08-01
  • 通讯作者: 于策
  • 基金资助:
    国家自然科学基金资助项目;天津自然科学基金;天津市科技支撑重点项目

Parallel massive data oriented astronomical cross-match

  • Received:2010-02-11 Revised:2010-03-20 Online:2010-07-30 Published:2010-08-01

摘要: 交叉证认是实现多波段数据融合的关键技术,天文数据的海量性使这一问题必须要依靠计算机技术加以解决。按照PCAM并行设计模型设计了并行交叉证认算法。针对交叉证认在数据I/O访问方面存在的性能瓶颈,通过调整划分的粒度、过滤空白区域、优化数据加载、计算流程等方法,协调了数据读取量与计算量间的关系。实验表明该并行方法对交叉证认计算的效率提升明显。另一方面还考虑了对HTM、HEALPix两种最常用天文数据索引方式的支持,并通过实验对比了两者的性能,为我国天文数据主题库、虚拟天文台等项目提供了技术参考。

关键词: 天文交叉证认, HTM, HEALPix, PCAM并行程序设计模型, 大规模数据处理

Abstract: Cross-match is the kernel technology to realize multi-band data aggregation. It must be resolved by computer technologies since the astronomical data sets are very large. This paper issues a parallel crossmatch function using PCAM parallel programming model. Because the performance bottle-neck of cross-match exists on data I/O accessing, this function balanced the amount of data reading and computation by tuning partition granularity, removing blank areas, improving data load and computation flow, and etc. Experiments prove that it can speed up large-scale crossmatch greatly. On the other hand, the support of both of the two most famous astronomical data index functions HTM and HEALPix were considered, and the performance comparison between them were analyzed through experiments. The high-efficiency cross-match function in this paper offers a technology conference for the projects such as national astronomical database and virtual observatory.

Key words: astronomical cross-match, HTM, HEALPix, PCAM parallel programming model, massive data processing