Journal of Computer Applications ›› 2019, Vol. 39 ›› Issue (6): 1563-1568.DOI: 10.11772/j.issn.1001-9081.2018122602

• 2018 National Annual Conference on High Performance Computing (HPC China 2018) • Previous Articles     Next Articles

Real-time processing of space science satellite data based on stream computing

SUN Xiaojuan1,2,3, SHI Tao1,2, HU Yuxin1,2,3, TONG Jizhou3,4, LI Bing1,2, SONG Yao1,2,3   

  1. 1. Institute of Electronics, Chinese Academy of Sciences, Beijing 100190, China;
    2. Key Laboratory of Technology in Geo-spatial Information Processing and Application System, Chinese Academy of Sciences, Beijing 100190, China;
    3. University of Chinese Academy of Sciences, Beijing 100049, China;
    4. National Space Science Center, Chinese Academy of Sciences, Beijing 100190, China
  • Received:2018-12-12 Revised:2019-02-03 Online:2019-06-10 Published:2019-06-17
  • Supported by:
    This work is partially supported by the 13th Five-year Information Plan of Chinese Academy of Sciences (XXH13505-04), the Beijing Science and Technology Plan (Z181100002918002).

基于流式计算的空间科学卫星数据实时处理

孙小涓1,2,3, 石涛1,2, 胡玉新1,2,3, 佟继周3,4, 李冰1,2, 宋峣1,2,3   

  1. 1. 中国科学院 电子学研究所, 北京 100190;
    2. 中国科学院 空间信息处理与应用系统技术重点实验室, 北京 100190;
    3. 中国科学院大学, 北京 100049;
    4. 中国科学院 国家空间科学中心, 北京 100190
  • 通讯作者: 孙小涓
  • 作者简介:孙小涓(1980-),女,山东烟台人,副研究员,博士,CCF会员,主要研究方向:空间信息处理、高性能计算;石涛(1982-),男,天津人,副研究员,硕士,CCF会员,主要研究方向:空间信息处理;胡玉新(1981-),男,内蒙古赤峰人,研究员,博士,主要研究方向:空间信息处理系统;佟继周(1976-),女,北京人,副研究员,硕士,主要研究方向:空间科学大数据管理;李冰(1990-),男,山西运城人,助理研究员,硕士,主要研究方向:空间信息处理;宋峣(1994-),男,江西新余人,硕士,主要研究方向:空间信息处理。
  • 基金资助:
    中国科学院十三五信息化专项(XXH13505-04);北京市科技计划项目(Z181100002918002)。

Abstract: Concerning the increasingly high real-time processing requirement of space science satellite observed data, a real-time processing method of space science satellite data based on stream computing framework was proposed. Firstly, the data stream was abstractly analyzed according to the data processing characteristics of space science satellite. Then, the input and output data structures of each processing unit were redefined. Finally, the parallel data stream processing structure was designed based on the stream computing framework Storm to meet the requirements of parallel processing and distributed computing of large-scale data. The developed system for space science satellite data processing applying with this method was tested and analyzed. The results show that the data processing time is half of that of the original system under same conditions and the data localization strategy has higher throughput than round-robin strategy with the data tuple throughput increased by 29% on average. It can be seen that the use of stream computing framework can greatly shorten the data processing delay and improve the real-time performance of the space science satellite data processing system.

Key words: stream computing, data stream, Storm, space science satellite, data processing

摘要: 针对空间科学卫星探测数据的实时处理要求越来越高的问题,提出一种基于流计算框架的空间科学卫星数据实时处理方法。首先,根据空间科学卫星数据处理特点对数据流进行抽象分析;然后,对各处理单元的输入输出数据结构进行重新定义;最后,基于流计算框架Storm设计数据流处理并行结构,以适应大规模数据并行处理和分布式计算的要求。对应用该方法开发的空间科学卫星数据处理系统进行测试分析,测试结果显示,在相同条件下数据处理时间比原有系统缩短了一半;数据局部性策略比轮询策略具有更高的吞吐率,数据元组吞吐率平均提高29%。可见采用流式计算框架能够大幅缩短数据处理延迟,提高空间科学卫星数据处理系统的实时性。

关键词: 流式计算, 数据流, Storm, 空间科学卫星, 数据处理

CLC Number: