计算机应用 ›› 2015, Vol. 35 ›› Issue (12): 3403-3407.DOI: 10.11772/j.issn.1001-9081.2015.12.3403

• 先进计算 • 上一篇    下一篇

基于MapReduce技术的Argo浮标剖面信息融合算法

蒋华, 胡莹   

  1. 桂林电子科技大学计算机科学与工程学院, 广西桂林 541004
  • 收稿日期:2015-06-15 修回日期:2015-09-09 出版日期:2015-12-10 发布日期:2015-12-10
  • 通讯作者: 胡莹(1990-),女,湖南邵阳人,硕士研究生,主要研究方向:信息安全、异常检测
  • 作者简介:蒋华(1963-),男,河南信阳人,教授,博士,主要研究方向:信息安全、数据库。

Information fusion algorithm for Argo buoy profile based on MapReduce

JIANG Hua, HU Ying   

  1. School of Computer Science and Engineering, Guilin University of Electronic Technology, Guilin Guangxi 541004, China
  • Received:2015-06-15 Revised:2015-09-09 Online:2015-12-10 Published:2015-12-10

摘要: 针对目前Argo浮标剖面以单一浮标为分析单元造成的分析不全面,以及单机处理造成的计算时间长且复杂等问题,提出一种以经纬度网格单元为分析对象,采用MapReduce技术与主曲线相结合的信息融合算法。在Map阶段,从大量数据文件中提取Argo浮标的有效信息,并对所提取的有效Argo剖面信息进行经纬度划分。在Reduce阶段,生成各划分区域Argo浮标主剖面:首先对数据进行归一化处理,然后利用K主曲线理论获得由少量剖面点和折线组成且包含区域剖面特征的主剖面,从而实现海量Argo浮标的信息融合。通过全球Argo浮标样本数据对所提算法进行验证,新的信息融合算法在投影距离为0.03~0.10时残差均值小于0.1,且相比传统的单机处理方式,存储空间节约99.4%,计算速度提升36.4%。验证结果表明,所提算法在保证生成主剖面准确度的同时节省了极大的存储空间,提高了计算速度。

关键词: Argo剖面, 主曲线, MapReduce, 信息融合

Abstract: The analysis about Argo buoy profile is not comprehensive for taking the single Argo buoy as a processing object, and the calculation time of uniprocessing method is long. In order to solve the problems, a new algorithm using latitude and longitude cell as analysis object and combining MapReduce with principal curve analysis was proposed. In Map processing, the effective information of Argo buoy was extracted from the big data files and the extracted Argo profiles were classified according to the latitude and longitude. In Reduce processing, the principal Argo profile of each region was generated. Firstly, the information was normalized, and then the principal Argo profile of regional profile characteristics which consisted of a small amount of profile points and lines was obtained through the Kegl's principal curve theory, the information fusion of massive Argo buoy was realized. The proposed algorithm was verified through the global Argo buoy sample data, the new algorithm achieved the mean of residual errors was within 0.1 in the condition of 0.03-0.10 squared distance, the data storage space was saved by 99.4%, and the computation speed was increased by 36.4%, compared with the traditional method only based on uniprocessing. The experimental results show that the proposed algorithm can generate principal profiles accurately, meanwhile reduce the data storage space and effectively improve the computation speed.

Key words: Argo profile, principal curve, MapReduce, information fusion

中图分类号: