计算机应用 ›› 2011, Vol. 31 ›› Issue (09): 2317-2320.DOI: 10.3724/SP.J.1087.2011.02317

• 网络与通信 • 上一篇    下一篇

基于P2P的分布式文件系统下载效率优化

廖彬,于炯,张陶,杨兴耀   

  1. 新疆大学 信息科学与工程学院,乌鲁木齐 830046
  • 收稿日期:2011-03-07 修回日期:2011-05-13 发布日期:2011-09-01 出版日期:2011-09-01
  • 通讯作者: 廖彬
  • 作者简介:廖彬(1986-),男,四川内江人,硕士研究生,主要研究方向:数据库、网格与云计算;
    于炯(1964-),男,北京人,教授,博士,主要研究方向:网络安全、网格与分布式计算;
    张陶(1988-),女,新疆乌鲁木齐人,硕士研究生,主要研究方向:分布式计算、网格计算;
    杨兴耀(1984-),男,新疆乌鲁木齐人,博士研究生,主要研究方向:分布式计算、网格计算。
  • 基金资助:
    国家自然科学基金资助项目(61003131;61003138;61073116);新疆大学博士科研启动基金资助项目(BS090153)

Download performance optimization in Hadoop distributed file system based on P2P

LIAO Bin,YU Jiong,ZHANG Tao,YANG Xing-yao   

  1. College of Information Science and Technology, Xinjiang University, Urumqi Xingjiang 830046, China
  • Received:2011-03-07 Revised:2011-05-13 Online:2011-09-01 Published:2011-09-01
  • Contact: LIAO Bin

摘要: 对分布式文件系统(HDFS)集群内部数据块存储机制与下载流程进行分析研究,结合P2P多点与多线程下载思想,从数据块、文件、集群三个方面提出了数据下载效率优化算法。考虑到集群内部可能因多线程下载出现的负载均衡问题,提出下载点选择算法以优化下载点的选择。实验结果表明,三种优化算法都能提高下载效率,下载点选择算法能够很好地实现集群内部DataNode负载均衡。

关键词: 云计算, 分布式文件系统, 对等网, 并行下载, 负载均衡

Abstract: The data block storage mechanism and downloading process in Hadoop Distributed File System (HDFS) cluster were analyzed. In combination with multi-point and multi-threaded Peer-to-Peer (P2P) download idea, an efficiency optimization algorithm was proposed from the aspects of data-block, file and cluster. Concerning the possible imbalanced load problem caused by multi-thread download in HDFS cluster, a download-point selection algorithm was put forward to optimize the download-point selection. The mathematical analysis and experiments prove that the three methods can improve the download efficiency and download-point selection algorithm can achieve loading balance among DataNodes in HDFS cluster.

Key words: cloud computing, Hadoop Distributed File System (HDFS), Peer-to-Peer (P2P), parallel download, load balance

中图分类号: