Journal of Computer Applications ›› 2015, Vol. 35 ›› Issue (2): 378-382.DOI: 10.11772/j.issn.1001-9081.2015.02.0378

Previous Articles     Next Articles

Energy-efficient strategy of distributed file system based on data block clustering storage

WANG Zhengying1, YU Jiong1,2, YING Changtian1, LU Liang1   

  1. 1. School of Information Science and Engineering, Xinjiang University, Urumqi Xinjiang 830046, China;
    2. School of Software, Xinjiang University, Urumqi Xinjiang 830008, China
  • Received:2014-09-15 Revised:2014-11-19 Online:2015-02-10 Published:2015-02-12

分布式文件系统数据块聚类存储节能策略

王政英1, 于炯1,2, 英昌甜1, 鲁亮1   

  1. 1. 新疆大学 信息科学与工程学院, 乌鲁木齐 830046;
    2. 新疆大学 软件学院, 乌鲁木齐 830008
  • 通讯作者: 王政英
  • 作者简介:王政英(1984-),女,山西晋城人,硕士研究生,CCF会员,主要研究方向:云计算、绿色计算、数据挖掘; 于炯(1964-),男,北京人,博士,教授,博士生导师,CCF高级会员,主要研究方向:网络安全、网格与分布式计算; 英昌甜(1989-),女,新疆乌鲁木齐人,博士研究生,主要研究方向:云计算、分布式存储; 鲁亮(1990-),男,湖南湘潭人,博士研究生,CCF会员,主要研究方向:分布式计算、云计算。
  • 基金资助:

    国家自然科学基金资助项目(61462079,61262088,61363083);新疆维吾尔族自治区自然科学基金资助项目(2013211A011)。

Abstract:

Concerning the low server utilization and complicated energy management caused by block random placement strategy in distributed file systems, the vector of the visiting feature on data block was built to depict the behavior of the random block accessing. K-means algorithm was adopted to do the clustering calculation according to the calculation result, then the datanodes were divided into multiple regions to store different cluster data blocks. The data blocks were dynamic reconfigured according to the clustering calculation results when the system load is low. The unnecessary datanodes could sleep to reduce the energy consumption. The flexible set of distance parameters between clusters made the strategy be suitable for different scenarios that has different requests for the energy consumption and utilization. Compared with hot-cold zoning strategies, the mathematical analysis and experimental results prove that the proposed method has a higher energy saving efficiency, the energy consumption reduces by 35% to 38%.

Key words: cloud computing, distributed file system, data clustering, dynamic reconfiguration, energy-efficient computing

摘要:

针对分布式文件系统中由于数据块随机放置而导致的服务器利用率低、能耗管理复杂的问题,建立了数据块访问特征向量模型描述用户对数据块的随机访问,运用K-means算法对数据块进行聚类计算,根据计算结果将数据节点划分为多个区域以存储不同聚类簇的数据块,在系统负载较低时进行数据块动态重配置,关闭不必要节点达到节能的目的。为使得策略适用于对能耗和资源利用率有不同要求的场景,算法中聚类簇间隔参数可灵活设置。实验通过和冷热区划分算法进行比较表明:按照聚类结果进行数据块重配置后,能耗节省效率优于冷热区划分算法,节省能耗35%~38%。

关键词: 云计算, 分布式文件系统, 数据聚类, 动态重配置, 节能计算

CLC Number: