Abstract:Aiming at the problems of long data acquisition delay and unstable data download in cloud storage system, a scheduling scheme based on storage node load information and erasure code technique was proposed. Firstly, erasure code was utilized to improve the delay performance of data retrieving in cloud storage, and parallel threads were used to download multiple data copies simultaneously. Secondly, a lot of load information about storage nodes was analyzed to figure out which performance indicators would affect delay performance, and a new scheduling algorithm was proposed based on load information. Finally, the open-source project OpenStack was used to build a real cloud computing platform to test algorithm performance based on real user request tracing and erasure coding. A large number of experiments show that the proposed scheme not only can achieve 15% lower average delay but also reduce 40% volatility of delay compared with other scheduling policies. It proves that the scheduling policy can effectively improve delay performance and stability of data retrieving in real cloud computing platform, achieving a better user experience.
[1] Wikipedia. Cloud storage[EB/OL].[2016-06-10]. https://en.wikipedia.org/wiki/Cloudstorage. [2] 华为.大数据和云计算[EB/OL].[2016-07-19]. http://e.huawei.com/zh/publications/cn/ict_insights/hw_366755/horizons/HW_366714. (Huawei. Big data and cloud computing[EB/OL].[2016-07-19]. http://e.huawei.com/zh/publications/cn/ict_insights/hw_366755/horizons/HW_366714.) [3] GHEMAWAT S, GOBIOFF H, LEUNG S T. The google file system[C]//Proceedings of the 19th ACM Symposium on Operating Systems Principles. New York:ACM, 2003:29-43. [4] SHVACHKO K, KUANG H, RADIA S, et al. The Hadoop distributed file system[C]//Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies. Washington, DC:IEEE Computer Society, 2010:1-10. [5] HUANG L, PAWAR S, ZHANG H, et al. Codes can reduce queueing delay in data centers[C]//Proceedings of the 2012 IEEE International Symposium on Information Theory. Piscataway, NJ:IEEE, 2012:2766-2770. [6] LIANG G, KOZAT U C. Fast cloud:pushing the envelope on delay performance of cloud storage with coding[J]. IEEE/ACM Transactions on Networking, 2014, 22(6):2012-2025. [7] SHAH N B, LEE K, RAMCHANDRAN K. The MDS queue:analysing latency performance of codes and redundant requests[EB/OL].[2016-01-07]. http://people.eecs.berkeley.edu/~nihar/publications/The_MDS_Queue.pdf. [8] ROSENTHAL J, SMARANDACHE R. Maximum distance separable convolutional codes[J]. Applicable Algebra in Engineering, Communication and Computing, 1999, 10(1):15-32. [9] RASHMI K, SHAH N B, GU D, et al. A solution to the network challenges of data recovery in erasure-coded distributed storage systems:a study on the facebook warehouse cluster[C]//Proceedings of the 5th USENIX Conference on Hot Topics in Storage and File Systems. Berkeley, CA:USENIX Association, 2013:8. [10] Openstack. Swift service[EB/OL].[2016-06-10]. https://wiki.openstack.org/wiki/Swift/. [11] Hadoop Wiki. HDFS-RAID[EB/OL].[2016-06-10]. http://wiki.apache.org/hadoop/HDFS-RAID. [12] ZHANG B, IOSUP A, POUWELSE J, et al. The peer-to-peer trace archive:design and comparative trace analysis[C]//Proceedings of the ACM CoNEXT Student Workshop. New York:ACM, 2010:Article No. 21. [13] YEUNG K H, SZETO C W. On the modeling of WWW request arrivals[C]//Proceedings of the 1999 International Workshops on Parallel Processing. Piscataway, NJ:IEEE, 1999:248-253. [14] Wikipedia. Openstack[EB/OL].[2016-06-10]. https://en.wikipedia.org/wiki/OpenStack. [15] Wikipedia. Prim's algorithm[EB/OL].[2016-06-10]. https://en.wikipedia.org/wiki/Prim%27salgorithm. [16] LIU S, HUANG X, FU H, et al. Understanding data characteristics and access patterns in a cloud storage system[C]//Proceedings of the 201313th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing. Piscataway, NJ:IEEE, 2013:327-334. [17] SUN Y, ZHENG Z, KOKSAL C E, et al. Provably delay efficient data retrieving in storage clouds[C]//Proceedings of the 2015 IEEE Conference on Computer Communications. Piscataway, NJ:IEEE, 2015:585-593. [18] JOSHI G, LIU Y, SOLJANIN E. On the delay-storage trade-off in content download from coded distributed storage systems[J]. IEEE Journal on Selected Areas in Communications, 2013, 32(5):989-997. [19] CHANG F, DEAN J, GHEMAWAT S, et al. Bigtable:a distributed storage system for structured data[J]. ACM Transactions on Computer Systems, 2008, 26(2):205-218. [20] HUANG C, SIMITCI H, XU Y, et al. Erasure coding in windows azure storage[C]//Proceedings of the 2012 USENIX Conference on Annual Technical Conference. Berkeley, CA:USENIX Association, 2012:2. [21] CHEN S, SUN Y, KOZAT U C, et al. When queueing meets coding:optimal-latency data retrieving scheme in storage clouds[C]//Proceedings of the 2014 Proceedings IEEE INFOCOM. Piscataway, NJ:IEEE, 2014:1042-1050. [22] LIANG G, KOZAT U C. TOFEC:achieving optimal throughput-delay trade-off of cloud storage using erasure codes[C]//Proceedings of the 2014 Proceedings IEEE INFOCOM. Piscataway, NJ:IEEE, 2014:826-834. [23] 罗象宏,舒继武.存储系统中的纠删码研究综述[J].计算机研究与发展,2012,49(1):1-11.(LUO X H, SHU J W. Summary of research for erasure code in storage system[J]. Journal of Computer Research and Development, 2012, 49(1):1-11.) [24] 蒋海波,王晓京,范明钰,等.基于水平纠删码的云存储数据布局方法[J].四川大学学报(工程科学版),2013,45(2):103-109.(JIANG H B, WANG X J, FAN M Y, et al. A data placement based on level array codes in cloud storage[J]. Journal of Sichuan University (Engineering Science Edition), 2013, 45(2):103-109.) [25] 李晓恺,代翔,李文杰,等.基于纠删码和动态副本策略的HDFS改进系统[J].计算机应用,2012,32(8):2150-2158.(LI X K, DAI X, LI W J, et al. Improved HDFS scheme based on erasure code and dynamical-replication system[J]. Journal of Computer Applications, 2012, 32(8):2150-2158.) [26] 葛君伟,李志强,方义秋.云存储环境下基于分散式服务器的Erasure Code算法[J].计算机应用,2011,31(11):2940-2942.(GE J W, LI Z Q, FANG Y Q. Erasure code algorithm based on distributed server in cloud storage environment[J]. Journal of Computer Applications, 2011, 31(11):2940-2942.) [27] 程振东,栾钟治,孟由,等.云文件系统中纠删码技术的研究与实现[J].计算机科学与探索,2013,7(4):315-325.(CHENG Z D, LUAN Z Z, MENG Y, et al. Research and implementation on erasure code in cloud file system[J]. Journal of Frontiers of Computer Science and Technology, 2013, 7(4):315-325.)