《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (3): 767-775.DOI: 10.11772/j.issn.1001-9081.2022010074
所属专题: 数据科学与技术
收稿日期:
2022-01-20
修回日期:
2022-03-19
接受日期:
2022-03-21
发布日期:
2022-04-21
出版日期:
2023-03-10
通讯作者:
唐聃
作者简介:
龙运波(1997—),男,贵州黔东南人,硕士研究生,主要研究方向:分布式存储、编码理论基金资助:
Yunbo LONG1,2, Dan TANG3()
Received:
2022-01-20
Revised:
2022-03-19
Accepted:
2022-03-21
Online:
2022-04-21
Published:
2023-03-10
Contact:
Dan TANG
About author:
LONG Yunbo, born in 1997, M. S. candidate. His research interests include distributed storage, coding theory.
Supported by:
摘要:
针对分布式存储中热数据访问性能低下的问题,提出一种基于局部修复码(LRC)的负载均衡方法,采用编码的方式规避节点的集中式访问,并提高热数据的访问效率。首先,利用平衡不完全区组设计(BIBD)构造一类适用于小规模存储系统的特殊LRC,从而为编码数据提供多种访问方式;然后,分别基于里所(RS)码和随机阵列码将LRC推广到更大规模,并使它满足存储系统一定的容错需求;最后,提出一种热数据访问算法以降低热数据的访问压力,并结合合理的数据布局方案实现存储系统在高频访问场景下的负载均衡。理论分析和实验结果表明,所提方法能以极小的代价实现负载均衡,明显优于传统方法中利用多副本及最大距离可分(MDS)码实现的负载均衡方法,尤其是解决了因冷热数据访问不均带来的负载失衡问题,可以有效提高热数据存储系统的访问效率。
中图分类号:
龙运波, 唐聃. 分布式存储中基于局部修复码的负载均衡方法[J]. 计算机应用, 2023, 43(3): 767-775.
Yunbo LONG, Dan TANG. Load balancing method based on local repair code in distributed storage[J]. Journal of Computer Applications, 2023, 43(3): 767-775.
修复集1 | 修复集2 |
---|---|
l1=m1⊕m4⊕m11 | l6=m2⊕m10⊕m12 |
l2=m6⊕m13⊕m14 | l7=m1⊕m3⊕m6 |
l3=m2⊕m5⊕m7 | l8=m7⊕m8⊕m11 |
l4=m3⊕m9⊕m10 | l9=m4⊕m13⊕m15 |
l5=m8⊕m12⊕m15 | l10=m5⊕m9⊕m14 |
表2 例4中局部校验块的生成式
Tab. 2 Generation formulas of local parity blocks in example 4
修复集1 | 修复集2 |
---|---|
l1=m1⊕m4⊕m11 | l6=m2⊕m10⊕m12 |
l2=m6⊕m13⊕m14 | l7=m1⊕m3⊕m6 |
l3=m2⊕m5⊕m7 | l8=m7⊕m8⊕m11 |
l4=m3⊕m9⊕m10 | l9=m4⊕m13⊕m15 |
l5=m8⊕m12⊕m15 | l10=m5⊕m9⊕m14 |
修复集1 | 修复集2 |
---|---|
表3 例5中局部校验块的生成式
Tab. 3 Generation formulas of local parity blocks in example 5
修复集1 | 修复集2 |
---|---|
1 | REINSEL D, GANTZ J, RYDNING J. Data age 2025: the digitization of the world from edge to core [EB/OL]. [2021-10-17]. . |
2 | DEAN J, BARROSO L A. The tail at scale[J]. Communications of the ACM, 2013, 56(2): 74-80. 10.1145/2408776.2408794 |
3 | 刘世贤. 数据分区架构下负载均衡技术的研究与应用[D]. 杭州:浙江大学, 2010:27-42. |
LIU S X. Research and application on load balance of data based application partitioning[D]. Hangzhou: Zhejiang University, 2010:27-42. | |
4 | BYERS J, CONSIDINE J, MITZENMACHER M. Simple load balancing for distributed Hash tables[C]// Proceedings of the 2003 International Workshop on Peer-to-Peer Systems, LNCS 2735. Berlin: Springer, 2003: 80-87. |
5 | 李振宇,谢高岗. 基于DHT的P2P系统的负载均衡算法[J]. 计算机研究与发展, 2006, 43(9): 1579-1585. 10.1360/crad20060914 |
LI Z Y, XIE G G. A load balancing algorithm for DHT-based P2P systems[J]. Journal of Computer Research and Development, 2006, 43(9): 1579-1585. 10.1360/crad20060914 | |
6 | 周健,洪佩琳,李津生. DHT网络中一种基于树型结构的负载均衡方案[J]. 小型微型计算机系统, 2006, 27(11): 2042-2046. 10.3969/j.issn.1000-1220.2006.11.012 |
ZHOU J, HONG P L, LI J S. Tree-based load balancing method in DHT networks[J]. Mini-Micro Systems, 2006, 27(11): 2042-2046. 10.3969/j.issn.1000-1220.2006.11.012 | |
7 | DATTA A, SCHMIDT R, ABERER K. Query-load balancing in structured overlays[C]// Proceedings of the 7th IEEE International Symposium on Cluster Computing and the Grid. Piscataway: IEEE, 2007:453-460. 10.1109/ccgrid.2007.90 |
8 | 孟宪福,陈晓令. 结构化P2P网络热点负载动态迁移策略[J]. 电子学报, 2011, 39(10): 2407-2411. 10.1631/jzus.C0910717 |
MENG X F, CHEN X L. Strategy for hotspot load dynamic migration in structured P2P networks[J]. Acta Electronica Sinica, 2011, 39(10): 2407-2411. 10.1631/jzus.C0910717 | |
9 | 陈晨. 结构化对等网络中访问热度引起的负载均衡技术研究[D]. 北京:北京交通大学, 2008:14-22. |
CHEN C. Research on load balancing caused by query hotspot based on structured peer-to-peer networks[D]. Beijing: Beijing Jiaotong University, 2008:14-22. | |
10 | 苏跃明,李晨,田丽华. 基于分片一致性哈希负载均衡策略与应用 [J]. 计算机技术与发展, 2017, 27(11): 62-65, 70. 10.3969/j.issn.1673-629X.2017.11.013 |
SU Y M, LI C, TIAN L H. A consistent Hashing load balancing strategy based on fragmentation and its application[J]. Computer Technology and Development, 2017, 27(11): 62-65, 70. 10.3969/j.issn.1673-629X.2017.11.013 | |
11 | HUANG L B, PAWAR S, ZHANG H, et al. Codes can reduce queueing delay in data centers[C]// Proceedings of the 2012 IEEE International Symposium on Information Theory. Piscataway: IEEE, 2012:2766-2770. 10.1109/isit.2012.6284026 |
12 | JOSHI G, LIU Y P, SOLJANIN E. Coding for fast content download[C]// Proceedings of the 50th Annual Allerton Conference on Communication, Control, and Computing. Piscataway: IEEE, 2012:326-333. 10.1109/allerton.2012.6483236 |
13 | FERNER U J, MÉDARD M, SOLJANIN E. Toward sustainable networking: storage area networks with network coding[C]// Proceedings of the 50th Annual Allerton Conference on Communication, Control, and Computing. Piscataway: IEEE, 2012:517-524. 10.1109/allerton.2012.6483262 |
14 | JOSHI G, LIU Y P, SOLJANIN E. On the delay-storage trade-off in content download from coded distributed storage systems[J]. IEEE Journal on Selected Areas in Communications, 2014, 32(5): 989-997. 10.1109/jsac.2014.140518 |
15 | JOSHI G, SOLJANIN E, WORNELL G. Efficient redundancy techniques for latency reduction in cloud systems[J]. ACM Transactions on Modeling and Performance Evaluation Computing Systems, 2017, 2(2): No.12. 10.1145/3055281 |
16 | SHUAI Q Q, LI V O K. Latency analysis of flexible redundant scheme in MDS-coded distributed storage systems[C]// Proceedings of the 2017 IEEE Global Telecommunications Conference. Piscataway: IEEE, 2017:1-6. 10.1109/glocom.2017.8254038 |
17 | SHUAI Q Q, LI V O K, LU Z Y. Which achieves lower latency with redundant requests, replication or coding?[C]// Proceedings of the 2017 IEEE Global Telecommunications Conference. Piscataway: IEEE, 2017:1-6. 10.1109/glocom.2017.8254986 |
18 | HU Y C, NIU D. Reducing access latency in erasure coded cloud storage with local block migration[C]// Proceedings of the 35th IEEE Annual IEEE International Conference on Computer Communications. Piscataway: IEEE, 2016:1-9. 10.1109/infocom.2016.7524628 |
19 | DING C W, WANG L, SUN L S. Load balancing technology based on erasure code in distributed storage system[C]// Proceedings of the 31st Chinese Control Conference. Piscataway: IEEE, 2012:5558-5563. |
20 | DIMAKIS A G, GODFREY P B, WU Y N, et al. Network coding for distributed storage systems[J]. IEEE Transactions on Information Theory, 2010, 56(9): 4539-4551. 10.1109/tit.2010.2054295 |
21 | HUANG C, CHEN M H, LI J. Pyramid codes: flexible schemes to trade space for access efficiency in reliable data storage systems[C]// Proceedings of the 6th IEEE International Symposium on Network Computing and Applications. Piscataway: IEEE, 2007:79-86. 10.1109/nca.2007.37 |
22 | JIN L F, MA L M, XING C P. Construction of optimal locally repairable codes via automorphism groups of rational function fields[J]. IEEE Transactions on Information Theory, 2020, 66(1): 210-221. 10.1109/tit.2019.2946637 |
23 | LI X D, MA L M, XING C P. Optimal locally repairable codes via elliptic curves[J]. IEEE Transactions on Information Theory, 2019, 65(1): 108-117. 10.1109/tit.2018.2844216 |
24 | CAI H, MIAO Y, SCHWARTZ M, et al. On optimal locally repairable codes with super-linear length[J]. IEEE Transactions on Information Theory, 2020, 66(8): 4853-4868. 10.1109/tit.2020.2977647 |
25 | FANG W J, FU F W. Optimal cyclic (r, δ) locally repairable codes with unbounded length[J]. Finite Fields and Their Applications, 2020, 63: No.101650. 10.1016/j.ffa.2020.101650 |
26 | CHEN B, XIA S T, HAO J, et al. Constructions of optimal cyclic (r, δ) locally repairable codes[J]. IEEE Transactions on Information Theory, 2018, 64(4): 2499-2511. 10.1109/tit.2017.2761120 |
27 | CHEN B, FANG W J, XIA S T, et al. Constructions of optimal (r, δ) locally repairable codes via constacyclic codes[J]. IEEE Transactions on Communications, 2019, 67(8): 5253-5263. 10.1109/tcomm.2019.2916085 |
28 | WANG A Y, ZHANG Z F. Repair locality with multiple erasure tolerance[J]. IEEE Transactions on Information Theory, 2014, 60(11): 6979-6987. 10.1109/tit.2014.2351404 |
29 | RAWAT A S, PAPAILIOPOULOS D S, DIMAKIS A G, et al. Locality and availability in distributed storage[J]. IEEE Transactions on Information Theory, 2016, 62(8): 4481-4493. 10.1109/tit.2016.2524510 |
30 | SU Y S. Design of membership matrices for (r, t)-availability in distributed storage[C]// Proceedings of the 2016 IEEE International Symposium on Information Theory. Piscataway: IEEE, 2016:998-1002. 10.1109/isit.2016.7541449 |
31 | HAO J, XIA S T. Constructions of optimal binary locally repairable codes with multiple repair groups[J]. IEEE Communications Letters, 2016, 20(6): 1060-1063. 10.1109/lcomm.2016.2539160 |
32 | BALAJI S B, PRASANTH K P, KUMAR P V. Binary codes with locality for multiple erasures having short block length[C]// Proceedings of the 2016 IEEE International Symposium on Information Theory. Piscataway: IEEE, 2016:655-659. 10.1109/isit.2016.7541380 |
33 | ZHANG Y, KAN H B. Locally repairable codes with strict availability from linear functions[J]. Science China Information Sciences, 2018, 61(10): No.109304. 10.1007/s11432-018-9444-4 |
34 | ZHANG Y, KAN H B. Locally repairable codes from combinatorial designs[J]. Science China Information Sciences, 2020, 63(2): No.122304. 10.1007/s11432-019-2649-5 |
35 | TAMO I, BARG A, FROLOV A. Bounds on the parameters of locally recoverable codes[J]. IEEE Transactions on Information Theory, 2016, 62(6): 3070-3083. 10.1109/tit.2016.2518663 |
36 | BALAJI S B, KUMAR P V. Bounds on the rate and minimum distance of codes with availability[C]// Proceedings of the 2017 IEEE International Symposium on Information Theory. Piscataway: IEEE, 2017:3155-3159. 10.1109/isit.2017.8007111 |
37 | KADHE S, SOLJANIN E, SPRINTSON A. When do the availability codes make the stored data more available?[C]// Proceedings of the 53rd Annual Allerton Conference on Communication, Control, and Computing. Piscataway: IEEE, 2015:956-963. 10.1109/allerton.2015.7447111 |
38 | AKTAŞ M F, NAJM E, SOLJANIN E. Simplex queues for hot-data download[C]// Proceedings of the 2017 ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems. New York: ACM, 2017:35-36. 10.1145/3078505.3078553 |
39 | AL-ABBASI A O, AGGARWAL V. TTLCache: taming latency in erasure-coded storage through TTL caching[J]. IEEE Transactions on Network and Service Management, 2020, 17(3): 1582-1596. 10.1109/tnsm.2020.2998175 |
40 | AKTAŞ M, JOSHI G, KADHE S, et al. Service rate region: a new aspect of coded distributed system design[J]. IEEE Transactions on Information Theory, 2021, 67(12): 7940-7963. 10.1109/tit.2021.3117695 |
41 | KADHE S, SOLJANIN E, SPRINTSON A. Analyzing the download time of availability codes[C]// Proceedings of the 2015 IEEE International Symposium on Information Theory. Piscataway: IEEE, 2015:1467-1471. 10.1109/isit.2015.7282699 |
42 | AKTAŞ M F, KADHE S, SOLJANIN E, et al. Download time analysis for distributed storage codes with locality and availability[J]. IEEE Transactions on Communications, 2021, 69(6): 3898-3910. 10.1109/tcomm.2021.3067054 |
43 | 滕鹏国,张景中,陈亮,等. 随机阵列码:一种高容灾易扩展的RAID存储容灾方法[J]. 工程科学与技术, 2017, 49(3): 110-116. 10.15961/j.jsuese.201600492 |
TENG P G, ZHANG J Z, CHEN L, et al. Random RAID: a RAID storage scheme with high fault-tolerance and flexibility[J]. Advanced Engineering Sciences, 2017, 49(3): 110-116. 10.15961/j.jsuese.201600492 | |
44 | BARAT J. Combinatorics of symmetric designs[J]. Acta Scientiarum Mathematicarum, 2008, 74(3/4): 945-946. |
[1] | 倪瑞轩, 蔡淼, 叶保留. 内存高效的持久性分布式文件系统客户端缓存DFS-Cache[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1172-1180. |
[2] | 张明, 付乐, 王海峰. 面向边缘计算的并发数据流接转控制模型[J]. 《计算机应用》唯一官方网站, 2024, 44(12): 3876-3883. |
[3] | 孙亚男, 吴杰宏, 石峻岭, 高利军. 改进自组织映射的多无人机协同任务分配方法[J]. 《计算机应用》唯一官方网站, 2023, 43(5): 1551-1556. |
[4] | 杨力, 陈建廷, 向阳. 基于HBase的工业时序大数据分布式存储性能优化策略[J]. 《计算机应用》唯一官方网站, 2023, 43(3): 759-766. |
[5] | 周娜, 成茗, 贾孟霖, 杨杨. 基于缩略图加密和分布式存储的医学图像隐私保护[J]. 《计算机应用》唯一官方网站, 2023, 43(10): 3149-3155. |
[6] | 高旗, 吕娜, 缪竞成. 基于负载均衡的无线虚拟网络映射算法[J]. 《计算机应用》唯一官方网站, 2022, 42(10): 3148-3153. |
[7] | 卿欣艺, 陈玉玲, 周正强, 涂园超, 李涛. 基于中国剩余定理的区块链存储扩展模型[J]. 计算机应用, 2021, 41(7): 1977-1982. |
[8] | 许红亮, 杨桂芹, 蒋占军. 基于软件定义网络的数据中心自适应多路径负载均衡算法[J]. 计算机应用, 2021, 41(4): 1160-1164. |
[9] | 杨翎, 姜春茂. 基于三支决策的虚拟机节能迁移策略[J]. 计算机应用, 2021, 41(4): 990-998. |
[10] | 张茂, 李瑞虎, 郑尤良, 付强. 基于sunflower的局部修复码构造[J]. 计算机应用, 2021, 41(3): 763-767. |
[11] | 李翠, 陈庆奎. 基于DPDK并行通信的动态监控模型[J]. 《计算机应用》唯一官方网站, 2020, 40(2): 335-341. |
[12] | 张航, 刘善政, 唐聃, 蔡红亮. 分布式存储系统中的低修复成本纠删码[J]. 计算机应用, 2020, 40(10): 2942-2950. |
[13] | 张国潮, 王瑞锦. 基于门限秘密共享的区块链分片存储模型[J]. 计算机应用, 2019, 39(9): 2617-2622. |
[14] | 李祝红, 赵灿明, 闫龙, 张信明. 智能电网中电力线通信网络负载均衡的机会路由协议[J]. 计算机应用, 2019, 39(3): 812-816. |
[15] | 王泽武, 孙磊, 郭松辉, 孙瑞辰. 密码云中基于熵权评价的虚拟密码机调度方法[J]. 计算机应用, 2018, 38(5): 1353-1359. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||