Edge computing and service offloading algorithm based on improved deep reinforcement learning

doi:10.11772/j.issn.1001-9081.2022050724

Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (5): 1543-1550.DOI: 10.11772/j.issn.1001-9081.2022050724

• Advanced computing • Previous Articles Next Articles

Edge computing and service offloading algorithm based on improved deep reinforcement learning

Tengfei CAO(), Yanliang LIU, Xiaoying WANG

Department of Computer Technology and Applications，Qinghai University，Xining Qinghai 810016，China

Received:2022-05-19 Revised:2022-06-25 Accepted:2022-06-27 Online:2022-06-30 Published:2023-05-10
Contact: Tengfei CAO
About author:CAO Tengfei， born in 1987， Ph. D.， associate professor. His research interests include edge computing in B5G network.
LIU Yanliang， born in 2002， M. S. candidate. His research interests include edge computing， reinforcement learning.
WANG Xiaoying， born in 1982， Ph. D.， professor. Her research interests include computer network architecture， mobile computing.
Supported by:
National Natural Science Foundation of China(62101299);Natural Science Foundation of Qinghai Province(2020-ZJ-943Q)

基于改进深度强化学习的边缘计算服务卸载算法

曹腾飞(), 刘延亮, 王晓英

青海大学计算机技术与应用系，西宁 810016

通讯作者: 曹腾飞
作者简介:曹腾飞（1987—），男，湖北钟祥人，副教授，博士，CCF高级会员，主要研究方向：B5G网络中的边缘计算 caotf@qhu.edu.cn
刘延亮（2002—），男，湖南衡阳人，硕士研究生，湖南衡阳人，主要研究方向：边缘计算、强化学习
王晓英（1982—），女，吉林大安人，教授，博士生导师，博士，主要研究方向：计算机网络体系结构、移动计算。
基金资助:
国家自然科学基金资助项目(62101299);青海省自然科学基金资助项目(2020?ZJ?943Q)

Abstract

Abstract:

To solve the problem of limited computing resources and storage space of edge nodes in the Edge Computing （EC） network， an Edge Computing and Service Offloading （ECSO） algorithm based on improved Deep Reinforcement Learning （DRL） was proposed to reduce node processing latency and improve service performance. Specifically， the problem of edge node service offloading was formulated as a resource-constrained Markov Decision Process （MDP）. Due to the difficulty of predicting the request state transfer probability of the edge node accurately， DRL algorithm was used to solve the problem. Considering that the state action space of edge node for caching services is too large， by defining new action behaviors to replace the original actions， the optimal action set was obtained according to the proposed action selection algorithm， so that the process of calculating the action behavior reward was improved， thereby reducing the size of the action space greatly， and improving the training efficiency and reward of the algorithm. Simulation results show that compared with the original Deep Q-Network （DQN） algorithm， Proximal Policy Optimization （PPO） algorithm and traditional Most Popular （MP） algorithm， the total reward value of the proposed ECSO algorithm is increased by 7.0%， 12.7% and 65.6%， respectively， and the latency of edge node service offloading is reduced by 13.0%， 18.8% and 66.4%， respectively， which verifies the effectiveness of the proposed ECSO algorithm and shows that the ECSO can effectively improve the offloading performance of edge computing services.

Key words: Edge Computing (EC), caching service, service offloading, Deep Reinforcement Learning (DRL), action behavior reward

摘要：

在边缘计算（EC）网络中，针对边缘节点计算资源和存储空间有限的问题，提出一种基于改进深度强化学习（DRL）的边缘计算服务卸载（ECSO）算法，以降低节点处理时延和提高服务性能。具体来说，将边缘节点服务卸载问题转化为资源受限的马尔可夫决策过程（MDP），利用DRL算法解决边缘节点的请求状态转移概率难以精确预测的问题；考虑到边缘节点执行缓存服务的状态动作空间过大，定义新的动作行为替代原有动作，并依据提出的动作筛选算法得到最优动作集合，以改进计算动作行为奖励值的过程，进而大幅度降低动作空间大小，提高算法训练的效率以及收益。仿真实验结果表明，对比原深度Q网络（DQN）算法、邻近策略优化（PPO）算法以及传统的最流行（MP）算法，ECSO算法的总奖励值分别提升了7.0%、12.7%和65.6%，边缘节点服务卸载时延分别降低了13.0%、18.8%和66.4%，验证了算法的有效性，说明ECSO能有效提升边缘计算服务的卸载性能。

关键词: 边缘计算, 缓存服务, 服务卸载, 深度强化学习, 动作行为奖励

CLC Number:

TP393

Tengfei CAO, Yanliang LIU, Xiaoying WANG. Edge computing and service offloading algorithm based on improved deep reinforcement learning[J]. Journal of Computer Applications, 2023, 43(5): 1543-1550.

曹腾飞, 刘延亮, 王晓英. 基于改进深度强化学习的边缘计算服务卸载算法[J]. 《计算机应用》唯一官方网站, 2023, 43(5): 1543-1550.

Figures/Tables 9

Fig. 1 Edge computing service offloading model

Fig. 2 Flow of ECSO algorithm

Fig. 3 Performance comparison between cloud computing and edge conputing

Fig. 4 Comparison of rewards in different training epochs

Fig. 5 Comparison of transmission latency reduction

Tab. 1 Final performance comparison of four algorithms after training to stability

算法	奖励值	降低的传输时延/s
ECSO	144.03	158.42
DQN-Based	134.57	140.25
PPO-Based	127.85	133.32
MP	86.97	95.20

Fig. 6 Latency reduction comparison under different storage space and computing resource

Fig. 7 Latency reduction under different number of users

Fig. 8 Latency reduction under different λ parameters

References 25

1	AL-FUQAHA A， GUIZANI M， MOHAMMADI M， et al. Internet of Things： a survey on enabling technologies， protocols， and applications［J］. IEEE Communications Surveys and Tutorials， 2015， 17（4）： 2347-2376. 10.1109/comst.2015.2444095
2	中国互联网络信息中心. 第48次中国互联网络发展状况统计报告［R］. 北京：中国互联网络信息中心， 2021. 10.1007/978-981-33-6930-6_2
	China Internet Network Information Center. The 48th statistical report on China’s Internet development［R］. Beijing： CNNIC， 2021. 10.1007/978-981-33-6930-6_2
3	MAO Y Y， YOU C S， ZHANG J， et al. A survey on mobile edge computing： the communication perspective［J］. IEEE Communications Surveys and Tutorials， 2017， 19（4）： 2322-2358. 10.1109/comst.2017.2745201
4	CHEN X， LI W Z， LU S L， et al. Efficient resource allocation for on demand mobile-edge cloud computing［J］. IEEE Transactions on Vehicular Technology， 2018， 67（9）： 8769-8780. 10.1109/tvt.2018.2846232
5	GUO S T， LIU J D， YANG Y Y， et al. Energy-efficient dynamic computation offloading and cooperative task scheduling in mobile cloud computing［J］. IEEE Transactions on Mobile Computing， 2019， 18（2）： 319-333. 10.1109/tmc.2018.2831230
6	ALFAKIH T， HASSAN M M， GUMAEI A， et al. Task offloading and resource allocation for mobile edge computing by deep reinforcement learning based on SARSA［J］. IEEE Access， 2020， 8： 54074-54084. 10.1109/access.2020.2981434
7	SADIKI A， BENTAHAR J， DSSOULI R， et al. Deep reinforcement learning for the computation offloading in MIMO-based edge computing［J］. Ad Hoc Networks， 2023， 141： No.103080. 10.1016/j.adhoc.2022.103080
8	LI M S， GAO J， ZHAO L， et al. Deep reinforcement learning for collaborative edge computing in vehicular networks［J］. IEEE Transactions on Cognitive Communications and Networking， 2020， 6（4）： 1122-1135. 10.1109/tccn.2020.3003036
9	LI D J， XU S Y， LI P Y. Deep reinforcement learning-empowered resource allocation for mobile edge computing in cellular V2X networks［J］. Sensors， 2021， 21（2）： No.372. 10.3390/s21020372
10	JIANG Z B， XU C Q， GUAN J F， et al. Stochastic analysis of DASH-based video service in high-speed railway networks［J］. IEEE Transactions on Multimedia， 2019， 21（6）： 1577-1592. 10.1109/tmm.2018.2881095
11	MAO H Z， NETRAVALI R， ALIZADEH M. Neural adaptive video streaming with Pensieve［C］// Proceedings of the 2017 Conference of the ACM Special Interest Group on Data Communication. New York： ACM， 2017： 197-210. 10.1145/3098822.3098843
12	杨思明，单征，丁煜，等. 深度强化学习研究综述［J］. 计算机工程， 2021， 47（12）： 19-29.
	YANG S M， SHAN Z， DING Y， et al. Survey of research on deep reinforcement learning［J］. Computer Engineering， 2021， 47（12）： 19-29.
13	van HASSELT H， GUEZ A， SILVER D. Deep reinforcement learning with double Q-learning［C］// Proceedings of the 30th AAAI Conference on Artificial Intelligence. Palo Alto， CA： AAAI Press， 2016： 2094-2100. 10.1609/aaai.v30i1.10295
14	XIONG X， ZHENG K， LEI L， et al. Resource allocation based on deep reinforcement learning in IoT edge computing［J］. IEEE Journal on Selected Areas in Communications， 2020， 38（6）： 1133-1146. 10.1109/jsac.2020.2986615
15	梁俊斌，张海涵，蒋婵，等. 移动边缘计算中基于深度强化学习的任务卸载研究进展［J］. 计算机科学， 2021， 48（7）： 316-323. 10.11896/jsjkx.200800095
	LIANG J B， ZHANG H H， JIANG C， et al. Research progress of task offloading based on deep reinforcement learning in mobile edge computing［J］. Computer Science， 2021， 48（7）： 316-323. 10.11896/jsjkx.200800095
16	MNIH V， KAVUKCUOGLU K， SILVER D， et al. Playing Atari with deep reinforcement learning［EB/OL］. ［2021-12-19］.. 10.1038/nature14236
17	SCHULMAN J， WOLSKI F， DHARIWAL P， et al. Proximal policy optimization algorithms［EB/OL］. ［2022-01-13］..
18	QIAN Y C， WANG R， WU J， et al. Reinforcement learning based optimal computing and caching in mobile edge network［J］. IEEE Journal on Selected Areas in Communications， 2020， 38（10）：2343-2355. 10.1109/jsac.2020.3000396
19	WANG C， GUAN J F， FENG T T， et al. BitLat： bitrate-adaptivity and latency-awareness algorithm for live video streaming［C］// Proceedings of the 27th ACM International Conference on Multimedia. New York： ACM， 2019： 2642-2646. 10.1145/3343031.3356069
20	HOCHBA D S. Approximation algorithms for NP-hard problems［J］. ACM SIGACT News， 1997， 28（2）：40-52. 10.1145/261342.571216
21	QIU X Y， LIU L B， CHEN W H， et al. Online deep reinforcement learning for computation offloading in blockchain-empowered mobile edge computing［J］. IEEE Transactions on Vehicular Technology， 2019， 68（8）： 8050-8062. 10.1109/tvt.2019.2924015
22	CAO T F， XU C Q， DU J P， et al. Reliable and efficient multimedia service optimization for edge computing-based 5G networks： game theoretic approaches［J］. IEEE Transactions on Network and Service Management， 2020， 17（3）： 1610-1625. 10.1109/tnsm.2020.2993886
23	XU J， CHEN L X， ZHOU P. Joint service caching and task offloading for mobile edge computing in dense networks［C］// Proceedings of the 2018 IEEE Conference on Computer Communications. Piscataway： IEEE， 2018：207-215. 10.1109/infocom.2018.8485977
24	ABADI A， AGARWAL A， BARHAM P， et al. TensorFlow： large-scale machine learning on heterogeneous distributed systems［EB/OL］. ［2021-12-20］..
25	CAO T F， XU C Q， WANG M， et al. Stochastic optimization for green multimedia services in dense 5G networks［J］. ACM Transactions on Multimedia Computing， Communications， and Applications， 2019， 15（3）： No.79. 10.1145/3328996

Edge computing and service offloading algorithm based on improved deep reinforcement learning

基于改进深度强化学习的边缘计算服务卸载算法

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 9

References 25

Related Articles 1

Recommended Articles

Metrics