Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (5): 1543-1550.DOI: 10.11772/j.issn.1001-9081.2022050724

• Advanced computing • Previous Articles     Next Articles

Edge computing and service offloading algorithm based on improved deep reinforcement learning

Tengfei CAO(), Yanliang LIU, Xiaoying WANG   

  1. Department of Computer Technology and Applications,Qinghai University,Xining Qinghai 810016,China
  • Received:2022-05-19 Revised:2022-06-25 Accepted:2022-06-27 Online:2022-06-30 Published:2023-05-10
  • Contact: Tengfei CAO
  • About author:CAO Tengfei, born in 1987, Ph. D., associate professor. His research interests include edge computing in B5G network.
    LIU Yanliang, born in 2002, M. S. candidate. His research interests include edge computing, reinforcement learning.
    WANG Xiaoying, born in 1982, Ph. D., professor. Her research interests include computer network architecture, mobile computing.
  • Supported by:
    National Natural Science Foundation of China(62101299);Natural Science Foundation of Qinghai Province(2020-ZJ-943Q)

基于改进深度强化学习的边缘计算服务卸载算法

曹腾飞(), 刘延亮, 王晓英   

  1. 青海大学 计算机技术与应用系,西宁 810016
  • 通讯作者: 曹腾飞
  • 作者简介:曹腾飞(1987—),男,湖北钟祥人,副教授,博士,CCF高级会员,主要研究方向:B5G网络中的边缘计算 caotf@qhu.edu.cn
    刘延亮(2002—),男,湖南衡阳人,硕士研究生,湖南衡阳人,主要研究方向:边缘计算、强化学习
    王晓英(1982—),女,吉林大安人,教授,博士生导师,博士,主要研究方向:计算机网络体系结构、移动计算。
  • 基金资助:
    国家自然科学基金资助项目(62101299);青海省自然科学基金资助项目(2020?ZJ?943Q)

Abstract:

To solve the problem of limited computing resources and storage space of edge nodes in the Edge Computing (EC) network, an Edge Computing and Service Offloading (ECSO) algorithm based on improved Deep Reinforcement Learning (DRL) was proposed to reduce node processing latency and improve service performance. Specifically, the problem of edge node service offloading was formulated as a resource-constrained Markov Decision Process (MDP). Due to the difficulty of predicting the request state transfer probability of the edge node accurately, DRL algorithm was used to solve the problem. Considering that the state action space of edge node for caching services is too large, by defining new action behaviors to replace the original actions, the optimal action set was obtained according to the proposed action selection algorithm, so that the process of calculating the action behavior reward was improved, thereby reducing the size of the action space greatly, and improving the training efficiency and reward of the algorithm. Simulation results show that compared with the original Deep Q-Network (DQN) algorithm, Proximal Policy Optimization (PPO) algorithm and traditional Most Popular (MP) algorithm, the total reward value of the proposed ECSO algorithm is increased by 7.0%, 12.7% and 65.6%, respectively, and the latency of edge node service offloading is reduced by 13.0%, 18.8% and 66.4%, respectively, which verifies the effectiveness of the proposed ECSO algorithm and shows that the ECSO can effectively improve the offloading performance of edge computing services.

Key words: Edge Computing (EC), caching service, service offloading, Deep Reinforcement Learning (DRL), action behavior reward

摘要:

在边缘计算(EC)网络中,针对边缘节点计算资源和存储空间有限的问题,提出一种基于改进深度强化学习(DRL)的边缘计算服务卸载(ECSO)算法,以降低节点处理时延和提高服务性能。具体来说,将边缘节点服务卸载问题转化为资源受限的马尔可夫决策过程(MDP),利用DRL算法解决边缘节点的请求状态转移概率难以精确预测的问题;考虑到边缘节点执行缓存服务的状态动作空间过大,定义新的动作行为替代原有动作,并依据提出的动作筛选算法得到最优动作集合,以改进计算动作行为奖励值的过程,进而大幅度降低动作空间大小,提高算法训练的效率以及收益。仿真实验结果表明,对比原深度Q网络(DQN)算法、邻近策略优化(PPO)算法以及传统的最流行(MP)算法,ECSO算法的总奖励值分别提升了7.0%、12.7%和65.6%,边缘节点服务卸载时延分别降低了13.0%、18.8%和66.4%,验证了算法的有效性,说明ECSO能有效提升边缘计算服务的卸载性能。

关键词: 边缘计算, 缓存服务, 服务卸载, 深度强化学习, 动作行为奖励

CLC Number: