《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (11): 3540-3550.DOI: 10.11772/j.issn.1001-9081.2022111732

• 网络与通信 • 上一篇    

基于深度强化学习的SWIPT边缘网络联合优化方法

王哲1,2,3, 王启名2(), 李陶深4, 葛丽娜1,3,5   

  1. 1.广西民族大学 人工智能学院,南宁 530006
    2.广西民族大学 电子信息学院,南宁 530006
    3.广西混杂计算与集成电路设计分析重点实验室(广西民族大学),南宁 530006
    4.广西大学 计算机与电子信息学院,南宁 530004
    5.广西民族大学 网络通信工程重点实验室,南宁 530006
  • 收稿日期:2022-11-22 修回日期:2023-04-30 接受日期:2023-05-12 发布日期:2023-06-02 出版日期:2023-11-10
  • 通讯作者: 王启名
  • 作者简介:王哲(1991—),男,河南南阳人,副教授,博士,CCF会员,主要研究方向:计算机网络、携能通信、联邦机器学习
    王启名(1997—),男,江苏宿迁人,硕士研究生,主要研究方向:计算机网络、携能通信、机器学习 wqm082199@163. com
    李陶深(1957—),男,广西南宁人,教授,博士,CCF杰出会员,主要研究方向:移动无线网络、无线能量传输、物联网、智慧城市
    葛丽娜(1969—),女,广西环江人,教授,博士,CCF高级会员,主要研究方向:网络与信息安全、移动计算、人工智能。
  • 基金资助:
    国家自然科学基金资助项目(61862007);广西自然科学基金资助项目(2020GXNSFBA297103);广西民族大学引进人才科研启动项目(2019KJQD17)

Joint optimization method for SWIPT edge network based on deep reinforcement learning

Zhe WANG1,2,3, Qiming WANG2(), Taoshen LI4, Lina GE1,3,5   

  1. 1.School of Artificial Intelligence,Guangxi Minzu University,Nanning Guangxi 530006,China
    2.College of Electronic Information,Guangxi Minzu University,Nanning Guangxi 530006,China
    3.Guangxi Key Laboratory of Hybrid Computation and IC Design Analysis (Guangxi Minzu University),Nanning Guangxi 530006,China
    4.School of Computer,Electronics and Information,Guangxi University,Nanning Guangxi 530004,China
    5.Key Laboratory of Network Communication Engineering,Guangxi Minzu University,Nanning Guangxi 530006,China
  • Received:2022-11-22 Revised:2023-04-30 Accepted:2023-05-12 Online:2023-06-02 Published:2023-11-10
  • Contact: Qiming WANG
  • About author:WANG Zhe, born in 1991, Ph. D., associate professor. His research interests include computer network, simultaneous information and power transfer, federated machine learning.
    WANG Qiming, born in 1997, M. S. candidate. His research interests include computer network, simultaneous information and power transfer, machine learning.
    LI Taoshen, born in 1957, Ph. D., professor. His research interests include mobile wireless network, wireless energy transmission, internet of things, smart city.
    GE Lina, born in 1969, Ph. D., professor. Her research interests include network and information security, mobile computing, artificial intelligence.
  • Supported by:
    National Natural Science Foundation of China(61862007);Natural Science Foundation of Guangxi Province(2020GXNSFBA297103);Scientific Research Start Project of Talents Introduced by Guangxi Minzu University(2019KJQD17)

摘要:

边缘计算(EC)与无线携能通信(SWIPT)技术能够提升传统网络性能,但同时也增加了系统决策制定的难度和复杂度。而基于最优化方法所设计的系统决策往往具有较高的计算复杂度,无法满足系统的实时性需求。为此,针对EC与SWIPT辅助的无线传感网络(WSN),联合考虑网络中波束成形、计算卸载与功率控制问题,建立了系统能效最优化数学模型;其次,针对该模型的非凸与参数耦合特征,通过设计系统的信息交换过程,提出基于深度强化学习的联合优化方法,该方法无须建立环境模型,采用奖励函数代替Critic网络对动作进行评估,能降低决策制定难度并提升实时性;最后,基于该方法设计了改进的深度确定性策略梯度(IDDPG)算法,并与多种最优化算法和机器学习算法进行仿真对比,验证了联合优化方法在降低计算复杂度、提升决策实时性方面的优势。

关键词: 无线传感网络, 深度强化学习, 无线携能通信, 边缘计算, 联合优化

Abstract:

Edge Computing (EC) and Simultaneous Wireless Information and Power Transfer (SWIPT) technologies can improve the performance of traditional networks, but they also increase the difficulty and complexity of system decision-making. The system decisions designed by optimization methods often have high computational complexity and are difficult to meet the real-time requirements of the system. Therefore, aiming at Wireless Sensor Network (WSN) assisted by EC and SWIPT, a mathematical model of system energy efficiency optimization was proposed by jointly considering beamforming, computing offloading and power control problems in the network. Then, concerning the non-convex and parameter coupling characteristics of this model, a joint optimization method based on deep reinforcement learning was proposed by designing information interchange process of the system. This method did not need to build an environmental model and adopted a reward function instead of the Critic network for action evaluation, which could reduce the difficulty of decision-making and improve the system real-time performance. Finally, based on the joint optimization method, an Improved Deep Deterministic Policy Gradient (IDDPG) algorithm was designed. Simulation comparisons were made with a variety of optimization algorithms and machine learning algorithms to verify the advantages of the joint optimization method in reducing the computational complexity and improving real-time performance of decision-making.

Key words: Wireless Sensor Network (WSN), deep reinforcement learning, SWIPT (Simultaneous Wireless Information and Power Transfer), Edge Computing (EC), joint optimization