《计算机应用》唯一官方网站 ›› 2024, Vol. 44 ›› Issue (5): 1562-1569.DOI: 10.11772/j.issn.1001-9081.2023050612

• 网络与通信 • 上一篇    

D2D通信增强的蜂窝网络中基于DDPG的资源分配

唐睿1,2(), 庞川林2, 张睿智3, 刘川1, 岳士博2   

  1. 1.西华师范大学 电子信息工程学院, 四川 南充 637002
    2.成都理工大学 计算机与网络安全学院, 成都 610059
    3.电子科技大学 信息与通信工程学院, 成都 611731
  • 收稿日期:2023-05-22 修回日期:2023-08-02 接受日期:2023-08-08 发布日期:2023-08-10 出版日期:2024-05-10
  • 通讯作者: 唐睿
  • 作者简介:庞川林(1997—),男,四川南充人,硕士研究生,主要研究方向:深度强化学习
    张睿智(1999—),男,山东济宁人,博士研究生,主要研究方向:泛化优化算法
    刘川(1991—),男,四川南充人,讲师,硕士,主要研究方向:无线通信
    岳士博(2000—),男,四川巴中人,硕士研究生,CCF会员,主要研究方向:无人机通信中资源分配。
    第一联系人:唐睿(1988—),男,甘肃兰州人,副教授,博士,CCF会员,主要研究方向:无线通信
  • 基金资助:
    国家自然科学基金资助项目(62301450);四川省科技厅自然科学基金资助项目(24NSFSC5070);四川省科技厅区域创新合作项目(2022YFQ0017);成都理工大学基本科研业务费资金资助项目(10912?KYQD2019_08164)

DDPG-based resource allocation in D2D communication-empowered cellular network

Rui TANG1,2(), Chuanlin PANG2, Ruizhi ZHANG3, Chuan LIU1, Shibo YUE2   

  1. 1.School of Electronic Information Engineering,China West Normal University,Nanchong Sichuan 637002,China
    2.College of Computer Science and Cyber Security,Chengdu University of Technology,Chengdu Sichuan 610059,China
    3.School of Information and Communication Engineering,University of Electronic Science and Technology of China,Chengdu Sichuan 611731,China
  • Received:2023-05-22 Revised:2023-08-02 Accepted:2023-08-08 Online:2023-08-10 Published:2024-05-10
  • Contact: Rui TANG
  • About author:PANG Chuanlin, born in 1997, M. S. candidate. His research interests include deep reinforcement learning.
    ZHANG Ruizhi, born in 1999, Ph. D. candidate. His research interests include generalized optimization algorithms.
    LIU Chuan, born in 1991, M. S., lecturer. His research interests include wireless communication.
    YUE Shibo, born in 2000, M. S. candidate. His research interests include resource allocation in unmanned aerial vehicular communication.
  • Supported by:
    National Natural Science Foundation of China(62301450);Sichuan Provincial Natural Science Foundation(24NSFSC5070);Sichuan Provincial Regional Innovation Cooperation Project(2022YFQ0017);Fundamental Research Funds of Chengdu University of Technology(10912-KYQD2019_08164)

摘要:

针对终端直通(D2D)通信增强的蜂窝网络中存在的同频干扰,通过联合调控信道分配和功率控制最大化D2D链路和速率,并同时满足功率约束和蜂窝链路的服务质量(QoS)需求。为有效求解上述资源分配所对应的混合整数非凸规划问题,将原问题转化为马尔可夫决策过程,并提出一种基于深度确定性策略梯度(DDPG)算法的机制。通过离线训练,直接构建了从信道状态信息到最佳资源分配策略的映射关系,而且无需求解任何优化问题,因此可通过在线方式部署。仿真结果表明,相较于遍历搜索机制,所提机制在仅损失9.726%性能的情况下将运算时间降低了4个数量级(99.51%)。

关键词: 终端直通通信, 资源分配, 马尔可夫决策过程, 深度强化学习, 深度确定性策略梯度算法

Abstract:

To deal with the co-channel interference in Device-to-Device (D2D) communication-empowered cellular networks, the sum rate of D2D links was maximized through joint channel allocation and power control while satisfying the power constraints and the Quality-of-Service (QoS) requirements of cellular links. In order to efficiently solve the mixed-integer non-convex programming problem corresponding to the above resource allocation, the original problem was transformed into a Markov decision process, and a Deep Deterministic Policy Gradient (DDPG) algorithm-based mechanism was proposed. Through offline training, the mapping relationship from the channel state information to the optimal resource allocation policy was directly built up without solving any optimization problems, so it could be deployed in an online fashion. Simulation results show that compared with the exhausting search-based mechanism, the proposed mechanism reduces the computation time by 4 orders of magnitude (99.51%) at the cost of only 9.726% performance loss.

Key words: Device-to-Device (D2D) communication, resource allocation, Markov decision process, deep reinforcement learning, Deep Deterministic Policy Gradient (DDPG) algorithm

中图分类号: