《计算机应用》唯一官方网站 ›› 2025, Vol. 45 ›› Issue (2): 571-577.DOI: 10.11772/j.issn.1001-9081.2024030306

• 网络与通信 • 上一篇    

基于深度强化学习的低轨卫星多波束子载波动态分配算法

王华华1,2, 黄梁1,2(), 陈甲杰1,2, 方杰宁1,2   

  1. 1.重庆邮电大学 通信与信息工程院,重庆 400065
    2.卫星与移动通信协议创新团队(重庆邮电大学),重庆 400065
  • 收稿日期:2024-03-21 修回日期:2024-04-17 接受日期:2024-04-22 发布日期:2024-05-07 出版日期:2025-02-10
  • 通讯作者: 黄梁
  • 作者简介:王华华(1981—),男,山西临汾人,正高级工程师,硕士,主要研究方向:卫星通信
    陈甲杰(2001—),男,湖南衡阳人,硕士研究生,主要研究方向:移动通信软件协议
    方杰宁(1999—),男,广西贵港人,硕士研究生,主要研究方向:移动通信软件协议。
  • 基金资助:
    重庆市自然科学基金创新发展联合基金(中国星网)资助项目(CSTB2023NSCQ-LZX0114)

Dynamic allocation algorithm for multi-beam subcarriers of low orbit satellites based on deep reinforcement learning

Huahua WANG1,2, Liang HUANG1,2(), Jiajie CHEN1,2, Jiening FANG1,2   

  1. 1.School of Communication and Information Engineering,Chongqing University of Posts and Telecommunications,Chongqing 400065,China
    2.Satellite and Mobile Communication Protocol Innovation Team (Chongqing University of Posts and Telecommunications),Chongqing 400065,China
  • Received:2024-03-21 Revised:2024-04-17 Accepted:2024-04-22 Online:2024-05-07 Published:2025-02-10
  • Contact: Liang HUANG
  • About author:WANG Huahua, born in 1981, M.S., senior engineer. His research interests include satellite communication.
    CHEN Jiajie, born in 2001, M. S. candidate. His research interests include mobile communication software protocols.
    FANG Jiening, born in 1999, M. S. candidate. His research interests include mobile communication software protocols.
  • Supported by:
    Innovation and Development Joint Fund of Chongqing Natural Science Foundation (China Satellite Network)(CSTB2023NSCQ-LZX0114)

摘要:

针对低轨(LEO)卫星在多波束场景下的资源分配问题,由于在实际卫星通信环境中,波束间信号的干扰和噪声等因素复杂多变,常规的子载波动态分配算法无法动态调整参数以适应通信环境的变化。通过结合传统的通信调度算法与强化学习技术,以最小化用户丢包率为目标,动态调整用户调度情况并动态分配整个卫星通信系统的资源以适应环境的变化。通过时隙划分离散化LEO卫星的动态特性模型,并根据LEO卫星资源分配场景的建模提出一种基于深度强化学习(DRL)的资源分配策略。通过调整卫星调度的排队情况,增加大时延用户的调度机会,即调节单颗LEO卫星各个波束中的资源块以对应用户的资格性,从而在保证一定公平性的同时,降低用户丢包率。仿真实验结果表明,在满足总功率约束的条件下,所提出的基于深度强化学习的资源分配算法(DRL-RA)中的用户传输公平性和系统吞吐量比较稳定,且DRL-RA中时延较大的用户因优先级提升而获得了更多的调度机会,而DRL-RA的数据丢包率相较于比例公平算法和最大负载/干扰(Max C/I)算法分别降低了13.9%和15.6%。可见,所提算法有效解决了数据传输过程中丢包的问题。

关键词: 低轨卫星, 时隙划分, 资源分配, 深度强化学习, 优先级调整

Abstract:

In response to the resource allocation problem in multi-beam scenarios of Low Earth Orbit (LEO) satellite, as the factors such as interference and noise between wave beams in actual satellite communication environments are complex and variable, conventional subcarrier dynamic allocation algorithms cannot adjust parameters dynamically to adapt to changes in the communication environment. By combining traditional communication scheduling algorithms with reinforcement learning techniques, with the goal of minimizing user packet loss rate, user’s scheduling situations were adjusted dynamically and resources of the entire satellite communication system were allocated dynamically to adapt to environmental changes. The dynamic characteristic model of LEO satellite was discretized by time slot division, and a Deep Reinforcement Learning (DRL)-based resource allocation strategy was proposed on the basis of the modeling of LEO satellite resource allocation scenarios. In this strategy, the scheduling opportunities for users with high latency were increased by adjusting the satellite scheduling queue situation, that is, adjusting the resource blocks in each beam of a single LEO satellite to correspond to qualifications of users, thereby ensuring a certain level of fairness and reducing the user packet loss rate at the same time. Simulation results show that under the condition meeting total power constraints, the user transmission fairness and system throughput are stable in the proposed Deep Reinforcement Learning based Resource Allocation algorithm (DRL-RA), and users with large latency obtain more scheduling opportunities in DRL-RA due to priority improvement. Compared with Proportional Fairness (PF) algorithm and Maximum Carrier/Interference (Max C/I) algorithm, DRL-RA has the data packet loss rate reduced by 13.9% and 15.6% respectively. It can be seen that the proposed algorithm solves the problem of packet loss effectively during data transmission.

Key words: Low Earth Orbit (LEO) satellite, time slot division, resource allocation, Deep Reinforcement Learning (DRL), priority adjustment

中图分类号: