基于深度强化学习的低轨卫星多波束子载波动态分配算法

doi:10.11772/j.issn.1001-9081.2024030306

《计算机应用》唯一官方网站 ›› 2025, Vol. 45 ›› Issue (2): 571-577.DOI: 10.11772/j.issn.1001-9081.2024030306

• 网络与通信 • 上一篇

基于深度强化学习的低轨卫星多波束子载波动态分配算法

王华华¹^,², 黄梁¹^,²(), 陈甲杰¹^,², 方杰宁¹^,²

^1.重庆邮电大学通信与信息工程院，重庆 400065
^2.卫星与移动通信协议创新团队（重庆邮电大学），重庆 400065

收稿日期:2024-03-21 修回日期:2024-04-17 接受日期:2024-04-22 发布日期:2024-05-07 出版日期:2025-02-10
通讯作者: 黄梁
作者简介:王华华（1981—），男，山西临汾人，正高级工程师，硕士，主要研究方向：卫星通信
陈甲杰（2001—），男，湖南衡阳人，硕士研究生，主要研究方向：移动通信软件协议
方杰宁（1999—），男，广西贵港人，硕士研究生，主要研究方向：移动通信软件协议。
基金资助:
重庆市自然科学基金创新发展联合基金（中国星网）资助项目(CSTB2023NSCQ-LZX0114)

Dynamic allocation algorithm for multi-beam subcarriers of low orbit satellites based on deep reinforcement learning

Huahua WANG¹^,², Liang HUANG¹^,²(), Jiajie CHEN¹^,², Jiening FANG¹^,²

^1.School of Communication and Information Engineering，Chongqing University of Posts and Telecommunications，Chongqing 400065，China
^2.Satellite and Mobile Communication Protocol Innovation Team （Chongqing University of Posts and Telecommunications），Chongqing 400065，China

Received:2024-03-21 Revised:2024-04-17 Accepted:2024-04-22 Online:2024-05-07 Published:2025-02-10
Contact: Liang HUANG
About author:WANG Huahua， born in 1981， M.S.， senior engineer. His research interests include satellite communication.
CHEN Jiajie， born in 2001， M. S. candidate. His research interests include mobile communication software protocols.
FANG Jiening， born in 1999， M. S. candidate. His research interests include mobile communication software protocols.
Supported by:
Innovation and Development Joint Fund of Chongqing Natural Science Foundation （China Satellite Network）(CSTB2023NSCQ-LZX0114)

摘要/Abstract

摘要：

针对低轨（LEO）卫星在多波束场景下的资源分配问题，由于在实际卫星通信环境中，波束间信号的干扰和噪声等因素复杂多变，常规的子载波动态分配算法无法动态调整参数以适应通信环境的变化。通过结合传统的通信调度算法与强化学习技术，以最小化用户丢包率为目标，动态调整用户调度情况并动态分配整个卫星通信系统的资源以适应环境的变化。通过时隙划分离散化LEO卫星的动态特性模型，并根据LEO卫星资源分配场景的建模提出一种基于深度强化学习（DRL）的资源分配策略。通过调整卫星调度的排队情况，增加大时延用户的调度机会，即调节单颗LEO卫星各个波束中的资源块以对应用户的资格性，从而在保证一定公平性的同时，降低用户丢包率。仿真实验结果表明，在满足总功率约束的条件下，所提出的基于深度强化学习的资源分配算法（DRL-RA）中的用户传输公平性和系统吞吐量比较稳定，且DRL-RA中时延较大的用户因优先级提升而获得了更多的调度机会，而DRL-RA的数据丢包率相较于比例公平算法和最大负载/干扰（Max C/I）算法分别降低了13.9%和15.6%。可见，所提算法有效解决了数据传输过程中丢包的问题。

关键词: 低轨卫星, 时隙划分, 资源分配, 深度强化学习, 优先级调整

Abstract:

In response to the resource allocation problem in multi-beam scenarios of Low Earth Orbit （LEO） satellite， as the factors such as interference and noise between wave beams in actual satellite communication environments are complex and variable， conventional subcarrier dynamic allocation algorithms cannot adjust parameters dynamically to adapt to changes in the communication environment. By combining traditional communication scheduling algorithms with reinforcement learning techniques， with the goal of minimizing user packet loss rate， user’s scheduling situations were adjusted dynamically and resources of the entire satellite communication system were allocated dynamically to adapt to environmental changes. The dynamic characteristic model of LEO satellite was discretized by time slot division， and a Deep Reinforcement Learning （DRL）-based resource allocation strategy was proposed on the basis of the modeling of LEO satellite resource allocation scenarios. In this strategy， the scheduling opportunities for users with high latency were increased by adjusting the satellite scheduling queue situation， that is， adjusting the resource blocks in each beam of a single LEO satellite to correspond to qualifications of users， thereby ensuring a certain level of fairness and reducing the user packet loss rate at the same time. Simulation results show that under the condition meeting total power constraints， the user transmission fairness and system throughput are stable in the proposed Deep Reinforcement Learning based Resource Allocation algorithm （DRL-RA）， and users with large latency obtain more scheduling opportunities in DRL-RA due to priority improvement. Compared with Proportional Fairness （PF） algorithm and Maximum Carrier/Interference （Max C/I） algorithm， DRL-RA has the data packet loss rate reduced by 13.9% and 15.6% respectively. It can be seen that the proposed algorithm solves the problem of packet loss effectively during data transmission.

Key words: Low Earth Orbit (LEO) satellite, time slot division, resource allocation, Deep Reinforcement Learning (DRL), priority adjustment

中图分类号:

TN929.5

王华华, 黄梁, 陈甲杰, 方杰宁. 基于深度强化学习的低轨卫星多波束子载波动态分配算法[J]. 计算机应用, 2025, 45(2): 571-577.

Huahua WANG, Liang HUANG, Jiajie CHEN, Jiening FANG. Dynamic allocation algorithm for multi-beam subcarriers of low orbit satellites based on deep reinforcement learning[J]. Journal of Computer Applications, 2025, 45(2): 571-577.

图/表 10

图1 低轨多波束卫星覆盖模型

Fig. 1 LEO multi-beam satellite coverage model

表1 中心波束半径与3 dB角度间的关系

Tab. 1 Relationship between center beam radius and 3 dB angle

$θ 3 d B, l$ /（°）	波束半径/km	$θ 3 d B, l$ /（°）	波束半径/km
10	176	7	123
9	158	6	105
8	141

表1 中心波束半径与3 dB角度间的关系

Tab. 1 Relationship between center beam radius and 3 dB angle

$θ 3 d B, l$ /（°）	波束半径/km	$θ 3 d B, l$ /（°）	波束半径/km
10	176	7	123
9	158	6	105
8	141

图2 比例公平调度方案

Fig. 2 Proportional fairness scheduling scheme

图3 子载波的动态分配情况

Fig. 3 Dynamic subcarrier allocation

图4 优化策略的流程

Fig. 4 Flow of optimization strategy

表2 仿真参数设置

Tab. 2 Simulation parameter setting

仿真参数	数值
用户数	150
单颗卫星波束数	16
系统带宽（B）	500 MHz
最大天线增益（ $G m a x$ ）	26.8 dBi
玻尔兹曼常数（ $κ$ ）	228.6 dB
接收机噪声温度（ $T$ ）	35 K
雨量衰减（ $χ k, d B$ ）	20 dB
Ka频段的载波频率（ $f$ ）	20 GHz
接收天线增益（ $G r e$ ）	24 dBi
噪声功率（ $N$ ）	-110 dBm
仿真帧数	1 000
业务传输阈值（ $C t h$ ）	500 kbit/s
LEO卫星轨道高度	1 000 Km
LEO卫星波束天线-3 dB角度	10°

表2 仿真参数设置

Tab. 2 Simulation parameter setting

仿真参数	数值
用户数	150
单颗卫星波束数	16
系统带宽（B）	500 MHz
最大天线增益（ $G m a x$ ）	26.8 dBi
玻尔兹曼常数（ $κ$ ）	228.6 dB
接收机噪声温度（ $T$ ）	35 K
雨量衰减（ $χ k, d B$ ）	20 dB
Ka频段的载波频率（ $f$ ）	20 GHz
接收天线增益（ $G r e$ ）	24 dBi
噪声功率（ $N$ ）	-110 dBm
仿真帧数	1 000
业务传输阈值（ $C t h$ ）	500 kbit/s
LEO卫星轨道高度	1 000 Km
LEO卫星波束天线-3 dB角度	10°

图5 吞吐量性能比较

Fig. 5 Throughput performance comparison

图6 平均时延性能比较

Fig. 6 Comparison of average latency performance

图7 不同调度算法的丢包率性能比较

Fig. 7 Comparison of packet loss rate performance among different scheduling algorithms

图8 用户满意度性能比较

Fig. 8 Comparison of user satisfaction performance

参考文献 24

1	KODHELI O， LAGUNAS E， MATURO N， et al. Satellite communications in the new space era： a survey and future challenges［J］. IEEE Communications Surveys and Tutorials， 2021， 23（1）： 70-109.
2	EFREM C N， PANAGOPOULOS A D. Dynamic energy-efficient power allocation in multibeam satellite systems［J］. IEEE Wireless Communications Letters， 2020， 9（2）： 228-231.
3	XIAO W， WANG R， SONG J， et al. AI-based satellite ground communication system with intelligent antenna pointing［C］// Proceedings of the 2020 IEEE Global Communications Conference. Piscataway： IEEE， 2020： 1-6.
4	IVARI S M， CAUS M， VAZQUEZ M A， et al. Power allocation and user clustering in multicast NOMA based satellite communication systems［C］// Proceedings of the 2020 IEEE International Conference on Communications. Piscataway： IEEE， 2020： 1-6.
5	CHENG N， HE J， YIN Z， et al. 6G service-oriented space-air-ground integrated network： a survey［J］. Chinese Journal of Aeronautics， 2022， 35（9）： 1-18.
6	别玉霞，卜瑞杰，刘海燕. 多优先级的卫星网络信道分配算法［J］. 计算机科学， 2017， 44（3）：132-136， 144.
	BIE Y X， BU R J， LIU H Y. Channel allocation algorithm of multi-priority satellite network［J］. Computer Science， 2017， 44（3）： 132-136， 144.
7	李新桐，张亚生. 一种适用于低轨卫星的 SDN 网络人工智能路由方法［J］. 电子测量技术， 2020， 43（22）： 109-114.
	LI X T， ZHANG Y S. Artificial intelligence routing method for SDN network suitable for LEO satellites［J］. Electronic Measurement Technology， 2020， 43（22）： 109-114.
8	ZHOU D， SHENG M， WANG Y， et al. Machine learning-based resource allocation in satellite networks supporting internet of remote things［J］. IEEE Transactions on Wireless Communications， 2021， 20（10）： 6606-6621.
9	段超凡，王锐. 基于智能水滴算法的卫星信道资源调度研究［J］. 现代计算机， 2022， 28（7）：75-78.
	DUAN C F， WANG R. Satellite channel allocation based on the intelligent water drops algorithm［J］. Modern Computer， 2022， 28（7）：75-78.
10	SHUKLA I， DOZIER H R， HENSLEE A C. A study of model based and model free offline reinforcement learning［C］// Proceedings of the 2022 International Conference on Computational Science and Computational Intelligence. Piscataway： IEEE， 2022： 315-316.
11	LUO Z， YANG D， WANG H， et al. Weighted fair precoding based on traffic demands for multibeam satellite systems［C］// Proceedings of the 2019 IEEE 90th Vehicular Technology Conference. Piscataway： IEEE， 2019： 1-5.
12	GHARANJIK A， SHANKAR M R B， ARAPOGLOU P D， et al. Precoding design and user selection for multibeam satellite channels［C］// Proceedings of the IEEE 16th International Workshop on Signal Processing Advances in Wireless Communications. Piscataway： IEEE， 2015： 420-424.
13	GU P， LI R， HUA C， et al. Cooperative spectrum sharing in a co-existing LEO-GEO satellite system［C］// Proceedings of the 2020 IEEE Global Communications Conference. Piscataway： IEEE， 2020： 1-6.
14	CHU J， CHEN X， ZHONG C， et al. Robust design for NOMA-based multibeam LEO satellite Internet of Things［J］. IEEE Internet of Things Journal， 2021， 8（3）： 1959-1970.
15	COCCO G， ANGELONE M， PÈREZ-NEIRA A I. Co-channel interference cancelation at the user terminal in multibeam satellite systems［J］. International Journal of Satellite Communications and Networking， 2017， 35（1）： 45-65.
16	ZHENG G， CHATZINOTAS S， OTTERSTEN B. Generic optimization of linear precoding in multibeam satellite systems［J］. IEEE Transactions on Wireless Communications， 2012， 11（6）： 2308-2320.
17	ITU-R. Recommendation P.1853-1： tropospheric attenuation time series synthesis［EB/OL］. ［2023-12-14］. ！！PDF-E.pdf.
18	张美蓉. 面向低轨卫星OFDM系统资源分配方法研究［D］. 南京：南京邮电大学， 2023： 1-34.
	ZHANG M R. Research on resource allocation methods in OFDM system for LEO satellite［D］. Nanjing： Nanjing University of Posts and Telecommunications， 2023： 1-34.
19	LI M. A spectrum allocation algorithm based on proportional fairness［C］// Proceedings of the 6th Global Electromagnetic Compatibility Conference. Piscataway： IEEE， 2020： 1-4.
20	YUAN S， SUN Y， PENG M， et al. Joint beam direction control and radio resource allocation in dynamic multi-beam LEO satellite networks［J］. IEEE Transactions on Vehicular Technology， 2024， 73（6）： 8222-8237.
21	SHANTHAMALLU U S， SPANIAS A. Machine and deep learning applications［M］// Machine and deep learning algorithms and applications， SLSP. Cham： Springer， 2021： 59-72.
22	胡靖. D2D蜂窝通信系统中公平资源分配与调度算法研究［D］. 南京：南京邮电大学， 2017： 12-13.
	HU J. Research on fair resource allocation and scheduling algorithm in D2D communications underlay cellular network［D］. Nanjing： Nanjing University of Posts and Telecommunications， 2017： 12-13.
23	ZHAO B， DONG X， REN G， et al. Optimal user pairing and power allocation in 5G satellite random access networks［J］. IEEE Transactions on Wireless Communications， 2022， 21（6）： 4085-4097.
24	LI Y， ZHU S， DAI J. Joint user grouping and resource allocation for LEO satellite multicast［J］. IEEE Systems Journal， 2023， 17（3）： 4695-4702.

[1]	王靖, 方旭明. Wi-Fi7多链路通感一体化的功率和信道联合智能分配算法[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 563-570.
[2]	缪孜珺, 罗飞, 丁炜超, 董文波. 基于全局状态预测与公平经验重放的交通信号控制算法[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 337-344.
[3]	周毅, 高华, 田永谌. 基于裁剪优化和策略指导的近端策略优化算法[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2334-2341.
[4]	马天, 席润韬, 吕佳豪, 曾奕杰, 杨嘉怡, 张杰慧. 基于深度强化学习的移动机器人三维路径规划方法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2055-2064.
[5]	张俊娜, 王欣新, 李天泽, 赵晓焱, 袁培燕. 基于动态服务缓存辅助的任务卸载方法[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1493-1500.
[6]	赵晓焱, 韩威, 张俊娜, 袁培燕. 基于异步深度强化学习的车联网协作卸载策略[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1501-1510.
[7]	唐睿, 庞川林, 张睿智, 刘川, 岳士博. D2D通信增强的蜂窝网络中基于DDPG的资源分配[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1562-1569.
[8]	唐睿, 岳士博, 张睿智, 刘川, 庞川林. UAV协助下非正交多址接入使能的数据采集系统中能效优化机制[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1209-1218.
[9]	罗华亮, 李全忠, 张旗. 融合信息通信和空中计算的认知无线网络鲁棒资源分配优化[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1195-1202.
[10]	陈发堂, 黄淼, 金宇峰. 面向用户需求的低轨卫星资源分配算法[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1242-1247.
[11]	秦鑫彤, 宋政育, 侯天为, 王飞越, 孙昕, 黎伟. 基于自适应p持续的移动自组网信道接入和资源分配算法[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 863-868.
[12]	邓辅秦, 官桧锋, 谭朝恩, 付兰慧, 王宏民, 林天麟, 张建民. 基于请求与应答通信机制和局部注意力机制的多机器人强化学习路径规划方法[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 432-438.
[13]	李源潮, 陶重犇, 王琛. 基于最大熵深度强化学习的双足机器人步态控制方法[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 445-451.
[14]	余家宸, 杨晔. 基于裁剪近端策略优化算法的软机械臂不规则物体抓取[J]. 《计算机应用》唯一官方网站, 2024, 44(11): 3629-3638.
[15]	龙杰, 谢良, 徐海蛟. 集成的深度强化学习投资组合模型[J]. 《计算机应用》唯一官方网站, 2024, 44(1): 300-310.

基于深度强化学习的低轨卫星多波束子载波动态分配算法

Dynamic allocation algorithm for multi-beam subcarriers of low orbit satellites based on deep reinforcement learning

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 10

参考文献 24

相关文章 15

编辑推荐

Metrics