计算机应用 ›› 2021, Vol. 41 ›› Issue (7): 2026-2032.DOI: 10.11772/j.issn.1001-9081.2020081249

所属专题: 网络与通信

• 网络与通信 • 上一篇    下一篇

基于禁忌搜索和Q-learning的CR-NOMA系统的功率分配算法

周烁1,2, 仇润鹤1,2, 唐旻俊1,2   

  1. 1. 东华大学 信息科学与技术学院, 上海 201620;
    2. 数字化纺织服装技术教育部工程研究中心(东华大学), 上海 201620
  • 收稿日期:2020-08-18 修回日期:2020-12-23 出版日期:2021-07-10 发布日期:2021-01-22
  • 通讯作者: 周烁
  • 作者简介:周烁(1996-),男,江西上饶人,硕士研究生,主要研究方向:认知无线电、非正交多址接入;仇润鹤(1961-),男,上海人,教授,博士,主要研究方向:无线通信系统、协作中继网络、认知无线电网络;唐旻俊(1994-),男,上海人,硕士研究生,主要研究方向:认知无线通信系统。
  • 基金资助:
    国家自然科学基金面上项目(61671143)。

Power allocation algorithm for CR-NOMA system based on tabu search and Q-learning

ZHOU Shuo1,2, QIU Runhe1,2, TANG Minjun1,2   

  1. 1. College of Information Sciences and Technology, Donghua University, Shanghai 201620, China;
    2. Engineering Research Center of Digitized Textile and Fashion Technology, Ministry of Education(Donghua University), Shanghai 201620, China
  • Received:2020-08-18 Revised:2020-12-23 Online:2021-07-10 Published:2021-01-22
  • Supported by:
    This work is partially supported by Surface Program of the National Natural Science Foundation of China (61671143).

摘要: 针对下一代移动通信对于高速率和大规模连接的需求,对认知无线电(CR)-非正交多址接入(NOMA)混合系统中通过优化功率分配来提升次用户总传输速率进行研究,提出一种基于禁忌搜索和Q-learning的功率分配(PATSQ)算法。首先,认知基站在系统环境中观测并学习用户的功率分配,次用户采用NOMA方式接入授权信道。其次,将功率优化分配问题中的功率分配、信道状态和总传输速率分别表述为马尔可夫决策过程中的动作、状态和奖励,通过结合禁忌搜索和Q-learning的方法来解决该马尔可夫决策过程问题并得到一个最优的禁忌Q表。最后,在主次用户服务质量(QoS)和最大发射功率的约束下,认知基站通过查找禁忌Q表得到最优的功率分配因子,实现系统中次用户总传输速率的最大化。仿真结果表明,在总功率相同条件下,所提算法在次用户总传输速率和系统容纳用户数量上要优于认知移动无线网络(CMRN)算法、次用户预解码(SFDM)算法以及传统等功率分配算法。

关键词: 非正交多址接入, 认知无线电, 功率分配, 禁忌搜索, Q-learning, 服务质量, 马尔可夫决策过程

Abstract: For the demand of high speed and massive connections of next-generation mobile communication, improving the total secondary users' transmission rate by the optimization of power allocation in Cognitive Radio-Non-Orthogonal Multi-Access (CR-NOMA) hybrid system was studied, and an algorithm of Power Allocation based on Tabu Search and Q-learning (PATSQ) was proposed. Firstly, the users' power allocation was observed and learnt by the cognitive base station in the system environment, and the secondary users used NOMA to access the authorized channel. Then, the power allocation, channel state and total transmission rate in the power allocation problem were expressed as action, state and reward in the Markov decision process, which was solved by combining tabu search and Q-learning and an optimal tabu Q-table was obtained. Finally, under the constraints of primary and secondary users' Quality of Service (QoS) and maximum transmitting power, optimal power allocation factors were obtained by the cognitive base station by looking up the tabu Q-table, so as to maximize the total transmission rate of secondary users in the system. Simulation results show that under the same total power, the proposed algorithm is superior to Cognitive Mobile Radio Network (CMRN) algorithm, Secondary user First Decode Mode (SFDM) algorithm and the traditional equal power allocation algorithm in terms of the total transmission rate of secondary users and the number of users contained in the system.

Key words: Non-Orthogonal Multiple Access (NOMA), cognitive radio, power allocation, tabu search, Q-learning, Quality of Service (QoS), Markov decision process

中图分类号: