基于W学习的无线网络传输调度方案

计算机应用 ›› 2013, Vol. 33 ›› Issue (11): 3005-3009.

基于W学习的无线网络传输调度方案

朱江,彭祯珍,张玉平

重庆邮电大学移动通信技术重庆市重点实验室，重庆 400065

收稿日期:2013-05-24 修回日期:2013-07-17 发布日期:2013-12-04 出版日期:2013-11-01
通讯作者: 彭祯珍
作者简介:朱江（1977-），男，湖北荆州人，副教授，博士，主要研究方向：认知无线电、移动通信；彭祯珍(1989-)，女，四川达州人，硕士研究生，主要研究方向：认知无线电；张玉平（1987-），男，内蒙古通辽人，硕士研究生，主要研究方向：认知无线电。
基金资助:
国家自然科学基金资助项目;教育部科学技术研究重点项目;重庆市科委自然科学基金资助项目;重庆市教委科学技术研究项目;重庆邮电大学博士启动基金资助项目

Transmission and scheduling scheme based on W-learning algorithm in wireless networks

ZHU Jiang,PENG Zhenzhen,ZHANG Yuping

Chongqing Key Laboratory of Mobile Communications Technology, Chongqing University of Posts and Telecommunications, Chongqing 400065, China

Received:2013-05-24 Revised:2013-07-17 Online:2013-12-04 Published:2013-11-01
Contact: PENG Zhenzhen

摘要/Abstract

摘要： 针对无线网络的传输问题，提出了一种适用于无线网络的智能传输调度方案，在马尔可夫决策过程(MDP)的基础上构建了系统模型，通过W学习算法的引入，中继节点对缓存器储存状态及信道质量进行学习，从而在信息数据包的传输过程中智能地选择数据包传输对象及数据包传输方式来达到在节省能量损耗的前提下尽量减少数据包丢失的目的。通过状态聚合方法解决因状态空间过大而导致的维灾问题，同时采用了行动集缩减来以减少某些状态对应的行动数，利用这些简化方法可以发现逐次逼近法的存储空间压缩率为41%，W学习算法的存储空间压缩率为43%。最后，系统仿真结果表明，提出的传输调度方案可以在节省能耗的基础上尽量地传输数据，减少了数据包的丢失，同时采取的状态聚合法及行动集缩减在有效地简化计算的同时并没有影响算法的性能。

关键词: 传输调度方案, 马尔可夫决策过程, W学习算法, 中继节点, 近似最优策略

Abstract: To solve the problem of transmission in wireless networks, a transmission and scheduling scheme based on W-learning algorithm in wireless networks was proposed in this paper. Building the system model based on Markov Decision Progress (MDP), with the help of W-learning algorithm, the goal of using this scheme was to transmit intelligently, namely, the package loss under the premise of energy saving by choosing which one to transmit and the transmit mode legitimately was reduced. The curse of dimensionality was overcome by state aggregate method, and the number of actions was reduced by action set reduction scheme. The storage space compression ratio of successive approximation was 41%; the storage space compression ratio of W-learning algorithm was 43%. Finally, the simulation results were given to evaluate the performances of the scheme, which showed that the proposed scheme can transport data as much as possible on the basis of energy saving, the state aggregation method and the action set reduction scheme can simplify the calculation with little influence on the performance of algorithms.

Key words: transmission and scheduling scheme, Markov Decision Process (MDP), W-learning algorithm, relay node, approximate optimal strategy

中图分类号:

TN925

朱江彭祯珍张玉平. 基于W学习的无线网络传输调度方案[J]. 计算机应用, 2013, 33(11): 3005-3009.

ZHU Jiang PENG Zhenzhen ZHANG Yuping. Transmission and scheduling scheme based on W-learning algorithm in wireless networks[J]. Journal of Computer Applications, 2013, 33(11): 3005-3009.

[1]	唐睿, 庞川林, 张睿智, 刘川, 岳士博. D2D通信增强的蜂窝网络中基于DDPG的资源分配[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1562-1569.
[2]	罗飞, 白梦伟. 基于强化学习的交通情景问题决策优化[J]. 《计算机应用》唯一官方网站, 2022, 42(8): 2361-2368.
[3]	李学明, 吴国豪, 周尚波, 林晓然, 谢洪斌. 基于分数阶网络和强化学习的图像实例分割模型[J]. 《计算机应用》唯一官方网站, 2022, 42(2): 574-583.
[4]	周烁, 仇润鹤, 唐旻俊. 基于禁忌搜索和Q-learning的CR-NOMA系统的功率分配算法[J]. 计算机应用, 2021, 41(7): 2026-2032.
[5]	尚芳剑, 李信, 翟迪, 陆阳, 张东磊, 钱玉文. 智能电网中两阶段网络切片资源分配技术[J]. 计算机应用, 2021, 41(7): 2033-2038.
[6]	葛宇, 梁静. 基于相遇概率时效性和重复扩散感知的机会网络消息转发算法[J]. 计算机应用, 2020, 40(5): 1397-1402.
[7]	王奇, 秦进. 基于动作空间划分的MAXQ自动分层方法[J]. 计算机应用, 2017, 37(5): 1357-1362.
[8]	姚玉坤, 李小勇, 任智, 刘江兵. 基于协作网络编码的高效媒体访问控制协议[J]. 计算机应用, 2017, 37(10): 2748-2753.
[9]	闫成雨, 李志华, 喻新荣. 基于自适应过载阈值选择的虚拟机动态整合方法[J]. 计算机应用, 2016, 36(10): 2698-2703.
[10]	黄浩, 唐昊, 周雷, 程文娟. 服务率不确定的单站点传送带给料加工站系统鲁棒优化控制[J]. 计算机应用, 2015, 35(7): 2067-2072.
[11]	徐明, 刘广钟. 基于部分可观测马尔可夫决策过程的水声传感器网络介质访问控制协议[J]. 计算机应用, 2015, 35(11): 3047-3050.
[12]	郑延斌郭凌云刘晶晶. 多智能体系统分散式通信决策研究[J]. 计算机应用, 2012, 32(10): 2875-2878.
[13]	刘鑫王能. 匿名通信综述[J]. 计算机应用, 2010, 30(3): 719-722.
[14]	张愿马社祥. 三协作节点的功率分配及其信道容量[J]. 计算机应用, 2010, 30(12): 3151-3154.
[15]	韦云凯毛玉明于秦. 基于中继技术的B3G/4G蜂窝网络成本优化模型[J]. 计算机应用, 2009, 29(12): 3207-3210.