Reinforcement learning-based particle swarm optimization algorithm with adaptive dynamic strategy

doi:10.11772/j.issn.1001-9081.2025070848

Abstract

Abstract: Abstract: To address performance degradation of particle swarm optimization in high-dimensional complex optimization scenarios caused by improper parameter control, an adaptive dynamic strategy particle swarm optimization algorithm based on reinforcement learning was proposed. A reinforcement learning agent was constructed to dynamically generate nonlinear adjustment strategies for inertia weight and social learning factor online, thereby autonomously balancing global exploration and local exploitation capabilities. A multimodal social learning mechanism was further designed to dynamically assign social learning targets for particles during mid-late stages of the algorithm, breaking limitation of a single social learning mode on population diversity through differentiated guidance strategies. Experimental results demonstrate that compared with state-of-the-art algorithms such as NHPSO-JTVAC (new self-organising hierarchical particle swarm optimization with jumping time-varying acceleration coefficients), VP-PSO (velocity pausing particle swarm optimization), and EA-PSO (elite archives-driven particle swarm optimization), proposed algorithm exhibits superior convergence accuracy and stability on CEC2013 test suite, particularly on high-dimensional complex composite functions. Among all 28 test functions, number of functions on which proposed algorithm performs significantly better than or equal to compared algorithms is no less than 18. It also achieves first rank in Friedman test, indicating its excellent comprehensive performance and providing a more universal scheme for complex nonlinear optimization problems.

Key words: reinforcement Learning, particle swarm optimization algorithm, adaptive dynamic parameters, differentiated guidance

摘要： 针对粒子群优化算法在高维复杂优化场景中因参数调控不当导致的寻优性能下降问题，文中提出一种基于强化学习的自适应动态策略粒子群优化算法。通过构建强化学习智能体，动态在线生成惯性权重和社会学习因子的非线性调整策略，以自主平衡全局探索与局部开发能力。并进一步设计多模态社会学习机制，在算法中后期动态分配粒子社会学习目标，以差异化引导策略打破单一社会学习模式对种群多样性的限制。实验结果表明，相较于同类NHPSO-JTVAC(new self-organising hierarchical particle swarm optimization with jumping time-varying acceleration coefficients)、VP-PSO(velocity pausing particle swarm optimization)、EA-PSO(elite archives-driven particle swarm optimization)等改进算法，所提算法在CEC2013测试集上收敛精度与稳定性方面均表现出较大优势，特别在高维复杂的复合函数集上优势更为明显。在28个测试函数上，所提算法性能显著优于或等于对比算法的函数个数均不低于18个，且Friedman检验排名第一，所提算法优异的综合性能也为复杂非线性优化问题提供了更具普适性的解决方案。

关键词: 强化学习, 粒子群优化算法, 自适应动态参数, 差异化引导

CLC Number:

TP301

韩煜熊宋尚校许攀李卓然熊敏于长华. 基于强化学习的自适应动态策略粒子群优化算法[J]. 《计算机应用》唯一官方网站, DOI: 10.11772/j.issn.1001-9081.2025070848.

[1]	Luhui ZHOU, Xuezhi YUE. Hybrid particle swarm optimization for solving vehicle routing problems with time windows [J]. Journal of Computer Applications, 2026, 46(1): 181-187.
[2]	Tianyu XUE, Aiping LI, Liguo DUAN. Vehicular edge computing scheme with task offloading and resource optimization [J]. Journal of Computer Applications, 2025, 45(6): 1766-1775.
[3]	Pengcheng XU, Lei HE, Chuan LI, Weiqi QIAN, Tun ZHAO. Deep symbolic regression method based on Transformer [J]. Journal of Computer Applications, 2025, 45(5): 1455-1463.
[4]	Suqian WU, Jianguo YAN, Bin YANG, Tao QIN, Ying LIU, Jing YANG. Multi-strategy improved Aquila optimizer and its application in path planning [J]. Journal of Computer Applications, 2025, 45(3): 937-945.
[5]	Jing WANG, Xuming FANG. Intelligent joint power and channel allocation algorithm for Wi-Fi7 multi-link integrated communication and sensing [J]. Journal of Computer Applications, 2025, 45(2): 563-570.
[6]	Huahua WANG, Liang HUANG, Jiajie CHEN, Jiening FANG. Dynamic allocation algorithm for multi-beam subcarriers of low orbit satellites based on deep reinforcement learning [J]. Journal of Computer Applications, 2025, 45(2): 571-577.
[7]	Lin WEI, Shihao ZHANG, Mengyang HE. Workflow task optimization and energy-efficient offloading method for computing power network [J]. Journal of Computer Applications, 2025, 45(12): 3916-3924.
[8]	Xiang KUANG, Zhen MA, Wanchun ZHU, Zhi ZHANG, Yunfei CUI. Secure and reliable service function chain deployment based on encoder-decoder structured reinforcement learning [J]. Journal of Computer Applications, 2025, 45(12): 3947-3956.
[9]	Chengyi WANG, Lei XU, Jinyin CHEN, Hongjun QIU. Cyber anti-mapping method based on adaptive perturbation [J]. Journal of Computer Applications, 2025, 45(12): 3896-3908.
[10]	Xiaojuan CHEN, Wei ZHANG. Task allocation of unmanned aerial vehicle for rural last-mile delivery based on reinforcement learning [J]. Journal of Computer Applications, 2025, 45(12): 4055-4063.
[11]	Jun ZENG, Yinghua TONG, Defang WANG. Anomaly detection method based on cumulative probability fluctuation and automated clustering [J]. Journal of Computer Applications, 2025, 45(12): 3864-3871.
[12]	Haoxiang XU, Dunhui YU, Yichen DENG, Kui XIAO. Knowledge graph constrained question answering model based on hierarchical reinforcement learning [J]. Journal of Computer Applications, 2025, 45(12): 3764-3770.
[13]	Shuai ZHOU, Hao FU, Wei LIU. Spatial-temporal Transformer-based hybrid return implicit Q-learning for crowd navigation [J]. Journal of Computer Applications, 2025, 45(11): 3666-3673.
[14]	Jinghua ZHAO, Zhu ZHANG, Xiting LYU, Huidan LIN. Multiscale information diffusion prediction model based on hypergraph neural network [J]. Journal of Computer Applications, 2025, 45(11): 3529-3539.
[15]	Yanpeng ZHANG, Yuqian ZHAO, Fan ZHANG, Tenghai QIU, Gui GUI, Lingli YU. Capacitated vehicle routing problem solving method based on improved MAML and GVAE [J]. Journal of Computer Applications, 2025, 45(11): 3642-3648.

Reinforcement learning-based particle swarm optimization algorithm with adaptive dynamic strategy

基于强化学习的自适应动态策略粒子群优化算法

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics