养老院场景下基于任务的辅助机器人路径规划

doi:10.11772/j.issn.1001-9081.2024101534

《计算机应用》唯一官方网站 ›› 2025, Vol. 45 ›› Issue (10): 3270-3276.DOI: 10.11772/j.issn.1001-9081.2024101534

• 先进计算 • 上一篇

养老院场景下基于任务的辅助机器人路径规划

王昱(), 赵明月, 周小琳

沈阳航空航天大学自动化学院，沈阳110136

收稿日期:2024-10-31 修回日期:2024-12-27 接受日期:2024-12-30 发布日期:2025-03-07 出版日期:2025-10-10
通讯作者: 王昱
作者简介:王昱（1980—），女，辽宁沈阳人，副教授，博士，主要研究方向：机器学习、智能决策 Email:wangyu@sau.edu.cn
赵明月（1998—），女，辽宁阜新人，硕士研究生，主要研究方向：强化学习、优化算法在路径规划中的应用
周小琳（2001—），女，辽宁辽阳人，硕士研究生，主要研究方向：任务规划。
基金资助:
国家自然科学基金资助项目(61906125);国家自然科学基金资助项目(62373261);辽宁省高校基本科研业务费专项资金资助项目(LJ232410143020);辽宁省高校基本科研业务费专项资金资助项目(LJ212410143047)

Task-based assistive robot path planning in nursing home scenarios

Yu WANG(), Mingyue ZHAO, Xiaolin ZHOU

School of Automation，Shenyang Aerospace University，Shenyang Liaoning 110136，China

Received:2024-10-31 Revised:2024-12-27 Accepted:2024-12-30 Online:2025-03-07 Published:2025-10-10
Contact: Yu WANG
About author:WANG Yu， born in 1980， Ph. D.， associate professor. Her research interests include machine learning， intelligent decision-making.
ZHAO Mingyue， born in 1998， M. S. candidate. Her research interests include reinforcement learning， application of optimization algorithms in path planning.
ZHOU Xiaolin， born in 2001， M. S. candidate. Her research interests include task planning.
Supported by:
National Natural Science Foundation of China(61906125);Basic Research Funds for Universities of Liaoning Province(LJ232410143020)

摘要/Abstract

摘要：

全球老龄化问题日益严峻，养老服务领域面临严重人力短缺挑战，亟需引入具有智能决策能力的机器人技术。针对养老院场景下辅助机器人在多任务机制中的自主路径规划问题，提出一种改进的非确定性策略SAC（Soft Actor-Critic）强化学习决策算法。首先，提出基于虚拟圆的障碍物轮廓重构法，在降低环境建模难度的同时提升雷达探测效率；其次，针对强化学习算法在求解连续状态空间内复杂任务时从零进行策略寻优的困难，将鲸鱼优化算法（WOA）与SAC算法结合得到WOA-SAC算法，通过构建辅助监督机制为学习过程提供方向引导，提升决策能力的同时显著提升收敛速度；最后，基于老人的日常需求规划任务，在包含静态障碍、动态障碍的固定任务和突发性随机任务环境中完成模型训练。仿真实验结果表明，与传统SAC算法相比，WOA-SAC算法的平均路径长度缩短了10.42%，成功率提升了6.66%，平均步长减小了29.63%。可见，WOA-SAC算法能够显著提升SAC算法的学习效率和决策能力，并解决多任务机制中的自主路径规划问题。

关键词: 养老机器人, 多任务路径规划, 引导学习, 鲸鱼优化算法, 强化学习

Abstract:

The global aging issue is becoming severe increasingly， and the field of elderly care services is facing a challenge of manpower shortage， urgently requiring the introduction of robot technology with intelligent decision-making capabilities. To solve the autonomous path planning problem of assistive robots under a multi-task mechanism in nursing home scenarios， an improved Soft Actor-Critic （SAC） reinforcement learning decision-making algorithm was proposed. Firstly， an obstacle contour reconstruction method based on virtual circles was introduced， which reduced the complexity of environmental modeling and enhanced radar detection efficiency. Then， to tackle the difficulty of reinforcement learning algorithms in optimizing strategies from scratch when solving complex tasks in a continuous state space， Whale Optimization Algorithm （WOA） was integrated with SAC algorithm to obtain WOA-SAC algorithm. At the same time， by constructing an auxiliary supervision mechanism to provide directional guidance for the learning process， the decision-making capability was improved while the convergence was accelerated significantly. Finally， task planning was conducted on the basis of daily needs of the elderly， and model training was completed in environments composed of fixed tasks with static and dynamic obstacles as well as emergent random tasks. Simulation results demonstrate that compared to the traditional SAC algorithm， WOA-SAC algorithm reduces the average path length by 10.42%， increases the success rate by 6.66%， and decreases the average step size by 29.63%. It can be seen the significant enhancement of WOA-SAC algorithm in the learning efficiency and decision-making capability of SAC algorithm， addressing the autonomous path planning problems in multi-task mechanisms effectively.

Key words: elderly care robot, multi-task path planning, guided learning, Whale Optimization Algorithm (WOA), reinforcement learning

中图分类号:

TP242

王昱, 赵明月, 周小琳. 养老院场景下基于任务的辅助机器人路径规划[J]. 计算机应用, 2025, 45(10): 3270-3276.

Yu WANG, Mingyue ZHAO, Xiaolin ZHOU. Task-based assistive robot path planning in nursing home scenarios[J]. Journal of Computer Applications, 2025, 45(10): 3270-3276.

图/表 16

图1 养老院环境的模拟图

Fig. 1 Simulation map of nursing home environment

图2 多任务自主路径规划决策系统

Fig. 2 Multi-task autonomous path planning decision-making system

图3 运动学模型

Fig. 3 Kinematic model

图4 基于虚拟圆的障碍物轮廓重构

Fig. 4 Obstacle contour reconstruction based on virtual circles

图5 雷达模型

Fig. 5 Radar model

图6 本文算法的流程

Fig. 6 Flow of proposed algorithm

图7 WOA的适应度值

Fig. 7 Fitness value of WOA

表1 超参数详情

Tab. 1 Hyperparameter details

参数名	符号	值	参数名	符号	值
经验池大小	M	65 536	软更新学习率	τ	0.01
训练批次大小	N	64	最大训练次数	E	1 000
Actor网络训练率	l_a	3×10^-4	加权系数	λ₁，λ₂	0.5
Critic网络训练率	l_c	3×10^-4	机器人步数	T	500
奖励折扣率	γ	0.9

图8 WOA的机器人路径规划结果

Fig. 8 Robot path planning results of WOA

图9 模型训练曲线

Fig. 9 Model training curve

图10 任务过程中躲避动态障碍

Fig. 10 Avoiding dynamic obstacles during tasks

图11 复合任务路径规划

Fig. 11 Multi-task path planning

图12 静态环境下2种算法的路径规划结果对比

Fig. 12 Comparison of path planning results of two algorithms in static environment

图13 静态和动态环境下4种算法的奖励变化对比

Fig. 13 Comparison of reward changes of four algorithms in static and dynamic environments

图14 动态环境下2种算法的路径规划结果对比

Fig. 14 Comparison of path planning results of two algorithms in dynamic environment

表2 SAC与WOA-SAC的性能对比

Tab. 2 Performance comparison between SAC and WOA-SAC

算法	平均路径长度/m	成功率	平均步长
SAC	48	0.75	270
WOA-SAC	43	0.80	190

参考文献 23

[1]	LAI X， LI J， CHAMBERS J. Enhanced center constraint weighted A* algorithm for path planning of petrochemical inspection robot［J］. Journal of Intelligent and Robotic Systems， 2021， 102（4）： No.78.
[2]	JIANG Q， MAN Y， PU X. Improved path planning algorithm based on RRT［C］// Proceedings of the 2nd International Conference on Signal Processing and Intelligent Computing. Piscataway： IEEE， 2024： 463-466.
[3]	FU S. Robot path planning optimization based on RRT and APF fusion algorithm［C］// Proceedings of the 8th International Conference on Robotics and Automation Sciences. Piscataway： IEEE， 2024： 32-36.
[4]	FAN J， CHEN X， LIANG X. UAV trajectory planning based on bi-directional APF-RRT* algorithm with goal-biased［J］. Expert Systems with Applications， 2023， 213（Pt C）： No.119137.
[5]	LAN W， JIN X， WANG T， et al. Improved RRT algorithms to solve path planning of multi-glider in time-varying ocean currents［J］. IEEE Access， 2021， 9： 158098-158115.
[6]	YAN Z， ZHAO L， WANG Y， et al. Path planning of AUV for obstacle avoidance with improved artificial potential field［C］// Proceedings of the 49th Annual Conference of the IEEE Industrial Electronics Society. Piscataway： IEEE， 2023： 1-5.
[7]	LI W， CHEN Y， XIANG L. Cooperative path planning of UAV formation based on improved artificial potential field［C］// Proceedings of the IEEE 2nd International Conference on Electronic Technology， Communication and Information. Piscataway： IEEE， 2022： 636-641.
[8]	HAMILTON J， STEFANAKOS I， CALINESCU R， et al. Towards adaptive planning of assistive-care robot tasks［EB/OL］. ［2024-08-22］..
[9]	李少波，宋启松，李志昂，等. 遗传算法在机器人路径规划中的研究综述［J］.科学技术与工程， 2020， 20（2）： 423-431.
	LI S B， SONG Q S， LI Z A， et al. Review of genetic algorithm in robot path planning［J］. Science Technology and Engineering， 2020， 20（2）： 423-431.
[10]	冯豪博，胡桥，赵振轶. 基于精英族系遗传算法的AUV集群路径规划［J］. 系统工程与电子技术， 2022， 44（7）：2251-2262.
	FENG H B， HU Q， ZHAO Z Y. AUV swarm path planning based on elite lineage genetic algorithm［J］. Systems Engineering and Electronics， 2022， 44（7）： 2251-2262.
[11]	韩统，崔明朗，张伟，等. 多无人机协同空战机动决策［J］. 兵器装备工程学报， 2020， 41（4）： 117-123.
	HAN T， CUI M L， ZHANG W， et al. Multi-UCAV cooperative air combat maneuvering decision［J］. Journal of Ordnance Equipment Engineering， 2020， 41（4）： 117-123.
[12]	吕柏行，郭志光，赵韦皓，等. 标准粒子群算法的优化方式综述［J］. 科学技术创新， 2021（28）： 33-37.
	LYU B X， GUO Z G， ZHAO W H， et al. A review on optimization methods of standard particle swarm optimization［J］. Scientific and Technological Innovation， 2021（28）： 33-37.
[13]	ZHAO J， ZHU X， SONG T. Serial manipulator time-jerk optimal trajectory planning based on hybrid IWOA-PSO algorithm［J］. IEEE Access， 2022， 10： 6592-6604.
[14]	KATHEN M J T， SAMANIEGIO F P， FLORES I J， et al. AquaHet-PSO： an informative path planner for a fleet of autonomous surface vehicles with heterogeneous sensing capabilities based on multi-objective PSO［J］. IEEE Access， 2023， 11： 110943-110966.
[15]	YU J， HOU J， CHEN G. Improved safety-first A-star algorithm for autonomous vehicles［C］// Proceedings of the 5th International Conference on Advanced Robotics and Mechatronics. Piscataway： IEEE， 2020： 706-710.
[16]	CHEN Z， YU J， ZHAO Z， et al. A path-planning method considering environmental disturbance based on VPF-RRT［J］. Drones， 2023， 7（2）： No.145.
[17]	QU H， XING K， ALEXANDER T. An improved genetic algorithm with co-evolutionary strategy for global path planning of multiple mobile robots［J］. Neurocomputing， 2013， 120： 509-517.
[18]	张荣霞，武长旭，孙同超，等. 深度强化学习及在路径规划中的研究进展［J］. 计算机工程与应用， 2021， 57（19）： 44-56.
	ZHANG R X， WU C X， SUN T C， et al. Progress on deep reinforcement learning in path planning［J］. Computer Engineering and Applications， 2021， 57（19）： 44-56.
[19]	ZHOU S， LIU X， XU Y， et al. A Deep Q-Network （DQN） based path planning method for mobile robots［C］// Proceedings of the 2018 IEEE International Conference on Information and Automation. Piscataway： IEEE， 2018： 366-371.
[20]	DONG Y， ZOU X. Mobile robot path planning based on improved DDPG reinforcement learning algorithm［C］// Proceedings of the IEEE 11th International Conference on Software Engineering and Service Science. Piscataway： IEEE， 2020： 52-56.
[21]	DE JESUS J C， KICH V A， KOLLING A H， et al. Soft actor-critic for navigation of mobile robots［J］. Journal of Intelligent and Robotic Systems， 2021， 102（2）： No.31.
[22]	赵俊涛，罗小川，刘俊秘. 改进鲸鱼优化算法在机器人路径规划中的应用［J］. 东北大学学报（自然科学版）， 2023， 44（8）： 1065-1071.
	ZHAO J T， LUO X C， LIU J M. Application of improved whale optimization algorithm in robot path planning［J］. Journal of Northeastern University （Natural Science）， 2023， 44（8）： 1065-1071.
[23]	CUI Z， GUAN W， ZHANG X， et al. Autonomous collision avoidance decision-making method for USV based on ATL-TD3 algorithm［J］. Ocean Engineering， 2024， 312（Pt 3）： No.119297.

养老院场景下基于任务的辅助机器人路径规划

Task-based assistive robot path planning in nursing home scenarios

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 16

参考文献 23

相关文章 15

编辑推荐

Metrics

[1]	薛天宇, 李爱萍, 段利国. 联合任务卸载和资源优化的车辆边缘计算方案[J]. 《计算机应用》唯一官方网站, 2025, 45(6): 1766-1775.
[2]	许鹏程, 何磊, 李川, 钱炜祺, 赵暾. 基于Transformer的深度符号回归方法[J]. 《计算机应用》唯一官方网站, 2025, 45(5): 1455-1463.
[3]	王兴旺, 张清杨, 姜守勇, 董永权. 基于改进鲸鱼优化算法的动态无人机路径规划[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 928-936.
[4]	王靖, 方旭明. Wi-Fi7多链路通感一体化的功率和信道联合智能分配算法[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 563-570.
[5]	王华华, 黄梁, 陈甲杰, 方杰宁. 基于深度强化学习的低轨卫星多波束子载波动态分配算法[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 571-577.
[6]	缪孜珺, 罗飞, 丁炜超, 董文波. 基于全局状态预测与公平经验重放的交通信号控制算法[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 337-344.
[7]	肖海林, 黄天义, 代秋香, 张跃军, 张中山. 基于轨迹预测的安全强化学习自动变道决策方法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2958-2963.
[8]	何浩东, 符浩, 王强, 周帅, 刘伟. 基于深度强化学习的多机器人路径跟随与编队[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2626-2633.
[9]	周毅, 高华, 田永谌. 基于裁剪优化和策略指导的近端策略优化算法[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2334-2341.
[10]	马天, 席润韬, 吕佳豪, 曾奕杰, 杨嘉怡, 张杰慧. 基于深度强化学习的移动机器人三维路径规划方法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2055-2064.
[11]	赵晓焱, 韩威, 张俊娜, 袁培燕. 基于异步深度强化学习的车联网协作卸载策略[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1501-1510.
[12]	唐睿, 庞川林, 张睿智, 刘川, 岳士博. D2D通信增强的蜂窝网络中基于DDPG的资源分配[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1562-1569.
[13]	陈发堂, 黄淼, 金宇峰. 面向用户需求的低轨卫星资源分配算法[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1242-1247.
[14]	秦鑫彤, 宋政育, 侯天为, 王飞越, 孙昕, 黎伟. 基于自适应p持续的移动自组网信道接入和资源分配算法[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 863-868.
[15]	邓辅秦, 官桧锋, 谭朝恩, 付兰慧, 王宏民, 林天麟, 张建民. 基于请求与应答通信机制和局部注意力机制的多机器人强化学习路径规划方法[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 432-438.