《计算机应用》唯一官方网站

• •    下一篇

养老院场景下基于任务的辅助机器人路径规划

王昱,赵明月,周小琳   

  1. 沈阳航空航天大学 自动化学院,沈阳110136
  • 收稿日期:2024-10-30 修回日期:2024-12-27 接受日期:2024-12-30 发布日期:2025-03-07 出版日期:2025-03-07
  • 通讯作者: 王昱
  • 基金资助:
    国家自然科学基金;辽宁省高校基本科研业务费项目

Task-based assistive robot path planning in nursing home scenarios

  • Received:2024-10-30 Revised:2024-12-27 Accepted:2024-12-30 Online:2025-03-07 Published:2025-03-07

摘要: 全球老龄化问题日益严峻,养老服务领域面临严重人力短缺挑战,亟需引入具有智能决策能力的机器人技术。针对养老院场景下辅助机器人在多任务机制中的自主路径规划问题,提出一种改进的非确定性策略(SAC)强化学习决策算法。首先,提出基于虚拟圆的障碍物轮廓重构法,在降低环境建模难度的同时提升雷达探测效率;其次,针对强化学习算法在求解连续状态空间内复杂任务时从零进行策略寻优的困难,将鲸鱼优化算法(WOA)与SAC算法相结合,通过构建辅助监督机制为学习过程提供方向引导,提升决策能力的同时显著提升收敛速度;最后,基于老人的日常需求进行任务规划,在包含静态障碍、动态障碍的固定任务和突发性随机任务环境中完成模型训练。仿真实验结果表明,与传统SAC算法相比,WOA-SAC算法的平均路径长度由48m缩短至43m,成功率由75%提高了5至80%,平均步长由270减小至190,能够显著提升SAC算法的学习效率和决策能力,解决多任务机制中的自主路径规划问题。

关键词: 养老机器人, 多任务路径规划, 引导学习, 鲸鱼算法, 强化学习, TP242(机器人)

Abstract: The global issue of aging is becoming increasingly severe, and the field of elderly care is facing a challenge of manpower shortage, urgently requiring the introduction of robot technology with intelligent decision-making capabilities. To address the autonomous path planning of assistive robots under a multi-task mechanism in elderly care scenarios, an improved Soft Actor-Critic (SAC) reinforcement learning decision algorithm was proposed. Firstly, an obstacle contour reconstruction method based on virtual circles was introduced, which reduced the complexity of environmental modeling while enhancing radar detection efficiency. Then, to tackle the difficulty reinforcement learning algorithms in optimizing strategies from scratch when solving complex tasks in a continuous state space, the Whale Optimization Algorithm (WOA) was integrated with the SAC algorithm. By constructing an auxiliary supervision mechanism to provide directional guidance for the learning process, the decision-making capability was improved while significantly accelerating the convergence speed. Finally, task planning was conducted based on the daily needs of the elderly, and model training was completed in environments that included fixed tasks with static and dynamic obstacles as well as emergent random tasks. Simulation results demonstrated that, compared to the traditional SAC algorithm, the WOA-SAC algorithm reduced the average path length from 48m to 43m, increased the success rate from 75% to 80%, and decreased the average step count from 270 to 190. These improvements highlighted the significant enhancement in the learning efficiency and decision-making capability of the SAC algorithm, effectively addressing the autonomous path planning challenges within multi-task mechanisms.

Key words: elderly care robots, multi-task path planning, guided learning, Whale optimization algorithm, reinforcement learning, TP242(robot)

中图分类号: