计算机应用 ›› 2013, Vol. 33 ›› Issue (05): 1301-1304.DOI: 10.3724/SP.J.1087.2013.01301

• 人工智能 • 上一篇    下一篇

基于分层Option的仿人机器人相似性关键姿势转换

柯文德1,2,彭志平1,陈珂1,项顺伯1   

  1. 1. 广东石油化工学院 计算机与电子信息学院,广东 茂名525000
    2. 哈尔滨工业大学 计算机科学与技术学院,哈尔滨150001
  • 收稿日期:2012-11-29 修回日期:2013-01-04 出版日期:2013-05-01 发布日期:2013-05-08
  • 通讯作者: 柯文德
  • 作者简介:柯文德(1976-),男,广东茂名人,博士研究生,副教授,CCF会员,主要研究方向:机器人、计算机系统结构;彭志平(1969-),男,福建泉州人,教授,博士,主要研究方向:智能主体、机器人;陈珂(1964-),男,黑龙江牡丹江人,副教授,硕士,主要研究方向:多机器人协作、数据挖掘;项顺伯(1979-),男,安徽枞阳人,讲师,硕士,主要研究方向:计算机软件、机器人。
  • 基金资助:

    国家自然科学基金资助项目(61272382);广东省自然科学基金资助项目(8152500002000003, S2012010009963);广东省高等学校科技创新项目(2012KJCX0077);广东高校石化装备故障诊断与信息化控制工程中心项目(512009)

Similar key posture transformation based on hierarchical Option for humanoid robot

KE Wende1,2,PENG Zhiping1,CHEN Ke1,XIANG Shunbo1   

  1. 1. College of Computer and Electronic Information, Guangdong University of Petrochemical Technology, Maoming Guangdong525000, China
    2. School of Computer Science and Technology, Harbin Institute of Technology, Harbin Heilongjiang 150001,China
  • Received:2012-11-29 Revised:2013-01-04 Online:2013-05-08 Published:2013-05-01
  • Contact: KE Wende

摘要: 针对运动捕获系统获取的人体运动轨迹固定、难以实现仿人机器人关键姿势转换问题,提出了一种基于分层Option学习的仿人机器人关键姿势相似性转换方法。构建多级关键姿势树状结构,从关节相似差异、时刻整体相似差异、周期整体相似差异等角度描述了关键姿势差异,引入分层强化Option学习方法,建立关键姿势与Option行为集,由关键姿势差异的累计奖励将SMDP-Q方法逼近最优Option值函数,实现了关键姿势的转换。实验验证了方法的有效性。

关键词: 仿人机器人, 分层强化学习, 相似性, 姿势

Abstract: Concerning the problem in which the fixed locomotion track captured from human movement can not be used in transformation between key postures for humanoid robot, a method of similar key posture transformation based on hierarchical Option for humanoid robot was proposed. The multi-level dendrogram of key postures was constructed and the difference of key postures was illustrated in respects of similar joint difference, moment total similar difference, period total similar difference. The hierarchical reinforcement Option learning was introduced, in which the sets of key postures and Option actions were constructed. SMDP-Q method tended to be the optimal Option function by the accumulative rewards of key posture difference and the transformations were realized. The experiments show the validity of the method.

Key words: humanoid robot, hierarchical reinforcement learning, similarity, posture

中图分类号: