[1] PARKER L E. Multiple mobile robot systems[M]//Springer Handbook of Robotics. Berlin:Springer, 2008:921-941. [2] CHARKROBORTY J, MUKHOPADHYAY S. A robust cooperative multi-robot path-planning in noisy environment[C]//Proceedings of the 2010 IEEE International Conference on Industrial and Information Systems. Piscataway:IEEE, 2010:626-631. [3] DAI B, XIAO X, CAI Z. Current status and future development of mobile robot path planning technology[J]. Control Engineering of China, 2005, 12(3):198-202. (戴博,肖晓明,蔡自兴.移动机器人路径规划技术的研究现状与展望[J].控制工程,2005,12(3):198-202.) [4] SHI L, LUO Q, HAN B, et al. Research in biomimetic experiment of hexapod robot[J]. Journal of System Simulation, 2008, 20(19):5384-5387. [5] JARADAT M, GARIBEH M H, FEILAT E A. Dynamic motion planning for autonomous mobile robot using fuzzy potential field[C]//Proceedings of the 6th International Symposium on Mechatronics and Its Applications. Piscataway:IEEE, 2009:24-26. [6] GHATEE M, MOHADES A. Motion planning in order to optimize the length and clearance applying a Hopfield neural network[J]. Expert Systems with Applications, 2009, 36(3):4688-4695. [7] XU Y, YAO Y. Research on AUV global path planning considering ocean current[J]. Ship Building of China, 2008, 49(4):109-114. (徐玉如,姚耀中.考虑海流影响的水下机器人全局路径规划研究[J].中国造船,2008,49(4):109-114.) [8] HAO D, LIU B. Behavior fusion path planning method for mobile robot based in fuzzy logic[J]. Computer Engineering and Design, 2009, 30(3):660-663. (郝冬,刘斌.基于模糊逻辑行为融合路径规划方法[J].计算机工程与设计,2009, 30(3):660-663.) [9] SONG Y, LI Y, LI C. Initialization in reinforcement learning for mobile robots path planning[J]. Control Theory & Applications, 2012, 29(12):1623-1628. (宋勇,李贻斌,李彩虹.移动机器人路径规划强化学习的初始化[J].控制理论与应用,2012,29(12):1623-1628.) [10] BARTO A G, MAHADEVEN S. Recent advance in hierarchical reinforcement learning[J]. Discrete Event Dynamic Systems, 2003, 13(4):341-379. [11] SABATTIN L, SECCHI C, FANTUZZI C. Arbitrarily shaped formations of mobile robots:artificial potential fields and coordinate transformation[J]. Autonomous Robots, 2011, 30(4):385-397. [12] KHATIB O. Real-time obstacle avoidance for manipulators and mobile robots[C]//Proceedings of the 1985 IEEE International Conference on Robotics and Automation. Piscataway:IEEE, 1985, 2:500-505. [13] LIANG T. A speedup convergent method for multi-Agent reinforcement learning[C]//Proceedings of the 2009 International Conference on Information Engineering and Computer Science. Piscataway:IEEE, 2009:1-4. [14] SUTTON R S, PRECUP D, SINGH S P. Between MDPs and semi-MDPs:a framework for temporal abstraction in reinforcement learning[J]. Artificial Intelligence, 1999, 112(1/2):181-211. [15] PARR R. Hierarchical control and learning for Markov decision processes[D]. Berkeley:University of California, 1998:17-109. [16] DIETTERICH T G. Hierarchical reinforcement learning with the MAXQ value function decomposition[J]. Journal of Artificial Intelligence Research, 2000, 13(1):227-303. [17] SHEN J, LIU H, ZHANG R, et al. Multi-robot hierarchical reinforcement learning based on semi-Markov games[J]. Journal of Shandong University:Engineering Science, 2010, 40(4):1-7. (沈晶,刘海波,张汝波,等.基于半马尔可夫对策的多机器人分层强化学习[J].山东大学学报:工学版,2010,40(4):1-7.) [18] SINGH S P, JAAKKOLA T, LITTMAN M L, et al. Convergence results for single step on policy reinforcement learning algorithm[J]. Machine Learning, 2000, 38(3):287-308. |