| 1 | LI Z, CHENG X, PENG X B, et al. Reinforcement learning for robust parameterized locomotion control of bipedal robots[C]// Proceedings of the 2021 IEEE International Conference on Robotics and Automation. Piscataway: IEEE, 2021: 2811-2817.  10.1109/icra48506.2021.9560769 | 
																													
																						| 2 | KHAN A T, LI S, CAO X. Human guided cooperative robotic agents in smart home using beetle antennae search[J]. Science China Information Sciences, 2022, 65: 122204.  10.1007/s11432-020-3073-5 | 
																													
																						| 3 | XIN S, VIJAYAKUMAR S. Online dynamic motion planning and control for wheeled biped robots[C]// Proceedings of the 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway: IEEE, 2020: 3892-3899.  10.1109/iros45743.2020.9340967 | 
																													
																						| 4 | KHAN A T, LI S, ZHOU X. Trajectory optimization of 5-link biped robot using beetle antennae search[J]. IEEE Transactions on Circuits and Systems II: Express Briefs, 2021, 68(10): 3276-3280.  10.1109/tcsii.2021.3062639 | 
																													
																						| 5 | JEONG H, LEE I, OH J, et al. A robust walking controller based on online optimization of ankle, hip, and stepping strategies[J]. IEEE Transactions on Robotics, 2019, 35(6): 1367-1386.  10.1109/tro.2019.2926487 | 
																													
																						| 6 | 廖发康, 周亚丽, 张奇志. 变长度柔性双足机器人行走控制及稳定性分析[J]. 计算机应用, 2023, 43(1): 312-320. | 
																													
																						|  | LIAO F K, ZHOU Y L, ZHANG Q Z. Walking control and stability analysis of flexible biped robot with variable length legs[J]. Journal of Computer Applications, 2023, 43(1): 312-320. | 
																													
																						| 7 | 张瑞, 张奇志, 周亚丽. 变长度弹性伸缩腿双足机器人半被动起步行走仿人控制[J]. 计算机应用, 2022, 42(1): 252-257. | 
																													
																						|  | ZHANG R, ZHANG Q Z, ZHOU Y L. Starting and walking human-like control of semi-passive bipedal robot with variable length telescopic legs[J]. Journal of Computer Applications, 2022, 42(1): 252-257. | 
																													
																						| 8 | YU J, LIU Y, LI R, et al. Stable walking of seven-link biped robot based on CPG-ZMP hybrid control method[C]// Proceedings of the 2021 IEEE International Conference on Robotics and Biomimetics. Piscataway: IEEE, 2021: 870-874.  10.1109/robio54168.2021.9739430 | 
																													
																						| 9 | YAMAMOTO T, SUGIHARA T. Foot-guided control of a biped robot through ZMP manipulation[J]. Advanced Robotics, 2020, 34(21/22): 1472-1489.  10.1080/01691864.2020.1827031 | 
																													
																						| 10 | TAN J, ZHANG T, COUMANS E, et al. Sim-to-real: learning agile locomotion for quadruped robots[EB/OL]. (2018-05-16) [2023-02-11]. .  10.15607/rss.2018.xiv.010 | 
																													
																						| 11 | HAARNOJA T, HA S, ZHOU A, et al. Learning to walk via deep reinforcement learning[EB/OL]. (2019-06-19) [2023-02-11]. .  10.15607/rss.2019.xv.011 | 
																													
																						| 12 | ARULKUMARAN K, DEISENROTH M P, BRUNDAGE M, et al. Deep reinforcement learning: a brief survey[J]. IEEE Signal Processing Magazine, 2017, 34(6): 26-38.  10.1109/msp.2017.2743240 | 
																													
																						| 13 | LILLICRAP T P, HUNT J J, PRITZEL A, et al. Continuous control with deep reinforcement learning[EB/OL]. (2019-07-05) [2023-02-11]. . | 
																													
																						| 14 | MNIH V, BADIA A P, MIRZA M, et al. Asynchronous methods for deep reinforcement learning[C]// Proceedings of the 2016 International Conference on Machine Learning. New York: JMLR.org, 2016: 1928-1937. | 
																													
																						| 15 | WU Y, YAO D, XIAO X, et al. Intelligent controller for passivity-based biped robot using deep Q network[J]. Journal of Intelligent & Fuzzy Systems, 2019, 36(1): 731-745.  10.3233/jifs-172180 | 
																													
																						| 16 | WU X, LIU S, ZHANG T, et al. Motion control for biped robot via DDPG-based deep reinforcement learning[C]// Proceedings of the 2018 WRC Symposium on Advanced Robotics and Automation. Piscataway: IEEE, 2018: 40-45.  10.1109/wrc-sara.2018.8584227 | 
																													
																						| 17 | SCHULMAN J, WOLSKI F, DHARIWAL P, et al. Proximal policy optimization algorithms[EB/OL]. (2017-08-28) [2023-02-11]. . | 
																													
																						| 18 | SCHULMAN J, LEVINE S, MORITZ P, et al. Trust region policy optimization[C]// Proceedings of the 32nd International Conference on Machine Learning. New York: JMLR.org, 2015: 1889-1897. | 
																													
																						| 19 | WU Y-H, YU Z-C, LI C-Y, et al. Reinforcement learning in dual-arm trajectory planning for a free-floating space robot[J]. Aerospace Science and Technology, 2020, 98: 105657.  10.1016/j.ast.2019.105657 | 
																													
																						| 20 | 赵玉婷, 韩宝玲, 罗庆生. 基于deep Q-network双足机器人非平整地面行走稳定性控制方法[J]. 计算机应用, 2018, 38(9): 2459-2463. | 
																													
																						|  | ZHAO Y T, HAN B L, LUO Q S. Walking stability control method based on deep Q-network for biped robot on uneven ground[J]. Journal of Computer Applications, 2018, 38(9): 2459-2463. | 
																													
																						| 21 | TAO C, XUE J, ZHANG Z, et al. Parallel deep reinforcement learning method for gait control of biped robot[J]. IEEE Transactions on Circuits and Systems II: Express Briefs, 2022, 69(6): 2802-2806.  10.1109/tcsii.2022.3145373 | 
																													
																						| 22 | RODRIGUEZ D, BEHNKE S. DeepWalk: omnidirectional bipedal gait by deep reinforcement learning[C]// Proceedings of the 2021 IEEE International Conference on Robotics and Automation. Piscataway: IEEE, 2021: 3033-3039.  10.1109/icra48506.2021.9561717 | 
																													
																						| 23 | HAARNOJA T, PONG V, ZHOU A, et al. Composable deep reinforcement learning for robotic manipulation[C]// Proceedings of the 2018 IEEE International Conference on Robotics and Automation. Piscataway: IEEE, 2018: 6244-6251.  10.1109/icra.2018.8460756 | 
																													
																						| 24 | MNIH V, KAVUKCUOGLU K, SILVER D, et al. Playing Atari with deep reinforcement learning[EB/OL]. (2013-12-19) [2023-02-11]. .  10.1038/nature14236 | 
																													
																						| 25 | HAARNOJA T, ZHOU A, ABBEEL P, et al. Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor[C]// Proceedings of the 2018 International Conference on Machine Learning. New York: JMLR.org, 2018: 1861-1870.  10.1109/icra.2018.8460756 | 
																													
																						| 26 | SCHAUL T, QUAN J, ANTONOGLOU I, et al. Prioritized experience replay[EB/OL]. (2016-02-25) [2023-02-11]. . | 
																													
																						| 27 | FUJITA Y, NAGARAJAN P, KATAOKA T, et al. ChainerRL: a deep reinforcement learning library[J]. The Journal of Machine Learning Research, 2021, 22(1): 3557-3570. | 
																													
																						| 28 | CASTILLO G A, WENG B, HEREID A, et al. Reinforcement learning meets hybrid zero dynamics: a case study for rabbit[C]// Proceedings of the 2019 International Conference on Robotics and Automation. Piscataway: IEEE, 2019: 284-290.  10.1109/icra.2019.8793627 |