[1] KARAMAN S,WALTER M R,PEREZ A,et al. Anytime motion planning using the RRT[C]//Proceedings of the 2011 IEEE International Conference on Robotics and Automation. Piscataway:IEEE,2011:1478-1483. [2] BELL F. Connectivism:its place in theory-informed research and innovation in technology-enabled learning[J]. International Review of Research in Open and Distance Learning,2011,12(3):98-118. [3] KOREN Y,BORENSTEIN J. Potential field methods and their inherent limitations for mobile robot navigation[C]//Proceedings of the 1991 IEEE International Conference on Robotics and Automation. Piscataway:IEEE,1991:1398-1404. [4] ZHANG B,CHEN W,FEI M. An optimized method for path planning based on artificial potential field[C]//Proceedings of the 6th International Conference on Intelligent Systems Design and Applications. Piscataway:IEEE,2006:35-39. [5] SCHAAL S. Is imitation learning the route to humanoid robots?[J]. Trends in Cognitive Sciences,1999,3(6):233-242. [6] SUTTON R S,BARTO A G. Reinforcement Learning:An Introduction[M]. Cambridge:MIT Press,1998:1-4. [7] CHEN C,SEFF A,KARNHAUSER A,et al. DeepDriving:learning affordance for direct perception in autonomous driving[C]//Proceedings of the 2015 International Conference on Computer Vision. Piscataway:IEEE,2015:2722-2730. [8] BANSAL M,KRIZHEVRZ A,OGALE A. ChauffeurNet:learning to drive by imitating the best and synthesizing the worst[EB/OL].[2018-12-07]. https://arxiv.org/pdf/1812.03079.pdf. [9] MNTH V,KAVUKCUOGLU K,SILVER D,et al. Playing Atari with deep reinforcement learning[EB/OL].[2018-12-12]. https://www.cs.toronto.edu/~vmnih/docs/dqn.pdf. [10] 赵玉婷, 韩宝玲, 罗庆生. 基于Deep Q-network的双足机器人非平整地面行走稳定性控制方法[J]. 计算机应用, 2018, 38(9):2459-2463.(ZHAO Y T,HAN B L,LUO Q S. Walking stability control method based on deep Q-network for biped robot on uneven ground[J]. Journal of Computer Applications,2018,38(9):2459-2463.) [11] LILLICRAP T P,HUNT J J,PRITZEL A,et al. Continuous control with deep reinforcement learning[EB/OL].[2019-02-22] https://arxiv.org/pdf/1509.02971v2.pdf. [12] 张政, 何山, 贺靖淇. 基于长短时记忆单元和卷积神经网络混合神经网络模型的视频着色方法[J]. 计算机应用, 2019, 39(9):2726-2730.(ZHANG Z,HE S,HE J Q. Video colorization method based on hybrid neural network model of long short term memory and convolutional neural network[J]. Journal of Computer Applications,2019,39(9):2726-2730.) [13] BOJARSKI M,DEL TESTA D,DWOAKOWSKI D,et al. End to end learning for self-driving cars[EB/OL].[2019-02-23]. http://arxiv.org/pdf/1604.07316.pdf. [14] HOCHREFFITER S,SCHMIDHUBER J. Long short-term memory[J]. Neural Computation,1997,9(8):1735-1780. [15] WATKINS C J C H,DAYAN P. Q-learning[J]. Machine Learning,1992,8(3/4):279-292. [16] ZHANG R,LIU C,CHEN Q. End-to-end control of kart agent with deep reinforcement learning[C]//Proceedings of the 2018 IEEE International Conference on Robotics and Biomimetics. Piscataway:IEEE,2018:1688-1693. [17] DOSOVITSKIY A,ROS G,CODEVILLA F,et al. CARLA:an open urban driving simulator[EB/OL].[2018-11-10]. https://arxiv.org/pdf/1711.03938.pdf. [18] 方红, 杨海蓉. 贪婪算法与压缩感知理论[J]. 自动化学报, 2011, 37(12):1413-1421.(FANG H,YANG H R. Greedy algorithms and compressed sensing[J]. Acta Automatica Sinica, 2011,37(12):1413-1421.) [19] MNITH V,BADIA A P,MIRZ M,et al. Asynchronous methods for deep reinforcement learning[EB/OL].[2018-11-10]. https://arxiv.org/pdf/1602.01783.pdf. |