[1]
张秋华,薛惠锋,吴介军,等.多智能体系统MAS机器应用[J].计算机仿真,2007,24(6):133-137.
[2]
BUSONIU L, BABUSKA R, De SCHUTTER B. A comprehensive survey of Multi-Agent reinforcement learning [J]. IEEE Transactions on Systems, Man, and Cybernetics — Part C: Applications and Reviews. 2008,38(2):156-172.
[3]
孙湧,仵博,冯延蓬.基于策略迭代和值迭代的POMDP算法[J].计算机研究与发展,2008,45(10):1763-1768.
[4]
NAIR R, ROTH M, YOKOO M, et al.Communication for improving policy computation in distributed POMDPs[C]// AAMAS04: Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems.Washington, DC: IEEE Computer Society,2004,3:1098-1105.
[5]
BERNSTEIN D S, GIVAN R, IMMERMAN N,et al.. The complexity of decentralized control of Markov decision processes[J]. Mathematics of Operations Research, 2002,27(4):819-840.
[6]
PESHKIN L, KIM K E, MEULEAU N, et al. Learning to cooperate via policy search[C]// Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence. San Francisco: Morgan Kaufmann, 2000: 489-496.
[7]
吴峰.基于决策理论的多智能体系统规划问题研究[D].合肥:中国科学技术大学,2011.
[8]
PYNADATH D V, TAMBE M. The communicative multiagent team decision problem: Analyzing teamwork theories and models [J]. Journal of Artificial Intelligence Research, 2002,16:389-423.
[9]
GOLDMAN C V, ZILBERSTEIN S. Decentralized control of cooperative systems: Categorization and complexity analysis[J]. Joumal of Artificial Intelligence Research, 2004,22(1):143-174.
[10]
高阳,陈世福,陆鑫.强化学习研究综述[J].自动化学报,2004,30(1):86-100.
[11]
范长杰.基于马尔可夫决策理论的规划问题的研究[D].合肥:中国科学技术大学,2008.
[12]
ROTH M, SIMMONS R, VELOSO M. Reasoning about joint beliefs for execution-time communication decisions[C]// Proceedings of the 4th International Joint Conference on Autonomous Agents and Multi Agent Systems. Dordrecht, Netherland: Springer,2005:786-793.
[13]
ROTH M, SIMMONS R, VELOSO M. Decentralized communication strategies for coordinated multi-Agent policies[C]// Multi-Robot Systems: From Swarms to Intelligent Automata. Dordrecht, Netherland: Springer, 2005,3:93-106.
[14]
刘海涛,洪炳镕,乔立民,等.多智能体机器人系统分散式通信决策研究[J]. 机器人,2007,29(6):540-545.
[15]
仵博,吴敏.一种基于信念状态压缩的实时POMDP算 法[J] .控制与决策,2007,22(12):1417-1420. |