Publication - Coordinated Multi-Agent Learning for Decentralized POMDPs

Authors:	Zhang, Chongjie; Lesser, Victor
Title:	Coordinated Multi-Agent Learning for Decentralized POMDPs
Abstract:	In many multi-agent applications such as distributed sensor nets, a network of agents act collaboratively under uncertainty and local interactions. Networked Distributed POMDP (ND-POMDP) provides a framework to model such cooperative multi-agent decision making. Existing work on ND-POMDPs has focused on offline techniques that require accurate models, which are usually costly to obtain in practice. This paper presents a model-free, scalable learning approach that synthesizes multi-agent reinforcement learning (MARL) and distributed constraint optimization (DCOP). By exploiting structured interaction in ND-POMDPs, our approach distributes the learning of the joint policy and employs DCOP techniques to coordinate distributed learning to ensure the global learning performance. Our approach can learn a globally optimal policy for ND-POMDPs with a property called groupwise observability. Experimental results show that, with communication during learning and execution, our approach significantly out-performs the nearly-optimal non-communication policies computed offline.
Publication:	Proceedings of 7th Workshop on Multiagent Sequential Decision Making Under Uncertainty, held in conjunction with 11th International Conference on Autonomous Agents and Multiagent Systems
Location:	Valencia, Spain
Date:	2012
Sources:	PDF: /Documents/msdm-12-chongjie.pdf
Reference:	Zhang, Chongjie; Lesser, Victor. Coordinated Multi-Agent Learning for Decentralized POMDPs. Proceedings of 7th Workshop on Multiagent Sequential Decision Making Under Uncertainty, held in conjunction with 11th International Conference on Autonomous Agents and Multiagent Systems . 2012.
`bibtex`:	Zhang, Chongie; Lesser, Victor. "Coordinated Multi-Agent Learning for Decentralized POMDPs." Proceedings of 7th Annual Workshop on Multiagent Sequential Decision- Making Under Uncertainty (MSDM-2012), held in conjunction with AAMAS, Valencia, Spain, June 2012.