Publication - Coordinated Multi-Agent Learning for Decentralized POMDPs

Authors: Zhang, Chongjie; Lesser, Victor
Title: Coordinated Multi-Agent Learning for Decentralized POMDPs
Abstract: In many multi-agent applications such as distributed sensor nets, a network of agents act collaboratively under uncertainty and local interactions. Networked Distributed POMDP (ND-POMDP) provides a framework to model such cooperative multi-agent decision making. Existing work on ND-POMDPs has focused on offline techniques that require accurate models, which are usually costly to obtain in practice. This paper presents a model-free, scalable learning approach that synthesizes multi-agent reinforcement learning (MARL) and distributed constraint optimization (DCOP). By exploiting structured interaction in ND-POMDPs, our approach distributes the learning of the joint policy and employs DCOP techniques to coordinate distributed learning to ensure the global learning performance. Our approach can learn a globally optimal policy for ND-POMDPs with a property called groupwise observability. Experimental results show that, with communication during learning and execution, our approach significantly out-performs the nearly-optimal non-communication policies computed offline.
Publication: Proceedings of 7th Workshop on Multiagent Sequential Decision Making Under Uncertainty, held in conjunction with 11th International Conference on Autonomous Agents and Multiagent Systems
Location: Valencia, Spain
Date: 2012
Sources: PDF: /Documents/msdm-12-chongjie.pdf
Reference: Zhang, Chongjie; Lesser, Victor. Coordinated Multi-Agent Learning for Decentralized POMDPs. Proceedings of 7th Workshop on Multiagent Sequential Decision Making Under Uncertainty, held in conjunction with 11th International Conference on Autonomous Agents and Multiagent Systems . 2012.
bibtex:
Zhang, Chongie; Lesser, Victor. "Coordinated Multi-Agent Learning for Decentralized POMDPs." Proceedings of 7th Annual Workshop on Multiagent Sequential Decision-
Making Under Uncertainty (MSDM-2012), held in conjunction with
AAMAS, Valencia, Spain, June 2012.