Publication - Coordinated Multi-Agent Reinforcement Learning in Networked Distributed POMDPs

Authors: Zhang, Chongjie; and Lesser, Victor
Title: Coordinated Multi-Agent Reinforcement Learning in Networked Distributed POMDPs
Abstract: In many multi-agent applications such as distributed sensor nets, a network of agents act collaboratively un- der uncertainty and local interactions. Networked Distributed POMDP (ND-POMDP) provides a framework to model such cooperative multi-agent decision making. Existing work on ND-POMDPs has focused on offline techniques that require accurate models, which are usu- ally costly to obtain in practice. This paper presents a model-free, scalable learning approach that synthesizes multi-agent reinforcement learning (MARL) and dis- tributed constraint optimization (DCOP). By exploiting structured interaction in ND-POMDPs, our approach distributes the learning of the joint policy and employs DCOP techniques to coordinate distributed learning to ensure the global learning performance. Our approach can learn a globally optimal policy for ND-POMDPs with a property called groupwise observability. Exper- imental results show that, with communication during learning and execution, our approach significantly out- performs the nearly-optimal non-communication poli- cies computed offline.
Keywords: Communication, Coordination, Distributed AI, Distributed MDP, Learning, Multi-Agent Systems, Uncertainty
Publication: Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence (AAAI-11), pp. 764 - 770
Location: San Francisco, California, USA
Date: 2011
Sources: PDF: /Documents/aaai11-zhang.pdf
Reference: Zhang, Chongjie; and Lesser, Victor. Coordinated Multi-Agent Reinforcement Learning in Networked Distributed POMDPs. Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence (AAAI-11), pp. 764-770. 2011.
bibtex:
@inproceedings{Zhang-505,
  author    = "Chongjie Zhang and Victor Lesser",
  title     = "{Coordinated Multi-Agent Reinforcement Learning in
               Networked Distributed POMDPs}",
  booktitle = "Proceedings of the Twenty-Fifth AAAI Conference on
               Artificial Intelligence (AAAI-11)",
  pages     = "764-770",
  year      = "2011",
  address   = "San Francisco, California, USA",
  url       = "http://mas.cs.umass.edu/paper/505",
}