Publication - Coordinated Multi-Agent Learning for Decentralized POMDPs
| Authors: | Zhang, Chongjie; Lesser, Victor | ||||
| Title: | Coordinated Multi-Agent Learning for Decentralized POMDPs | ||||
| Abstract: | In many multi-agent applications such as distributed sensor nets, a network of agents act collaboratively under uncertainty and local interactions. Networked Distributed POMDP (ND-POMDP) provides a framework to model such cooperative multi-agent decision making. Existing work on ND-POMDPs has focused on offline techniques that require accurate models, which are usually costly to obtain in practice. This paper presents a model-free, scalable learning approach that synthesizes multi-agent reinforcement learning (MARL) and distributed constraint optimization (DCOP). By exploiting structured interaction in ND-POMDPs, our approach distributes the learning of the joint policy and employs DCOP techniques to coordinate distributed learning to ensure the global learning performance. Our approach can learn a globally optimal policy for ND-POMDPs with a property called groupwise observability. Experimental results show that, with communication during learning and execution, our approach significantly out-performs the nearly-optimal non-communication policies computed offline. | ||||
| Publication: | Proceedings of 7th Workshop on Multiagent Sequential Decision Making Under Uncertainty, held in conjunction with 11th International Conference on Autonomous Agents and Multiagent Systems | ||||
| Location: | Valencia, Spain | ||||
| Date: | 2012 | ||||
| Sources: |
PDF: /Documents/msdm-12-chongjie.pdf |
||||
| Reference: | Zhang, Chongjie; Lesser, Victor. Coordinated Multi-Agent Learning for Decentralized POMDPs. Proceedings of 7th Workshop on Multiagent Sequential Decision Making Under Uncertainty, held in conjunction with 11th International Conference on Autonomous Agents and Multiagent Systems . 2012. | ||||
| bibtex: | Zhang, Chongie; Lesser, Victor. "Coordinated Multi-Agent Learning for Decentralized POMDPs." Proceedings of 7th Annual Workshop on Multiagent Sequential Decision- Making Under Uncertainty (MSDM-2012), held in conjunction with AAMAS, Valencia, Spain, June 2012. |
||||