Publication - Accelerating Multi-agent Reinforcement Learning with Dynamic Co-learning
| Authors: | Garant, Daniel; da Silva, Bruno C.; Lesser, Victor; Zhang, Chongjie | ||||
| Title: | Accelerating Multi-agent Reinforcement Learning with Dynamic Co-learning | ||||
| Abstract: | We introduce an approach to adaptively identify opportunities to periodically transfer experiences between agents in large-scale, stochastic, homogeneous, multi-agent systems. This algorithm operates in an on-line, distributed manner, using supervisor-directed transfer, leading to more rapid acquisition of appropriate policies in systems with a large number of cooperating reinforcement learning agents. Our method constructs high-level characterizations of the system---called contexts---and uses them to identify which agents operate under approximately similar dynamics. A set of supervisory agents compute and reason over contextual similarity between agents, identifying candidates for experience sharing, or co-learning. Using a tiered architecture, state, action, and reward tuples are propagated amongst the members of co-learning groups. We demonstrate the effectiveness of this approach on a large-scale distributed task allocation problem with hundreds of co-learning agents operating in an unknown environment with non-stationary neighbors. | ||||
| Publication: | UMass Computer Science Technical Report UM-CS-2015-004 | ||||
| Date: | 2015 | ||||
| Sources: |
PDF: https://web.cs.umass.edu/publication/docs/2015/UM-CS-2015-004.pdf |
||||
| Notes: | Extended version of paper submitted to AAMAS 2015. | ||||
| Reference: | Garant, Daniel; da Silva, Bruno C.; Lesser, Victor; Zhang, Chongjie. Accelerating Multi-agent Reinforcement Learning with Dynamic Co-learning. UMass Computer Science Technical Report UM-CS-2015-004. 2015. Extended version of paper submitted to AAMAS 2015. | ||||
| bibtex: | @techreport{Garant-532,
author = "Daniel Garant and Bruno C. da Silva and Victor
Lesser and Chongjie Zhang",
title = "{Accelerating Multi-agent Reinforcement Learning
with Dynamic Co-learning}",
year = "2015",
url = "http://mas.cs.umass.edu/paper/532",
}
|
||||