Publication - Accelerating Multi-agent Reinforcement Learning with Dynamic Co-learning
Authors: | Garant, Daniel; da Silva, Bruno C.; Lesser, Victor; Zhang, Chongjie | ||||
Title: | Accelerating Multi-agent Reinforcement Learning with Dynamic Co-learning | ||||
Abstract: | We introduce an approach to adaptively identify opportunities to periodically transfer experiences between agents in large-scale, stochastic, homogeneous, multi-agent systems. This algorithm operates in an on-line, distributed manner, using supervisor-directed transfer, leading to more rapid acquisition of appropriate policies in systems with a large number of cooperating reinforcement learning agents. Our method constructs high-level characterizations of the system---called contexts---and uses them to identify which agents operate under approximately similar dynamics. A set of supervisory agents compute and reason over contextual similarity between agents, identifying candidates for experience sharing, or co-learning. Using a tiered architecture, state, action, and reward tuples are propagated amongst the members of co-learning groups. We demonstrate the effectiveness of this approach on a large-scale distributed task allocation problem with hundreds of co-learning agents operating in an unknown environment with non-stationary neighbors. | ||||
Publication: | UMass Computer Science Technical Report UM-CS-2015-004 | ||||
Date: | 2015 | ||||
Sources: |
PDF: https://web.cs.umass.edu/publication/docs/2015/UM-CS-2015-004.pdf |
||||
Notes: | Extended version of paper submitted to AAMAS 2015. | ||||
Reference: | Garant, Daniel; da Silva, Bruno C.; Lesser, Victor; Zhang, Chongjie. Accelerating Multi-agent Reinforcement Learning with Dynamic Co-learning. UMass Computer Science Technical Report UM-CS-2015-004. 2015. Extended version of paper submitted to AAMAS 2015. | ||||
bibtex: | @techreport{Garant-532, author = "Daniel Garant and Bruno C. da Silva and Victor Lesser and Chongjie Zhang", title = "{Accelerating Multi-agent Reinforcement Learning with Dynamic Co-learning}", year = "2015", url = "http://mas.cs.umass.edu/paper/532", } |