Publication - Accelerating Multi-agent Reinforcement Learning with Dynamic Co-learning

Authors:	Garant, Daniel; da Silva, Bruno C.; Lesser, Victor; Zhang, Chongjie
Title:	Accelerating Multi-agent Reinforcement Learning with Dynamic Co-learning
Abstract:	We introduce an approach to adaptively identify opportunities to periodically transfer experiences between agents in large-scale, stochastic, homogeneous, multi-agent systems. This algorithm operates in an on-line, distributed manner, using supervisor-directed transfer, leading to more rapid acquisition of appropriate policies in systems with a large number of cooperating reinforcement learning agents. Our method constructs high-level characterizations of the system---called contexts---and uses them to identify which agents operate under approximately similar dynamics. A set of supervisory agents compute and reason over contextual similarity between agents, identifying candidates for experience sharing, or co-learning. Using a tiered architecture, state, action, and reward tuples are propagated amongst the members of co-learning groups. We demonstrate the effectiveness of this approach on a large-scale distributed task allocation problem with hundreds of co-learning agents operating in an unknown environment with non-stationary neighbors.
Publication:	UMass Computer Science Technical Report UM-CS-2015-004
Date:	2015
Sources:	PDF: https://web.cs.umass.edu/publication/docs/2015/UM-CS-2015-004.pdf
Notes:	Extended version of paper submitted to AAMAS 2015.
Reference:	Garant, Daniel; da Silva, Bruno C.; Lesser, Victor; Zhang, Chongjie. Accelerating Multi-agent Reinforcement Learning with Dynamic Co-learning. UMass Computer Science Technical Report UM-CS-2015-004. 2015. Extended version of paper submitted to AAMAS 2015.
`bibtex`:	@techreport{Garant-532, author = "Daniel Garant and Bruno C. da Silva and Victor Lesser and Chongjie Zhang", title = "{Accelerating Multi-agent Reinforcement Learning with Dynamic Co-learning}", year = "2015", url = "http://mas.cs.umass.edu/paper/532", }