Publication - Learning Task Allocation via Multi-Level Policy Gradient Algorithm with Dynamic Learning Rate

Authors:	Abdallah, Sherief; Lesser, Victor
Title:	Learning Task Allocation via Multi-Level Policy Gradient Algorithm with Dynamic Learning Rate
Abstract:	Task allocation is the process of assigning tasks to appropriate resources. To achieve scalability, it is common to use a network of agents (also called mediators) that handles task allocation. This work proposes a novel multi-level policy gradient algorithm to solve the local decision problem at each mediator agent. The higher level policy stochastically chooses a task decomposition. The lower level policy assigns subtasks to neighboring agents also stochastically. Agents learn autonomously, cooperatively, and concurrently to increase system performance. No state information is used except for the task being allocated. Furthermore, the algorithm dynamically adjusts the learning rate, to speed up convergence, using the ratio of action values. Experimental results show how our proposed solution outperforms other deterministic approaches by balancing the load over resources and converging faster to better policies.
Keywords:	Learning, Task Distribution
Publication:	Proceedings of Workshop on Planning and Learning in A Priori Unknown or Dynamic Domains, the International Joint Conference on Artificial Intelligence, IJCAI, pp. 76 - 82
Editor:	Bulitko, V.; and Koenig, S.
Location:	Edinburgh, UK
Date:	2005
Sources:	Citeseer: PDF: /Documents/Abdallah_IJCAI05_WS.pdf
Reference:	Abdallah, Sherief; Lesser, Victor. Learning Task Allocation via Multi-Level Policy Gradient Algorithm with Dynamic Learning Rate . Proceedings of Workshop on Planning and Learning in A Priori Unknown or Dynamic Domains, the International Joint Conference on Artificial Intelligence, IJCAI, Bulitko, V.; and Koenig, S., ed., pp. 76-82. 2005.
`bibtex`:	@article{Abdallah-407, author = "Sherief Abdallah and Victor Lesser", title = "{ Learning Task Allocation via Multi-Level Policy Gradient Algorithm with Dynamic Learning Rate }", journal = "Proceedings of Workshop on Planning and Learning in A Priori Unknown or Dynamic Domains, the International Joint Conference on Artificial Intelligence, IJCAI", editor = "V. Bulitko and S. Koenig", pages = "76-82", year = "2005", address = "Edinburgh, UK", url = "http://mas.cs.umass.edu/paper/407", }