Kagan Tumer: <b>Multiagent Reward Analysis for Learning in Noisy Domains</b>

Kagan Tumer's Publications

Display Publications by [Year] [Type] [Topic]

Multiagent Reward Analysis for Learning in Noisy Domains. A. Agogino and K. Tumer. In Proceedings of the Fourth International Joint Conference on Autonomous Agents and Multiagent Systems, Utrecht, Netherlands, July 2005.

Abstract

In many multi agent learning problems, it is difficult to determine, a priori, the agent reward structure that will lead to good performance. This problem is particularly pronounced in continuous, noisy domains ill-suited to simple table backup schemes commonly used in TD(\lambda)/Q-learning. In this paper, we present a new reward evaluation method that allows the tradeoff between coordination among the agents and the difficulty of the learning problem each agent faces to be visualized. This method is independent of the learning algorithm and is only a function of the problem domain and the agents' reward structure. We then use this reward efficiency visualization method to determine an effective reward without performing extensive simulations. We test this method in both a static and a dynamic multi-rover learning domain where the the agents have continuous state spaces and where their actions are noisy (e.g., the agents' movement decisions are not always carried out properly). Our results show that in the more difficult dynamic domain, the reward efficiency visualization method provides a two order of magnitude speedup in selecting a good reward. Most importantly it allows one to quicklycreate and verify rewards tailored to the observational limitations of the domain.

Download

[PDF]258.2kB

BibTeX Entry

@inproceedings{tumer-agogino_aamas05,
	author = {A. Agogino and K. Tumer},
	title = {Multiagent Reward Analysis for Learning in Noisy Domains},
	booktitle = {Proceedings of the Fourth International Joint Conference on Autonomous Agents and Multiagent Systems},
	month = {July},
	address = {Utrecht, Netherlands},
	abstract = {In many multi agent learning problems, it is difficult to determine, a priori, the agent reward  structure that will lead to good performance. This problem is  particularly pronounced in continuous, noisy domains ill-suited to simple table backup schemes commonly used in TD(\lambda)/Q-learning. In this paper, we present a new reward evaluation method that allows the tradeoff between coordination among the agents and the difficulty of the learning problem each agent faces to be visualized. This method is independent of the learning algorithm and is only a function of the problem domain and the agents' reward structure. We then use this reward efficiency visualization  method to determine an effective reward without  performing extensive simulations. We test this method in both a static and a dynamic multi-rover learning domain where the the agents have  continuous state spaces and where their actions are noisy (e.g., the agents' movement decisions are not always carried out properly). Our results show that in the more difficult dynamic domain, the reward efficiency visualization method provides a two order of magnitude speedup in selecting a good reward. Most importantly it allows one to quicklycreate and verify rewards tailored to the observational limitations of the domain.},
	bib2html_pubtype = {Refereed Conference Papers},
	bib2html_rescat = {Multiagent Systems, Reinforcement Learning},
	year = {2005}
}

Generated by bib2html.pl (written by Patrick Riley ) on Wed Apr 01, 2020 17:39:43