Kagan Tumer's Publications

Display Publications by [Year] [Type] [Topic]


Learning Sequences of Actions in Collectives of Autonomous Agents. K. Tumer, A. Agogino, and D. Wolpert. In Proceedings of the First International Joint Conference on Autonomous Agents and Multiagent Systems, pp. 378–385, Bologna, Italy, July 2002.

Abstract

In this paper we focus on the problem of designing a collective of autonomous agents that individually learn sequences of actions such that the resultant sequence of joint actions achieves a predetermined global objective. Directly applying Reinforcement Learning (RL) concepts to multi-agent systems often proves problematic, as agents may work at crosspurposes, or have difficulty in evaluating their contribution to achievement of the global objective, or both. Accordingly, the crucial design step in designing multi-agent systems focuses on how to set the rewards for the RL algorithm of each agent so that as the agents attempt to maximize those rewards, the system reaches a globally "desirable" solution. In this work we consider a version of this problem involving multiple autonomous agents in a grid world. We use concepts from collective intelligence to design rewards for the agents that are "aligned" with the global reward, and are "learnable" in that agents can readily see how their behavior affects their reward. We show that reinforcement learning agents using those rewards outperform both "natural" extensions of single agent algorithms and global reinforcement learning solutions based on "team games."

Download

[PDF]241.8kB  

BibTeX Entry

@inproceedings{tumer-agogino_aamas02,
	author = {K. Tumer and A. Agogino and D. Wolpert},
	title = {Learning Sequences of Actions in Collectives of 
		Autonomous Agents},
	booktitle = {Proceedings of the First International Joint Conference on 
	Autonomous Agents and Multiagent Systems},
	pages = {378-385},
	month = {July},
	address = {Bologna, Italy},
	abstract = {In this paper we focus on the problem of designing a collective of autonomous agents that individually learn sequences of actions such that the resultant sequence of joint actions achieves a predetermined global objective. Directly applying Reinforcement Learning (RL) concepts to multi-agent systems often proves problematic, as agents may work at crosspurposes, or have difficulty in evaluating their contribution to achievement of the global objective, or both. Accordingly, the crucial design step in designing multi-agent systems focuses on how to set the rewards for the RL algorithm of each agent so that as the agents attempt to maximize those rewards, the system reaches a globally "desirable" solution. In this work we consider a version of this problem involving multiple autonomous agents in a grid world. We use concepts from collective intelligence to design rewards for the agents that are "aligned" with the global reward, and are "learnable" in that agents can readily see how their behavior affects their reward. We show that reinforcement learning agents using those rewards outperform both "natural" extensions of single agent algorithms and global reinforcement learning solutions based on "team games."},
	bib2html_pubtype = {Refereed Conference Papers},
	bib2html_rescat = {Multiagent Systems, Reinforcement Learning},
	year = {2002}
}

Generated by bib2html.pl (written by Patrick Riley ) on Wed Apr 01, 2020 17:39:43