Kagan Tumer's Publications

Display Publications by [Year] [Type] [Topic]


Traffic Congestion Management as a Learning Agent Coordination Problem. K. Tumer, A. K. Agogino, and Z. Welch. In A. Bazzan and F. Kluegl, editors, Multiagent Architectures for Traffic and Transportation Engineering, pp. 261–279, Lecture notes in AI, Springer, 2009.

Abstract

Traffic management problems provide a unique environment to study how multiagent systems promote desired system level behavior. In particular, they represent a special class of problems where the individual actions of the agents are neither intrinsically ``good" nor ``bad'' for the system. Instead, it is the combinations of actions among agents that lead to desirable or undesirable outcomes. As a consequence, agents need to learn how to coordinate their actions with those of other agents, rather than learn a particular set of ``good" actions. In this chapter, we focus on problems where there is no communication among the drivers, which puts the burden of coordination on the principled selection of the agent reward functions. We explore the impact of agent reward functions on two types of traffic problems. In the first problem,we study how agents learn the best departure times in a daily commuting environment and how following those departure times alleviates congestion. In the second problem, we study how agents learn to select desirable lanes to improve traffic flow and minimize delays for all drivers. In both cases, we focus on having an agent select the most suitable action for each driver using reinforcement learning, and explore the impact of different reward functions on system behavior. Our results show that agent rewards that are both aligned with, and sensitive to, the system reward lead to significantly better results than purely local or global agent rewards. We conclude this chapter by discussing how changing the way in which the system performance is measured affects the relative performance of these rewards functions, and how agent rewards derived for one setting (timely arrivals) can be modified to meet a new system setting (maximize throughput).

Download

[PDF]280.0kB  

BibTeX Entry

@incollection{tumer-welch_maatte09,
	title = {Traffic Congestion Management as a Learning Agent Coordination Problem}, 
	author = {K. Tumer and A. K. Agogino and Z. Welch},
	booktitle = {Multiagent Architectures for Traffic and Transportation Engineering},
	editor = {A. Bazzan and F. Kluegl},
	publisher = {Lecture notes in AI, Springer},
	abstract = {Traffic management problems provide a unique environment to study how multiagent systems promote desired system level behavior.  In particular, they represent a special class of problems where the individual actions of the agents are neither intrinsically ``good" nor ``bad'' for the system. Instead, it is the combinations of actions among agents that lead to desirable or undesirable outcomes. As a consequence, agents need to learn how to coordinate their actions with those of other agents, rather than learn a particular set of  ``good" actions. In this chapter, we focus on problems where there is no communication among the drivers, which puts the burden of coordination on the principled selection of the agent reward functions. 
We explore the impact of agent reward functions on  two types of traffic problems. In the first problem,
we study how agents learn the best departure times in a daily commuting environment and how following those departure times alleviates congestion. In the second problem,  we study how agents learn to select desirable lanes to improve traffic flow and minimize delays for all drivers.  In both cases, we focus on having an agent select the most suitable action for each driver using reinforcement learning, and explore the impact of different reward functions on system behavior. Our results show that agent rewards that are both aligned with, and sensitive to, the system reward lead to significantly better results than purely local or global agent rewards. 
We conclude this chapter by discussing how changing the way in which the system performance is measured affects the relative performance of these rewards functions, and how agent rewards derived for one setting (timely arrivals) can be modified to meet a new system setting (maximize throughput).},
	bib2html_pubtype = {Book Chapters},
	bib2html_rescat = {Multiagent Systems, Reinforcement Learning, Traffic and Transportation},
	pages = {261-279},
	year = {2009}
}

Generated by bib2html.pl (written by Patrick Riley ) on Wed Apr 01, 2020 17:39:43