Kagan Tumer: <b>Policy Progress Score for Automatic Task Selection in Curriculum Learning</b>

Kagan Tumer's Publications

Display Publications by [Year] [Type] [Topic]

Policy Progress Score for Automatic Task Selection in Curriculum Learning. G. Rockefeller, S. Chow, Y. Tuladhar, and K. Tumer. In AAMAS-2018 Workshop on Adaptive and Learning Agents, Stockholm, Sweden, July 2018.

Abstract

This paper introduces policy progress scores as a principled means of selecting intermediate tasks that enable effective transfer learning. For some complex tasks, rewards can be sparse, delayed, noisy and/or uninformative, making it difficult to find good policies. For such tasks, a sequence of related tasks, i.e. curriculum, can be designed to enable and speed up learning by transferring knowledge between successive tasks in the curriculum. However, how to develop an effective curriculum, particularly without in depth domain knowledge, is a challenging problem.In this paper, we present an automatic curriculum generation process that scores tasks on the anticipated change in policy parameters. This Òpolicy progressÓ score is defined as the weighted sum of absolute expected change in a policy parameters. We apply the automated curriculum generation process using progress score to both single agent and multiagent Grid World domains. Our results on a complex multiagent Grid World domain show policy progress scores create similar curricula as the current state-of-the-art with reduced computation.

Download

(unavailable)

BibTeX Entry

@INPROCEEDINGS {tumer-rockefeller_ala2018, 
author = {G. Rockefeller and S. Chow and Y. Tuladhar and K. Tumer}, 
title = {Policy Progress Score for Automatic Task Selection in Curriculum Learning}, 
booktitle = {AAMAS-2018 Workshop on Adaptive and Learning Agents}, 
address = {Stockholm, Sweden}, 
month = {July},
abstract={This paper introduces policy progress scores as a principled means of selecting intermediate tasks that enable effective transfer learning. For some complex tasks, rewards can be sparse, delayed, noisy and/or uninformative, making it difficult to find good policies. For such tasks, a sequence of related tasks, i.e. curriculum, can be designed to enable and speed up learning by transferring knowledge between successive tasks in the curriculum. However, how to develop an effective curriculum, particularly without in depth domain knowledge, is a challenging problem.
In this paper, we present an automatic curriculum generation process that scores tasks on the anticipated change in policy parameters. This Òpolicy progressÓ score is defined as the weighted sum of absolute expected change in a policy parameters. We apply the automated curriculum generation process using progress score to both single agent and multiagent Grid World domains. Our results on a complex multiagent Grid World domain show policy progress scores create similar curricula as the current state-of-the-art with reduced computation.},
	bib2html_pubtype = {Workshop/Symposium Papers},	
	bib2html_rescat = {Multiagent Systems},
year = 2018,
}

Generated by bib2html.pl (written by Patrick Riley ) on Wed Apr 01, 2020 17:39:43