Kagan Tumer's Publications

Display Publications by [Year] [Type] [Topic]


Collaborative Evolutionary Reinforcement Learning. S. Khadka, S. Majumdar, T. Nassar, Z. Dwiel, E. Tumer, Y. Liu, and K. Tumer. In International Conference on Machine Learning (ICML), pp. , Long Beach, CA, June 2019.

Abstract

Deep reinforcement learning algorithms have been successfully applied to a range of challenging control tasks. However, these methods typically struggle with achieving effective exploration and are extremely sensitive to the choice of hyper- parameters. One reason is that most approaches use a noisy version of their operating policy to explore - thereby limiting the range of exploration. In this paper, we introduce Collaborative Evolutionary Reinforcement Learning (CERL), a scalable framework that comprises a portfolio of policies that simultaneously explore and exploit diverse regions of the solution space. A collection of learners - typically proven algorithms like TD3- optimize over varying time-horizons leading to this diverse portfolio. All learners contribute to and use a shared replay buffer to achieve greater sample efficiency. Computational resources are dynamically distributed to favor the best learners as a form of online algorithm selection. Neuroevolution binds this entire process to generate a single emergent learner that exceeds the capabilities of any individual learner. Experiments in a range of continuous control benchmarks demonstrate that the emergent learner significantly outperforms its composite learners while remaining overall more sample-efficient - notably solving the Mujoco Humanoid benchmark where all of its composite learners (TD3) fail entirely in isolation.

Download

[PDF]867.0kB  

BibTeX Entry

@InProceedings{tumer-khadka_icml19,
author = {S. Khadka and S. Majumdar and T. Nassar and Z. Dwiel and E. Tumer and Y. Liu and K. Tumer},
title = {Collaborative Evolutionary Reinforcement Learning},
booktitle = {International Conference on Machine Learning (ICML)},
address = {Long Beach, CA},
month = {June},
 pages={},
 abstract={Deep reinforcement learning algorithms have been successfully applied to a range of challenging control tasks. However, these methods typically struggle with achieving effective exploration and are extremely sensitive to the choice of hyper- parameters. One reason is that most approaches use a noisy version of their operating policy to explore - thereby limiting the range of exploration. In this paper, we introduce Collaborative Evolutionary Reinforcement Learning (CERL), a scalable framework that comprises a portfolio of policies that simultaneously explore and exploit diverse regions of the solution space. A collection of learners - typically proven algorithms like TD3
- optimize over varying time-horizons leading to this diverse portfolio. All learners contribute to and use a shared replay buffer to achieve greater sample efficiency. Computational resources are dynamically distributed to favor the best learners as a form of online algorithm selection. Neuroevolution binds this entire process to generate a single emergent learner that exceeds the capabilities of any individual learner. Experiments in a range of continuous control benchmarks demonstrate that the emergent learner significantly outperforms its composite learners while remaining overall more sample-efficient - notably solving the Mujoco Humanoid benchmark where all of its composite learners (TD3) fail entirely in isolation.},
	bib2html_pubtype = {Refereed Conference Papers},
	bib2html_rescat = {Multiagent Systems},
year = {2019}
}

Generated by bib2html.pl (written by Patrick Riley ) on Wed Apr 01, 2020 17:39:43