Sort by: [year] [type] [author]

Interpolation-Based Q-Learning

Csaba Szepesvári and William D. Smart.
In "Proceedings of the Twenty-First International Conference on Machine Learning (ICML 2004)", Russell Greiner and Dale Schuurmans (eds)., pages 791-798, July 2004.

We consider a variant of Q-learning in continuous state spaces under the total expected discounted cost criterion combined with local function approximation methods. Provided that the function approximator satisfies certain interpolation properties, the resulting algorithm is shown to converge with probability one. The limit function is shown to satisfy a fixed point equation of the Bellman type, where the fixed point operator depends on the stationary distribution of the exploration policy and approximation properties of the function approximation method. The basic algorithm is extended in several ways. In particular, a variant of the algorithm is obtained that is shown to converge in probability to the optimal Q function. Preliminary computer simulations confirm the validity of the approach.

Paper: [PDF]

  author = {Szepesv\'{a}ri, Csaba and Smart, William D.},
  editor = {Greiner, Russell and Schuurmans, Dale},
  title = {Interpolation-Based {Q}-Learning},
  booktitle = {Proceedings of the Twenty-First International Conference on Machine Learning ({ICML} 2004)},
  pages = {791--798},
  month = {July},
  year = {2004}