CS539: Probabilistic Agents

Course Description

This course will study how to construct an intelligent agent based on probabilistic reasoning. We will begin by studying reinforcement learning including both Markov decision problems (MDPs) and partially-observable Markov decision problems (POMDPs). Then we will consider several reinforcement learning algorithms including prioritized sweeping, TD(lambda), Q learning, and direct policy search. The need to understand probabilistic models of POMDPs will lead us to study belief networks (bayesian networks), algorithms for reasoning in belief networks, and special kinds of networks, particularly Hidden Markov Models (HMMs). We will study the forward-backward algorithm for reasoning in HMMs as well as Monte Carlo methods. To learn HMMs, we will first study the EM algorithm for simple, "naive bayes" networks. Then we will apply EM to learn HMMs. After taking some time to study applications of HMMs in biology, speech recognition, and robotics, we will combine HMMs with reinforcement learning to construct complete probabilistic agents. The coursework will consist primarily of reading and programming assignments along with midterm and final exams.

Prerequisites: CS530 or consent of the instructor; basic knowledge of probability

Registration Information: 4 Units. MWF 9:00-9:50 Rogers 332 CRN 25321.

Course Handouts

Syllabus. (Updated February 29)
Suggested object-oriented design for RL programs.

Viewgraphs for Lectures

Part 1. Agents and Markov Decision Problems (Chapter 3, Sutton and Barto). Barto's slides Additional slides.
Part 2. Dynamic Programming and Prioritized Sweeping. Barto's slides Additional slides.
Part 3. Temporal Difference Methods. Barto's slides Additional slides.
Part 4. Eligibility Traces. Barto's slides.
Part 5. Function Approximation Barto's slides.
Part 6. Model-based Reinforcement Learning; Introduction to Belief Networks Postscript Slides.
Part 7. Inference in Belief Networks. (Updated 2/21/2000.) Postscript Slides.
Part 8. Learning in Belief Networks. Postscript Slides.
Part 9. Hidden Markov Models. Postscript Slides.

Homework Assignments

Programming Assignments

Exam Solutions

Tom Dietterich, tgd@cs.orst.edu