Each student is responsible for his/her own work. The standard departmental rules for academic dishonesty apply to all assignments in this course. Collaboration on homeworks and programs should be limited to answering questions that can be asked and answered without using any written medium (e.g., no pencils, pens, or email). This means that no student should read any code written by another student.
tgd@cs.orst.edu.
Jan 5 Agents. Markov decision problems. Partially-observable Markov
Decision Problems.
7 Optimal value functions and policies [SB 3]
10 Policy Evaluation, Policy Iteration. [SB 4]
12 Value Iteration, Generalized Value Iteration, Prioritized
Sweeping [SB 4 and handout]
14 Monte Carlo methods [SB 5]
17 MLK Holiday: No Class
19 TD(0), SARSA(0), Q learning. [SB 6.4]
21 Average Reward DP: R learning [handout]
24 TD(lambda) [SB 7]
26 SARSA(lambda), Q(lambda) [SB 7]
28 TD(lambda) with function approximation [SB 8]
31 MIDTERM EXAM
Feb 2 Model-based learning. Compact models of the environment
4 Review of belief networks
7 Belief net inference: SPI
9 Belief net inference: Junction Tree algorithm
11 Constructing junction trees using SPI
14 Learning in belief nets: fully observable, known structure
16 Learning in belief nets: hidden variables, known structure
The Hard EM algorithm for Gaussian mixtures,
18 EM for naive Bayes mixture models
21 EM and overfitting, Dirichlet priors
23 Gibbs Sampling with and without learning
25 Hidden Markov Models: Forward algorithm, MPE
28 Hidden Markov Models: Viterbi Algorithm
Mar 1 HMMs applied to speech recognition and DNA sequence modeling
3 Factorial HMMs and Monte Carlo inference in HMMs
6 HMMs applied to reinforcement learning of POMDPs.
Greedy action selection in HMMs: Value of Information
8 Decomposition methods for MDPs: MAXQ
10 Class Cancelled
16 14:00 FINAL EXAM