Each student is responsible for his/her own work. The standard departmental rules for academic dishonesty apply to all assignments in this course. Collaboration on homeworks and programs should be limited to answering questions that can be asked and answered without using any written medium (e.g., no pencils, pens, or email). This means that no student should read any code written by another student.
tgd@cs.orst.edu
.
Jan 5 Agents. Markov decision problems. Partially-observable Markov Decision Problems. 7 Optimal value functions and policies [SB 3] 10 Policy Evaluation, Policy Iteration. [SB 4] 12 Value Iteration, Generalized Value Iteration, Prioritized Sweeping [SB 4 and handout] 14 Monte Carlo methods [SB 5] 17 MLK Holiday: No Class 19 TD(0), SARSA(0), Q learning. [SB 6.4] 21 Average Reward DP: R learning [handout] 24 TD(lambda) [SB 7] 26 SARSA(lambda), Q(lambda) [SB 7] 28 TD(lambda) with function approximation [SB 8] 31 MIDTERM EXAM Feb 2 Model-based learning. Compact models of the environment 4 Review of belief networks 7 Belief net inference: SPI 9 Belief net inference: Junction Tree algorithm 11 Constructing junction trees using SPI 14 Learning in belief nets: fully observable, known structure 16 Learning in belief nets: hidden variables, known structure The Hard EM algorithm for Gaussian mixtures, 18 EM for naive Bayes mixture models 21 EM and overfitting, Dirichlet priors 23 Gibbs Sampling with and without learning 25 Hidden Markov Models: Forward algorithm, MPE 28 Hidden Markov Models: Viterbi Algorithm Mar 1 HMMs applied to speech recognition and DNA sequence modeling 3 Factorial HMMs and Monte Carlo inference in HMMs 6 HMMs applied to reinforcement learning of POMDPs. Greedy action selection in HMMs: Value of Information 8 Decomposition methods for MDPs: MAXQ 10 Class Cancelled 16 14:00 FINAL EXAM