CS430/530 Final Exam Study Guide

The final will be open books, open notes. It will cover all of the material that we have covered in class or on the homework and programming problems. Most of the final exam will focus on material we have covered since the midterm. We covered the following material before the midterm:

1.1, 1.4, 2(all), 3.1, 3.2, 3.3, 3.4, 3.5, 3.6, 4.1, 4.2, 4.4, 6.1, 6.2, 7.5, 7.6, 7.7, 7.8.

We covered the following material since the midterm:

14 (all), 15.1, 15.2, 15.3 (530 only), 15.4 (530 only), 16.1, 16.3, 16.5, 16.6 (530 only), 17 (all), 18.1, 18.2, 18.3, 18.4, 19.2, 19.3, 19.4, 19.6, 19.7, 20.1, 20.4, 20.5, 20.6.

You are responsible for all of these except for the sections in Chapter 7.

Here is an outline of the most important points in the second half of the course.

1. Implementing agents using probability
  Key ideas:
  * Use the power of probability to represent uncertainty about the environment
  * Use probabilistic inference to implement the functions of the agent.
  * Use utility function to represent the utility of different states
  * Combine with dynamic programming search to find optimal policies.

  * Probability theory
    - random variables
    - algebraic rules of probability

  * Probabilistic inference
    - Belief networks (semantics, syntax, conditional independence, D-separation)
    - SPI algorithm 

  * Dynamic Programming Algorithms
    - Value iteration
    - Policy iteration
    - Modified policy iteration

  * Dynamic decision networks
    - Updating belief about current state based on the chosen action
      and observation.
    - Performing lookahead search a fixed number of steps to choose
      the optimal action.

2. Learning for probabilistic agents
  Key ideas:
  * Each state-action-result-reward step provides training examples
    for learning P(S'|S,A) and R(S'|S,A).
  * A learning agent must explore (try actions not currently believed
    to be optimal) in order to learn more about the environment.
  * Exploration strategies include random exploration, weighted random
    exploration (Boltzmann exploration), and optimism under uncertainty.
  * Optimism under uncertainty is similar to A* search and avoids the
    need for exhaustive exploration in some cases.
  * Dynamic programming can be applied after each step to derive the
    current best policy.
  * Q learning is an alternative approach that avoids learning a model
    of the environment.  It generally requires many more interactions
    with the environment to reach an optimal policy.  Q learning
    relies on temporal averaging to compute expected values.

  Limitations:
  * Each action must be performed in each state many times in order to
    learn the model.  No generalization!
  * Q learning can be very slow.

3. Supervised Learning
  Key Ideas:
  * Supervised learning involves learning the definition of an unknown
    function from examples of that function.
  * Learning algorithms use heuristic algorithms to search large
    spaces of potential hypotheses.
  * There is a fundamental tradeoff in machine learning between the
    amount of data, the size of the hypothesis space, and the expected
    accuracy of the resulting hypothesis.
  * Decision tree algorithms "grow" a decision tree using a 1-step
    greedy heuristic.
  * Linear threshold units ("perceptrons") are learned by using
    gradient descent search.
  * Multilayer neural networks are learned by gradient descent. 
    Early stopping (using a halting set) is used to control the number
    of hypotheses explored.

Skills you should be able to demonstrate during the exam:

1. Know how to infer the joint distribution and conditional
   independencies from the structure of a belief network.
2. Be able to hand-execute the SPI algorithm. 
3. Be able to hand-simulate value iteration, policy iteration (both
   value determination and policy improvement), and Q learning.  
4. Be able to compute the optimism-under-uncertainty policy.
5. Be able to hand-simulate decision tree and linear threshold unit
   algorithms. 
6. Be able to write down belief networks and decision diagrams for
   simple situations.

The most important items from the first half of the course are

1. Definitions of different kinds of agents.
2. Definitions of different kinds of environments.
3. Key functions that must be implemented in a general agent.

Consult the midterm study guide for more details.