CS533 Final Exam Study Guide
The final exam will be held in Rogers Hall Room 440 at 12:00 on
Monday, December 4. It will last 110 minutes. Format: short answer
problems (like the midterm), open notes.
The Final will cover everything in the course, but at least two-thirds
of the exam will cover material that we have discussed since the midterm.
- Machine Learning and Pattern Recognition
- Definition of classification learning problem.
- Neural networks:
- weights, bias values, sigmoid functions.
- error function J(S,W), gradient descent search
- stochastic gradient descent, conjugate gradient descent
- measuring error using holdout data.
- the problem of overfitting; solving via early stopping.
- Decision trees:
- structure and how they are executed to classify
- top-down divide-and-conquer method for growing
- Scoring a proposed splitting test using mutual information.
- Pruning using a validation set.
- Converting trees to rules.
- Nearest neighbor method:
- finding the k nearest neighbors.
- importance of the distance metric.
- problems with noisy or irrelevant features.
- finding the nearest neighbor using a kd tree.
- Reinforcement Learning:
- Definition of a reinforcement learning task: states, actions,
- Definition of policy, optimal policy, and value function.
- Computing the optimal policy from the optimal value function.
- The value iteration algorithm.
- Temporal difference learning using a neural network.
- Diagnosis with Belief Networks
- Diagnosis problem: repair the device while minimizing total
average cost of repair.
- Computing optimal policies in the repair-only case with the
single-fault assumption (ratio of probability of failure divided by
cost of observation).
- Computing optimal policies in the general case by complete
analysis of the decision tree (working backwards taking expected
values and max's).
- Computing approximately-optimal policies for the case where we
have a mix of repairable and purely observable components.
- Probabilistic Reasoning:
- Random variables, expected values, joint distribution, marginal
probability, conditional probability.
- Computing marginal and conditional probabilities from the joint
- Algebra of probability distributions: chain rule, independence,
- Belief networks. What probability distribution is stored at each
node. Computing the joint distribution by taking the conformal
product of all of the individual distributions.
- Computing probabilities for diagnosis. Implementing the
single fault assumption.