# Midterm Study Guide -- CS534 -- Spring 2005

Topics to know for the midterm:

• Situations in which machine learning is useful.
• Definitions of terminology: training examples, features, classes, hypotheses, hypothesis classes, loss functions, adjustable parameters, VC dimension.
• Decision theory: How to use a loss function to decide what decision to make in order to minimize expected loss. How to handle reject options.
• Three main kinds of hypotheses: decision boundaries, conditional models P(y|X), and joint models P(X,y).
• How to make classification decisions using each of these.
• Types of hypothesis spaces: Fixed versus variable, stochastic vs. deterministic Debate about which method is best. Factors disputed in the debate.
• Criteria for off-the-shelf learning algorithms. What does each of them mean?
• Details of specific learning algorithms and hypothesis spaces (type of decision boundary, learning algorithms, advantages and disadvantages according to the criteria for "off-the-shelf" learning):
• Linear threshold units (what can they express? What can't they express?) Ways of fitting LTUs via: LMS, Logistic regression, multi-variate gaussians, naive bayes (discrete case), linear programming.
• Decision trees (including splitting rule and methods of handling missing values)
• Neural networks (including both squared error and softmax error, initialization of the weights)
• Nearest Neighbor (curse of dimensionality)
• Support Vector Machines (kernels, formulation as linear programming)
• Naive Bayes (How to compute it for discrete attributes; Laplace corrections; Kernel density estimation)
• Computational Learning Theory: Blumer bound for discrete hypothesis space and for continuous hypothesis space. Estimating the VC dimension by geometric analysis.
• Gradient descent search. How to design a gradient descent search algorithm. Difference between batch and incremental (stochastic) gradient descent.
• Linear programming. What is the standard form of a linear programming problem?
• Bayesian Learning Theory. What is bayesian model averaging? What is MAP? How are they related?
• Bias-Variance Analysis. Definitions of Bias, Variance, Noise. Decomposition for squared error, 0-1 loss. Estimation using the bootstrap. Ensemble methods: Bagging and Boosting.