CS434: Machine Learning and Data Mining

CS 434: Machine Learning and Data Mining

Fall 2008

MWF 15:00 - 15:50 Kelly 1001

Instructor: Xiaoli Fern

Email:	xfern@eecs.oregonstate.edu
Office:	kelly 3073
Office hour:	MWF 2-3pm, or by appointment
Class email list:	cs434-f08@engr.oregonstate.edu

Machine learning and Data mining is a subfield of artificial intelligence that develops computer programs that can learn from past experience and find useful patterns in data. This field has provided many tools that are widely used and making significant impacts in both industrial and research settings. Some of the application domains include personalized spam filters, HIV vaccine design, handwritten digit recognition, face recognition, credit card fraud detection, unmanned vehicle control, medical diagnosis, intelligent web search, etc.

This course will provide a basic introduction to this dynamic and fast advancing field. Topics include the three basic branches in this field: (1) Supervised learning for prediction problems (learn to predict); (2) Unsupervised learning for clustering data and discovering interesting patterns from data (learn to understand); and (3) Reinforcement learning for learning to select actions based on positive and negative feedback (learn to act). It will have a special focus on the practical side --- students will not only learn various machine learning and data mining techniques, but also learn how to apply them to real problems in practice.

Syllabus

Course Policy

Course materials

No text book required, lecture notes and reading materials will be posted on the webpage, please check regularly.
Useful References:

Machine learning, Tom Mitchell, McGraw-Hill 1997 (Referred to as TM).

Machine learning and pattern recognition, Chris Bishop, Springer (Referred to as Bishop).

Learning objectives

Upon completing the course, students are expected to be able to:

1) Students are able to apply supervised learning algorithms to prediction problems and evaluate the results.

2) Students are able to apply unsupervised learning algorithms to data analysis problems and evaluate results.

3) Students are able to apply reinforcement learning algorithms to control problem and evaluate results.

4) Students are able to take a description of a new problem and decide what kind of problem (supervised, unsupervised, or reinforcement) it is.

Lecture Schedule

see previous class for a rough lecture schedule cs434 Fall 2007

Date	Topics	Lecture Notes	Reading	Assignments
9/29 M	Introduction to basic concepts	slides	TM Chapter 1
10/1 W	The perceptron algorithm	slides	notes on perceptron by William Cohen
10/3 F	The nearest neighbor algorithm	Slides		hw1, due monday 13th in class Solution
10/6 M	Decision tree algorithm	slides	J. R. Quinlan, Induction of decision trees, Machine learning 1: 81-106, 1986
10/8 W	Decision tree cont.	slides
10/10 F	Review of probability theory	slides
10/13 M	(Naive) Bayes classifier	slides		hw2 due on Friday Oct 24th in class solution to the written part
10/15 W	NBC cont, logistic regression	slides	generative model vs discriminative model
10/17 F	Logistic Regression	slides
10/20 M	Support Vector Machine	slides		Final project information
10/22 W	support vector machines cont.	slides
10/24 F	Ensemble methods, bagging	slides
10/27 M	boosting	Slides	A short introduction to boosting
10/29 W	Feature Selection	slides
10/31 F	Clustering, HAC	slides		Assignment 3 Due Nov 12th
11/3 M	Clustering cont. Kmeans	slides
11/5 W	midterm exam
11/7 F	Gaussian Mixture modeling	slides
11/10 M	Discussion of midterm questions
11/12 W	Canceled class
11/14 F	GMM cont, unsupervised dimension reduction	slides		Assignment 4, Due Nov 24th cluster.csv; random.csv
11/17 M	Guest lecture on sequence analysis
11/19 W	Markov Decision Processes	slides
11/21 F	MDPs cont.	slides
11/24 M	Reinforcement learning	slides		hw5 : Due on 12/03
11/26 W	Reinforcement learning - passive learning	slides
11/28 F	No class - thanks giving holiday
12/1 M	Reinforcement learning - active learning	slides
12/3 W	Reinforcement learning - function approximation	slides
12/5 F	Association rules mining