Trustable Machine Learning
Course Description
This short course considered the problem of obtaining reliable
decisions from supervised machine learning. It attempts to summarize
the current state of knowledge about how we can create machine
learning classifiers that, when they make a prediction, can provide a
guarantee that the prediction is correct with high probability. These
classifiers reject test queries for which they are not sufficiently
confident. The course consists of four lectures, with each lecture
centered around a few recent papers but including material from other
publications.
- Lecture 1: Calibrated Probabilities. This lecture
discusses how to obtain calibrated probabilities from supervised
classifiers. These are useful for making rejection decisions, but
also for cost-sensitive classification, for handling class
imbalance, and for serving as a component of a larger AI system.
- Lecture 2: Classification with a Reject
Option. We do not need to obtain calibrated probabilities in
order to make reject decisions correctly. This lecture discusses
methods for setting a rejection threshold that provide accuracy
guarantees. This includes standard thresholding methods and also
the method of conformal prediction.
- Lecture 3: Open Category Detection. The first
two lectures considered only the case of a closed world with iid
training data. In this lecture, we discuss the problem of
detecting test queries that belong to classes not present in the
training data.
- Lecture 4: Anomaly Detection. Most open category
methods apply an anomaly detection method to detect the
novel-class queries. This lecture discusses a benchmark study of
eight anomaly detection algorithms. It then presents the Rare
Pattern Anomaly Detection theory developed by Alan Fern, Md. Amran
Siddiqui, and me that gives a PAC-style theory for anomaly
detection methods.
I was not able to cover ALL of the relevant literature in these
presentations. I would be grateful to receive email with pointers to
other papers that discuss these topics. Similarly, if you see errors
in these presentations, please send me email so that I can correct
them.
Tom Dietterich, tgd@cs.orst.edu