CS539: Probabilistic Relational Models

Course Description

Machine Learning is in the midst of a revolution. The "old" approach to machine learning focused on supervised learning from independent and identically distributed (iid) training examples. The goal was to learn a classifier f that given an object x would produce as output a classification label y = f(x).

The "new" approach focuses on learning a complex web of relationships among a collection of diverse objects. Examples include diagnosing the disease of a patient based not only on properties of that patient but also on properties of other people that patient lives with or has had contact with. A new formalism has been developed called Probabilistic Relational Models (PRMs) that can represent these webs of relationships and support learning and reasoning with them.

This course will provide an introduction to PRMs for graduate students interested in doing research in this area. The course will begin with a rapid review of bayesian networks and Markov random fields including representation and inference. Then we will read and discuss all of the papers published on Probabilistic Relational Models (PRMs) and Relational Markov Networks (RMNs). Students will make class presentations, develop PRMs and RMNs for various application problems, and identify problems for future research. The class will involve substantial work outside of class including a class project.

Prerequisites: Consent of the instructor; basic knowledge of probability

Registration Information: 1-4 Units. TTh 2:00-2:50pm Strand 323

Course Handouts

Viewgraphs for Lectures

Part 1. Introduction to Probability and Bayesian Networks Postscript Slides.
Part 2. Inference in Belief Networks. Postscript Slides.
Part 3. Learning in Belief Networks. Postscript Slides. (Revised Oct 13 2003)
Part 4. Hidden Markov Models. Postscript Slides.
Part 5. Probabilistic Relational models. PDF Slides.
Part 6. Relational Uncertainty in PRMs. PPT presentation.

Software Resources

Bayesian network tools in Java from Kansas State University

Reading Schedule

October 23: Learning Probabilistic Relational Models, L. Getoor, N. Friedman, D. Koller, and A. Pfeffer. Invited contribution to the book Relational Data Mining, S. Dzeroski and N. Lavrac, Eds., Springer-Verlag, 2001.
October 28 and 30: Learning Probabilistic Models of Relational Structure, L. Getoor, N. Friedman, D. Koller, and B. Taskar. Eighteenth International Conference on Machine Learning (ICML), Williams College, June 2001.
Learning Probabilistic Models of Link Structure, L. Getoor, N. Friedman, D. Koller, B. Taskar, Journal of Machine Learning Research, 2002.
November 4: [Qiang He] Excerpt of Markov Random Field Modeling in Computer Vision, by S.Z. Li. Available online from Microsoft Research.
S. Geman and D. Geman. Stochastic relaxation, gibbs distributions and the bayesian restoration of images. IEEE Trans. PAMI, 6:721 - 741, 1984.
Slides: MRFs and Gibbs Fields (He); Markov Chain Monte Carlo. (Bulatov)
November 6: Ben Taskar will visit and talk.
November 11: [Sriraam Natarajan] Perlich, C. and F. Provost. Aggregation-based Feature Invention and Relational Concept Classes. In Proceedings of the Ninth SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2003). PDF slides.
[Ronald Bjarnason] Statistical Relational Learning for Link Prediction. Alexandrin Popescul, Lyle H. Ungar , Workshop on Learning Statistical Models from Relational Data at IJCAI 2003. PPT presentation.
November 13: [Phuoc Do] Autocorrelation and linkage cause bias in evaluation of relational learners. D. Jensen and J. Neville (2002). In Proceedings of The Twelfth International Conference on Inductive Logic Programming (ILP 2002). Springer-Verlag. Viewgraphs
[Matt McLaughlin] Linkage and autocorrelation cause feature selection bias in relational learning. D. Jensen and J. Neville (2002). Proceedings of the Nineteenth International Conference on Machine Learning (ICML2002). Morgan Kaufmann. pp. 259-266.
November 18: [Scott Proper, Mark Vulfson] On the Statistical Analysis of Dirty Pictures, Julian Besag, Journal of the Royal Statistical Society B, vol. 48, 1986, pp. 259-302 PPT Presentation
November 20: [Rongkun Shen] Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. John Lafferty, Andrew McCallum and Fernando Pereira. ICML-2001. PPT Presentation
[Guohua Hao] Discriminative Probabilistic Models for Relational Data, B. Taskar, P. Abbeel and D. Koller. Eighteenth Conference on Uncertainty in Artificial Intelligence (UAI02), Edmonton, Canada, August 2002. PPT Presentation.
November 25: [Charles Parker] C. Anderson, P. Domingos and D. Weld, Relational Markov Models and their Application to Adaptive Web Navigation. Proceedings of the Eighth International Conference on Knowledge Discovery and Data Mining (pp. 143-152), 2002. Edmonton, Canada: ACM Press.PPT presentation
[Matteu Labbe] S. Sanghai, P. Domingos and D. Weld, Dynamic Probabilistic Relational Models. Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence, 2003. Acapulco, Mexico: Morgan Kaufmann. PDF presentation
December 2: [Kiran Polavarapu] Learning on the Test Data: Leveraging Unseen Features, B. Taskar, M.F. Wong and D. Koller. Twentieth International Conference on Machine Learning (ICML03), Washington, DC, August 2003. PPT presentation.
December 4: [Pengcheng Wu] Max-Margin Markov Networks, B. Taskar, C. Guestrin and D. Koller. To appear in NIPS-2003. PDF slides.

Programming Tasks

Program Task Assignments (Dan Vega's page).

1. A module for acquiring the database schema from an existing RDB (mysql?).
2. A module for importing data from an existing RDB
3. A module for importing data in CSV format from multiple flat files
4. A module for defining and executing a path language. Needs to support saving and restoring path expressions. This will be used for instantiating the PRM.
5. A module for entering, editing, and visualizing PRMs. By this, I mean that the module should display the PRM and allow the user to enter, inspect, and edit the path expressions that define the parents of each descriptive attribute.
6. A module for applying the PRM schema + paths to the RDB to create an "unrolled" Bayes net with tied parameters.
7. Various modules for approximate inference on the unrolled Bayes net. BNJ already contains many tools, but I believe it does not contain "loopy belief propagation".
8. A module for feature discovery that searches the space of path expressions.
9. Support for discretization either manually (e.g., as part of task 5), or automatically.
10. Support for representing, learning, and reasoning with conditional probability tables represented as mixtures of decision trees. This could be done using Friedman's tree boost technique.
11. Adding undirected arcs and potential functions into the basic inference machinery of the package. For marginal and MPE queries, this should be quite easy.

I'm listing here some tasks that I believe will require additional research (including searching to see what has already been done and possibly the development of new techniques).

1. Study the problem of how to handle probabilistic inference "through" aggregators. Example: Suppose A.x = AVG(A.B.y) and that you have observed A.x and SOME of the A.B.y's. How can you compute, for example, the marginal probability for a particular B.y value: P(B.y1 | A.x)? How can you compute the most likely combination of values for the B.y's? Are there applications that require this?
2. Develop automatic tools for handling the situation where the PRM schema does not exactly match the database schema.
3. Develop tools for handling inheritance (subclassing) within PRMs. Koller and Pfeffer may have already done research on this.

Reading List

We will reading the following papers:

S. Geman and D. Geman. Stochastic relaxation, gibbs distributions and the bayesian restoration of images. IEEE Trans. PAMI, 6:721 - 741, 1984.
On the Statistical Analysis of Dirty Pictures, Julian Besag, Journal of the Royal Statistical Society B, vol. 48, 1986, pp. 259-302
Excerpt of Markov Random Field Modeling in Computer Vision, by S.Z. Li. Available online from Microsoft Research.
Object-Oriented Bayesian Networks, D. Koller and A. Pfeffer. Proceedings of the 13th Annual Conference on Uncertainty in AI (UAI), Providence, Rhode Island, August 1997, pages 302--313. Winner of the UAI '97 best student paper award.
Learning Probabilistic Relational Models, N. Friedman, L. Getoor, D. Koller and A. Pfeffer. Proceedings of the 16th International Joint Conference on Artificial Intelligence (IJCAI), Stockholm, Sweden, August 1999, pages 1300--1307.
From Instances to Classes in Probabilistic Relational Models. L. Getoor, D. Koller, N. Friedman. Proceedings of the ICML-2000 Workshop on Attribute-Value and Relational Learning: Crossing the Boundarie , Stanford, CA (June, 2000).
Learning Probabilistic Models of Relational Structure, L. Getoor, N. Friedman, D. Koller, and B. Taskar. Eighteenth International Conference on Machine Learning (ICML), Williams College, June 2001.
Probabilistic Models of Text and Link Structure for Hypertext Classification, L. Getoor, E. Segal, B. Taskar, D. Koller. IJCAI01 Workshop on "Text Learning: Beyond Supervision", Seattle, Washington, August 2001.
Probabilistic Clustering in Relational Data, B. Taskar, E. Segal, and D. Koller. Seventeenth International Joint Conference on Artificial Intelligence (IJCAI01), Seattle, Washington, August 2001.
Learning Probabilistic Relational Models, L. Getoor, N. Friedman, D. Koller, and A. Pfeffer. Invited contribution to the book Relational Data Mining, S. Dzeroski and N. Lavrac, Eds., Springer-Verlag, 2001 (to appear).
Learning Probabilistic Models of Link Structure, L. Getoor, N. Friedman, D. Koller, B. Taskar, Journal of Machine Learning Research, 2002.
Autocorrelation and linkage cause bias in evaluation of relational learners. D. Jensen and J. Neville (2002). In Proceedings of The Twelfth International Conference on Inductive Logic Programming (ILP 2002). Springer-Verlag.
Linkage and autocorrelation cause feature selection bias in relational learning. D. Jensen and J. Neville (2002). Proceedings of the Nineteenth International Conference on Machine Learning (ICML2002). Morgan Kaufmann. pp. 259-266.
Discriminative Probabilistic Models for Relational Data, B. Taskar, P. Abbeel and D. Koller. Eighteenth Conference on Uncertainty in Artificial Intelligence (UAI02), Edmonton, Canada, August 2002.
C. Anderson, P. Domingos and D. Weld, Relational Markov Models and their Application to Adaptive Web Navigation. Proceedings of the Eighth International Conference on Knowledge Discovery and Data Mining (pp. 143-152), 2002. Edmonton, Canada: ACM Press.
Link-based Classification, Q. Lu and L. Getoor. International Conference on Machine Learning, Washington, DC, August 2003.
Neville, J., M. Rattigan and D. Jensen (2003). Statistical Relational Learning: Four Claims and a Survey. Proceedings of the Workshop on Learning Statistical Models from Relational Data, 8th International Joint Conference on Artificial Intelligence.
Jensen, D., J. Neville and M. Hay (2003). Avoiding Bias When Aggregating Relational Data with Degree Disparity. Proceedings of the 20th International Conference on Machine Learning.
Neville, J., D. Jensen, L. Friedland and M. Hay (2003). Learning Relational Probability Trees. Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.
Neville, J., D. Jensen and B. Gallagher (2003). Simple Estimators for Relational Bayesian Classifers. Proceedings of The Third IEEE International Conference on Data Mining .
Statistical Relational Learning for Link Prediction. Alexandrin Popescul, Lyle H. Ungar , Workshop on Learning Statistical Models from Relational Data at IJCAI 2003.
S. Sanghai, P. Domingos and D. Weld, Dynamic Probabilistic Relational Models. Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence, 2003. Acapulco, Mexico: Morgan Kaufmann.
Learning on the Test Data: Leveraging Unseen Features, B. Taskar, M.F. Wong and D. Koller. Twentieth International Conference on Machine Learning (ICML03), Washington, DC, August 2003.
Perlich, C. and F. Provost. Aggregation-based Feature Invention and Relational Concept Classes. In Proceedings of the Ninth SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2003).
Max-Margin Markov Networks, B. Taskar, C. Guestrin and D. Koller. To appear in NIPS-2003.

Other Resources

Tutorial on Learning PRMS by Lise Getoor (large PPT file).

Tom Dietterich, tgd@cs.orst.edu