Thomas G. Dietterich

Distinguished Professor (Emeritus) and Director of Intelligent Systems
Institute for Collaborative Robotics and Intelligence Systems (CoRIS)
School of Electrical Engineering and Computer Science
1148 Kelley Engineering Center
Oregon State University
Corvallis, Oregon 97331-5501

E-mail: tgd@cs.orst.edu Twitter: @tdietterich Mastodon: @tdietterich@mastodon.social
Phone: +1-541-737-5559 Office: KEC 2063
PGP Public Key
(Last updated November 28, 2022)

Page Contents: Research Prospective Students Publications Talks CV Software Students and Staff Course Materials Bio Sketch

Current Research

"If you invent a breakthrough in artificial intelligence, so machines can learn," Mr. Gates responded, "that is worth 10 Microsofts." (Quoted in NY Times, Monday March 3, 2004). Deep learning is providing that breakthrough, and machine learning is now transforming computer science and many other disciplines as well. It is an exciting time to be working in machine learning!

The focus of my research is artificial intelligence and machine learning. How can we make computer systems that adapt and learn from their experience? How can we combine machine learning with other advances in AI to build Integrated Intelligent Systems? How can we make those integrated AI systems robust to errors, both known and unknown? How can we combine human knowledge with massive data sets to expand scientific knowledge and build more useful computer applications? My work combines research on machine learning and AI fundamentals with applications to problems in science and engineering.

Scientific Projects
- Robust Artificial Intelligence. AI methods are widely deployed in many parts of the economy including search engines, speech-enabled systems, computer vision, and natural language translation. Of particular concern are AI applications that involve life-and-death decision making such as self-driving cars, AI control of the power grid, and autonomous weapons systems. Before we deploy AI in such applications, we need to be confident that it will behave correctly.
  With my colleagues and students, I'm pursuing research in safe artificial intelligence including open category supervised learning, safe reinforcement learning, and explainable artificial intelligence. I've reviewed the research challenges in this area in my AAAI Presidential Address, Steps Toward Robust Artificial Intelligence. Slides, Video, Paper. More recently, I've set forth a research agenda based on studies of reliable human organizations, Toward High-Reliability Artificial Intelligence PDF slides.
- Ecosystem Informatics and Computational Sustainability: Oregon State University is a leader in combining computer science and the ecological sciences to build the new discipline of Ecosystem Informatics. Ecosystem Informatics studies methods for collecting, analyzing, and visualizing data on the structure and function of ecosystems. It is part of the larger field of Computational Sustainability.
  Oregon State is also part of the Institute for Computational Sustainability led by Cornell University. This effort seeks to develop novel computational methods to address problems in ecosystem science and sustainable management of the biosphere.
  My group is involved in many Ecosystem Informatics and Computational Sustainability activities:
  - Project TAHMO: Deployment, Cleaning, and Analysis of Sensor Network Data. We are part of the Project TAHMO that seeks to construct and deploy a network of 20,000 hydro-meteorological stations in Africa. We are developing algorithms for sensor placement, data cleaning, recovery from damaged sensors, and analysis of the resulting data. We are building on our previous work with Ethan Dereszynski on dynamic Bayesian network models for sensor data cleaning. Here is a recent presentation on our work: Toward automated quality control for hydro-meteorological weather station data PDF slides.
  - NIPS 2012 Posner Lecture: Challenges for Machine Learning in Computational Sustainability.
  - ICML 2011 Tutorial on Machine Learning in Ecology and Ecosystem Management
- Intelligent Desktop Assistants.
  - TaskTracer. When you come into work in the morning, you don't want to say to your computer "I want to run Word", but rather, "I want to work on my CS534 homework". In other words, you would like a user interface that was organized around your projects and activities rather than around application programs, files, folders, etc. You would also like all of your information in one place rather than scattered across the local file system, network file systems, dropbox, web sites, email folders, calendar, contacts, etc. TaskTracer extends the Windows UI to provide exactly this functionality. This research has been supported by gifts and grants from Keysight, Inc., Google, Intel, and the DARPA CALO and PPAML projects.
- Fundamental Machine Learning and Artificial Intelligence Research
  - Anomaly Detection. An important capability for AI systems is to be able to detect when an input situation is unusual. For example, anomaly detection can allow machine learning systems to detect when an input case is very different from the training data and hence could lead to extrapolation and poor performance. Anomaly detection methods are also important for detecting novel failures in sensor networks and novel attacks on computer systems. We are developing a range of algorithms for anomaly detection under grants from the DARPA, National Science Foundation, the Future of Life Institute, and a gift from Huawei, Inc.
    Here is a presentation that summarizes our work on anomaly detection for machine learning applications involving hand-engineered features. Advances in Anomaly Detection. PDF slides. It covers work on benchmarking anomaly detection algorithms, incorporating analyst feedback, feature-based explanations of anomalies, and a PAC theory of anomaly detection.
    I'm currently studying anomaly detection in deep learning, with a focus on anomaly detection in object classification for open category/open set learning. Here is a presentation summarizing my current hypotheses about how anomaly detection in deep learning is different from anomaly detection in feature vector data: Anomaly Detection for OOD and Novel Category Detection PDF slides.
    My students and I have also developed a new benchmarking methodology for comparing anomaly detection methods. Benchmark data sets and scripts are available for download.
  - Exogenous State MDPs We are studying MDPs where some of the state variables cannot be controlled by the policy. This is a collaboration with George Trimponias. We published our initial results at ICML 2018: Discovering and Removing Exogenous State Variables and Rewards for Reinforcement Learning. PDF slides. George and I recently submitted a journal manuscript that presents a comprehensive theoretical and experimental study of this area: Reinforcement learning with exogenous states and rewards. Arxiv preprint. It shows that we can achieve dramatic speedups in reinforcement learning by identifying and removing exogenous state variables.
- Reviews, tutorials, and books. I have written several review articles and tutorials on machine learning.
Information for Prospective Students

I am no longer accepting new interns, students, or post-docs. I encourage students interested in the Oregon State graduate programs in CS or AI to contact my colleagues. They are listed here: AI and Robotics.

Publications , Curriculum Vita, Software, and Data

Publications
Talks
Curriculum Vita
Downloadable Collection of Error Correcting Codes for use with the error-correcting output coding technique.
Release 1.0 of MAXQ Hierarchical Reinforcement Learning code.
RSW (Recurrent Sliding Window) Package for WEKA for sequential supervised learning.
Implementation of the PCBR region detector developed as part of the BugID project.
Implementation of TreeCRF system and supporting materials from our JMLR paper. Java implementation by Brad Block.
STONEFLY9 Image Database.
EPT29 Image Database.
Starcraft Scouting dataset (for UAI 2012 paper)

Professional Service, Journals, and Book Series

I am the lead moderator for the machine learning (CS.LG) part of CoRR, which is the computer science sub-part of arXiv.
I am a member of the Advisory Board for the Journal of Machine Learning Research, which is an electronic (and hardcopy) journal covering all areas of machine learning.
I am a former President of the Association for the Advancement of Artificial Intelligence.
I am past President of the International Machine Learning Society.

Entrepreneurial Activities

I was a co-founder of Strands (formerly MyStrands; formerly MusicStrands), a recommendation company that was acquired by CRIP SVP in 2020.
I am a co-founder of Smart Desktop. Smart Desktop is now part of Decho, Inc., which is a "cloud computing" effort of EMC. Decho was a spinout of the TaskTracer project.
I am a co-founder and Chief Scientist of BigML. BigML provides large scale cloud-based machine learning services. It has a free tier that is a great place to play around with various machine learning techniques.

Current Students and Staff

Former Students and Staff

Hussein Almuallim, Oil and Energy Professional, Calgary, Canada.
Eric Altendorf, Self-Employed.
Adam Ashenfelter, Tignis, Inc. Seattle, Washington.
Ghulum Bakiri, President at MicroCenter, Bahrain.
Christian Baumberger. Data Scientist SBB CFF FFS, Burgdorf, Bern, Switzerland.
Xinlong Bao. Distinguished Software Engineer, Google Mountain View.
Brian Breck.
Waranun Bunjongsat.
Giuseppe Cerbone. Independent Information Services Professional, Milan, Italy.
Martha Chamberlin.
Hei Chan. Assistant Professor / Project Researcher at the Transdisciplinary Research Integration Center, Japan.
Richard Charon.
Eric Chown, Full Professor, Bowdoin College.
Selina Chu, JPL, Pasadena, CA.
Dan Corpron
Mark Crowley, Assistant Professor, Department of Electrical and Computer Engineering, University of Waterloo.
Diane Damon, Damon Consulting, Portland, OR.
Ethan Dereszynski, ML Engineer, NVidia
Phuoc Do, Vida Lab.
Andrew Emmott. Software Engineer, Lacework.
Nicholas Flann Associate Professor, Utah State University
Greg Foltz.
Dan Forrest.
Tony Fountain, Director of the Cyberinfrastructure Lab for Environmental Observing Systems (CLEOS), UC San Diego.
Ashit Gandhi, Founder and Vice-President, Prism Gem, LLC - The Art of Diamond Coloring.
Risheek Garrepalli Senior Deep Learning Research Engineer at Qualcomm AI Research.
Colin Gerety, Fort Collins, CO.
Arwen Griffioen.
Alex Guyer.
Guohua Hao, Senior Manager, Data Science, TIBCO.
Brandon Harvey, Symantec and Linn-Benton Community College.
Hermann Hild, President, SMI Cognitive Software GmbH .
Jesse Hostetler
Rebecca Hutchinson, Associate Professor of Computer Science and Fishers and Wildlife, Oregon State University.
Jed Irvine, Senior Faculty Research Assistant (Software Engineer).
Saket Joshi, Senior Engineering (Natural Language Understanding) at Google.
Varad Joshi, Director of Engineering at Elemental Technologies.
Caroline Koff, Hewlett-Packard Corporation, Fort Collins, CO.
Victoria Keiser, Research Programmer, CMU. Masters Thesis (PDF).
Michael Kelm, Research Scientist, Siemens Healthcare.
Eun Bae Kong, Professor, Computer Science, Chungnam National University, South Korea
Bill Langford, Research Associate at RMIT, Melbourne, Australia.
Junyuan Lin, VMWare, Seattle.
Liping Liu, Assistant Professor, Tufts University.
Si Liu, Postdoc, Fred Hutchinson Cancer Institute, Seattle.
Dragos Margineantu, The Boeing Company.
Gonzalo Martinez, Assistant Professor, Autonomous University of Madrid.
Sean McGregor. XPRIZE.
Prafulla Mishra, Software Development Manager at eBay.
Thomas Noel. Software Engineer, Nova Dynamics.
Avis Ng.
Soumya Ray, Assistant Professor, Case-Western Reserve University.
Angelo Restificar, Principal Machine Learning Engineer, EBay, Seattle.
Ritchey Ruff, Senior SDET, Microsoft.
Yuvraj Sharma. Data Scientist, Sightly.
Dan Sheldon, Associate Professor, University of Massachusetts, Amherst.
Jianqiang Shen. Senior Engineering Manager, Machine Learning at LinkedIn. Doctoral dissertation.
Rongkun Shen. Post-doc, Oregon Health and Science University, Portland.
Michael Shindler, Lecturer at the University of Southern California
Shriprakash Sinha. Ph.D. student TU Delft.
Shahed Sorower, Senior Manager, Data Science at Capital One.
Simone Stumpf. Senior Lecturer, City University London.
Amelia Snyder, Intern at World Resources Institute
Tao Sun, Graduate Student at UMass Amherst.
Majid Alkaee Taleghan. Senior Machine Learning Scientist at eSentire.
Irene Tematelewo. Data Scientist, Microsoft.
Dan Vega, Senior Software Engineer at Valley Inception, LLC.
Mark Vulfson. Microsoft Corporation.
Kiri Wagstaff. Senior Instructor, Oregon State University.
Xin Wang, Senior Scientist at Inome (Intelius).
Dietrich Wettschereck. tarent solutions GmbH, Bonn, Germany.
Pengcheng Wu.
Michael Wynkoop, Technical Fellow at PTC.
Qing Yao, College of Informatics and Electronics. Zhejiang Sci-Tech University. Hangzhou, China.
Tadesse ZeMicheal, Applied Scientist, AI Infrastructure, nVidia, Austin, TX.
Wei Zhang, The Boeing Company.
Wei Zhang. Startup Incubator, Beijing. Doctoral Dissertation (PDF).
Valentina Zubek, Principal Statistician, Boehringer Ingelheim.

Previous Courses and Courseware

Machine Learning Resources

ML-News is an email list operated by the IMLS for conference annoouncements, job positions (including graduate student and postdoc offers), and other items of relevance to the machine learning community. To join, you must have a google account. Log in to google and then go to the ML-News main page and click on "Apply to join group". Please include your affiliation in your join request.
How to be a Graduate Student. A great web page with pointers to lots of good resources for graduate students.
Standard Proofreading Symbols that I use when marking corrections on papers.
Computing Research Repository (part of arXiv). I am the moderator for machine learning.
Research Index at Penn State. An invaluable resource for finding online articles, citations, etc.
Journal of Machine Learning Research (JMLR). (Free electronic and hardcopy journal.)
Machine Learning. (Expensive hardcopy journal.)
Journal of Artificial Intelligence Research (JAIR). (Free electronic and hardcopy journal.)
The Machine Learning Database Repository at UC Irvine.
StatLib containing data, algorithms, and other information relevant to statistics.
Knowledge Discovery Mine containing information about knowledge discovery in databases.
NIPS Online Proceedings.
CMU Reinforcement learning group.
Bibliographies on Artificial Intelligence
The DBLP Computer Science Bibliography.
Numerical Recipes Homepage.

My Family's Activities

yOya on MySpace and yOya home page. My son Noah writes songs and plays keyboards for this band.
My daugher Hannah studies volcanoes at the Alaska Volcano Observatory.
Jubilate: The Women's Choir of Corvallis. My wife Carol sings in this choir.

Tom Dietterich, tgd@cs.orst.edu