Comparative Analysis of Constructed and Random DAGs for Automated Causal Discovery

Searching for Patterns in How We Search for Causality

Colin Shea-Blymyer | Vartan Kesiz Abnousi

c0lin@vt.edu | vkesizab@vt.edu

Product of Data Mining Large Networks and Time Series with Dr. Prakash

ABSTRACT

Automated Causal Discovery is used to find the relationships between variables without needing to intervene. These relationships can be represented as directed, acyclic graphs (DAGs), where vertices are variables, and the edges are relationships between these variables. These causal graphs are used to the analyze cause and effect in a wide array of fields, from urban planning, to epidemiology. However, little research has been done on the structure of the graphs themselves, or how they compare to randomly generated DAGs.

This project aims to compare and contrast large causal models, non-causal Bayesian networks, natural DAGs, and randomly generated DAGs. From this we hope to gain insight into a common structure (or lack thereof) for relationships between variables in the world, and determine if causal dynamics can be modeled, abstractly, through random graphs. Further, this analysis may serve to challenge assumptions about how causal networks from different domains look, and how algorithms choice in causal inference impacts an inferred graph's structure.

Project Report
We show some interesting trends in small clustering structure, path lengths, degree distribution. Some causal properties are discussed, and our results are tied to assumptions made in causal inference. Finally, we discuss the future of this project, and lay out remaining work.

Project Presentation
A slide-based presentation of the project

Project Code
Collection of code used to perform our analysis

COLIN SHEA-BLYMYER

Colin is a computer science graduate student at Virginia Tech. He does research in machine learning, the automation of scientific discovery, and their applications to animal swarming models and behavior. He has done work in industry on natrual language processing, cryptography, and system security.
Read More Here

VARTAN KESIZ ABNOUSI

Vartan is a PhD candidate in Economics at Virginia Tech. He does research in Econometric methodology, Times Series Analysis, causal discovery and its extension in machine learning models. He has worked in the financial industry as a Quantitative Analyst on financial algorithmic modeling for portfolio optimization.

CONTACT

Colin
c0lin@vt.edu

Vartan
vkesizab@vt.edu