Introduction to Causal Discovery

DRP Autumn 2022

Mentee: Mutong (Mandy) Zhang

In this project, we will take a graphical approach to learn causal relationships between different variables in a system. First, we will learn how to represent causality using Directed Acyclic Graphs (DAGs). We will cover concepts such as d-separation, Markov property, and faithfulness. We will then describe two algorithms to learn causality from observational data. The first is a constraint-based algorithm called PC (named after Peter Spirtes and Clark Glymour who first described it). The second is a score-based algorithm called Greedy Equivalence Search (GES). Then, we will find a real dataset and apply these methods. If time permits, we can explore other ideas such as computational complexity, causal sufficienct, and background knowledge.

Schedule

This schedule is bound to change depending on how much we are able to do every week. Please check the schedule every week to find the correct readings. The following acronyms are used in assigned readings (see end of syllabus for full references):

Note On October 21, we decided to move away from the CPS textbook as it was a little archaic in notation and explanation.

September 28, 2022: Introduction & probability review
Optional reading: CMRI Chapter 1.1

October 3, 2022: Graphical models I
Readings: CPS Chapter 2.1-2.3
Bonus: CMRI Chapter 1.2-1.4

October 10, 2022: Skipped

October 17, 2022: Grahpical models II
Readings: CPS Chapter 3.4, 3.7.1
Bonus: A characterization of Markov equivalence classes for acyclic digraphs. URL

October 24, 2022: PC algorithm I
Readings: Read until page 11 (no need for score-based methods). Causal Structure Learning and Inference: A Selective Review URL
Bonus: Review of Causal Discovery Methods Based on Graphical Models URL

October 31, 2022: PC algorithm II
Readings: Background knowledge URL

November 7, 2022: GES algorithm
Readings: Read from page 11 to 13 (until start of 3.1.2). Causal Structure Learning and Inference: A Selective Review URL

November xx, 2022: Midterm feedback due

November 14, 2022: Implementation on real data
Readings: TBD

November 21, 2022: Implementation on real data
Readings: TBD

November 28, 2022: Implementation on real data
Readings: TBD

December 5, 2022: Practice presentation
Readings: TBD

December xx, 2022: Final presentation

December xx, 2022: Final report due

If we have time or there is interest, we may also discuss the following topics:

References

  1. Pearl, J. (2009). Causality. Cambridge university press.
  2. Spirtes, P., Glymour, C. N., Scheines, R., & Heckerman, D. (2000). Causation, prediction, and search. MIT press. URL
  3. Andersson, S. A., Madigan, D., & Perlman, M. D. (1997). A characterization of Markov equivalence classes for acyclic digraphs. The Annals of Statistics, 25(2), 505-541. URL
  4. Kalisch, M., & Bühlmann, P. (2014). Causal structure learning and inference: a selective review. Quality Technology & Quantitative Management, 11(1), 3-21. URL
  5. Glymour, C., Zhang, K., & Spirtes, P. (2019). Review of causal discovery methods based on graphical models. Frontiers in genetics, 10, 524.
  6. Meek, C. (1995). Causal inference and causal explanation with background knowledge. In In Conference on Uncertainty in Artificial Intelligence. URL