CS 520 - Causal Inference and Learning

Contact:

Time: TTh 11-12:15pm

Location: BBS 311

Office hours: Tue 2-4pm in SEO 1140

Two roads diverged in a yellow wood,
And sorry I could not travel both [...]
I took the one less traveled by,
And that has made all the difference.
--Robert Frost

"The dramatic success in machine learning has led to an explosion of AI applications and increasing expectations for autonomous systems that exhibit human-level intelligence. These expectations, however, have met with fundamental obstacles that cut across many application areas. [... One of the obstacles] concerns the understanding of cause-effect connections. This hallmark of human cognition is [...] a necessary (though not sufficient) ingredient for achieving human-level intelligence. This ingredient should allow computer systems to choreograph a parsimonious and modular representation of their environment, interrogate that representation, distort it by acts of imagination and finally answer "What if?" kind of questions. Examples are interventional questions: ΄What if I make it happen?‘ and retrospective or explanatory questions: ΄What if I had acted differently?‘ or ΄what if my flight had not been late?‘ Such questions cannot be articulated, let alone answered by systems that operate in purely statistical mode, as do most learning machines today."
--From "The Seven Tools of Causal Inference with Reflections on Machine Learning" by Judea Pearl

Course Description

Causal reasoning is an integral part of data science and artificial intelligence. The goal of the course on Causal Inference and Learning is to introduce students to methodologies and algorithms for causal reasoning and connect various aspects of causal inference, including methods developed within computer science, statistics, and economics. The course will cover state-of-the-art research on causal reasoning and prepare students to conduct research in this area.

Course objectives

This is a seminar course. The goal of the course is to expose graduate students to state-of-the-art research on causal inference. The class project plays a central role in the course, and it should be taken as an opportunity to connect your research area of interest to the course topics.

Course format

Approximately one third of the course will be lecture-based using the following book by Judea Pearl, Madelyn Glymour and Nicholas Jewell: Causal Inference in Statistics: A Primer (Wiley Press 2016). The rest of the course will be student-led presentations of recent papers from the growing body of causal inference research.

The class will meet synchronously in person. We will be using Piazza for all course discussions and materials.

Student deliverables

Homework assignments
Written paper summaries
In-class participation
Research paper presentations
Course project (proposal, progress report, final presentation, final report)

Prerequisites

CS 412 Introduction to Machine Learning or consent of the instructor.

Textbooks

Required textbook:
[CISP] Judea Pearl, Madelyn Glymour, Nicholas Jewell (2016). Causal Inference in Statistics: A Primer. Wiley Press. Errata. (ebook from library). Solutions to selected problems. DAGitty solutions to selected problems.

Optional textbooks:
[WHY] Judea Pearl, Dana Mackenzie (2018): The Book of Why. Basic Books.
[CMRI] Judea Pearl (2009): Causality: Models, Reasoning, and Inference. Cambridge University Press. (ebook from library)
[ECI] Jonas Peters, Dominik Janzing, Bernhard Schölkopf (2017): Elements of Causal Inference: Foundations and Learning Algorithms. MIT Press.
[CPS] Spirtes, Glymour, Scheines (2000): Causation, prediction and search. MIT Press.
[CI] Hernan, Robins (2020). Causal inference. Chapman & Hall.
[CISSBS] Guido Imbens and Donald Rubin (2015): Causal Inference for Statistics, Social and Biomedical Sciences. Cambridge University Press.

Course schedule

Check Piazza for an up-to-date schedule.

Date	Topic	Assigned Reading
8/23	Introduction	Syllabus
8/25	Why causality?	Main paper: Judea Pearl. The Seven Tools of Causal Reasoning with Reflections on Machine Learning. Communications of the ACM. 2019. Optional: Brenden M. Lake, Tomer D. Ullman, Joshua B. Tenenbaum, Samuel J. Gershman. Building Machines That Learn and Think Like People. 2016.
8/30	Hypothesis testing and randomized controlled trials	CIT Ch. 11 PTDS Ch. 18
9/1	Probability and statistics: review	CISP Ch. 1.1-1.3
9/6	Structural causal models	CISP Ch. 1.4-1.5
9/8	Graphical models	CISP Ch. 2
9/13	Interventions, adjustment formula back-door criterion	CISP Ch. 3.1-3.3
9/15	On learning in the presence of biased data and strategic behavior
9/20	Front-door criterion, covariate-specific effects, inverse probability weighing	CISP Ch. 3.4-3.6
9/22	Bias and emergent instabilities in socially-embedded algorithms
9/27	Mediation and causal inference in linear systems	CISP Ch. 3.7-3.8
9/29	Defining and computing counterfactuals	CISP Ch. 4.1-4.2
10/4	Counterfactual probabilities, counterfactuals in linear systems	CISP Ch. 4.3
10/6	Counterfactuals: attribution, mediation, practical uses	CISP Ch. 4.4-4.5
10/11	Do calculus and transportability	Main paper: Bareinboim, Pearl. Causal inference and the data fusion problem. PNAS 2016. Optional: Bareinboim, Pearl. External validity: From do-calculus to transportability across populations. Stat. Science 2014.
10/13	Causal estimands under interference	Main paper: Hudgens, Halloran. Toward Causal Inference With Interference. JASA 2008. Optional paper: Zheleva, Arbour. Causal Inference from Network Data. Tutorial. KDD 2021.
10/18	Two-stage randomization	Main paper: Fatemi, Zheleva. Minimizing interference and selection bias in network experiment design. ICWSM 2020. Optional paper: Ugander, Karrer, Backstrom, Kleinberg. Graph cluster randomization: Network exposure to multiple universes. KDD 2013.
10/20	Network experiment designs	Main paper: Toulis, Kao. Estimation of Causal Peer Influence Effects. ICML 2013. Optional paper: Eckles, Kizilcec, Bakshy. Estimating peer effects in networks with peer encouragement designs. PNAS 2016.
10/25	Social media and polarization	Main paper: Bail, Argyle, Brown, Bumpus, Chen, Hunzaker, Lee, Mann, Merhout, Volfovsky. Exposure to opposing views on social media can increase political polarization. PNAS 2018. Optional paper: Alvari, Shaabani, Sarkar, Beigi, Shakarian. Less is More: Semi-Supervised Causal Inference for Detecting Pathogenic Users in Social Media. WWW 2019.
10/27	Heterogeneous effects in network experiments	Main paper: Yuan, Altenburger, Kooti. Causal Network Motifs: Identifying Heterogeneous Spillover Effects in A/B Tests. WebConf 2021. Optional paper: Yuan, Altenberger. Characterizing Interference Heterogeneity and Improving Estimation for Experiments in Networks. SSRN 2022.
11/1	Interference and inverse probability weighting	Main paper: Tchedgen Tchedgen, VanderWeele. On causal inference in the presence of interference. SMMR 2012. Optional paper: Qu, Xiong, Liu, Imbens. 2021. Efficient Treatment Effect Estimation in Observational Studies under Heterogeneous Partial Interference. Arxiv 2021.
11/3	Structural causal models for interference	Main paper: Ogburn, VanderWeele. Causal Diagrams for Interference. Statistical Science 2014. Optional paper: Tran & Zheleva. Heterogeneous Peer Effects in the Linear Threshold Model. AAAI 2022.
11/8	Election Day: no class
11/10	Abstract ground graphs and relational d-separation	Main paper: Maier, Marazopoulou, Arbour, Jensen. Reasoning about Independence in Probabilistic Models of Relational Data. Arxiv 2013.
11/15	Relational causal discovery	Main paper: Maier, Marazopoulou, Arbour, Jensen. A sound and complete algorithm for learning causal models from relational data. UAI 2013. Optional paper: Spirtes, Zhang. Search for causal models. Chapter in Handbook of Graphical Models. 2018.
11/17	Chain graphs and interference	Main paper: Bhattacharya, Malinsky, Shpitser. Causal Inference Under Interference And Network Uncertainty. UAI 2019. Optional paper: Ogburn, Shpitser and Lee. Causal inference, social networks and chain graphs. JRSSB 2020.
11/22	Causal effects on hypergraphs	Main paper: Ma, Wan, Yang, Li, Hecht, Teevan. Learning Causal Effects on Hypergraphs. KDD 2022. Optional: Shalit, Johansson, Sontag. Estimating individual treatment effect: generalization bounds and algorithms. ICML 2017.
11/24	No class (Thanksgiving)
11/29	Presentations	1. On the potential causes of divergence from party policy in roll call votes 2. Causal inference of position and herding bias in helpfulness voting 3. Benchmarking causal discovery algorithms on medical data 4. Effect of words on tweet popularity 5. Spillover effect of business threats on review characteristics
12/1	Presentations	1. Causal analysis and the impact of COVID-19 on Reddit posts 2. Effects of social media on mental health 3. Causality between substance abuse and mental health 4. Causal analysis of hotel booking cancellations 5. A counterfactual evaluation of song recommendation systems
12/5	Finals week