Causal Data Science for Business Analytics
Hamburg University of Technology
Monday, 24. June 2024
Source: https://towardsdatascience.com (2023).
Session | Date | Topic |
---|---|---|
1 | April 15 & 16 | Introduction to Causal Inference |
2 | April 22 & 21 | Graphical Causal Models |
3 | April 29 & 30 | Randomized Experiments |
4 | May 6 & 7 | Observed Confounding |
5 | May 13 & 14 | Double Machine Learning |
- | May 20 & 21 | Holiday |
6 | May 27 & 28 | Effect Heterogeneity |
7 | June 3 & 4 | Unobserved Confounding & Instrumental Variables |
8 | June 10 & 11 | Difference-in-Difference |
9 | June 17 & 18 | Synthetic Control |
10 | June 24 & 25 | Regression Discontinuity |
11 | July 1 & 2 | Causal Mediation |
12 | July 8 & 9 | Further Topics in Causal Machine Learning |
Lecture - Causal Data Science:
Monday, 11.30 - 13.00, Building D, Room D - 1.023Lab - Business Analytics with Causal Data Science:
Tuesday, 15.00 - 16.30, Building O, Room O - 0.007Examination:
10 challenges related to each topic documented in a lab journalContact:
Oliver Mork (oliver.mork@tuhh.de)(Democritus)
(Aristotle)
Source: Peters, Jonas. 2015. Causality: Lecture Notes, ETH Zurich.
Source: Neal, Brady (2020). Introduction to causal inference from a Machine Learning Perspective. Course Lecture Notes (draft).
library(ggdag)
library(ggplot2)
coord_dag <- list(
x = c(SEM = 0, Intent = 1, Sales = 2),
y = c(SEM = 0, Intent = 1, Sales = 0)
)
dag <- ggdag::dagify(SEM ~ Intent,
Sales ~ SEM,
Sales ~ Intent,
coords = coord_dag)
dag %>%
ggplot(aes(x = x, y = y, xend = xend, yend = yend)) +
geom_dag_point(colour = "grey") +
geom_dag_edges() +
geom_dag_text(colour = "black", size = 5) +
theme_dag(legend.position = "none")
Women | Men | ||
---|---|---|---|
Non-management: | $3,163.30 (87) | $3,015.18 (59) | |
Management: | $5,592.44 (13) | $5,319.82 (41) |
\[ \left(\frac{87 + 59}{200} \cdot \$148.12\right) + \left(\frac{13 + 41}{200} \cdot \$272.62\right) \approx \$181.74 \]
data <- data.frame(
Salary = c(5319.82, 3015.18, 5592.44, 3163.30, 3960.08, 3479.09),
Position = c("Management", "Non-Management", "Management", "Non-Management", "All Positions", "All Positions"),
Gender = c("Male", "Male", "Female", "Female", "Male", "Female")
)
library(ggplot2)
data |>
ggplot(aes(x=Gender, y=Salary, group=Position, colour=Position)) +
geom_line() + geom_point() +
theme_bw()
Healthy Lifestyle | Unhealthy Lifestyle | ||
---|---|---|---|
Non-management: | $3,163.30 (87) | $3,015.18 (59) | |
Management: | $5,592.44 (13) | $5,319.82 (41) |
library(ggdag)
coord_dag <- list(
x = c(Gender = 0, Management = 1, Salary = 2),
y = c(Gender = 0, Management = 1, Salary = 0)
)
dag <- ggdag::dagify(Management ~ Gender,
Salary ~ Gender,
Salary ~ Management,
coords = coord_dag)
dag %>%
ggplot(aes(x = x, y = y, xend = xend, yend = yend)) +
geom_dag_point(colour = "grey") +
geom_dag_edges() +
geom_dag_text(colour = "black", size = 5) +
theme_dag(legend.position = "none")
library(ggdag)
coord_dag <- list(
x = c(Lifestyle = 0, Management = 1, Salary = 2),
y = c(Lifestyle = 0, Management = 1, Salary = 0)
)
dag <- ggdag::dagify(Lifestyle ~ Management,
Salary ~ Lifestyle,
Salary ~ Management,
coords = coord_dag)
dag %>%
ggplot(aes(x = x, y = y, xend = xend, yend = yend)) +
geom_dag_point(colour = "grey") +
geom_dag_edges() +
geom_dag_text(colour = "black", size = 5) +
theme_dag(legend.position = "none")
(Rubin, 1975; Holland, 1986)
(Bertrand and Mullainathan, 2004)
.(Neyman, 1923; Rubin, 1974)
is a way to formalize this idea.For the potential outcomes and the ITE to be precisely defined, we need to make an initial set of assumptions:
Assumption 1: “No Interference”
Unit i’s potential outcomes do not depend on other units’ treatments.
\(Y_i(t_1,...,t_{i-1},t_i,t_{i+1},...t_n) = Y_i(t_i)\)
Assumption 2: “Consistency.”
There are no other versions of the treatment. Equivalently, we require that the treatment levels be well-defined, or have no ambiguity at least for the outcome of interest. If the treatment is \(T\), then the observed outcome \(Y\) is the potential outcome under treatment \(T\).
Formally, \(T = t => Y = Y(t)\) or equivalently \(Y = Y(T)\)
Assumption 3: “Stable Unit Treatment Value Assumption (SUTVA).”
Both Assumptions 1 and 2 hold: \(Y_i = Y(T_i)\)
\(i\) | \(T_i\) | \(Y_i\) | \(Y_i(1)\) | \(Y_i(0)\) | \(Y_i(1) - Y_i(0)\) |
---|---|---|---|---|---|
1 | 0 | 0 | ? | 0 | ? |
2 | 1 | 1 | 1 | ? | ? |
3 | 1 | 0 | 0 | ? | ? |
4 | 0 | 0 | ? | 0 | ? |
5 | 0 | 1 | ? | 1 | ? |
6 | 1 | 1 | 1 | ? | ? |
Assumption 4: “Ignorability / Exchangeability”.
Ignorability (of how people selected their treatment) is equivalent to random assignment into treatments.
Exchangeability means that observations in treatment and control group could be swapped, and one would still obtain the same outcomes. This implies that observations in groups are the same in all relevant aspects other than the treatment.
Formally, \((Y(1), Y(0)) \perp\!\!\!\perp T\).
\(i\) | \(T_i\) | \(Y_i\) | \(Y_i(1)\) | \(Y_i(0)\) |
---|---|---|---|---|
1 | 0 | 0 | ? | 0 |
4 | 0 | 0 | ? | 0 |
5 | 0 | 1 | ? | 1 |
2 | 1 | 1 | 1 | ? |
3 | 1 | 0 | 0 | ? |
6 | 1 | 1 | 1 | ? |
Assumption 5: “Conditional Exchangeability / Unconfoundedness”.
Formally, \((Y(1), Y(0)) \perp\!\!\!\perp T \, | \, X\).
Assumption 6: “Positivity / Overlap / Common Support”.
For all values of covariates \(x\) present in the population of interest (i.e. \(x\) such that \(P(X=x) > 0\)), we have \(0 < P(T=1|X=x) < 1\).
Theorem 1: “Identification of the ATE”:
\(\tau = \mathbb{E}[Y_i(1)] - \mathbb{E}[Y_i(0)] = \mathbb{E_X}[\mathbb{E}[Y_i|T_i=1, X_i] - \mathbb{E}[Y_i|T_i=0, X_i]]\)
\[\begin{align*} \tau = \mathbb{E}[\tau_i] &= \mathbb{E}[Y_i(1) - Y_i(0)] \\ &= \mathbb{E}[Y_i(1)] - \mathbb{E}[Y_i(0)] \\ & \text{(linearity of expectation)} \\ &= \mathbb{E}_X [\mathbb{E}[Y_i(1) \mid X_i]] - \mathbb{E}_X [\mathbb{E}[Y_i(0) \mid X_i]] \\ &\text{(law of iterated expectations)} \\ &= \mathbb{E}_X [\mathbb{E}[Y_i(1) \mid T_i = 1, X_i]] - \mathbb{E}_X [\mathbb{E}[Y_i(0) \mid T_i = 0, X_i]] \\ &\text{(unconfoundedness and positivity)} \\ &= \mathbb{E}_X [\mathbb{E}[Y_i \mid T_i = 1, X_i]] - \mathbb{E}_X [\mathbb{E}[Y_i \mid T_i = 0, X_i]] \\ &\text{(consistency)} \end{align*}\]
“Average Treatment Effect on the Treated” (ATT):
\(ATT = \mathbb{E}[Y_i(1)|T_i=1] - \mathbb{E}[Y_i(0)|T_i=1]\)
“Conditional Average Treatment Effect” (CATE):
\(CATE = \mathbb{E}[Y_i(1)|X_i=x] - \mathbb{E}[Y_i(0)|X_i=x]\)
Thank you for your attention! | |
Causal Data Science: (1) Introduction to Causal Inference