Conditional Average Treatment Effect estimation with cross-fitting.
cate(
response_model,
propensity_model,
cate_model = ~1,
contrast = c(1, 0),
data,
nfolds = 1,
rep = 1,
silent = FALSE,
stratify = FALSE,
mc.cores,
...
)
formula or ml_model object (formula => glm)
formula or ml_model object (formula => glm)
formula specifying regression design for conditional average treatment effects
treatment contrast (default 1 vs 0)
data.frame
Number of folds
Number of replications of cross-fitting procedure
supress all messages and progressbars
If TRUE the response_model will be stratified by treatment
mc.cores Optional number of cores. parallel::mcmapply used instead of future
additional arguments to future.apply::future_mapply
cate.targeted object
We have observed data \((Y,A,W)\) where \(Y\) is the response variable, \(A\) the binary treatment, and \(W\) covariates. We further let \(V\) be a subset of the covariates. Define the conditional potential mean outcome $$\psi_{a}(P)(V) = E_{P}[E_{P}(Y\mid A=a, W)|V]$$ and let \(m(V; \beta)\) denote a parametric working model, then the target parameter is the mean-squared error $$\beta(P) = \operatorname{argmin}_{\beta} E_{P}[\{\Psi_{1}(P)(V)-\Psi_{0}(P)(V)\} - m(V; \beta)]^{2}$$
Mark J. van der Laan (2006) Statistical Inference for Variable Importance, The International Journal of Biostatistics.
sim1 <- function(n=1000, ...) {
w1 <- rnorm(n)
w2 <- rnorm(n)
a <- rbinom(n, 1, expit(-1 + w1))
y <- cos(w1) + w2*a + 0.2*w2^2 + a + rnorm(n)
data.frame(y, a, w1, w2)
}
d <- sim1(5000)
## ATE
cate(cate_model=~1,
response_model=y~a*(w1+w2),
propensity_model=a~w1+w2,
data=d)
#> Estimate Std.Err 2.5% 97.5% P-value
#> E[y(1)] 1.8047 0.04831 1.7100 1.8994 2.099e-305
#> E[y(0)] 0.8308 0.01984 0.7919 0.8697 0.000e+00
#> ───────────
#> (Intercept) 0.9740 0.05491 0.8664 1.0816 2.175e-70
## CATE
cate(cate_model=~1+w2,
response_model=y~a*(w1+w2),
propensity_model=a~w1+w2,
data=d)
#> Estimate Std.Err 2.5% 97.5% P-value
#> E[y(1)] 1.8047 0.04831 1.7100 1.8994 2.099e-305
#> E[y(0)] 0.8308 0.01984 0.7919 0.8697 0.000e+00
#> ───────────
#> (Intercept) 0.9502 0.05280 0.8467 1.0536 2.093e-72
#> w2 1.0377 0.04756 0.9445 1.1309 1.586e-105
if (FALSE) ## superlearner example
mod1 <- list(
glm=predictor_glm(y~w1+w2),
gam=predictor_gam(y~s(w1) + s(w2))
)
s1 <- predictor_sl(mod1, nfolds=5)
#> Error: object 'mod1' not found
cate(cate_model=~1,
response_model=s1,
propensity_model=predictor_glm(a~w1+w2, family=binomial),
data=d,
stratify=TRUE)
#> Error: object 's1' not found
# \dontrun{}