Assumption lean inference via cross-fitting (Double ML). See <doi:10.1111/rssb.12504
alean(
response_model,
exposure_model,
data,
link = "identity",
g_model,
nfolds = 1,
silent = FALSE,
mc.cores,
...
)
formula or learner object (formula => glm)
model for the exposure
data.frame
Link function (g)
Model for \(E[g(Y|A,W)|W]\)
Number of folds
supress all messages and progressbars
mc.cores Optional number of cores. parallel::mcmapply used instead of future
additional arguments to future.apply::future_mapply
alean.targeted object
Let \(Y\) be the response variable, \(A\) the exposure and \(W\) covariates. The target parameter is: $$\Psi(P) = \frac{E(Cov[A, g\{E(Y|A,W)\}\mid W])} {E\{Var(A\mid W)\}} $$
The response_model
is the model for \(E(Y|A,W)\), and
exposure_model
is the model for \(E(A|W)\).
link
specifies \(g\).
sim1 <- function(n, family=gaussian(), ...) {
m <- lava::lvm() |>
lava::distribution(~y, value=lava::binomial.lvm()) |>
lava::regression('a', value=function(l) l) |>
lava::regression('y', value=function(a,l) a + l)
if (family$family=="binomial")
lava::distribution(m, ~a) <- lava::binomial.lvm()
lava::sim(m, n)
}
library(splines)
f <- binomial()
d <- sim1(1e4, family=f)
e <- alean(
response_model=learner_glm(y ~ a + bs(l, df=3), family=binomial),
exposure_model=learner_glm(a ~ bs(l, df=3), family=f),
data=d,
link = "logit", mc.cores=1, nfolds=1
)
e
#> Estimate Std.Err 2.5% 97.5% P-value
#> a 0.9716 0.05424 0.8653 1.078 9.237e-72
e <- alean(response_model=learner_glm(y ~ a + l, family=binomial),
exposure_model=learner_glm(a ~ l),
data=d,
link = "logit", mc.cores=1, nfolds=1)
e
#> Estimate Std.Err 2.5% 97.5% P-value
#> a 0.9629 0.05411 0.8568 1.069 7.718e-71