
Mediation Analysis for survival data
Klaus Holst & Thomas Scheike
2026-05-23
Source:vignettes/mediation-survival.Rmd
mediation-survival.RmdOverview
We fit
- binomial regression with IPCW:
binreg - additive Lin-Ying model:
aalenMets - Cox model:
phreg - standard logistic regression via
binreg
in the context of mediation analysis using mediation weights as in
the medFlex package. We thus fit natural effects models;
for example, on the binary scale: \begin{align*}
\mbox{logit}(P(Y(x,M(x^*))=1| Z) = \beta_0+ \beta_1 x + \beta_2 x^* +
\beta_3^T Z,
\end{align*} In this case the Natural Direct Effect (NDE) for
fixed covariates Z is \begin{align*}
\mbox{OR}_{1,0|Z}^{\mbox{NDE}} =
\frac{\mbox{odds}(Y(1,M(x))|Z)}{\mbox{odds}(Y(0,M(x))|Z)} =
\exp(\beta_1),
\end{align*} and the Natural Indirect Effect (NIE) for fixed
covariates Z is \begin{align*}
\mbox{OR}_{1,0|Z}^{\mbox{NIE}} =
\frac{\mbox{odds}(Y(x,M(1))|Z)}{\mbox{odds}(Y(x,M(0))|Z)} =
\exp(\beta_2).
\end{align*} See the medFlex package for additional
discussion of the parametrisation.
The mediator can be:
- binomial, using
glmwith familybinomial. - multinomial, via the
mlogitfunction inmets.
Both mediator and exposure must be coded as factors.
In the example below:
- mediator:
gp.f - exposure:
dnr.f
The outcome model concerns the risk/hazard of cause 2.
Standard errors are computed using i.i.d. influence functions and a Taylor expansion to account for uncertainty in the mediation weights.
Simulated Data
First we simulate data that mimics the dataset from Kumar et
al. (2012), which consists of multiple myeloma patients treated with
allogeneic stem cell transplantation from the Center for International
Blood and Marrow Transplant Research (CIBMTR). The data cover patients
transplanted from 1995 to 2005, comparing outcomes between transplant
periods: 2001–2005 (N=488) versus
1995–2000 (N=375). The two competing
events were relapse (cause 2) and treatment-related mortality (TRM,
cause 1), defined as death without relapse. Kumar et al. (2012)
considered the following risk covariates: transplant period
(gp, the main variable of interest: 1 for 2001–2005, 0 for
1995–2000), donor type (dnr: 1 for unrelated or other
related donor (N=280), 0 for
HLA-identical sibling (N=584)), prior
autologous transplant (preauto: 1 for auto+allo transplant
(N=399), 0 for allogeneic alone (N=465)), and time to transplant
(ttt24: 1 for more than 24 months (N=289), 0 for 24 months or less (N=575)).
The interest is in the effect of transplant period (gp)
and possible mediation via the proportion of unrelated or related donors
(dnr) — a somewhat artificial example. All analyses are
adjusted for other important confounders.
library(mets)
runb <- 0
options(warn=-1)
set.seed(1000) # to control output in simulations for p-values below.
n <- 200; k.boot <- 10;
dat <- kumarsimRCT(n,rho1=0.5,rho2=0.5,rct=2,censpar=c(0,0,0,0),
beta = c(-0.67, 0.59, 0.55, 0.25, 0.98, 0.18, 0.45, 0.31),
treatmodel = c(-0.18, 0.56, 0.56, 0.54),restrict=1)
dfactor(dat) <- dnr.f~dnr
dfactor(dat) <- gp.f~gp
drename(dat) <- ttt24~"ttt24*"
dat$id <- 1:n
dat$ftime <- 1Binomial Regression
A simple multivariate regression of the probability of relapse at 50 months with both exposure and mediator (given the other covariates):
aaMss2 <- binreg(Event(time,status)~gp+dnr+preauto+ttt24+cluster(id),data=dat,time=50,cause=2)
summary(aaMss2)
#> n events
#> 200 97
#>
#> 200 clusters
#> coeffients:
#> Estimate Std.Err 2.5% 97.5% P-value
#> (Intercept) -1.01508 0.31869 -1.63971 -0.39046 0.0014
#> gp 1.08533 0.34216 0.41471 1.75594 0.0015
#> dnr 0.51969 0.35757 -0.18113 1.22051 0.1461
#> preauto 0.39417 0.35936 -0.31017 1.09851 0.2727
#> ttt24 0.50469 0.38681 -0.25344 1.26283 0.1920
#>
#> exp(coeffients):
#> Estimate 2.5% 97.5%
#> (Intercept) 0.36237 0.19404 0.6767
#> gp 2.96041 1.51394 5.7889
#> dnr 1.68151 0.83433 3.3889
#> preauto 1.48316 0.73332 2.9997
#> ttt24 1.65648 0.77612 3.5354Binomial regression IPCW Mediation Analysis
We first look at the probability of relapse at 50 months:
### binomial regression ###########################################################
aaMss <- binreg(Event(time,status)~dnr.f0+dnr.f1+preauto+ttt24+cluster(id),data=wdata,
time=50,weights=wdata$weights,cause=2)
summary(aaMss)
#> n events
#> 400 194
#>
#> 200 clusters
#> coeffients:
#> Estimate Std.Err 2.5% 97.5% P-value
#> (Intercept) -0.535534 0.256218 -1.037712 -0.033356 0.0366
#> dnr.f01 0.375817 0.348618 -0.307462 1.059095 0.2810
#> dnr.f11 0.275383 0.071199 0.135836 0.414931 0.0001
#> preauto 0.588221 0.350437 -0.098624 1.275066 0.0932
#> ttt24 0.266179 0.363603 -0.446469 0.978827 0.4641
#>
#> exp(coeffients):
#> Estimate 2.5% 97.5%
#> (Intercept) 0.58536 0.35426 0.9672
#> dnr.f01 1.45618 0.73531 2.8838
#> dnr.f11 1.31704 1.14549 1.5143
#> preauto 1.80078 0.90608 3.5789
#> ttt24 1.30497 0.63988 2.6613
ll <- mediatorSurv(aaMss,fit,data=dat,wdata=wdata)
summary(ll)
#> n events
#> 400 194
#>
#> 200 clusters
#> coeffients:
#> Estimate Std.Err 2.5% 97.5% P-value
#> (Intercept) -0.535534 0.254832 -1.034995 -0.036073 0.0356
#> dnr.f01 0.375817 0.317732 -0.246927 0.998560 0.2369
#> dnr.f11 0.275383 0.117175 0.045726 0.505041 0.0188
#> preauto 0.588221 0.346523 -0.090951 1.267394 0.0896
#> ttt24 0.266179 0.366361 -0.451875 0.984233 0.4675
#>
#> exp(coeffients):
#> Estimate 2.5% 97.5%
#> (Intercept) 0.58536 0.35523 0.9646
#> dnr.f01 1.45618 0.78120 2.7144
#> dnr.f11 1.31704 1.04679 1.6571
#> preauto 1.80078 0.91306 3.5516
#> ttt24 1.30497 0.63643 2.6758
if (runb>0) { bll <- BootmediatorSurv(aaMss,fit,data=dat,k.boot=k.boot); summary(bll)}The estimated NDE is 1.40 (95% CI: 0.72, 2.76) and the NIE is 1.32 (95% CI: 1.05, 1.66).
Mediation Analysis
We illustrate how to use the other models mentioned above.
### lin-ying model ################################################################
aaMss <- aalenMets(Surv(time/100,status==2)~dnr.f0+dnr.f1+preauto+ttt24+cluster(id),data=wdata,
weights=wdata$weights)
ll <- mediatorSurv(aaMss,fit,data=dat,wdata=wdata)
summary(ll)
#> n events
#> 400 196
#> coeffients:
#> Estimate Std.Err 2.5% 97.5% P-value
#> dnr.f01 1.169592 0.739323 -0.279454 2.618637 0.1137
#> dnr.f11 0.206757 0.131289 -0.050565 0.464078 0.1153
#> preauto 0.617537 0.504302 -0.370877 1.605950 0.2207
#> ttt24 0.457736 0.517822 -0.557175 1.472648 0.3767
if (runb>0) { bll <- BootmediatorSurv(aaMss,fit,data=dat,k.boot=k.boot); summary(bll)}
### cox model ###############################################################################
aaMss <- phreg(Surv(time,status==2)~dnr.f0+dnr.f1+preauto+ttt24+cluster(id),data=wdata,
weights=wdata$weights)
summary(aaMss)
#>
#> n events
#> 400 196
#> coefficients:
#> Estimate S.E. dU^-1/2 P-value
#> dnr.f01 0.414565 0.213724 0.157231 0.0524
#> dnr.f11 0.100656 0.039308 0.144971 0.0104
#> preauto 0.284460 0.232166 0.162375 0.2205
#> ttt24 0.185561 0.226044 0.160886 0.4117
#>
#> exp(coefficients):
#> Estimate 2.5% 97.5%
#> dnr.f01 1.51371 0.99568 2.3013
#> dnr.f11 1.10590 1.02389 1.1945
#> preauto 1.32904 0.84318 2.0949
#> ttt24 1.20389 0.77300 1.8750
ll <- mediatorSurv(aaMss,fit,data=dat,wdata=wdata)
summary(ll)
#> n events
#> 400 196
#> coeffients:
#> Estimate Std.Err 2.5% 97.5% P-value
#> dnr.f01 0.41456472 0.20869639 0.00552731 0.82360212 0.0470
#> dnr.f11 0.10065575 0.05121458 0.00027702 0.20103448 0.0494
#> preauto 0.28445952 0.23037280 -0.16706288 0.73598192 0.2169
#> ttt24 0.18556110 0.22549763 -0.25640614 0.62752835 0.4106
#>
#> exp(coeffients):
#> Estimate 2.5% 97.5%
#> dnr.f01 1.51371 1.00554 2.2787
#> dnr.f11 1.10590 1.00028 1.2227
#> preauto 1.32904 0.84615 2.0875
#> ttt24 1.20389 0.77383 1.8730
if (runb>0) { bll <- BootmediatorSurv(aaMss,fit,data=dat,k.boot=k.boot); summary(bll)}
### Fine-Gray #############################################################3
aaMss <- cifreg(Event(time,status)~dnr.f0+dnr.f1+preauto+ttt24+cluster(id),data=wdata,
weights=wdata$weights,propodds=NULL,cause=2)
summary(aaMss)
#>
#> n events
#> 400 196
#>
#> 200 clusters
#> coefficients:
#> Estimate S.E. dU^-1/2 P-value
#> dnr.f01 0.18943 0.21986 0.15855 0.3889
#> dnr.f11 0.18730 0.04083 0.14503 0.0000
#> preauto 0.41452 0.22783 0.16098 0.0688
#> ttt24 0.17304 0.22892 0.16308 0.4497
#>
#> exp(coefficients):
#> Estimate 2.5% 97.5%
#> dnr.f01 1.20856 0.78545 1.8596
#> dnr.f11 1.20599 1.11324 1.3065
#> preauto 1.51364 0.96849 2.3656
#> ttt24 1.18892 0.75910 1.8621
ll <- mediatorSurv(aaMss,fit,data=dat,wdata=wdata)
summary(ll)
#> n events
#> 400 196
#>
#> 200 clusters
#> coeffients:
#> Estimate Std.Err 2.5% 97.5% P-value
#> dnr.f01 0.189426 0.233939 -0.269087 0.647939 0.4181
#> dnr.f11 0.187298 0.047733 0.093744 0.280853 0.0001
#> preauto 0.414517 0.230676 -0.037600 0.866634 0.0723
#> ttt24 0.173042 0.230810 -0.279338 0.625422 0.4534
#>
#> exp(coeffients):
#> Estimate 2.5% 97.5%
#> dnr.f01 1.20856 0.76408 1.9116
#> dnr.f11 1.20599 1.09828 1.3243
#> preauto 1.51364 0.96310 2.3789
#> ttt24 1.18892 0.75628 1.8690
if (runb>0) { bll <- BootmediatorSurv(aaMss,fit,data=dat,k.boot=k.boot); summary(bll)}
### logit model #############################################################3
aaMss <- cifreg(Event(time,status)~dnr.f0+dnr.f1+preauto+ttt24+cluster(id),data=wdata,
weights=wdata$weights,cause=2)
summary(aaMss)
#>
#> n events
#> 400 196
#>
#> 200 clusters
#> coefficients:
#> Estimate S.E. dU^-1/2 P-value
#> dnr.f01 0.357168 0.339848 0.158937 0.2933
#> dnr.f11 0.272392 0.064166 0.145076 0.0000
#> preauto 0.657010 0.326082 0.160361 0.0439
#> ttt24 0.191333 0.353606 0.167443 0.5884
#>
#> exp(coefficients):
#> Estimate 2.5% 97.5%
#> dnr.f01 1.42928 0.73424 2.7822
#> dnr.f11 1.31310 1.15792 1.4891
#> preauto 1.92902 1.01806 3.6551
#> ttt24 1.21086 0.60549 2.4215
ll <- mediatorSurv(aaMss,fit,data=dat,wdata=wdata)
summary(ll)
#> n events
#> 400 196
#>
#> 200 clusters
#> coeffients:
#> Estimate Std.Err 2.5% 97.5% P-value
#> dnr.f01 0.357168 0.351089 -0.330953 1.045289 0.3090
#> dnr.f11 0.272392 0.068131 0.138857 0.405927 0.0001
#> preauto 0.657010 0.328207 0.013736 1.300284 0.0453
#> ttt24 0.191333 0.356086 -0.506583 0.889250 0.5910
#>
#> exp(coeffients):
#> Estimate 2.5% 97.5%
#> dnr.f01 1.42928 0.71824 2.8442
#> dnr.f11 1.31310 1.14896 1.5007
#> preauto 1.92902 1.01383 3.6703
#> ttt24 1.21086 0.60255 2.4333
if (runb>0) { bll <- BootmediatorSurv(aaMss,fit,data=dat,k.boot=k.boot); summary(bll)}
### binomial outcome ############################
aaMss <- binreg(Event(ftime,status)~dnr.f0+dnr.f1+preauto+ttt24+cluster(id),data=wdata,
time=50,weights=wdata$weights,cens.weights=1,cause=2)
summary(aaMss)
#> n events
#> 400 196
#>
#> 200 clusters
#> coeffients:
#> Estimate Std.Err 2.5% 97.5% P-value
#> (Intercept) -0.674433 0.235285 -1.135583 -0.213284 0.0042
#> dnr.f01 0.221834 0.318264 -0.401952 0.845620 0.4858
#> dnr.f11 0.262722 0.060281 0.144572 0.380871 0.0000
#> preauto 0.578077 0.319091 -0.047331 1.203484 0.0700
#> ttt24 0.214442 0.328183 -0.428784 0.857669 0.5135
#>
#> exp(coeffients):
#> Estimate 2.5% 97.5%
#> (Intercept) 0.50944 0.32123 0.8079
#> dnr.f01 1.24836 0.66901 2.3294
#> dnr.f11 1.30046 1.15555 1.4636
#> preauto 1.78261 0.95377 3.3317
#> ttt24 1.23917 0.65130 2.3577
ll <- mediatorSurv(aaMss,fit,data=dat,wdata=wdata)
summary(ll)
#> n events
#> 400 196
#>
#> 200 clusters
#> coeffients:
#> Estimate Std.Err 2.5% 97.5% P-value
#> (Intercept) -0.674433 0.235022 -1.135069 -0.213798 0.0041
#> dnr.f01 0.221834 0.286717 -0.340122 0.783789 0.4391
#> dnr.f11 0.262722 0.107508 0.052011 0.473432 0.0145
#> preauto 0.578077 0.315260 -0.039822 1.195975 0.0667
#> ttt24 0.214442 0.329107 -0.430596 0.859480 0.5147
#>
#> exp(coeffients):
#> Estimate 2.5% 97.5%
#> (Intercept) 0.50944 0.32140 0.8075
#> dnr.f01 1.24836 0.71168 2.1898
#> dnr.f11 1.30046 1.05339 1.6055
#> preauto 1.78261 0.96096 3.3068
#> ttt24 1.23917 0.65012 2.3619
if (runb>0) { bll <- BootmediatorSurv(aaMss,fit,data=dat,k.boot=k.boot); summary(bll)}Multinomial regression
Also works with a mediator with more than two levels:
- mediator:
wmiin 4 categories - exposure:
agein 4 categories
library(mets)
data(tTRACE)
dcut(tTRACE) <- ~.
weightmodel <- fit <- mlogit(wmicat.4 ~agecat.4+vf+chf,data=tTRACE,family=binomial)
wdata <- medweight(fit,data=tTRACE)
aaMss <- binreg(Event(time,status)~agecat.40+ agecat.41+ vf+chf+cluster(id),data=wdata,
time=7,weights=wdata$weights,cause=9)
summary(aaMss)
MultMed <- mediatorSurv(aaMss,fit,data=tTRACE,wdata=wdata)
summary(MultMed)
summary(MultMed)
#> n events
#> 4000 2016
#>
#> 1000 clusters
#> coeffients:
#> Estimate Std.Err 2.5% 97.5% P-value
#> (Intercept) -1.839306 0.174541 -2.181401 -1.497211 0.0000
#> agecat.40(60.1,68.6] 0.916646 0.223488 0.478618 1.354674 0.0000
#> agecat.40(68.6,75.6] 1.363830 0.222418 0.927898 1.799762 0.0000
#> agecat.40(75.6,96.3] 2.277415 0.249815 1.787786 2.767044 0.0000
#> agecat.41(60.1,68.6] 0.121100 0.053334 0.016567 0.225633 0.0232
#> agecat.41(68.6,75.6] 0.119374 0.053193 0.015118 0.223631 0.0248
#> agecat.41(75.6,96.3] 0.095356 0.053874 -0.010234 0.200947 0.0767
#> vf 0.712461 0.293627 0.136962 1.287960 0.0152
#> chf 1.166578 0.154721 0.863331 1.469825 0.0000
#>
#> exp(coeffients):
#> Estimate 2.5% 97.5%
#> (Intercept) 0.15893 0.11288 0.2238
#> agecat.40(60.1,68.6] 2.50089 1.61384 3.8755
#> agecat.40(68.6,75.6] 3.91114 2.52919 6.0482
#> agecat.40(75.6,96.3] 9.75144 5.97621 15.9115
#> agecat.41(60.1,68.6] 1.12874 1.01671 1.2531
#> agecat.41(68.6,75.6] 1.12679 1.01523 1.2506
#> agecat.41(75.6,96.3] 1.10005 0.98982 1.2226
#> vf 2.03900 1.14678 3.6254
#> chf 3.21099 2.37105 4.3485SessionInfo
sessionInfo()
#> R version 4.6.0 (2026-04-24)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.4 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
#>
#> locale:
#> [1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8
#> [4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8
#> [7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C
#> [10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: UTC
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] mets_1.3.10
#>
#> loaded via a namespace (and not attached):
#> [1] cli_3.6.6 knitr_1.51 rlang_1.2.0
#> [4] xfun_0.57 textshaping_1.0.5 jsonlite_2.0.0
#> [7] listenv_0.10.1 future.apply_1.20.2 lava_1.9.1
#> [10] htmltools_0.5.9 ragg_1.5.2 sass_0.4.10
#> [13] rmarkdown_2.31 grid_4.6.0 evaluate_1.0.5
#> [16] jquerylib_0.1.4 fastmap_1.2.0 numDeriv_2016.8-1.1
#> [19] yaml_2.3.12 mvtnorm_1.3-7 lifecycle_1.0.5
#> [22] timereg_2.0.7 compiler_4.6.0 codetools_0.2-20
#> [25] fs_2.1.0 htmlwidgets_1.6.4 Rcpp_1.1.1-1.1
#> [28] future_1.70.0 lattice_0.22-9 systemfonts_1.3.2
#> [31] digest_0.6.39 R6_2.6.1 parallelly_1.47.0
#> [34] parallel_4.6.0 splines_4.6.0 Matrix_1.7-5
#> [37] bslib_0.11.0 tools_4.6.0 RcppArmadillo_15.2.6-1
#> [40] globals_0.19.1 survival_3.8-6 pkgdown_2.2.0
#> [43] cachem_1.1.0 desc_1.4.3