G-Computation or standardization for the Cox, Fine-Gray and binomial regression models for survival data

G-computation for the Cox and Fine-Gray models

Computing the standardized estimate (G-estimation) based on the Cox or Fine-Gray model : $\hat S(t,A=a) = n^{-1} \sum_i S(t,A=a,X_i)$ and this estimator has influence function $S(t,A=a,X_i) - S(t,A=a) + E( D_{A_0(t), \beta} S(t,A=a,X_i) ) \epsilon_i(t)$ where $\epsilon_i(t)$ is the iid decomposition of $(\hat A(t) - A(t), \hat \beta- \beta)$ .

These estimates have a causal interpration under the assumption of no-unmeasured confounders, and even without the causal assumptions this standardization can still be a useful summary measure.

First looking cumulative incidence via the Fine-Gray model for the two causes and making a plot of the standardized cumulative incidence for cause 1.

set.seed(100)

data(bmt); bmt$time <- bmt$time+runif(nrow(bmt))*0.001
dfactor(bmt) <- tcell~tcell
bmt$event <- (bmt$cause!=0)*1

fg1 <- cifreg(Event(time,cause)~tcell+platelet+age,bmt,cause=1,propodds=NULL)
summary(survivalG(fg1,bmt,50))
#> risk:
#>       Estimate Std.Err   2.5%  97.5%   P-value
#> risk0   0.4331 0.02749 0.3793 0.4870 6.321e-56
#> risk1   0.2727 0.05863 0.1577 0.3876 3.313e-06
#> 
#> Average Treatment effects (G-estimator) :
#>     Estimate Std.Err   2.5%    97.5% P-value
#> ps0  -0.1605 0.06353 -0.285 -0.03597 0.01153
#> 
#> Average Treatment effect risk-ratio (G-estimator) :
#>        Estimate  Std.Err      2.5%     97.5%    P-value
#> [ps0] 0.6295004 0.139248 0.3565794 0.9024214 0.00779742
#> 
#> Average Treatment effect (1-risk=survival)-ratio (G-estimator) :
#> NULL

fg2 <- cifreg(Event(time,cause)~tcell+platelet+age,bmt,cause=2,propodds=NULL)
summary(survivalG(fg2,bmt,50))
#> risk:
#>       Estimate Std.Err   2.5%  97.5%   P-value
#> risk0   0.2127 0.02314 0.1674 0.2581 3.757e-20
#> risk1   0.3336 0.06799 0.2003 0.4668 9.281e-07
#> 
#> Average Treatment effects (G-estimator) :
#>     Estimate Std.Err     2.5%  97.5% P-value
#> ps0   0.1208 0.07189 -0.02009 0.2617 0.09285
#> 
#> Average Treatment effect risk-ratio (G-estimator) :
#>       Estimate   Std.Err      2.5%    97.5%   P-value
#> [ps0] 1.567915 0.3627528 0.8569321 2.278897 0.1174496
#> 
#> Average Treatment effect (1-risk=survival)-ratio (G-estimator) :
#> NULL

cif1time <- survivalGtime(fg1,bmt)
plot(cif1time,type="risk");

Now looking at the survival probability

ss <- phreg(Surv(time,event)~tcell+platelet+age,bmt)
sss <- survivalG(ss,bmt,50)
summary(sss)
#> risk:
#>       Estimate Std.Err   2.5%  97.5%    P-value
#> risk0   0.6539 0.02709 0.6008 0.7070 9.218e-129
#> risk1   0.5640 0.05971 0.4470 0.6811  3.531e-21
#> 
#> Average Treatment effects (G-estimator) :
#>     Estimate Std.Err    2.5%   97.5% P-value
#> ps0 -0.08992  0.0629 -0.2132 0.03337  0.1529
#> 
#> Average Treatment effect risk-ratio (G-estimator) :
#>        Estimate    Std.Err      2.5%    97.5%   P-value
#> [ps0] 0.8624974 0.09446477 0.6773499 1.047645 0.1455042
#> 
#> Average Treatment effect (1-risk=survival)-ratio (G-estimator) :
#>       Estimate   Std.Err      2.5%    97.5%   P-value
#> [ps0] 1.259836 0.1894627 0.8884963 1.631176 0.1702385

Gtime <- survivalGtime(ss,bmt)
plot(Gtime)

G-computation for the binomial regression

We compare with the similar estimates using the Doubly Robust estimating equations using binregATE. The standardization from the G-computation can also be computed using a specialized function that takes less memory and is quicker (for large data).


## survival situation
sr1 <- binregATE(Event(time,event)~tcell+platelet+age,bmt,cause=1,
         time=40, treat.model=tcell~platelet+age)
summary(sr1)
#> 
#>    n events
#>  408    241
#> 
#>  408 clusters
#> coeffients:
#>              Estimate   Std.Err      2.5%     97.5% P-value
#> (Intercept)  0.676409  0.137007  0.407880  0.944939  0.0000
#> tcell1      -0.023675  0.346994 -0.703770  0.656420  0.9456
#> platelet    -0.492952  0.246158 -0.975412 -0.010492  0.0452
#> age          0.343939  0.115561  0.117444  0.570434  0.0029
#> 
#> exp(coeffients):
#>             Estimate    2.5%  97.5%
#> (Intercept)  1.96680 1.50363 2.5727
#> tcell1       0.97660 0.49472 1.9279
#> platelet     0.61082 0.37704 0.9896
#> age          1.41049 1.12462 1.7690
#> 
#> Average Treatment effects (G-formula) :
#>             Estimate    Std.Err       2.5%      97.5% P-value
#> treat0     0.6230976  0.0273827  0.5694284  0.6767667  0.0000
#> treat1     0.6177595  0.0731712  0.4743466  0.7611723  0.0000
#> treat:1-0 -0.0053381  0.0783973 -0.1589940  0.1483179  0.9457
#> 
#> Average Treatment effects (double robust) :
#>            Estimate   Std.Err      2.5%     97.5% P-value
#> treat0     0.623337  0.027508  0.569422  0.677253  0.0000
#> treat1     0.644397  0.085942  0.475954  0.812840  0.0000
#> treat:1-0  0.021059  0.090305 -0.155935  0.198054  0.8156

## relative risk effect 
estimate(coef=sr1$riskDR,vcov=sr1$var.riskDR,f=function(p) p[2]/p[1],null=1)
#>          Estimate Std.Err   2.5% 97.5% P-value
#> [treat1]    1.034  0.1453 0.7489 1.319  0.8162
#> 
#>  Null Hypothesis: 
#>   [treat1] = 1

## competing risks 
br1 <- binregATE(Event(time,cause)~tcell+platelet+age,bmt,cause=1,
         time=40,treat.model=tcell~platelet+age)
summary(br1)
#> 
#>    n events
#>  408    157
#> 
#>  408 clusters
#> coeffients:
#>              Estimate   Std.Err      2.5%     97.5% P-value
#> (Intercept) -0.191519  0.130883 -0.448044  0.065007  0.1434
#> tcell1      -0.712880  0.351489 -1.401786 -0.023974  0.0425
#> platelet    -0.531919  0.244495 -1.011119 -0.052718  0.0296
#> age          0.432939  0.107314  0.222607  0.643271  0.0001
#> 
#> exp(coeffients):
#>             Estimate    2.5%  97.5%
#> (Intercept)  0.82570 0.63888 1.0672
#> tcell1       0.49023 0.24616 0.9763
#> platelet     0.58748 0.36381 0.9486
#> age          1.54178 1.24933 1.9027
#> 
#> Average Treatment effects (G-formula) :
#>            Estimate   Std.Err      2.5%     97.5% P-value
#> treat0     0.417746  0.027030  0.364768  0.470724  0.0000
#> treat1     0.267097  0.061849  0.145874  0.388319  0.0000
#> treat:1-0 -0.150649  0.067578 -0.283100 -0.018199  0.0258
#> 
#> Average Treatment effects (double robust) :
#>            Estimate   Std.Err      2.5%     97.5% P-value
#> treat0     0.417320  0.027122  0.364163  0.470478  0.0000
#> treat1     0.231149  0.060651  0.112275  0.350023  0.0001
#> treat:1-0 -0.186171  0.066053 -0.315633 -0.056710  0.0048

and using the specialized function

br1 <- binreg(Event(time,cause)~tcell+platelet+age,bmt,cause=1,time=40)
Gbr1 <- binregG(br1,data=bmt)
summary(Gbr1)
#> risk:
#>       Estimate Std.Err   2.5%  97.5%   P-value
#> risk0   0.4177 0.02727 0.3643 0.4712 5.588e-53
#> risk1   0.2671 0.06183 0.1459 0.3883 1.562e-05
#> 
#> Average Treatment effects (G-estimator) :
#>    Estimate Std.Err    2.5%    97.5% P-value
#> p1  -0.1506 0.06759 -0.2831 -0.01817 0.02583
#> 
#> Average Treatment effect risk-ratio (G-estimator) :
#>       Estimate   Std.Err      2.5%     97.5%    P-value
#> [p1] 0.6393758 0.1538101 0.3379136 0.9408381 0.01904716
#> 
#> Average Treatment effect (1-risk=survival)-ratio (G-estimator) :
#> NULL

## contrasting average age to +2-sd age, Avalues
Gbr2 <- binregG(br1,data=bmt,varname="age",Avalues=c(0,2))
summary(Gbr2)
#> risk:
#>       Estimate Std.Err   2.5%  97.5%   P-value
#> risk0   0.3932 0.02537 0.3434 0.4429 3.566e-54
#> risk2   0.5997 0.05544 0.4911 0.7084 2.874e-27
#> 
#> Average Treatment effects (G-estimator) :
#>    Estimate Std.Err   2.5%  97.5%   P-value
#> p1   0.2066 0.04998 0.1086 0.3045 3.584e-05
#> 
#> Average Treatment effect risk-ratio (G-estimator) :
#>      Estimate   Std.Err     2.5%    97.5%      P-value
#> [p1] 1.525375 0.1324356 1.265806 1.784945 7.277532e-05
#> 
#> Average Treatment effect (1-risk=survival)-ratio (G-estimator) :
#> NULL

SessionInfo

sessionInfo()
#> R version 4.5.0 (2025-04-11)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.2 LTS
#> 
#> Matrix products: default
#> BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so;  LAPACK version 3.12.0
#> 
#> locale:
#>  [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8       
#>  [4] LC_COLLATE=C.UTF-8     LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8   
#>  [7] LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C          
#> [10] LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   
#> 
#> time zone: UTC
#> tzcode source: system (glibc)
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] mets_1.3.6
#> 
#> loaded via a namespace (and not attached):
#>  [1] cli_3.6.5           knitr_1.50          rlang_1.1.6        
#>  [4] xfun_0.52           textshaping_1.0.0   jsonlite_2.0.0     
#>  [7] listenv_0.9.1       future.apply_1.11.3 lava_1.8.1         
#> [10] htmltools_0.5.8.1   ragg_1.4.0          sass_0.4.10        
#> [13] rmarkdown_2.29      grid_4.5.0          evaluate_1.0.3     
#> [16] jquerylib_0.1.4     fastmap_1.2.0       numDeriv_2016.8-1.1
#> [19] yaml_2.3.10         mvtnorm_1.3-3       lifecycle_1.0.4    
#> [22] timereg_2.0.6       compiler_4.5.0      codetools_0.2-20   
#> [25] fs_1.6.6            Rcpp_1.0.14         future_1.40.0      
#> [28] lattice_0.22-6      systemfonts_1.2.2   digest_0.6.37      
#> [31] R6_2.6.1            parallelly_1.43.0   parallel_4.5.0     
#> [34] splines_4.5.0       Matrix_1.7-3        bslib_0.9.0        
#> [37] tools_4.5.0         globals_0.17.0      survival_3.8-3     
#> [40] pkgdown_2.1.1       cachem_1.1.0        desc_1.4.3

Klaus Holst & Thomas Scheike

2025-04-23

G-computation for the Cox and Fine-Gray models

G-computation for the binomial regression

SessionInfo