Introduction

Let YY be a binary response, AA a binary exposure, and VV a vector of covariates.

DAG for the statistical model with the dashed edge representing a potential interaction between exposure AA and covariates VV.

In a common setting, the main interest lies in quantifying the treatment effect, ν\nu, of AA on YY adjusting for the set of covariates, and often a standard approach is to use a Generalized Linear Model (GLM):

g{E(YA,V)}=AνtW+μtZnuisanceg\{ E(Y\mid A,V) \} = A\nu^tW + \underset{\mathrm{nuisance}}{\mu^tZ}

with link function gg, and W=w(V)W = w(V), Z=v(V)Z= v(V) known vector functions of VV.

The canonical link (logit) leads to nice computational properties (logistic regression) and parameters with an odds-ratio interpretation. But ORs are not collapsible even under randomization. For example

E(YX)=E[E(YX,Z)X]=E[expit(μ+αX+βZ)X]expit[μ+αX+βE(ZX)], E(Y\mid X) = E[ E(Y\mid X,Z) \mid X ] = E[\operatorname{expit}( \mu + \alpha X + \beta Z ) \mid X] \neq \operatorname{expit}[\mu + \alpha X + \beta E(Z\mid X)],

When marginalizing we leave the class of logistic regression. This non-collapsibility makes it hard to interpret odds-ratios and to compare results from different studies

Relative risks (and risk differences) are collapsible and generally considered easier to interpret than odds-ratios. Richardson et al (JASA, 2017) proposed a regression model for a binary exposures which solves the computational problems and need for parameter contraints that are associated with using for example binomial regression with a log-link function (or identify link for the risk difference) to obtain such parameter estimates. In the following we consider the relative risk as the target parameter

RR(v)=P(Y=1A=1,V=v)P(Y=1A=0,V=v). \mathrm{RR}(v) = \frac{P(Y=1\mid A=1, V=v)}{P(Y=1\mid A=0, V=v)}.

Let pa(V)=P(YA=a,V),a{0,1}p_a(V) = P(Y \mid A=a, V), a\in\{0,1\}, the idea is then to posit a linear model for θ(v)=log(RR(v)) \theta(v) = \log \big(RR(v)\big) , i.e., log(RR(v))=αTv,\log \big(RR(v)\big) = \alpha^Tv,

and a nuisance model for the odds-product ϕ(v)=log(p0(v)p1(v)(1p0(v))(1p1(v))) \phi(v) = \log\left(\frac{p_{0}(v)p_{1}(v)}{(1-p_{0}(v))(1-p_{1}(v))}\right)

noting that these two parameters are variation independent as illustrated by the below L’Abbé plot.

  p0 <- seq(0,1,length.out=100)
  p1 <- function(p0,op) 1/(1+(op*(1-p0)/p0)^-1)
  plot(0, type="n", xlim=c(0,1), ylim=c(0,1),
     xlab="P(Y=1|A=0)", ylab="P(Y=1|A=1)", main="Constant odds product")
  for (op in exp(seq(-6,6,by=.25))) lines(p0,p1(p0,op), col="lightblue")

  p0 <- seq(0,1,length.out=100)
  p1 <- function(p0,rr) rr*p0
  plot(0, type="n", xlim=c(0,1), ylim=c(0,1),
     xlab="P(Y=1|A=0)", ylab="P(Y=1|A=1)", main="Constant relative risk")
  for (rr in exp(seq(-3,3,by=.25))) lines(p0,p1(p0,rr), col="lightblue")

Similarly, a model can be constructed for the risk-difference on the following scale

θ(v)=arctanh(RD(v)).\theta(v) = \operatorname{arctanh} \big(RD(v)\big).

Simulation

First the targeted package is loaded

This automatically imports lava (CRAN) which we can use to simulate from the Relative-Risk Odds-Product (RR-OP) model.

m <- lvm(a ~ x,
         lp.target ~ 1,
         lp.nuisance ~ x+z)
m <- binomial.rr(m, response="y", exposure="a", target.model="lp.target", nuisance.model="lp.nuisance")

The lvm call first defines the linear predictor for the exposure to be of the form

LPA:=μA+αX\mathrm{LP}_A := \mu_A + \alpha X

and the linear predictors for the /target parameter/ (relative risk) and the /nuisance parameter/ (odds product) to be of the form

LPRR:=μRR,\mathrm{LP}_{RR} := \mu_{RR},

LPOP:=μOP+βxX+βzZ.\mathrm{LP}_{OP} := \mu_{OP} + \beta_x X + \beta_z Z.

The covariates are by default assumed to be independent and standard normal X,Z𝒩(0,1)X, Z\sim\mathcal{N}(0,1), but their distribution can easily be altered using the lava::distribution method.

The binomial.rr function

args(binomial.rr)
#> function (x, response, exposure, target.model, nuisance.model, 
#>     exposure.model = binomial.lvm(), ...) 
#> NULL

then defines the link functions, i.e.,

logit(E[AX,Z])=μA+αX,\operatorname{logit}(E[A\mid X,Z]) = \mu_A + \alpha X,

log(E[YX,Z,A=1]/E[YX,A=0])=μRR,\operatorname{log}(E[Y\mid X,Z, A=1]/E[Y\mid X, A=0]) = \mu_{RR},

log{p1(X,Z)p0(X,Z)/[(1p1(X,Z))(1p0(X,Z))]}=μOP+βxX+βzZ\operatorname{log}\{p_1(X,Z)p_0(X,Z)/[(1-p_1(X,Z))(1-p_0(X,Z))]\} = \mu_{OP}+\beta_x X + \beta_z Z

with pa(X,Z)=E(YA=a,X,Z)p_a(X,Z)=E(Y\mid A=a,X,Z).

The risk-difference model with the RD parameter modeled on the arctanh\operatorname{arctanh} scale can be defined similarly using the binomial.rd method

args(binomial.rd)
#> function (x, response, exposure, target.model, nuisance.model, 
#>     exposure.model = binomial.lvm(), ...) 
#> NULL

We can inspect the parameter names of the modeled

coef(m)
#>              m1              m2              m3              p1              p2 
#>             "a"     "lp.target"   "lp.nuisance"           "a~x" "lp.nuisance~x" 
#>              p3              p4 
#> "lp.nuisance~z"          "a~~a"

Here the intercepts of the model are simply given the same name as the variables, such that μA\mu_A becomes a, and the other regression coefficients are labeled using the “~”-formula notation, e.g., α\alpha becomes a~x.

Intercepts are by default set to zero and regression parameters set to one in the simulation. Hence to simulate from the model with (muA,μRR,μOP,α,βx,βz)T=(1,1,2,2,1,1)T(mu_A, \mu_{RR}, \mu_{OP}, \alpha, \beta_x, \beta_z)^T = (-1,1,-2,2,1,1)^T, we define the parameter vector p given by

p <- c('a'=-1, 'lp.target'=1, 'lp.nuisance'=-1, 'a~x'=2)

and then simulate from the model using the sim method

d <- sim(m, 1e4, p=p, seed=1)

head(d)
#>   a          x lp.target lp.nuisance          z y
#> 1 0 -0.6264538         1  -2.4307854 -0.8043316 0
#> 2 0  0.1836433         1  -1.8728823 -1.0565257 0
#> 3 0 -0.8356286         1  -2.8710244 -1.0353958 0
#> 4 1  1.5952808         1  -0.5902796 -1.1855604 1
#> 5 0  0.3295078         1  -1.1709317 -0.5004395 1
#> 6 0 -0.8204684         1  -2.3454571 -0.5249887 0

Notice, that in this simulated data the target parameter μRR\mu_{RR} has been set to lp.target = 1.

Estimation

MLE

We start by fitting the model using the maximum likelihood estimator.

args(riskreg_mle)
#> function (y, a, x1, x2 = x1, weights = rep(1, length(y)), std.err = TRUE, 
#>     type = "rr", start = NULL, control = list(), ...) 
#> NULL

The riskreg_mle function takes vectors/matrices as input arguments with the response y, exposure a, target parameter design matrix x1 (i.e., the matrix WW at the start of this text), and the nuisance model design matrix x2 (odds product).

We first consider the case of a correctly specified model, hence we do not consider any interactions with the exposure for the odds product and simply let x1 be a vector of ones, whereas the design matrix for the log-odds-product depends on both XX and ZZ

x1 <- model.matrix(~1, d)
x2 <- model.matrix(~x+z, d)

fit1 <- with(d, riskreg_mle(y, a, x1, x2, type="rr"))
fit1
#>                          Estimate Std.Err    2.5%   97.5%    P-value
#> (Intercept)                0.9512 0.03319  0.8862  1.0163 1.204e-180
#> odds-product:(Intercept)  -1.0610 0.05199 -1.1629 -0.9591  1.377e-92
#> odds-product:x             1.0330 0.05944  0.9165  1.1495  1.230e-67
#> odds-product:z             1.0421 0.05285  0.9386  1.1457  1.523e-86

The parameters are presented in the same order as the columns of x1and x2, hence the target parameter estimate is in the first row

estimate(fit1, keep=1)
#>             Estimate Std.Err   2.5% 97.5%    P-value
#> (Intercept)   0.9512 0.03336 0.8858 1.017 7.159e-179

DRE

We next fit the model using a double robust estimator (DRE) which introduces a model for the exposure E(A=1V)E(A=1\mid V) (propensity model). The double-robustness stems from the fact that the this estimator remains consistent in the union model where either the odds-product model or the propensity model is correctly specified. With both models correctly specified the estimator is efficient.

with(d, riskreg_fit(y, a, target=x1, nuisance=x2, propensity=x2, type="rr"))
#>             Estimate Std.Err   2.5% 97.5%    P-value
#> (Intercept)   0.9372  0.0339 0.8708 1.004 3.004e-168

The usual /formula/-syntax can be applied using the riskreg function. Here we illustrate the double-robustness by using a wrong propensity model but a correct nuisance paramter (odds-product) model:

  riskreg(y~a, nuisance=~x+z, propensity=~z, data=d, type="rr")
#>             Estimate Std.Err   2.5% 97.5%    P-value
#> (Intercept)   0.9511 0.03333 0.8857 1.016 4.547e-179

Or vice-versa

  riskreg(y~a, nuisance=~z, propensity=~x+z, data=d, type="rr")
#>             Estimate Std.Err   2.5% 97.5%    P-value
#> (Intercept)   0.9404 0.03727 0.8673 1.013 1.736e-140

whereas the MLE in this case yields a biased estimate of the relative risk:

  fit2 <- with(d, riskreg_mle(y, a, x1=model.matrix(~1,d), x2=model.matrix(~z, d)))
  estimate(fit2, keep=1)
#>             Estimate Std.Err  2.5% 97.5% P-value
#> (Intercept)    1.243 0.02778 1.189 1.298       0

Interactions

The more general model where logRR(V)=AαTV\log RR(V) = A \alpha^TV for a subset VV of the covariates can be estimated using the target argument:

fit <- riskreg(y~a, target=~x, nuisance=~x+z, data=d)
fit
#>             Estimate Std.Err     2.5%   97.5%    P-value
#> (Intercept)  0.95267 0.03365  0.88673 1.01862 2.361e-176
#> x           -0.01078 0.03804 -0.08534 0.06378  7.769e-01

As expected we do not see any evidence of an effect of XX on the relative risk with the 95% confidence limits clearly overlapping zero.

Note, that when the propensity argument is omitted as above, the same design matrix is used for both the odds-product model and the propensity model.

Risk-difference

The syntax for fitting the risk-difference model is similar. To illustrate this we simulate some new data from this model

m2 <- binomial.rd(m, response="y", exposure="a", target.model="lp.target", nuisance.model="lp.nuisance")
d2 <- sim(m2, 1e4, p=p)

And we can then fit the DRE with the syntax

riskreg(y~a, nuisance=~x+z, data=d2, type="rd")
#>             Estimate Std.Err   2.5% 97.5% P-value
#> (Intercept)    1.027 0.02081 0.9857 1.067       0

Influence-function

The DRE is a regular and asymptotic linear (RAL) estimator, hence n(α̂DREα)=1ni=1nϕeff(Zi)+op(1)\sqrt{n}(\widehat{\alpha}_{\mathrm{DRE}} - \alpha) = \frac{1}{\sqrt{n}}\sum_{i=1}^{n} \phi_{\mathrm{eff}}(Z_{i}) + o_{p}(1) where Zi=(Yi,Ai,Vi),i=1,,nZ_i = (Y_i, A_i, V_i), i=1,\ldots,n are the i.i.d. observations and ϕeff\phi_{\mathrm{eff}} is the influence function.

The influence function can be extracted using the IC method

head(IC(fit))
#>   (Intercept)          x
#> 1   0.6226459 -0.4585424
#> 2   1.2319960  0.7925974
#> 3   0.3941325 -0.5798067
#> 4  -0.8854890  2.9621436
#> 5  -6.9949142 -5.2133643
#> 6   0.5853933 -0.7571452

SessionInfo

sessionInfo()
#> R version 4.4.2 (2024-10-31)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 22.04.5 LTS
#> 
#> Matrix products: default
#> BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0
#> 
#> locale:
#>  [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8       
#>  [4] LC_COLLATE=C.UTF-8     LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8   
#>  [7] LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C          
#> [10] LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   
#> 
#> time zone: UTC
#> tzcode source: system (glibc)
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] targeted_0.6 lava_1.8.0  
#> 
#> loaded via a namespace (and not attached):
#>  [1] Matrix_1.7-1        future.apply_1.11.3 jsonlite_1.8.9     
#>  [4] compiler_4.4.2      Rcpp_1.0.13-1       parallel_4.4.2     
#>  [7] jquerylib_0.1.4     globals_0.16.3      splines_4.4.2      
#> [10] systemfonts_1.1.0   textshaping_0.4.0   yaml_2.3.10        
#> [13] fastmap_1.2.0       lattice_0.22-6      R6_2.5.1           
#> [16] knitr_1.49          future_1.34.0       nloptr_2.1.1       
#> [19] desc_1.4.3          bslib_0.8.0         rlang_1.1.4        
#> [22] cachem_1.1.0        xfun_0.49           fs_1.6.5           
#> [25] sass_0.4.9          cli_3.6.3           pkgdown_2.1.1      
#> [28] digest_0.6.37       grid_4.4.2          mvtnorm_1.3-2      
#> [31] lifecycle_1.0.4     timereg_2.0.6       evaluate_1.0.1     
#> [34] pracma_2.4.4        data.table_1.16.2   numDeriv_2016.8-1.1
#> [37] listenv_0.9.1       codetools_0.2-20    ragg_1.3.3         
#> [40] survival_3.7-0      optimx_2023-10.21   parallelly_1.39.0  
#> [43] rmarkdown_2.29      mets_1.3.4          tools_4.4.2        
#> [46] htmltools_0.5.8.1