Skip to contents

Fits a cumulative odds model for discrete time-to-event data, handling interval censoring where the event time is known only to lie within an interval \((t_l, t_r]\). The model assumes: $$ \text{logit}(P(T \leq t | x)) = \log(G(t)) + x \beta $$ where \(G(t)\) is the baseline cumulative odds function and \(\beta\) are the regression coefficients. This is equivalent to: $$ P(T \leq t | x) = \frac{G(t) \exp(x \beta)}{1 + G(t) \exp(x \beta)} $$

Usage

interval_logitsurv_discrete(
  formula,
  data,
  beta = NULL,
  no.opt = FALSE,
  method = "NR",
  stderr = TRUE,
  weights = NULL,
  offsets = NULL,
  exp.link = 1,
  increment = 1,
  ...
)

Arguments

formula

Formula with an Interval object (e.g., Interval(entry, time)) on the left-hand side and covariates on the right. Can include cluster() for correlated data.

data

Data frame containing the variables in the formula.

beta

Starting values for the optimization (vector of length \(p + k\), where \(p\) is the number of covariates and \(k\) is the number of time intervals).

no.opt

Logical; if TRUE, skips optimization and returns estimates based on the provided beta (useful for initialization).

method

Optimization method: "NR" (Newton-Raphson, default) or "nlm".

stderr

Logical; if FALSE, returns only the coefficient estimates.

weights

Observation weights (follows ID).

offsets

Offsets (follows ID).

Logical; if TRUE, parameterizes increments as \(\exp(\alpha) > 0\).

increment

Logical; if TRUE, uses increments \(dG(t) = \exp(\alpha)\) as parameters.

...

Additional arguments passed to the optimizer (lava::NR or nlm).

Value

An object of class "cumoddsreg" containing:

coef

Estimated coefficients (baseline time effects and covariate effects).

se.coef

Standard errors of the coefficients.

var

Variance-covariance matrix.

iid

Influence function (IID) decomposition for robust variance estimation.

ntimes

Number of distinct time intervals.

utimes

Unique time points.

ploglik

Log-likelihood at convergence.

gradient, hessian

Optimization results.

call

Original function call.

Details

The baseline \(G(t)\) is parameterized as the cumulative sum of exponentials (\(G(t) = \sum \exp(\alpha)\)), ensuring positivity. The regression coefficients describe the log-odds of the event occurring by time \(t\).

The likelihood is maximized over the observed intervals: $$ L = \prod_i [ P(T_i > t_{il} | x_i) - P(T_i > t_{ir} | x_i) ] $$ where \(t_{il}\) and \(t_{ir}\) are the left and right endpoints of the interval for subject \(i\). Right-censored intervals have \(t_{ir} = \infty\).

References

Scheike, T. H. (2024). Discrete time survival analysis with interval censoring. mets package documentation.

See also

cumoddsreg, predictlogitSurvd, simlogitSurvd

Author

Thomas Scheike

Examples

data(ttpd) 
dtable(ttpd,~entry+time2)
#> 
#>       time2   1   2   3   4   5   6 Inf
#> entry                                  
#> 0           316   0   0   0   0   0   0
#> 1             0 133   0   0   0   0   0
#> 2             0   0 150   0   0   0   0
#> 3             0   0   0  23   0   0   0
#> 4             0   0   0   0  90   0   0
#> 5             0   0   0   0   0  68   0
#> 6             0   0   0   0   0   0 220

out <- interval_logitsurv_discrete(Interval(entry,time2)~X1+X2+X3+X4,ttpd)
summary(out)
#> $baseline
#>       Estimate Std.Err   2.5%   97.5%   P-value
#> time1  -2.0064  0.1523 -2.305 -1.7079 1.273e-39
#> time2  -2.1749  0.1599 -2.488 -1.8614 4.118e-42
#> time3  -1.4581  0.1544 -1.761 -1.1554 3.636e-21
#> time4  -2.9260  0.2453 -3.407 -2.4453 8.379e-33
#> time5  -1.2051  0.1706 -1.539 -0.8706 1.633e-12
#> time6  -0.9102  0.1860 -1.275 -0.5457 9.843e-07
#> 
#> $logor
#>    Estimate Std.Err    2.5%  97.5%   P-value
#> X1   0.9913  0.1179 0.76024 1.2223 4.100e-17
#> X2   0.6962  0.1162 0.46847 0.9238 2.064e-09
#> X3   0.3466  0.1159 0.11941 0.5738 2.788e-03
#> X4   0.3223  0.1151 0.09668 0.5478 5.111e-03
#> 
#> $or
#>    Estimate     2.5%    97.5%
#> X1 2.694610 2.138791 3.394874
#> X2 2.006032 1.597554 2.518953
#> X3 1.414239 1.126834 1.774950
#> X4 1.380231 1.101503 1.729490
#> 
head(iid(out)) 
#>            [,1]         [,2]          [,3]          [,4]          [,5]
#> 1  0.0045687959  0.004769499  0.0053427163  0.0059138018  0.0066308444
#> 2  0.0016959549  0.002038630  0.0025477402  0.0029776943 -0.0102830496
#> 3  0.0045687959  0.004769499  0.0053427163  0.0059138018  0.0066308444
#> 4  0.0027545442 -0.006047556 -0.0007244072 -0.0006949805 -0.0006704063
#> 5 -0.0002919658 -0.008889214 -0.0026820744 -0.0026532556 -0.0026268232
#> 6  0.0001497624 -0.008530642 -0.0033151419 -0.0032325395 -0.0031636812
#>            [,6]          [,7]          [,8]          [,9]         [,10]
#> 1  0.0081721788 -0.0033482398 -0.0034168560  0.0034308192 -0.0034212419
#> 2 -0.0012875717 -0.0007883982 -0.0005310631 -0.0004080546 -0.0000776067
#> 3  0.0081721788 -0.0033482398 -0.0034168560  0.0034308192 -0.0034212419
#> 4 -0.0006379316  0.0003557924  0.0007697270  0.0008855193 -0.0013506040
#> 5 -0.0025608456 -0.0026170215  0.0016772465  0.0020412533  0.0017043055
#> 6 -0.0030621110  0.0015290328  0.0016662399  0.0020179143  0.0017471657

pred <- predictlogitSurvd(out,se=FALSE)
plotSurvd(pred)


ttpd <- dfactor(ttpd,fentry~entry)
out <- cumoddsreg(fentry~X1+X2+X3+X4,ttpd)
summary(out)
#> $baseline
#>       Estimate Std.Err   2.5%   97.5%   P-value
#> time1  -2.0064  0.1523 -2.305 -1.7079 1.273e-39
#> time2  -2.1749  0.1599 -2.488 -1.8614 4.118e-42
#> time3  -1.4581  0.1544 -1.761 -1.1554 3.636e-21
#> time4  -2.9260  0.2453 -3.407 -2.4453 8.379e-33
#> time5  -1.2051  0.1706 -1.539 -0.8706 1.633e-12
#> time6  -0.9102  0.1860 -1.275 -0.5457 9.843e-07
#> 
#> $logor
#>    Estimate Std.Err    2.5%  97.5%   P-value
#> X1   0.9913  0.1179 0.76024 1.2223 4.100e-17
#> X2   0.6962  0.1162 0.46847 0.9238 2.064e-09
#> X3   0.3466  0.1159 0.11941 0.5738 2.788e-03
#> X4   0.3223  0.1151 0.09668 0.5478 5.111e-03
#> 
#> $or
#>    Estimate     2.5%    97.5%
#> X1 2.694610 2.138791 3.394874
#> X2 2.006032 1.597554 2.518953
#> X3 1.414239 1.126834 1.774950
#> X4 1.380231 1.101503 1.729490
#>