Skip to contents

Interface for statistical and machine learning models to be used for nuisance model estimation in targeted learning.

The following list provides an overview of constructors for many commonly used models.

Regression and classification: learner_glm, learner_gam, learner_grf, learner_hal, learner_glmnet_cv, learner_svm, learner_xgboost, learner_mars
Regression: learner_isoreg
Classification: learner_naivebayes
Ensemble (super learner): learner_sl

Author

Klaus Kähler Holst, Benedikt Sommer

Public fields

info

Optional information/name of the model

Active bindings

clear

Remove fitted model from the learner object

fit

Return estimated model object.

formula

Return model formula. Use learner$update() to update the formula.

predict.filter

Return instantiated prediction filter function

predict.filter.generator

Return prediction filter generator function

Methods


learner$new()

Create a new prediction model object

Usage

learner$new(
  formula = NULL,
  estimate,
  predict = stats::predict,
  predict.args = NULL,
  estimate.args = NULL,
  info = NULL,
  specials = c(),
  formula.keep.specials = FALSE,
  predict.filter = function(data, ...) function(pred, newdata, ...) pred,
  intercept = FALSE
)

Arguments

formula

formula specifying outcome and design matrix

estimate

function for fitting the model. This must be a function with response, 'y', and design matrix, 'x'. Alternatively, a function with a formula and data argument. See the examples section.

predict

prediction function (must be a function of model object, 'object', and new design matrix, 'newdata')

predict.args

optional arguments to prediction function

estimate.args

optional arguments to estimate function

info

optional description of the model

specials

optional specials terms (weights, offset, id, subset, ...) passed on to design

formula.keep.specials

if TRUE then special terms defined by specials will be removed from the formula before it is being passed to the estimate print.function()

predict.filter

function to post-process predictions. Useful to bound predictions or handle NAs. The argument is experimental and its behavior may change in the future.

intercept

(logical) include intercept in design matrix


learner$estimate()

Estimation method

Usage

learner$estimate(data, ..., store = TRUE)

Arguments

data

data.frame

...

Additional arguments to estimation and prediction filter generator function

store

Logical determining if estimated model should be stored inside the class.


learner$predict()

Prediction method

Usage

learner$predict(newdata, ..., object = NULL)

Arguments

newdata

data.frame

...

Additional arguments to prediction method and prediction filter function

object

Optional model fit object


learner$update()

Update formula

Usage

learner$update(formula)

Arguments

formula

formula or character which defines the new response


learner$print()

Print method

Usage

learner$print()


learner$summary()

Summary method to provide more extensive information than learner$print().

Usage

learner$summary()

Returns

summarized_learner object, which is a list with the following elements:

info

description of the learner

formula

formula specifying outcome and design matrix

estimate

function for fitting the model

estimate.args

arguments to estimate function

predict

function for making predictions from fitted model

predict.args

arguments to predict function

specials

provided special terms

intercept

include intercept in design matrix

Examples

lr <- learner_glm(y ~ x, family = "nb")
lr$summary()

lr_sum <- lr$summary() # store returned summary in new object
names(lr_sum)
print(lr_sum)


learner$response()

Extract response from data

Usage

learner$response(data, eval = TRUE, ...)

Arguments

data

data.frame

eval

when FALSE return the untransformed outcome (i.e., return 'a' if formula defined as I(a==1) ~ ...)

...

additional arguments to design


learner$design()

Generate design object (design matrix and response) from data

Usage

learner$design(data, ...)

Arguments

data

data.frame

...

additional arguments to design


learner$opt()

Get options

Usage

learner$opt(arg)

Arguments

arg

name of option to get value of


learner$clone()

The objects of this class are cloneable with this method.

Usage

learner$clone(deep = FALSE)

Arguments

deep

Whether to make a deep clone.

Examples

data(iris)
rf <- function(formula, ...) {
  learner$new(formula,
    info = "grf::probability_forest",
    estimate = function(x, y, ...) {
      grf::probability_forest(X = x, Y = y, ...)
    },
    predict = function(object, newdata) {
      predict(object, newdata)$predictions
    },
    estimate.args = list(...)
  )
}

args <- expand.list(
  num.trees = c(100, 200), mtry = 1:3,
  formula = c(Species ~ ., Species ~ Sepal.Length + Sepal.Width)
)
models <- lapply(args, function(par) do.call(rf, par))

x <- models[[1]]$clone()
x$estimate(iris)
predict(x, newdata = head(iris))
#>         setosa  versicolor   virginica
#> [1,] 0.9960000 0.003000000 0.001000000
#> [2,] 0.9612500 0.035083333 0.003666667
#> [3,] 0.9950000 0.003333333 0.001666667
#> [4,] 0.9852381 0.013095238 0.001666667
#> [5,] 1.0000000 0.000000000 0.000000000
#> [6,] 0.9528413 0.044492063 0.002666667

# \donttest{
# Reduce Ex. timing
a <- targeted::cv(models, data = iris)
cbind(coef(a), attr(args, "table"))
#>              brier -logscore
#> model1  0.09933114 0.2176723
#> model2  0.09852394 0.2190576
#> model3  0.08930160 0.1951197
#> model4  0.08693167 0.1859497
#> model5  0.07920760 0.1667053
#> model6  0.07973235 0.1642473
#> model7  0.33889722 0.5576471
#> model8  0.34434199 0.5659671
#> model9  0.33340380 0.5510511
#> model10 0.33373744 0.5450570
#> model11 0.34011615 0.5492692
#> model12 0.33269580 0.5396498
# }

# defining learner via function with arguments y (response)
# and x (design matrix)
f1 <- learner$new(
  estimate = function(y, x) lm.fit(x = x, y = y),
  predict = function(object, newdata) newdata %*% object$coefficients
)
# defining the learner via arguments formula and data
f2 <- learner$new(
  estimate = function(formula, data, ...) glm(formula, data, ...)
)
# generic learner defined from function (predict method derived per default
# from stats::predict
f3 <- learner$new(
  estimate = function(dt, ...) {
    lm(y ~ x, data = dt)
  }
)

## ------------------------------------------------
## Method `learner$summary()`
## ------------------------------------------------

lr <- learner_glm(y ~ x, family = "nb")
lr$summary()
#> ────────── learner object ──────────
#> glm 
#> 
#> formula: y ~ x <environment: 0x564a99a00128> 
#> estimate: formula, data, family, ... 
#> estimate.args: family=nb 
#> predict: object, newdata, ... 
#> predict.args:   
#> specials:  

lr_sum <- lr$summary() # store returned summary in new object
names(lr_sum)
#> [1] "formula"       "info"          "estimate.args" "predict.args" 
#> [5] "estimate"      "predict"       "specials"      "intercept"    
print(lr_sum)
#> ────────── learner object ──────────
#> glm 
#> 
#> formula: y ~ x <environment: 0x564a99a00128> 
#> estimate: formula, data, family, ... 
#> estimate.args: family=nb 
#> predict: object, newdata, ... 
#> predict.args:   
#> specials: