Interface for statistical and machine learning models to be used for nuisance model estimation in targeted learning.
The following list provides an overview of constructors for many commonly used models.
Regression and classification: learner_glm, learner_gam, learner_grf,
learner_hal, learner_glmnet_cv, learner_svm, learner_xgboost,
learner_mars
Regression: learner_isoreg
Classification: learner_naivebayes
Ensemble (super learner): learner_sl
Active bindings
clearRemove fitted model from the learner object
fitReturn estimated model object.
formulaReturn model formula. Use learner$update() to update the formula.
Methods
Method new()
Create a new prediction model object
Arguments
formulaformula specifying outcome and design matrix
estimatefunction for fitting the model. This must be a function with response, 'y', and design matrix, 'x'. Alternatively, a function with a formula and data argument. See the examples section.
predictprediction function (must be a function of model object, 'object', and new design matrix, 'newdata')
predict.argsoptional arguments to prediction function
estimate.argsoptional arguments to estimate function
infooptional description of the model
specialsoptional specials terms (weights, offset, id, subset, ...) passed on to design
formula.keep.specialsif TRUE then special terms defined by
specialswill be removed from the formula before it is being passed to the estimate print.function()intercept(logical) include intercept in design matrix
Method estimate()
Estimation method
Method predict()
Prediction method
Method update()
Update formula
Method summary()
Summary method to provide more extensive information than learner$print().
Returns
summarized_learner object, which is a list with the following elements:
- info
description of the learner
- formula
formula specifying outcome and design matrix
- estimate
function for fitting the model
- estimate.args
arguments to estimate function
- predict
function for making predictions from fitted model
- predict.args
arguments to predict function
- specials
provided special terms
- intercept
include intercept in design matrix
Examples
lr <- learner_glm(y ~ x, family = "nb")
lr$summary()
lr_sum <- lr$summary() # store returned summary in new object
names(lr_sum)
print(lr_sum)Method response()
Extract response from data
Arguments
datadata.frame
evalwhen FALSE return the untransformed outcome (i.e., return 'a' if formula defined as I(a==1) ~ ...)
...additional arguments to design
Method design()
Generate design object (design matrix and response) from data
Arguments
datadata.frame
...additional arguments to design
Examples
data(iris)
rf <- function(formula, ...) {
learner$new(formula,
info = "grf::probability_forest",
estimate = function(x, y, ...) {
grf::probability_forest(X = x, Y = y, ...)
},
predict = function(object, newdata) {
predict(object, newdata)$predictions
},
estimate.args = list(...)
)
}
args <- expand.list(
num.trees = c(100, 200), mtry = 1:3,
formula = c(Species ~ ., Species ~ Sepal.Length + Sepal.Width)
)
models <- lapply(args, function(par) do.call(rf, par))
x <- models[[1]]$clone()
x$estimate(iris)
predict(x, newdata = head(iris))
#> setosa versicolor virginica
#> [1,] 0.9823929 0.014178571 0.003428571
#> [2,] 0.9510833 0.043488095 0.005428571
#> [3,] 0.9844048 0.013095238 0.002500000
#> [4,] 0.9687738 0.028726190 0.002500000
#> [5,] 0.9898929 0.006678571 0.003428571
#> [6,] 0.9064936 0.068506410 0.025000000
# \donttest{
# Reduce Ex. timing
a <- targeted::cv(models, data = iris)
cbind(coef(a), attr(args, "table"))
#> brier -logscore
#> model1 0.10005912 0.2205740
#> model2 0.10032820 0.2232710
#> model3 0.08405392 0.1823416
#> model4 0.08297912 0.1784703
#> model5 0.08659444 0.1799173
#> model6 0.08517396 0.1764445
#> model7 0.35036559 0.5715464
#> model8 0.34923672 0.5664500
#> model9 0.34400827 0.5539763
#> model10 0.34324098 0.5507146
#> model11 0.34244349 0.5466884
#> model12 0.34120762 0.5492056
# }
# defining learner via function with arguments y (response)
# and x (design matrix)
f1 <- learner$new(
estimate = function(y, x) lm.fit(x = x, y = y),
predict = function(object, newdata) newdata %*% object$coefficients
)
# defining the learner via arguments formula and data
f2 <- learner$new(
estimate = function(formula, data, ...) glm(formula, data, ...)
)
# generic learner defined from function (predict method derived per default
# from stats::predict
f3 <- learner$new(
estimate = function(dt, ...) {
lm(y ~ x, data = dt)
}
)
## ------------------------------------------------
## Method `learner$summary`
## ------------------------------------------------
lr <- learner_glm(y ~ x, family = "nb")
lr$summary()
#> ────────── learner object ──────────
#> glm
#>
#> formula: y ~ x <environment: 0x561eb200fcf8>
#> estimate: formula, data, family, ...
#> estimate.args: family=nb
#> predict: object, newdata, ...
#> predict.args:
#> specials:
lr_sum <- lr$summary() # store returned summary in new object
names(lr_sum)
#> [1] "formula" "info" "estimate.args" "predict.args"
#> [5] "estimate" "predict" "specials" "intercept"
print(lr_sum)
#> ────────── learner object ──────────
#> glm
#>
#> formula: y ~ x <environment: 0x561eb200fcf8>
#> estimate: formula, data, family, ...
#> estimate.args: family=nb
#> predict: object, newdata, ...
#> predict.args:
#> specials:
