Cross-validation estimation of the generalization error of the super learner and each of the separate models in the ensemble. Both the chosen model scoring metrics as well as the model weights of the stacked ensemble.
Usage
# S3 method for class 'learner_sl'
cv(object, data, nfolds = 5, rep = 1, model.score = scoring, ...)Arguments
- object
(learner_sl) Instantiated learner_sl object.
- data
data.frame or matrix
- nfolds
Number of folds (nfolds=0 simple test/train split into two folds 1:([n]/2), ([n]+1/2):n with last part used for testing)
- rep
Number of repetitions (default 1)
- model.score
Model scoring metric (default: MSE / Brier score). Must be a function with arguments response and prediction, and may optionally include weights, object and newdata arguments
- ...
Additional arguments parsed to elements in
object
Examples
sim1 <- function(n = 5e2) {
x1 <- rnorm(n, sd = 2)
x2 <- rnorm(n)
y <- x1 + cos(x1) + rnorm(n, sd = 0.5**.5)
data.frame(y, x1, x2)
}
sl <- learner_sl(list(
"mean" = learner_glm(y ~ 1),
"glm" = learner_glm(y ~ x1),
"glm2" = learner_glm(y ~ x1 + x2)
))
cv(sl, data = sim1(), rep = 2)
#>
#> 5-fold cross-validation with 2 repetitions
#>
#> ── mse
#> mean sd min max
#> sl 0.97783 0.10834 0.84327 1.16670
#> mean 4.83617 0.55126 3.76855 5.61918
#> glm 0.93604 0.10606 0.81326 1.15954
#> glm2 0.93807 0.10821 0.81342 1.16182
#>
#> ── mae
#> mean sd min max
#> sl 0.79538 0.04366 0.72901 0.85986
#> mean 1.74969 0.10167 1.54912 1.88131
#> glm 0.78390 0.05189 0.71345 0.87016
#> glm2 0.78475 0.05216 0.71386 0.86993
#>
#> ── weight
#> mean sd min max
#> sl - - - -
#> mean 0.04846 0.09221 0.00000 0.24134
#> glm 0.88012 0.17292 0.47291 1.00000
#> glm2 0.07141 0.17056 0.00000 0.52709
