Cross-validation estimation of the generalization error of the super learner and each of the separate models in the ensemble. Both the chosen model scoring metrics as well as the model weights of the stacked ensemble.
# S3 method for class 'learner_sl'
cv(object, data, nfolds = 5, rep = 1, model.score = scoring, ...)
(learner_sl) Instantiated learner_sl object.
data.frame or matrix
Number of folds (nfolds=0 simple test/train split into two folds 1:([n]/2), ([n]+1/2):n with last part used for testing)
Number of repetitions (default 1)
Model scoring metric (default: MSE / Brier score). Must be a function with arguments response and prediction, and may optionally include weights, object and newdata arguments
Additional arguments parsed to elements in object
sim1 <- function(n = 5e2) {
x1 <- rnorm(n, sd = 2)
x2 <- rnorm(n)
y <- x1 + cos(x1) + rnorm(n, sd = 0.5**.5)
data.frame(y, x1, x2)
}
sl <- learner_sl(list(
"mean" = learner_glm(y ~ 1),
"glm" = learner_glm(y ~ x1),
"glm2" = learner_glm(y ~ x1 + x2)
))
cv(sl, data = sim1(), rep = 2)
#>
#> 5-fold cross-validation with 2 repetitions
#>
#> ── mse
#> mean sd min max
#> sl 0.97783 0.10834 0.84327 1.16670
#> mean 4.83617 0.55126 3.76855 5.61918
#> glm 0.93604 0.10606 0.81326 1.15954
#> glm2 0.93807 0.10821 0.81342 1.16182
#>
#> ── mae
#> mean sd min max
#> sl 0.79538 0.04366 0.72901 0.85986
#> mean 1.74969 0.10167 1.54912 1.88131
#> glm 0.78390 0.05189 0.71345 0.87016
#> glm2 0.78475 0.05216 0.71386 0.86993
#>
#> ── weight
#> mean sd min max
#> sl - - - -
#> mean 0.04846 0.09221 0.00000 0.24134
#> glm 0.88012 0.17292 0.47291 1.00000
#> glm2 0.07141 0.17056 0.00000 0.52709