Cross-validation — cv • targeted

Generic cross-validation function

cv(
  models,
  data,
  response = NULL,
  nfolds = 5,
  rep = 1,
  weights = NULL,
  model.score = NULL,
  seed = NULL,
  shared = NULL,
  args.pred = NULL,
  args.future = list(),
  mc.cores,
  ...
)

Arguments

models: List of fitting functions
data: data.frame or matrix
response: Response variable (vector or name of column in data).
nfolds: Number of folds (default 5. K=0 splits in 1:n/2, n/2:n with last part used for testing)
rep: Number of repetitions (default 1)
weights: Optional frequency weights
model.score: Model scoring metric (default: MSE / Brier score). Must be a function with arguments: response, prediction, weights, object, ...
seed: Random seed (argument parsed to future_Apply::future_lapply)
shared: Function applied to each fold with results send to each model
args.pred: Optional arguments to prediction function (see details below)
args.future: Arguments to future.apply::future_mapply
mc.cores: Optional number of cores. parallel::mcmapply used instead of future
...: Additional arguments parsed to models in models

Value

An object of class 'cross_validated' is returned. See cross_validated-class for more details about this class and its generic functions.

Details

models should be list of objects of class ml_model. Alternatively, each element of models should be a list with a fitting function and a prediction function.

The response argument can optionally be a named list where the name is then used as the name of the response argument in models. Similarly, if data is a named list with a single data.frame/matrix then this name will be used as the name of the data/design matrix argument in models.

Author

Klaus K. Holst

Examples

f0 <- function(data,...) lm(...,data=data)
f1 <- function(data,...) lm(Sepal.Length~Species,data=data)
f2 <- function(data,...) lm(Sepal.Length~Species+Petal.Length,data=data)
x <- cv(list(m0=f0,m1=f1,m2=f2),rep=10, data=iris, formula=Sepal.Length~.)
x
#> Call: cv(models = list(m0 = f0, m1 = f1, m2 = f2), data = iris, rep = 10, 
#>     formula = Sepal.Length ~ .)
#> 
#> 5-fold cross-validation with 10 repetitions
#> 
#>           mse       mae
#> m0 0.09955688 0.2553216
#> m1 0.27017313 0.4062745
#> m2 0.11754607 0.2769748