Skip to contents

XGBoost model is fitted at the defined rate (r_subsample) of the input dataset by grid search. With proper settings, users can utilize graphics processing units (GPU) to speed up the training process.

Usage

fit_base_xgb(
  dt_imputed,
  folds = NULL,
  tune_mode = "grid",
  tune_bayes_iter = 50L,
  learn_rate = 0.1,
  yvar = "Arithmetic.Mean",
  xvar = seq(5, ncol(dt_imputed)),
  vfold = 5L,
  device = "cuda:0",
  trim_resamples = TRUE,
  return_best = FALSE,
  ...
)

Arguments

dt_imputed

The input data table to be used for fitting.

folds

pre-generated rset object with minimal number of columns. If NULL, vfold should be numeric to be used in rsample::vfold_cv.

tune_mode

character(1). Hyperparameter tuning mode. Default is "grid", "bayes" is acceptable.

tune_bayes_iter

integer(1). The number of iterations for Bayesian optimization. Default is 50. Only used when tune_mode = "bayes".

learn_rate

The learning rate for the model. For branching purpose. Default is 0.1.

yvar

The target variable.

xvar

The predictor variables.

vfold

The number of folds for cross-validation.

device

The device to be used for training. Default is "cuda:0". Make sure that your system is equipped with CUDA-enabled graphical processing units.

trim_resamples

logical(1). Default is TRUE, which replaces the actual data.frames in splits column of tune_results object with NA.

return_best

logical(1). If TRUE, the best tuned model is returned.

...

Additional arguments to be passed.

Value

The fitted workflow.

Details

Hyperparameters mtry, ntrees, and learn_rate are tuned. With tune_mode = "grid", users can modify learn_rate explicitly, and other hyperparameters will be predefined (30 combinations per learn_rate).

Note

tune package should be 1.2.0 or higher. xgboost should be installed with GPU support.