This function subsets the full data by column subsamples (rate=50%) The optimal hyperparameter search is performed based on spatiotemporal cross-validation schemes. As of version 0.4.5, users can define metric used for selecting best hyperparameter set (default = "rmse").
Arguments
- data
 data.frame. Full data.frame of base learner predictions and AQS spatiotemporal identifiers.
attach_pred.- c_subsample
 numeric(1). Rate of column resampling. Default is 0.5.
- r_subsample
 numeric(1). The proportion of rows to be used. Default is 1.0, which uses full dataset but setting is required to balance groups generated with
make_subdata- yvar
 character(1). Outcome variable name
- target_cols
 characters(1). Columns from
datato be retained during column resampling. Default is c("site_id", "time", "Event.Type", "lon", "lat").- args_generate_cv
 List of arguments to be passed to
switch_generate_cv_rsetfunction.- tune_iter
 integer(1). Bayesian optimization iterations. Default is 50.
- trim_resamples
 logical(1). Default is TRUE, which replaces the actual data.frames in splits column of
tune_resultsobject with NA. Passed tofit_base_tune.- return_best
 logical(1). If TRUE, the best tuned model is returned. Passed to
fit_base_tune.- metric
 character(1). The metric to be used for selecting the best. Must be one of "rmse", "rsq", "mae". Default = "rmse". Passed to
fit_base_tune.
Value
List of 3, including the best-fit model, the best hyperparameters,
and the all performance records from tune::tune_bayes().
Note that the meta learner function returns the best-fit model,
not predicted values.