Runs a single Random Forest model with an additional bagging layer and calculates performance metrics

rf_ens_fn(
  v,
  form,
  max_split,
  weights = FALSE,
  ntree = 100,
  mtry = 5,
  importance = TRUE
)

Arguments

v

A data frame object created by erf_data_prep() or internally in ens_random_forests()

form

A formula class object specifying the RF model formulation (created by erf_formula_prep() or internal in ens_random_forests())

max_split

The maximum number of samples in the RF bagging procedure (created internally by ens_random_forests())

weights

logical to include weights

ntree

The number of decision trees to use in each RF, default is 100

mtry

The number of covariates to try at each node split, default is 5

importance

A logical flag for the randomForest model to calculate the variable importance

Value

A list containing mod (the randomForest model), preds (the predictions), roc_train (the Receiver Operator Characteristic Curve performance metrics calculated by rocr_ens() on the training set), roc_test (the Receiver Operator Characteristic Curve performance metrics calculated by rocr_ens() on the test set)

Examples

form <- erf_formula_prep(var='obs', covariates=grep('cov',colnames(simData$samples),value=TRUE))
data <- erf_data_prep(df=simData$samples, var='obs', covariate=grep('cov', colnames(simData$samples), value=TRUE))
max_split <- max_splitter(data)

#fit a single RandomForest
rf_ex <- rf_ens_fn(v=data, form=form, max_split=max_split, ntree=50)

#see the training/test auc value
rf_ex$roc_train$auc
#> [1] 0.9698518
rf_ex$roc_test$auc
#> [1] 0.7526798

#see the distribution of predictions
par(mar=c(4,4,1,1))
plot(density(rf_ex$preds[,2],from=0,to=1,adj=2), main="", las=1)