Runs a single Random Forest model with an additional bagging layer and calculates performance metrics
rf_ens_fn(
v,
form,
max_split,
weights = FALSE,
ntree = 100,
mtry = 5,
importance = TRUE
)
A data frame object created by erf_data_prep()
or internally in ens_random_forests()
A formula class object specifying the RF model formulation (created by erf_formula_prep()
or internal in ens_random_forests()
)
The maximum number of samples in the RF bagging procedure (created internally by ens_random_forests()
)
logical to include weights
The number of decision trees to use in each RF, default is 100
The number of covariates to try at each node split, default is 5
A logical flag for the randomForest model to calculate the variable importance
A list containing mod (the randomForest model), preds (the predictions), roc_train (the Receiver Operator Characteristic Curve performance metrics calculated by rocr_ens() on the training set), roc_test (the Receiver Operator Characteristic Curve performance metrics calculated by rocr_ens() on the test set)
form <- erf_formula_prep(var='obs', covariates=grep('cov',colnames(simData$samples),value=TRUE))
data <- erf_data_prep(df=simData$samples, var='obs', covariate=grep('cov', colnames(simData$samples), value=TRUE))
max_split <- max_splitter(data)
#fit a single RandomForest
rf_ex <- rf_ens_fn(v=data, form=form, max_split=max_split, ntree=50)
#see the training/test auc value
rf_ex$roc_train$auc
#> [1] 0.9698518
rf_ex$roc_test$auc
#> [1] 0.7526798
#see the distribution of predictions
par(mar=c(4,4,1,1))
plot(density(rf_ex$preds[,2],from=0,to=1,adj=2), main="", las=1)