Calculates the Accumulated Local Effects (ALE) from an ERF object

calc_ALE_multi(
  fit,
  var,
  save = TRUE,
  out.folder = NULL,
  cores = parallel::detectCores() - 4,
  type = "response"
)

Arguments

fit

The fitted object returned from calling ens_random_forests()

var

The name of the response variable

save

A logical flag to save the output as an RData object, default is TRUE.

out.folder

A path to the folder to write out too. If NULL then a folder is generated in the working directory

cores

An integer value that either indicates the number of cores to use for parallel processing or a negative value to indicate the number of cores to leave free. Default is to leave two cores free.

type

is either 'response' or 'prob' from predict.randomForest; if 'prob' then n sets of predictions are returned for the n levels in var; if "response" then the factorized predicted response values are returned

Value

A list that contains a data.frame for each variable, ordered by the mean variable importance, and a vector of the covariate values (used for rug plot in plot_ALE). The columns in each data.frame are as follows:

  • x: the covariate values that the ALE was calculated for

  • class: the class of the covariate; used by subsequent plot_ALE function

  • q: the quantile of the x value of the covariate

  • f.X: the ALEs evaluated at a given x value

Examples

#run an ERF with 10 RFs and 
logit <- function(x){log(x/(1-x))}
inv_logit <- function(x){exp(x)/(exp(x)+1)}

x_mat <- as.data.frame(replicate(4, rnorm(1e4)))
x_mat_bin <- t(rmultinom(1e4,1,prob=c(0.33,0.33,0.33)))
x_mat$V5 <- factor(apply(x_mat_bin,1,function(x)which(x==1)))

y_gin <- function(){
  eff <- rnorm(ncol(x_mat)-1)
eff_bin <- rnorm(ncol(x_mat_bin))
y_logit <- t(eff %*% t(x_mat[,-5])) + t(eff_bin %*% t(x_mat_bin))
y <- inv_logit(y_logit)
return(y)
}

y_poss <- data.frame(y1 = y_gin(),
                    y2 = y_gin(),
                    y3 = y_gin())

y_bin <- apply(y_poss,1,function(x) rmultinom(1,1,x))
y <- apply(y_bin,2,function(x)which(x==1))
df <- data.frame(y = factor(y),x_mat)

ens_rf_ex <- ens_random_forests(df=df, var="obs", covariates=colnames(df[,-1]), save=FALSE, cores=1)
#> rounding n.forests to the nearest one
#> Error in `[.data.frame`(df, , var): undefined columns selected

ALEdf <- calc_ALE(ens_rf_ex, save=FALSE)
#> No name of response variable, making one
#> rounding n.forests to the nearest one
#> Error in eval(expr, envir, enclos): object 'ens_rf_ex' not found
head(ALEdf[[1]]$df)
#> Error in eval(expr, envir, enclos): object 'ALEdf' not found