R/calc_ALE_multi.R
calc_ALE_multi.Rd
Calculates the Accumulated Local Effects (ALE) from an ERF object
calc_ALE_multi(
fit,
var,
save = TRUE,
out.folder = NULL,
cores = parallel::detectCores() - 4,
type = "response"
)
The fitted object returned from calling ens_random_forests()
The name of the response variable
A logical flag to save the output as an RData object, default is TRUE.
A path to the folder to write out too. If NULL then a folder is generated in the working directory
An integer value that either indicates the number of cores to use for parallel processing or a negative value to indicate the number of cores to leave free. Default is to leave two cores free.
is either 'response' or 'prob' from predict.randomForest; if 'prob' then n sets of predictions are returned for the n levels in var; if "response" then the factorized predicted response values are returned
A list that contains a data.frame for each variable, ordered by the mean variable importance, and a vector of the covariate values (used for rug plot in plot_ALE). The columns in each data.frame are as follows:
x: the covariate values that the ALE was calculated for
class: the class of the covariate; used by subsequent plot_ALE function
q: the quantile of the x value of the covariate
f.X: the ALEs evaluated at a given x value
#run an ERF with 10 RFs and
logit <- function(x){log(x/(1-x))}
inv_logit <- function(x){exp(x)/(exp(x)+1)}
x_mat <- as.data.frame(replicate(4, rnorm(1e4)))
x_mat_bin <- t(rmultinom(1e4,1,prob=c(0.33,0.33,0.33)))
x_mat$V5 <- factor(apply(x_mat_bin,1,function(x)which(x==1)))
y_gin <- function(){
eff <- rnorm(ncol(x_mat)-1)
eff_bin <- rnorm(ncol(x_mat_bin))
y_logit <- t(eff %*% t(x_mat[,-5])) + t(eff_bin %*% t(x_mat_bin))
y <- inv_logit(y_logit)
return(y)
}
y_poss <- data.frame(y1 = y_gin(),
y2 = y_gin(),
y3 = y_gin())
y_bin <- apply(y_poss,1,function(x) rmultinom(1,1,x))
y <- apply(y_bin,2,function(x)which(x==1))
df <- data.frame(y = factor(y),x_mat)
ens_rf_ex <- ens_random_forests(df=df, var="obs", covariates=colnames(df[,-1]), save=FALSE, cores=1)
#> rounding n.forests to the nearest one
#> Error in `[.data.frame`(df, , var): undefined columns selected
ALEdf <- calc_ALE(ens_rf_ex, save=FALSE)
#> No name of response variable, making one
#> rounding n.forests to the nearest one
#> Error in eval(expr, envir, enclos): object 'ens_rf_ex' not found
head(ALEdf[[1]]$df)
#> Error in eval(expr, envir, enclos): object 'ALEdf' not found