You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I'm wondering if it would be possible (or even make sense) to have the option to specify random effects in the model explainer?
I thought about this because when looking at feature importance, the full model RMSE is quite different to one that accounts for random effects. For example...
library(tidyverse)
library(tidymodels)
library(lme4)
library(DALEXtra)
df <- nlme::Oxboys
df
# model using lmer
lmr_mod <- lme4::lmer(height ~ age + Occasion + (1|Subject), df)
sjstats::rmse(lmr_mod)
# RMSE is 1.2
# model with tidymodels
mixed_model_spec <- linear_reg() %>% set_engine("lmer")
mixed_model_wf <- workflow() %>%
add_model(mixed_model_spec, formula = height ~ age + Occasion + (1|Subject)) %>%
add_variables(outcomes = height, predictors = c(age, Occasion, Subject))
fit <- fit(mixed_model_wf, df)
explainer <-
explain_tidymodels(
fit,
data = dplyr::select(df, c(age, Occasion, Subject)),
y = df$height,
label = "lmm",
verbose = T)
var_imp <-
feature_importance(explainer)
# full model RMSE is 8.0
The text was updated successfully, but these errors were encountered:
ML workflows with clustered data are a delicate thing. Using a clean train/test split (grouped split on subject) and then evaluating the model on the test data is often a good choice. Then you wont have this problem.
Hi, I'm wondering if it would be possible (or even make sense) to have the option to specify random effects in the model explainer?
I thought about this because when looking at feature importance, the full model RMSE is quite different to one that accounts for random effects. For example...
The text was updated successfully, but these errors were encountered: