You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In some social science fields large data do not exist and researchers must make decisions using small number of samples (p >> n problem)
Good to see support in R (tfprobability, brnn packages)
Wondering if the DALEX team has any thoughts/comments on this?
The text was updated successfully, but these errors were encountered:
@asheetal size of the data shall not matter in the implemented XAI techniques (nor local nor global),
but let's try, do you have any trained models for tests?
In a recent experiment with p >> n what I did was as follows
create an p x l array p = predictor, l = 1000 below
for (i in 1:1000) {
randomize the seed
build a keras model
generate variable importance rank with DALEX
against each predictor append the rank number from DALEX into its list
}
sort the predictor array based on how many times that predictor has received ones, followed by twos etc etc
It indeed helped. The final rank was a histogram against each predictor. I found that if I had run it once (l=1) I would have gotten completely inaccurate results.
Forgot to add. The problem is not within DALEX. The problem is the model itself. For p >> n, the model must be Bayesian probabilitic. So must work in conjunction tfprobability etc models, so that now the variable importance is not a rank rather a probabilistic range of ranks. The researcher can now choose to decide how to infer the rank - median, max, min, overlapping.
In some social science fields large data do not exist and researchers must make decisions using small number of samples (p >> n problem)
Good to see support in R (tfprobability, brnn packages)
Wondering if the DALEX team has any thoughts/comments on this?
The text was updated successfully, but these errors were encountered: