-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Categorical takes C-1 inputs for C classes #55
Comments
The motivation for fixing one input was to ensure that the mapping is invertible: we map C-1 inputs to the C-1 dimensional simplex. It is the natural generalization of the logistic function, as used e.g. in multinomial logistic regression. I can see though that it can be a bit inconvenient sometimes. |
What do you think then - change it or leave as is? |
I have been dealing with categorical likelihoods again recently and I think both are just as valid (interestingly having C inputs adds an unnecessary degree of freedom, and I am not sure what the effect on inference are). I will make a PR to allow for all these options, maybe I can find an elegant formulation. Related to this is #58 |
Solved by #61 I believe |
@devmotion Could you comment on the exchangability of the classes when using the C-1 inputs? Would it still be valid? |
I'm not sure, what exactly do you mean? |
In the C inputs, C classes case I can interchange any class by interchanging the input. Right? |
If you interchange two of the C-1 inputs, then the probabilities of the corresponding two classes will be interchanged as well. And if you want to interchange some class with the reference class, you can either change the reference class or set its input to the additive inverse and subtract it from all other inputs. Is that what you're after? E.g., if |
Thanks, that is really insightful. My PI was having doubts on this version and was arguing about the exchangeability but I could not find proper arguments. So interestingly I made a few experiments with my logistic-softmax link. On a simple 1-D example I generate data with C-1 input, and fit it with both C-1 and C GPs. |
The C-1 is common in multinomial logistic regression (and, of course, logistic regression): https://en.wikipedia.org/wiki/Multinomial_logistic_regression#As_a_set_of_independent_binary_regressions With C-1 inputs one also has the nice interpretation of the inputs as log odds which is lost in case of C inputs. |
Sure! I think he was directly having in mind processes where the order matters like the stick-breaking process https://en.wikipedia.org/wiki/Dirichlet_process#The_stick-breaking_process but probably got confused |
Currently, the
CategoricalLikelihood
is defined to take a vector of C-1 inputs to produce a distribution with C classes by appending a 0 to the input vector before going through the softmax.This is fine for simple stuff, but I think there are some cases where you'd want to give all C inputs - e.g. if you're doing multi-class classification with a different kernel for each class, I don't think this would be possible with the current version?
Should I make a PR to change it or is there a reason to keep it as is? (would always still be possible to use the current version by appending a zero yourself)
@willtebbutt
The text was updated successfully, but these errors were encountered: