You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I think it is reasonable that the author only care about recall. The activated neurons contribute most of the activation, while non-activated neurons are less important. So we want to find all activated neurons to ensure the model accuracy. The neurons that are not activated but predicted as activated do not have a negative impact on the results, but the activated neurons predicted as not activated have a significant impact on the results. That's why recall is used.
Hi guys, I would like to ask if the term 'activating neurons' in the FFN in the paper refers to a row or column of parameters in a linear layer? For example, if a neural network only has one linear layer (256, 512), with input x as (1, 256), and the output output is (1, 512). Then, for predicting neuron activation, does the MLP predictor need to take x as input and output a (1, 256) or (1, 512) tensor as the activation_mask indicating which row/column of weights are activated? I don't know if I understand it correctly.
Hi, dejavu is really fascinating! Thanks a lot for releasing the corresponding code.
I have some questions about implementation details.
Thank you! Hope to hear from you!
The text was updated successfully, but these errors were encountered: