-
Notifications
You must be signed in to change notification settings - Fork 200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
About selective_scan #19
Comments
I have the same question as you! About u and x, here is my understanding |
It doesn't jump out of the mamba training loop. The hidden state being initialized to 0 when training each batch, just like RNN. |
I think x shoulden't be 0 , for long time series, this should be hidden state to be stored |
Hi, great work!
Could you please explain why in selective_scan the "x = torch.zeros((b, d_in, n), device=deltaA.device)"?
In addition, I am confusing on u and x.
Thanks.
The text was updated successfully, but these errors were encountered: