About selective_scan #19

ZK-Zhou · 2024-02-21T04:33:03Z

Hi， great work！
Could you please explain why in selective_scan the "x = torch.zeros((b, d_in, n), device=deltaA.device)"?
In addition, I am confusing on u and x.

Thanks.

Ykiiii · 2024-06-20T03:37:05Z

I have the same question as you!
Why reset “x” to 0 in the selective_scan function？

About u and x, here is my understanding
in selective_scan function show as
x(t + 1) = Ax(t) + Bu(t)
y(t) = Cx(t) + Du(t)
here “u” is incoming x,
here “x” is hidden variable，it can be understood as h

ZhangXG001 · 2024-08-29T03:23:17Z

"x = torch.zeros((b, d_in, n), device=deltaA.device)" is out of the loop, I think it is init the hidden state with 0(x = torch.zeros((b, d_in, n), device=deltaA.device)) @Ykiiii @ZK-Zhou

Ykiiii · 2024-09-19T02:08:21Z

It doesn't jump out of the mamba training loop. The hidden state being initialized to 0 when training each batch, just like RNN.
My confusion is, when training with long time series, why not continue using the hidden state. It's more in line with the idea of state-space equation, isn't it？
@ZhangXG001

shawnnjupt · 2025-01-22T11:16:42Z

I think x shoulden't be 0 , for long time series, this should be hidden state to be stored

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About selective_scan #19

About selective_scan #19

ZK-Zhou commented Feb 21, 2024

Ykiiii commented Jun 20, 2024

ZhangXG001 commented Aug 29, 2024

Ykiiii commented Sep 19, 2024

shawnnjupt commented Jan 22, 2025

About selective_scan #19

About selective_scan #19

Comments

ZK-Zhou commented Feb 21, 2024

Ykiiii commented Jun 20, 2024

ZhangXG001 commented Aug 29, 2024

Ykiiii commented Sep 19, 2024

shawnnjupt commented Jan 22, 2025