MSTDP(ET) Learning rules #548

peschn · 2022-04-19T19:46:12Z

peschn
Apr 19, 2022

Hi,

I got another question around an implementation detail I just came across. Looking at the learning rules and especially reward-based learning I can see that in the Network.run() method connection weights are updated every timestep. The total timesteps are calculated as

timesteps = time / dt

This implies IMHO that the weights are updated with every approximation step. Coming from a reinforcement learning setting, where an action as a whole is rewarded this seems strange to me. The agent interacts with the environment by performing an action and receiving a (scalar) reward. To compute the action it is necessary to simulate multiple time steps, esp. in the case of population coding. Therefore I would expect that the update step is taken at the end of sequence, i.e. at the last timestep of the run() method or it would be even possible to supply the reward later implying the following pattern

execute network.run() to collect spikes
Compute action from spikes and execute it in the environment
Receive reward and update network weights --> call to connection[l].update(...)
Compute next action (start at 1.)

Does that make sense or am I confusing things?

In the case of MSTDP(ET) I can see that the reward is multiplied with connection.dt. Is it correct that this is done to spread the reward across time steps?

Thanks and best,

Peter

Hananel-Hazan · 2022-04-22T21:10:05Z

Hananel-Hazan
Apr 22, 2022
Maintainer

Hi,

You are right, the update rule can be apply on each time step, but it not necessarily need to be use in each time step. You can design your reward function to give reward at any point during the time steps and on the rest it can send zero reward.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MSTDP(ET) Learning rules #548

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment

{{title}}

Select a reply

MSTDP(ET) Learning rules #548

peschn Apr 19, 2022

Replies: 1 comment

Hananel-Hazan Apr 22, 2022 Maintainer

peschn
Apr 19, 2022

Hananel-Hazan
Apr 22, 2022
Maintainer