Add aggregation (reduction) method for N-steps calculations #32

jamartinh · 2024-01-20T20:06:37Z

jamartinh
Jan 20, 2024

Hi, would it be possible to add a parameter to the N-step subdict of the ReplayBuffer constructor so that one can specify the kind of aggregator/reductio method of the n-steps returns.

Normally the standard case is accumulate or cumsum, however forinstance for average reinforcement learning would be nice to have not the sum but the mean so one can use N-step returns for average reward RL as well.

I am not being able to compile now cpprb on my machines because my OS is centos 7 and I have compiler obsolescence.

Thanks !

ymd-h · 2024-01-21T02:46:26Z

ymd-h
Jan 21, 2024
Maintainer

@jamartinh
Thank you for your proposal.

Can you give us more detail calculation you want?
Is the reduction applied over future N-steps like ordinary Nstep reward?

For simple average, is it not enough by setting gamma = 1.0 and dividing rewards by N later?

0 replies

jamartinh · 2024-01-22T12:12:25Z

jamartinh
Jan 22, 2024
Author

Hi @ymd-h , do you mean that when creating the N-steps dict for the env_dict I can use for instance:

N=10
n_step_dict = {
    "size": N,
    "gamma": 1.0,
    "rew": "rew",
    "next": "next_obs"}

And then, the samples will give be the sample["rew"] so I can make just AVG = sample["rew"]/N ?

It seems that this may work !

1 reply

ymd-h Jan 22, 2024
Maintainer

@jamartinh
Yes, it is what I mean.

If you find any problems, please let us know.

jamartinh · 2024-01-22T12:52:18Z

jamartinh
Jan 22, 2024
Author

Thanks! El lun, 22 ene 2024, 13:49, H.Yamada ***@***.***> escribió:

…

@jamartinh <https://github.com/jamartinh> Yes, it is what I mean. If you find any problems, please let us know. — Reply to this email directly, view it on GitHub <#32 (reply in thread)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AA3NNM6EPJ77ZNUMEWPMTQTYPZN6XAVCNFSM6AAAAABCDNIOZ6VHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM4DEMBYGIZDS> . You are receiving this because you were mentioned.Message ID: ***@***.***>

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add aggregation (reduction) method for N-steps calculations #32

{{title}}

Replies: 3 comments 1 reply

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Add aggregation (reduction) method for N-steps calculations #32

jamartinh Jan 20, 2024

Replies: 3 comments · 1 reply

ymd-h Jan 21, 2024 Maintainer

jamartinh Jan 22, 2024 Author

ymd-h Jan 22, 2024 Maintainer

jamartinh Jan 22, 2024 Author

jamartinh
Jan 20, 2024

Replies: 3 comments 1 reply

ymd-h
Jan 21, 2024
Maintainer

jamartinh
Jan 22, 2024
Author

ymd-h Jan 22, 2024
Maintainer

jamartinh
Jan 22, 2024
Author