You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Shouldn't line 232 be replaced by following code: obs, infos = envs.reset() states = agent.build_state(obs, infos) valid_ids = [agent.encode(info['valid']) for info in infos]
The text was updated successfully, but these errors were encountered:
You are right, that will mess up the next action. That said, I'm questioning the purpose of that reset in the first place. @wsxzwps @PeterAJansen any thoughts?
Yes, I think that's a bug, and it should mirror lines 142-145 on reset. It might either be legacy code or my mistake. The evaluation frequency is generally quite low in our runs (e.g. every 1k-5k steps), so I don't think this would negatively affect performance in our evaluation much or at all.
IIRC, the ScienceWorld DRRN keeps two instances of the environments: one for training, and one for evaluation (e.g. on the dev or test set). I think the call to reset() is intended as a safety call, to reset the training environments to a new variation/start of the game just in case evaluation does anything to the model states that might need to be reset. If you determine that it's not needed, then we can always remove it and it should continue training in each episode from where it left off when it started evaluation.
Shouldn't the state representation and valid_ids be rebuilt when we reset all environments by doing evs.reset() after every evaluation?
drrn-scienceworld/drrn/train-scienceworld.py
Line 232 in 4ed8909
Shouldn't line 232 be replaced by following code:
obs, infos = envs.reset()
states = agent.build_state(obs, infos)
valid_ids = [agent.encode(info['valid']) for info in infos]
The text was updated successfully, but these errors were encountered: