Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect Starting State representation, and Valid Ids after every evaluation is run? #5

Open
hitzkrieg opened this issue Nov 23, 2022 · 3 comments

Comments

@hitzkrieg
Copy link

Shouldn't the state representation and valid_ids be rebuilt when we reset all environments by doing evs.reset() after every evaluation?

Shouldn't line 232 be replaced by following code:
obs, infos = envs.reset()
states = agent.build_state(obs, infos)
valid_ids = [agent.encode(info['valid']) for info in infos]

@MarcCote
Copy link
Collaborator

You are right, that will mess up the next action. That said, I'm questioning the purpose of that reset in the first place. @wsxzwps @PeterAJansen any thoughts?

@PeterAJansen
Copy link
Contributor

Yes, I think that's a bug, and it should mirror lines 142-145 on reset. It might either be legacy code or my mistake. The evaluation frequency is generally quite low in our runs (e.g. every 1k-5k steps), so I don't think this would negatively affect performance in our evaluation much or at all.

IIRC, the ScienceWorld DRRN keeps two instances of the environments: one for training, and one for evaluation (e.g. on the dev or test set). I think the call to reset() is intended as a safety call, to reset the training environments to a new variation/start of the game just in case evaluation does anything to the model states that might need to be reset. If you determine that it's not needed, then we can always remove it and it should continue training in each episode from where it left off when it started evaluation.

@hitzkrieg
Copy link
Author

Thank you for the clarifications! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants