-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Policy combiner #357
Policy combiner #357
Conversation
given by combine_tfa_policies_lib.get_input_signature() and action spec given by combine_tfa_policies_lib.get_action_spec() The combiner policy uses a new timestep spec feature "model_selector" to select the requested policy at the current state. The feature is computed as a md5 hash from the respective policies names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks good, just mostly nits
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just nits from my end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two nits that should be fixed before landing. Otherwise LGTM.
Combines two tf-agents policies with the given signature spec in get_input_signature.