-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
explore_function part of ExplorationWorker #384
explore_function part of ExplorationWorker #384
Conversation
interactive clang compilation together with exploration for a given module. This commit adds the class with its compile_module function which will compile the module with a given compiler policy.
Explores the module using the given policy and the exploration distr. This implementation assumes the action set is only {0,1} and will just switch the action played at explore_step.
@@ -32,6 +33,30 @@ | |||
from compiler_opt.rl import env | |||
|
|||
|
|||
@dataclasses.dataclass | |||
class SequenceExampleFeatureNames: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you need to rebase?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I think this got caught up in some other changes that I made.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, but to keep the change scoped, can you move it back where it was?
module_name: str = 'module_name' | ||
|
||
|
||
def get_loss(seq_example: tf.train.SequenceExample, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
neat, I wonder if we should move the sequence example stuff in its own utility library? if you think that's a good idea, can you tag an issue on this, we can do it after.
self.explore_step = 0 | ||
self.gap = np.inf | ||
self.explore_on_features = explore_on_features | ||
self.explore_step: int = len(replay_prefix) - 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self._explore_step
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, OK. I see what is also happening. Can you separate the stylistic changes in their own PR, and just submit it once it passes the CI (no review needed, I suggest naming it like "[NFC] add _
for member fields"). Then merge this.
This helps both with this review and later: the mechanical change doesn't interfere (and one knows not to spend much time looking at it in the future, when we want to understand something)
Explores the module using the given policy and the exploration distr.
This implementation assumes the action set is only {0,1} and will just
switch the action played at explore_step.