-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial ExplorationWorker commit #383
Conversation
interactive clang compilation together with exploration for a given module. This commit adds the class with its compile_module function which will compile the module with a given compiler policy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ptal at your commit message, probably formatting split it.
from compiler_opt.rl import env | ||
|
||
|
||
def add_int_feature( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this take a type
parameter defaulted to np.int64? the type of the value should match the type of the element value in the feature list, right?
same q for the add_float_feature
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The type value should match the type of the element value in the feature list. All of these are helper functions for add_feature_list. I can definitely make this take a type parameter but I don't see how this would help since at the end of the day I would have to cast whatever type is specified there to an int64 so that we create an int64 feature list. My understanding is that there are only a few types supported for feature lists and the three different add functions are these types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, I see. sgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aaah! one thing I was missing: SequenceExamples really knows int64 and float32 IIRC, right? so the constraint here is what SequenceExamples knows about.
May be worth stressing that somehow. add_float_feature_to_seq_example
may be a mouthful, maybe just documenting that is sufficient (in the docstring) (i.e. that "int" is really int64 and always will be, because SeqExample. Then the reader gets it that it's not a point in time choice we made; the choice was made for us)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah exactly, I will add something to the documentation to emphasize this.
lst.extend([feature_value]) | ||
|
||
|
||
def add_string_feature( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why, do we ever think we'd use this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Imagine that during training we want to have specific weights only specific modules, then we can read off the module name from here and compute the relevant weight. It's something that we experimented with, however, it did not significantly improve on our final method. I can imagine this being useful in similar contexts though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we don't currently (as in, part of this patch or known upcoming ones) use it, let's not add it. It's easy to add later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have used it in part of the training for some runs just not the ones that ended up giving the best policy. It would be more effort to remove this and the module_name feature than actually keep it at this point but really up to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok! just wanted to make sure it has an envisioned use (other than test)
raise ValueError( | ||
f'Compilation loop exited at step type {0} before last step'.format( | ||
curr_obs_dict.step_type)) | ||
native_size = curr_obs_dict.score_policy[self.reward_key] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
native_size
is tied to inlining, perhaps score
? same then below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed it to reward
curr_obs_dict.step_type)) | ||
native_size = curr_obs_dict.score_policy[self.reward_key] | ||
native_size_list = np.float32(native_size) * np.float32(np.ones(horizon)) | ||
add_feature_list(sequence_example, native_size_list, 'reward') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so I understand: the reward key could be hi_mom
, but we rename it here to reward
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sequence_examples are what is going to be used for training. reward_key is the the key to be used in the environment to determine what we use as a reward, the two are somewhat independent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
may be worth commenting that, easier for a reader to get that "it's intentional"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See if the docstring for the function makes this more clear now.
native_size_list = np.float32(native_size) * np.float32(np.ones(horizon)) | ||
add_feature_list(sequence_example, native_size_list, 'reward') | ||
module_name_list = [self.loaded_module_spec.name for _ in range(horizon)] | ||
add_feature_list(sequence_example, module_name_list, 'module_name') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you may want to pull up in a global variable the feature names that are basically forming an API: IIUC module_name
(1) can't be a feature in the original policy, and (2) is used as part of the implementation here.
You can also validate then that the given trace doesn't have this feature
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check if this looks good, otherwise I haven't understood what you mean here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check if the feature names already appear in what is returned by the env is done in _process_obs now.
self, | ||
policy: Callable[[Optional[time_step.TimeStep]], np.ndarray], | ||
) -> tf.train.SequenceExample: | ||
"""Compiles the module with the given policy and outputs a seq. example. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the documentation should capture that the sequence example also has some special, specifically-named features (module_name
...) - IIUC
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
dtype=tf.int32, | ||
name='step_type'), | ||
reward=tf.convert_to_tensor([reward], dtype=tf.float32, name='reward'), | ||
discount=tf.convert_to_tensor([0.0], dtype=tf.float32, name='discount'), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reward
and discount
here are features required by TimeStep
, right? one thing that could clarify what feature names mean what is to maybe group them as static members into a couple of classes - just so there's a namespace-ing that helps readability (i.e. here it would be name=TFAgentsTimeStep.DiscountFeatureName
)
Or, since there don't seem to be many names, just having globals and grouping them with comments also works.
Either way the reader should get sufficient hinting where these names come from or what the intent is with them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure what this means. The names should be documented by tfagents already, why do we need to do anything here? Or do you mean the names of the tensors in tf.convert_to_tensor, if that's the case the names just match whatever is expected by TimeStep.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just to make things easier for a reader. Rather than realizing that these may be tfagents things (not the most discoverable thing) and then reading on that, 1-2 sentences allows me to "get it" and move forward.
@@ -111,6 +114,13 @@ def add_feature_list(seq_example: tf.train.SequenceExample, | |||
add_function(seq_example, feature, feature_name) | |||
|
|||
|
|||
@dataclasses.dataclass | |||
class SequenceExampleFeatureNames: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: add a comment or doc string saying what this is for ("feature names that we depend on" for example).
also use these in the test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
Initial ExplorationWorker commit. ExplorationWorker implements the
interactive clang compilation together with exploration for a
given module. This commit adds the class with its compile_module
function which will compile the module with a given compiler policy.