Initial ExplorationWorker commit #383

tvmarino · 2024-10-14T18:45:50Z

Initial ExplorationWorker commit. ExplorationWorker implements the
interactive clang compilation together with exploration for a
given module. This commit adds the class with its compile_module
function which will compile the module with a given compiler policy.

interactive clang compilation together with exploration for a given module. This commit adds the class with its compile_module function which will compile the module with a given compiler policy.

mtrofin

ptal at your commit message, probably formatting split it.

mtrofin · 2024-10-16T01:06:29Z

compiler_opt/rl/generate_bc_trajectories.py

+from compiler_opt.rl import env
+
+
+def add_int_feature(


should this take a type parameter defaulted to np.int64? the type of the value should match the type of the element value in the feature list, right?

same q for the add_float_feature

The type value should match the type of the element value in the feature list. All of these are helper functions for add_feature_list. I can definitely make this take a type parameter but I don't see how this would help since at the end of the day I would have to cast whatever type is specified there to an int64 so that we create an int64 feature list. My understanding is that there are only a few types supported for feature lists and the three different add functions are these types.

ah, I see. sgtm

aaah! one thing I was missing: SequenceExamples really knows int64 and float32 IIRC, right? so the constraint here is what SequenceExamples knows about.

May be worth stressing that somehow. add_float_feature_to_seq_example may be a mouthful, maybe just documenting that is sufficient (in the docstring) (i.e. that "int" is really int64 and always will be, because SeqExample. Then the reader gets it that it's not a point in time choice we made; the choice was made for us)

Yeah exactly, I will add something to the documentation to emphasize this.

mtrofin · 2024-10-16T01:08:07Z

compiler_opt/rl/generate_bc_trajectories.py

+  lst.extend([feature_value])
+
+
+def add_string_feature(


why, do we ever think we'd use this?

Imagine that during training we want to have specific weights only specific modules, then we can read off the module name from here and compute the relevant weight. It's something that we experimented with, however, it did not significantly improve on our final method. I can imagine this being useful in similar contexts though.

if we don't currently (as in, part of this patch or known upcoming ones) use it, let's not add it. It's easy to add later.

I have used it in part of the training for some runs just not the ones that ended up giving the best policy. It would be more effort to remove this and the module_name feature than actually keep it at this point but really up to you.

ok! just wanted to make sure it has an envisioned use (other than test)

compiler_opt/rl/generate_bc_trajectories.py

mtrofin · 2024-10-16T01:13:16Z

compiler_opt/rl/generate_bc_trajectories.py

+      raise ValueError(
+          f'Compilation loop exited at step type {0} before last step'.format(
+              curr_obs_dict.step_type))
+    native_size = curr_obs_dict.score_policy[self.reward_key]


native_size is tied to inlining, perhaps score? same then below

changed it to reward

mtrofin · 2024-10-16T01:13:54Z

compiler_opt/rl/generate_bc_trajectories.py

+              curr_obs_dict.step_type))
+    native_size = curr_obs_dict.score_policy[self.reward_key]
+    native_size_list = np.float32(native_size) * np.float32(np.ones(horizon))
+    add_feature_list(sequence_example, native_size_list, 'reward')


so I understand: the reward key could be hi_mom, but we rename it here to reward?

The sequence_examples are what is going to be used for training. reward_key is the the key to be used in the environment to determine what we use as a reward, the two are somewhat independent.

may be worth commenting that, easier for a reader to get that "it's intentional"

See if the docstring for the function makes this more clear now.

mtrofin · 2024-10-16T01:15:42Z

compiler_opt/rl/generate_bc_trajectories.py

+    native_size_list = np.float32(native_size) * np.float32(np.ones(horizon))
+    add_feature_list(sequence_example, native_size_list, 'reward')
+    module_name_list = [self.loaded_module_spec.name for _ in range(horizon)]
+    add_feature_list(sequence_example, module_name_list, 'module_name')


you may want to pull up in a global variable the feature names that are basically forming an API: IIUC module_name (1) can't be a feature in the original policy, and (2) is used as part of the implementation here.

You can also validate then that the given trace doesn't have this feature

Check if this looks good, otherwise I haven't understood what you mean here.

Check if the feature names already appear in what is returned by the env is done in _process_obs now.

mtrofin · 2024-10-16T01:17:28Z

compiler_opt/rl/generate_bc_trajectories.py

+      self,
+      policy: Callable[[Optional[time_step.TimeStep]], np.ndarray],
+  ) -> tf.train.SequenceExample:
+    """Compiles the module with the given policy and outputs a seq. example.


the documentation should capture that the sequence example also has some special, specifically-named features (module_name...) - IIUC

mtrofin · 2024-10-16T01:18:42Z

compiler_opt/rl/generate_bc_trajectories.py

+                                       dtype=tf.int32,
+                                       name='step_type'),
+        reward=tf.convert_to_tensor([reward], dtype=tf.float32, name='reward'),
+        discount=tf.convert_to_tensor([0.0], dtype=tf.float32, name='discount'),


reward and discount here are features required by TimeStep, right? one thing that could clarify what feature names mean what is to maybe group them as static members into a couple of classes - just so there's a namespace-ing that helps readability (i.e. here it would be name=TFAgentsTimeStep.DiscountFeatureName)

Or, since there don't seem to be many names, just having globals and grouping them with comments also works.

Either way the reader should get sufficient hinting where these names come from or what the intent is with them.

I am not sure what this means. The names should be documented by tfagents already, why do we need to do anything here? Or do you mean the names of the tensors in tf.convert_to_tensor, if that's the case the names just match whatever is expected by TimeStep.

just to make things easier for a reader. Rather than realizing that these may be tfagents things (not the most discoverable thing) and then reading on that, 1-2 sentences allows me to "get it" and move forward.

mtrofin · 2024-10-16T23:36:53Z

compiler_opt/rl/generate_bc_trajectories.py

@@ -111,6 +114,13 @@ def add_feature_list(seq_example: tf.train.SequenceExample,
    add_function(seq_example, feature, feature_name)


+@dataclasses.dataclass
+class SequenceExampleFeatureNames:


nit: add a comment or doc string saying what this is for ("feature names that we depend on" for example).

also use these in the test

Initial ExplorationWorker commit. ExplorationWorker implements the

25b4e6b

interactive clang compilation together with exploration for a given module. This commit adds the class with its compile_module function which will compile the module with a given compiler policy.

tvmarino requested review from mtrofin and boomanaiden154 October 14, 2024 18:45

Suppressing pytype error for protobuf import.

a2d39b2

mtrofin reviewed Oct 16, 2024

View reviewed changes

Addressing comments by @mtrofin.

59221f1

mtrofin reviewed Oct 16, 2024

View reviewed changes

mtrofin approved these changes Oct 16, 2024

View reviewed changes

Fixing nits.

0474b60

tvmarino changed the title ~~Initial ExplorationWorker commit. ExplorationWorker implements the~~ Initial ExplorationWorker commit Oct 17, 2024

tvmarino merged commit f9dee45 into google:main Oct 17, 2024
15 checks passed

tvmarino deleted the exploration_worker branch October 17, 2024 14:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initial ExplorationWorker commit #383

Initial ExplorationWorker commit #383

tvmarino commented Oct 14, 2024

mtrofin left a comment

mtrofin Oct 16, 2024

tvmarino Oct 16, 2024

mtrofin Oct 16, 2024

mtrofin Oct 16, 2024

tvmarino Oct 16, 2024

mtrofin Oct 16, 2024

tvmarino Oct 16, 2024

mtrofin Oct 16, 2024

tvmarino Oct 16, 2024

mtrofin Oct 16, 2024

mtrofin Oct 16, 2024

tvmarino Oct 16, 2024

mtrofin Oct 16, 2024

tvmarino Oct 16, 2024

mtrofin Oct 16, 2024

tvmarino Oct 16, 2024

mtrofin Oct 16, 2024

tvmarino Oct 16, 2024

tvmarino Oct 16, 2024

mtrofin Oct 16, 2024

tvmarino Oct 16, 2024

mtrofin Oct 16, 2024

tvmarino Oct 16, 2024

mtrofin Oct 16, 2024

mtrofin Oct 16, 2024

tvmarino Oct 17, 2024

Initial ExplorationWorker commit #383

Initial ExplorationWorker commit #383

Conversation

tvmarino commented Oct 14, 2024

mtrofin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment