Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TRIS-722] Custom Tagging #2144

Open
wants to merge 30 commits into
base: master
Choose a base branch
from

Conversation

RikishK
Copy link

@RikishK RikishK commented Nov 13, 2024

Options to add custom tags are in the order:

environment variable dictionary,
step functions cli create flag,
batch decorator variable dictionary

This PR builds ontop of this PR: #1628 - this original PR did not work when testing but I would like to credit it here as some of the changes are still used.

@RikishK RikishK force-pushed the fsat/pr-1627--rebased--2.12.28 branch from 7a3be98 to 66e562e Compare November 14, 2024 05:36
@RikishK RikishK force-pushed the fsat/pr-1627--rebased--2.12.28 branch from edd8055 to e178c81 Compare November 18, 2024 04:28
@RikishK RikishK force-pushed the fsat/pr-1627--rebased--2.12.28 branch from dd7073a to 00be0f0 Compare November 19, 2024 01:52
@RikishK RikishK force-pushed the fsat/pr-1627--rebased--2.12.28 branch 3 times, most recently from 2810093 to 0d45ad5 Compare November 21, 2024 05:13
@RikishK RikishK force-pushed the fsat/pr-1627--rebased--2.12.28 branch from 0d45ad5 to 4c5cebf Compare November 22, 2024 02:38
@RikishK
Copy link
Author

RikishK commented Nov 25, 2024

Testing

Setup:

Environment variable:

{ name: 'METAFLOW_BATCH_EMIT_TAGS', value: 'True' },
{ name: 'METAFLOW_BATCH_DEFAULT_TAGS', value: '{"default_greeting": "default_world"}' }

Step functions create command:

step-functions create --aws-tags "hello=world"

Flow:

from metaflow import (
    FlowSpec,
    step,
    batch,
    retry,
    schedule,
    project,
    conda,
    current
)
import logging
import logging_setup

logging_setup.configure()


custom_step_tags = {
    "goodbye": "world",
    "hello": "universe",
   #"inv@lid": "t@g"
    }

@project(name="ml_apac_trisolaris")
@schedule(hourly=True)
class CanarySimpleAdhoc(FlowSpec):
    @step
    def start(self):
        self.logger = logging.getLogger(self.__class__.__name__)
        self.logger.info(f"Canary Hello")
        self.next(self.hello)

    @batch(cpu=1, memory=500, tags=custom_step_tags)
    @retry
    @step
    def hello(self):
        self.logger.info("Canary Hello World in Batch!")
        self.next(self.end)

    @step
    def end(self):
        self.logger.info("HelloAWS is finished.")


if __name__ == "__main__":
    CanarySimpleAdhoc()

Output/Result:

Start step batch tags:
image

Hello step batch tags:
image

Additional notes:
Tag validation was tested with "inv@lid": "t@g" which caused an error as expected.

Copy link
Collaborator

@saikonen saikonen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

notes for testing, the config variable

METAFLOW_BATCH_EMIT_TAGS=True 

needs to be set in order for tags to have any effect (even validation).

For a better UX, there could be a check somewhere in the path that if aws tags are being set, but emit tags is set to false, we print a warning on this.

Validation could also maybe be run despite the emit_tags feature flag, in order to catch issues.

metaflow/plugins/aws/batch/batch_decorator.py Outdated Show resolved Hide resolved
metaflow/plugins/aws/batch/batch_decorator.py Outdated Show resolved Hide resolved
metaflow/plugins/aws/batch/batch_decorator.py Show resolved Hide resolved
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to add similar logic that is present in f.ex. the kubernetes_decorator.py#runtime_step_cli which json.dumps some attributes to the cli_args.commands, namely the new aws_tags in this case.

otherwise the passed in value will be not handled correctly in the CLI

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for example replacing the cli_args.commands.update(self.attributes) with the following should work:

# dictionaries need to be json dumped for the CLI
            json_types = ["aws_tags"]
            for k, v in self.attributes.items():
                if k in json_types:
                    cli_args.command[k] = json.dumps(v)
                else:
                    cli_args.command_options[k] = v

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also for the batch_cli.py side, the --aws_tags option needs to be

@click.option("--aws-tags", multiple=False, type=JSONTypeClass(), default=None, help="AWS tags.")

in order to convert types correctly. note the multiple=False also, was there a reason to have the plural option act as multiple?

Comment on lines +332 to +366

if not isinstance(BATCH_DEFAULT_TAGS, dict):
raise BatchException(
"The BATCH_DEFAULT_TAGS config option must be a dictionary of key-value tags."
)

for name, value in BATCH_DEFAULT_TAGS.items():
aws_tag = {'key': name, 'value': value}
validate_aws_tag(aws_tag)
job.tag(name, value)

if step_function_tags is not None:
aws_tags_list = []
for tag in step_function_tags:
key_value = tag.split("=", 1)
if len(key_value) == 2:
aws_tags_list.append({
'key': key_value[0],
'value': key_value[1]
})
for tag in aws_tags_list:
validate_aws_tag(tag)
job.tag(tag['key'], tag['value'])

if aws_tags is not None:
if not isinstance(aws_tags, dict):
raise BatchException("tags must be a dictionary of key-value tags.")
decorator_aws_tags_list = [
{'key': key,
'value': val} for key, val in aws_tags.items()
]
for tag in decorator_aws_tags_list:
validate_aws_tag(tag)
job.tag(tag['key'], tag['value'])

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't test this part thoroughly, but there is one significant caveat with running validations in the batch setup phase, namely: when validation fails, the metadata for the run/step/task have already been registered in this case, leading to a failed task with no logs.

An alternative approach with a better UX would be to perform batch tag validation inside the batch_decorator.py#step_init, which will happen before job submission. This allows for erroring out fairly quickly due to malformed tags.

Is there any issue with moving the validation and the BATCH_DEFAULT_TAGS usage into the decorator completely?

@saikonen
Copy link
Collaborator

saikonen commented Feb 4, 2025

also can you verify after the changes that the following flows run as expected? both with batch, and step-functions

expected to fail

from metaflow import step, FlowSpec, batch

class BatchTags(FlowSpec):
    @batch(aws_tags={"inv@lid": "t@g"})
    @step
    def start(self):
        print("Starting 👋")
        self.next(self.end)

    @step
    def end(self):
        print("Done! 🏁")


if __name__ == "__main__":
    BatchTags()

expected to run

from metaflow import (
    FlowSpec,
    step,
    batch,
)

custom_step_tags = {
    "goodbye": "world",
    "hello": "universe",
    }

class BatchTags2(FlowSpec):
    @step
    def start(self):
        self.next(self.hello)

    @batch(aws_tags=custom_step_tags)
    @step
    def hello(self):
        self.next(self.end)

    @step
    def end(self):
        pass

if __name__ == "__main__":
    BatchTags2()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants