Using hierarchical YAML (or YAML unions/merges) in lieu of #FLUX in batch scripts #3278
Replies: 4 comments 4 replies
-
Steve,
I am not sure how many users know YAML, so it may be like pulling teeth to get them to use it.
I am also not sure why there is resistance to having #FLUX commands in the user’s scripts. This is a lab standard that has persisted through all our batch systems since LCRM, so it is much more known and user friendly.
Phil
|
Beta Was this translation helpful? Give feedback.
-
Well, we've been trying to build a system that doesn't take for granted that past solutions are necessarily the best solutions, but it's totally reasonable to ask us to defend decisions that impact users. I think the biggest problem with batch script directives is that it would seem to encourage users to create scripts that contain "application logic" as well as parameters that need to be hand edited from run to run. I think it's evident these days that we want code to be captured under source control, and that input and output parameters should be captured but likely not with the application source code. (In fact they may be curated by different groups: users and code teams) Looking around at the various examples I find with google, a lot of people deal with this by either generating the batch script from parameterized input and a script template, or move the application logic to another script and call that from the batch script, so that the batch script is just parameters and the script command line. If the best practice is to separate input parameters from code, then it would be a better design to provide an input file for the parameters that allows no code. It also allows that input file to be more expressive and readable. Flux jobspec was intended to be that sort of input file. Does that make sense? Just trying to say why those directives might be considered harmful. Let's not go there if we can solve our usability/transition issues a better way. |
Beta Was this translation helpful? Give feedback.
-
By the way @SteVwonder, I like your idea. I haven't used YAML much, but I can cut and paste from examples and modify them as well as the next guy, and I shouldn't think our users would be any worse off than me. It seems to me like a nice usability boost to be able to capture a set of job specific defaults and reuse them run to run! |
Beta Was this translation helpful? Give feedback.
-
tl;dr: k8s does something similar with their configuration resource, and it is quite nice. Random, but while taking a K8s class, the implementation of ConfigMaps reminded me of this suggestion. In K8s, a ConfigMap is essentially a key-value map that lets you provide configuration information to a pod/container. You can create a ConfigMap from a file or a directory of files using with |
Beta Was this translation helpful? Give feedback.
-
Problem
Users typically have a common set of settings/attributes that they want to apply to every job in their workflow. Common things include
bank
,queue
, andwalltime
. In Slurm, LSF, and many many other schedulers, they handle this by putting#SCHEDNAME --cli-option value
directives at the top of the batch scripts. On more exotic systems that include burst buffers, the staging commands also make their way into the top of the batch script as a#
directive.I don't think we ever really intended or wanted to special case "batch scripts" and read
#FLUX
directives from the top of the file to include in the jobspec. Without ruling that out as a possibility, I was thinking of an alternative.Idea
Allow the user to provide YAML file(s) that serve as the "base" jobspec(s) layer(s). The
flux mini
commands would then build the "top-level" YAML jobspec and apply a union/merge/flattening of the layers of YAML.Strawman, illustrative example:
my-campaign1.yaml
staging.yaml
Illustrative Run lines
Submit a job with an included yaml but override the walltime in the "base" jobspec to only by 5 minutes:
flux mini run --include my-campaign1.yaml -N1 -n1 -t 5m prep.exe
Submit a job but include two yamls (latter includes take precedence over former includes):
flux mini run --include my-campaign1.yaml --include staging.yaml -N4 -n16 run.exe
Submit a job with an included yaml but override the queue:
flux mini run --include my-campaign1.yaml -N1 -n16 --setattr system.queue=viz viz.exe
Reference Implementations of the Merge/Union/Flattening
Beta Was this translation helpful? Give feedback.
All reactions