-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
idea: add FLUX
environment variable that holds "closest enclosing jobid"
#6474
Comments
a random idea I thought of. if we want to avoid spreading too many environment variables, could we support a new command like hypothetically |
Would |
I think if |
An environment variable or flux attribute (each of which works consistently across cases) would be great. Another thing we would like to have is storing the equivalent, but for the very top level instance id. E.g.,: FLUX_TOP_LEVEL_ID=xxx I am adding a command I think whatever you decide to come up with will be hugely helpful, so thank you in advance! |
Yeah, this is what I was thinking. Just wrap the logic into it. |
That would be fine I think, it's consistency that matters most. Having it be a command might be best since that means things like I admit to a personal preference to have access to at least the innermost job ID and matching flux URI be very easy though, since that's what people are most used to from other systems and will need to do naive ports of job scripts that use the enclosing jobid and talk to the system scheduler in batch scripts. |
Having it as a environment variable is more consistent with the behavior of slurm where there is an environment variable for the id for the allocation and for the if of the subjob inside of the allocation. It also saves me from having to do something like execve or something inside of my process. |
Good point about the extra pain of requiring an execve (or using the Flux C API) to get the We could do something simple like This captures only the enclosing jobid. If |
Oh, @MrBurmark, it just occurred to me as of flux-core v0.70.0 you can request an environment variable be set containing the batch jobid using template substitution, e.g.: $ flux batch -N1 --env=FLUX_BATCH_ID={{id}} ...
ƒV2VX34zvb Would result in |
The FLUX_ENCLOSING_ID sounds perfect. The flux batch or flux alloc with a
flux run inside is the use case I am thinking about.
…On Wed, Jan 15, 2025, 18:49 Mark Grondona ***@***.***> wrote:
Oh, @MrBurmark <https://github.com/MrBurmark>, it just occurred to me as
of flux-core v0.70.0 you can request an environment variable be set
containing the batch jobid using template substitution, e.g.:
$ flux batch -N1 --env=FLUX_BATCH_ID={{id}} ...ƒV2VX34zvb
Would result in FLUX_BATCH_ID=ƒV2VX34zvb set in your batch environment.
Be cautious that this will be propagated to all jobs (including batch/alloc
jobs) run within that batch job.
—
Reply to this email directly, view it on GitHub
<#6474 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAUGGQC6GPMM24AFBO7QBHT2K4M2RAVCNFSM6AAAAABTBJKH7OVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKOJUGM3DAOBQGE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Problem: It is inconvenient to get the jobid of the closest enclosing instance (e.g. the ID of a batch or alloc job in the parent instance) because the broker drops FLUX_JOB_ID in favor of a jobid attribute. However, a case has been made that an environment variable would be more convenient since it can be accessed without use of system(3) or the Flux API. Introduce FLUX_ENCLOSING_ID, which is set by the broker whenever the jobid attribute is set (i.e. when the broker is started as a job in a Flux instance). This will be available in the initial program as well as being inherited by jobs run within the instance. Add the variable to the env_blocklist so that it is unset when the current instance is not a job. Fixes flux-framework#6474
Problem: It is inconvenient to get the jobid of the closest enclosing instance (e.g. the ID of a batch or alloc job in the parent instance) because the broker drops FLUX_JOB_ID in favor of a jobid attribute. However, a case has been made that an environment variable would be more convenient since it can be accessed without use of system(3) or the Flux API. Introduce FLUX_ENCLOSING_ID, which is set by the broker whenever the jobid attribute is set (i.e. when the broker is started as a job in a Flux instance). This will be available in the initial program as well as being inherited by jobs run within the instance. Add the variable to the env_blocklist so that it is unset when the current instance is not a job. Fixes flux-framework#6474
This idea was brought up by @trws on slack and in a project meeting.
Flux currently doesn't have a consistent way to determine the "nearest" enclosing jobid. The cases are somewhat delineated in #3817, though the information there may be outdated (i.e. there does now exist a flux_job_timeleft(3) function). While it makes sense that
FLUX_JOB_ID
is set in the environment of tasks launched byflux run
andflux submit
but not in the environment of the initial program influx alloc
andflux batch
, this will likely continue to cause confusion and annoyance for users.One idea put forward by @trws is to add another
FLUX_
jobid variable that is always set by the job shell which is not cleared. (Please correct me if I'm mistaken). This variable would leak through to initial programs, which would then be able to use this variable to determine the jobid of their parent instance (if there was a jobid associated) - equivalent to, but more straightforward than usingflux getattr jobid
. It would also be available influx run
andflux submit
where it would be the same asFLUX_JOB_ID
. Comparing the two variables would allow users to easily determine if they are in an initial program environment or within a job. Lack of this new environment variable would indicate that there is no enclosing job, i.e. the current process is not within an instance, or the enclosing instance is not itself a job.Edit: if we enable this feature, that may allow us to close #3817.
The text was updated successfully, but these errors were encountered: