job-list: access user protected data from job-info #5120

chu11 · 2023-04-26T21:48:03Z

Just brainstorming here a bit. The conversation in flux-framework/flux-docs#229 made me realize it sort of sucks that data has to be gathered from two locations sometimes. Seems to be the case from slurm days too.

It is sort of a historical side effect of the fact there is data for everyone to see (jobids, number of nodes, etc.) which eventually gets listed in tools like flux-jobs and data that is not for everyone to see (full commandline of job, jobspec, etc.) that is only retrievable by individuals that specifically want that data.

Because of this historical split, some information simply has never been available in job-list (flux jobs), you have to go to job-info to get it (flux job info).

For example, jobspec and R are read / cached in job-list, so it could conceptually be offered to callers if there was an access control mechanism. The same access controls that are done in job-info could possibly be copied into job-list, allowing users to access that not-for-everyone data.

I'm not sure what side effects there could be for this. Off the top of my head.

what to display / return when there's data that the user isn't allow to retrieve
we wouldn't want users to output data in flux-jobs and then cut and paste to some communication (i.e. slack) that they shouldn't, so maybe this isn't a wise idea. But this could be controlled by just not supporting this output in flux-jobs, it is only available via API or something.

alternately if the "everything goes into a database" that is someday done, everything could be redirected to the database with appropriate controls (see conversation #4914)

The text was updated successfully, but these errors were encountered:

grondo · 2023-04-26T22:08:47Z

It doesn't seem like we'd want another copy of everything in the job-list module. At least when sitting in the KVS the content doesn't have to reside in memory for all time.

I'm not sure it is a problem that some detailed information (like the entire submitted environment or job script) has be obtained by fetching the jobspec directly. You'd also have to cache the signed J in case the user wanted to verify the data had not been modifed after submission, plus there is the redacted jobspec which the instance has indeed modified for its own use...

chu11 · 2023-04-26T22:20:39Z

It doesn't seem like we'd want another copy of everything in the job-list module. At least when sitting in the KVS the content doesn't have to reside in memory for all time.

Yeah, we don't need everything. I guess I was specifically thinking of additional data in R / jobspec (or eventlog), because that's already read into job-list anyways, and (I'd have to verify) I think is already cached in job-list.

And whatever isn't cached, it is in #4336 b/c at some point in time it has to be dumped to sqlite.

chu11 mentioned this issue Apr 26, 2023

How to access a wide range of job information for a user tool #5119

Closed

chu11 mentioned this issue Apr 26, 2023

flux-job(1): document flux job info #5121

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

job-list: access user protected data from job-info #5120

job-list: access user protected data from job-info #5120

chu11 commented Apr 26, 2023

grondo commented Apr 26, 2023

chu11 commented Apr 26, 2023

job-list: access user protected data from job-info #5120

job-list: access user protected data from job-info #5120

Comments

chu11 commented Apr 26, 2023

grondo commented Apr 26, 2023

chu11 commented Apr 26, 2023