-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace min/max CPU fields with single CPU cores field #616
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes look good to me for the most part. There was at least one API incompatibility (and possibly two) in this change. Those will have to be fixed before this can be merged.
@@ -127,8 +127,8 @@ const StepResourceRequirementsForm = ({ | |||
<Grid item xs={12}> | |||
<FastField | |||
id={buildID(baseId, ids.RESOURCE_REQUESTS.TOOL_CPU)} | |||
name={`requirements.${index}.min_cpu_cores`} | |||
label={t("minCPUCores")} | |||
name={`requirements.${index}.cpu_cores`} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The api is still expecting min_cpu_cores
here, so this field name should stay the same as before. We may change the APIs eventually, but this can be revisited then if we do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was about to say the same as @slr71, except I would suggest the opposite: I think the min_cpu_cores
field should be replaced with the max_cpu_cores
field.
If I remember correctly, the max_cpu_cores
field was added for VICE apps, so those apps need this field to limit users' CPU consumption; and I just tested a Condor job with only the max_cpu_cores
field set to 1
, and the job only consumed 1 CPU / hour.
This will also still allow the default max CPU of 4
to be selected automatically (which was added to fix a previous issue where users were accidentally using the max set by the app, which can be 128 or more).
So I would remove the {t("selectMins")}
typography label above, and replace this min_cpu_cores
field with the max_cpu_cores
field that was below this section; and the selectMaxes
typography label would no longer be needed, and the new cpuCores
label would no longer be required, as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did some more testing with the API, and I think that we can get away with specifying only min_cpu_cores
as long as the max_cpu_cores
is always set to zero in the request. When the max_cpu_cores
field is set to zero, the API will automatically adjust it to be the same as the min_cpu_cores
field to ensure that the job submission makes sense. For example, submitting an analysis with this request body:
{
"description": "",
"requirements": [
{
"min_cpu_cores": 1.0,
"min_memory_limit": 0,
"min_disk_space": 0,
"step_number": 0,
"max_cpu_cores": 0.0
}
],
"notify_periodic": false,
"config": {},
"name": "JupyterLab_Datascience_analysis1",
"app_id": "c2227314-1995-11ed-986c-008cfa5ae621",
"periodic_period": 14400,
"system_id": "de",
"debug": false,
"create_output_subdir": true,
"app_version_id": "d7927408-1937-11ef-9ba0-008cfa5ae621",
"output_dir": "/iplant/home/sarahr/analyses",
"notify": true
}
produces these limits and requests:
{
"limits": {
"cpu": "1",
"memory": "8Gi"
},
"requests": {
"cpu": "1",
"ephemeral-storage": "1Gi",
"memory": "2Gi"
}
}
My concern with setting only the max_cpu_cores
field is that if the default number of CPUs for the app is lower than the requested maximum then there's a chance that the analysis will be scheduled on a node that doesn't have enough CPUs to run the analysis. For example, submitting an analysis with this request body:
{
"description": "",
"requirements": [
{
"min_cpu_cores": 0.0,
"min_memory_limit": 0,
"min_disk_space": 0,
"step_number": 0,
"max_cpu_cores": 16.0
}
],
"notify_periodic": false,
"config": {},
"name": "JupyterLab_Datascience_analysis1",
"app_id": "c2227314-1995-11ed-986c-008cfa5ae621",
"periodic_period": 14400,
"system_id": "de",
"debug": false,
"create_output_subdir": true,
"app_version_id": "d7927408-1937-11ef-9ba0-008cfa5ae621",
"output_dir": "/iplant/home/sarahr/analyses",
"notify": true
}
Produces these limits and requests:
$ kubectl -n vice-apps get pods -l username=sarahr -o json | jq '.items[0].spec.containers[] | select(.name == "analysis") | .resources'
{
"limits": {
"cpu": "16",
"memory": "8Gi"
},
"requests": {
"cpu": "1",
"ephemeral-storage": "1Gi",
"memory": "2Gi"
}
}
If the user actually needs 16 CPUs, it's possible that the analysis will run on a node that has fewer than 16 CPUs, which could cause an analysis to fail.
Another option would be to set both the min_cpu_cores
and max_cpu_cores
fields to the same value, which may be preferable simply because it's more explicit. Now that I think about it, being explicit is probably the best option simply because it avoids confusion, so my recommendation would be to set both the min_cpu_cores
and max_cpu_cores
fields.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concern with setting only the
max_cpu_cores
field is that if the default number of CPUs for the app is lower than the requested maximum then there's a chance that the analysis will be scheduled on a node that doesn't have enough CPUs to run the analysis.
That's a good point, I didn't think of that.
Another option would be to set both the
min_cpu_cores
andmax_cpu_cores
fields to the same value, which may be preferable simply because it's more explicit. Now that I think about it, being explicit is probably the best option simply because it avoids confusion, so my recommendation would be to set both themin_cpu_cores
andmax_cpu_cores
fields.
This does sound like the best option. In that case, I still recommend using the max_cpu_cores
field in the UI form, and removing the min_cpu_cores
form field from displaying in the UI. This way the logic in the initAppLaunchValues
formatter does not have to be updated to set a default selected value, so that the max a tool allows will not be used by the API by default:
sonora/src/components/apps/launch/formatters.js
Lines 44 to 64 in 886c95d
// If no default_max_cpu_cores is returned from the API, | |
// then use the default from configs (if it's less than the actual max) | |
// so the max is not automatically submitted by the services. | |
const reqInitValues = requirements?.map( | |
({ | |
step_number, | |
max_cpu_cores, | |
default_max_cpu_cores = max_cpu_cores < defaultMaxCpuCores | |
? max_cpu_cores | |
: defaultMaxCpuCores, | |
default_cpu_cores = 0, | |
default_memory = 0, | |
default_disk_space = 0, | |
}) => ({ | |
step_number, | |
max_cpu_cores: default_max_cpu_cores, | |
min_cpu_cores: default_cpu_cores, | |
min_memory_limit: default_memory, | |
min_disk_space: default_disk_space, | |
}) | |
); |
The submission formatter will also need to be updated so that each requirement sets its min_cpu_cores
value with the max_cpu_cores
form value (maybe with a requirements: requirements.map( ... )
function):
sonora/src/components/apps/launch/formatters.js
Lines 185 to 215 in 886c95d
const formatSubmission = ( | |
defaultOutputDir, | |
{ | |
notify, | |
notifyPeriodic, | |
periodicPeriod, | |
debug, | |
name, | |
description, | |
output_dir, | |
system_id, | |
app_id, | |
app_version_id, | |
requirements, | |
groups, | |
} | |
) => ({ | |
notify, | |
notify_periodic: notifyPeriodic, | |
periodic_period: periodicPeriod, | |
debug, | |
create_output_subdir: output_dir === defaultOutputDir, | |
name: name.trim(), | |
description, | |
output_dir, | |
system_id, | |
app_id, | |
app_version_id, | |
requirements, | |
config: groups?.reduce(paramConfigsReducer, {}), | |
}); |
So we would still remove the selectMins
and selectMaxes
typography labels, but the max_cpu_cores
field can now be labeled with the new cpuCores
label.
label={t("minCPUCores")} | ||
value={min_cpu_cores} | ||
label={t("cpuCores")} | ||
value={cpu_cores} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could be wrong, but I think that this line might have to be left unchanged as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also replace this Review Row with the max_cpu_cores
row from below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After looking more closely at this, I think that @psarando is correct about this. The Discovery Environment UI sets the max_cpu_cores
field to whatever it thinks the default number of CPUs requested for the analysis should be, which is what users would expect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that the value
here comes from the JSON returned by the initAppLaunchValues
formatter. So we can rename this label with the new cpuCores
label, but in order to change the form field name, both the initAppLaunchValues
formatter and the submission formatter would need to be updated to convert to and from the API JSON and the JSON used by the form fields.
Since both max_cpu_cores
and min_cpu_cores
are still needed to populate the CPU cores selection list values, I think it might be less work to just use max_cpu_cores
as the form field name (though the display label can be updated):
sonora/src/components/apps/launch/ResourceRequirements.js
Lines 103 to 107 in 886c95d
const cpuCoreList = buildLimitList( | |
1, | |
min_cpu_cores || 0, | |
max_cpu_cores || defaultMaxCPUCores || 8 | |
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recommend replacing the min_cpu_cores
form field with the max_cpu_cores
form field, since I think the max_cpu_cores
field is more essential is controlling how many CPUs per hour a user will consume for either VICE or Condor analyses.
Also, I don't see any Chromatic builds associated with this PR, so you probably still need to set up a Chromatic token for your sonora
fork.
@@ -127,8 +127,8 @@ const StepResourceRequirementsForm = ({ | |||
<Grid item xs={12}> | |||
<FastField | |||
id={buildID(baseId, ids.RESOURCE_REQUESTS.TOOL_CPU)} | |||
name={`requirements.${index}.min_cpu_cores`} | |||
label={t("minCPUCores")} | |||
name={`requirements.${index}.cpu_cores`} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was about to say the same as @slr71, except I would suggest the opposite: I think the min_cpu_cores
field should be replaced with the max_cpu_cores
field.
If I remember correctly, the max_cpu_cores
field was added for VICE apps, so those apps need this field to limit users' CPU consumption; and I just tested a Condor job with only the max_cpu_cores
field set to 1
, and the job only consumed 1 CPU / hour.
This will also still allow the default max CPU of 4
to be selected automatically (which was added to fix a previous issue where users were accidentally using the max set by the app, which can be 128 or more).
So I would remove the {t("selectMins")}
typography label above, and replace this min_cpu_cores
field with the max_cpu_cores
field that was below this section; and the selectMaxes
typography label would no longer be needed, and the new cpuCores
label would no longer be required, as well.
const { | ||
step_number, | ||
min_cpu_cores, | ||
min_memory_limit, | ||
min_disk_space, | ||
max_cpu_cores, | ||
} = stepRequirements; | ||
const { step_number, min_memory_limit, min_disk_space, cpu_cores } = | ||
stepRequirements; | ||
|
||
const hasRequest = !!( | ||
min_cpu_cores || | ||
min_memory_limit || | ||
min_disk_space || | ||
max_cpu_cores | ||
); | ||
const hasRequest = !!(min_memory_limit || min_disk_space || cpu_cores); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only min_cpu_cores
need to be removed in these params.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming that we want to set min_cpu_cores
and max_cpu_cores
explicitly if the user selects any value for the CPU Cores field, I think this might have to be left as it was before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic is only for displaying values in the review step that have been set in the previous step of the form. If we are only displaying the max_cpu_cores
field, then the check for min_cpu_cores
does not need to be included in hasRequest
here.
label={t("minCPUCores")} | ||
value={min_cpu_cores} | ||
label={t("cpuCores")} | ||
value={cpu_cores} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also replace this Review Row with the max_cpu_cores
row from below.
stepRequirementErrors?.min_memory_limit || | ||
stepRequirementErrors?.min_disk_space || | ||
stepRequirementErrors?.max_cpu_cores | ||
stepRequirementErrors?.cpu_cores |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stepRequirementErrors?.cpu_cores | |
stepRequirementErrors?.max_cpu_cores |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this have to change also if both the min_cpu_cores
and max_cpu_cores
fields are specified in the request to the API?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This controls if the resource requirements are displayed in the review step, but only if those form fields have any values selected. If we are only displaying the max_cpu_cores
form field, and hiding min_cpu_cores
in the form, then the stepRequirementErrors?.min_cpu_cores ||
line can be removed here.
values.requirements.forEach((req, i) => { | ||
if (req?.min_cpu_cores && req?.max_cpu_cores) { | ||
const err = validateNotAbove( | ||
req.min_cpu_cores, | ||
req.max_cpu_cores, | ||
t | ||
); | ||
if (err) { | ||
reqErrors[i] = { min_cpu_cores: err }; | ||
} | ||
} | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would leave this validation in place, because it's still possible for an app to have a min_cpu_cores
requirement set in the API, and we don't want the possibility of someone being able to somehow submit a max requirement that is less than the min set by the app creator (although the way the CPU core select list is built should prevent that).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I overlooked this in my last review, but if we're now hiding min_cpu_cores
in the form, then this error won't display anywhere anymore, so reqErrors[i] = { min_cpu_cores: err }
here should be updated to reqErrors[i] = { max_cpu_cores: err }
.
I'm seeing Chromatic builds for this PR now, so I guess I was wrong about this (I switched to a new "merge experience" view, but also there were no visual storybook changes, so the Chromatic builds were hidden from my view by default). |
name={`requirements.${index}.min_cpu_cores`} | ||
label={t("minCPUCores")} | ||
name={`requirements.${index}.max_cpu_cores`} | ||
label={t("cpuCores")} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also remove the selectMins
grid item just above this, since these are no longer all minimum requirements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
I also tested the DE Word Count story, using the Chromatic build from the Storybook Publish link under the checks
section of this PR, and it appears to be submitting the correct requirements for both min and max CPUs 🎉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me as well! 👍
"Syncing branch with main
These changes simplify the CPU resource selection process and reduce user confusion around CPU requirements.