Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace min/max CPU fields with single CPU cores field #616

Merged
merged 5 commits into from
Jan 25, 2025

Conversation

suemkwon
Copy link
Contributor

  • Changed resource requirements form to use a single CPU cores field instead of separate min/max fields
  • Updated form validation to remove min/max CPU validation logic
  • Updated review display to show single CPU cores value
  • Added new translation key for "CPU Cores" label
  • Removed unused code related to min/max CPU validation

These changes simplify the CPU resource selection process and reduce user confusion around CPU requirements.
Screenshot 2025-01-22 at 3 25 21 PM

@suemkwon suemkwon requested review from psarando and slr71 January 22, 2025 22:42
Copy link
Member

@slr71 slr71 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look good to me for the most part. There was at least one API incompatibility (and possibly two) in this change. Those will have to be fixed before this can be merged.

@@ -127,8 +127,8 @@ const StepResourceRequirementsForm = ({
<Grid item xs={12}>
<FastField
id={buildID(baseId, ids.RESOURCE_REQUESTS.TOOL_CPU)}
name={`requirements.${index}.min_cpu_cores`}
label={t("minCPUCores")}
name={`requirements.${index}.cpu_cores`}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The api is still expecting min_cpu_cores here, so this field name should stay the same as before. We may change the APIs eventually, but this can be revisited then if we do.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was about to say the same as @slr71, except I would suggest the opposite: I think the min_cpu_cores field should be replaced with the max_cpu_cores field.

If I remember correctly, the max_cpu_cores field was added for VICE apps, so those apps need this field to limit users' CPU consumption; and I just tested a Condor job with only the max_cpu_cores field set to 1, and the job only consumed 1 CPU / hour.

This will also still allow the default max CPU of 4 to be selected automatically (which was added to fix a previous issue where users were accidentally using the max set by the app, which can be 128 or more).

So I would remove the {t("selectMins")} typography label above, and replace this min_cpu_cores field with the max_cpu_cores field that was below this section; and the selectMaxes typography label would no longer be needed, and the new cpuCores label would no longer be required, as well.

Copy link
Member

@slr71 slr71 Jan 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some more testing with the API, and I think that we can get away with specifying only min_cpu_cores as long as the max_cpu_cores is always set to zero in the request. When the max_cpu_cores field is set to zero, the API will automatically adjust it to be the same as the min_cpu_cores field to ensure that the job submission makes sense. For example, submitting an analysis with this request body:

{
  "description": "",
  "requirements": [
    {
      "min_cpu_cores": 1.0,
      "min_memory_limit": 0,
      "min_disk_space": 0,
      "step_number": 0,
      "max_cpu_cores": 0.0
    }
  ],
  "notify_periodic": false,
  "config": {},
  "name": "JupyterLab_Datascience_analysis1",
  "app_id": "c2227314-1995-11ed-986c-008cfa5ae621",
  "periodic_period": 14400,
  "system_id": "de",
  "debug": false,
  "create_output_subdir": true,
  "app_version_id": "d7927408-1937-11ef-9ba0-008cfa5ae621",
  "output_dir": "/iplant/home/sarahr/analyses",
  "notify": true
}

produces these limits and requests:

{
  "limits": {
    "cpu": "1",
    "memory": "8Gi"
  },
  "requests": {
    "cpu": "1",
    "ephemeral-storage": "1Gi",
    "memory": "2Gi"
  }
}

My concern with setting only the max_cpu_cores field is that if the default number of CPUs for the app is lower than the requested maximum then there's a chance that the analysis will be scheduled on a node that doesn't have enough CPUs to run the analysis. For example, submitting an analysis with this request body:

{
  "description": "",
  "requirements": [
    {
      "min_cpu_cores": 0.0,
      "min_memory_limit": 0,
      "min_disk_space": 0,
      "step_number": 0,
      "max_cpu_cores": 16.0
    }
  ],
  "notify_periodic": false,
  "config": {},
  "name": "JupyterLab_Datascience_analysis1",
  "app_id": "c2227314-1995-11ed-986c-008cfa5ae621",
  "periodic_period": 14400,
  "system_id": "de",
  "debug": false,
  "create_output_subdir": true,
  "app_version_id": "d7927408-1937-11ef-9ba0-008cfa5ae621",
  "output_dir": "/iplant/home/sarahr/analyses",
  "notify": true
}

Produces these limits and requests:

$ kubectl -n vice-apps get pods -l username=sarahr -o json | jq '.items[0].spec.containers[] | select(.name == "analysis") | .resources'
{
  "limits": {
    "cpu": "16",
    "memory": "8Gi"
  },
  "requests": {
    "cpu": "1",
    "ephemeral-storage": "1Gi",
    "memory": "2Gi"
  }
}

If the user actually needs 16 CPUs, it's possible that the analysis will run on a node that has fewer than 16 CPUs, which could cause an analysis to fail.

Another option would be to set both the min_cpu_cores and max_cpu_cores fields to the same value, which may be preferable simply because it's more explicit. Now that I think about it, being explicit is probably the best option simply because it avoids confusion, so my recommendation would be to set both the min_cpu_cores and max_cpu_cores fields.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern with setting only the max_cpu_cores field is that if the default number of CPUs for the app is lower than the requested maximum then there's a chance that the analysis will be scheduled on a node that doesn't have enough CPUs to run the analysis.

That's a good point, I didn't think of that.

Another option would be to set both the min_cpu_cores and max_cpu_cores fields to the same value, which may be preferable simply because it's more explicit. Now that I think about it, being explicit is probably the best option simply because it avoids confusion, so my recommendation would be to set both the min_cpu_cores and max_cpu_cores fields.

This does sound like the best option. In that case, I still recommend using the max_cpu_cores field in the UI form, and removing the min_cpu_cores form field from displaying in the UI. This way the logic in the initAppLaunchValues formatter does not have to be updated to set a default selected value, so that the max a tool allows will not be used by the API by default:

// If no default_max_cpu_cores is returned from the API,
// then use the default from configs (if it's less than the actual max)
// so the max is not automatically submitted by the services.
const reqInitValues = requirements?.map(
({
step_number,
max_cpu_cores,
default_max_cpu_cores = max_cpu_cores < defaultMaxCpuCores
? max_cpu_cores
: defaultMaxCpuCores,
default_cpu_cores = 0,
default_memory = 0,
default_disk_space = 0,
}) => ({
step_number,
max_cpu_cores: default_max_cpu_cores,
min_cpu_cores: default_cpu_cores,
min_memory_limit: default_memory,
min_disk_space: default_disk_space,
})
);

The submission formatter will also need to be updated so that each requirement sets its min_cpu_cores value with the max_cpu_cores form value (maybe with a requirements: requirements.map( ... ) function):

const formatSubmission = (
defaultOutputDir,
{
notify,
notifyPeriodic,
periodicPeriod,
debug,
name,
description,
output_dir,
system_id,
app_id,
app_version_id,
requirements,
groups,
}
) => ({
notify,
notify_periodic: notifyPeriodic,
periodic_period: periodicPeriod,
debug,
create_output_subdir: output_dir === defaultOutputDir,
name: name.trim(),
description,
output_dir,
system_id,
app_id,
app_version_id,
requirements,
config: groups?.reduce(paramConfigsReducer, {}),
});

So we would still remove the selectMins and selectMaxes typography labels, but the max_cpu_cores field can now be labeled with the new cpuCores label.

label={t("minCPUCores")}
value={min_cpu_cores}
label={t("cpuCores")}
value={cpu_cores}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could be wrong, but I think that this line might have to be left unchanged as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also replace this Review Row with the max_cpu_cores row from below.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After looking more closely at this, I think that @psarando is correct about this. The Discovery Environment UI sets the max_cpu_cores field to whatever it thinks the default number of CPUs requested for the analysis should be, which is what users would expect.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the value here comes from the JSON returned by the initAppLaunchValues formatter. So we can rename this label with the new cpuCores label, but in order to change the form field name, both the initAppLaunchValues formatter and the submission formatter would need to be updated to convert to and from the API JSON and the JSON used by the form fields.

Since both max_cpu_cores and min_cpu_cores are still needed to populate the CPU cores selection list values, I think it might be less work to just use max_cpu_cores as the form field name (though the display label can be updated):

const cpuCoreList = buildLimitList(
1,
min_cpu_cores || 0,
max_cpu_cores || defaultMaxCPUCores || 8
);

Copy link
Member

@psarando psarando left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend replacing the min_cpu_cores form field with the max_cpu_cores form field, since I think the max_cpu_cores field is more essential is controlling how many CPUs per hour a user will consume for either VICE or Condor analyses.

Also, I don't see any Chromatic builds associated with this PR, so you probably still need to set up a Chromatic token for your sonora fork.

@@ -127,8 +127,8 @@ const StepResourceRequirementsForm = ({
<Grid item xs={12}>
<FastField
id={buildID(baseId, ids.RESOURCE_REQUESTS.TOOL_CPU)}
name={`requirements.${index}.min_cpu_cores`}
label={t("minCPUCores")}
name={`requirements.${index}.cpu_cores`}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was about to say the same as @slr71, except I would suggest the opposite: I think the min_cpu_cores field should be replaced with the max_cpu_cores field.

If I remember correctly, the max_cpu_cores field was added for VICE apps, so those apps need this field to limit users' CPU consumption; and I just tested a Condor job with only the max_cpu_cores field set to 1, and the job only consumed 1 CPU / hour.

This will also still allow the default max CPU of 4 to be selected automatically (which was added to fix a previous issue where users were accidentally using the max set by the app, which can be 128 or more).

So I would remove the {t("selectMins")} typography label above, and replace this min_cpu_cores field with the max_cpu_cores field that was below this section; and the selectMaxes typography label would no longer be needed, and the new cpuCores label would no longer be required, as well.

Comment on lines 323 to 304
const {
step_number,
min_cpu_cores,
min_memory_limit,
min_disk_space,
max_cpu_cores,
} = stepRequirements;
const { step_number, min_memory_limit, min_disk_space, cpu_cores } =
stepRequirements;

const hasRequest = !!(
min_cpu_cores ||
min_memory_limit ||
min_disk_space ||
max_cpu_cores
);
const hasRequest = !!(min_memory_limit || min_disk_space || cpu_cores);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only min_cpu_cores need to be removed in these params.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming that we want to set min_cpu_cores and max_cpu_cores explicitly if the user selects any value for the CPU Cores field, I think this might have to be left as it was before.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic is only for displaying values in the review step that have been set in the previous step of the form. If we are only displaying the max_cpu_cores field, then the check for min_cpu_cores does not need to be included in hasRequest here.

label={t("minCPUCores")}
value={min_cpu_cores}
label={t("cpuCores")}
value={cpu_cores}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also replace this Review Row with the max_cpu_cores row from below.

stepRequirementErrors?.min_memory_limit ||
stepRequirementErrors?.min_disk_space ||
stepRequirementErrors?.max_cpu_cores
stepRequirementErrors?.cpu_cores
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
stepRequirementErrors?.cpu_cores
stepRequirementErrors?.max_cpu_cores

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this have to change also if both the min_cpu_cores and max_cpu_cores fields are specified in the request to the API?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This controls if the resource requirements are displayed in the review step, but only if those form fields have any values selected. If we are only displaying the max_cpu_cores form field, and hiding min_cpu_cores in the form, then the stepRequirementErrors?.min_cpu_cores || line can be removed here.

Comment on lines 241 to 252
values.requirements.forEach((req, i) => {
if (req?.min_cpu_cores && req?.max_cpu_cores) {
const err = validateNotAbove(
req.min_cpu_cores,
req.max_cpu_cores,
t
);
if (err) {
reqErrors[i] = { min_cpu_cores: err };
}
}
});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would leave this validation in place, because it's still possible for an app to have a min_cpu_cores requirement set in the API, and we don't want the possibility of someone being able to somehow submit a max requirement that is less than the min set by the app creator (although the way the CPU core select list is built should prevent that).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I overlooked this in my last review, but if we're now hiding min_cpu_cores in the form, then this error won't display anywhere anymore, so reqErrors[i] = { min_cpu_cores: err } here should be updated to reqErrors[i] = { max_cpu_cores: err }.

@psarando
Copy link
Member

psarando commented Jan 23, 2025

Also, I don't see any Chromatic builds associated with this PR, so you probably still need to set up a Chromatic token for your sonora fork.

I'm seeing Chromatic builds for this PR now, so I guess I was wrong about this (I switched to a new "merge experience" view, but also there were no visual storybook changes, so the Chromatic builds were hidden from my view by default).

name={`requirements.${index}.min_cpu_cores`}
label={t("minCPUCores")}
name={`requirements.${index}.max_cpu_cores`}
label={t("cpuCores")}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also remove the selectMins grid item just above this, since these are no longer all minimum requirements.

Copy link
Member

@psarando psarando left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

I also tested the DE Word Count story, using the Chromatic build from the Storybook Publish link under the checks section of this PR, and it appears to be submitting the correct requirements for both min and max CPUs 🎉

Copy link
Member

@slr71 slr71 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me as well! 👍 :shipit:

@suemkwon suemkwon merged commit 4ed0ff6 into main Jan 25, 2025
8 checks passed
@psarando psarando deleted the CPU-request-fields branch January 31, 2025 00:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants