Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add initial higher limits and default behavior for private tool resource restrictions #132

Closed
wants to merge 1 commit into from

Conversation

ianmcorvidae
Copy link
Member

This is a pretty small PR with work I did weeks ago, but coming back to it I'm not actually sure there's more that needs changing in apps. I'd appreciate a check from those who work a bit more with the apps code to make sure I'm not crazy and there's not more needed here!

@@ -69,7 +78,7 @@
restrict-private-tool
(assoc :implementation (ensure-default-implementation user implementation))
persistence/add-tool)]
(containers/add-tool-container tool-id (restrict-private-tool-container container))
(containers/add-tool-container tool-id (restrict-private-tool-container (set-private-tool-defaults container)))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The basic gist here is: set-private-tool-defaults should set anything that isn't provided to the (lower) default limits, where restrict-private-tool-container has been repurposed to restrict things down to the higher maximum limits. I've kept it so the latter will still default the values to the default limits, but I wanted to be slightly more explicit, particularly because ultimately we may want to move restrict-private-tool-container to another namespace and apply it to all tools, not just private ones.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original intent was to restrict private tools to a maximum of the apps.tools.private.memory-limit config (16GB), whether the user provided no limit, or if the provided limit was higher than this configured max.

So is it now the intent to allow the user to set a higher limit for private tools, up to the apps.tools.memory-limit max for all tools (32GB)? Or do we still want to restrict private tools to a lower limit than public tools (of course, an admin should still be able to lift the limit to the apps.tools.memory-limit max)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That all tools would be subject to the same effective maximum was my understanding, yeah -- with the prior limits for private tools becoming the defaults instead (and we'll also be hiding this slightly behind something saying it's advanced behaviour). Not much benefit to adding an interface for this unless it can be adjusted both up and down, and I don't think we want to bring the defaults down (though we could -- VICE defaults to 2 cores and I forget how much memory, but I think less than the current private tool defaults). It's also worth noting that the trajectory here is towards this being only the max, adding a field also for the minimum (default whatever value we consider acceptable as the smallest slot), and then allowing users to select a value between them at runtime. Once again there, we'd probably peg the defaults somewhere like these older values, I think

@ianmcorvidae
Copy link
Member Author

This PR is part of #123

Copy link
Member

@slr71 slr71 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't spot anything that you missed. If you are crazy, then I'm apparently at least as crazy. 😆 :shipit:

Copy link
Member

@psarando psarando left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look good, but I had a question about the intent or policy of resource restrictions for private tools vs. all tools.

@@ -69,7 +78,7 @@
restrict-private-tool
(assoc :implementation (ensure-default-implementation user implementation))
persistence/add-tool)]
(containers/add-tool-container tool-id (restrict-private-tool-container container))
(containers/add-tool-container tool-id (restrict-private-tool-container (set-private-tool-defaults container)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original intent was to restrict private tools to a maximum of the apps.tools.private.memory-limit config (16GB), whether the user provided no limit, or if the provided limit was higher than this configured max.

So is it now the intent to allow the user to set a higher limit for private tools, up to the apps.tools.memory-limit max for all tools (32GB)? Or do we still want to restrict private tools to a lower limit than public tools (of course, an admin should still be able to lift the limit to the apps.tools.memory-limit max)?

Copy link
Member

@psarando psarando left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, if we decided to allow users to (eventually) set their private tools to use as much RAM/CPU as public tools, then these changes LGTM 👍

@ianmcorvidae
Copy link
Member Author

After discussion elsewhere I think I'm going to close this and rework it. I think the plan is:

  • Add configurations much like this PR, for system maximums (and later, minimums maybe), private tool (and maybe VICE) maximums, and defaults for different types of tools.
  • set up the DB to have columns for tools' "recommended" values, copying what's currently in the maximum fields to that one, and using that (for now) for setting limits
  • change it so apps always resolves the limits to real values to pass down (even for public tools -- use the default values if stuff is unset, otherwise use those values)
  • change defaults that currently are set elsewhere so that they yell more loudly if they're used, since they shouldn't be now

Future work will allow also setting minimums/maximums for tools, allow choosing a value when launching analyses, and possibly adding recommended/minimum/maximum values at the app level as well. But to start it'll just be tool-level recommended values and configurable maximums resolved in the apps service.

@cyverse-org
Copy link

cyverse-org commented Nov 13, 2018 via email

@ianmcorvidae
Copy link
Member Author

We haven't really fleshed out a way to limit disk/data size, so we'd only really be able to add some sort of "advisory" limit, or something that kills jobs outright if they exceed the size. We also haven't yet figured out GPU integration stuff at all, and probably anything dealing with that architecture and/or CPU architecture would need to be designed and done at once. Basically, what I've laid out is the first steps, and I think what you've said goes in the "future work" section :)

@cyverse-org
Copy link

cyverse-org commented Nov 13, 2018 via email

@ianmcorvidae
Copy link
Member Author

Yeah, we already add all the groups the submitting user is part of to the classad. We don't currently support any sort of "launch as a member of X group" feature, so you're always considered as a member of every group you're part of, but that's quite sufficient for a class or workshop probably.

@ianmcorvidae
Copy link
Member Author

And, the way it's currently set up, we basically have a template that gets used to create the submit file, which has access to the full job definition. Most likely we don't benefit from doing much more than that, since we don't want to encode details of our condor configuration in our high-level models or vice versa.

@cyverse-org
Copy link

cyverse-org commented Nov 13, 2018 via email

@cyverse-org
Copy link

cyverse-org commented Nov 13, 2018 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants