Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

policy: expose internals of resource module policy factory to allow custom policies #1268

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

wihobbs
Copy link
Member

@wihobbs wihobbs commented Aug 13, 2024

Problem: the current fluxion policy matching factory doesn't allow for customizable policies.

Add a custom option and refactor existing policies to behave similarly. Add unit testing for the string parsing convenience functions.

Outstanding to-do:

  • add an option in the scheduler config to actually pass a custom string in, and probably some validation too

@wihobbs
Copy link
Member Author

wihobbs commented Aug 13, 2024

And I know the testing and the code should be in separate commits... sigh (will fix)

@wihobbs wihobbs force-pushed the new-policy-refactor branch from f7f1df3 to cd87d59 Compare August 13, 2024 01:09
Copy link
Member

@cmoussa1 cmoussa1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for opening this @wihobbs! I have just some preliminary questions/comments on a first pass. Apologies in advance that I'm not very familiar with the sched code at all so I could definitely be misunderstanding how this works. :-)

resource/policies/dfu_match_policy_factory.cpp Outdated Show resolved Hide resolved
resource_type_t node_rt ("node");
std::shared_ptr<dfu_match_cb_t> create_match_cb (const std::string &policy_requested)
{
std::string policy = policies.find (policy_requested)->second;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this .find () call have any sort of error checking after the lookup of policy_requested? Maybe this is always guaranteed to find something successfully. If not, maybe you can check the contents of policy before moving on.

if (policy == "")
    // return an error?
    // set matcher to something default?
    // sorry that idk what the right behavior should be here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mmm good catch. It should call known_match_policy first to see if the policy exists.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, actually, known_match_policy is called when the config is validated in resource/modules/resource_match_opts.cpp so it's already been validated before it gets here.

This will need additional work to support custom.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, a comment making note that it is already validated by the time it gets to that lookup might be helpful here!


if (option_exists ("stop_on_k_matches", policy)) {
ptr->set_stop_on_k_matches (parse_int_match_options ("stop_on_k_matches", policy));
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this if-else branch need an else statement? Excuse my ignorance on this - I'm not sure if the addition of custom policies allows for something other than low or high

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK low and high are the only options, unless locality or variation are chosen.

@wihobbs wihobbs force-pushed the new-policy-refactor branch 2 times, most recently from 5f03ae3 to ba80f7b Compare August 13, 2024 16:31
Copy link
Member

@jameshcorbett jameshcorbett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just did a quick look-through, it looks reasonable to me. I think the commits could be broken up better though (as you noted already)

bool known_match_policy (const std::string &policy);

const std::map<std::string, std::string> policies =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll do a deeper look over this, but first thing this should be split. Put the declaration here with extern and the definition in the cpp file so this variable doesn't end up getting defined in all includers.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be fair, that's what the old code did, but it shouldn't 😬

@garlick
Copy link
Member

garlick commented Aug 20, 2024

PR title seems truncated?

@wihobbs
Copy link
Member Author

wihobbs commented Aug 20, 2024

Well, custom is a new type of policy that this PR is enabling. Or will enable eventually. Maybe "resource: refactor policy options to allow a user-defined policy" is more descriptive?

Note that this is still WIP and needs some config work and testing for the custom options. Happy to incorporate any feedback on the initial draft (and thanks to those who have already reviewed!), but it's going to stay in draft state until the code can:

  • accept a custom string with policy options
  • reject certain strings that are requesting invalid custom options, such as set_stop_on_k_matches=-100 and such
  • test some custom options and make sure they behave as expected
  • break out the commit history in a better way

@wihobbs
Copy link
Member Author

wihobbs commented Sep 5, 2024

Per discussion with Tom:

  • To handle custom policies, "don't validate, parse" -- parse the input on a delimiter first and then verify that each key is valid, give an error message if it isn't, possibly demote this to a warning based on a config option
  • Fluxion should issue a fatal warning if a policy named is not valid -- don't fall back on first (might already be done?)
  • set_stop_on_k_matches probably only takes a value of 1, however, we should test this
  • we should also think about having "node_centric" have a score factor of 10000, maybe we don't want to allow users to configure this, but maybe that number doesn't cover all our bases

@wihobbs wihobbs force-pushed the new-policy-refactor branch from ba80f7b to 2d176c4 Compare November 15, 2024 17:25
@wihobbs wihobbs force-pushed the new-policy-refactor branch 3 times, most recently from 147eebf to 3484d1c Compare December 23, 2024 18:07
@wihobbs wihobbs requested a review from jameshcorbett January 16, 2025 18:19
Copy link
Member

@jameshcorbett jameshcorbett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good! Just noticed one thing.

resource/policies/dfu_match_policy_factory.hpp Outdated Show resolved Hide resolved
@grondo
Copy link
Contributor

grondo commented Jan 16, 2025

Don't forget to address @garlick's comment about the PR title, which will end up as a line in the release notes.

@wihobbs
Copy link
Member Author

wihobbs commented Jan 16, 2025

I just asked @jameshcorbett to do a first pass to get me started. Note this still is in draft state until it has working sharness tests and additional reviews.

@wihobbs wihobbs force-pushed the new-policy-refactor branch from 3484d1c to 2efc97e Compare January 22, 2025 21:54
@wihobbs wihobbs changed the title policy: refactor options to allow custom policy: expose internals of resource module policy factory to allow custom policies Jan 22, 2025
Problem: policies in fluxion are sets of options that are
specified in a shared pointer, and, when a new set of options
(a new policy) is created, this requires at least a re-compilation
of flux-sched. In other words, the internal options on which policies
are built are not exposed to end users for creating new, custom policies.

Solution: continue to support the existing policy options, but make them
sets of specified options, and allow users to create their own sets and
pass them to the scheduler as a string.
Problem: the custom policy code added in a1a54e0c does not
have sharness testing.

Solution: building from existing, validated outputs, provide
the full string for certain policy options (rather than the
short string) and check that the matches are the same.
Problem: the policy factory in the resource module has no
unit testing.

Solution: add a series of unit tests to validate the parsing
and selection of policy options in the factory.
@wihobbs wihobbs force-pushed the new-policy-refactor branch from 2efc97e to bdff4d5 Compare January 22, 2025 21:57
@wihobbs
Copy link
Member Author

wihobbs commented Jan 22, 2025

Once the automated tests pass I'll drop draft from this PR.

Copy link

codecov bot commented Jan 22, 2025

Codecov Report

Attention: Patch coverage is 95.94595% with 3 lines in your changes missing coverage. Please review.

Project coverage is 75.3%. Comparing base (22c24d5) to head (bdff4d5).

Files with missing lines Patch % Lines
resource/policies/dfu_match_policy_factory.cpp 93.8% 3 Missing ⚠️
Additional details and impacted files
@@          Coverage Diff           @@
##           master   #1268   +/-   ##
======================================
  Coverage    75.3%   75.3%           
======================================
  Files         111     112    +1     
  Lines       16042   16088   +46     
======================================
+ Hits        12081   12125   +44     
- Misses       3961    3963    +2     
Files with missing lines Coverage Δ
...licies/base/test/matcher_policy_factory_test02.cpp 100.0% <100.0%> (ø)
resource/policies/dfu_match_policy_factory.cpp 89.2% <93.8%> (+0.5%) ⬆️

@wihobbs wihobbs marked this pull request as ready for review January 22, 2025 22:48
@wihobbs wihobbs requested a review from jameshcorbett January 22, 2025 22:48
@trws
Copy link
Member

trws commented Jan 23, 2025

This is starting to look really good @wihobbs! There's one tweak I'd like to see to this so we can keep extending it. I like the way the options are handled, this is exactly what I was hoping for. The one tweak I'd make is to have a policy=* option rather than separate low= and high=, since the base policy object types are low, high, locality and variation. That way, we get the "you can't set both low and high at the same time" for free, and can even collapse the two low and high branches to create a matcher based on which it is then set options on the resulting object after the condition. We can also check that policy has been set, either with the bare name or by having policy=whatever to allow options. Does that sound workable?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants