-
Notifications
You must be signed in to change notification settings - Fork 836
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(scheduler): add partial scheduling based on min replicas #6221
Conversation
// also send an update if the model is not yet at desired replicas, if we have partial scheduling | ||
|
||
if replicaStateUpdated || modelVersion.state.State == ScheduleFailed || model.IsDeleted() || modelVersion.state.State == ModelProgressing || | ||
(modelVersion.state.State == ModelAvailable && len(modelVersion.GetAssignment()) < modelVersion.DesiredReplicas()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oof.. this statement is becoming unwieldy (if it wasn't already). I wonder if it's worth having an explicit task to figure out how to refactor it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah we should fix it, it keeps growing. I will add a todo in the code.
065a56e
to
6170cdf
Compare
What this PR does / why we need it:
This PR introduces the ability to do partial scheduling for a given model if:
In this case we load the model on as many replicas as possible assuming it meets min replicas requirement; we mark the model as
Available
/Ready
(instead of theScheduleFailed
) . We also make sure that when there is more capacity available we retry scheduling to make sure so that the model can potentially make use of this extra capacity.We also favour servers that have the models loaded to minimise ping-ping behaviour.
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer: