You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Supporting model reloads (when a new version is available) and multiple models.
Motivation
Other servers supports this so to be more attractive that would be a nice feature.
Pitch
Right now it's obvious on how to serve one model, but what if there are multiple ones (and the request (binary, or HTTP arguments) will tell which model should be used).
Alternatives
Run N instances for the N models present at a certain time, but if a new model appear, that won't work.
Additional context
We have an internal C++ server that supports this, torch.serve support that too with I believe what they call an orchestrator.
The text was updated successfully, but these errors were encountered:
🚀 Feature
Supporting model reloads (when a new version is available) and multiple models.
Motivation
Other servers supports this so to be more attractive that would be a nice feature.
Pitch
Right now it's obvious on how to serve one model, but what if there are multiple ones (and the request (binary, or HTTP arguments) will tell which model should be used).
Alternatives
Run N instances for the N models present at a certain time, but if a new model appear, that won't work.
Additional context
We have an internal C++ server that supports this, torch.serve support that too with I believe what they call an orchestrator.
The text was updated successfully, but these errors were encountered: