Custom model orchestration #6145
-
Hi, |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments
-
You are using a model to load/unload other models? That sounds like poor separation of duties and can run into a lot of trouble. The intent of the model loading/unloading API is to be called by the user. Could you describe your use case further? Python backend models are created in their own processes, so it's possible they are not communicating on the same network as Triton server. CC: @Tabrizian |
Beta Was this translation helpful? Give feedback.
-
Ray was meant to deal with situations like this. https://github.com/autonomi-ai/nos supports multiple models on the same HW (with loading/unloading) if you want to try it out. |
Beta Was this translation helpful? Give feedback.
-
@tingc9 We've added model load/unload support in Python backend starting from 23.07 which might be what you're looking for: https://github.com/triton-inference-server/python_backend?tab=readme-ov-file#model-loading-api |
Beta Was this translation helpful? Give feedback.
You are using a model to load/unload other models? That sounds like poor separation of duties and can run into a lot of trouble. The intent of the model loading/unloading API is to be called by the user. Could you describe your use case further?
Python backend models are created in their own processes, so it's possible they are not communicating on the same network as Triton server.
CC: @Tabrizian