Sagemaker Multi-model endpoint allows deploying multiple models on the same endpoint. This is particularly useful for deploying large number of models which are not always invoked but need to be available for invocation without making application code change.
At present three scenarios are illustrated through example code (adapted from the aws-sagemeaker-examples repo)
- Multi-model Serving
- Building Sagemaker Containers
- Building Sagemaker Scikit-learn Containers
- Building Sagemmaker PyTorch Containers
- Building Sagemaker XGBoost Containers
- Sagemaker Inference Toolkit
- Sagemaker SparkML Serving Container
- Sagemaker Tensorflow Serving Container
- Sagemaker MXNet Serving Container