This is a step-by-step guide on how to deploy OpenVINO™ Model Server on Linux, using a Docker Container. Links are provided for different compatible hardware.
Before you start, make sure you have:
- Docker Engine installed (How to Install Docker Engine)
- Intel® Core™ processor (6-12th gen.) or Intel® Xeon® processor
- (optional) AI accelerators supported by OpenVINO
- Linux, macOS or Windows via WSL
NOTE: accelerators are only tested on bare-metal Linux hosts.
- Pull OpenVINO™ Model Server Image.
- Start a Docker Container with OVMS and your chosen model from cloud storage.
- Provide the input files, (arrange an input Dataset).
- Prepare a client package.
- Run the prediction using ovmsclient.
Here is an example of this process using a ResNet50 model for image classification:
Pull an image from Docker or RedHat Ecosystem Catalog
docker pull openvino/model_server:latest
# or, alternatively
docker pull
Start the container
# start the container
docker run -p 9000:9000 openvino/model_server:latest \
--model_name resnet --model_path gs://ovms-public-eu/resnet50-binary \
--layout NHWC --port 9000
# download input files, an image, and a label mapping file
# Install the Python-based ovmsclient package
pip3 install ovmsclient
Run prediction
import numpy as np
from classes import imagenet_classes
from ovmsclient import make_grpc_client
client = make_grpc_client("localhost:9000")
with open("path/to/img.jpeg", "rb") as f:
img =
output = client.predict({"0": img}, "resnet")
result_index = np.argmax(output[0])
predicted_class = imagenet_classes[result_index]
To learn how to set up OpenVINO Model Server, refer to the Quick Start guide.
You can build your own Docker image executing the make docker_build
command in the git repository root folder.
In the ./dist
directory it will generate:
- image tagged as openvino/model_server:latest - with CPU, NCS, and HDDL support
- image tagged as openvino/model_server-gpu:latest - with CPU, NCS, HDDL, and iGPU support
- image tagged as openvino/model_server:latest-nginx-mtls - with CPU, NCS, and HDDL support and a reference nginx setup of mTLS integration
- release package (.tar.gz, with ovms binary and necessary libraries)
Note: OVMS docker image can be created with ubi8-minimal base image or the default ubuntu20. Note that OVMS with the ubi base image doesn’t support NCS and HDDL accelerators.
To do so, use either of these commands:
Running the inference operation on GPU requires the ovms process security context account to have correct permissions. It has to belong to the render group identified by the command:
stat -c "group_name=%G group_id=%g" /dev/dri/render*
The default account in the docker image is already preconfigured. In case you change the security context, use the following command to start the ovms container:
docker run --rm -it --device=/dev/dri --group-add=$(stat -c "%g" /dev/dri/render* | head -n 1) -u $(id -u):$(id -g) \
-v /opt/model:/opt/model -p 9001:9001 openvino/model_server:latest-gpu \
--model_path /opt/model --model_name my_model --port 9001 --target_device GPU
Note: The public docker image includes the OpenCL drivers for GPU in version 21.38.21026.
If you have multiple inference devices available (e.g. Myriad VPUs and CPU) you can increase inference throughput by enabling the Multi-Device Plugin. With Multi-Device Plugin enabled, inference requests will be load balanced between multiple devices. For more detailed information read OpenVino's Multi-Device plugin documentation.
In order to use this feature in OpenVino™ Model Server, following steps are required:
Set target_device for the model in configuration json file to MULTI:DEVICE_1,DEVICE_2 (e.g. MULTI:MYRIAD,CPU, order of the devices defines their priority, so MYRIAD devices will be used first in this example)
Below is exemplary config.json setting up Multi-Device Plugin for resnet model, using Intel® Movidius™ Neural Compute Stick and CPU devices:
make docker_build BASE_OS=ubuntu
Additionally, you can use the INSTALL_DRIVER_VERSION
argument command to choose which GPU driver version is used by the produced image.
If not provided, most recent version is used.
Currently, the following versions are available:
- 21.38.21026 - Redhat
- 21.48.21782 - Ubuntu
make docker_build INSTALL_DRIVER_VERSION=21.38.21026
If not provided, version 21.38.21026 is used for Redhat and 21.48.21782 is used for Ubuntu.