vllm: improve container support #200

russellb · 2024-10-06T19:32:42Z

This is a follow-up issue for #181.

The vllm inline inference adapter works in both the conda and docker stack types, but some features fail in the docker case because the base image does not include all necessary dependencies (some cuda libraries, in particular). The specific case that failed for me was trying to use tensor_parallel_size greater than 1. NCCL fails to initialize (nccl library isn't present).

I started working on this, but didn't get it working completely yet.

my WIP was here: russellb@3a61246

The text was updated successfully, but these errors were encountered:

yanxi0830 added the enhancement New feature or request label Oct 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vllm: improve container support #200

vllm: improve container support #200

russellb commented Oct 6, 2024 •

edited

Loading

vllm: improve container support #200

vllm: improve container support #200

Comments

russellb commented Oct 6, 2024 • edited Loading

russellb commented Oct 6, 2024 •

edited

Loading