-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lm_eval --model vllm did not work when data_parallel_size > 1 #2379
Comments
Hi! I cannot reproduce this (on an A40 node). Are you already using a Ray instance? Think that might be the issue, as I don't get the autoscaler messages as in your log. Also haven't been able to initialize multiple models inside a multiprocessing context, as vllm wants to create child processes and that's not allowed. cc: @mgoin in case they have any tips! |
My command:
|
It should use You could try here
and maybe also calling Alternatively you could serve the models separately and use |
We noticed that lm_eval --model vllm did not work when data_parallel_size > 1 and got
Error: No available node types can fulfill resource request
from Ray. After some research, I believe whentensor_parallel_size=1
we should use multiprocessing instead of ray (in this line) for the latest vLLM. My code works ondata_parallel_size=1
but got the following error whendata_parallel_size > 1
, the logs are below, please help!Log:
Meanwhile the
ray status
shows 4gpu available:The text was updated successfully, but these errors were encountered: