You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I already posted it as "issue" #13294 and several user answered that they have the same Problem - so it seems to be a bug :)
Your current environment
Hey guys :)
since version 0.6.6 up to the current V0.7.2 I have a slightly annoying problem. When I start up my AI server, vllm everything works fine. The model is loaded and can be used as desired.
However, as soon as I end my start script and want to load the same model again, or even a different model, vLLM always freezes at this point:
[W214 15:20:56.973624780 CUDAAllocatorConfig.h:28] Warning: expandable_segments not supported on this platform (function operator())
[W214 15:20:56.008814328 CUDAAllocatorConfig.h:28] Warning: expandable_segments not supported on this platform (function operator())
[W214 15:20:56.104798409 CUDAAllocatorConfig.h:28] Warning: expandable_segments not supported on this platform (function operator())
[W214 15:20:56.107680744 CUDAAllocatorConfig.h:28] Warning: expandable_segments not supported on this platform (function operator())
[W214 15:20:56.196595399 CUDAAllocatorConfig.h:28] Warning: expandable_segments not supported on this platform (function operator())
[W214 15:20:56.199089483 CUDAAllocatorConfig.h:28] Warning: expandable_segments not supported on this platform (function operator())
[W214 15:20:56.205991785 CUDAAllocatorConfig.h:28] Warning: expandable_segments not supported on this platform (function operator())
[W214 15:20:56.207522727 CUDAAllocatorConfig.h:28] Warning: expandable_segments not supported on this platform (function operator())
(VllmWorkerProcess pid=8968) INFO 02-14 15:20:56 utils.py:950] Found nccl from library libnccl.so.2
(VllmWorkerProcess pid=8969) INFO 02-14 15:20:56 utils.py:950] Found nccl from library libnccl.so.2
(VllmWorkerProcess pid=8968) INFO 02-14 15:20:56 pynccl.py:69] vLLM is using nccl==2.21.5
INFO 02-14 15:20:56 utils.py:950] Found nccl from library libnccl.so.2
(VllmWorkerProcess pid=8969) INFO 02-14 15:20:56 pynccl.py:69] vLLM is using nccl==2.21.5
(VllmWorkerProcess pid=8971) INFO 02-14 15:20:56 utils.py:950] Found nccl from library libnccl.so.2
(VllmWorkerProcess pid=8967) INFO 02-14 15:20:56 utils.py:950] Found nccl from library libnccl.so.2
INFO 02-14 15:20:56 pynccl.py:69] vLLM is using nccl==2.21.5
(VllmWorkerProcess pid=8971) INFO 02-14 15:20:56 pynccl.py:69] vLLM is using nccl==2.21.5
(VllmWorkerProcess pid=8967) INFO 02-14 15:20:56 pynccl.py:69] vLLM is using nccl==2.21.5
(VllmWorkerProcess pid=8970) INFO 02-14 15:20:56 utils.py:950] Found nccl from library libnccl.so.2
(VllmWorkerProcess pid=8972) INFO 02-14 15:20:56 utils.py:950] Found nccl from library libnccl.so.2
(VllmWorkerProcess pid=8973) INFO 02-14 15:20:56 utils.py:950] Found nccl from library libnccl.so.2
(VllmWorkerProcess pid=8970) INFO 02-14 15:20:56 pynccl.py:69] vLLM is using nccl==2.21.5
(VllmWorkerProcess pid=8972) INFO 02-14 15:20:56 pynccl.py:69] vLLM is using nccl==2.21.5
(VllmWorkerProcess pid=8973) INFO 02-14 15:20:56 pynccl.py:69] vLLM is using nccl==2.21.5
When I quit vLLM before, I always get this warning: [rank0]:[W214 15:18:06.519697808 ProcessGroupNCCL.cpp:1250] Warning: WARNING: process group has NOT been destroyed before we destruct ProcessGroupNCCL. On normal program exit, the application should call destroy_process_group to ensure that any pending NCCL operations have finished in this process. In rare cases this process can exit before this point and block the progress of another member of the process group. This constraint has always been present, but this warning has only been added since PyTorch 2.4 (function operator())
Unfortunately, I cannot find any examples of how I can execute the start script so that the shutdown is executed correctly. Can you help me with this?
How would you like to use vllm
I would like it if I didn't always have to restart the AI machine to reload a model with vLLM
Before submitting a new issue...
Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
The text was updated successfully, but these errors were encountered:
Your current environment
I already posted it as "issue" #13294 and several user answered that they have the same Problem - so it seems to be a bug :)
Your current environment
Hey guys :)
since version 0.6.6 up to the current V0.7.2 I have a slightly annoying problem. When I start up my AI server, vllm everything works fine. The model is loaded and can be used as desired.
However, as soon as I end my start script and want to load the same model again, or even a different model, vLLM always freezes at this point:
When I quit vLLM before, I always get this warning:
[rank0]:[W214 15:18:06.519697808 ProcessGroupNCCL.cpp:1250] Warning: WARNING: process group has NOT been destroyed before we destruct ProcessGroupNCCL. On normal program exit, the application should call destroy_process_group to ensure that any pending NCCL operations have finished in this process. In rare cases this process can exit before this point and block the progress of another member of the process group. This constraint has always been present, but this warning has only been added since PyTorch 2.4 (function operator())
Unfortunately, I cannot find any examples of how I can execute the start script so that the shutdown is executed correctly. Can you help me with this?
How would you like to use vllm
I would like it if I didn't always have to restart the AI machine to reload a model with vLLM
Before submitting a new issue...
🐛 Describe the bug
My startskript.sh:
Before submitting a new issue...
The text was updated successfully, but these errors were encountered: