You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@PeterYang12
looks like a mismatch between vllm-fork version and vllm-hpu-extension version.
try uninstall all vllm related packages. (the 2 above, might do it more than once for each)
then:
pip install -e .
in vllm-fork base folder. (this should also install vllm-hpu-extension automatically)
then run the calibrate script again
Another option will be to only uninstall vllm-hpu-extension )again - might need to do it more than once) and then install it again.
I followed the README but failed to do quantization for meta-llama/Llama-3.1-8B-Instruct
run command:
The error message below:
The text was updated successfully, but these errors were encountered: