Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to do quantilization #68

Open
PeterYang12 opened this issue Jan 3, 2025 · 2 comments
Open

Failed to do quantilization #68

PeterYang12 opened this issue Jan 3, 2025 · 2 comments

Comments

@PeterYang12
Copy link

I followed the README but failed to do quantization for meta-llama/Llama-3.1-8B-Instruct

run command:

./calibrate_model.sh -m meta-llama/Llama-3.1-8B-Instruct -d /workspace/vllm-hpu-extension/calibration/open_orca/open_orca_gpt4_tokenized_llama.calibration_1000.pkl -o /workspace/vllm-hpu-extension/calibration/inc

The error message below:

[rank0]:   File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1606, in _call_impl
[rank0]:     result = forward_call(*args, **kwargs)
[rank0]: TypeError: PatchedVLLMKVCache.forward_measure() missing 2 required positional arguments: 'block_indices' and 'block_offset'
Step 2/4 done

3/4 Postprocessing scales
[]
[]
finished fix_measurements script
cp: cannot stat 'inc_tmp/llama-3.1-8b-instruct/g2/*': No such file or directory

@nirda7
Copy link
Contributor

nirda7 commented Jan 5, 2025

@PeterYang12
looks like a mismatch between vllm-fork version and vllm-hpu-extension version.
try uninstall all vllm related packages. (the 2 above, might do it more than once for each)
then:
pip install -e .
in vllm-fork base folder. (this should also install vllm-hpu-extension automatically)
then run the calibrate script again

Another option will be to only uninstall vllm-hpu-extension )again - might need to do it more than once) and then install it again.

@PeterYang12
Copy link
Author

Thank you. I am curious why I must input some data to do the quantization. Is it Gaudi specific?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants