-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AttributeError: 'NoneType' object has no attribute 'to' when using wanda pruning #67
Comments
Update The same issue arises when compressing using
Note: Similar issue is observed even with The only change that was made was related to #36. Tagging coauthor @liuzhuang13 to see if something is broken. |
hello, Have you solved the problem? |
@Logan-007L - Please refer #51 @Eric-mingjie, @liuzhuang13. Closing this as it resolved the issue. But I am not sure if it's technically correct though!! If so, would be helpful to make some code changes as Llama-2 is a popular model (even the paper reports results using this model) |
Thanks. I'll have a try |
Hi @Eric-mingjie,
I am also facing the same issue (as [#51]) when trying to prune llama-2-7b-chat-hf
Here's the command
python main.py --model meta-llama/Llama-2-7b-chat-hf --prune_method wanda --sparsity_ratio 0.5 --sparsity_type unstructured --save out/llama_7b/unstructured/wanda/0.5/
And the
attention_mask
seems to beNone
These are the
kwargs
in theprepare_calibration_input
function{'attention_mask': None, 'position_ids': tensor([[ 0, 1, 2, ..., 4093, 4094, 4095]], device='cuda:0'), 'past_key_value': None, 'output_attentions': False, 'use_cache': False, 'cache_position': tensor([ 0, 1, 2, ..., 4093, 4094, 4095], device='cuda:0')}
Any idea how to go forward with this?
The text was updated successfully, but these errors were encountered: