Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AttributeError: 'NoneType' object has no attribute 'to' when using wanda pruning #67

Closed
NamburiSrinath opened this issue Aug 29, 2024 · 4 comments

Comments

@NamburiSrinath
Copy link

NamburiSrinath commented Aug 29, 2024

Hi @Eric-mingjie,

I am also facing the same issue (as [#51]) when trying to prune llama-2-7b-chat-hf

Here's the command

python main.py --model meta-llama/Llama-2-7b-chat-hf --prune_method wanda --sparsity_ratio 0.5 --sparsity_type unstructured --save out/llama_7b/unstructured/wanda/0.5/

And the attention_mask seems to be None

These are the kwargs in the prepare_calibration_input function

{'attention_mask': None, 'position_ids': tensor([[ 0, 1, 2, ..., 4093, 4094, 4095]], device='cuda:0'), 'past_key_value': None, 'output_attentions': False, 'use_cache': False, 'cache_position': tensor([ 0, 1, 2, ..., 4093, 4094, 4095], device='cuda:0')}

Any idea how to go forward with this?

@NamburiSrinath NamburiSrinath changed the title AttributeError: 'NoneType' object has no attribute 'to' AttributeError: 'NoneType' object has no attribute 'to' when using wanda pruning Aug 29, 2024
@NamburiSrinath
Copy link
Author

NamburiSrinath commented Aug 30, 2024

Update

The same issue arises when compressing using sparsegpt as well.

python main.py --model 'meta-llama/Llama-2-7b-chat-hf' --prune_method sparsegpt --sparsity_ratio 0.1 --sparsity_type unstructured --save out/llama_7b/unstructured/sparsegpt/0.1/ --save_model out/llama_7b/unstructured/sparsegpt/0.1/

Note: Similar issue is observed even with llama-2-7b-hf i.e the base model!

The only change that was made was related to #36.

Tagging coauthor @liuzhuang13 to see if something is broken.

@Logan-007L
Copy link

hello, Have you solved the problem?

@NamburiSrinath
Copy link
Author

NamburiSrinath commented Sep 1, 2024

@Logan-007L - Please refer #51

@Eric-mingjie, @liuzhuang13. Closing this as it resolved the issue. But I am not sure if it's technically correct though!! If so, would be helpful to make some code changes as Llama-2 is a popular model (even the paper reports results using this model)

@Logan-007L
Copy link

Thanks. I'll have a try

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants