Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with def apply_rotary_pos_emb(q, k, cos, sin, position_ids, unsqueeze_dim=1): (index out of bounds error) #82

Open
ukaprch opened this issue Feb 14, 2025 · 1 comment

Comments

@ukaprch
Copy link

ukaprch commented Feb 14, 2025

Running model for an image. Note: I am using a quantized (qint8) version of the model.

> deepseek_vl2.models.modeling_deepseek.apply_rotary_pos_emb : 360 Python deepseek_vl2.models.modeling_deepseek.forward : 886 (Current frame) Python torch.nn.modules.module._call_impl : 1750 Python torch.nn.modules.module._wrapped_call_impl : 1739 Python deepseek_vl2.models.modeling_deepseek.forward : 1298 Python torch.nn.modules.module._call_impl : 1750 Python torch.nn.modules.module._wrapped_call_impl : 1739 Python deepseek_vl2.models.modeling_deepseek.forward : 1585 Python torch.nn.modules.module._call_impl : 1750 Python
When calling def apply_rotary_pos_emb the COS and SIN tensors are empty resulting in an error:

Image

So deepseek changed the LLama original code. These (buffers) apparently are not initialized in this model. How do we get around this error?

After using the chunk_size=512 # prefilling size the error above went away producing a new error:
Then I got a new error message trying to update the version on a torch tensor. 1st problem: Why in inference mode would the model be updating a torch tensor version? The 2nd problem has to do with Quanto not handling the problem internally as the tensor in question was quantized.

Here's the relevant code in def forward (modeling_deepseek.py; line number 898) where I had to make a change.

    qclone = self.kv_b_proj.weight.detach().clone()     << insert this new code to bypass version update error on quantized tensor
    qclone = qclone.to(dtype=self.kv_b_proj.weight.dtype,device=self.kv_b_proj.weight.device) << insert as above
    #kv_b_proj = self.kv_b_proj.weight.view(self.num_heads, -1, self.kv_lora_rank)   << produces version error on q tensor
    kv_b_proj = qclone.view(self.num_heads, -1, self.kv_lora_rank)            << refer to new cloned tensor
@MirkoDeVita98
Copy link

I have the same error, were you able to solve it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants