Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vpux-compiler error occured when using qwen2.5-7B in large content or prompt #12736

Open
dockerg opened this issue Jan 22, 2025 · 5 comments
Open

Comments

@dockerg
Copy link

dockerg commented Jan 22, 2025

First error is occured when max-context-len >= 10000
Image
The second error is occured when max-prompt-len >=6000
Image

Device : Nuc 14 pro
CPU: Ultra5 125H
Memory: 96G whole
NPU driver: 32.0.100.3104
Others:
Image

@plusbang
Copy link
Contributor

Hi, @dockerg , max-context-len>=10000 is not supported yet, maybe you could try 2K.

@dockerg
Copy link
Author

dockerg commented Jan 23, 2025

how about "DDR" memory allocate issue?

@plusbang
Copy link
Contributor

how about "DDR" memory allocate issue?

Does this issue exist for smaller value of max-prompt-len (such as 1024)? Too long prompt is also not supported yet.

@dockerg
Copy link
Author

dockerg commented Jan 23, 2025

how about "DDR" memory allocate issue?

Does this issue exist for smaller value of max-prompt-len (such as 1024)? Too long prompt is also not supported yet.

It occured when max-prompt-len > 4000;
What limit the prompt? Hardware or npu-driver or Ipex-llm not support it?

@plusbang
Copy link
Contributor

It occured when max-prompt-len > 4000; What limit the prompt? Hardware or npu-driver or Ipex-llm not support it?

The issue relates to hardware and npu driver.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants