-
Notifications
You must be signed in to change notification settings - Fork 24
Issues: fla-org/native-sparse-attention
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[Bug] parallel_nsa_with_compression is slower than flash-attn
bug
Something isn't working
#9
opened Feb 28, 2025 by
suhmily10
2 tasks
RuntimeError: Triton Error [CUDA]: an illegal memory access was encountered
#8
opened Feb 27, 2025 by
Lau-Jonathan
[Bug] Bug Report: Illegal Memory Access in Training Long-Context Models with NSA
bug
Something isn't working
#6
opened Feb 25, 2025 by
imhuim982
2 tasks done
[Feature Request] How can we conduct tests by integrating software like sglang or vllm ?
enhancement
New feature or request
#4
opened Feb 24, 2025 by
BigCousin-z
[RFC] Could you please provide the latest sample code for using the latest New feature or request
parallel_nsa
function?
enhancement
#2
opened Feb 24, 2025 by
Kyfafyd
ProTip!
Add no:assignee to see everything that’s not assigned.