Releases: vllm-project/flash-attention
Releases · vllm-project/flash-attention
v2.6.2
v2.6.1
What's Changed
- Adds Python 3.12 to publish.yml by @mgoin in #10
- Sync with FA v2.6.0 to support soft capping by @WoosukKwon in #13
- Support non-default CUDA version by @WoosukKwon in #14
- Bump up to v2.6.1 by @WoosukKwon in #15
New Contributors
Full Changelog: v2.6.0...v2.6.1
v2.6.0
What's Changed
- Upgrade to torch 2.3.1 by @WoosukKwon in #5
- Upgrade to v2.5.9.post1 by @WoosukKwon in #6
- use global function rather than lambda by @youkaichao in #7
- Update torch to 2.4 by @SageMoore in #8
- Add CUDA 11.8 by @WoosukKwon in #9
New Contributors
- @youkaichao made their first contribution in #7
- @SageMoore made their first contribution in #8
Full Changelog: v2.5.9...v2.6.0
v2.5.9.post1
What's Changed
- Upgrade to torch 2.3.1 by @WoosukKwon in #5
- Upgrade to v2.5.9.post1 by @WoosukKwon in #6
Full Changelog: v2.5.9...v2.5.9.post1
v2.5.9
What's Changed
Full Changelog: v2.5.8.post3...v2.5.9
v2.5.8.post3
v2.5.8.post2
Full Changelog: v2.5.8.post1...v2.5.8.post2
v2.5.8.post1
What's Changed
- Sync up by @WoosukKwon in #1
New Contributors
- @WoosukKwon made their first contribution in #1
Full Changelog: https://github.com/vllm-project/flash-attention/commits/v2.5.8.post1