Embark on a journey of paralleled/unparalleled computational prowess with FJFormer - an arsenal of custom Jax Flax Functions and Utils that elevate your AI endeavors to new heights!
FJFormer is a collection of functions and utilities that can help with various tasks when using Flax and JAX. It includes checkpoint savers, partitioning tools, and other helpful functions. The goal of FJFormer is to make your life easier when working with Flax and JAX. Whether you are training a new model, fine-tuning an existing one, or just exploring the capabilities of these powerful frameworks, FJFormer offers
- FlashAttention on
TPU/GPU
🧬 - BITComputations for 8,6,4 BIT Flax Models 🤏
- Smart Dataset Loading
- Built-in functions and Loss functions
- GPU-Pallas triton like implementation of
Softmax
,FlashAttention
,RMSNorm
,LayerNorm
- Distributed and sharding Model Loaders and Checkpoint Savers
- Monitoring Utils for TPU/GPU/CPU memory
foot-print
- Special Optimizers with schedulers and Easy to Use
- Partitioning Utils
- LoRA with
XRapture
🤠
and A lot of these features are fully documented so i gusse FJFormer has something to offer, and it's not just a Computation BackEnd for EasyDel.
checkout for documentations here.
FJFormer is an open-source project, and contributions are always welcome! If you have a feature request, bug report, or just want to help out with development, please check out our GitHub repository and feel free to submit a pull request or open an issue.
Thank you for using FJFormer, and happy training!