Skip to content

Commit

Permalink
docs: using fms-acceleration flags as json
Browse files Browse the repository at this point in the history
Signed-off-by: Anh Uong <[email protected]>
  • Loading branch information
anhuong committed Sep 25, 2024
1 parent fab2943 commit ad943f5
Showing 1 changed file with 11 additions and 0 deletions.
11 changes: 11 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -634,6 +634,7 @@ Notes:
- pass `--fast_kernels True True True` for full finetuning/LoRA
- pass `--fast_kernels True True True --auto_gptq triton_v2 --fused_lora auto_gptq True` for GPTQ-LoRA
- pass `--fast_kernels True True True --bitsandbytes nf4 --fused_lora bitsandbytes True` for QLoRA
- Note the list of supported models [here](https://github.com/foundation-model-stack/fms-acceleration/blob/main/plugins/fused-ops-and-kernels/README.md#supported-models).
* Notes on Padding Free
- works for both *single* and *multi-gpu*.
- works on both *pretokenized* and *untokenized* datasets
Expand All @@ -642,6 +643,16 @@ Notes:
- works only for *multi-gpu*.
- currently only includes the version of *multipack* optimized for linear attention implementations like *flash-attn*.

Note: To pass the above flags via a JSON config, each of the flags expects the value to be a mixed type list, so the values must be a list. For example:
```json
{
"fast_kernels": [true, true, true],
"padding_free": ["huggingface"],
"multipack": [16],
"auto_gptq": ["triton_v2"]
}
```

Activate `TRANSFORMERS_VERBOSITY=info` to see the huggingface trainer printouts and verify that `AccelerationFramework` is activated!

```
Expand Down

0 comments on commit ad943f5

Please sign in to comment.