Support bfloat16 execution #1121

guillaumekln · 2023-03-09T16:51:58Z

Models that are trained with bfloat16 can have numerical issues when run with float16. See #1074 for example.

We should consider supporting bfloat16 execution which is supported on recent Intel CPUs and NVIDIA GPUs.

wsxiaoys · 2023-06-13T19:09:53Z

To add another potential related data point of this, I converted SantaCoder ( a float16 trained model) to ctranslate2 for inference, when combined with beam_size = 1:

int8 / int8_float16 generate reasonble outputs.
float16 generate unusable outputs.

Converted model locates here: https://huggingface.co/TabbyML/SantaCoder-1B/tree/main/ctranslate2

The reported issue: TabbyML/tabby#236

guillaumekln · 2023-07-07T14:54:23Z

The linked PR is adding support for bfloat16.

Just like other types, you can select it during conversion with --quantization bfloat16 or when loading the model with compute_type="bfloat16". The quantization mode int8_bfloat16 is also supported.

It would be helpful if anyone watching this issue can test the implementation and give feedback. To install the development build:

Go to this build page
Download the artifact "python-wheels"
Extract the archive
Install the wheel matching your system and Python version with pip install --force-reinstall <wheel>

Note that a GPU with Compute Capability 8 or greater is required.

guillaumekln added the enhancement New feature or request label Mar 9, 2023

guillaumekln mentioned this issue Jun 6, 2023

BF16 support? SYSTRAN/faster-whisper#281

Closed

guillaumekln mentioned this issue Jun 23, 2023

Support bfloat16 execution on the GPU #1316

Merged

guillaumekln closed this as completed in #1316 Jul 17, 2023

cyberluke mentioned this issue Oct 8, 2024

Converting fine-tuned whisper model for faster-whisper, using safetensors SYSTRAN/faster-whisper#567

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support bfloat16 execution #1121

Support bfloat16 execution #1121

guillaumekln commented Mar 9, 2023

wsxiaoys commented Jun 13, 2023 •

edited

Loading

guillaumekln commented Jul 7, 2023

Support bfloat16 execution #1121

Support bfloat16 execution #1121

Comments

guillaumekln commented Mar 9, 2023

wsxiaoys commented Jun 13, 2023 • edited Loading

guillaumekln commented Jul 7, 2023

wsxiaoys commented Jun 13, 2023 •

edited

Loading