Support Int4/Int8.. Type #162

AntiAnimeGeneral · 2024-10-10T07:13:22Z

It is difficult to run LLM with f32/f16 on pc, To perform inference of LLM on the edge, it is almost necessary to use Q4 quantization. Perhaps Int4 can be used as a built-in type

nathanielsimard · 2024-10-13T16:15:27Z

We can't upload int8 or int4 to the GPU, but @laggui is working on quantization on Burn. We will probably create abstractions making it easier to create quantized kernels

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support Int4/Int8.. Type #162

Support Int4/Int8.. Type #162

AntiAnimeGeneral commented Oct 10, 2024 •

edited

Loading

nathanielsimard commented Oct 13, 2024

Support Int4/Int8.. Type #162

Support Int4/Int8.. Type #162

Comments

AntiAnimeGeneral commented Oct 10, 2024 • edited Loading

nathanielsimard commented Oct 13, 2024

AntiAnimeGeneral commented Oct 10, 2024 •

edited

Loading