Skip to content

Releases: huggingface/optimum-quanto

release: 0.0.11

19 Jan 15:21
Compare
Choose a tag to compare

New features:

  • support int2 and int4 weights.

New contributors:

@younesbelkada
@a-r-r-o-w

release: 0.0.10

19 Jan 15:13
Compare
Choose a tag to compare

New features:

  • calibration streamline option to remove spurious quantize/dequantize,
  • calibration debug mode.

release: 0.0.9

15 Dec 14:52
Compare
Choose a tag to compare

New features:

  • quantize weights and activations parameters
  • float8 activations

release: 0.0.8

08 Dec 15:31
Compare
Choose a tag to compare

New features:

  • weight-only quantization,
  • integer matmul acceleration on CUDA.

Bug fixes:

  • actually use float16 weights,
  • avoid float16 overflows,
  • correct device placement,
  • robust serialization.

release: 0.0.7

01 Dec 15:23
Compare
Choose a tag to compare

New features:

  • per-axis quantization

release: 0.0.6

27 Oct 14:48
Compare
Choose a tag to compare

New features:

  • support opt models,
  • support gpt-neox models,
  • support codegen models.

release: 0.0.5

19 Oct 07:40
Compare
Choose a tag to compare

New features:

  • support MPS devices,
  • support Transformer models

release: 0.0.4

09 Oct 09:06
Compare
Choose a tag to compare

Fix release to add correct package metadata.

release: 0.0.1

02 Oct 09:54
Compare
Choose a tag to compare
release: 0.0.1 Pre-release
Pre-release

Initial import of the sources.