Tags

Tags give the ability to mark specific points in history as being important

This project is mirrored from https://github.com/huggingface/optimum-quanto. Pull mirroring updated Sep 16, 2024.

v0.2.4

832f7f5c · release: 0.2.4 · Jul 26, 2024
v0.2.3

4b71e426 · release: 0.2.3 · Jul 25, 2024
v0.2.2

8c8aa97e · chore: release 0.2.2 · Jun 28, 2024
v0.2.1

1f35ceb2 · release: 0.2.1 · May 31, 2024

v0.2.0

96ab5d3e · release: version 0.2.0 · May 24, 2024

release: 0.2.0

New:
- requantize helper by @calmitchell617,
- StableDiffusion example by @thliang01,
- improved linear backward path,
- AWQ int4 kernels.

v0.1.0

fe2b3139 · release: 0.1.0 · Mar 13, 2024

release: 0.1.0

- group-wise quantization,
- safe serialization.

v0.0.13

addd7122 · chore: release 0.0.13 · Feb 23, 2024

release: 0.0.13

- new `QConv2d` quantized module,
- official support for `float8` weights.

- fix `QbitsTensor.to()` that was not moving the inner tensors,
- prevent shallow `QTensor` copies when loading weights that do not move
  inner tensors.

v0.0.12

d79f7959 · chore: release 0.0.12 · Feb 16, 2024
```
release: 0.0.12
```
0.0.11

283080ab · chore: bump version · Jan 19, 2024

0.0.10

5ab7e6a9 · chore: release 0.0.10 · Dec 20, 2023

release: 0.0.10

New features:

- calibration streamline option to remove spurious quantize/dequantize,
- calibration debug mode.

0.0.9

8acbefc1 · chore: release 0.0.9 · Dec 15, 2023

release: 0.0.9

New features:

- quantize weights and activations parameters
- float8 activations

0.0.8

63041a49 · chore: version 0.0.8 · Dec 08, 2023

release: 0.0.8

New features:

- weight-only quantization,
- integer matmul acceleration on CUDA.

Bug fixes:

- actually use float16 weights,
- avoid float16 overflows,
- correct device placement,
- robust serialization.

0.0.7

93b20c7e · chore: update version · Dec 01, 2023

release(quanto): 0.0.7

New features:

- per-axis quantization

0.0.6

fe330f0d · doc: fix typo · Oct 27, 2023

release: 0.0.6

New features:

- support `opt` models,
- support `gpt-neox` models,
- support `codegen` models.

0.0.5

e0dd8936 · chore: version 0.0.5 · Oct 19, 2023

release: 0.0.5

New features:

- support MPS devices,
- support Transformer models

0.0.4

abfd2bc5 · chore: bump version · Oct 09, 2023

release: 0.0.4

Fix release to add correct package metadata.

0.0.3

01ae20bf · chore: version 0.0.3 · Oct 06, 2023

release: 0.0.3

- full model quantization
- quantization aware training

0.0.1

84d6cecd · chore: add packaging · Sep 28, 2023
```
release: 0.0.1

Initial version of the package.
```