0.0.8 · Tags · mirrored_repos / MachineLearning / huggingface / Optimum-Quanto

0.0.8

63041a49 · chore: version 0.0.8 · Dec 08, 2023

release: 0.0.8

New features:

- weight-only quantization,
- integer matmul acceleration on CUDA.

Bug fixes:

- actually use float16 weights,
- avoid float16 overflows,
- correct device placement,
- robust serialization.