Branches · mirrored_repos / MachineLearning / huggingface / Optimum-Quanto · GitLab

This project is mirrored from https://github.com/huggingface/optimum-quanto. Pull mirroring updated Sep 19, 2024.

main default

2075da7a · feat: e4m3fnuz added · Sep 17, 2024
add_e4m3fnuz

f37c58b6 · feat: e4m3fnuz added · Sep 17, 2024
disable_int_mm_cpu

0a832208 · fix(library): disable int_mm for CPU · Sep 16, 2024
add_marlin_int4_kernel

6e888fda · fix(marlin): avoid kernel crash on H100 · Sep 13, 2024
marlin_neural_magic_step_by_step

06629972 · fix(marlin_: avoid kernel crash on H100 · Sep 05, 2024
marlin_neural_magic

6751ecad · style: reapply NeuralMagic formatting · Sep 03, 2024
fix_bugs

1fa6771b · fix(qmodule): register default scales with correct dtype · Sep 02, 2024
stop_enforcing_conventional_commits

960f7341 · chore: one commit more than allowed · Sep 02, 2024
marlin-fp8-kernel-only

160f4c05 · feat(library): add marlin float8/float8 mm kernel · Aug 27, 2024
feat-hub-support

b2d3a390 · chore: apply make style. · Aug 15, 2024
load_strict_false

65ace79d · fix: support strict=False for quantized weights · Aug 14, 2024
marlin-fp8

07a56520 · fix: remove extension load on unsupported system · Jul 29, 2024
refactor_qtensor_dispatch

858950e4 · refactor(qtensor): split torch.nn.functional.linear dispatch · Jul 26, 2024
release-v0.2.4

832f7f5c · release: 0.2.4 · Jul 26, 2024
fix_diffusers_model_import

05312bf8 · chore: bump dev version · Jul 26, 2024
release-v0.2.3

4b71e426 · release: 0.2.3 · Jul 25, 2024
cleanup_readme

2dfef8d5 · docs: cleanup README · Jul 25, 2024
fix_serialization_issues

b3de2a28 · fix(qbits): correct stride when packing · Jul 25, 2024
feat-diffusers-models

7986c7cb · fix: force pos_embed initialization · Jul 24, 2024
reintroduce_qtensor_quantize

cea4ff6d · ci: temporarily use pt test channel · Jul 24, 2024

Prev
1
2
3
4
Next

🐾❤️ Strive to be the person your dogs believe you are ❤️🐾