Skip to content
GitLab
Explore
Sign in
Overview
Active
Stale
All
This project is mirrored from
https://github.com/huggingface/optimum-quanto
. Pull mirroring updated
Sep 19, 2024
.
reintroduce_qtensor_quantize
cea4ff6d
·
ci: temporarily use pt test channel
·
Jul 24, 2024
update_bench_numbers
48788f57
·
feat(bench): add dtype to evaluate_configurations and gen_barchart
·
Jul 24, 2024
investigate_ci_compilation_issues
5dcabd24
·
try installing editable version
·
Jul 23, 2024
cuda_examples
b026fbda
·
fix(awq): disable AWQ kernel if output features < 128
·
Jul 22, 2024
disable_awq_gemm_out_feats_less_128
a56027e3
·
fix(awq): disable AWQ kernel if output features < 128
·
Jul 22, 2024
fix_int8_regression
f24c2d91
·
fix(qbytes_mm): restore previous order of int8 mm
·
Jul 20, 2024
refactor_qmodule
7ed2bf85
·
fix(calibration): only disable output quantization when streamlining
·
Jul 19, 2024
sigma-xl
3e65dc1b
·
feat(pixart): add example
·
Jul 18, 2024
fix_tinygemm_move
60d19829
·
workaround to have all test pass
·
Jul 17, 2024
modeling_classes
726d20e3
·
feat(modeling): add QuantizedModel
·
Jul 11, 2024
cleanup_tests
ae05a6e6
·
test(tinygemm): skip mps device until pt 2.4
·
Jul 11, 2024
_weight_int4pack_mm
928d323b
·
feat(qbits): accept float shifts
·
Jul 03, 2024
cuda_multiple_devices
00c5d4dd
·
feat(awq): align extension on max arch
·
Jul 03, 2024
serialization_in_readme
134cb67e
·
fix(examples): unpin transformers
·
Jul 03, 2024
release-v0.2.2
8c8aa97e
·
chore: release 0.2.2
·
Jun 28, 2024
extension_lifecycle
a4b3432f
·
fix(extensions): rebuild extensions when pytorch is updated
·
Jun 28, 2024
_weight_int8pack_mm
16daaef4
·
wip
·
Jun 26, 2024
Prev
1
2
Next