Skip to content
GitLab
Explore
Sign in
Overview
Active
Stale
All
This project is mirrored from
https://github.com/huggingface/optimum-quanto
. Pull mirroring updated
Sep 19, 2024
.
refactor
35d04332
·
refactor(group): align terminology
·
Apr 16, 2024
quantize-stablediffusion
1c004145
·
fix(example): fix serval error
·
Apr 17, 2024
explicit_qlinear_gradient
f7e605d0
·
refactor(quanto): avoid qlinear composite gradients
·
Apr 22, 2024
awq_kernel_int8_zeros
814991d3
·
wip
·
Apr 26, 2024
announce_migration
89076e00
·
Update README.md
·
Apr 29, 2024
yet_another_tensor_refactoring
685b4154
·
test(compile): still not working with pt 2.3.0
·
May 03, 2024
prepare_for_gemm_kernels
5be14203
·
feat(quantize): do not use a group_size lower than 128
·
May 16, 2024
awq_kernels_refactor
e3e11d7d
·
feat(qbits): use AWQ CUDA gemm whenever possible
·
May 20, 2024
awq_kernels_v1
725b889a
·
wip
·
May 21, 2024
release-v0.2.0
96ab5d3e
·
release: version 0.2.0
·
May 24, 2024
namespace_package
1a95dc20
·
refactor: make optimum-quanto a namespace package
·
May 30, 2024
release-v0.2.1
1f35ceb2
·
release: 0.2.1
·
May 31, 2024
detect_cuda_version
327b9884
·
fix(qbits): only use cuda kernels if arch >= sm80
·
Jun 03, 2024
optimum-subpackage
0360de14
·
feat: add subpackage for optimum integration
·
Jun 06, 2024
add_owlv2_detection_example
6d8166e2
·
feat(examples): add owl detection
·
Jun 11, 2024
fix_cuda_compilation
e5be8cad
·
wip
·
Jun 12, 2024
fix_cuda_regression
e72b1ebe
·
wip
·
Jun 12, 2024
Prev
1
2
Next