Q
quantization

Any
C
C#
C++
CMake
CSS
Dockerfile
Go
HCL
HTML
Java
JavaScript
Jinja
Jupyter Notebook
MDX
Makefile
PHP
Python
Ruby
Rust
SCSS
Shell
Swift
TSX
TypeScript
Vue

Projects with this topic

Sort by:

Sort by
Updated date
Name
Name, descending
Oldest updated
Oldest created
Last created
Most stars
Hide archived projects
Show archived projects
Show archived projects only

View BitBLAS project

mirrored_repos / MachineLearning / Microsoft / BitBLAS

🔧🔗https://github.com/microsoft/BitBLAS

BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.

matrix-multi... mixed-precision quantization Large Langua... Python

0

Updated Aug 06, 2025

0 0 0 0

Updated Aug 06, 2025
View Marlin project

mirrored_repos / MachineLearning / IST-DASLabs / Marlin

🔧🔗https://github.com/IST-DASLab/marlin FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.

kernel quantization Large Langua... 4bit

0

Updated Oct 12, 2024

0 0 0 0

Updated Oct 12, 2024

🐾❤️ Strive to be the person your dogs believe you are ❤️🐾