Projects with this topic
Sort by:
-
🔧 🔗 https://github.com/vllm-project/llm-compressor Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLMUpdated -
🔧 🔗 https://github.com/microsoft/BitBLASBitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.
Updated