Projects with this topic
Sort by:
-
-
🔧 🔗 https://github.com/axboe/liburing Library providing helpers for the Linux kernel io_uring supportUpdated -
🔧 🔗 https://github.com/IST-DASLab/marlin FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.Updated