Projects with this topic
Sort by:
-
🔧 🔗 https://github.com/vllm-project/vllmA high-throughput and memory-efficient inference and serving engine for LLMs
Updated -
https://github.com/InternLM/lmdeploy LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
🔗 lmdeploy.readthedocs.io/en/latest/Updated