Projects with this topic
Sort by:
-
🔧 🔗 https://github.com/codelion/optillm Optimizing inference proxy for LLMsUpdated -
🔧 🔗 https://github.com/pytorch/serveServe, optimize and scale PyTorch models in production
Updated
Serve, optimize and scale PyTorch models in production