Projects with this topic
Sort by:
-
🔧 🔗 https://github.com/containers/ramalamaRamaLama is an open-source developer tool that simplifies the local serving of AI models from any source and facilitates their use for inference in production, all
Updated -
Real-time inference for Stable Diffusion - 0.88s latency. Covers AITemplate, nvFuser, TensorRT, FlashAttention. Join our Discord communty: https://discord.com/invite/TgHXuSJEk6
Updated