Projects with this topic
-
🔧 🔗 https://github.com/vllm-project/vllmA high-throughput and memory-efficient inference and serving engine for LLMs
Updated -
🔧 🔗 https://github.com/vllm-project/vllm-ascend Community maintained hardware plugin for vLLM on AscendUpdated -
Build applications that make decisions (chatbots, agents, simulations, etc...). Monitor, persist, and execute on your own infrastructure. burr.dagworks.io
Updated -
🔧 🔗 https://github.com/pytorch/serveServe, optimize and scale PyTorch models in production
Updated -
🤖 𝗟𝗲𝗮𝗿𝗻 for 𝗳𝗿𝗲𝗲 how to 𝗯𝘂𝗶𝗹𝗱 an end-to-end 𝗽𝗿𝗼𝗱𝘂𝗰𝘁𝗶𝗼𝗻-𝗿𝗲𝗮𝗱𝘆 𝗟𝗟𝗠 & 𝗥𝗔𝗚 𝘀𝘆𝘀𝘁𝗲𝗺 using 𝗟𝗟𝗠𝗢𝗽𝘀 best practices: ~ 𝘴𝘰𝘶𝘳𝘤𝘦 𝘤𝘰𝘥𝘦 + 11 𝘩𝘢𝘯𝘥𝘴-𝘰𝘯 𝘭𝘦𝘴𝘴𝘰𝘯𝘴 https://github.com/decodingml/llm-twin-courseUpdated -
https://github.com/Upsonic/On-Prem Self-Driven Autonomous Python Libraries
Updated -
https://github.com/Upsonic/Server Self-Driven Autonomous Python Libraries
Updated