L
llm-inference

Projects with this topic

View Vllm project

mirrored_repos / MachineLearning / vllm-project / Vllm

🔧🔗https://github.com/vllm-project/vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

vllm amd cuda inference pytorch transformer Llama gpt rocm model-serving tpu hpu mlops xpu inferentia Large Langua... llm-inference llmops

0

Updated Feb 04, 2026

0 0 0 0

Updated Feb 04, 2026
View Flashinfer project

mirrored_repos / MachineLearning / flashinfer-ai / Flashinfer

🔧🔗https://github.com/flashinfer-ai/flashinfer

FlashInfer: Kernel Library for LLM Serving

🕸️🔗flashinfer.ai

gpu code jit pytorch llm-inference flash-attention Large Langua...

0

Updated Feb 04, 2026

0 0 0 0

Updated Feb 04, 2026
View Lmdeploy project

mirrored_repos / MachineLearning / InternLM / Lmdeploy

https://github.com/InternLM/lmdeploy LMDeploy is a toolkit for compressing, deploying, and serving LLMs. 🔗 lmdeploy.readthedocs.io/en/latest/

Llama cuda-kernels deepspeed Large Langua... fastertransf... llm-inference turbomind internlm llama2 codellama llama3

0

Updated Feb 03, 2026

0 0 0 0

Updated Feb 03, 2026
View Mistral Inference project

mirrored_repos / MachineLearning / mistralai / Mistral Inference

https://github.com/mistralai/mistral-inference Official inference library for Mistral models

mistral.ai/

Large Langua... llm-inference mistral library Python

0

Updated Nov 21, 2025

0 0 0 0

Updated Nov 21, 2025
View Spellbook Docker project

mirrored_repos / MachineLearning / noco-ai / Spellbook Docker

AI stack for interacting with LLMs, Stable Diffusion, Whisper, xTTS and many other AI models https://github.com/noco-ai/spellbook-docker/wiki

tts bark automatic-sp... whisper musicgeneration stable-diffu... llm-inference llama2 xttsv2 mixtral

0

Updated May 01, 2024

0 0 0 0

Updated May 01, 2024

🐾❤️ Strive to be the person your dogs believe you are ❤️🐾