Skip to content

GitLab

Explore

Sign in

I
inference

Any
Batchfile
C
C#
C++
CMake
CSS
Cuda
Dockerfile
Go
HTML
Java
JavaScript
Jinja
Jupyter Notebook
Makefile
PHP
Python
Ruby
Rust
SCSS
Shell
Swift
TSX
TypeScript
Vue

Projects with this topic

Sort by:

Sort by
Updated date
Name
Name, descending
Oldest updated
Oldest created
Last created
Most stars
Hide archived projects
Show archived projects
Show archived projects only

View Vllm project

mirrored_repos / MachineLearning / vllm-project / Vllm

🔧🔗https://github.com/vllm-project/vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

vllm amd cuda inference pytorch transformer Llama gpt rocm model-serving tpu hpu mlops xpu inferentia Large Langua... llm-inference llmops

0

Updated Sep 04, 2025

0 0 0 0

Updated Sep 04, 2025
View Sglang project

mirrored_repos / MachineLearning / sgl-project / Sglang

🔧🔗https://github.com/sgl-project/sglang

SGLang is a fast serving framework for large language models and vision language models.

cuda inference pytorch transformer moe Llama vlm Large Langua... llm-serving llava deepseek llama3

0

Updated Sep 04, 2025

0 0 0 0

Updated Sep 04, 2025
View Debug Print project

mirrored_repos / MachineLearning / flashinfer-ai / Debug Print

🔧🔗https://github.com/flashinfer-ai/debug-print

Debug print operator for cudagraph debugging

cuda cuda-kernels debug inference

0

Updated Jan 24, 2025

0 0 0 0

Updated Jan 24, 2025

🐾❤️ Strive to be the person your dogs believe you are ❤️🐾