L
Large Language Models

Projects with this topic

View Flashinfer project

mirrored_repos / MachineLearning / flashinfer-ai / Flashinfer

🔧🔗https://github.com/flashinfer-ai/flashinfer

FlashInfer: Kernel Library for LLM Serving

🕸️🔗flashinfer.ai

gpu code jit pytorch llm-inference flash-attention Large Langua...

0

Updated Feb 02, 2026

0 0 0 0

Updated Feb 02, 2026
View Vllm project

mirrored_repos / MachineLearning / vllm-project / Vllm

🔧🔗https://github.com/vllm-project/vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

vllm amd cuda inference pytorch transformer Llama gpt rocm model-serving tpu hpu mlops xpu inferentia Large Langua... llm-inference llmops

0

Updated Feb 02, 2026

0 0 0 0

Updated Feb 02, 2026
View Vllm Ascend project

mirrored_repos / MachineLearning / vllm-project / Vllm Ascend

🔧🔗https://github.com/vllm-project/vllm-ascend Community maintained hardware plugin for vLLM on Ascend

inference transformer model-serving mlops ascend Large Langua... llmops llm-serving vllm

0

Updated Feb 02, 2026

0 0 0 0

Updated Feb 02, 2026
View Lmdeploy project

mirrored_repos / MachineLearning / InternLM / Lmdeploy

https://github.com/InternLM/lmdeploy LMDeploy is a toolkit for compressing, deploying, and serving LLMs. 🔗 lmdeploy.readthedocs.io/en/latest/

Llama cuda-kernels deepspeed Large Langua... fastertransf... llm-inference turbomind internlm llama2 codellama llama3

0

Updated Feb 02, 2026

0 0 0 0

Updated Feb 02, 2026
View Lm Evaluation Harness project

mirrored_repos / MachineLearning / EleutherAI / Lm Evaluation Harness

🔧🔗https://github.com/EleutherAI/lm-evaluation-harness A framework for few-shot evaluation of language models.

Large Langua... evaluation harness

0

Updated Jan 30, 2026

0 0 0 0

Updated Jan 30, 2026
View Data Juicer project

mirrored_repos / MachineLearning / modelscope / Data Juicer

🔧🔗https://github.com/modelscope/data-juicer Making data higher-quality, juicier, and more digestible for foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大模型提供更高质量、更丰富、更易”消化“的数据！

nlp data-science opendata data-visuali... pytorch dataset data-analysis Llama gpt modelscope multi-modality pre-training streamlit sora Large Langua... instruction-... llava

0

Updated Jan 30, 2026

0 0 0 0

Updated Jan 30, 2026
View Mlc Llm project

mirrored_repos / MachineLearning / MLC-AI / Mlc Llm

https://github.com/mlc-ai/mlc-llm Universal LLM Deployment Engine with ML Compilation

language-model tvm Large Langua... machine-lear...

0

Updated Jan 29, 2026

0 0 0 0

Updated Jan 29, 2026
View SWE Agent project

mirrored_repos / MachineLearning / princeton-nlp / SWE Agent

https://github.com/princeton-nlp/SWE-agent SWE-agent takes a GitHub issue and tries to automatically fix it, using GPT-4, or your LM of choice. It solves 12.47% of bugs in the SWE-bench evaluation set and takes just 1 minute to run.

agent Synthetic In... lms developer-tools agent-based-... Large Langua...

0

Updated Jan 19, 2026

0 0 0 0

Updated Jan 19, 2026
View Sglang project

mirrored_repos / MachineLearning / sgl-project / Sglang

🔧🔗https://github.com/sgl-project/sglang

SGLang is a fast serving framework for large language models and vision language models.

cuda inference pytorch transformer moe Llama vlm Large Langua... llm-serving llava deepseek llama3

0

Updated Dec 01, 2025

0 0 0 0

Updated Dec 01, 2025
View Llm.C project

mirrored_repos / MachineLearning / karpathy / Llm.C

LLM.c
LLM training in simple, raw C/CUDA LLMs in simple, pure C/CUDA with no need for 245MB of PyTorch or 107MB of cPython. Current focus is on pretraining, in particular reproducing the GPT-2 and GPT-3 miniseries, along with a parallel PyTorch ref

Large Langua... llm-training cuda C

0

Updated Nov 01, 2025

0 0 0 0

Updated Nov 01, 2025
View Yalm project

mirrored_repos / MachineLearning / andrewkchan / Yalm

🔧🔗https://github.com/andrewkchan/yalm

Yet Another Language Model: LLM inference in C++/CUDA, no libraries except for I/O

Machine Lear... cpp cuda Llama mistral inference-en... Large Langua...

0

Updated Sep 13, 2025

0 0 0 0

Updated Sep 13, 2025
View BitBLAS project

mirrored_repos / MachineLearning / Microsoft / BitBLAS

🔧🔗https://github.com/microsoft/BitBLAS

BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.

matrix-multi... mixed-precision quantization Large Langua... Python

0

Updated Aug 06, 2025

0 0 0 0

Updated Aug 06, 2025
View Dash Infer project

mirrored_repos / MachineLearning / modelscope / Dash Infer

🔧🔗https://github.com/modelscope/dash-infer

DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including x86 and ARMv9.

modelscope cpu llm-inference Large Langua... native-engine

0

Updated Jul 29, 2025

0 0 0 0

Updated Jul 29, 2025
View Nitro project

mirrored_repos / MachineLearning / menloresearch / Nitro

https://github.com/janhq/nitro.git now: https://github.com/janhq/cortex.git Drop-in, local AI alternative to the OpenAI stack. Multi-engine (llama.cpp, TensorRT-LLM, ONNX). Powers 👋 Jan

Synthetic In... cuda Llama accellerated inference-en... openai-api Large Langua... stable-diffu... llamacpp llama2 llama3 gguf tensorrt-llm

0

Updated Jul 04, 2025

0 0 0 0

Updated Jul 04, 2025
View Cortex.Tensorrt Llm project

mirrored_repos / MachineLearning / menloresearch / Cortex.Tensorrt Llm

https://github.com/janhq/cortex.tensorrt-llm Cortex.Tensorrt-LLM is a C++ inference library that can be loaded by any server at runtime. It submodules NVIDIA’s TensorRT-LLM for GPU accelerated inference on NVIDIA's GPUs.

nvidia jan tensorrt Large Langua... tensorrt-llm

0

Updated Jan 20, 2025

0 0 0 0

Updated Jan 20, 2025
View Qwen.Cpp project

mirrored_repos / MachineLearning / QwenLM / Qwen.Cpp

🔧🔗https://github.com/QwenLM/qwen.cpp C++ implementation of Qwen-LM

C qwen qwen2 cpp Large Langua...

0

Updated Dec 16, 2024

0 0 0 0

Updated Dec 16, 2024
View Groma project

mirrored_repos / MachineLearning / FoundationVision / Groma

🔧🔗https://github.com/FoundationVision/Groma

[ECCV2024] Grounded Multimodal Large Language Model with Localized Visual Tokenization

🕸️🔗groma-mllm.github.io/

Llama multimodal grounding foundational... Large Langua... mllm vision-langu... llama2

0

Updated Oct 19, 2024

0 0 0 0

Updated Oct 19, 2024
View Marlin project

mirrored_repos / MachineLearning / IST-DASLabs / Marlin

🔧🔗https://github.com/IST-DASLab/marlin FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.

kernel quantization Large Langua... 4bit

0

Updated Oct 12, 2024

0 0 0 0

Updated Oct 12, 2024
View APAR project

mirrored_repos / MachineLearning / THUDM / APAR

https://github.com/THUDM/APAR APAR: LLMs Can Do Auto-Parallel Auto-Regressive Decoding

Large Langua... parallel auto-regressive

0

Updated Sep 30, 2024

0 0 0 0

Updated Sep 30, 2024
View RemoveDup project

mirrored_repos / LibreTranslate / RemoveDup

https://github.com/LibreTranslate/RemoveDup Remove duplicates from parallel corpora

clean duplicates deduplication Large Langua...

0

Updated Aug 17, 2024

0 0 0 0

Updated Aug 17, 2024

🐾❤️ Strive to be the person your dogs believe you are ❤️🐾