Projects with this topic
-
🔧 🔗 https://github.com/vllm-project/vllmA high-throughput and memory-efficient inference and serving engine for LLMs
Updated -
🔧 🔗 https://github.com/flashinfer-ai/flashinferFlashInfer: Kernel Library for LLM Serving
Updated -
https://github.com/InternLM/lmdeploy LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
🔗 lmdeploy.readthedocs.io/en/latest/Updated -
https://github.com/Lightning-AI/litgpt 20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
🔗 https://lightning.ai/Updated -
🔧 🔗 https://github.com/google/gemma.cpplightweight, standalone C++ inference engine for Google's Gemma models.
Updated -
🔧 🔗 https://github.com/modelscope/dash-inferDashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including x86 and ARMv9.
Updated -
🔧 🔗 https://github.com/nbursa/inception-core-serverA modular, extensible Rust-based server providing short-term, long-term, and latent memory services, a chat endpoint with a BaseAgent + Sentience DSL, and integration with ChromaDB and LLM services.
Updated -
🔧 🔗 https://github.com/harishsg993010/neural-state-manipulator A tool for manipulating the internal neural activations of language modelsUpdated -
🔧 🔗 https://github.com/cognitivecomputations/dolphin-loggerThis proxy allows you to record your chats to create datasets based on your conversations with any Chat LLM. It supports multiple LLM backends including OpenAI, Anthropic, Google and Ollama.
Updated -
🔧 🔗 https://github.com/bytedance/ShadowKV ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM InferenceUpdated -
https://github.com/mistralai/mistral-inference Official inference library for Mistral models
mistral.ai/
Updated -
https://github.com/huggingface/parler-tts Inference and training library for high-quality TTS models.
Updated -
The Arcee client for executing domain-adpated language model routines
Updated -
https://github.com/coqui-ai/inference-engine Coqui Inference Engine
Updated -
AI stack for interacting with LLMs, Stable Diffusion, Whisper, xTTS and many other AI models https://github.com/noco-ai/spellbook-docker/wiki
Updated