llm-inference
Projects with this topic
-
https://github.com/InternLM/lmdeploy LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
🔗 lmdeploy.readthedocs.io/en/latest/Updated -
🔧 🔗 https://github.com/modelscope/dash-inferDashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including x86 and ARMv9.
Updated -
https://github.com/huggingface/parler-tts Inference and training library for high-quality TTS models.
Updated -
https://github.com/Lightning-AI/litgpt 20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
🔗 https://lightning.ai/Updated -
🔧 🔗 https://github.com/bytedance/ShadowKV ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM InferenceUpdated -
https://github.com/mistralai/mistral-inference Official inference library for Mistral models
mistral.ai/
Updated -
The Arcee client for executing domain-adpated language model routines
Updated -
https://github.com/coqui-ai/inference-engine Coqui Inference Engine
Updated -
AI stack for interacting with LLMs, Stable Diffusion, Whisper, xTTS and many other AI models https://github.com/noco-ai/spellbook-docker/wiki
Updated