Explore projects
-
-
A high-throughput and memory-efficient inference and serving engine for LLMs
Updated -
Updated
-
huggingface.co/transformers https://github.com/huggingface/transformers🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.Updated -
cortex.llamacpp is a high-efficiency C++ inference engine for edge computing. It is a dynamic library that can be loaded by any server at runtime.
Updated -
MLX: An array framework for Apple silicon
Updated -
Swift API for MLX
Updated -
https://github.com/AnswerDotAI/gpu.cpp A lightweight library for portable low-level GPU computation using WebGPU.
Updated -
-
https://github.com/janhq/cortex Drop-in, local AI alternative to the OpenAI stack. Multi-engine (llama.cpp, TensorRT-LLM, ONNX). Powers
👋 JanUpdated