Explore projects
-
User-Mode Driver for Tenstorrent hardware
Updated -
A high-throughput and memory-efficient inference and serving engine for LLMs
Updated -
https://github.com/janhq/nitro.git now: https://github.com/janhq/cortex.git Drop-in, local AI alternative to the OpenAI stack. Multi-engine (llama.cpp, TensorRT-LLM, ONNX). Powers
π JanUpdated -
https://github.com/janhq/cortex Drop-in, local AI alternative to the OpenAI stack. Multi-engine (llama.cpp, TensorRT-LLM, ONNX). Powers
π JanUpdated -
https://github.com/InternLM/lmdeploy LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
π lmdeploy.readthedocs.io/en/latest/Updated -
https://github.com/element-hq/element-x-ios Next generation Matrix client for iOS built with SwiftUI on top of matrix-rust-sdk.
Updated -
MLX: An array framework for Apple silicon
Updated -
https://github.com/janhq/cortex.tensorrt-llm Cortex.Tensorrt-LLM is a C++ inference library that can be loaded by any server at runtime. It submodules NVIDIAβs TensorRT-LLM for GPU accelerated inference on NVIDIA's GPUs.
Updated -
Updated
-
Qdrant - High-performance, massive-scale Vector Database for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/
Updated -
-
π§ π https://github.com/moderngl/moderngl Modern OpenGL binding for Pythonmoderngl.readthedocs.io/
Updated -
Port of OpenAI's Whisper model in C/C++
Updated -
π§ π https://github.com/NVIDIA/apexA PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
Updated -
https://github.com/Mozilla-Ocho/llamafile Distribute and run LLMs with a single file. llamafile.ai
Updated -