Projects with this topic
Sort by:
-
mlx-vlm
https://github.com/Blaizzy/mlx-vlm MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac using MLX.
Updated -
-
https://github.com/InternLM/InternLM-XComposer InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output
Updated -
https://github.com/princeton-nlp/CharXiv CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs
Updated -
🔧 🔗 https://github.com/deepseek-ai/DeepSeek-VL DeepSeek-VL: Towards Real-World Vision-Language UnderstandingUpdated -
🔧 🔗 https://github.com/FoundationVision/Groma[ECCV2024] Grounded Multimodal Large Language Model with Localized Visual Tokenization
Updated