Projects with this topic
Sort by:
-
🔧 🔗 https://github.com/FoundationVision/Groma[ECCV2024] Grounded Multimodal Large Language Model with Localized Visual Tokenization
Updated
[ECCV2024] Grounded Multimodal Large Language Model with Localized Visual Tokenization