T
tokenization
Projects with this topic
-
🔧 🔗 https://github.com/cohere-ai/magikarpThis repository contains the code and extended results for the paper Fishing for Magikarp: Automatically Detecting Under-trained Tokens in Large Language Models https://arxiv.org/abs/2405.05417
Updated -
🔧 🔗 https://github.com/FoundationVision/OmniTokenizer OmniTokenizer: one model and one weight for image-video joint tokenization.Updated