T
tokenization

  • Any
  • C
  • C#
  • C++
  • CMake
  • CSS
  • Dockerfile
  • Go
  • HCL
  • HTML
  • Java
  • JavaScript
  • Jinja
  • Jupyter Notebook
  • MDX
  • Makefile
  • PHP
  • Python
  • Ruby
  • Rust
  • SCSS
  • Shell
  • Swift
  • TSX
  • TypeScript
  • Vue

Projects with this topic

Sort by:
  • Sort by
  • Updated date
  • Name
  • Name, descending
  • Oldest updated
  • Oldest created
  • Last created
  • Most stars
  • Hide archived projects
  • Show archived projects
  • Show archived projects only
  • View Magikarp project

    mirrored_repos / MachineLearning / cohere-ai / Magikarp

    🔧🔗https://github.com/cohere-ai/magikarp

    This repository contains the code and extended results for the paper Fishing for Magikarp: Automatically Detecting Under-trained Tokens in Large Language Models https://arxiv.org/abs/2405.05417

    cohere tokenization Large Langua...
    0
    Updated May 15, 2025
    0 0 0 0
    Updated May 15, 2025
  • View OmniTokenizer project

    mirrored_repos / MachineLearning / FoundationVision / OmniTokenizer

    🔧🔗https://github.com/FoundationVision/OmniTokenizer OmniTokenizer: one model and one weight for image-video joint tokenization.

    vae image-genera... auto-regress... tokenization video-genera... vqvae
    0
    Updated Oct 19, 2024
    0 0 0 0
    Updated Oct 19, 2024

🐾❤️ Strive to be the person your dogs believe you are ❤️🐾