Explore projects
-
mirrored_repos / MachineLearning / cohere-ai / Unstructured
Apache License 2.0https://github.com/cohere-ai/unstructured Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
Updated -
mirrored_repos / Web Maker
MIT LicenseUpdated -
mirrored_repos / Tinkerbell / Sandbox
Apache License 2.0Updated -
Scaffold project for WPLShortcodes consumption
Updated -
mirrored_repos / MachineLearning / princeton-nlp / CharXiv
Apache License 2.0https://github.com/princeton-nlp/CharXiv CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs
Updated -
-
https://github.com/princeton-nlp/ELIZA-Transformer Representing Rule-based Chatbots with Transformers
Updated -
https://github.com/princeton-nlp/PTP Improving Language Understanding from Screenshots. Paper: https://arxiv.org/abs/2402.14073
Updated -
https://github.com/princeton-nlp/LESS [ICML 2024] LESS: Selecting Influential Data for Targeted Instruction Tuning
Updated -
https://github.com/princeton-nlp/Edge-Pruning Code and data for the paper "Finding Transformer Circuits with Edge Pruning".
Updated -
https://github.com/princeton-nlp/tree-of-thought-llm [NeurIPS 2023] Tree of Thoughts: Deliberate Problem Solving with Large Language Models
Updated -
https://github.com/princeton-nlp/QuRating [ICML 2024] Selecting High-Quality Data for Training Language Models
Updated -
https://github.com/princeton-nlp/CEPE [ACL 2024] Long-Context Language Modeling with Parallel Encodings
Updated -
https://github.com/princeton-nlp/Heuristic-Core [ACL 2024] The Heuristic Core: Understanding Subnetwork Generalization in Pretrained Language Models - https://arxiv.org/abs/2403.03942
Updated -
https://github.com/princeton-nlp/c-sts [EMNLP 2023] C-STS: Conditional Semantic Textual Similarity
Updated -
https://github.com/princeton-nlp/TransformerPrograms [NeurIPS 2023] Learning Transformer Programs
Updated -
https://github.com/princeton-nlp/intercode [NeurIPS 2023 D&B] Code repository for InterCode benchmark https://arxiv.org/abs/2306.14898
Updated -
https://github.com/princeton-nlp/LLM-Shearing [ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning
Updated -
https://github.com/princeton-nlp/TRIME [EMNLP 2022] Training Language Models with Memory Augmentation https://arxiv.org/abs/2205.12674
Updated