From 297d6d75dcc1252f34e6b4975e58067af26b35f1 Mon Sep 17 00:00:00 2001
From: Zirui Wang <ziruiw2000@gmail.com>
Date: Wed, 25 Dec 2024 13:00:51 +0800
Subject: [PATCH] Update README.md

---
 README.md | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/README.md b/README.md
index 96d9d35..dd5879c 100644
--- a/README.md
+++ b/README.md
@@ -10,9 +10,9 @@ This repository contains the code to evaluate models on CharXiv from the paper [
 https://github.com/princeton-nlp/CharXiv/assets/59942464/ab9b293b-8fd6-4735-b8b3-0079ee978b61
 
 ## ðŸ“° News
-**[12/25/2024]** ðŸš€ We updated the [leaderboard]((https://charxiv.github.io/#leaderboard)) with the latest models: o1, Qwen2-VL, Pixtral, InternVL 2.5, Llama 3.2 Vision, NVLM, Molmo, Llava OneVision, Phi 3.5, and more!
-**[10/10/2024]** ðŸš€ CharXiv is accepted at **NeurIPS 2024 Datasets & Benchmarks Track** and NeurIPS 2024 Multimodal Algorithmic Reasoning Workshop as a **spotlight** paper.
-**[07/26/2024]** ðŸš€ Upcoming this week: we'll be releasing scores for [GPT-4o-mini](https://openai.com/index/gpt-4o-mini-advancing-cost-efficient-intelligence/) as well as the largest and most capable open-weight VLM in our benchmark: [InternVL2 LLaMA-3 76B](https://huggingface.co/OpenGVLab/InternVL2-Llama3-76B). Alongside scores, we find some [interesting patterns](https://x.com/zwcolin/status/1816948825036071196) in the trend of model improvement with respect to differnet chart understanding benchmarks on X.
+**[12/25/2024]** ðŸš€ We updated the [leaderboard]((https://charxiv.github.io/#leaderboard)) with the latest models: o1, Qwen2-VL, Pixtral, InternVL 2.5, Llama 3.2 Vision, NVLM, Molmo, Llava OneVision, Phi 3.5, and more!  
+**[10/10/2024]** ðŸš€ CharXiv is accepted at **NeurIPS 2024 Datasets & Benchmarks Track** and NeurIPS 2024 Multimodal Algorithmic Reasoning Workshop as a **spotlight** paper.  
+**[07/26/2024]** ðŸš€ Upcoming this week: we'll be releasing scores for [GPT-4o-mini](https://openai.com/index/gpt-4o-mini-advancing-cost-efficient-intelligence/) as well as the largest and most capable open-weight VLM in our benchmark: [InternVL2 LLaMA-3 76B](https://huggingface.co/OpenGVLab/InternVL2-Llama3-76B). Alongside scores, we find some [interesting patterns](https://x.com/zwcolin/status/1816948825036071196) in the trend of model improvement with respect to differnet chart understanding benchmarks on X.  
 **[07/24/2024]** ðŸš€ We released the [full evaluation pipeline](https://github.com/princeton-nlp/CharXiv) (i.e., v1.0).  
 **[07/23/2024]** ðŸš€ We released our [evaluation results](https://huggingface.co/datasets/princeton-nlp/CharXiv/tree/main/existing_evaluations) on **all 34 MLLMs** that we have tested so far -- this includes all models' responses to CharXiv's challenging questions, scores graded by GPT-4o, as well as aggregated stats.   
 **[07/14/2024]** ðŸš€ We further evaluated the latest [InternVL Chat V2.0 26B](https://huggingface.co/OpenGVLab/InternVL2-26B) and [Cambrian 34B models](https://huggingface.co/nyu-visionx/cambrian-34b) on CharXiv with some **State-of-the-Art results**. More analysis are [here](https://x.com/zwcolin/status/1812650435808792731).
-- 
GitLab