From 7f026ae834e115da22a63f5494170a611a585555 Mon Sep 17 00:00:00 2001
From: xiamengzhou <296337231@qq.com>
Date: Mon, 5 Feb 2024 11:51:25 -0500
Subject: [PATCH] update

---
 README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/README.md b/README.md
index a222499..04601cb 100644
--- a/README.md
+++ b/README.md
@@ -37,7 +37,7 @@ pip install -e .
 
 
 ## Data Preparation
-We follow the [open-instruct](https://github.com/allenai/open-instruct?tab=readme-ov-file#dataset-preparation) repo to prepare hour instruction tuning datasets. In our project, we utilize a combination of four training datasets: Flan v2, COT, Dolly, and Open Assistant. For the purposes of evaluation, we employ three additional datasets: MMLU, Tydiqa, and BBH. A processed version of these files will be made available [here] [TODO].
+We follow the [open-instruct](https://github.com/allenai/open-instruct?tab=readme-ov-file#dataset-preparation) repo to prepare hour instruction tuning datasets. In our project, we utilize a combination of four training datasets: Flan v2, COT, Dolly, and Open Assistant. For the purposes of evaluation, we employ three additional datasets: MMLU, Tydiqa, and BBH. A processed version of these files are available [here](https://huggingface.co/datasets/princeton-nlp/less_data).
 
 ## Data Selection Pipeline
 
-- 
GitLab