From a0d1390aacc8159ccfc866b2cbc2681b69e79dfe Mon Sep 17 00:00:00 2001
From: Matthias Reso <13337103+mreso@users.noreply.github.com>
Date: Tue, 3 Oct 2023 05:46:38 -0700
Subject: [PATCH] Fixed typos

---
 docs/Dataset.md                      | 4 ++--
 scripts/spellcheck_conf/wordlist.txt | 3 ++-
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/docs/Dataset.md b/docs/Dataset.md
index 034b00a5..4f31907c 100644
--- a/docs/Dataset.md
+++ b/docs/Dataset.md
@@ -10,10 +10,10 @@ The provided fine tuning script allows you to select between three datasets by p
 ## Batching Strategies
 Llama-recipes support two strategies to batch requests together.
 The default setting is `packing` which concatenates the tokenized samples into long sequences filling up the context length of the model.
-This is the most compute efficient variant as it avoids any padding and all sequences have the same langth.
+This is the most compute efficient variant as it avoids any padding and all sequences have the same length.
 Samples at the boundary of the context length are truncated and the remainder of the cut sequence it used as the start of the next long sequence.
 
-If the amount of training data is small this proceedure might introduce a lot of noise into the training data which can hurt the prediction performance of the fine-tune model.
+If the amount of training data is small this procedure might introduce a lot of noise into the training data which can hurt the prediction performance of the fine-tune model.
 Therefore, we also support a `padding` strategy which does not introduce the addition noise due to truncated sequences.
 The strategy tries to minimize the efficiency loss by batching samples of similar length together so only minimal padding is necessary.
 
diff --git a/scripts/spellcheck_conf/wordlist.txt b/scripts/spellcheck_conf/wordlist.txt
index f58d0425..27b3286f 100644
--- a/scripts/spellcheck_conf/wordlist.txt
+++ b/scripts/spellcheck_conf/wordlist.txt
@@ -1156,4 +1156,5 @@ Autocast
 FN
 GBs
 MLP
-learnable
\ No newline at end of file
+learnable
+tokenized
-- 
GitLab