From 1d3780f3abbbf1af1c1adfba073e05b670167a22 Mon Sep 17 00:00:00 2001 From: Matthias Reso <13337103+mreso@users.noreply.github.com> Date: Wed, 30 Aug 2023 23:38:15 +0000 Subject: [PATCH] Address spell check issues --- docs/single_gpu.md | 4 ++-- scripts/spellcheck_conf/wordlist.txt | 23 +++++++++++++++++++ .../hf_text_generation_inference/README.md | 2 +- 3 files changed, 26 insertions(+), 3 deletions(-) diff --git a/docs/single_gpu.md b/docs/single_gpu.md index bcd2a317..4aec7810 100644 --- a/docs/single_gpu.md +++ b/docs/single_gpu.md @@ -4,7 +4,7 @@ To run fine-tuning on a single GPU, we will make use of two packages 1- [PEFT](https://huggingface.co/blog/peft) methods and in specific using HuggingFace [PEFT](https://github.com/huggingface/peft)library. -2- [BitandBytes](https://github.com/TimDettmers/bitsandbytes) int8 quantization. +2- [bitandbytes](https://github.com/TimDettmers/bitsandbytes) int8 quantization. Given combination of PEFT and Int8 quantization, we would be able to fine_tune a Llama 2 7B model on one consumer grade GPU such as A10. @@ -15,7 +15,7 @@ To run the examples, make sure to install the llama-recipes package (See [README ## How to run it? -Get access to a machine with one GPU or if using a multi-GPU macine please make sure to only make one of them visible using `export CUDA_VISIBLE_DEVICES=GPU:id` and run the following. It runs by default with `samsum_dataset` for summarization application. +Get access to a machine with one GPU or if using a multi-GPU machine please make sure to only make one of them visible using `export CUDA_VISIBLE_DEVICES=GPU:id` and run the following. It runs by default with `samsum_dataset` for summarization application. ```bash diff --git a/scripts/spellcheck_conf/wordlist.txt b/scripts/spellcheck_conf/wordlist.txt index 27c7323c..113f2166 100644 --- a/scripts/spellcheck_conf/wordlist.txt +++ b/scripts/spellcheck_conf/wordlist.txt @@ -1121,3 +1121,26 @@ summarization xA Sanitization tokenization +hatchling +setuptools +BoolQ +CausalLM +Dyck +GSM +HellaSwag +HumanEval +MMLU +NarrativeQA +NaturalQuestions +OpenbookQA +PREPROC +QuAC +TruthfulQA +WinoGender +bAbI +dataclass +datafiles +davinci +GPU's +HuggingFace's +LoRA \ No newline at end of file diff --git a/src/llama_recipes/inference/hf_text_generation_inference/README.md b/src/llama_recipes/inference/hf_text_generation_inference/README.md index 7a4f72c6..d6c3ada0 100644 --- a/src/llama_recipes/inference/hf_text_generation_inference/README.md +++ b/src/llama_recipes/inference/hf_text_generation_inference/README.md @@ -1,4 +1,4 @@ -# Serving a fine tuned LLaMA model with HuggingFace text-generation-inference server +# Serving a fine tuned Llama model with HuggingFace text-generation-inference server This document shows how to serve a fine tuned LLaMA mode with HuggingFace's text-generation-inference server. This option is currently only available for models that were trained using the LoRA method or without using the `--use_peft` argument. -- GitLab