From d28fc9898aa55312eb4b3fce435167ce335ca4be Mon Sep 17 00:00:00 2001 From: Hamid Shojanazeri <hamid.nazeri2010@gmail.com> Date: Sun, 27 Aug 2023 19:02:23 +0000 Subject: [PATCH] addressing doc comments --- docs/inference.md | 23 +++++++++++++++-------- 1 file changed, 15 insertions(+), 8 deletions(-) diff --git a/docs/inference.md b/docs/inference.md index 74e9b124..ddb1d003 100644 --- a/docs/inference.md +++ b/docs/inference.md @@ -41,35 +41,42 @@ model.resize_token_embeddings(model.config.vocab_size + 1) ``` Padding would be required for batch inference. In this this [example](../inference/inference.py), batch size = 1 so essentially padding is not required. However,We added the code pointer as an example in case of batch inference. +**Chat completion** +The inference folder also includes a chat completion example, that adds built-in safety features in fine-tuned models to the prompt tokens. To run the example: + +```bash +python inference/chat_completion.py --model_name "PATH/TO/MODEL/7B/" --prompt_file inference/chats.json --quantization --use_auditnlg + +``` **Code Llama** Code llama was recently released with three flavors, base-model that support multiple programming languages, Python fine-tuned model and an instruction fine-tuned and aligned variation of Code Llama, please read more [here](https://ai.meta.com/blog/code-llama-large-language-model-coding/). Find the scripts to run Code Llama [here](../inference/code-llama/), where there are two examples of running code completion and infilling. -**Note** Please find the right model on HF side [here](https://huggingface.co/codellama). +**Note** Please find the right model on HF side [here](https://huggingface.co/codellama). -To run the code completion example: +Make sure to install Transfromers from source for now ```bash -python code_completion_example.py --model_name MODEL_NAME --prompt_file code_completion_prompt.txt --temperature 0.2 --top_p 0.9 +pip install git+https://github.com/huggingface/transformers ``` -To run the code infilling example: +To run the code completion example: ```bash -python code_infilling_example.py --model_name MODEL_NAME --prompt_file code_infilling_prompt.txt --temperature 0.2 --top_p 0.9 +python code_completion_example.py --model_name MODEL_NAME --prompt_file code_completion_prompt.txt --temperature 0.2 --top_p 0.9 ``` -**Chat completion** -The inference folder also includes a chat completion example, that adds built-in safety features in fine-tuned models to the prompt tokens. To run the example: +To run the code infilling example: ```bash -python inference/chat_completion.py --model_name "PATH/TO/MODEL/7B/" --prompt_file inference/chats.json --quantization --use_auditnlg + +python code_infilling_example.py --model_name MODEL_NAME --prompt_file code_infilling_prompt.txt --temperature 0.2 --top_p 0.9 ``` -- GitLab