From beab5726cc7c7e5274406560a1ec77007cd8ea82 Mon Sep 17 00:00:00 2001 From: Hamid Shojanazeri <hamid.nazeri2010@gmail.com> Date: Tue, 8 Aug 2023 17:24:16 +0000 Subject: [PATCH] add notes for padding --- docs/inference.md | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/docs/inference.md b/docs/inference.md index c08c91f1..509eb2bc 100644 --- a/docs/inference.md +++ b/docs/inference.md @@ -27,6 +27,21 @@ inference/samsum_prompt.txt ... ``` +**Note** +Currently pad token by default in [HuggingFace Tokenizer is `None`](https://github.com/huggingface/transformers/blob/main/src/transformers/models/llama/tokenization_llama.py#L110). We add the padding token as a special token to the tokenizer, which in this case requires to resize the token_embeddings as shown below: + +```python +tokenizer.add_special_tokens( + { + + "pad_token": "<PAD>", + } + ) +model.resize_token_embeddings(model.config.vocab_size + 1) +``` +Padding would be required for batch inference. In this this [example](../inference/inference.py), batch size = 1 so essentially padding is not required. However,We added the code pointer as an example in case of batch inference. + + **Chat completion** The inference folder also includes a chat completion example, that adds built-in safety features in fine-tuned models to the prompt tokens. To run the example: -- GitLab