update custom_llms section of docs (#398)

Co-authored-by: Jerry Liu <jerry@robustintelligence.com>

update custom_llms section of docs (#398)
36e25f14 · Jerry Liu · GitHub · 9a69465e · 36e25f14
Unverified Commit 36e25f14 authored 2 years ago by Jerry Liu Committed by GitHub 2 years ago
--- a/docs/how_to/custom_llms.md
+++ b/docs/how_to/custom_llms.md
@@ -19,28 +19,95 @@ tokens, maximum chunk overlap, and more.
 By default, we use OpenAI's `text-davinci-003` model. But you may choose to customize
 the underlying LLM being used.
+Below we show a few examples of LLM customization. This includes
+- changing the underlying LLM 
+- changing the number of output tokens (for OpenAI, Cohere, or AI21)
+- having more fine-grained control over all parameters for any LLM, from input size to chunk overlap
-## Example
+## Example: Changing the underlying LLM
 An example snippet of customizing the LLM being used is shown below. 
 In this example, we use `text-davinci-002` instead of `text-davinci-003`. Note that 
 you may plug in any LLM shown on Langchain's 
 [LLM](https://langchain.readthedocs.io/en/latest/modules/llms.html) page.
 ```python
 from gpt_index import (
    GPTKeywordTableIndex, 
    SimpleDirectoryReader, 
    LLMPredictor,
-    PromptHelper
 )
 from langchain import OpenAI
+documents = SimpleDirectoryReader('data').load_data()
 # define LLM
 llm_predictor = LLMPredictor(llm=OpenAI(temperature=0, model_name="text-davinci-002"))
+# build index
+index = GPTKeywordTableIndex(llm_predictor=llm_predictor)
+# get response from query
+response = index.query("What did the author do after his time at Y Combinator?")
+```
+## Example: Changing the number of output tokens (for OpenAI, Cohere, AI21)
+The number of output tokens is usually set to some low number by default (for instance,
+with OpenAI the default is 256).
+For OpenAI, Cohere, AI21, you just need to set the `max_tokens` parameter 
+(or maxTokens for AI21). We will handle text chunking/calculations under the hood.
+```python
+from gpt_index import (
+    GPTKeywordTableIndex, 
+    SimpleDirectoryReader, 
+    LLMPredictor,
+)
+from langchain import OpenAI
+documents = SimpleDirectoryReader('data').load_data()
+# define LLM
+llm_predictor = LLMPredictor(llm=OpenAI(temperature=0, model_name="text-davinci-002", max_tokens=512))
+# build index
+index = GPTKeywordTableIndex(llm_predictor=llm_predictor)
+# get response from query
+response = index.query("What did the author do after his time at Y Combinator?")
+```
+If you are using other LLM classes from langchain, please see below.
+## Example: Fine-grained control over all parameters
+To have fine-grained control over all parameters, you will need to define
+a custom PromptHelper class.
+```python
+from gpt_index import (
+    GPTKeywordTableIndex, 
+    SimpleDirectoryReader, 
+    LLMPredictor,
+    PromptHelper
+)
+from langchain import OpenAI
+documents = SimpleDirectoryReader('data').load_data()
 # define prompt helper
 # set maximum input size
 max_input_size = 4096
@@ -50,16 +117,13 @@ num_output = 256
 max_chunk_overlap = 20
 prompt_helper = PromptHelper(max_input_size, num_output, max_chunk_overlap)
-# load index from disk
+# define LLM
-index = GPTKeywordTableIndex.load_from_disk(
+llm_predictor = LLMPredictor(llm=OpenAI(temperature=0, model_name="text-davinci-002", max_tokens=num_output))
-    'index_table.json', llm_predictor=llm_predictor, prompt_helper=prompt_helper
-)
+# build index
+index = GPTKeywordTableIndex(llm_predictor=llm_predictor, prompt_helper=prompt_helper)
 # get response from query
 response = index.query("What did the author do after his time at Y Combinator?")
 ```
-In this snippet, the index has already been created and saved to disk. We load
-the existing index. We then swap in a new `LLMPredictor` and `PromptHelper`
-to set both the new predictor as well as the new prompt parameters.
\ No newline at end of file