diff --git a/.readthedocs.yaml b/.readthedocs.yaml
new file mode 100644
index 0000000000000000000000000000000000000000..91d3113047d1e96be0f7720becda736d8e7cca7a
--- /dev/null
+++ b/.readthedocs.yaml
@@ -0,0 +1,10 @@
+version: 2
+sphinx:
+  configuration: docs/conf.py
+formats: all
+python:
+  version: 3.8
+  install:
+    - requirements: docs/requirements.txt
+    - method: pip
+      path: .
\ No newline at end of file
diff --git a/README.md b/README.md
index b00595a634e14ee768d4dfebe99bed5a7a05390d..657f7fcf241105c2b098ea247bd41d0eeee6ba89 100644
--- a/README.md
+++ b/README.md
@@ -1,22 +1,38 @@
 # 🗂️ ️GPT Index
 
-GPT Index is a project consisting of a set of *data structures* that are created using GPT-3 and can be traversed using GPT-3 in order to answer queries.
+GPT Index is a project consisting of a set of *data structures* that are created using LLMs and can be traversed using LLMs in order to answer queries.
+
+PyPi: https://pypi.org/project/gpt-index/.
+
+Documentation: https://gpt-index.readthedocs.io/en/latest/.
 
 ## 🚀 Overview
 
 #### Context
-- GPT-3 is a phenomenonal piece of technology for knowledge generation and reasoning.
-- A big limitation of GPT-3 is context size (e.g. Davinci's limit is 4096 tokens. Large, but not infinite).
-- The ability to feed "knowledge" to GPT-3 is restricted to this limited prompt size and model weights.
-- **Thought**: What if GPT-3 can have access to potentially a much larger database of knowledge without retraining/finetuning? 
+- LLMs are a phenomenonal piece of technology for knowledge generation and reasoning.
+- A big limitation of LLMs is context size (e.g. OpenAI's `davinci` model for GPT-3 has a [limit](https://openai.com/api/pricing/) of 4096 tokens. Large, but not infinite).
+- The ability to feed "knowledge" to LLMs is restricted to this limited prompt size and model weights.
+- **Thought**: What if LLMs can have access to potentially a much larger database of knowledge without retraining/finetuning? 
 
 #### Proposed Solution
-That's where the **GPT Index** data structures come in. Instead of relying on world knowledge encoded in the model weights, a GPT Index data structure does the following:
-- Uses a pre-trained GPT-3 model primarily for *reasoning*/*summarization* instead of prior knowledge.
-- Takes as input a large corpus of text data and build a structured index over it (using GPT-3 or heuristics).
-- Allow users to _query_ the index in order to synthesize an answer to the question - this requires both _traversal_ of the index as well as a synthesis of the answer.
+That's where the **GPT Index** comes in. GPT Index is a simple, flexible interface between your external data and LLMs. It resolves the following pain points:
+
+- Provides simple data structures to resolve prompt size limitations.
+- Offers data connectors to your external data sources.
+- Offers you a comprehensive toolset trading off cost and performance.
+
+At the core of GPT Index is a **data structure**. Instead of relying on world knowledge encoded in the model weights, a GPT Index data structure does the following:
+
+- Uses a pre-trained LLM primarily for *reasoning*/*summarization* instead of prior knowledge.
+- Takes as input a large corpus of text data and build a structured index over it (using an LLM or heuristics).
+- Allow users to *query* the index in order to synthesize an answer to the question - this requires both *traversal* of the index as well as a synthesis of the answer.
+
+## 📄 Documentation
+
+Full documentation can be found here: https://gpt-index.readthedocs.io/en/latest/. 
+
+Please check it out for the most up-to-date tutorials, how-to guides, references, and other resources! 
 
-The high-level design exercise of this project is to test the capability of GPT-3 as a general-purpose processor to organize and retrieve data. From our current understanding, related works have used GPT-3 to reason with external db sources (see below); this work links reasoning with knowledge building.
 
 ## 💻 Example Usage
 
@@ -53,32 +69,6 @@ The main third-party package requirements are `transformers`, `openai`, and `lan
 All requirements should be contained within the `setup.py` file. To run the package locally without building the wheel, simply do `pip install -r requirements.txt`. 
 
 
-## Index Data Structures
-
-- [`Tree Index`](gpt_index/indices/tree/README.md): Tree data structures
-    - **Creation**: with GPT hierarchical summarization over sub-documents
-    - **Query**: with GPT recursive querying over multiple choice problems
-- [`Keyword Table Index`](gpt_index/indices/keyword_table/README.md): a keyword-based table
-    - **Creation**: with GPT keyword extraction over each sub-document
-    - **Query**: with GPT keyword extraction over question, match to sub-documents. *Create and refine* an answer over candidate sub-documents.
-- [`List Index`](gpt_index/indices/list/README.md): a simple list-based data structure
-    - **Creation**: by splitting documents into a list of text chunks
-    - **Query**: use GPT with a create and refine prompt iterately over the list of sub-documents
-
-
-## Data Connectors
-
-We currently offer connectors into the following data sources. External data sources are retrieved through their APIs + corresponding authentication token.
-- Notion (`NotionPageReader`)
-- Google Drive (`GoogleDocsReader`)
-- Slack (`SlackReader`)
-- MongoDB (local) (`SimpleMongoReader`)
-- Wikipedia (`WikipediaReader`)
-- local file directory (`SimpleDirectoryReader`)
-
-Example notebooks of how to use data connectors are found in the [Data Connector Example Notebooks](examples/data_connectors).
-
-
 ## 🔬 Related Work [WIP]
 
 [Measuring and Narrowing the Compositionality Gap in Language Models, by Press et al.](https://arxiv.org/abs/2210.03350)
diff --git a/docs/Makefile b/docs/Makefile
new file mode 100644
index 0000000000000000000000000000000000000000..d4bb2cbb9eddb1bb1b4f366623044af8e4830919
--- /dev/null
+++ b/docs/Makefile
@@ -0,0 +1,20 @@
+# Minimal makefile for Sphinx documentation
+#
+
+# You can set these variables from the command line, and also
+# from the environment for the first two.
+SPHINXOPTS    ?=
+SPHINXBUILD   ?= sphinx-build
+SOURCEDIR     = .
+BUILDDIR      = _build
+
+# Put it first so that "make" without argument is like "make help".
+help:
+	@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
+
+.PHONY: help Makefile
+
+# Catch-all target: route all unknown targets to Sphinx using the new
+# "make mode" option.  $(O) is meant as a shortcut for $(SPHINXOPTS).
+%: Makefile
+	@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
diff --git a/docs/_static/composability/diagram.png b/docs/_static/composability/diagram.png
new file mode 100644
index 0000000000000000000000000000000000000000..9734b399757d0e7d5aae85768423cd59b51873b0
Binary files /dev/null and b/docs/_static/composability/diagram.png differ
diff --git a/docs/_static/composability/diagram_b0.png b/docs/_static/composability/diagram_b0.png
new file mode 100644
index 0000000000000000000000000000000000000000..0e080daaac368b2f8497d5815637cc9f0d4a85f7
Binary files /dev/null and b/docs/_static/composability/diagram_b0.png differ
diff --git a/docs/_static/composability/diagram_b1.png b/docs/_static/composability/diagram_b1.png
new file mode 100644
index 0000000000000000000000000000000000000000..31a6d3055ff6259c6bf46c4cd8b23f1fa9f32ecc
Binary files /dev/null and b/docs/_static/composability/diagram_b1.png differ
diff --git a/docs/_static/composability/diagram_q1.png b/docs/_static/composability/diagram_q1.png
new file mode 100644
index 0000000000000000000000000000000000000000..feb6cbb723f8c537a029e36f2863288ecf9616ee
Binary files /dev/null and b/docs/_static/composability/diagram_q1.png differ
diff --git a/docs/_static/composability/diagram_q2.png b/docs/_static/composability/diagram_q2.png
new file mode 100644
index 0000000000000000000000000000000000000000..f89af4216df9e549bb5a6105709667c2bd9151de
Binary files /dev/null and b/docs/_static/composability/diagram_q2.png differ
diff --git a/docs/conf.py b/docs/conf.py
new file mode 100644
index 0000000000000000000000000000000000000000..e82983c975bdeb7ff91ee0d3308a020eb0dd8b36
--- /dev/null
+++ b/docs/conf.py
@@ -0,0 +1,50 @@
+"""Configuration for sphinx."""
+# Configuration file for the Sphinx documentation builder.
+#
+# For the full list of built-in configuration values, see the documentation:
+# https://www.sphinx-doc.org/en/master/usage/configuration.html
+
+# -- Path setup --------------------------------------------------------------
+
+# If extensions (or modules to document with autodoc) are in another directory,
+# add these directories to sys.path here. If the directory is relative to the
+# documentation root, use os.path.abspath to make it absolute, like shown here.
+#
+import os
+import sys
+
+import sphinx_rtd_theme  # noqa: F401
+
+sys.path.insert(0, os.path.abspath("../"))
+
+# -- Project information -----------------------------------------------------
+# https://www.sphinx-doc.org/en/master/usage/configuration.html#project-information
+
+
+project = "GPT Index"
+copyright = "2022, Jerry Liu"
+author = "Jerry Liu"
+
+# -- General configuration ---------------------------------------------------
+# https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration
+
+extensions = [
+    "sphinx.ext.autodoc",
+    "sphinx.ext.coverage",
+    "sphinx.ext.autodoc.typehints",
+    "sphinx.ext.autosummary",
+    "sphinx.ext.napoleon",
+    "sphinx_rtd_theme",
+    "sphinx.ext.mathjax",
+    "myst_parser",
+]
+
+templates_path = ["_templates"]
+exclude_patterns = ["_build", "Thumbs.db", ".DS_Store"]
+
+
+# -- Options for HTML output -------------------------------------------------
+# https://www.sphinx-doc.org/en/master/usage/configuration.html#options-for-html-output
+
+html_theme = "sphinx_rtd_theme"
+html_static_path = ["_static"]
diff --git a/docs/getting_started/installation.md b/docs/getting_started/installation.md
new file mode 100644
index 0000000000000000000000000000000000000000..676e9d36b94ce284ae0bcd7a8b7ac16ea6f9225d
--- /dev/null
+++ b/docs/getting_started/installation.md
@@ -0,0 +1,20 @@
+# Installation and Setup
+
+### Installation from Pip
+
+You can simply do:
+```
+pip install gpt-index
+```
+
+### Installation from Source
+Git clone this repository: `git clone git@github.com:jerryjliu/gpt_index.git`. Then do:
+
+- `pip install -e .` if you want to do an editable install (you can modify source files) of just the package itself.
+- `pip install -r requirements.txt` if you want to install optional dependencies + dependencies used for development (e.g. unit testing).
+
+
+### Environment Setup
+
+By default, we use the OpenAI GPT-3 `text-davinci-002` model. In order to use this, you must have an OPENAI_API_KEY setup.
+You can register an API key by logging into [OpenAI's page and creating a new API token](https://beta.openai.com/account/api-keys)
diff --git a/docs/getting_started/overview.rst b/docs/getting_started/overview.rst
new file mode 100644
index 0000000000000000000000000000000000000000..25a789dafe51c43a85f3af6eeedb0aef11a25a1a
--- /dev/null
+++ b/docs/getting_started/overview.rst
@@ -0,0 +1,4 @@
+Overview
+=====================================
+
+This section shows you how to quickly get up and running with GPT Index.
diff --git a/docs/getting_started/starter_example.md b/docs/getting_started/starter_example.md
new file mode 100644
index 0000000000000000000000000000000000000000..d75a8532a78d49b0189193b8372c604b529818d3
--- /dev/null
+++ b/docs/getting_started/starter_example.md
@@ -0,0 +1,69 @@
+# Starter Tutorial
+
+Here is a starter example for using GPT Index. Make sure you've followed the [installation](installation.md) steps first.
+
+
+### Download
+GPT Index examples can be found in the `examples` folder of the GPT Index repository. 
+We first want to download this `examples` folder. An easy way to do this is to just clone the repo: 
+
+```bash
+$ git clone git@github.com:jerryjliu/gpt_index.git
+```
+
+Next, navigate to your newly-cloned repository, and verify the contents:
+
+```bash
+$ cd gpt_index
+$ ls
+LICENSE                data_requirements.txt  tests/
+MANIFEST.in            examples/              pyproject.toml
+Makefile               experimental/          requirements.txt
+README.md              gpt_index/             setup.py
+```
+
+
+We now want to navigate to the following folder:
+```bash
+$ cd examples/paul_graham_essay
+```
+
+This contains GPT Index examples around Paul Graham's essay, ["What I Worked On"](http://paulgraham.com/worked.html). A comprehensive set of examples are already provided in `TestEssay.ipynb`. For the purposes of this tutorial, we can focus on a simple example of getting GPT Index up and running.
+
+
+### Build and Query Index
+Create a new `.py` file with the following:
+
+```python
+from gpt_index import GPTTreeIndex, SimpleDirectoryReader
+from IPython.display import Markdown, display
+
+documents = SimpleDirectoryReader('data').load_data()
+index = GPTTreeIndex(documents)
+```
+
+This builds an index over the documents in the `data` folder (which in this case just consists of the essay text). We then run the following
+```python
+response = index.query("What did the author do growing up?")
+print(response)
+```
+
+You should get back a response similar to the following: `The author wrote short stories and tried to program on an IBM 1401.`
+
+### Saving and Loading
+
+To save to disk and load from disk, do
+
+```python
+# save to disk
+index.save_to_disk('index.json')
+# load from disk
+index = GPTTreeIndex.load_from_disk('index.json')
+```
+
+
+### Next Steps
+
+That's it! For more information on GPT Index features, please check out the numerous "How-To Guides" to the left.
+Additionally, if you would like to play around with Example Notebooks, check out [this link](/reference/example_notebooks.rst).
+
diff --git a/docs/how_to/composability.md b/docs/how_to/composability.md
new file mode 100644
index 0000000000000000000000000000000000000000..6b46516f32eb49cb6d2d7d569be7d128348f918d
--- /dev/null
+++ b/docs/how_to/composability.md
@@ -0,0 +1,53 @@
+# Composability
+
+
+GPT Index offers **composability** of your indices, meaning that you can build indices on top of other indices. This allows you to more effectively index your entire document tree in order to feed custom knowledge to GPT.
+
+Composability allows you to to define lower-level indices for each document, and higher-order indices over a collection of documents. To see how this works, imagine defining 1) a tree index for the text within each document, and 2) a list index over each tree index (one document) within your collection.
+
+To see how this works, imagine you have 3 documents: `doc1`, `doc2`, and `doc3`.
+
+```python
+doc1 = SimpleDirectoryReader('data1').load_data()
+doc2 = SimpleDirectoryReader('data2').load_data()
+doc3 = SimpleDirectoryReader('data3').load_data()
+```
+
+![](/_static/composability/diagram_b0.png)
+
+Now let's define a tree index for each document. In Python, we have:
+
+```python
+index1 = GPTTreeIndex(doc1)
+index2 = GPTTreeIndex(doc2)
+index2 = GPTTreeIndex(doc2)
+```
+
+![](/_static/composability/diagram_b1.png)
+
+We can then create a list index on these 3 tree indices:
+
+```python
+list_index = GPTListIndex([index1, index2, index3])
+```
+
+![](/_static/composability/diagram.png)
+
+During a query, we would start with the top-level list index. Each node in the list corresponds to an underlying tree index. 
+
+```python
+response = list_index.query("Where did the author grow up?")
+```
+
+![](/_static/composability/diagram_q1.png)
+
+So within a node, instead of fetching the text, we would recursively query the stored tree index to retrieve our answer.
+
+![](/_static/composability/diagram_q2.png)
+
+NOTE: You can stack indices as many times as you want, depending on the hierarchies of your knowledge base! 
+
+
+We can take a look at a code example below as well. We first build two tree indices, one over the Wikipedia NYC page, and the other over Paul Graham's essay. We then define a keyword extractor index over the two tree indices.
+
+[Here is an example notebook](https://github.com/jerryjliu/gpt_index/blob/main/examples/composable_indices/ComposableIndices.ipynb).
diff --git a/docs/how_to/cost_analysis.md b/docs/how_to/cost_analysis.md
new file mode 100644
index 0000000000000000000000000000000000000000..891060b2438baae0277fc2b68a8aa5e7c0528e44
--- /dev/null
+++ b/docs/how_to/cost_analysis.md
@@ -0,0 +1,39 @@
+# Cost Analysis
+
+Each call to an LLM will cost some amount of money - for instance, OpenAI's Davinci costs $0.02 / 1k tokens. The cost of building an index and querying depends on 
+1. the type of LLM used
+2. the type of data structure used
+3. parameters used during building 
+4. parameters used during querying
+
+The cost of building and querying each index is a TODO in the reference documentation. In the meantime, here is a high-level overview of the cost structure of the indices.
+
+### Index Building
+
+
+#### Indices with no LLM calls
+The following indices don't require LLM calls at all during building (0 cost):
+- `GPTListIndex`
+- `GPTSimpleKeywordTableIndex` - uses a regex keyword extractor to extract keywords from each document
+- `GPTRAKEKeywordTableIndex` - uses a RAKE keyword extractor to extract keywords from each document
+
+#### Indices with LLM calls
+The following indices do require LLM calls during build time:
+- `GPTTreeIndex` - use LLM to hierarchically summarize the text to build the tree
+- `GPTKeywordTableIndex` - use LLM to extract keywords from each document
+
+
+### Query Time
+
+There will always be >= 1 LLM call during query time, in order to synthesize the final answer. 
+Some indices contain cost tradeoffs between index building and querying. `GPTListIndex`, for instance,
+is free to build, but running a query over a list index (without filtering or embedding lookups), will
+call the LLM {math}`N` times.
+
+Here are some notes regarding each of the indices:
+- `GPTListIndex`: by default requires {math}`N` LLM calls, where N is the number of nodes.
+    - However, can do `index.query(..., keyword="<keyword>")` to filter out nodes that don't contain the keyword
+- `GPTTreeIndex`: by default requires {math}`\log (N)` LLM calls, where N is the number of leaf nodes. 
+    - Setting `child_branch_factor=2` will be more expensive than the default `child_branch_factor=1` (polynomial vs logarithmic), because we traverse 2 children instead of just 1 for each parent node.
+- `GPTKeywordTableIndex`: by default requires an LLM call to extract query keywords.
+    - Can do `index.query(..., mode="simple")` or `index.query(..., mode="rake")` to also use regex/RAKE keyword extractors on your query text.
\ No newline at end of file
diff --git a/docs/how_to/custom_llms.md b/docs/how_to/custom_llms.md
new file mode 100644
index 0000000000000000000000000000000000000000..99581ef4bd64b596023084c9a1df884fcb05873f
--- /dev/null
+++ b/docs/how_to/custom_llms.md
@@ -0,0 +1,43 @@
+# Defining LLMs
+
+The goal of GPT Index is to provide a toolkit of data structures that can organize external information in a manner that 
+is easily compatible with the prompt limitations of an LLM. Therefore LLMs are always used to construct the final
+answer.
+Depending on the [type of index](/reference/indices.rst) being used,
+LLMs may also be used during index construction, insertion, and query traversal.
+
+GPT Index uses Langchain's [LLM](https://langchain.readthedocs.io/en/latest/modules/llms.html) 
+and [LLMChain](https://langchain.readthedocs.io/en/latest/modules/chains.html) module to define
+the underlying abstraction. We introduce a wrapper class, 
+[`LLMPredictor`](/reference/llm_predictor.rst), for integration into GPT Index.
+
+By default, we use OpenAI's `text-davinci-002` model. But you may choose to customize
+the underlying LLM being used.
+
+
+## Example
+
+An example snippet of customizing the LLM being used is shown below. 
+In this example, we use `text-davinci-003` instead of `text-davinci-002`. Note that 
+you may plug in any LLM shown on Langchain's 
+[LLM](https://langchain.readthedocs.io/en/latest/modules/llms.html) page.
+
+
+```python
+
+from gpt_index import GPTKeywordTableIndex, SimpleDirectoryReader, LLMPredictor
+from langchain import OpenAI
+
+# define LLM
+llm_predictor = LLMPredictor(llm=OpenAI(temperature=0, model_name="text-davinci-002"))
+
+# load index from disk
+index = GPTKeywordTableIndex.load_from_disk('index_table.json', llm_predictor=llm_predictor)
+
+# get response from query
+response = index.query("What did the author do after his time at Y Combinator?")
+
+```
+
+In this snipet, the index has already been created and saved to disk. We load
+the existing index, and swap in a new `LLMPredictor` that is used during query time.
\ No newline at end of file
diff --git a/docs/how_to/custom_prompts.md b/docs/how_to/custom_prompts.md
new file mode 100644
index 0000000000000000000000000000000000000000..f1cf36c14e29799c8cd93e5a05f935f57d8e193e
--- /dev/null
+++ b/docs/how_to/custom_prompts.md
@@ -0,0 +1,54 @@
+# Defining Prompts
+
+Prompting is the fundamental input that gives LLMs their expressive power. GPT Index uses prompts to build the index, do insertion, 
+perform traversal during querying, and to synthesize the final answer.
+
+GPT Index uses a finite set of *prompt types*, described [here](/reference/prompts.rst). 
+All index classes, along with their associated queries, utilize a subset of these prompts. The user may provide their own prompt.
+If the user does not provide their own prompt, default prompts are used.
+
+An API reference of all index classes and query classes are found below. The definition of each index class and query
+contains optional prompts that the user may pass in.
+- [Indices](/reference/indices.rst)
+- [Queries](/reference/query.rst)
+
+
+### Example
+
+An example can be found in [this notebook](https://github.com/jerryjliu/gpt_index/blob/main/examples/paul_graham_essay/TestEssay.ipynb).
+
+The corresponding snippet is below. We show how to define a custom Summarization Prompt that not only
+contains a `text` field, but also `query_str` field during construction of `GPTTreeIndex`, so that 
+the answer to the query can be simply synthesized from the root nodes.
+
+```python
+
+from gpt_index import Prompt, GPTTreeIndex, SimpleDirectoryReader
+
+# load documents
+documents = SimpleDirectoryReader('data').load_data()
+# define custom prompt
+query_str = "What did the author do growing up?"
+summary_prompt_tmpl = (
+    "Context information is below. \n"
+    "---------------------\n"
+    "{text}"
+    "\n---------------------\n"
+    "Given the context information and not prior knowledge, "
+    "answer the question: {query_str}\n"
+)
+
+summary_prompt = Prompt(
+    input_variables=["query_str", "text"],
+    template=DEFAULT_TEXT_QA_PROMPT_TMPL
+)
+# Build GPTTreeIndex: pass in custom prompt, also pass in query_str
+index_with_query = GPTTreeIndex(documents, summary_template=summary_prompt, query_str=query_str)
+
+```
+
+Once the index is built, we can retrieve our answer:
+```python
+# directly retrieve response from root nodes instead of traversing tree
+response = index_with_query.query(query_str, mode="retrieve")
+```
diff --git a/docs/how_to/data_connectors.md b/docs/how_to/data_connectors.md
new file mode 100644
index 0000000000000000000000000000000000000000..d9b39b3ba974f8a0fdd55ca86876c0f86c2d6c95
--- /dev/null
+++ b/docs/how_to/data_connectors.md
@@ -0,0 +1,13 @@
+# Data Connectors
+
+We currently offer connectors into the following data sources. External data sources are retrieved through their APIs + corresponding authentication token.
+The API reference documentation can be found [here](/reference/readers.rst).
+
+- [Notion](https://developers.notion.com/) (`NotionPageReader`)
+- [Google Docs](https://developers.google.com/docs/api) (`GoogleDocsReader`)
+- [Slack](https://api.slack.com/) (`SlackReader`)
+- MongoDB (`SimpleMongoReader`)
+- Wikipedia (`WikipediaReader`)
+- local file directory (`SimpleDirectoryReader`)
+
+We offer [example notebooks of connecting to different data sources](https://github.com/jerryjliu/gpt_index/tree/main/examples/data_connectors). Please check them out!
\ No newline at end of file
diff --git a/docs/how_to/embeddings.md b/docs/how_to/embeddings.md
new file mode 100644
index 0000000000000000000000000000000000000000..cc713330d6568d9076fa11b849e41998f0d20205
--- /dev/null
+++ b/docs/how_to/embeddings.md
@@ -0,0 +1,27 @@
+# Embedding support
+
+GPT Index provides embedding support to our tree and list indices. In addition to each node storing text, each node can optionally store an embedding.
+During query-time, we can use embeddings to do max-similarity retrieval of nodes before calling the LLM to synthesize an answer. 
+Since similarity lookup using embeddings (e.g. using cosine similarity) does not require a LLM call, embeddings serve as a cheaper lookup mechanism instead
+of using LLMs to traverse nodes.
+
+NOTE: we currently support OpenAI embeddings. External embeddings are coming soon!
+
+**How are Embeddings Generated?**
+
+Embeddings are lazily generated and then cached at query time (if mode="embedding" is specified during `index.query`), and not during index construction.
+This design choice prevents the need to generate embeddings for all text chunks during index construction.
+
+**Embedding Lookups**
+For the list index:
+- We iterate through every node in the list, and identify the top k nodes through embedding similarity. We use these nodes to synthesize an answer.
+- See the [List Query API](/reference/indices/list_query.rst) for more details.
+
+For the tree index:
+- We start with the root nodes, and traverse down the tree by picking the child node through embedding similarity.
+- See the [Tree Query API](/reference/indices/tree_query.rst) for more details.
+
+**Example Notebook**
+
+An example notebook is given [here](https://github.com/jerryjliu/gpt_index/blob/main/examples/test_wiki/TestNYC_Embeddings.ipynb).
+
diff --git a/docs/how_to/insert.md b/docs/how_to/insert.md
new file mode 100644
index 0000000000000000000000000000000000000000..3b28eb8f7ad06bbbd1219ad272d7a998bbf1c9fd
--- /dev/null
+++ b/docs/how_to/insert.md
@@ -0,0 +1,5 @@
+# Insert Capabilities
+
+Every GPT Index data structure allows insertion.
+
+An example notebook showcasing our insert capabilities is given [here](https://github.com/jerryjliu/gpt_index/blob/main/examples/paul_graham_essay/InsertDemo.ipynb).
\ No newline at end of file
diff --git a/docs/how_to/overview.rst b/docs/how_to/overview.rst
new file mode 100644
index 0000000000000000000000000000000000000000..1c6cc0c6d12bc104e11e000c590647526c733432
--- /dev/null
+++ b/docs/how_to/overview.rst
@@ -0,0 +1,4 @@
+Overview
+=====================================
+
+The how-to section contains guides on some of the core features of GPT Index:
diff --git a/docs/index.rst b/docs/index.rst
new file mode 100644
index 0000000000000000000000000000000000000000..081304af61e8a3a0cca5429b14c500389214b5d7
--- /dev/null
+++ b/docs/index.rst
@@ -0,0 +1,75 @@
+.. GPT Index documentation master file, created by
+   sphinx-quickstart on Sun Dec 11 14:30:34 2022.
+   You can adapt this file completely to your liking, but it should at least
+   contain the root `toctree` directive.
+
+Welcome to GPT Index!
+=====================================
+
+GPT Index is a project consisting of a set of data structures that are created using LLMs and can be traversed using LLMs in order to answer queries.
+
+The Github project page is here: https://github.com/jerryjliu/gpt_index.
+
+The pypi package is here: https://pypi.org/project/gpt-index/.
+
+
+🚀 Overview
+-----------
+
+Context
+^^^^^^^
+- LLMs are a phenomenonal piece of technology for knowledge generation and reasoning.
+- A big limitation of LLMs is context size (e.g. Davinci's limit is 4096 tokens. Large, but not infinite).
+- The ability to feed "knowledge" to LLMs is restricted to this limited prompt size and model weights.
+- **Thought**: What if LLMs can have access to potentially a much larger database of knowledge without retraining/finetuning? 
+
+Proposed Solution
+^^^^^^^^^^^^^^^^^
+That's where the **GPT Index** comes in. GPT Index is a simple, flexible interface between your external data and LLMs. It resolves the following pain points:
+
+- Provides simple data structures to resolve prompt size limitations.
+- Offers data connectors to your external data sources.
+- Offers you a comprehensive toolset trading off cost and performance.
+
+At the core of GPT Index is a **data structure**. Instead of relying on world knowledge encoded in the model weights, a GPT Index data structure does the following:
+
+- Uses a pre-trained LLM primarily for *reasoning*/*summarization* instead of prior knowledge.
+- Takes as input a large corpus of text data and build a structured index over it (using an LLM or heuristics).
+- Allow users to *query* the index in order to synthesize an answer to the question - this requires both *traversal* of the index as well as a synthesis of the answer.
+
+
+.. toctree::
+   :maxdepth: 1
+   :caption: Getting Started
+
+   getting_started/overview.rst
+   getting_started/installation.md
+   getting_started/starter_example.md
+
+
+.. toctree::
+   :maxdepth: 1
+   :caption: How To
+
+   how_to/overview.rst
+   how_to/data_connectors.md
+   how_to/composability.md
+   how_to/insert.md
+   how_to/cost_analysis.md
+   how_to/embeddings.md
+   how_to/custom_prompts.md
+   how_to/custom_llms.md
+
+
+.. toctree::
+   :maxdepth: 1
+   :caption: Reference
+
+   reference/overview.rst
+   reference/indices.rst
+   reference/query.rst
+   reference/readers.rst
+   reference/prompts.rst
+   reference/example_notebooks.rst
+   reference/llm_predictor.rst
+
diff --git a/docs/make.bat b/docs/make.bat
new file mode 100644
index 0000000000000000000000000000000000000000..32bb24529f92346af26219baed295b7488b77534
--- /dev/null
+++ b/docs/make.bat
@@ -0,0 +1,35 @@
+@ECHO OFF
+
+pushd %~dp0
+
+REM Command file for Sphinx documentation
+
+if "%SPHINXBUILD%" == "" (
+	set SPHINXBUILD=sphinx-build
+)
+set SOURCEDIR=.
+set BUILDDIR=_build
+
+%SPHINXBUILD% >NUL 2>NUL
+if errorlevel 9009 (
+	echo.
+	echo.The 'sphinx-build' command was not found. Make sure you have Sphinx
+	echo.installed, then set the SPHINXBUILD environment variable to point
+	echo.to the full path of the 'sphinx-build' executable. Alternatively you
+	echo.may add the Sphinx directory to PATH.
+	echo.
+	echo.If you don't have Sphinx installed, grab it from
+	echo.https://www.sphinx-doc.org/
+	exit /b 1
+)
+
+if "%1" == "" goto help
+
+%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
+goto end
+
+:help
+%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
+
+:end
+popd
diff --git a/docs/reference/example_notebooks.rst b/docs/reference/example_notebooks.rst
new file mode 100644
index 0000000000000000000000000000000000000000..df5e3e58bc5b72c2174fc3112c628060d58f0ae3
--- /dev/null
+++ b/docs/reference/example_notebooks.rst
@@ -0,0 +1,8 @@
+.. _Ref-Example-Notebooks:
+
+Example Notebooks
+=================
+
+We offer a wide variety of example notebooks. They are referenced throughout the documentation.
+
+Example notebooks are found `here <https://github.com/jerryjliu/gpt_index/tree/main/examples>`_.
\ No newline at end of file
diff --git a/docs/reference/indices.rst b/docs/reference/indices.rst
new file mode 100644
index 0000000000000000000000000000000000000000..89a8b7fcd39fc52abfee6257b5b11f1b7569abb9
--- /dev/null
+++ b/docs/reference/indices.rst
@@ -0,0 +1,15 @@
+.. _Ref-Indices:
+
+Indices
+=======
+
+This doc shows both the overarching class used to represent an index. These
+classes allow for index creation, insertion, and also querying.
+
+.. toctree::
+   :maxdepth: 1
+   :caption: Index Data Structures
+
+   indices/list.rst
+   indices/table.rst
+   indices/tree.rst
diff --git a/docs/reference/indices/list.rst b/docs/reference/indices/list.rst
new file mode 100644
index 0000000000000000000000000000000000000000..65c0ea734f6a9a35acb499174c7118d4e7cf35d0
--- /dev/null
+++ b/docs/reference/indices/list.rst
@@ -0,0 +1,9 @@
+List Index
+==========
+
+Building the List Index
+
+.. automodule:: gpt_index.indices.list
+   :members:
+   :inherited-members:
+   :exclude-members: delete, docstore, index_struct, index_struct_cls
\ No newline at end of file
diff --git a/docs/reference/indices/list_query.rst b/docs/reference/indices/list_query.rst
new file mode 100644
index 0000000000000000000000000000000000000000..fa65e44982c4ab5ad49c754966ad81069b627117
--- /dev/null
+++ b/docs/reference/indices/list_query.rst
@@ -0,0 +1,7 @@
+Querying a List Index
+=====================
+
+.. automodule:: gpt_index.indices.query.list
+   :members:
+   :inherited-members:
+   :exclude-members: index_struct, query, set_llm_predictor, set_prompt_helper
\ No newline at end of file
diff --git a/docs/reference/indices/table.rst b/docs/reference/indices/table.rst
new file mode 100644
index 0000000000000000000000000000000000000000..bdda65a5ddf297762578faf155a827f287ef3c86
--- /dev/null
+++ b/docs/reference/indices/table.rst
@@ -0,0 +1,8 @@
+Table Index
+===========
+
+Building the Keyword Table Index
+
+.. automodule:: gpt_index.indices.keyword_table
+   :members:
+   :inherited-members:
\ No newline at end of file
diff --git a/docs/reference/indices/table_query.rst b/docs/reference/indices/table_query.rst
new file mode 100644
index 0000000000000000000000000000000000000000..abcb62a70114e605846d26ce53b2e8405e879f9b
--- /dev/null
+++ b/docs/reference/indices/table_query.rst
@@ -0,0 +1,7 @@
+Querying a Keyword Table Index
+==============================
+
+.. automodule:: gpt_index.indices.query.keyword_table
+   :members:
+   :inherited-members:
+   :exclude-members: index_struct, query, set_llm_predictor, set_prompt_helper
\ No newline at end of file
diff --git a/docs/reference/indices/tree.rst b/docs/reference/indices/tree.rst
new file mode 100644
index 0000000000000000000000000000000000000000..5e582f92a6eadefc0c678e34e2c0f984800c2a8b
--- /dev/null
+++ b/docs/reference/indices/tree.rst
@@ -0,0 +1,8 @@
+Tree Index
+==========
+
+Building the Tree Index
+
+.. automodule:: gpt_index.indices.tree
+   :members:
+   :inherited-members:
\ No newline at end of file
diff --git a/docs/reference/indices/tree_query.rst b/docs/reference/indices/tree_query.rst
new file mode 100644
index 0000000000000000000000000000000000000000..46f5e67f520427500aac49459cf98ad5b3e0a223
--- /dev/null
+++ b/docs/reference/indices/tree_query.rst
@@ -0,0 +1,7 @@
+Querying a Tree Index
+=====================
+
+.. automodule:: gpt_index.indices.query.tree
+   :members:
+   :inherited-members:
+   :exclude-members: index_struct, query, set_llm_predictor, set_prompt_helper
\ No newline at end of file
diff --git a/docs/reference/llm_predictor.rst b/docs/reference/llm_predictor.rst
new file mode 100644
index 0000000000000000000000000000000000000000..ffc096c78b795ee7b096b16b5b931c54930c092b
--- /dev/null
+++ b/docs/reference/llm_predictor.rst
@@ -0,0 +1,10 @@
+.. _Ref-LLM-Predictor:
+
+LLMPredictor
+=================
+
+Our LLMPredictor is a wrapper around Langchain's `LLMChain` that allows easy integration into GPT Index.
+
+.. automodule:: gpt_index.langchain_helpers.chain_wrapper
+   :members:
+   :inherited-members:
diff --git a/docs/reference/overview.rst b/docs/reference/overview.rst
new file mode 100644
index 0000000000000000000000000000000000000000..b5ef28e5bfc0bcdb6bc7bebfbc0c25dfa5fb8a93
--- /dev/null
+++ b/docs/reference/overview.rst
@@ -0,0 +1,4 @@
+Overview
+=====================================
+
+The reference section contains comprehensive API documentation over all index data structures, and query modes.
diff --git a/docs/reference/prompts.rst b/docs/reference/prompts.rst
new file mode 100644
index 0000000000000000000000000000000000000000..eb8c0cfdaca4dd6aaff86eb3ae7175a62e17a8b0
--- /dev/null
+++ b/docs/reference/prompts.rst
@@ -0,0 +1,70 @@
+.. _Prompt-Templates:
+
+Prompt Templates
+=================
+
+These are the reference prompt templates. 
+We then document all prompts, with their required variables.
+
+We then show the base prompt class, 
+derived from `Langchain <https://langchain.readthedocs.io/en/latest/modules/prompt.html>`_.
+
+
+**Summarization Prompt**
+
+- Prompt to summarize the provided `text`.
+- input variables: `["text"]`
+
+**Tree Insert Prompt**
+
+- Prompt to insert a new chunk of text `new_chunk_text` into the tree index. More specifically,
+    this prompt has the LLM select the relevant candidate child node to continue tree traversal.
+- input variables: `["num_chunks", "context_list", "new_chunk_text"]`
+
+**Question-Answer Prompt**
+
+- Prompt to answer a question `query_str` given a context `context_str`.
+- input variables: `["context_str", "query_str"]`
+
+**Refinement Prompt**
+
+- Prompt to refine an existing answer `existing_answer` given a context `context_msg`,
+    and a query `query_str`.
+- input variables: `["query_str", "existing_answer", "context_msg"]`
+
+**Keyword Extraction Prompt**
+
+- Prompt to extract keywords from a text `text` with a maximum of `max_keywords` keywords.
+- input variables: `["text", "max_keywords"]`
+
+**Query Keyword Extraction Prompt**
+
+- Prompt to extract keywords from a query `query_str` with a maximum of `max_keywords` keywords.
+- input variables: `["question", "max_keywords"]`
+
+
+**Tree Select Query Prompt**
+
+- Prompt to select a candidate child node out of all child nodes provided in `context_list`,
+    given a query `query_str`. `num_chunks` is the number of child nodes in `context_list`.
+
+- input variables: `["num_chunks", "context_list", "query_str"]`
+
+
+**Tree Select Query Prompt (Multiple)**
+
+- Prompt to select multiple candidate child nodes out of all child nodes provided in `context_list`,
+    given a query `query_str`. `branching_factor` refers to the number of child nodes to select, and
+    `num_chunks` is the number of child nodes in `context_list`.
+
+- input variables: `["num_chunks", "context_list", "query_str", "branching_factor"]`
+
+
+**Base Prompt Class**
+
+.. automodule:: gpt_index.prompts
+   :members:
+   :inherited-members:
+   :exclude-members: Config, construct, copy, dict, from_examples, from_file, get_full_format_args, output_parser, save, template, template_format, update_forward_refs, validate_variable_names, json, template_is_valid
+
+
diff --git a/docs/reference/query.rst b/docs/reference/query.rst
new file mode 100644
index 0000000000000000000000000000000000000000..84525694f72a6fd9a4f6fa60590b41146c4c1d3f
--- /dev/null
+++ b/docs/reference/query.rst
@@ -0,0 +1,14 @@
+.. _Ref-Query:
+
+Querying an Index
+=================
+
+This doc specifically shows the classes that are used to query indices.
+
+.. toctree::
+   :maxdepth: 1
+   :caption: Query classes
+
+   indices/list_query.rst
+   indices/table_query.rst
+   indices/tree_query.rst
diff --git a/docs/reference/readers.rst b/docs/reference/readers.rst
new file mode 100644
index 0000000000000000000000000000000000000000..8fc062f54396a9c346158c94e48c3040d4d4ee62
--- /dev/null
+++ b/docs/reference/readers.rst
@@ -0,0 +1,6 @@
+Data Connectors
+===============
+
+.. automodule:: gpt_index.readers
+   :members:
+   :inherited-members:
\ No newline at end of file
diff --git a/docs/requirements.txt b/docs/requirements.txt
new file mode 100644
index 0000000000000000000000000000000000000000..51d8970186396d0997c5375d934bf65ae05193a5
--- /dev/null
+++ b/docs/requirements.txt
@@ -0,0 +1,4 @@
+-e .
+sphinx
+sphinx_rtd_theme
+myst-parser
\ No newline at end of file
diff --git a/gpt_index/__init__.py b/gpt_index/__init__.py
index 53553541bbe855e007bb3ac25625298015ea9353..d024cd75d8a54731133847f1939582f8e5d40aaf 100644
--- a/gpt_index/__init__.py
+++ b/gpt_index/__init__.py
@@ -6,13 +6,14 @@ with open(Path(__file__).absolute().parents[0] / "VERSION") as _f:
     __version__ = _f.read().strip()
 
 
-from gpt_index.indices.keyword_table.base import GPTKeywordTableIndex
-from gpt_index.indices.keyword_table.rake_base import GPTRAKEKeywordTableIndex
-from gpt_index.indices.keyword_table.simple_base import GPTSimpleKeywordTableIndex
-from gpt_index.indices.list.base import GPTListIndex
-
 # indices
-from gpt_index.indices.tree.base import GPTTreeIndex
+from gpt_index.indices.keyword_table import (
+    GPTKeywordTableIndex,
+    GPTRAKEKeywordTableIndex,
+    GPTSimpleKeywordTableIndex,
+)
+from gpt_index.indices.list import GPTListIndex
+from gpt_index.indices.tree import GPTTreeIndex
 
 # langchain helper
 from gpt_index.langchain_helpers.chain_wrapper import LLMPredictor
diff --git a/gpt_index/indices/__init__.py b/gpt_index/indices/__init__.py
index 0eee67e42b4e6e9f4517dee43a39a8a4e15e6ccf..10f6e05ee4f8421b751e647fe3062d17da843799 100644
--- a/gpt_index/indices/__init__.py
+++ b/gpt_index/indices/__init__.py
@@ -1 +1,16 @@
-"""Init file for indices."""
+"""GPT Index data structures."""
+
+# indices
+from gpt_index.indices.keyword_table.base import GPTKeywordTableIndex
+from gpt_index.indices.keyword_table.rake_base import GPTRAKEKeywordTableIndex
+from gpt_index.indices.keyword_table.simple_base import GPTSimpleKeywordTableIndex
+from gpt_index.indices.list.base import GPTListIndex
+from gpt_index.indices.tree.base import GPTTreeIndex
+
+__all__ = [
+    "GPTKeywordTableIndex",
+    "GPTSimpleKeywordTableIndex",
+    "GPTRAKEKeywordTableIndex",
+    "GPTListIndex",
+    "GPTTreeIndex",
+]
diff --git a/gpt_index/indices/base.py b/gpt_index/indices/base.py
index 43c7eab83023bdf1c95f59bf265a72ed0309f34d..915fddbd12c301be024e5530cbd5a72e8a049b72 100644
--- a/gpt_index/indices/base.py
+++ b/gpt_index/indices/base.py
@@ -152,7 +152,17 @@ class BaseGPTIndex(Generic[IS]):
         mode: str = DEFAULT_MODE,
         **query_kwargs: Any
     ) -> str:
-        """Answer a query."""
+        """Answer a query.
+
+        When `query` is called, we query the index with the given `mode` and
+        `query_kwargs`. The `mode` determines the type of query to run, and
+        `query_kwargs` are parameters that are specific to the query type.
+
+        For a comprehensive documentation of available `mode` and `query_kwargs` to
+        query a given index, please visit :ref:`Ref-Query`.
+
+
+        """
         # TODO: remove _mode_to_query and consolidate with query_runner
         if mode == "recursive":
             if "query_configs" not in query_kwargs:
@@ -175,7 +185,20 @@ class BaseGPTIndex(Generic[IS]):
 
     @classmethod
     def load_from_disk(cls, save_path: str, **kwargs: Any) -> "BaseGPTIndex":
-        """Load from disk."""
+        """Load index from disk.
+
+        This method loads the index from a JSON file stored on disk. The index data
+        structure itself is preserved completely. If the index is defined over
+        subindices, those subindices will also be preserved (and subindices of
+        those subindices, etc.).
+
+        Args:
+            save_path (str): The save_path of the file.
+
+        Returns:
+            BaseGPTIndex: The loaded index.
+
+        """
         with open(save_path, "r") as f:
             result_dict = json.load(f)
             index_struct = cls.index_struct_cls.from_dict(result_dict["index_struct"])
@@ -183,7 +206,14 @@ class BaseGPTIndex(Generic[IS]):
             return cls(index_struct=index_struct, docstore=docstore, **kwargs)
 
     def save_to_disk(self, save_path: str) -> None:
-        """Safe to file."""
+        """Save to file.
+
+        This method stores the index into a JSON file stored on disk.
+
+        Args:
+            save_path (str): The save_path of the file.
+
+        """
         out_dict: Dict[str, dict] = {
             "index_struct": self.index_struct.to_dict(),
             "docstore": self.docstore.to_dict(),
diff --git a/gpt_index/indices/keyword_table/__init__.py b/gpt_index/indices/keyword_table/__init__.py
index 1d4640565ae2765d9ca96a509dc9809217f62f2f..43a973b9b0920e2d6a499bee82a9aee62fe2321f 100644
--- a/gpt_index/indices/keyword_table/__init__.py
+++ b/gpt_index/indices/keyword_table/__init__.py
@@ -1 +1,12 @@
-"""Init file."""
+"""Keyword Table Index Data Structures."""
+
+# indices
+from gpt_index.indices.keyword_table.base import GPTKeywordTableIndex
+from gpt_index.indices.keyword_table.rake_base import GPTRAKEKeywordTableIndex
+from gpt_index.indices.keyword_table.simple_base import GPTSimpleKeywordTableIndex
+
+__all__ = [
+    "GPTKeywordTableIndex",
+    "GPTSimpleKeywordTableIndex",
+    "GPTRAKEKeywordTableIndex",
+]
diff --git a/gpt_index/indices/keyword_table/base.py b/gpt_index/indices/keyword_table/base.py
index 640f40217bbe927047ed17857792bdaf0fbf04dc..70f1556dc85f0fd7449176371c5286c8d5dfabbc 100644
--- a/gpt_index/indices/keyword_table/base.py
+++ b/gpt_index/indices/keyword_table/base.py
@@ -39,7 +39,26 @@ DQKET = DEFAULT_QUERY_KEYWORD_EXTRACT_TEMPLATE
 
 
 class BaseGPTKeywordTableIndex(BaseGPTIndex[KeywordTable]):
-    """Base GPT Index."""
+    """GPT Keyword Table Index.
+
+    This index extracts keywords from the text, and maps each
+    keyword to the node(s) that it corresponds to. In this sense it mimicks a
+    "hash table". During index construction, the keyword table is constructed
+    by extracting keywords from each node and creating an internal mapping.
+
+    During query time, the keywords are extracted from the query text, and these
+    keywords are used to index into the keyword table. The retrieved nodes
+    are then used to answer the query.
+
+    Args:
+        keyword_extract_template (Optional[Prompt]): A Keyword Extraction Prompt
+            (see :ref:`Prompt-Templates`).
+        max_keywords_per_query (int): The maximum number of keywords to extract
+            per query.
+        max_keywords_per_query (int): The maximum number of keywords to extract
+            per chunk.
+
+    """
 
     index_struct_cls = KeywordTable
 
@@ -151,7 +170,7 @@ class BaseGPTKeywordTableIndex(BaseGPTIndex[KeywordTable]):
 class GPTKeywordTableIndex(BaseGPTKeywordTableIndex):
     """GPT Keyword Table Index.
 
-    Uses GPT to build keyword table.
+    This index uses a GPT model to extract keywords from the text.
 
     """
 
diff --git a/gpt_index/indices/keyword_table/rake_base.py b/gpt_index/indices/keyword_table/rake_base.py
index b610e3a541ab3253a68ff67d8bfbff2e3a954d05..61b26ce6dc58e56ef751b4db489efa42582a4dc6 100644
--- a/gpt_index/indices/keyword_table/rake_base.py
+++ b/gpt_index/indices/keyword_table/rake_base.py
@@ -11,7 +11,11 @@ from gpt_index.indices.keyword_table.utils import rake_extract_keywords
 
 
 class GPTRAKEKeywordTableIndex(BaseGPTKeywordTableIndex):
-    """GPT Index."""
+    """GPT RAKE Keyword Table Index.
+
+    This index uses a RAKE keyword extractor to extract keywords from the text.
+
+    """
 
     def _extract_keywords(self, text: str) -> Set[str]:
         """Extract keywords from text."""
diff --git a/gpt_index/indices/keyword_table/simple_base.py b/gpt_index/indices/keyword_table/simple_base.py
index e8f506884bb0795a07821404333f614e13873c33..4d542c38da0d37d7504500a56ee9473f24f51eff 100644
--- a/gpt_index/indices/keyword_table/simple_base.py
+++ b/gpt_index/indices/keyword_table/simple_base.py
@@ -15,7 +15,11 @@ DQKET = DEFAULT_QUERY_KEYWORD_EXTRACT_TEMPLATE
 
 
 class GPTSimpleKeywordTableIndex(BaseGPTKeywordTableIndex):
-    """GPT Index."""
+    """GPT Simple Keyword Table Index.
+
+    This index uses a simple regex extractor to extract keywords from the text.
+
+    """
 
     def _extract_keywords(self, text: str) -> Set[str]:
         """Extract keywords from text."""
diff --git a/gpt_index/indices/list/__init__.py b/gpt_index/indices/list/__init__.py
index 1d4640565ae2765d9ca96a509dc9809217f62f2f..b24c607f33df4130279f81016b5fdc2a22d19fd0 100644
--- a/gpt_index/indices/list/__init__.py
+++ b/gpt_index/indices/list/__init__.py
@@ -1 +1,7 @@
-"""Init file."""
+"""List-based data structures."""
+
+from gpt_index.indices.list.base import GPTListIndex
+
+__all__ = [
+    "GPTListIndex",
+]
diff --git a/gpt_index/indices/list/base.py b/gpt_index/indices/list/base.py
index e93f8a827de776cfc180e2ad786a5b9542edb01a..5eaeb598df9b88f4a91a2888d30a6afbf57caadb 100644
--- a/gpt_index/indices/list/base.py
+++ b/gpt_index/indices/list/base.py
@@ -29,7 +29,21 @@ GENERATE_TEXT_QUERY = "What is a concise summary of this document?"
 
 
 class GPTListIndex(BaseGPTIndex[IndexList]):
-    """GPT List Index."""
+    """GPT List Index.
+
+    The list index is a simple data structure where nodes are stored in
+    a sequence. During index construction, the document texts are
+    chunked up, converted to nodes, and stored in a list.
+
+    During query time, the list index iterates through the nodes
+    with some optional filter parameters, and synthesizes an
+    answer from all the nodes.
+
+    Args:
+        text_qa_template (Optional[Prompt]): A Question-Answer Prompt
+            (see :ref:`Prompt-Templates`).
+
+    """
 
     index_struct_cls = IndexList
 
@@ -69,7 +83,14 @@ class GPTListIndex(BaseGPTIndex[IndexList]):
     def build_index_from_documents(
         self, documents: Sequence[BaseDocument]
     ) -> IndexList:
-        """Build the index from documents."""
+        """Build the index from documents.
+
+        Args:
+            documents (List[BaseDocument]): A list of documents.
+
+        Returns:
+            IndexList: The created list index.
+        """
         text_splitter = self._prompt_helper.get_text_splitter_given_prompt(
             self.text_qa_template, 1
         )
diff --git a/gpt_index/indices/query/keyword_table/__init__.py b/gpt_index/indices/query/keyword_table/__init__.py
index 1d4640565ae2765d9ca96a509dc9809217f62f2f..9b0b308e6e04d88056cbab99ebbdc33a0fac5a3d 100644
--- a/gpt_index/indices/query/keyword_table/__init__.py
+++ b/gpt_index/indices/query/keyword_table/__init__.py
@@ -1 +1,13 @@
-"""Init file."""
+"""Query classes for keyword table indices."""
+
+from gpt_index.indices.query.keyword_table.query import (
+    GPTKeywordTableGPTQuery,
+    GPTKeywordTableRAKEQuery,
+    GPTKeywordTableSimpleQuery,
+)
+
+__all__ = [
+    "GPTKeywordTableGPTQuery",
+    "GPTKeywordTableRAKEQuery",
+    "GPTKeywordTableSimpleQuery",
+]
diff --git a/gpt_index/indices/query/keyword_table/query.py b/gpt_index/indices/query/keyword_table/query.py
index 98b3a5235d48c1914b15917626691ec0a34c86eb..52b00270bc1195fb5a0e3a250875702a0c5fc146 100644
--- a/gpt_index/indices/query/keyword_table/query.py
+++ b/gpt_index/indices/query/keyword_table/query.py
@@ -23,7 +23,23 @@ DQKET = DEFAULT_QUERY_KEYWORD_EXTRACT_TEMPLATE
 
 
 class BaseGPTKeywordTableQuery(BaseGPTIndexQuery[KeywordTable]):
-    """Base GPT Keyword Table Index Query."""
+    """Base GPT Keyword Table Index Query.
+
+    Arguments are shared among subclasses.
+
+    Args:
+        keyword_extract_template (Optional[Prompt]): A Keyword Extraction Prompt
+            (see :ref:`Prompt-Templates`).
+        query_keyword_extract_template (Optional[Prompt]): A Query Keyword Extraction
+            Prompt (see :ref:`Prompt-Templates`).
+        refine_template (Optional[Prompt]): A Refinement Prompt
+            (see :ref:`Prompt-Templates`).
+        text_qa_template (Optional[Prompt]): A Question Answering Prompt
+            (see :ref:`Prompt-Templates`).
+        max_keywords_per_query (int): Maximum number of keywords to extract from query.
+        num_chunks_per_query (int): Maximum number of text chunks to query.
+
+    """
 
     def __init__(
         self,
@@ -89,7 +105,14 @@ class BaseGPTKeywordTableQuery(BaseGPTIndexQuery[KeywordTable]):
 class GPTKeywordTableGPTQuery(BaseGPTKeywordTableQuery):
     """GPT Keyword Table Index Query.
 
-    Extracts keywords using GPT.
+    Extracts keywords using GPT. Set when `mode="default"` in `query` method of
+    `GPTKeywordTableIndex`.
+
+    .. code-block:: python
+
+        response = index.query("<query_str>", mode="default")
+
+    See BaseGPTKeywordTableQuery for arguments.
 
     """
 
@@ -107,7 +130,14 @@ class GPTKeywordTableGPTQuery(BaseGPTKeywordTableQuery):
 class GPTKeywordTableSimpleQuery(BaseGPTKeywordTableQuery):
     """GPT Keyword Table Index Simple Query.
 
-    Extracts keywords using Simple keyword extractor.
+    Extracts keywords using simple regex-based keyword extractor.
+    Set when `mode="simple"` in `query` method of `GPTKeywordTableIndex`.
+
+    .. code-block:: python
+
+        response = index.query("<query_str>", mode="simple")
+
+    See BaseGPTKeywordTableQuery for arguments.
 
     """
 
@@ -122,6 +152,13 @@ class GPTKeywordTableRAKEQuery(BaseGPTKeywordTableQuery):
     """GPT Keyword Table Index RAKE Query.
 
     Extracts keywords using RAKE keyword extractor.
+    Set when `mode="rake"` in `query` method of `GPTKeywordTableIndex`.
+
+    .. code-block:: python
+
+        response = index.query("<query_str>", mode="rake")
+
+    See BaseGPTKeywordTableQuery for arguments.
 
     """
 
diff --git a/gpt_index/indices/query/list/__init__.py b/gpt_index/indices/query/list/__init__.py
index 1d4640565ae2765d9ca96a509dc9809217f62f2f..795cc1c5df2839027cbf4cb474d8acb4cddbd939 100644
--- a/gpt_index/indices/query/list/__init__.py
+++ b/gpt_index/indices/query/list/__init__.py
@@ -1 +1,6 @@
-"""Init file."""
+"""Query classes for list indices."""
+
+from gpt_index.indices.query.list.embedding_query import GPTListIndexEmbeddingQuery
+from gpt_index.indices.query.list.query import GPTListIndexQuery
+
+__all__ = ["GPTListIndexEmbeddingQuery", "GPTListIndexQuery"]
diff --git a/gpt_index/indices/query/list/embedding_query.py b/gpt_index/indices/query/list/embedding_query.py
index fd2fe99e196f14d914feaea4762920d22f962a72..4accc862097cb94b16164d55b1a26a93e705c340 100644
--- a/gpt_index/indices/query/list/embedding_query.py
+++ b/gpt_index/indices/query/list/embedding_query.py
@@ -8,7 +8,20 @@ from gpt_index.prompts.base import Prompt
 
 
 class GPTListIndexEmbeddingQuery(BaseGPTListIndexQuery):
-    """GPTListIndex query."""
+    """GPTListIndex query.
+
+    An embedding-based for GPTListIndex, which traverses
+    each node in sequence and retrieves top-k nodes by
+    embedding similarity to the query.
+    Set when `mode="embedding"` in `query` method of `GPTListIndex`.
+
+    .. code-block:: python
+
+        response = index.query("<query_str>", mode="embedding")
+
+    See BaseGPTListIndexQuery for arguments.
+
+    """
 
     def __init__(
         self,
diff --git a/gpt_index/indices/query/list/query.py b/gpt_index/indices/query/list/query.py
index b51746b02d48f03277f0d989a638137a84b55dfc..bf0b18f4174c75bb252b8ce41e65ad7e5882d418 100644
--- a/gpt_index/indices/query/list/query.py
+++ b/gpt_index/indices/query/list/query.py
@@ -12,7 +12,19 @@ from gpt_index.prompts.default_prompts import (
 
 
 class BaseGPTListIndexQuery(BaseGPTIndexQuery[IndexList]):
-    """GPTListIndex query."""
+    """GPTListIndex query.
+
+    Arguments are shared among subclasses.
+
+    Args:
+        text_qa_template (Optional[Prompt]): A Question Answering Prompt
+            (see :ref:`Prompt-Templates`).
+        refine_template (Optional[Prompt]): A Refinement Prompt
+            (see :ref:`Prompt-Templates`).
+        keyword (Optional[str]): If specified, keyword to filter nodes.
+            Simulates Ctrl+F lookup in a document.
+
+    """
 
     def __init__(
         self,
@@ -62,7 +74,20 @@ class BaseGPTListIndexQuery(BaseGPTIndexQuery[IndexList]):
 
 
 class GPTListIndexQuery(BaseGPTListIndexQuery):
-    """GPTListIndex query."""
+    """GPTListIndex query.
+
+    The default query mode for GPTListIndex, which traverses
+    each node in sequence and synthesizes a response across all nodes
+    (with an optional keyword filter).
+    Set when `mode="default"` in `query` method of `GPTListIndex`.
+
+    .. code-block:: python
+
+        response = index.query("<query_str>", mode="default")
+
+    See BaseGPTListIndexQuery for arguments.
+
+    """
 
     def _get_nodes_for_response(
         self, query_str: str, verbose: bool = False
diff --git a/gpt_index/indices/query/tree/__init__.py b/gpt_index/indices/query/tree/__init__.py
index 1d4640565ae2765d9ca96a509dc9809217f62f2f..f269b72b009c4da94d70b83a9b6b9f03af0345da 100644
--- a/gpt_index/indices/query/tree/__init__.py
+++ b/gpt_index/indices/query/tree/__init__.py
@@ -1 +1,11 @@
-"""Init file."""
+"""Query classes for tree indices."""
+
+from gpt_index.indices.query.tree.embedding_query import GPTTreeIndexEmbeddingQuery
+from gpt_index.indices.query.tree.leaf_query import GPTTreeIndexLeafQuery
+from gpt_index.indices.query.tree.retrieve_query import GPTTreeIndexRetQuery
+
+__all__ = [
+    "GPTTreeIndexLeafQuery",
+    "GPTTreeIndexRetQuery",
+    "GPTTreeIndexEmbeddingQuery",
+]
diff --git a/gpt_index/indices/query/tree/embedding_query.py b/gpt_index/indices/query/tree/embedding_query.py
index 7d474e0447a73269b0ed0ef4636e92b1fc1c7240..ba78dcbe32fda6f62d2b1813588b69d0d1a506f2 100644
--- a/gpt_index/indices/query/tree/embedding_query.py
+++ b/gpt_index/indices/query/tree/embedding_query.py
@@ -16,6 +16,26 @@ class GPTTreeIndexEmbeddingQuery(GPTTreeIndexLeafQuery):
     This class traverses the index graph using the embedding similarity between the
     query and the node text.
 
+    .. code-block:: python
+
+        response = index.query("<query_str>", mode="embedding")
+
+    Args:
+        query_template (Optional[Prompt]): Tree Select Query Prompt
+            (see :ref:`Prompt-Templates`).
+        query_template_multiple (Optional[Prompt]): Tree Select Query Prompt (Multiple)
+            (see :ref:`Prompt-Templates`).
+        text_qa_template (Optional[Prompt]): Question-Answer Prompt
+            (see :ref:`Prompt-Templates`).
+        refine_template (Optional[Prompt]): Refinement Prompt
+            (see :ref:`Prompt-Templates`).
+        child_branch_factor (int): Number of child nodes to consider at each level.
+            If child_branch_factor is 1, then the query will only choose one child node
+            to traverse for any given parent node.
+            If child_branch_factor is 2, then the query will choose two child nodes.
+        embed_model (Optional[OpenAIEmbedding]): Embedding model to use for
+            embedding similarity.
+
     """
 
     def __init__(
diff --git a/gpt_index/indices/query/tree/leaf_query.py b/gpt_index/indices/query/tree/leaf_query.py
index b76d38898193b32b35177e842ed2ec82b759130c..80531d38a08ccf2498617fe0521f14fe5bca02ad 100644
--- a/gpt_index/indices/query/tree/leaf_query.py
+++ b/gpt_index/indices/query/tree/leaf_query.py
@@ -20,6 +20,24 @@ class GPTTreeIndexLeafQuery(BaseGPTIndexQuery[IndexGraph]):
     This class traverses the index graph and searches for a leaf node that can best
     answer the query.
 
+    .. code-block:: python
+
+        response = index.query("<query_str>", mode="default")
+
+    Args:
+        query_template (Optional[Prompt]): Tree Select Query Prompt
+            (see :ref:`Prompt-Templates`).
+        query_template_multiple (Optional[Prompt]): Tree Select Query Prompt (Multiple)
+            (see :ref:`Prompt-Templates`).
+        text_qa_template (Optional[Prompt]): Question-Answer Prompt
+            (see :ref:`Prompt-Templates`).
+        refine_template (Optional[Prompt]): Refinement Prompt
+            (see :ref:`Prompt-Templates`).
+        child_branch_factor (int): Number of child nodes to consider at each level.
+            If child_branch_factor is 1, then the query will only choose one child node
+            to traverse for any given parent node.
+            If child_branch_factor is 2, then the query will choose two child nodes.
+
     """
 
     def __init__(
diff --git a/gpt_index/indices/query/tree/retrieve_query.py b/gpt_index/indices/query/tree/retrieve_query.py
index f7f9cad538ee30b5fe7152199cf8be37a3d41627..bc22bebf514c2efeb985e7c8b3e52a85efe7a807 100644
--- a/gpt_index/indices/query/tree/retrieve_query.py
+++ b/gpt_index/indices/query/tree/retrieve_query.py
@@ -19,6 +19,14 @@ class GPTTreeIndexRetQuery(BaseGPTIndexQuery[IndexGraph]):
     the answer (because it was constructed with a query_str), so it does not
     attempt to parse information down the graph in order to synthesize an answer.
 
+    .. code-block:: python
+
+        response = index.query("<query_str>", mode="retrieve")
+
+    Args:
+        text_qa_template (Optional[Prompt]): Question-Answer Prompt
+            (see :ref:`Prompt-Templates`).
+
     """
 
     def __init__(
diff --git a/gpt_index/indices/tree/__init__.py b/gpt_index/indices/tree/__init__.py
index 1d4640565ae2765d9ca96a509dc9809217f62f2f..c13b792b07486dc27964e134193fcbfbe2b877a5 100644
--- a/gpt_index/indices/tree/__init__.py
+++ b/gpt_index/indices/tree/__init__.py
@@ -1 +1,8 @@
-"""Init file."""
+"""Tree-structured Index Data Structures."""
+
+# indices
+from gpt_index.indices.tree.base import GPTTreeIndex
+
+__all__ = [
+    "GPTTreeIndex",
+]
diff --git a/gpt_index/indices/tree/base.py b/gpt_index/indices/tree/base.py
index 18b7c99bf470537cb1923da27f70991e91615c90..a946c4e8313ae8c7d898dd5c0a46d1da290164b7 100644
--- a/gpt_index/indices/tree/base.py
+++ b/gpt_index/indices/tree/base.py
@@ -119,7 +119,23 @@ class GPTTreeIndexBuilder:
 
 
 class GPTTreeIndex(BaseGPTIndex[IndexGraph]):
-    """GPT Index."""
+    """GPT Tree Index.
+
+    The tree index is a tree-structured index, where each node is a summary of
+    the children nodes. During index construction, the tree is constructed
+    in a bottoms-up fashion until we end up with a set of root_nodes.
+
+    There are a few different options during query time (see :ref:`Ref-Query`).
+    The main option is to traverse down the tree from the root nodes.
+    A secondary answer is to directly synthesize the answer from the root nodes.
+
+    Args:
+        summary_template (Optional[Prompt]): A Summarization Prompt
+            (see :ref:`Prompt-Templates`).
+        insert_prompt (Optional[Prompt]): An Tree Insertion Prompt
+            (see :ref:`Prompt-Templates`).
+
+    """
 
     index_struct_cls = IndexGraph
 
diff --git a/gpt_index/langchain_helpers/chain_wrapper.py b/gpt_index/langchain_helpers/chain_wrapper.py
index 23a7c99570a5516fa1d3c7a5755c18331268b08f..390dff4c632cbe7e054f2cb1cbdea571017cc893 100644
--- a/gpt_index/langchain_helpers/chain_wrapper.py
+++ b/gpt_index/langchain_helpers/chain_wrapper.py
@@ -9,14 +9,34 @@ from gpt_index.prompts.base import Prompt
 
 
 class LLMPredictor:
-    """LLM predictor class."""
+    """LLM predictor class.
+
+    Wrapper around an LLMChain from Langchain.
+
+    Args:
+        llm (Optional[LLM]): LLM from Langchain to use for predictions.
+            Defaults to OpenAI's text-davinci-002 model.
+            Please see
+            `Langchain's LLM Page
+            <https://langchain.readthedocs.io/en/latest/modules/llms.html>`_
+            for more details.
+
+    """
 
     def __init__(self, llm: Optional[LLM] = None) -> None:
         """Initialize params."""
         self._llm = llm or OpenAI(temperature=0, model_name="text-davinci-002")
 
     def predict(self, prompt: Prompt, **prompt_args: Any) -> Tuple[str, str]:
-        """Predict the answer to a query."""
+        """Predict the answer to a query.
+
+        Args:
+            prompt (Prompt): Prompt to use for prediction.
+
+        Returns:
+            Tuple[str, str]: Tuple of the predicted answer and the formatted prompt.
+
+        """
         llm_chain = LLMChain(prompt=prompt, llm=self._llm)
 
         formatted_prompt = prompt.format(**prompt_args)
diff --git a/gpt_index/prompts/__init__.py b/gpt_index/prompts/__init__.py
index 1d4640565ae2765d9ca96a509dc9809217f62f2f..70a9a06c14c38bb373b8afc1d1e6b4b7febf726e 100644
--- a/gpt_index/prompts/__init__.py
+++ b/gpt_index/prompts/__init__.py
@@ -1 +1,5 @@
-"""Init file."""
+"""Prompt class."""
+
+from gpt_index.prompts.base import Prompt
+
+__all__ = ["Prompt"]
diff --git a/gpt_index/readers/__init__.py b/gpt_index/readers/__init__.py
index 04de8b7ec7833cb2aa90a5f07615f556b7b9e46b..0fd7ca35c0aefbe908adb7a219113944127199c3 100644
--- a/gpt_index/readers/__init__.py
+++ b/gpt_index/readers/__init__.py
@@ -1 +1,25 @@
-"""Init file for readers."""
+"""Data Connectors for GPT Index.
+
+This module contains the data connectors for GPT Index. Each connector inherits
+from a `BaseReader` class, connects to a data source, and loads BaseDocument objects
+from that data source.
+
+"""
+
+# readers
+from gpt_index.readers.file import SimpleDirectoryReader
+from gpt_index.readers.google.gdocs import GoogleDocsReader
+from gpt_index.readers.mongo import SimpleMongoReader
+from gpt_index.readers.notion import NotionPageReader
+from gpt_index.readers.slack import SlackReader
+from gpt_index.readers.wikipedia import WikipediaReader
+
+__all__ = [
+    "WikipediaReader",
+    "SimpleDirectoryReader",
+    "SimpleMongoReader",
+    "NotionPageReader",
+    "GoogleDocsReader",
+    "SlackReader",
+    "LLMPredictor",
+]
diff --git a/gpt_index/readers/file.py b/gpt_index/readers/file.py
index ea20cd2ea0f0b470e3436e36cc52609edca1db16..f9ca48c17e65ac46102b5c23b0793235b8eea272 100644
--- a/gpt_index/readers/file.py
+++ b/gpt_index/readers/file.py
@@ -12,6 +12,9 @@ class SimpleDirectoryReader(BaseReader):
     Can read files into separate documents, or concatenates
     files into one document text.
 
+    input_dir (str): Path to the directory.
+    exclude_hidden (bool): Whether to exclude hidden files (dotfiles).
+
     """
 
     def __init__(self, input_dir: str, exclude_hidden: bool = True) -> None:
@@ -26,7 +29,15 @@ class SimpleDirectoryReader(BaseReader):
         self.input_files = input_files
 
     def load_data(self, **load_kwargs: Any) -> List[Document]:
-        """Load data from the input directory."""
+        """Load data from the input directory.
+
+        Args:
+            concatenate (bool): whether to concatenate all files into one document.
+
+        Returns:
+            List[Document]: A list of documents.
+
+        """
         concatenate = load_kwargs.get("concatenate", True)
         data = ""
         data_list = []
diff --git a/gpt_index/readers/mongo.py b/gpt_index/readers/mongo.py
index 16a5b24c2079bb978ea36b2ce491a409da579c14..7404223e88d5c9e4763000e3818f42aaaa64d746 100644
--- a/gpt_index/readers/mongo.py
+++ b/gpt_index/readers/mongo.py
@@ -11,6 +11,11 @@ class SimpleMongoReader(BaseReader):
 
     Concatenates each Mongo doc into Document used by GPT Index.
 
+    Args:
+        host (str): Mongo host.
+        port (int): Mongo port.
+        max_docs (int): Maximum number of documents to load.
+
     """
 
     def __init__(self, host: str, port: int, max_docs: int = 1000) -> None:
@@ -43,7 +48,16 @@ class SimpleMongoReader(BaseReader):
         return documents
 
     def load_data(self, **load_kwargs: Any) -> List[Document]:
-        """Load data from the input directory."""
+        """Load data from the input directory.
+
+        Args:
+            db_name (str): name of the database.
+            collection_name (str): name of the collection.
+
+        Returns:
+            List[Document]: A list of documents.
+
+        """
         if "db_name" not in load_kwargs:
             raise ValueError("`db_name` not found in load_kwargs.")
         else:
diff --git a/gpt_index/readers/notion.py b/gpt_index/readers/notion.py
index 8fc98550599114f50da855cd2a07d95f5a152d6b..1bec6a4fd83ab063e09be9f954b4345d31889bbd 100644
--- a/gpt_index/readers/notion.py
+++ b/gpt_index/readers/notion.py
@@ -18,6 +18,9 @@ class NotionPageReader(BaseReader):
 
     Reads a set of Notion pages.
 
+    Args:
+        integration_token (str): Notion integration token.
+
     """
 
     def __init__(self, integration_token: Optional[str] = None) -> None:
@@ -115,7 +118,15 @@ class NotionPageReader(BaseReader):
         return page_ids
 
     def load_data(self, **load_kwargs: Any) -> List[Document]:
-        """Load data from the input directory."""
+        """Load data from the input directory.
+
+        Args:
+            page_ids (List[str]): List of page ids to load.
+
+        Returns:
+            List[Document]: List of documents.
+
+        """
         if "page_ids" not in load_kwargs:
             raise ValueError('Must specify a "page_ids" in `load_kwargs`.')
         docs = []
diff --git a/gpt_index/readers/slack.py b/gpt_index/readers/slack.py
index 73cae8651de3c1b7db1c31aed9f1fd493696e2e0..3fd22a4978fb6f48aecd435190d47effe95cbad0 100644
--- a/gpt_index/readers/slack.py
+++ b/gpt_index/readers/slack.py
@@ -14,6 +14,10 @@ class SlackReader(BaseReader):
 
     Reads conversations from channels.
 
+    Args:
+        slack_token (Optional[str]): Slack token. If not provided, we
+            assume the environment variable `SLACK_BOT_TOKEN` is set.
+
     """
 
     def __init__(self, slack_token: Optional[str] = None) -> None:
@@ -100,7 +104,15 @@ class SlackReader(BaseReader):
         return "\n\n".join(result_messages)
 
     def load_data(self, **load_kwargs: Any) -> List[Document]:
-        """Load data from the input directory."""
+        """Load data from the input directory.
+
+        Args:
+            channel_ids (List[str]): List of channel ids to read.
+
+        Returns:
+            List[Document]: List of documents.
+
+        """
         channel_ids = load_kwargs.pop("channel_ids", None)
         if channel_ids is None:
             raise ValueError('Must specify a "channel_id" in `load_kwargs`.')
diff --git a/gpt_index/readers/wikipedia.py b/gpt_index/readers/wikipedia.py
index fd361aa1db92b75d12b86993f9f5a655901d9170..2db0fb0cd47d8a466496cce530199c8ac7f45ab8 100644
--- a/gpt_index/readers/wikipedia.py
+++ b/gpt_index/readers/wikipedia.py
@@ -22,7 +22,12 @@ class WikipediaReader(BaseReader):
             )
 
     def load_data(self, **load_kwargs: Any) -> List[Document]:
-        """Load data from the input directory."""
+        """Load data from the input directory.
+
+        Args:
+            pages (List[str]): List of pages to read.
+
+        """
         import wikipedia
 
         pages: List[str] = load_kwargs.pop("pages", None)
diff --git a/tests/indices/list/__init__.py b/tests/indices/list/__init__.py
index 1d4640565ae2765d9ca96a509dc9809217f62f2f..b24c607f33df4130279f81016b5fdc2a22d19fd0 100644
--- a/tests/indices/list/__init__.py
+++ b/tests/indices/list/__init__.py
@@ -1 +1,7 @@
-"""Init file."""
+"""List-based data structures."""
+
+from gpt_index.indices.list.base import GPTListIndex
+
+__all__ = [
+    "GPTListIndex",
+]