diff --git a/docs/docs/examples/structured_outputs/structured_outputs.ipynb b/docs/docs/examples/structured_outputs/structured_outputs.ipynb
index b248c1a5c6e70e7a1d1e30acadfa6c4d45ce1a72..0d6dd22ea4ba89d8e5d304a6c6d468e830b47b10 100644
--- a/docs/docs/examples/structured_outputs/structured_outputs.ipynb
+++ b/docs/docs/examples/structured_outputs/structured_outputs.ipynb
@@ -5,13 +5,13 @@
    "id": "ca3bd17d-aec3-4848-ac82-def6e2d6fa18",
    "metadata": {},
    "source": [
-    "# A Simple Guide to Structured Outputs\n",
+    "# Examples of Structured Data Extraction in LlamaIndex\n",
     "\n",
     "<a href=\"https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/docs/examples/structured_outputs/structured_outputs.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n",
     "\n",
-    "This is a simple guide to structured outputs with LLMs. At a high-level, we can attach a Pydantic class to any LLM and have the output format be natively structured, even if the LLM is used in upstream modules.\n",
+    "If you haven't yet read our [structured data extraction tutorial](../../understanding/extraction/index.md), we recommend starting there. This notebook demonstrates some of the techniques introduced in the tutorial.\n",
     "\n",
-    "We start with the simple syntax around LLMs, and then move on to how to plug it in within query pipelines, and also higher-level modules like a query engine and agent.\n",
+    "We start with the simple syntax around LLMs, then move on to how to use it with higher-level modules like a query engine and agent.\n",
     "\n",
     "A lot of the underlying behavior around structured outputs is powered by our Pydantic Program modules. Check out our [in-depth structured outputs guide](https://docs.llamaindex.ai/en/stable/module_guides/querying/structured_outputs/) for more details."
    ]
@@ -261,58 +261,12 @@
     "    pprint(partial_output.raw.dict())"
    ]
   },
-  {
-   "cell_type": "markdown",
-   "id": "de32c5b4-bdfb-4f7c-b667-ad3ea81e17d0",
-   "metadata": {},
-   "source": [
-    "### 1.b Example using Query Pipelines\n",
-    "\n",
-    "You can plug in structured LLMs in query pipelines - the output will be directly the structured object."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "ff82d41f-5397-4ba2-a600-68248056af85",
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "Album(name='Inside Out Soundtrack', artist='Various Artists', songs=[Song(title='Bundle of Joy', length_seconds=150), Song(title='Team Building', length_seconds=120), Song(title='Nomanisone Island/National Movers', length_seconds=180), Song(title='Overcoming Sadness', length_seconds=210), Song(title='Free Skating', length_seconds=160), Song(title='First Day of School', length_seconds=140), Song(title='Riled Up', length_seconds=130), Song(title='Goofball No Longer', length_seconds=170), Song(title='Memory Lanes', length_seconds=200), Song(title='The Forgetters', length_seconds=110)])"
-      ]
-     },
-     "execution_count": null,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "# use query pipelines\n",
-    "from llama_index.core.prompts import ChatPromptTemplate\n",
-    "from llama_index.core.query_pipeline import QueryPipeline as QP\n",
-    "from llama_index.core.llms import ChatMessage\n",
-    "\n",
-    "chat_prompt_tmpl = ChatPromptTemplate(\n",
-    "    message_templates=[\n",
-    "        ChatMessage.from_str(\n",
-    "            \"Generate an example album from {movie_name}\", role=\"user\"\n",
-    "        )\n",
-    "    ]\n",
-    ")\n",
-    "\n",
-    "qp = QP(chain=[chat_prompt_tmpl, sllm])\n",
-    "response = qp.run(movie_name=\"Inside Out\")\n",
-    "response"
-   ]
-  },
   {
    "cell_type": "markdown",
    "id": "9d1b4bc9-56ee-4bf7-a63c-50ce0cadeeae",
    "metadata": {},
    "source": [
-    "### 1.c Use the `structured_predict` Function\n",
+    "### 1.b Use the `structured_predict` Function\n",
     "\n",
     "Instead of explicitly doing `llm.as_structured_llm(...)`, every LLM class has a `structured_predict` function which allows you to more easily call the LLM with a prompt template + template variables to return a strutured output in one line of code."
    ]
diff --git a/docs/docs/module_guides/querying/structured_outputs/index.md b/docs/docs/module_guides/querying/structured_outputs/index.md
index 6e44510b9f86d54d0669d9613b04e48a66bb8589..06cb704f5fea9fbd61a58f6bee92a2328e3044b2 100644
--- a/docs/docs/module_guides/querying/structured_outputs/index.md
+++ b/docs/docs/module_guides/querying/structured_outputs/index.md
@@ -25,8 +25,9 @@ append format instructions to the prompt. After the LLM call, the output parser
 
 With function calling APIs, the output is inherently in a structured format, and the input can take in the signature of the desired object. The structured output just needs to be cast in the right object format (e.g. Pydantic).
 
-## Starter Guide
-- [Simple Guide to Structured Outputs](../../../examples/structured_outputs/structured_outputs.ipynb)
+## Starter Guides
+- [Structured data extraction tutorial](../../../understanding/extraction/index.md)
+- [Examples of Structured Outputs](../../../examples/structured_outputs/structured_outputs.ipynb)
 
 ## Other Resources
 
diff --git a/docs/docs/module_guides/querying/structured_outputs/pydantic_program.md b/docs/docs/module_guides/querying/structured_outputs/pydantic_program.md
index f4b3d47350a4714674478baf1e5cce63e38e378d..eba45b7e9d036deb875f5722241a667e92a8dbd5 100644
--- a/docs/docs/module_guides/querying/structured_outputs/pydantic_program.md
+++ b/docs/docs/module_guides/querying/structured_outputs/pydantic_program.md
@@ -1,7 +1,7 @@
-# Pydantic Program
+# Pydantic Programs
 
 !!! tip
-    The Pydantic Program is a lower-level abstraction for structured output extraction. The default way to perform structured output extraction is with our LLM classes, which lets you plug these LLMs easily into higher-level workflows. Check out our [structured output starter guide](../../../examples/structured_outputs/structured_outputs.ipynb).
+    Pydantic Programs are a lower-level abstraction for structured output extraction. The default way to perform structured output extraction is with our LLM classes, which lets you plug these LLMs easily into higher-level workflows. Check out our [structured data extraction tutorial](../../../understanding/extraction/index.md).
 
 A pydantic program is a generic abstraction that takes in an input string and converts it to a structured Pydantic object type.
 
@@ -10,7 +10,7 @@ Because this abstraction is so generic, it encompasses a broad range of LLM work
 There's a few general types of Pydantic Programs:
 
 - **Text Completion Pydantic Programs**: These convert input text into a user-specified structured object through a text completion API + output parsing.
-- **Function Calling Pydantic Program**: These convert input text into a user-specified structured object through an LLM function calling API.
+- **Function Calling Pydantic Programs**: These convert input text into a user-specified structured object through an LLM function calling API.
 - **Prepackaged Pydantic Programs**: These convert input text into prespecified structured objects.
 
 ## Text Completion Pydantic Programs
diff --git a/docs/docs/understanding/extraction/index.md b/docs/docs/understanding/extraction/index.md
new file mode 100644
index 0000000000000000000000000000000000000000..e8e4206981979b4456b26c29bb1c4e32decb79b4
--- /dev/null
+++ b/docs/docs/understanding/extraction/index.md
@@ -0,0 +1,163 @@
+# Introduction to Structured Data Extraction
+
+LLMs excel at data understanding, leading to one of their most important use cases: the ability to turn regular human language (which we refer to as **unstructured data**) into specific, regular, expected formats for consumption by computer programs. We call the output of this process **structured data**. Since in the process of conversion a lot of superfluous data is often ignored, we call it **extraction**.
+
+The core of the way structured data extraction works in LlamaIndex is [Pydantic](https://docs.pydantic.dev/latest/) classes: you define a data structure in Pydantic and LlamaIndex works with Pydantic to coerce the output of the LLM into that structure.
+
+## What is Pydantic?
+
+Pydantic is a widely-used data validation and conversion library. It relies heavily on Python type declarations. There is an [extensive guide](https://docs.pydantic.dev/latest/concepts/models/) to Pydantic in that projectâ€™s documentation, but weâ€™ll cover the very basics here.
+
+To create a Pydantic class, inherit from Pydanticâ€™s `BaseModel` class:
+
+```python
+from pydantic import BaseModel
+
+
+class User(BaseModel):
+    id: int
+    name: str = "Jane Doe"
+```
+
+In this example, youâ€™ve created a `User` class with two fields, `id` and `name`. Youâ€™ve defined `id` as an integer, and `name` as a string that defaults to `Jane Doe`.
+
+You can create more complex structures by nesting these models:
+
+```python
+from typing import List, Optional
+from pydantic import BaseModel
+
+
+class Foo(BaseModel):
+    count: int
+    size: Optional[float] = None
+
+
+class Bar(BaseModel):
+    apple: str = "x"
+    banana: str = "y"
+
+
+class Spam(BaseModel):
+    foo: Foo
+    bars: List[Bar]
+```
+
+Now `Spam` has a `foo` and a `bars`. `Foo` has a `count` and an optional `size` , and `bars` is a List of objects each of which has an `apple` and `banana` property.
+
+## Converting Pydantic objects to JSON schemas
+
+Pydantic supports converting Pydantic classes into JSON-serialized schema objects which conform to [popular standards](https://docs.pydantic.dev/latest/concepts/json_schema/). The `User` class above for instance serializes into this:
+
+```json
+{
+  "properties": {
+    "id": {
+      "title": "Id",
+      "type": "integer"
+    },
+    "name": {
+      "default": "Jane Doe",
+      "title": "Name",
+      "type": "string"
+    }
+  },
+  "required": [
+    "id"
+  ],
+  "title": "User",
+  "type": "object"
+}
+```
+
+This property is crucial: these JSON-formatted schemas are often passed to LLMs and the LLMs in turn use them as instructions on how to return data.
+
+## Using annotations
+
+As mentioned, LLMs are using JSON schemas from Pydantic as instructions on how to return data. To assist them and improve the accuracy of your returned data, itâ€™s helpful to include natural-language descriptions of objects and fields and what theyâ€™re used for. Pydantic has support for this with [docstrings](https://www.geeksforgeeks.org/python-docstrings/) and [Fields](https://docs.pydantic.dev/latest/concepts/fields/).
+
+Weâ€™ll be using the following example Pydantic classes in all of our examples going forward:
+
+```python
+from datetime import datetime
+
+
+class LineItem(BaseModel):
+    """A line item in an invoice."""
+
+    item_name: str = Field(description="The name of this item")
+    price: float = Field(description="The price of this item")
+
+
+class Invoice(BaseModel):
+    """A representation of information from an invoice."""
+
+    invoice_id: str = Field(
+        description="A unique identifier for this invoice, often a number"
+    )
+    date: datetime = Field(description="The date this invoice was created")
+    line_items: list[LineItem] = Field(
+        description="A list of all the items in this invoice"
+    )
+```
+
+This expands to a much more complex JSON schema:
+
+```json
+{
+  "$defs": {
+    "LineItem": {
+      "description": "A line item in an invoice.",
+      "properties": {
+        "item_name": {
+          "description": "The name of this item",
+          "title": "Item Name",
+          "type": "string"
+        },
+        "price": {
+          "description": "The price of this item",
+          "title": "Price",
+          "type": "number"
+        }
+      },
+      "required": [
+        "item_name",
+        "price"
+      ],
+      "title": "LineItem",
+      "type": "object"
+    }
+  },
+  "description": "A representation of information from an invoice.",
+  "properties": {
+    "invoice_id": {
+      "description": "A unique identifier for this invoice, often a number",
+      "title": "Invoice Id",
+      "type": "string"
+    },
+    "date": {
+      "description": "The date this invoice was created",
+      "format": "date-time",
+      "title": "Date",
+      "type": "string"
+    },
+    "line_items": {
+      "description": "A list of all the items in this invoice",
+      "items": {
+        "$ref": "#/$defs/LineItem"
+      },
+      "title": "Line Items",
+      "type": "array"
+    }
+  },
+  "required": [
+    "invoice_id",
+    "date",
+    "line_items"
+  ],
+  "title": "Invoice",
+  "type": "object"
+}
+```
+
+Now that you have a basic understanding of Pydantic and the schemas it generates, you can move on to using Pydantic classes for structured data extraction in LlamaIndex, starting with [Structured LLMs](structured_llms.md).
diff --git a/docs/docs/understanding/extraction/lower_level.md b/docs/docs/understanding/extraction/lower_level.md
new file mode 100644
index 0000000000000000000000000000000000000000..469e0276d0a20a765d72df78e4ad92af37bf8e03
--- /dev/null
+++ b/docs/docs/understanding/extraction/lower_level.md
@@ -0,0 +1,96 @@
+# Low-level structured data extraction
+
+If your LLM supports tool calling and you need more direct control over how LlamaIndex extracts data, you can use `chat_with_tools` on an LLM directly. If your LLM does not support tool calling you can instruct your LLM directly and parse the output yourself. Weâ€™ll show how to do both.
+
+## Calling tools directly
+
+```python
+from llama_index.core.program.function_program import get_function_tool
+
+tool = get_function_tool(Invoice)
+
+resp = llm.chat_with_tools(
+    [tool],
+    # chat_history=chat_history,  # can optionally pass in chat history instead of user_msg
+    user_msg="Extract an invoice from the following text: " + text,
+    # tool_choice="Invoice",  # can optionally force the tool call
+)
+
+tool_calls = llm.get_tool_calls_from_response(
+    resp, error_on_no_tool_calls=False
+)
+
+outputs = []
+for tool_call in tool_calls:
+    if tool_call.tool_name == "Invoice":
+        outputs.append(Invoice(**tool_call.tool_kwargs))
+
+# use your outputs
+print(outputs[0])
+```
+
+This is identical to `structured_predict` if the LLM has a tool calling API. However, if the LLM supports it you can optionally allow multiple tool calls. This has the effect of extracting multiple objects from the same input, as in this example:
+
+```python
+from llama_index.core.program.function_program import get_function_tool
+
+tool = get_function_tool(LineItem)
+
+resp = llm.chat_with_tools(
+    [tool],
+    user_msg="Extract line items from the following text: " + text,
+    allow_parallel_tool_calls=True,
+)
+
+tool_calls = llm.get_tool_calls_from_response(
+    resp, error_on_no_tool_calls=False
+)
+
+outputs = []
+for tool_call in tool_calls:
+    if tool_call.tool_name == "LineItem":
+        outputs.append(LineItem(**tool_call.tool_kwargs))
+
+# use your outputs
+print(outputs)
+```
+
+If extracting multiple Pydantic objects from a single LLM call is your goal, this is how to do that.
+
+## Direct prompting
+
+If for some reason none of LlamaIndexâ€™s attempts to make extraction easier are working for you, you can dispense with them and prompt the LLM directly and parse the output yourself, as here:
+
+```python
+schema = Invoice.model_json_schema()
+prompt = "Here is a JSON schema for an invoice: " + json.dumps(
+    schema, indent=2
+)
+prompt += (
+    """
+  Extract an invoice from the following text.
+  Format your output as a JSON object according to the schema above.
+  Do not include any other text than the JSON object.
+  Omit any markdown formatting. Do not include any preamble or explanation.
+"""
+    + text
+)
+
+response = llm.complete(prompt)
+
+print(response)
+
+invoice = Invoice.model_validate_json(response.text)
+
+pprint(invoice)
+```
+
+Congratulations! You have learned everything there is to know about structured data extraction in LlamaIndex.
+
+## Other Guides
+
+For a deeper look at structured data extraction with LlamaIndex, check out the following guides:
+
+- [Structured Outputs](../../module_guides/querying/structured_outputs/index.md)
+- [Pydantic Programs](../../module_guides/querying/structured_outputs/pydantic_program.md)
+- [Output Parsing](../../module_guides/querying/structured_outputs/output_parser.md)
diff --git a/docs/docs/understanding/extraction/structured_llms.md b/docs/docs/understanding/extraction/structured_llms.md
new file mode 100644
index 0000000000000000000000000000000000000000..cf8dd438733f530d704afdc8bfa1803ad5be5de4
--- /dev/null
+++ b/docs/docs/understanding/extraction/structured_llms.md
@@ -0,0 +1,106 @@
+# Using Structured LLMs
+
+The highest-level way to extract structured data in LlamaIndex is to instantiate a Structured LLM. First, letâ€™s instantiate our Pydantic class as previously:
+
+```python
+from datetime import datetime
+
+
+class LineItem(BaseModel):
+    """A line item in an invoice."""
+
+    item_name: str = Field(description="The name of this item")
+    price: float = Field(description="The price of this item")
+
+
+class Invoice(BaseModel):
+    """A representation of information from an invoice."""
+
+    invoice_id: str = Field(
+        description="A unique identifier for this invoice, often a number"
+    )
+    date: datetime = Field(description="The date this invoice was created")
+    line_items: list[LineItem] = Field(
+        description="A list of all the items in this invoice"
+    )
+```
+
+If this is your first time using LlamaIndex, letâ€™s get our dependencies:
+
+- `pip install llama-index-core llama-index-llms-openai` to get the LLM (weâ€™ll be using OpenAI for simplicity, but you can always use another one)
+- Get an OpenAI API key and set it an an environment variable called `OPENAI_API_KEY`
+- `pip install llama-index-readers-file` to get the PDFReader
+    - Note: for better parsing of PDFs, we recommend [LlamaParse](https://docs.cloud.llamaindex.ai/llamaparse/getting_started)
+
+Now letâ€™s load in the text of an actual invoice:
+
+```python
+from llama_index.readers.file import PDFReader
+from pathlib import Path
+
+pdf_reader = PDFReader()
+documents = pdf_reader.load_data(file=Path("./uber_receipt.pdf"))
+text = documents[0].text
+```
+
+And letâ€™s instantiate an LLM, give it our Pydantic class, and then ask it to `complete` using the plain text of the invoice:
+
+```python
+from llama_index.llms.openai import OpenAI
+
+llm = OpenAI(model="gpt-4o")
+sllm = llm.as_structured_llm(Invoice)
+
+response = sllm.complete(text)
+```
+
+`response` is a LlamaIndex `CompletionResponse` with two properties: `text` and `raw`. `text` contains the JSON-serialized form of the Pydantic-ingested response:
+
+```python
+json_response = json.loads(response.text)
+print(json.dumps(json_response, indent=2))
+```
+
+```python
+{
+    "invoice_id": "Visa \u2022\u2022\u2022\u20224469",
+    "date": "2024-10-10T19:49:00",
+    "line_items": [
+        {"item_name": "Trip fare", "price": 12.18},
+        {"item_name": "Access for All Fee", "price": 0.1},
+        {"item_name": "CA Driver Benefits", "price": 0.32},
+        {"item_name": "Booking Fee", "price": 2.0},
+        {"item_name": "San Francisco City Tax", "price": 0.21},
+    ],
+}
+```
+
+Note that this invoice didnâ€™t have an ID so the LLM has tried its best and used the credit card number. Pydantic validation is not a guarantee!
+
+The `raw` property of response (somewhat confusingly) contains the Pydantic object itself:
+
+```python
+from pprint import pprint
+
+pprint(response.raw)
+```
+
+```python
+Invoice(
+    invoice_id="Visa â€¢â€¢â€¢â€¢4469",
+    date=datetime.datetime(2024, 10, 10, 19, 49),
+    line_items=[
+        LineItem(item_name="Trip fare", price=12.18),
+        LineItem(item_name="Access for All Fee", price=0.1),
+        LineItem(item_name="CA Driver Benefits", price=0.32),
+        LineItem(item_name="Booking Fee", price=2.0),
+        LineItem(item_name="San Francisco City Tax", price=0.21),
+    ],
+)
+```
+
+Note that Pydantic is creating a full `datetime` object and not just translating a string.
+
+A structured LLM works exactly like a regular LLM class: you can call `chat`, `stream`, `achat`, `astream` etc. and it will respond with Pydantic objects in all cases. You can also pass in your Structured LLM as a parameter to `VectorStoreIndex.as_query_engine(llm=sllm)` and it will automatically respond to your RAG queries with structured objects.
+
+The Structured LLM takes care of all the prompting for you. If you want more control over the prompt, move on to [Structured Prediction](structured_prediction.md).
diff --git a/docs/docs/understanding/extraction/structured_prediction.md b/docs/docs/understanding/extraction/structured_prediction.md
new file mode 100644
index 0000000000000000000000000000000000000000..06d867dff8d3c533476b0341fbbdedd843ff70d9
--- /dev/null
+++ b/docs/docs/understanding/extraction/structured_prediction.md
@@ -0,0 +1,103 @@
+# Structured Prediction
+
+Structured Prediction gives you more granular control over how your application calls the LLM and uses Pydantic. We will use the same `Invoice` class, load the PDF as we did in the previous example, and use OpenAI as before. Instead of creating a structured LLM, we will call `structured_predict` on the LLM itself; this a method of every LLM class.
+
+Structured predict takes a Pydantic class and a Prompt Template as arguments, along with keyword arguments of any variables in the prompt template.
+
+```python
+from llama_index.core.prompts import PromptTemplate
+
+prompt = PromptTemplate(
+    "Extract an invoice from the following text. If you cannot find an invoice ID, use the company name '{company_name}' and the date as the invoice ID: {text}"
+)
+
+response = llm.structured_predict(
+    Invoice, prompt, text=text, company_name="Uber"
+)
+```
+
+As you can see, this allows us to include additional prompt direction for what the LLM should do if Pydantic isnâ€™t quite enough to parse the data correctly. The response object in this case is the Pydantic object itself. We can get the output as JSON if we want:
+
+```python
+json_output = response.model_dump_json()
+print(json.dumps(json.loads(json_output), indent=2))
+```
+
+```python
+{
+    "invoice_id": "Uber-2024-10-10",
+    "date": "2024-10-10T19:49:00",
+    "line_items": [
+        {"item_name": "Trip fare", "price": 12.18},
+        {"item_name": "Access for All Fee", "price": 0.1},
+        ...,
+    ],
+}
+```
+
+`structured_predict` has several variants available for different use-cases include async (`astructured_predict`) and streaming (`stream_structured_predict`, `astream_structured_predict`).
+
+## Under the hood
+
+Depending on which LLM you use, `structured_predict` is using one of two different classes to handle calling the LLM and parsing the output.
+
+### FunctionCallingProgram
+
+If the LLM you are using has a function calling API, `FunctionCallingProgram` will
+
+- Convert the Pydantic object into a tool
+- Prompts the LLM while forcing it to use this tool
+- Returns the Pydantic object generated
+
+This is generally a more reliable method and will be used by preference if available. However, some LLMs are text-only and they will use the other method.
+
+### LLMTextCompletionProgram
+
+If the LLM is text-only, `LLMTextCompletionProgram` will
+
+- Output the Pydantic schema as JSON
+- Send the schema and the data to the LLM with prompt instructions to respond in a form the conforms to the schema
+- Call `model_validate_json()` on the Pydantic object, passing in the raw text returned from the LLM
+
+This is notably less reliable, but supported by all text-based LLMs.
+
+## Calling prediction classes directly
+
+In practice `structured_predict` should work well for any LLM, but if you need lower-level control it is possible to call `FunctionCallingProgram` and `LLMTextCompletionProgram` directly and further customize whatâ€™s happening:
+
+```python
+textCompletion = LLMTextCompletionProgram.from_defaults(
+    output_cls=Invoice,
+    llm=llm,
+    prompt=PromptTemplate(
+        "Extract an invoice from the following text. If you cannot find an invoice ID, use the company name '{company_name}' and the date as the invoice ID: {text}"
+    ),
+)
+
+output = textCompletion(company_name="Uber", text=text)
+```
+
+The above is identical to calling `structured_predict` on an LLM without function calling APIs and returns a Pydantic object just like `structured_predict` does. However, you can customize how the output is parsed by subclassing the `PydanticOutputParser`:
+
+```python
+from llama_index.core.output_parsers import PydanticOutputParser
+
+
+class MyOutputParser(PydanticOutputParser):
+    def get_pydantic_object(self, text: str):
+        # do something more clever than this
+        return self.output_parser.model_validate_json(text)
+
+
+textCompletion = LLMTextCompletionProgram.from_defaults(
+    llm=llm,
+    prompt=PromptTemplate(
+        "Extract an invoice from the following text. If you cannot find an invoice ID, use the company name '{company_name}' and the date as the invoice ID: {text}"
+    ),
+    output_parser=MyOutputParser(output_cls=Invoice),
+)
+```
+
+This is useful if you are using a low-powered LLM that needs help with the parsing.
+
+In the final section we will take a look at even [lower-level calls to the extract structured data](lower_level.md), including extracting multiple structures in the same call.
diff --git a/docs/docs/use_cases/extraction.md b/docs/docs/use_cases/extraction.md
index 7dfe92d2edcac1f8c1671e80ee7fb7d6da0655ad..9dfa11a7ea725d05e767309a0a12281e5603f584 100644
--- a/docs/docs/use_cases/extraction.md
+++ b/docs/docs/use_cases/extraction.md
@@ -8,27 +8,25 @@ This can be especially useful when you have unstructured source material like ch
 
 Once you have structured data you can send them to a database, or you can parse structured outputs in code to automate workflows.
 
-## Core Guides
+## Full tutorial
 
-#### Quickstart
-The simplest way to perform structured extraction is with our LLM classes. Take a look at the following starter resources:
-- [Simple Guide to Structured Outputs](../examples/structured_outputs/structured_outputs.ipynb)
+Our Learn section has a [full tutorial on structured data extraction](../understanding/extraction/index.md). We recommend starting out there.
 
-There are also relevant sections for our LLM guides: [OpenAI](../examples/llm/openai.ipynb), [Anthropic](../examples/llm/anthropic.ipynb), and [Mistral](../examples/llm/mistralai.ipynb).
+There is also an [example notebook](../examples/structured_outputs/structured_outputs.ipynb) demonstrating some of the techniques from the tutorial.
 
-#### In-depth Guides
-For a more comprehensive overview of structured data extraction with LlamaIndex, including lower-level modules, check out the following guides. Check out our standalone lower-level modules like Pydantic programs or as part of a RAG workflow.
-We also have standalone output parsing modules that you can use yourself with an LLM / prompt.
+## Other Guides
+
+For a more comprehensive overview of structured data extraction with LlamaIndex, including lower-level modules, check out the following guides:
 
 - [Structured Outputs](../module_guides/querying/structured_outputs/index.md)
-- [Pydantic Program](../module_guides/querying/structured_outputs/pydantic_program.md)
+- [Pydantic Programs](../module_guides/querying/structured_outputs/pydantic_program.md)
 - [Output Parsing](../module_guides/querying/structured_outputs/output_parser.md)
 
 We also have multi-modal structured data extraction. [Check it out](../use_cases/multimodal.md#simple-evaluation-of-multi-modal-rag).
 
-## Misc Examples
+## Miscellaneous Examples
 
-Some additional miscellaneous examples highlighting use cases:
+Some additional examples highlighting use cases:
 
 - [Extracting names and locations from descriptions of people](../examples/output_parsing/df_program.ipynb)
 - [Extracting album data from music reviews](../examples/llm/llama_api.ipynb)
diff --git a/docs/mkdocs.yml b/docs/mkdocs.yml
index 8dc70caacb1cbd643253aa44e374a1235b445992..fc224ad7e204cc6055085a7d4c1cea5d7a959aa5 100644
--- a/docs/mkdocs.yml
+++ b/docs/mkdocs.yml
@@ -61,6 +61,11 @@ nav:
           - Nested workflows: ./understanding/workflows/nested.md
           - Observability: ./understanding/workflows/observability.md
           - Unbound syntax: ./understanding/workflows/unbound_functions.md
+      - Structured Data Extraction:
+          - Introduction: ./understanding/extraction/index.md
+          - Using Structured LLMs: ./understanding/extraction/structured_llms.md
+          - Structured Prediction: ./understanding/extraction/structured_prediction.md
+          - Lower-level extraction: ./understanding/extraction/lower_level.md
       - Tracing and Debugging: ./understanding/tracing_and_debugging/tracing_and_debugging.md
       - Evaluating:
           - ./understanding/evaluating/evaluating.md