"Original Sentence: Over the 21st century, AMOC will very likely decline for all SSP \n",
"Original Sentence: Over the 21st century, AMOC will very likely decline for all SSP \n",
"scenarios but will not involve an abrupt collapse before 2100 (WGI \n",
"scenarios but will not involve an abrupt collapse before 2100 (WGI \n",
"AR6 Sections 4.3.2, 9.2.3.1; Fox-Kemper et al., 2021; Lee et al., 2021).\n",
"AR6 Sections 4.3.2, 9.2.3.1; Fox-Kemper et al., 2021; Lee et al., 2021).\n"
"\n"
]
]
}
}
],
],
...
...
%% Cell type:markdown id: tags:
%% Cell type:markdown id: tags:
<ahref="https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/examples/node_postprocessor/MetadataReplacementDemo.ipynb"target="_parent"><imgsrc="https://colab.research.google.com/assets/colab-badge.svg"alt="Open In Colab"/></a>
<ahref="https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/examples/node_postprocessor/MetadataReplacementDemo.ipynb"target="_parent"><imgsrc="https://colab.research.google.com/assets/colab-badge.svg"alt="Open In Colab"/></a>
%% Cell type:markdown id: tags:
%% Cell type:markdown id: tags:
# Metadata Replacement + Node Sentence Window
# Metadata Replacement + Node Sentence Window
In this notebook, we use the `SentenceWindowNodeParser` to parse documents into single sentences per node. Each node also contains a "window" with the sentences on either side of the node sentence.
In this notebook, we use the `SentenceWindowNodeParser` to parse documents into single sentences per node. Each node also contains a "window" with the sentences on either side of the node sentence.
Then, during retrieval, before passing the retrieved sentences to the LLM, the single sentences are replaced with a window containing the surrounding sentences using the `MetadataReplacementNodePostProcessor`.
Then, during retrieval, before passing the retrieved sentences to the LLM, the single sentences are replaced with a window containing the surrounding sentences using the `MetadataReplacementNodePostProcessor`.
This is most useful for large documents/indexes, as it helps to retrieve more fine-grained details.
This is most useful for large documents/indexes, as it helps to retrieve more fine-grained details.
By default, the sentence window is 5 sentences on either side of the original sentence.
By default, the sentence window is 5 sentences on either side of the original sentence.
In this case, chunk size settings are not used, in favor of following the window settings.
In this case, chunk size settings are not used, in favor of following the window settings.
%% Cell type:code id: tags:
%% Cell type:code id: tags:
``` python
``` python
%load_extautoreload
%load_extautoreload
%autoreload2
%autoreload2
```
```
%% Cell type:markdown id: tags:
%% Cell type:markdown id: tags:
## Setup
## Setup
%% Cell type:markdown id: tags:
%% Cell type:markdown id: tags:
If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.
If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.
% Total % Received % Xferd Average Speed Time Time Time Current
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0curl: (6) Could not resolve host: www..ch
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0curl: (6) Could not resolve host: www..ch
%% Cell type:code id: tags:
%% Cell type:code id: tags:
``` python
``` python
fromllama_indeximportSimpleDirectoryReader
fromllama_indeximportSimpleDirectoryReader
documents=SimpleDirectoryReader(
documents=SimpleDirectoryReader(
input_files=["./IPCC_AR6_WGII_Chapter03.pdf"]
input_files=["./IPCC_AR6_WGII_Chapter03.pdf"]
).load_data()
).load_data()
```
```
%% Cell type:markdown id: tags:
%% Cell type:markdown id: tags:
### Extract Nodes
### Extract Nodes
We extract out the set of nodes that will be stored in the VectorIndex. This includes both the nodes with the sentence window parser, as well as the "base" nodes extracted using the standard parser.
We extract out the set of nodes that will be stored in the VectorIndex. This includes both the nodes with the sentence window parser, as well as the "base" nodes extracted using the standard parser.
There is low confidence in the quantification of Atlantic Meridional Overturning Circulation (AMOC) changes in the 20th century due to low agreement in quantitative reconstructed and simulated trends. Additionally, direct observational records since the mid-2000s remain too short to determine the relative contributions of internal variability, natural forcing, and anthropogenic forcing to AMOC change. However, it is very likely that AMOC will decline for all SSP scenarios over the 21st century, but it will not involve an abrupt collapse before 2100.
There is low confidence in the quantification of Atlantic Meridional Overturning Circulation (AMOC) changes in the 20th century due to low agreement in quantitative reconstructed and simulated trends. Additionally, direct observational records since the mid-2000s remain too short to determine the relative contributions of internal variability, natural forcing, and anthropogenic forcing to AMOC change. However, it is very likely that AMOC will decline for all SSP scenarios over the 21st century, but it will not involve an abrupt collapse before 2100.
%% Cell type:markdown id: tags:
%% Cell type:markdown id: tags:
We can also check the original sentence that was retrieved for each node, as well as the actual window of sentences that was sent to the LLM.
We can also check the original sentence that was retrieved for each node, as well as the actual window of sentences that was sent to the LLM.
There are concerns surrounding the AMOC (Atlantic Meridional Overturning Circulation). The context information mentions that the AMOC will decline over the 21st century, with high confidence but low confidence for quantitative projections.
There are concerns surrounding the AMOC (Atlantic Meridional Overturning Circulation). The context information mentions that the AMOC will decline over the 21st century, with high confidence but low confidence for quantitative projections.
%% Cell type:markdown id: tags:
%% Cell type:markdown id: tags:
## Analysis
## Analysis
So the `SentenceWindowNodeParser` + `MetadataReplacementNodePostProcessor` combo is the clear winner here. But why?
So the `SentenceWindowNodeParser` + `MetadataReplacementNodePostProcessor` combo is the clear winner here. But why?
Embeddings at a sentence level seem to capture more fine-grained details, like the word `AMOC`.
Embeddings at a sentence level seem to capture more fine-grained details, like the word `AMOC`.
We can also compare the retrieved chunks for each index!
We can also compare the retrieved chunks for each index!
%% Cell type:code id: tags:
%% Cell type:code id: tags:
``` python
``` python
forsource_nodeinwindow_response.source_nodes:
forsource_nodeinwindow_response.source_nodes:
print(source_node.node.metadata["original_text"])
print(source_node.node.metadata["original_text"])
print("--------")
print("--------")
```
```
%% Output
%% Output
Over the 21st century, AMOC will very likely decline for all SSP
Over the 21st century, AMOC will very likely decline for all SSP
scenarios but will not involve an abrupt collapse before 2100 (WGI
scenarios but will not involve an abrupt collapse before 2100 (WGI
AR6 Sections 4.3.2, 9.2.3.1; Fox-Kemper et al., 2021; Lee et al., 2021).
AR6 Sections 4.3.2, 9.2.3.1; Fox-Kemper et al., 2021; Lee et al., 2021).
--------
--------
Direct observational records since the mid-2000s remain too short to
Direct observational records since the mid-2000s remain too short to
determine the relative contributions of internal variability, natural
determine the relative contributions of internal variability, natural
forcing and anthropogenic forcing to AMOC change (high confidence)
forcing and anthropogenic forcing to AMOC change (high confidence)
(WGI AR6 Sections 2.3.3, 9.2.3.1; Fox-Kemper et al., 2021; Gulev et al.,
(WGI AR6 Sections 2.3.3, 9.2.3.1; Fox-Kemper et al., 2021; Gulev et al.,
2021).
2021).
--------
--------
%% Cell type:markdown id: tags:
%% Cell type:markdown id: tags:
Here, we can see that the sentence window index easily retrieved two nodes that talk about AMOC. Remember, the embeddings are based purely on the original sentence here, but the LLM actually ends up reading the surrounding context as well!
Here, we can see that the sentence window index easily retrieved two nodes that talk about AMOC. Remember, the embeddings are based purely on the original sentence here, but the LLM actually ends up reading the surrounding context as well!
%% Cell type:markdown id: tags:
%% Cell type:markdown id: tags:
Now, let's try and disect why the naive vector index failed.
Now, let's try and disect why the naive vector index failed.
%% Cell type:code id: tags:
%% Cell type:code id: tags:
``` python
``` python
fornodeinvector_response.source_nodes:
fornodeinvector_response.source_nodes:
print("AMOC mentioned?","AMOC"innode.node.text)
print("AMOC mentioned?","AMOC"innode.node.text)
print("--------")
print("--------")
```
```
%% Output
%% Output
AMOC mentioned? False
AMOC mentioned? False
--------
--------
AMOC mentioned? False
AMOC mentioned? False
--------
--------
AMOC mentioned? True
AMOC mentioned? True
--------
--------
AMOC mentioned? False
AMOC mentioned? False
--------
--------
AMOC mentioned? False
AMOC mentioned? False
--------
--------
%% Cell type:markdown id: tags:
%% Cell type:markdown id: tags:
So source node at index [2] mentions AMOC, but what did this text actually look like?
So source node at index [2] mentions AMOC, but what did this text actually look like?
changes‘Current Arctic sea ice coverage levels are the
changes‘Current Arctic sea ice coverage levels are the
lowest since at least 1850 for both annual mean
lowest since at least 1850 for both annual mean
and late-summer values (high confidence).’2.3.2.1, 9.3.1 (Fox-Kemper
and late-summer values (high confidence).’2.3.2.1, 9.3.1 (Fox-Kemper
et al. 2021; Gulev et al.
et al. 2021; Gulev et al.
2021)‘The Arctic will become practically ice-free in
2021)‘The Arctic will become practically ice-free in
September by the end of the 21st century under
September by the end of the 21st century under
SSP2-4.5, SSP3-7.0 and SSP5-8.5[…](high
SSP2-4.5, SSP3-7.0 and SSP5-8.5[…](high
confidence).’4.3.2.1, 9.3.1 (Fox-Kemper
confidence).’4.3.2.1, 9.3.1 (Fox-Kemper
et al. 2021; Lee et al.
et al. 2021; Lee et al.
2021)
2021)
Antarctic sea ice
Antarctic sea ice
changesThere is no global significant trend in
changesThere is no global significant trend in
Antarctic sea ice area from 1979 to 2020 (high
Antarctic sea ice area from 1979 to 2020 (high
confidence).2.3.2.1, 9.3.2 (Fox-Kemper
confidence).2.3.2.1, 9.3.2 (Fox-Kemper
et al. 2021; Gulev et al.
et al. 2021; Gulev et al.
2021)There is low confidence in model simulations of
2021)There is low confidence in model simulations of
future Antarctic sea ice.9.3.2 (Fox-Kemper et al.
future Antarctic sea ice.9.3.2 (Fox-Kemper et al.
2021)
2021)
Ocean chemistry
Ocean chemistry
Changes in salinityThe ‘large-scale, near-surface salinity contrasts
Changes in salinityThe ‘large-scale, near-surface salinity contrasts
have intensified since at least 1950 […]
have intensified since at least 1950 […]
(virtually certain).’2.3.3.2, 9.2.2.2
(virtually certain).’2.3.3.2, 9.2.2.2
(Fox-Kemper et al. 2021;
(Fox-Kemper et al. 2021;
Gulev et al. 2021)‘Fresh ocean regions will continue to get fresher
Gulev et al. 2021)‘Fresh ocean regions will continue to get fresher
and salty ocean regions will continue to get
and salty ocean regions will continue to get
saltier in the 21st century (medium confidence).’9.2.2.2 (Fox-Kemper et al.
saltier in the 21st century (medium confidence).’9.2.2.2 (Fox-Kemper et al.
2021)
2021)
Ocean acidificationOcean surface pH has declined globally over the
Ocean acidificationOcean surface pH has declined globally over the
past four decades (virtually certain).2.3.3.5, 5.3.2.2 (Canadell
past four decades (virtually certain).2.3.3.5, 5.3.2.2 (Canadell
et al. 2021; Gulev et al.
et al. 2021; Gulev et al.
2021)Ocean surface pH will continue to decrease
2021)Ocean surface pH will continue to decrease
‘through the 21st century, except for the
‘through the 21st century, except for the
lower-emission scenarios SSP1-1.9 and SSP1-2.6
lower-emission scenarios SSP1-1.9 and SSP1-2.6
[…] (high confidence).’4.3.2.5, 4.5.2.2, 5.3.4.1
[…] (high confidence).’4.3.2.5, 4.5.2.2, 5.3.4.1
(Lee et al. 2021; Canadell
(Lee et al. 2021; Canadell
et al. 2021)
et al. 2021)
Ocean
Ocean
deoxygenationDeoxygenation has occurred in most open
deoxygenationDeoxygenation has occurred in most open
ocean regions since the mid-20th century (high
ocean regions since the mid-20th century (high
confidence).2.3.3.6, 5.3.3.2 (Canadell
confidence).2.3.3.6, 5.3.3.2 (Canadell
et al. 2021; Gulev et al.
et al. 2021; Gulev et al.
2021)Subsurface oxygen content ‘is projected to
2021)Subsurface oxygen content ‘is projected to
transition to historically unprecedented condition
transition to historically unprecedented condition
with decline over the 21st century (medium
with decline over the 21st century (medium
confidence).’5.3.3.2 (Canadell et al.
confidence).’5.3.3.2 (Canadell et al.
2021)
2021)
Changes in nutrient
Changes in nutrient
concentrationsNot assessed in WGI Not assessed in WGI
concentrationsNot assessed in WGI Not assessed in WGI
%% Cell type:markdown id: tags:
%% Cell type:markdown id: tags:
So AMOC is disuccsed, but sadly it is in the middle chunk. With LLMs, it is often observed that text in the middle of retrieved context is often ignored or less useful. A recent paper ["Lost in the Middle" discusses this here](https://arxiv.org/abs/2307.03172).
So AMOC is disuccsed, but sadly it is in the middle chunk. With LLMs, it is often observed that text in the middle of retrieved context is often ignored or less useful. A recent paper ["Lost in the Middle" discusses this here](https://arxiv.org/abs/2307.03172).
%% Cell type:markdown id: tags:
%% Cell type:markdown id: tags:
## [Optional] Evaluation
## [Optional] Evaluation
We more rigorously evaluate how well the sentence window retriever works compared to the base retriever.
We more rigorously evaluate how well the sentence window retriever works compared to the base retriever.
We define/load an eval benchmark dataset and then run different evaluations over it.
We define/load an eval benchmark dataset and then run different evaluations over it.
**WARNING**: This can be *expensive*, especially with GPT-4. Use caution and tune the sample size to fit your budget.
**WARNING**: This can be *expensive*, especially with GPT-4. Use caution and tune the sample size to fit your budget.
%% Cell type:code id: tags:
%% Cell type:code id: tags:
``` python
``` python
fromllama_index.evaluationimport(
fromllama_index.evaluationimport(
DatasetGenerator,
DatasetGenerator,
QueryResponseDataset,
QueryResponseDataset,
)
)
fromllama_indeximportServiceContext
fromllama_indeximportServiceContext
fromllama_index.llmsimportOpenAI
fromllama_index.llmsimportOpenAI
importnest_asyncio
importnest_asyncio
importrandom
importrandom
nest_asyncio.apply()
nest_asyncio.apply()
```
```
%% Cell type:code id: tags:
%% Cell type:code id: tags:
``` python
``` python
len(base_nodes)
len(base_nodes)
```
```
%% Output
%% Output
428
428
%% Cell type:code id: tags:
%% Cell type:code id: tags:
``` python
``` python
num_nodes_eval=30
num_nodes_eval=30
# there are 428 nodes total. Take the first 200 to generate questions (the back half of the doc is all references)
# there are 428 nodes total. Take the first 200 to generate questions (the back half of the doc is all references)