Skip to content
Snippets Groups Projects
Unverified Commit 09f5671e authored by Ravi Theja's avatar Ravi Theja Committed by GitHub
Browse files

Add mistralai large cookbook (#11410)

* Add mistralai large cookbook

* Update documentation and colab link
parent 6d642a08
Branches
Tags
No related merge requests found
%% Cell type:markdown id: tags:
<a href="https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/cookbooks/mistralai.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>
%% Cell type:markdown id: tags:
# MistralAI Cookbook
MistralAI released [mistral-large](https://mistral.ai/news/mistral-large/) model with enhancing capabilities of Function calling, reasoning, precise instruction-following, JSON mode and many more.
This is a cook-book in showcasing the usage of `mistral-large` model with llama-index.
%% Cell type:markdown id: tags:
### Setup LLM and Embedding Model
%% Cell type:code id: tags:
``` python
import nest_asyncio
nest_asyncio.apply()
import os
os.environ["MISTRAL_API_KEY"] = "YOUR MISTRALAI API KEY"
from llama_index.llms.mistralai import MistralAI
from llama_index.embeddings.mistralai import MistralAIEmbedding
from llama_index.core import Settings
llm = MistralAI(model="mistral-large", temperature=0.1)
embed_model = MistralAIEmbedding(model_name="mistral-embed")
Settings.llm = llm
Settings.embed_model = embed_model
```
%% Cell type:markdown id: tags:
### Download Data
We will use `Uber-2021` and `Lyft-2021` 10K SEC filings.
%% Cell type:code id: tags:
``` python
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/10k/uber_2021.pdf' -O './uber_2021.pdf'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/10k/lyft_2021.pdf' -O './lyft_2021.pdf'
```
%% Output
--2024-02-27 01:17:30-- https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/10k/uber_2021.pdf
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 2606:50c0:8000::154, 2606:50c0:8001::154, 2606:50c0:8002::154, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|2606:50c0:8000::154|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1880483 (1.8M) [application/octet-stream]
Saving to: './uber_2021.pdf'
./uber_2021.pdf 100%[===================>] 1.79M 7.16MB/s in 0.3s
2024-02-27 01:17:31 (7.16 MB/s) - './uber_2021.pdf' saved [1880483/1880483]
--2024-02-27 01:17:31-- https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/10k/lyft_2021.pdf
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 2606:50c0:8000::154, 2606:50c0:8001::154, 2606:50c0:8002::154, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|2606:50c0:8000::154|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1440303 (1.4M) [application/octet-stream]
Saving to: './lyft_2021.pdf'
./lyft_2021.pdf 100%[===================>] 1.37M 2.34MB/s in 0.6s
2024-02-27 01:17:32 (2.34 MB/s) - './lyft_2021.pdf' saved [1440303/1440303]
%% Cell type:markdown id: tags:
### Load Data
%% Cell type:code id: tags:
``` python
from llama_index.core import SimpleDirectoryReader
uber_docs = SimpleDirectoryReader(input_files=["./uber_2021.pdf"]).load_data()
lyft_docs = SimpleDirectoryReader(input_files=["./lyft_2021.pdf"]).load_data()
```
%% Cell type:markdown id: tags:
### Build RAG on uber docs
%% Cell type:code id: tags:
``` python
from llama_index.core import VectorStoreIndex
uber_index = VectorStoreIndex.from_documents(uber_docs)
uber_query_engine = uber_index.as_query_engine(similarity_top_k=5)
response = uber_query_engine.query("What is the revenue of uber in 2021?")
print(response)
```
%% Output
The revenue of Uber in 2021 was $17,455 million.
%% Cell type:markdown id: tags:
### Compare `Uber` and `Lyft` revenue
We will use `SubQuestionQueryEngine`
%% Cell type:code id: tags:
``` python
lyft_index = VectorStoreIndex.from_documents(lyft_docs)
lyft_query_engine = lyft_index.as_query_engine(similarity_top_k=5)
```
%% Cell type:code id: tags:
``` python
from llama_index.core.tools import QueryEngineTool, ToolMetadata
from llama_index.core.query_engine import SubQuestionQueryEngine
query_engine_tools = [
QueryEngineTool(
query_engine=lyft_query_engine,
metadata=ToolMetadata(
name="lyft_10k",
description="Provides information about Lyft financials for year 2021",
),
),
QueryEngineTool(
query_engine=uber_query_engine,
metadata=ToolMetadata(
name="uber_10k",
description="Provides information about Uber financials for year 2021",
),
),
]
sub_question_query_engine = SubQuestionQueryEngine.from_defaults(
query_engine_tools=query_engine_tools
)
```
%% Cell type:code id: tags:
``` python
response = await sub_question_query_engine.aquery(
"Compare revenue growth of Uber and Lyft from 2020 to 2021"
)
print(response)
```
%% Output
Generated 4 sub questions.
[uber_10k] Q: What was the revenue of Uber in 2020
[uber_10k] Q: What was the revenue of Uber in 2021
[lyft_10k] Q: What was the revenue of Lyft in 2020
[lyft_10k] Q: What was the revenue of Lyft in 2021
[lyft_10k] A: The revenue of Lyft in 2021 was $3,208,323.
[uber_10k] A: The revenue of Uber in 2021 was $17,455 million.
[lyft_10k] A: The revenue of Lyft in 2020 was $2,364,681 (in thousands).
[uber_10k] A: The revenue of Uber in 2020 was $11,139 million.
From 2020 to 2021, both Uber and Lyft experienced revenue growth. Uber's revenue increased from $11,139 million in 2020 to $17,455 million in 2021. On the other hand, Lyft's revenue grew from $2,364,681 (in thousands) in 2020 to $3,208,323 in 2021. This indicates that both companies had a positive growth trajectory in their revenues during this period.
%% Cell type:markdown id: tags:
### Route queries between `Uber` and `Lyft`
%% Cell type:code id: tags:
``` python
from llama_index.core import SummaryIndex
summary_index = SummaryIndex.from_documents(uber_docs)
summary_query_engine = summary_index.as_query_engine(
response_mode="tree_summarize",
use_async=True,
)
```
%% Cell type:code id: tags:
``` python
from llama_index.core.tools import QueryEngineTool
lyft_vector_tool = QueryEngineTool.from_defaults(
query_engine=lyft_query_engine,
description=(
"Useful for retrieving specific context from lyft 10k SEC filings of 2021"
),
)
uber_vector_tool = QueryEngineTool.from_defaults(
query_engine=uber_query_engine,
description=(
"Useful for retrieving specific context from uber 10k SEC filings of 2021"
),
)
```
%% Cell type:code id: tags:
``` python
from llama_index.core.query_engine import RouterQueryEngine
from llama_index.core.selectors import LLMSingleSelector
router_query_engine = RouterQueryEngine(
selector=LLMSingleSelector.from_defaults(),
query_engine_tools=[
lyft_vector_tool,
uber_vector_tool,
],
verbose=True,
)
```
%% Cell type:code id: tags:
``` python
response = router_query_engine.query("What is the revenue of uber in 2021?")
print(str(response))
```
%% Output
Selecting query engine 1: This choice is relevant because it pertains to Uber's 10k SEC filings of 2021, where the revenue information for the year is likely to be found..
The revenue of Uber in 2021 was $17,455 million.
%% Cell type:code id: tags:
``` python
response = router_query_engine.query(
"What are the investments made by lyft in 2021?"
)
print(str(response))
```
%% Output
Selecting query engine 0: This choice is most relevant to the question as it pertains to retrieving specific context from Lyft's 10k SEC filings of 2021, where information about Lyft's investments made in 2021 would likely be found..
In 2021, Lyft made several investments to improve and expand their services. They continued to invest in the expansion of their network of Light Vehicles and Lyft Autonomous, which focuses on the deployment and scaling of third-party self-driving technology on the Lyft network. They also invested in their Express Drive program, which provides drivers access to rental cars they can use for ridesharing. Additionally, they made investments in their Driver Centers, Mobile Services, and related partnerships that offer drivers affordable and convenient vehicle maintenance services. Furthermore, they invested in their proprietary technology, including mapping, routing, payments, in-app navigation, matching technologies, and data science to make their network more efficient and seamless to use. They also acquired certain money market deposit accounts and cash in transit from payment processors for credit and debit card transactions. Short-term investments consisted of commercial paper, certificates of deposit, corporate bonds, and term deposits, which mature in 12 months or less. Restricted cash, cash equivalents, and investments consisted primarily of amounts held in separate trust accounts and restricted bank accounts as collateral for insurance purposes and amounts pledged to secure certain letters of credit.
%% Cell type:markdown id: tags:
### Tools usage
%% Cell type:code id: tags:
``` python
from llama_index.core.agent import ReActAgent
from llama_index.core.tools import FunctionTool
```
%% Cell type:code id: tags:
``` python
def multiply(a: int, b: int) -> int:
"""Multiple two integers and returns the result integer"""
return a * b
multiply_tool = FunctionTool.from_defaults(fn=multiply)
def add(a: int, b: int) -> int:
"""Add two integers and returns the result integer"""
return a + b
add_tool = FunctionTool.from_defaults(fn=add)
tools = [multiply_tool, add_tool]
agent = ReActAgent.from_tools(tools, llm=llm, verbose=True)
```
%% Cell type:code id: tags:
``` python
response = agent.chat("What is (26 * 2) + 2024?")
print(response)
```
%% Output
Thought: I need to use a tool to multiply 26 and 2.
Action: multiply
Action Input: {'a': 26, 'b': 2}
Observation: 52
Thought: I need to use a tool to add the result of the multiplication to 2024.
Action: add
Action Input: {'a': 52, 'b': 2024}
Observation: 2076
Thought: I can answer without using any more tools.
Answer: The result of (26 * 2) + 2024 is 2076.
The result of (26 * 2) + 2024 is 2076.
......@@ -184,6 +184,7 @@ maxdepth: 1
maxdepth: 1
---
/examples/llm/mistralai.ipynb
/cookbooks/mistralai.ipynb
```
## MonsterAPI
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment