Skip to content
Snippets Groups Projects
Commit c6a714b4 authored by dloman118's avatar dloman118
Browse files

fix broken link errors

parent 37e8dabd
No related branches found
No related tags found
No related merge requests found
Showing
with 36 additions and 4 deletions
%% Cell type:markdown id:4e11912c tags:
# Function Calling 101: An eCommerce Use Case
%% Cell type:markdown id:75a04a76 tags:
## 1. Introduction to Function Calling
%% Cell type:markdown id:64a7d176 tags:
### 1a. What is function calling and why is it important?
%% Cell type:markdown id:84ab38d9 tags:
Function calling (or tool use) in the context of large language models (LLMs) is the process of an LLM invoking a pre-defined function instead of generating a text response. LLMs are **non-deterministic**, offering flexibility and creativity, but this can lead to inconsistencies and occasional hallucinations, with the training data often being outdated. In contrast, traditional software is **deterministic**, executing tasks precisely as programmed but lacking adaptability. Function calling with LLMs aims to combine the best of both worlds: leveraging the flexibility and creativity of LLMs while ensuring consistent, repeatable actions and reducing hallucinations by utilizing pre-defined functions.
%% Cell type:markdown id:5384f37a tags:
### 1b. What is it doing?
%% Cell type:markdown id:c51bdac3 tags:
Function calling essentially arms your LLM with custom tools to perform specific tasks that a generic LLM might struggle with. During an interaction, the LLM determines which tool to call and what parameters to use, allowing it to execute actions it otherwise couldn’t. This enables the LLM to either perform an action directly or relay the function’s output back to itself, providing more context for a follow-up chat completion. By integrating these custom tools, function calling enhances the LLM’s capabilities and precision, enabling more complex and accurate responses.
%% Cell type:markdown id:434d4ed5 tags:
### 1c. What are some use cases?
%% Cell type:markdown id:57b41881 tags:
Function calling with LLMs can be applied to a variety of practical scenarios, significantly enhancing the capabilities of LLMs. Here are some organized and expanded use cases:
**1. Real-Time Information Retrieval:** LLMs can use function calling to access up-to-date information by querying APIs, databases or search tools, like the [Yahoo Finance API](https://finance.yahoo.com/) or [Tavily Search API](https://tavily.com/). This is particularly useful in domains where information changes frequently, or when you want to surface internal data to the user.
**2. Mathematical Calculations:** LLMs often face challenges with precise mathematical computations. By leveraging function calling, these calculations can be offloaded to specialized functions, ensuring accuracy and reliability.
**3. API Integration for Enhanced Functionality:** Function calling can significantly expand the capabilities of an LLM by integrating it with various APIs. This allows the LLM to perform tasks such as booking appointments, managing calendars, handling customer service requests, and more. By leveraging specific APIs, the LLM can process detailed parameters like appointment times, customer names, contact information, and service details, ensuring efficient and accurate task execution.
%% Cell type:markdown id:23825e9a tags:
## 2. Function Calling Implementation with Groq: eCommerce Use Case
%% Cell type:markdown id:cdac7b10 tags:
In this notebook, we'll use show how function calling can be used for an eCommerce use case, where our LLM will take on the role of a helpful customer service representative, able to use tools to create orders and get prices on products. We will be interacting as a customer named Tom Testuser.
%% Cell type:markdown id:03f13180 tags:
We will be using [Airtable](https://airtable.com/) as our backend database for this demo, and will use the Airtable API to read and write from `customers`, `products` and `orders` tables. You can view the Airtable base [here](https://airtable.com/appQZ9KdhmjcDVSGx/shrlg9MAetUslmX2Z), but will need to copy it into your own Airtable base (click “copy base” in the upper banner) in order to fully follow along with this guide and build on top of it.
%% Cell type:markdown id:63aadc9e tags:
### 2a. Setup
%% Cell type:markdown id:d5af0b86 tags:
We will be using Meta's Llama 3-70B model for this demo. Note that you will need a Groq API Key to proceed and can create an account [here](https://console.groq.com/) to generate one for free.
You will also need to create an Airtable account and provision an [Airtable Personal Access Token](https://airtable.com/create/tokens) with `data.record:read` and `data.record:write` scopes. The Airtable Base ID will be in the URL of the base you copy from above.
Finally, our System Message will provide relevant context to the LLM: that it is a customer service assistant for an ecommerce company, and that it is interacting with a customer named Tom Testuser (ID: 10).
%% Cell type:code id:32d7cdcd tags:
``` python
# Setup
import json
import os
import random
import urllib.parse
from datetime import datetime
import requests
from groq import Groq
# Initialize Groq client and model
client = Groq(api_key=os.getenv("GROQ_API_KEY"))
MODEL = "llama3-70b-8192"
# Airtable variables
airtable_api_token = os.environ["AIRTABLE_API_TOKEN"]
airtable_base_id = os.environ["AIRTABLE_BASE_ID"]
```
%% Cell type:code id:0db27033 tags:
``` python
SYSTEM_MESSAGE = """
You are a helpful customer service LLM for an ecommerce company that processes orders and retrieves information about products.
You are currently chatting with Tom Testuser, Customer ID: 10
"""
```
%% Cell type:markdown id:c44f7c50-8cd7-43fd-9868-c7b2306a30d7 tags:
### 2b. Tool Creation
%% Cell type:markdown id:53ca84b3-f5d6-4a4e-9a95-7b26dd61a524 tags:
First we must define the functions (tools) that the LLM will have access to. For our use case, we will use the Airtable API to create an order (POST request to the orders table), get product prices (GET request to the products table) and get product ID (GET request to the products table).
We will then compile these tools in a list that can be passed to the LLM. Note that we must provide proper descriptions of the functions and parameters so that they can be called appropriately given the user input:
%% Cell type:code id:64e18dfc tags:
``` python
# Creates an order given a product_id and customer_id
def create_order(product_id, customer_id):
headers = {
"Authorization": f"Bearer {airtable_api_token}",
"Content-Type": "application/json",
}
url = f"https://api.airtable.com/v0/{airtable_base_id}/orders"
order_id = random.randint(1, 100000) # Randomly assign an order_id
order_datetime = datetime.utcnow().strftime(
"%Y-%m-%dT%H:%M:%SZ"
) # Assign order date as now
data = {
"fields": {
"order_id": order_id,
"product_id": product_id,
"customer_id": customer_id,
"order_date": order_datetime,
}
}
response = requests.post(url, headers=headers, json=data)
return str(response.json())
# Gets the price for a product, given the name of the product
def get_product_price(product_name):
api_token = os.environ["AIRTABLE_API_TOKEN"]
base_id = os.environ["AIRTABLE_BASE_ID"]
headers = {"Authorization": f"Bearer {airtable_api_token}"}
formula = f"{{name}}='{product_name}'"
encoded_formula = urllib.parse.quote(formula)
url = f"https://api.airtable.com/v0/{airtable_base_id}/products?filterByFormula={encoded_formula}"
response = requests.get(url, headers=headers)
product_price = response.json()["records"][0]["fields"]["price"]
return "$" + str(product_price)
# Gets product ID given a product name
def get_product_id(product_name):
api_token = os.environ["AIRTABLE_API_TOKEN"]
base_id = os.environ["AIRTABLE_BASE_ID"]
headers = {"Authorization": f"Bearer {airtable_api_token}"}
formula = f"{{name}}='{product_name}'"
encoded_formula = urllib.parse.quote(formula)
url = f"https://api.airtable.com/v0/{airtable_base_id}/products?filterByFormula={encoded_formula}"
response = requests.get(url, headers=headers)
product_id = response.json()["records"][0]["fields"]["product_id"]
return str(product_id)
```
%% Cell type:markdown id:51a7a120 tags:
The necessary structure to compile our list of tools so that the LLM can use them; note that we must provide proper descriptions of the functions and parameters so that they can be called appropriately given the user input:
%% Cell type:code id:b5a12541 tags:
``` python
tools = [
# First function: create_order
{
"type": "function",
"function": {
"name": "create_order",
"description": "Creates an order given a product_id and customer_id. If a product name is provided, you must get the product ID first. After placing the order indicate that it was placed successfully and output the details.",
"parameters": {
"type": "object",
"properties": {
"product_id": {
"type": "integer",
"description": "The ID of the product",
},
"customer_id": {
"type": "integer",
"description": "The ID of the customer",
},
},
"required": ["product_id", "customer_id"],
},
},
},
# Second function: get_product_price
{
"type": "function",
"function": {
"name": "get_product_price",
"description": "Gets the price for a product, given the name of the product. Just return the price, do not do any calculations.",
"parameters": {
"type": "object",
"properties": {
"product_name": {
"type": "string",
"description": "The name of the product (must be title case, i.e. 'Microphone', 'Laptop')",
}
},
"required": ["product_name"],
},
},
},
# Third function: get_product_id
{
"type": "function",
"function": {
"name": "get_product_id",
"description": "Gets product ID given a product name",
"parameters": {
"type": "object",
"properties": {
"product_name": {
"type": "string",
"description": "The name of the product (must be title case, i.e. 'Microphone', 'Laptop')",
}
},
"required": ["product_name"],
},
},
},
]
```
%% Cell type:markdown id:cf6325f3 tags:
### 2c. Simple Function Calling
%% Cell type:markdown id:3b1dd8ba tags:
First, let's start out by just making a simple function call with only one tool. We will ask the customer service LLM to place an order for a product with Product ID 5.
%% Cell type:markdown id:92c77018 tags:
The two key parameters we need to include in our chat completion are `tools=tools` and `tool_choice="auto"`, which provides the model with the available tools we've just defined and tells it to use one if appropriate (`tool_choice="auto"` gives the LLM the option of using any, all or none of the available functions. To mandate a specific function call, we could use `tool_choice={"type": "function", "function": {"name":"create_order"}}`).
When the LLM decides to use a tool, the response is *not* a conversational chat, but . From there, we can execute the LLM-identified tool with the LLM-identified parameters, and feed the response *back* to the LLM for a second request so that it can respond with appropriate context from the tool it just used:
When the LLM decides to use a tool, the response is *not* a conversational chat, but a JSON object containing the tool choice and tool parameters. From there, we can execute the LLM-identified tool with the LLM-identified parameters, and feed the response *back* to the LLM for a second request so that it can respond with appropriate context from the tool it just used:
%% Cell type:code id:482b2251 tags:
``` python
user_prompt = "Please place an order for Product ID 5"
messages = [
{"role": "system", "content": SYSTEM_MESSAGE},
{
"role": "user",
"content": user_prompt,
},
]
# Step 1: send the conversation and available functions to the model
response = client.chat.completions.create(
model=MODEL,
messages=messages,
tools=tools,
tool_choice="auto", # Let the LLM decide if it should use one of the available tools
max_tokens=4096,
)
response_message = response.choices[0].message
tool_calls = response_message.tool_calls
print("First LLM Call (Tool Use) Response:", response_message)
# Step 2: check if the model wanted to call a function
if tool_calls:
# Step 3: call the function and append the tool call to our list of messages
available_functions = {
"create_order": create_order,
}
messages.append(
{
"role": "assistant",
"tool_calls": [
{
"id": tool_call.id,
"function": {
"name": tool_call.function.name,
"arguments": tool_call.function.arguments,
},
"type": tool_call.type,
}
for tool_call in tool_calls
],
}
)
# Step 4: send the info for each function call and function response to the model
tool_call = tool_calls[0]
function_name = tool_call.function.name
function_to_call = available_functions[function_name]
function_args = json.loads(tool_call.function.arguments)
function_response = function_to_call(
product_id=function_args.get("product_id"),
customer_id=function_args.get("customer_id"),
)
messages.append(
{
"tool_call_id": tool_call.id,
"role": "tool",
"name": function_name,
"content": function_response,
}
) # extend conversation with function response
# Send the result back to the LLM to complete the chat
second_response = client.chat.completions.create(
model=MODEL, messages=messages
) # get a new response from the model where it can see the function response
print("\n\nSecond LLM Call Response:", second_response.choices[0].message.content)
```
%% Output
First LLM Call (Tool Use) Response: ChoiceMessage(content=None, role='assistant', tool_calls=[ChoiceMessageToolCall(id='call_cnyc', function=ChoiceMessageToolCallFunction(arguments='{"customer_id":10,"product_id":5}', name='create_order'), type='function')])
Second LLM Call Response: Your order has been successfully placed!
Order details:
* Order ID: 24255
* Product ID: 5
* Customer ID: 10 (that's you, Tom Testuser!)
* Order Date: 2024-05-31 13:59:03
We'll process your order shortly. You'll receive an email with further updates on your order status. If you have any questions or concerns, feel free to ask!
%% Cell type:markdown id:cb60a037 tags:
Here is the entire message sequence for a simple tool call:
%% Cell type:code id:fce83d48 tags:
``` python
print(json.dumps(messages, indent=2))
```
%% Output
[
{
"role": "system",
"content": "\nYou are a helpful customer service LLM for an ecommerce company that processes orders and retrieves information about products.\nYou are currently chatting with Tom Testuser, Customer ID: 10\n"
},
{
"role": "user",
"content": "Please place an order for Product ID 5"
},
{
"role": "assistant",
"tool_calls": [
{
"id": "call_cnyc",
"function": {
"name": "create_order",
"arguments": "{\"customer_id\":10,\"product_id\":5}"
},
"type": "function"
}
]
},
{
"tool_call_id": "call_cnyc",
"role": "tool",
"name": "create_order",
"content": "{'id': 'recWasb2AECLJiRj1', 'createdTime': '2024-05-31T13:59:04.000Z', 'fields': {'order_id': 24255, 'product_id': 5, 'customer_id': 10, 'order_date': '2024-05-31T13:59:03.000Z'}}"
}
]
%% Cell type:markdown id:513fff34 tags:
### 2d. Parallel Tool Use
%% Cell type:markdown id:b50964e8 tags:
If we need multiple function calls that **do not** depend on each other, we can run them in parallel - meaning, multiple function calls will be identified within a single chat request. Here, we are asking for the price of both a Laptop and a Microphone, which requires multiple calls of the `get_product_price` function. Note that in using parallel tool use, *the LLM itself* will decide if it needs to make multiple function calls. So we don't need to make any changes to our chat completion code, but *do* need to be able to iterate over multiple tool calls after the tools are identified.
%% Cell type:markdown id:9e0f5a0e tags:
*parallel tool use is only available for Llama-based models at this time (5/27/2024)*
%% Cell type:code id:5ec93e21 tags:
``` python
user_prompt = "Please get the price for the Laptop and Microphone"
messages = [
{"role": "system", "content": SYSTEM_MESSAGE},
{
"role": "user",
"content": user_prompt,
},
]
# Step 1: send the conversation and available functions to the model
response = client.chat.completions.create(
model=MODEL, messages=messages, tools=tools, tool_choice="auto", max_tokens=4096
)
response_message = response.choices[0].message
tool_calls = response_message.tool_calls
print("First LLM Call (Tool Use) Response:", response_message)
# Step 2: check if the model wanted to call a function
if tool_calls:
# Step 3: call the function and append the tool call to our list of messages
available_functions = {
"get_product_price": get_product_price,
} # only one function in this example, but you can have multiple
messages.append(
{
"role": "assistant",
"tool_calls": [
{
"id": tool_call.id,
"function": {
"name": tool_call.function.name,
"arguments": tool_call.function.arguments,
},
"type": tool_call.type,
}
for tool_call in tool_calls
],
}
)
# Step 4: send the info for each function call and function response to the model
# Iterate over all tool calls
for tool_call in tool_calls:
function_name = tool_call.function.name
function_to_call = available_functions[function_name]
function_args = json.loads(tool_call.function.arguments)
function_response = function_to_call(
product_name=function_args.get("product_name")
)
messages.append(
{
"tool_call_id": tool_call.id,
"role": "tool",
"name": function_name,
"content": function_response,
}
) # extend conversation with function response
second_response = client.chat.completions.create(
model=MODEL, messages=messages
) # get a new response from the model where it can see the function response
print("\n\nSecond LLM Call Response:", second_response.choices[0].message.content)
```
%% Output
First LLM Call (Tool Use) Response: ChoiceMessage(content=None, role='assistant', tool_calls=[ChoiceMessageToolCall(id='call_88r0', function=ChoiceMessageToolCallFunction(arguments='{"product_name":"Laptop"}', name='get_product_price'), type='function'), ChoiceMessageToolCall(id='call_vva6', function=ChoiceMessageToolCallFunction(arguments='{"product_name":"Microphone"}', name='get_product_price'), type='function')])
Second LLM Call Response: So, the price of the Laptop is $753.03 and the price of the Microphone is $276.23. The total comes out to be $1,029.26.
%% Cell type:markdown id:90082fd7 tags:
Here is the entire message sequence for a parallel tool call:
%% Cell type:code id:50d953b7 tags:
``` python
print(json.dumps(messages, indent=2))
```
%% Output
[
{
"role": "system",
"content": "\nYou are a helpful customer service LLM for an ecommerce company that processes orders and retrieves information about products.\nYou are currently chatting with Tom Testuser, Customer ID: 10\n"
},
{
"role": "user",
"content": "Please get the price for the Laptop and Microphone"
},
{
"role": "assistant",
"tool_calls": [
{
"id": "call_88r0",
"function": {
"name": "get_product_price",
"arguments": "{\"product_name\":\"Laptop\"}"
},
"type": "function"
},
{
"id": "call_vva6",
"function": {
"name": "get_product_price",
"arguments": "{\"product_name\":\"Microphone\"}"
},
"type": "function"
}
]
},
{
"tool_call_id": "call_88r0",
"role": "tool",
"name": "get_product_price",
"content": "$753.03"
},
{
"tool_call_id": "call_vva6",
"role": "tool",
"name": "get_product_price",
"content": "$276.23"
}
]
%% Cell type:markdown id:53959911 tags:
### 2e. Multiple Tool Use
%% Cell type:markdown id:1d6f5a39 tags:
Multiple Tool Use is for when we need to use multiple functions where the input to one of the functions **depends on the output** of another function. Unlike parallel tool use, with multiple tool use we will only output a single tool call per LLM request, and then make a separate LLM request to call the next tool. To do this, we'll add a WHILE loop to continuously send LLM requests with our updated message sequence until it has enough information to no longer need to call any more tools. (Note that this solution is generalizable to both simple and parallel tool calling as well).
%% Cell type:markdown id:946576e9 tags:
In our first example we invoked the `create_order` function by providing the product ID directly; since that is a bit clunky, we will first use the `get_product_id` function to get the product ID associated with the product name, then use that ID to call `create_order`:
%% Cell type:code id:6ea17b01 tags:
``` python
user_prompt = "Please place an order for a Microphone"
messages = [
{"role": "system", "content": SYSTEM_MESSAGE},
{
"role": "user",
"content": user_prompt,
},
]
# Continue to make LLM calls until it no longer decides to use a tool
tool_call_identified = True
while tool_call_identified:
response = client.chat.completions.create(
model=MODEL, messages=messages, tools=tools, tool_choice="auto", max_tokens=4096
)
response_message = response.choices[0].message
tool_calls = response_message.tool_calls
# Step 2: check if the model wanted to call a function
if tool_calls:
print("LLM Call (Tool Use) Response:", response_message)
# Step 3: call the function and append the tool call to our list of messages
available_functions = {
"create_order": create_order,
"get_product_id": get_product_id,
}
messages.append(
{
"role": "assistant",
"tool_calls": [
{
"id": tool_call.id,
"function": {
"name": tool_call.function.name,
"arguments": tool_call.function.arguments,
},
"type": tool_call.type,
}
for tool_call in tool_calls
],
}
)
# Step 4: send the info for each function call and function response to the model
for tool_call in tool_calls:
function_name = tool_call.function.name
function_to_call = available_functions[function_name]
function_args = json.loads(tool_call.function.arguments)
if function_name == "get_product_id":
function_response = function_to_call(
product_name=function_args.get("product_name")
)
elif function_name == "create_order":
function_response = function_to_call(
customer_id=function_args.get("customer_id"),
product_id=function_args.get("product_id"),
)
messages.append(
{
"tool_call_id": tool_call.id,
"role": "tool",
"name": function_name,
"content": function_response,
}
) # extend conversation with function response
else:
print("\n\nFinal LLM Call Response:", response.choices[0].message.content)
tool_call_identified = False
```
%% Output
LLM Call (Tool Use) Response: ChoiceMessage(content=None, role='assistant', tool_calls=[ChoiceMessageToolCall(id='call_6yd2', function=ChoiceMessageToolCallFunction(arguments='{"product_name":"Microphone"}', name='get_product_id'), type='function')])
LLM Call (Tool Use) Response: ChoiceMessage(content=None, role='assistant', tool_calls=[ChoiceMessageToolCall(id='call_mnv6', function=ChoiceMessageToolCallFunction(arguments='{"customer_id":10,"product_id":15}', name='create_order'), type='function')])
Final LLM Call Response: Your order with ID 42351 has been successfully placed! The details are: product ID 15, customer ID 10, and order date 2024-05-31T13:59:40.000Z.
%% Cell type:markdown id:865b15f0 tags:
Here is the entire message sequence for a multiple tool call:
%% Cell type:code id:bda72263 tags:
``` python
print(json.dumps(messages, indent=2))
```
%% Output
[
{
"role": "system",
"content": "\nYou are a helpful customer service LLM for an ecommerce company that processes orders and retrieves information about products.\nYou are currently chatting with Tom Testuser, Customer ID: 10\n"
},
{
"role": "user",
"content": "Please place an order for a Microphone"
},
{
"role": "assistant",
"tool_calls": [
{
"id": "call_6yd2",
"function": {
"name": "get_product_id",
"arguments": "{\"product_name\":\"Microphone\"}"
},
"type": "function"
}
]
},
{
"tool_call_id": "call_6yd2",
"role": "tool",
"name": "get_product_id",
"content": "15"
},
{
"role": "assistant",
"tool_calls": [
{
"id": "call_mnv6",
"function": {
"name": "create_order",
"arguments": "{\"customer_id\":10,\"product_id\":15}"
},
"type": "function"
}
]
},
{
"tool_call_id": "call_mnv6",
"role": "tool",
"name": "create_order",
"content": "{'id': 'rectr27e5TP1UMREM', 'createdTime': '2024-05-31T13:59:41.000Z', 'fields': {'order_id': 42351, 'product_id': 15, 'customer_id': 10, 'order_date': '2024-05-31T13:59:40.000Z'}}"
}
]
%% Cell type:markdown id:159b38ec tags:
### 2f. Langchain Integration
%% Cell type:markdown id:899ceec7 tags:
Finally, Groq function calling is compatible with [Langchain](https://python.langchain.com/v0.1/docs/modules/tools/), by converting your functions into Langchain tools. Here is an example using our `get_product_price` function:
%% Cell type:code id:4f38cece tags:
``` python
from langchain_groq import ChatGroq
llm = ChatGroq(groq_api_key=os.getenv("GROQ_API_KEY"), model=MODEL)
```
%% Cell type:markdown id:84f9d041-a00c-4f03-a8d4-2d1e63f132c2 tags:
When defining Langchain tools, put the function description as a string at the beginning of the function
%% Cell type:code id:9c52872c tags:
``` python
from langchain_core.tools import tool
@tool
def create_order(product_id, customer_id):
"""
Creates an order given a product_id and customer_id.
If a product name is provided, you must get the product ID first.
After placing the order indicate that it was placed successfully and output the details.
product_id: ID of the product
customer_id: ID of the customer
"""
api_token = os.environ["AIRTABLE_API_TOKEN"]
base_id = os.environ["AIRTABLE_BASE_ID"]
headers = {
"Authorization": f"Bearer {api_token}",
"Content-Type": "application/json",
}
url = f"https://api.airtable.com/v0/{base_id}/orders"
order_id = random.randint(1, 100000) # Randomly assign an order_id
order_datetime = datetime.utcnow().strftime(
"%Y-%m-%dT%H:%M:%SZ"
) # Assign order date as now
data = {
"fields": {
"order_id": order_id,
"product_id": product_id,
"customer_id": customer_id,
"order_date": order_datetime,
}
}
response = requests.post(url, headers=headers, json=data)
return str(response.json())
@tool
def get_product_price(product_name):
"""
Gets the price for a product, given the name of the product.
Just return the price, do not do any calculations.
product_name: The name of the product (must be title case, i.e. 'Microphone', 'Laptop')
"""
api_token = os.environ["AIRTABLE_API_TOKEN"]
base_id = os.environ["AIRTABLE_BASE_ID"]
headers = {"Authorization": f"Bearer {api_token}"}
formula = f"{{name}}='{product_name}'"
encoded_formula = urllib.parse.quote(formula)
url = f"https://api.airtable.com/v0/{base_id}/products?filterByFormula={encoded_formula}"
response = requests.get(url, headers=headers)
product_price = response.json()["records"][0]["fields"]["price"]
return "$" + str(product_price)
@tool
def get_product_id(product_name):
"""
Gets product ID given a product name
product_name: The name of the product (must be title case, i.e. 'Microphone', 'Laptop')
"""
api_token = os.environ["AIRTABLE_API_TOKEN"]
base_id = os.environ["AIRTABLE_BASE_ID"]
headers = {"Authorization": f"Bearer {api_token}"}
formula = f"{{name}}='{product_name}'"
encoded_formula = urllib.parse.quote(formula)
url = f"https://api.airtable.com/v0/{base_id}/products?filterByFormula={encoded_formula}"
response = requests.get(url, headers=headers)
product_id = response.json()["records"][0]["fields"]["product_id"]
return str(product_id)
# Add tools to our LLM
tools = [create_order, get_product_price, get_product_id]
llm_with_tools = llm.bind_tools(tools)
```
%% Cell type:code id:968145b2 tags:
``` python
from langchain_core.messages import AIMessage, HumanMessage, SystemMessage, ToolMessage
user_prompt = "Please place an order for a Microphone"
print(llm_with_tools.invoke(user_prompt).tool_calls)
```
%% Output
[{'name': 'get_product_id', 'args': {'product_name': 'Microphone'}, 'id': 'call_7f8y'}, {'name': 'create_order', 'args': {'product_id': '{result of get_product_id}', 'customer_id': ''}, 'id': 'call_zt5c'}]
%% Cell type:code id:d245e8ac tags:
``` python
from langchain_core.messages import AIMessage, HumanMessage, SystemMessage, ToolMessage
available_tools = {
"create_order": create_order,
"get_product_price": get_product_price,
"get_product_id": get_product_id,
}
messages = [SystemMessage(SYSTEM_MESSAGE), HumanMessage(user_prompt)]
tool_call_identified = True
while tool_call_identified:
ai_msg = llm_with_tools.invoke(messages)
messages.append(ai_msg)
for tool_call in ai_msg.tool_calls:
selected_tool = available_tools[tool_call["name"]]
tool_output = selected_tool.invoke(tool_call["args"])
messages.append(ToolMessage(tool_output, tool_call_id=tool_call["id"]))
if len(ai_msg.tool_calls) == 0:
tool_call_identified = False
print(ai_msg.content)
```
%% Output
Your order has been placed successfully! Your order ID is 87812.
......
%% Cell type:markdown id:072150ea-1f44-4428-94ae-695ba94b2f7d tags:
# Retrieval-Augmented Generation for Presidential Speeches using Groq API and Langchain
%% Cell type:markdown id:d7a4fc92-eb9a-4273-8ff6-0fc5b96236d7 tags:
Retrieval-Augmented Generation (RAG) is a widely-used technique that enables us to gather pertinent information from an external data source and provide it to our Large Language Model (LLM). It helps solve two of the biggest limitations of LLMs: knowledge cutoffs, in which information after a certain date or for a specific source is not available to the LLM, and hallucination, in which the LLM makes up an answer to a question it doesn't have the information for. With RAG, we can ensure that the LLM has relevant information to answer the question at hand.
%% Cell type:markdown id:ea1ae66c-a322-467d-b789-f7ce5a636ad7 tags:
In this notebook we will be using [Groq API](https://console.groq.com), [LangChain](https://www.langchain.com/) and [Pinecone](https://www.pinecone.io/) to perform RAG on [presidential speech transcripts](https://millercenter.org/the-presidency/presidential-speeches) from the Miller Center at the University of Virginia. In doing so, we will create vector embeddings for each speech, store them in a vector database, retrieve the most relevent speech excerpts pertaining to the user prompt and include them in context for the LLM.
%% Cell type:markdown id:d7784880-495e-4d7c-a045-d12b7f57b65d tags:
### Setup
%% Cell type:code id:b4679c23-7035-4276-b3d6-95cd89916477 tags:
``` python
import pandas as pd
import numpy as np
from groq import Groq
import os
import pinecone
from langchain_community.vectorstores import Chroma
from langchain.text_splitter import TokenTextSplitter
from langchain.docstore.document import Document
from langchain_community.embeddings.sentence_transformer import SentenceTransformerEmbeddings
from langchain_pinecone import PineconeVectorStore
from transformers import AutoModelForCausalLM, AutoTokenizer
from sklearn.metrics.pairwise import cosine_similarity
from IPython.display import display, HTML
```
%% Cell type:markdown id:4c18688b-178f-439d-90a4-590f99ade11f tags:
A Groq API Key is required for this demo - you can generate one for free [here](https://console.groq.com/keys). We will be using Pinecone as our vector database, which also requires an API key (you can create one index for a small project there for free on their Starter plan), but will also show how it works with [Chroma DB](https://www.trychroma.com/), a free open source alternative that stores vector embeddings in memory. We will also use the Llama3 8b model for this demo.
A Groq API Key is required for this demo - you can generate one for free [here](https://console.groq.com/). We will be using Pinecone as our vector database, which also requires an API key (you can create one index for a small project there for free on their Starter plan), but will also show how it works with [Chroma DB](https://www.trychroma.com/), a free open source alternative that stores vector embeddings in memory. We will also use the Llama3 8b model for this demo.
%% Cell type:code id:14fd5b33-360e-4fbe-ad29-11d5f759b0d3 tags:
``` python
groq_api_key = os.getenv('GROQ_API_KEY')
pinecone_api_key = os.getenv('PINECONE_API_KEY')
client = Groq(api_key = groq_api_key)
model = "llama3-8b-8192"
```
%% Cell type:markdown id:469e5b3a-6c5d-49cd-a547-222d45d7a996 tags:
### RAG Basics with One Document
%% Cell type:markdown id:283183cd-ba64-4e98-a0d9-a6165e88494e tags:
The presidential speeches we'll be using are stored in this [.csv file](https://github.com/groq/groq-api-cookbook/blob/main/presidential-speeches-rag/presidential_speeches.csv). Each row of the .csv contains fields for the date, president, party, speech title, speech summary and speech transcript, and includes every recorded presidential speech through the Trump presidency:
%% Cell type:code id:d1017409-cb0e-402b-9c53-c61729296bd2 tags:
``` python
presidential_speeches_df = pd.read_csv('presidential_speeches.csv')
presidential_speeches_df.head()
```
%% Output
Date President Party \
0 1789-04-30 George Washington Unaffiliated
1 1789-10-03 George Washington Unaffiliated
2 1790-01-08 George Washington Unaffiliated
3 1790-12-08 George Washington Unaffiliated
4 1790-12-29 George Washington Unaffiliated
Speech Title \
0 First Inaugural Address
1 Thanksgiving Proclamation
2 First Annual Message to Congress
3 Second Annual Message to Congress
4 Talk to the Chiefs and Counselors of the Senec...
Summary \
0 Washington calls on Congress to avoid local an...
1 At the request of Congress, Washington establi...
2 In a wide ranging speech, President Washington...
3 Washington focuses on commerce in his second a...
4 The President reassures the Seneca Nation that...
Transcript \
0 Fellow Citizens of the Senate and the House of...
1 Whereas it is the duty of all Nations to ackno...
2 Fellow Citizens of the Senate and House of Rep...
3 Fellow citizens of the Senate and House of Rep...
4 I the President of the United States, by my ow...
URL
0 https://millercenter.org/the-presidency/presid...
1 https://millercenter.org/the-presidency/presid...
2 https://millercenter.org/the-presidency/presid...
3 https://millercenter.org/the-presidency/presid...
4 https://millercenter.org/the-presidency/presid...
%% Cell type:markdown id:a9aaabfb-c34d-40f1-a90f-f448a9051130 tags:
To get a better idea of the steps involved in building a RAG system, let's focus on a single speech to start. In honor of his [upcoming Netflix series](https://www.netflix.com/tudum/articles/death-by-lightning-tv-series-adaptation) and his distinction of being the only president to [contribute an original proof of the Pythagorean Theorem](https://maa.org/press/periodicals/convergence/mathematical-treasure-james-a-garfields-proof-of-the-pythagorean-theorem), we'll use James Garfield's Inaugural Address:
%% Cell type:code id:39439748-8652-415d-a5e7-0f421a6ae30a tags:
``` python
garfield_inaugural = presidential_speeches_df.iloc[309].Transcript
#display(HTML(garfield_inaugural))
```
%% Cell type:markdown id:8e1a9811-2fd6-4c99-ba11-a9df2df33ec0 tags:
A challenge with prompting LLMs can be running into limits with their context window. While this speech is not extremely long and would actually fit in Llama3's context window, it is not always great practice to use way more of the context window than you need, so when using RAG we want to split up the text to provide only relevant parts of it to the LLM. To do so, we first need to ```tokenize``` the transcript. We'll use the ```sentence-transformers/all-MiniLM-L6-v2``` tokenzier with the transformers AutoTokenizer class for this - this will show the number of tokens the model counts in Garfield's Inaugural Address:
%% Cell type:code id:c6057e9f-874e-4d7a-9f3c-e411a9acbb2e tags:
``` python
model_id = "sentence-transformers/all-MiniLM-L6-v2"
tokenizer = AutoTokenizer.from_pretrained(model_id)
# create the length function
def token_len(text):
tokens = tokenizer.encode(
text
)
return len(tokens)
token_len(garfield_inaugural)
```
%% Output
Token indices sequence length is longer than the specified maximum sequence length for this model (3420 > 512). Running this sequence through the model will result in indexing errors
3420
%% Cell type:markdown id:71de3b6c-e89b-4446-a585-32cf10a1cd8e tags:
Next, we'll split the text into chunks using LangChain's `TokenTextSplitter` function. In this example we will set the maximum tokens in a chunk to be 450, with a 20 token overlap to reduce the chances that a sentence or concept will be split into different chunks.
Note that LangChain uses OpenAI's `tiktoken` tokenizer, so our tokenizer will count tokens a bit differently - when adjusting for this, our chunk sizes will be around 500 tokens.
%% Cell type:code id:20ba719b-9a03-437a-a665-a0bde9ec24cf tags:
``` python
text_splitter = TokenTextSplitter(
chunk_size=450, # 500 tokens is the max
chunk_overlap=20 # Overlap of N tokens between chunks (to reduce chance of cutting out relevant connected text like middle of sentence)
)
chunks = text_splitter.split_text(garfield_inaugural)
for chunk in chunks:
print(token_len(chunk))
```
%% Output
453
455
467
457
457
455
461
368
%% Cell type:markdown id:ce723eea-7e69-48c1-8452-957709d117db tags:
Next, we will embed each chunk into a semantic vector space using the all-MiniLM-L6-v2 model, through LangChain's implementation of Sentence Transformers from [HuggingFace](https://huggingface.co/sentence-transformers). Note that each embedding has a length of 384.
%% Cell type:code id:740cea0e-c568-4522-995b-2bc1b9f1d4d8 tags:
``` python
chunk_embeddings = []
embedding_function = SentenceTransformerEmbeddings(model_name="all-MiniLM-L6-v2")
for chunk in chunks:
chunk_embeddings.append(embedding_function.embed_query(chunk))
print(len(chunk_embeddings[0]),chunk_embeddings[0][:20]) #Shows first 25 embeddings out of 384
```
%% Output
384 [-0.041311442852020264, 0.04761345684528351, 0.007975001819431782, -0.030207891017198563, 0.04763732850551605, 0.03253324702382088, 0.012350181117653847, -0.044836871325969696, -0.008013647049665451, 0.015704018995165825, -0.0009443548624403775, 0.11632765829563141, -0.007115611340850592, -0.03356580808758736, -0.043237943202257156, 0.06872360408306122, -0.04552490636706352, -0.07017458975315094, -0.10271692276000977, 0.11116139590740204]
%% Cell type:markdown id:4a52f835-69f8-465d-a06e-fb5e31656b37 tags:
Finally, we will embed our prompt and use cosine similarity to find the most relevant chunk to the question we'd like answered:
%% Cell type:code id:e3315f7c-6523-4aca-a624-2e2076b3e6bf tags:
``` python
user_question = "What were James Garfield's views on civil service reform?"
```
%% Cell type:code id:587efed4-1d6a-402e-9c47-7259fbe898be tags:
``` python
prompt_embeddings = embedding_function.embed_query(user_question)
similarities = cosine_similarity([prompt_embeddings], chunk_embeddings)[0]
closest_similarity_index = np.argmax(similarities)
most_relevant_chunk = chunks[closest_similarity_index]
display(HTML(most_relevant_chunk))
```
%% Output
%% Cell type:markdown id:913592d1-454f-43c1-9255-956c7c37b222 tags:
Now, we can feed the most relevant speech expert into our chat completion model so that the LLM can use it to answer our question:
%% Cell type:code id:159ace36-71bf-4af9-9719-83ba1182071f tags:
``` python
# A chat completion function that will use the most relevant exerpt(s) from presidential speeches to answer the user's question
def presidential_speech_chat_completion(client, model, user_question, relevant_excerpts):
chat_completion = client.chat.completions.create(
messages = [
{
"role": "system",
"content": "You are a presidential historian. Given the user's question and relevant excerpts from presidential speeches, answer the question by including direct quotes from presidential speeches. When using a quote, site the speech that it was from (ignoring the chunk)."
},
{
"role": "user",
"content": "User Question: " + user_question + "\n\nRelevant Speech Exerpt(s):\n\n" + relevant_excerpts,
}
],
model = model
)
response = chat_completion.choices[0].message.content
return response
presidential_speech_chat_completion(client, model, user_question, most_relevant_chunk)
```
%% Output
'James Garfield, in his inaugural address on March 4, 1881, briefly touched on the subject of civil service reform. He expressed his belief that the civil service could not be placed on a satisfactory basis until it was regulated by law. He also mentioned his intention to ask Congress to fix the tenure of minor offices and prescribe the grounds for removal during the terms for which incumbents had been appointed. He stated that this would be done to protect those with appointing power, incumbents, and to ensure honest and faithful service from executive officers. Garfield believed that offices were created for the service of the Government, not for the benefit of incumbents or their supporters.\n\nSource: Inaugural Address, March 4, 1881.'
%% Cell type:markdown id:3b144390-558b-47a5-9c67-9e3a7b6c8138 tags:
### Using a Vector DB to store and retrieve embeddings for all speeches
%% Cell type:markdown id:23ab35ad-d47b-4bfe-ac9c-5fbf4946f3ae tags:
Now, let's repeat the same process for every speech in our .csv using the same text splitter as above. Note that we will be converting our text to a `Document` object so that it integrates with the vector database, and also prepending the president, date and title to the speech transcript to provide more context to the LLM:
%% Cell type:code id:94d8ba00-4360-4313-a271-272013a74f66 tags:
``` python
documents = []
for index, row in presidential_speeches_df[presidential_speeches_df['Transcript'].notnull()].iterrows():
chunks = text_splitter.split_text(row.Transcript)
total_chunks = len(chunks)
for chunk_num in range(1,total_chunks+1):
header = f"Date: {row['Date']}\nPresident: {row['President']}\nSpeech Title: {row['Speech Title']} (chunk {chunk_num} of {total_chunks})\n\n"
chunk = chunks[chunk_num-1]
documents.append(Document(page_content=header + chunk, metadata={"source": "local"}))
print(len(documents))
```
%% Output
10698
%% Cell type:markdown id:eec976bc-5f33-49bc-a61a-5f2ee4a293d6 tags:
I will be using a Pinecone index called `presidential-speeches` for this demo. As mentioned above, you can sign up for Pinecone's Starter plan for free and have access to a single index, which is ideal for a small personal project. You can also use Chroma DB as an open source alternative. Note that either Vector DB will use the same embedding function we've defined above:
%% Cell type:code id:84d9ef15-62b4-4961-80d0-27a558389c8c tags:
``` python
pinecone_index_name = "presidential-speeches"
docsearch = PineconeVectorStore.from_documents(documents, embedding_function, index_name=pinecone_index_name)
### Use Chroma for open source option
#docsearch = Chroma.from_documents(documents, embedding_function)
```
%% Cell type:markdown id:83dcec95-98f3-4d11-bb43-2ab967741067 tags:
Fortunately, all of the manual work we did above to embed text and use cosine similarity to find the most relevant chunk is done under the hood when using a vector database. Now, we can ask our question again, over the entire corpus of presidential speeches.
%% Cell type:code id:f7d6698f-0331-43e7-9a83-2c5b684ec44c tags:
``` python
user_question = "What were James Garfield's views on civil service reform?"
```
%% Cell type:code id:91f88d8d-d2a9-4289-a3b6-0a8415f0e72b tags:
``` python
relevent_docs = docsearch.similarity_search(user_question)
# print results
#display(HTML(relevent_docs[0].page_content))
```
%% Cell type:markdown id:190c1a66-ded4-4340-94e9-0a789704c03d tags:
We will use the three most relevant excerpts in our system prompt. Note that even with nearly 1000 speeches chunked and stored in our vector database, the similarity search still found the same one as when we only parsed Garfield's Inaugural Address:
%% Cell type:code id:77a5b3bc-7e6a-40d8-b012-38bfebeaa641 tags:
``` python
relevant_excerpts = '\n\n------------------------------------------------------\n\n'.join([doc.page_content for doc in relevent_docs[:3]])
display(HTML(relevant_excerpts.replace("\n", "<br>")))
```
%% Output
%% Cell type:code id:9b2fc804-4d5e-4db2-b185-79ff85c36362 tags:
``` python
presidential_speech_chat_completion(client, model, user_question, relevant_excerpts)
```
%% Output
'James Garfield, in his Inaugural Address delivered on March 4, 1881, expressed his views on civil service reform. He believed that the civil service could not be placed on a satisfactory basis until it was regulated by law. He proposed to ask Congress to fix the tenure of the minor offices of the several Executive Departments and prescribe the grounds upon which removals shall be made during the terms for which incumbents have been appointed. He stated, "For the good of the service itself, for the protection of those who are intrusted with the appointing power against the waste of time and obstruction to the public business caused by the inordinate pressure for place, and for the protection of incumbents against intrigue and wrong, I shall at the proper time ask Congress to fix the tenure of the minor offices of the several Executive Departments and prescribe the grounds upon which removals shall be made during the terms for which incumbents have been appointed."\n\nHe also mentioned that he will act within the authority and limitations of the Constitution, invading neither the rights of the States nor the reserved rights of the people, it will be the purpose of my Administration to maintain the authority of the nation in all places within its jurisdiction; to enforce obedience to all the laws of the Union in the interests of the people; to demand rigid economy in all the expenditures of the Government, and to require the honest and faithful service of all executive officers, remembering that the offices were created, not for the benefit of incumbents or their supporters, but for the service of the Government.\n\nIt is also worth noting that Garfield\'s successor, Chester A. Arthur, in his Second Annual Message delivered on December 4, 1882, mentioned that Garfield\'s administration had a higher percentage of removals (22.7%) than the previous four administrations (the ratio of removals to the whole number of appointments was much the same during each of those four years, and ranged from 8.6% to 10%). Arthur states that "In the four months of President Garfield\'s Administration there were 390 appointments and 89 removals, or 22.7 per cent. Precisely the same number of removals (89) has taken place in the fourteen months which have since elapsed, but they constitute only 7.8 per cent of the whole number of appointments (1,119) within that period and less than 2.6 per cent of the entire list of officials (3,459), exclusive of the Army and Navy, which is filled by Presidential appointment. "'
%% Cell type:markdown id:4bc73962-81aa-4ca7-88dd-83793195d382 tags:
# Conclusion
%% Cell type:markdown id:c9b86556-d31d-4896-a342-8eff9d9fb48b tags:
In this notebook we've shown how to implement a RAG system using Groq API, LangChain and Pinecone by embedding, storing and searching over nearly 1,000 speeches from US presidents. By embedding speech transcripts into a vector database and leveraging the power of semantic search, we have demonstrated how to overcome two of the most significant challenges faced by LLMs: the knowledge cutoff and hallucination issues.
You can interact with this RAG application here: https://presidential-speeches-rag.streamlit.app/
......
......@@ -12,6 +12,10 @@ A simple application that allows users to interact with a conversational chatbot
## Usage
<!-- markdown-link-check-disable -->
You will need to store a valid Groq API Key as a secret to proceed with this example. You can generate one for free [here](https://console.groq.com/keys).
<!-- markdown-link-check-enable -->
You can [fork and run this application on Replit](https://replit.com/@GroqCloud/Chatbot-with-Conversational-Memory-on-LangChain) or run it on the command line with `python main.py`
......@@ -12,9 +12,12 @@ The [CrewAI](https://docs.crewai.com/) Machine Learning Assistant is a command l
- **LangChain Integration**: Incorporates LangChain to facilitate natural language processing and enhance the interaction between the user and the machine learning assistant.
## Usage
<!-- markdown-link-check-disable -->
You will need to store a valid Groq API Key as a secret to proceed with this example. You can generate one for free [here](https://console.groq.com/keys).
You can [fork and run this application on Replit](https://replit.com/@GroqCloud/CrewAI-Machine-Learning-Assistant) or run it on the command line with `python main.py`. You can upload a sample .csv to the same directory as ```main.py``` to give the application a head start on your ML problem. The application will output a Markdown file including python code for your ML use case to the same directory as main.py.
<!-- markdown-link-check-enable -->
You can [fork and run this application on Replit](https://replit.com/@GroqCloud/CrewAI-Machine-Learning-Assistant) or run it on the command line with `python main.py`. You can upload a sample .csv to the same directory as `main.py` to give the application a head start on your ML problem. The application will output a Markdown file including python code for your ML use case to the same directory as main.py.
......@@ -12,6 +12,10 @@ A simple application that allows users to interact with a conversational chatbot
## Usage
<!-- markdown-link-check-disable -->
You will need to store a valid Groq API Key as a secret to proceed with this example. You can generate one for free [here](https://console.groq.com/keys).
<!-- markdown-link-check-enable -->
You can [fork and run this application on Replit](https://replit.com/@GroqCloud/Groq-Quickstart-Conversational-Chatbot) or run it on the command line with `python main.py`.
......@@ -18,6 +18,10 @@ The function calling in this application is handled by the Groq API, abstracted
## Usage
<!-- markdown-link-check-disable -->
You will need to store a valid Groq API Key as a secret to proceed with this example. You can generate one for free [here](https://console.groq.com/keys).
<!-- markdown-link-check-enable -->
You can [fork and run this application on Replit](https://replit.com/@GroqCloud/Groqing-the-Stock-Market-Function-Calling-with-Llama3) or run it on the command line with `python main.py`.
......@@ -14,4 +14,8 @@ A simple application that allows users to interact with a conversational chatbot
##Usage
<!-- markdown-link-check-disable -->
You will need to store a valid Groq API Key as a secret to proceed with this example. You can generate one for free [here](https://console.groq.com/keys).
<!-- markdown-link-check-enable -->
......@@ -22,8 +22,12 @@ The main script of the application is [main.py](./main.py). Here's a brief overv
## Usage
<!-- markdown-link-check-disable -->
You will need to store a valid Groq API Key as a secret to proceed with this example outside of this Repl. You can generate one for free [here](https://console.groq.com/keys).
<!-- markdown-link-check-enable -->
You would also need your own [Pinecone](https://www.pinecone.io/) index with presidential speech embeddings to run this code locally. You can create a Pinecone API key and one index for a small project for free on their Starter plan, and visit [this Cookbook post](https://github.com/groq/groq-api-cookbook/blob/dan/replit-conversion/presidential-speeches-rag/presidential-speeches-rag.ipynb) for more info on RAG and a guide to uploading these embeddings to a vector database
You can [fork and run this application on Replit](https://replit.com/@GroqCloud/Presidential-Speeches-RAG-with-Pinecone) or run it on the command line with `python main.py`.
......@@ -38,8 +38,12 @@ A well-crafted system prompt is essential for building a functional Text-to-SQL
## Usage
<!-- markdown-link-check-disable -->
You will need to store a valid Groq API Key as a secret to proceed with this example. You can generate one for free [here](https://console.groq.com/keys).
<!-- markdown-link-check-enable -->
You can [fork and run this application on Replit](https://replit.com/@GroqCloud/Building-a-Text-to-SQL-app-with-Groqs-JSON-mode) or run it on the command line with `python main.py`.
## Customizing with Your Own Data
......
......@@ -36,8 +36,12 @@ The verified SQL queries and their descriptions are stored in YAML files located
## Usage
<!-- markdown-link-check-disable -->
You will need to store a valid Groq API Key as a secret to proceed with this example. You can generate one for free [here](https://console.groq.com/keys).
<!-- markdown-link-check-enable -->
You can [fork and run this application on Replit](https://replit.com/@GroqCloud/Execute-Verified-SQL-Queries-with-Function-Calling) or run it on the command line with `python main.py`.
## Customizing with Your Own Data
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment