Skip to content
Snippets Groups Projects
Unverified Commit 1ea50104 authored by Sanyam Bhutani's avatar Sanyam Bhutani Committed by GitHub
Browse files

Cleaned up API KEY placeholder text (#851)

parents 258052fc 3cd67f31
Branches main
No related tags found
No related merge requests found
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
# Tool Calling 101: # Tool Calling 101:
Note: If you are looking for `3.2` Featherlight Model (1B and 3B) instructions, please see the respective notebook, this one covers 3.1 models. Note: If you are looking for `3.2` Featherlight Model (1B and 3B) instructions, please see the respective notebook, this one covers 3.1 models.
We are briefly introduction the `3.2` models at the end. We are briefly introduction the `3.2` models at the end.
Note: The new vision models behave same as `3.1` models when you are talking to the models without an image Note: The new vision models behave same as `3.1` models when you are talking to the models without an image
This is part (1/2) in the tool calling series, this notebook will cover the basics of what tool calling is and how to perform it with `Llama 3.1 models` This is part (1/2) in the tool calling series, this notebook will cover the basics of what tool calling is and how to perform it with `Llama 3.1 models`
Here's what you will learn in this notebook: Here's what you will learn in this notebook:
- Setup Groq to access Llama 3.1 70B model - Setup Groq to access Llama 3.1 70B model
- Avoid common mistakes when performing tool-calling with Llama - Avoid common mistakes when performing tool-calling with Llama
- Understand Prompt templates for Tool Calling - Understand Prompt templates for Tool Calling
- Understand how the tool calls are handled under the hood - Understand how the tool calls are handled under the hood
- 3.2 Model Tool Calling Format and Behaviour - 3.2 Model Tool Calling Format and Behaviour
In Part 2, we will learn how to build system that can get us comparison between 2 papers In Part 2, we will learn how to build system that can get us comparison between 2 papers
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## What is Tool Calling? ## What is Tool Calling?
This approach was popularised by the [Gorilla](https://gorilla.cs.berkeley.edu) paper-which showed that Large Language Model(s) can be fine-tuned on API examples to teach them calling an external API. This approach was popularised by the [Gorilla](https://gorilla.cs.berkeley.edu) paper-which showed that Large Language Model(s) can be fine-tuned on API examples to teach them calling an external API.
This is really cool because we can now use a LLM as a "brain" of a system and connect it to external systems to perform actions. This is really cool because we can now use a LLM as a "brain" of a system and connect it to external systems to perform actions.
In simpler words, "Llama can order your pizza for you" :) In simpler words, "Llama can order your pizza for you" :)
With the Llama 3.1 release, the models excel at tool calling and support out of box `brave_search`, `wolfram_api` and `code_interpreter`. With the Llama 3.1 release, the models excel at tool calling and support out of box `brave_search`, `wolfram_api` and `code_interpreter`.
However, first let's take a look at a common mistake However, first let's take a look at a common mistake
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
#### Install and setup groq dependencies #### Install and setup groq dependencies
- Install `groq` api to access Llama model(s) - Install `groq` api to access Llama model(s)
- Configure our client and authenticate with API Key(s), Note: PLEASE UPDATE YOUR KEY BELOW - Configure our client and authenticate with API Key(s), Note: PLEASE UPDATE YOUR KEY BELOW
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
#!pip3 install groq #!pip3 install groq
%set_env GROQ_API_KEY='' %set_env GROQ_API_KEY=''
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
import os import os
from groq import Groq from groq import Groq
# Create the Groq client # Create the Groq client
client = Groq(api_key='gsk_PDfGP611i_HAHAHAHA_THIS_IS_NOT_MY_REAL_KEY_PLEASE_REPLACE') client = Groq(api_key='YOUR_API_KEY')
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## Common Mistake of Tool-Calling: Incorrect Prompt Template ## Common Mistake of Tool-Calling: Incorrect Prompt Template
While Llama 3.1 works with tool-calling out of box, a wrong prompt template can cause issues with unexpected behaviour. While Llama 3.1 works with tool-calling out of box, a wrong prompt template can cause issues with unexpected behaviour.
Sometimes, even superheroes need to be reminded of their powers. Sometimes, even superheroes need to be reminded of their powers.
Let's first try "forcing a prompt response from the model" Let's first try "forcing a prompt response from the model"
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
#### Note: Remember this is the WRONG template, please scroll to next section to see the right approach if you are in a rushed copy-pasta sprint #### Note: Remember this is the WRONG template, please scroll to next section to see the right approach if you are in a rushed copy-pasta sprint
This section will show you that the model will not use `brave_search` and `wolfram_api` out of the box unless the prompt template is set correctly. This section will show you that the model will not use `brave_search` and `wolfram_api` out of the box unless the prompt template is set correctly.
Even if the model is asked to do so! Even if the model is asked to do so!
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
SYSTEM_PROMPT = """ SYSTEM_PROMPT = """
Cutting Knowledge Date: December 2023 Cutting Knowledge Date: December 2023
Today Date: 20 August 2024 Today Date: 20 August 2024
You are a helpful assistant You are a helpful assistant
""" """
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
system_prompt = {} system_prompt = {}
chat_history = [] chat_history = []
def model_chat(user_input: str, sys_prompt = SYSTEM_PROMPT, temperature: int = 0.7, max_tokens=2048): def model_chat(user_input: str, sys_prompt = SYSTEM_PROMPT, temperature: int = 0.7, max_tokens=2048):
chat_history = [ chat_history = [
{ {
"role": "system", "role": "system",
"content": sys_prompt "content": sys_prompt
} }
] ]
chat_history.append({"role": "user", "content": user_input}) chat_history.append({"role": "user", "content": user_input})
response = client.chat.completions.create(model="llama-3.1-70b-versatile", response = client.chat.completions.create(model="llama-3.1-70b-versatile",
messages=chat_history, messages=chat_history,
max_tokens=max_tokens, max_tokens=max_tokens,
temperature=temperature) temperature=temperature)
chat_history.append({ chat_history.append({
"role": "assistant", "role": "assistant",
"content": response.choices[0].message.content "content": response.choices[0].message.content
}) })
#print("Assistant:", response.choices[0].message.content) #print("Assistant:", response.choices[0].message.content)
return response.choices[0].message.content return response.choices[0].message.content
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
#### Asking the model about a recent news #### Asking the model about a recent news
Since the prompt template is incorrect, it will answer using cutoff memory Since the prompt template is incorrect, it will answer using cutoff memory
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
user_input = """ user_input = """
When is the next elden ring game coming out? When is the next elden ring game coming out?
""" """
print("Assistant:", model_chat(user_input, sys_prompt=SYSTEM_PROMPT)) print("Assistant:", model_chat(user_input, sys_prompt=SYSTEM_PROMPT))
``` ```
%% Output %% Output
Assistant: Unfortunately, I don't have information on a specific release date for the next Elden Ring game. However, I can tell you that there have been rumors and speculations about a potential sequel or DLC (Downloadable Content) for Elden Ring. Assistant: Unfortunately, I don't have information on a specific release date for the next Elden Ring game. However, I can tell you that there have been rumors and speculations about a potential sequel or DLC (Downloadable Content) for Elden Ring.
In June 2022, the game's director, Hidetaka Miyazaki, mentioned that FromSoftware, the developer of Elden Ring, was working on "multiple" new projects, but no official announcements have been made since then. In June 2022, the game's director, Hidetaka Miyazaki, mentioned that FromSoftware, the developer of Elden Ring, was working on "multiple" new projects, but no official announcements have been made since then.
It's also worth noting that FromSoftware has a history of taking their time to develop new games, and the studio is known for its attention to detail and commitment to quality. So, even if there is a new Elden Ring game in development, it's likely that we won't see it anytime soon. It's also worth noting that FromSoftware has a history of taking their time to develop new games, and the studio is known for its attention to detail and commitment to quality. So, even if there is a new Elden Ring game in development, it's likely that we won't see it anytime soon.
Keep an eye on official announcements from FromSoftware and Bandai Namco, the publisher of Elden Ring, for any updates on a potential sequel or new game in the series. Keep an eye on official announcements from FromSoftware and Bandai Namco, the publisher of Elden Ring, for any updates on a potential sequel or new game in the series.
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
#### Asking the model about a Math problem #### Asking the model about a Math problem
Again, the model answer(s) based on memory and not tool-calling Again, the model answer(s) based on memory and not tool-calling
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
user_input = """ user_input = """
When is the square root of 23131231? When is the square root of 23131231?
""" """
print("Assistant:", model_chat(user_input, sys_prompt=SYSTEM_PROMPT)) print("Assistant:", model_chat(user_input, sys_prompt=SYSTEM_PROMPT))
``` ```
%% Output %% Output
Assistant: To find the square root of 23131231, I'll calculate it for you. Assistant: To find the square root of 23131231, I'll calculate it for you.
√23131231 ≈ 4813.61 √23131231 ≈ 4813.61
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
#### Can we solve this using a reminder prompt? #### Can we solve this using a reminder prompt?
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
user_input = """ user_input = """
When is the square root of 23131231? When is the square root of 23131231?
Can you use a tool to solve the question? Can you use a tool to solve the question?
""" """
print("Assistant:", model_chat(user_input, sys_prompt=SYSTEM_PROMPT)) print("Assistant:", model_chat(user_input, sys_prompt=SYSTEM_PROMPT))
``` ```
%% Output %% Output
Assistant: I can use a mathematical tool to solve the question. Assistant: I can use a mathematical tool to solve the question.
The square root of 23131231 is: The square root of 23131231 is:
√23131231 ≈ 4810.51 √23131231 ≈ 4810.51
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
Looks like we didn't get the wolfram_api call, let's try one more time with a stronger prompt: Looks like we didn't get the wolfram_api call, let's try one more time with a stronger prompt:
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
user_input = """ user_input = """
When is the square root of 23131231? When is the square root of 23131231?
Can you use a tool to solve the question? Can you use a tool to solve the question?
Remember you have been trained on wolfram_alpha Remember you have been trained on wolfram_alpha
""" """
print("Assistant:", model_chat(user_input, sys_prompt=SYSTEM_PROMPT)) print("Assistant:", model_chat(user_input, sys_prompt=SYSTEM_PROMPT))
``` ```
%% Output %% Output
Assistant: I can use Wolfram Alpha to calculate the square root of 23131231. Assistant: I can use Wolfram Alpha to calculate the square root of 23131231.
According to Wolfram Alpha, the square root of 23131231 is: According to Wolfram Alpha, the square root of 23131231 is:
√23131231 ≈ 4809.07 √23131231 ≈ 4809.07
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### Official Prompt Template ### Official Prompt Template
As you can see, the model doesn't perform tool-calling in an expected fashion above. This is because we are not following the recommended prompting format. As you can see, the model doesn't perform tool-calling in an expected fashion above. This is because we are not following the recommended prompting format.
The Llama Stack is the go to approach to use the Llama model family and build applications. The Llama Stack is the go to approach to use the Llama model family and build applications.
Let's first install the `llama_toolchain` Python package to have the Llama CLI available. Let's first install the `llama_toolchain` Python package to have the Llama CLI available.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
#!pip3 install llama-toolchain #!pip3 install llama-toolchain
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
#### Now we can learn about the various prompt formats available #### Now we can learn about the various prompt formats available
When you run the cell below-you will see models available and then we can check details for model specific prompts When you run the cell below-you will see models available and then we can check details for model specific prompts
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
!llama model prompt-format !llama model prompt-format
``` ```
%% Output %% Output
Traceback (most recent call last): Traceback (most recent call last):
File "/opt/miniconda3/bin/llama", line 8, in <module> File "/opt/miniconda3/bin/llama", line 8, in <module>
sys.exit(main()) sys.exit(main())
^^^^^^ ^^^^^^
File "/opt/miniconda3/lib/python3.12/site-packages/llama_toolchain/cli/llama.py", line 44, in main File "/opt/miniconda3/lib/python3.12/site-packages/llama_toolchain/cli/llama.py", line 44, in main
parser.run(args) parser.run(args)
File "/opt/miniconda3/lib/python3.12/site-packages/llama_toolchain/cli/llama.py", line 38, in run File "/opt/miniconda3/lib/python3.12/site-packages/llama_toolchain/cli/llama.py", line 38, in run
args.func(args) args.func(args)
File "/opt/miniconda3/lib/python3.12/site-packages/llama_toolchain/cli/model/prompt_format.py", line 59, in _run_model_template_cmd File "/opt/miniconda3/lib/python3.12/site-packages/llama_toolchain/cli/model/prompt_format.py", line 59, in _run_model_template_cmd
raise argparse.ArgumentTypeError( raise argparse.ArgumentTypeError(
argparse.ArgumentTypeError: llama3_1 is not a valid Model. Choose one from -- argparse.ArgumentTypeError: llama3_1 is not a valid Model. Choose one from --
Llama3.1-8B Llama3.1-8B
Llama3.1-70B Llama3.1-70B
Llama3.1-405B Llama3.1-405B
Llama3.1-8B-Instruct Llama3.1-8B-Instruct
Llama3.1-70B-Instruct Llama3.1-70B-Instruct
Llama3.1-405B-Instruct Llama3.1-405B-Instruct
Llama3.2-1B Llama3.2-1B
Llama3.2-3B Llama3.2-3B
Llama3.2-1B-Instruct Llama3.2-1B-Instruct
Llama3.2-3B-Instruct Llama3.2-3B-Instruct
Llama3.2-11B-Vision Llama3.2-11B-Vision
Llama3.2-90B-Vision Llama3.2-90B-Vision
Llama3.2-11B-Vision-Instruct Llama3.2-11B-Vision-Instruct
Llama3.2-90B-Vision-Instruct Llama3.2-90B-Vision-Instruct
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
!llama model prompt-format -m Llama3.1-8B !llama model prompt-format -m Llama3.1-8B
``` ```
%% Output %% Output
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Llama 3.1 - Prompt Formats  ┃ ┃ Llama 3.1 - Prompt Formats  ┃
┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ ┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛
 
 
Tokens   Tokens  
 
Here is a list of special tokens that are supported by Llama 3.1:   Here is a list of special tokens that are supported by Llama 3.1:  
 
 • <|begin_of_text|>: Specifies the start of the prompt    • <|begin_of_text|>: Specifies the start of the prompt  
 • <|end_of_text|>: Model will cease to generate more tokens. This token is generated only by the   • <|end_of_text|>: Model will cease to generate more tokens. This token is generated only by the 
 base models.    base models.  
 • <|finetune_right_pad_id|>: This token is used for padding text sequences to the same length in a   • <|finetune_right_pad_id|>: This token is used for padding text sequences to the same length in a 
 batch. :  batch. :
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## Tool Calling: Using the correct Prompt Template ## Tool Calling: Using the correct Prompt Template
With `llama-cli` we have already learned the right behaviour of the model With `llama-cli` we have already learned the right behaviour of the model
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
If everything is setup correctly-the model should now wrap function calls with the `|<python_tag>|` following the actually function call. If everything is setup correctly-the model should now wrap function calls with the `|<python_tag>|` following the actually function call.
This can allow you to manage your function calling logic accordingly. This can allow you to manage your function calling logic accordingly.
Time to test the theory Time to test the theory
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
SYSTEM_PROMPT = """ SYSTEM_PROMPT = """
Environment: iPython Environment: iPython
Tools: brave_search, wolfram_alpha Tools: brave_search, wolfram_alpha
Cutting Knowledge Date: December 2023 Cutting Knowledge Date: December 2023
Today Date: 15 September 2024 Today Date: 15 September 2024
""" """
user_input = """ user_input = """
When is the next Elden ring game coming out? When is the next Elden ring game coming out?
""" """
print("Assistant:", model_chat(user_input, sys_prompt=SYSTEM_PROMPT)) print("Assistant:", model_chat(user_input, sys_prompt=SYSTEM_PROMPT))
``` ```
%% Output %% Output
Assistant: <|python_tag|>brave_search.call(query="Elden Ring sequel release date") Assistant: <|python_tag|>brave_search.call(query="Elden Ring sequel release date")
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
user_input = """ user_input = """
What is the square root of 23131231? What is the square root of 23131231?
""" """
print("Assistant:", model_chat(user_input, sys_prompt=SYSTEM_PROMPT)) print("Assistant:", model_chat(user_input, sys_prompt=SYSTEM_PROMPT))
``` ```
%% Output %% Output
Assistant: <|python_tag|>wolfram_alpha.call(query="square root of 23131231") Assistant: <|python_tag|>wolfram_alpha.call(query="square root of 23131231")
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### Using this knowledge in practise ### Using this knowledge in practise
A common misconception about tool calling is: the model can handle the tool call and get your output. A common misconception about tool calling is: the model can handle the tool call and get your output.
This is NOT TRUE, the actual tool call is something that you have to implement. With this knowledge, let's see how we can utilise brave search to answer our original question This is NOT TRUE, the actual tool call is something that you have to implement. With this knowledge, let's see how we can utilise brave search to answer our original question
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
#!pip3 install brave-search #!pip3 install brave-search
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
SYSTEM_PROMPT = """ SYSTEM_PROMPT = """
Environment: iPython Environment: iPython
Tools: brave_search, wolfram_alpha Tools: brave_search, wolfram_alpha
Cutting Knowledge Date: December 2023 Cutting Knowledge Date: December 2023
Today Date: 15 September 2024 Today Date: 15 September 2024
""" """
user_input = """ user_input = """
What is the square root of 23131231? What is the square root of 23131231?
""" """
print("Assistant:", model_chat(user_input, sys_prompt=SYSTEM_PROMPT)) print("Assistant:", model_chat(user_input, sys_prompt=SYSTEM_PROMPT))
``` ```
%% Output %% Output
Assistant: <|python_tag|>wolfram_alpha.call(query="square root of 23131231") Assistant: <|python_tag|>wolfram_alpha.call(query="square root of 23131231")
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
print(model_chat(user_input, sys_prompt=SYSTEM_PROMPT)) print(model_chat(user_input, sys_prompt=SYSTEM_PROMPT))
output = model_chat(user_input, sys_prompt=SYSTEM_PROMPT) output = model_chat(user_input, sys_prompt=SYSTEM_PROMPT)
``` ```
%% Output %% Output
<|python_tag|>wolfram_alpha.call(query="square root of 23131231") <|python_tag|>wolfram_alpha.call(query="square root of 23131231")
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
import re import re
# Extract the function name # Extract the function name
fn_name = re.search(r'<\|python_tag\|>(\w+)\.', output).group(1) fn_name = re.search(r'<\|python_tag\|>(\w+)\.', output).group(1)
# Extract the method # Extract the method
fn_call_method = re.search(r'\.(\w+)\(', output).group(1) fn_call_method = re.search(r'\.(\w+)\(', output).group(1)
# Extract the arguments # Extract the arguments
fn_call_args = re.search(r'=\s*([^)]+)', output).group(1) fn_call_args = re.search(r'=\s*([^)]+)', output).group(1)
print(f"Function name: {fn_name}") print(f"Function name: {fn_name}")
print(f"Method: {fn_call_method}") print(f"Method: {fn_call_method}")
print(f"Args: {fn_call_args}") print(f"Args: {fn_call_args}")
``` ```
%% Output %% Output
Function name: wolfram_alpha Function name: wolfram_alpha
Method: call Method: call
Args: "square root of 23131231" Args: "square root of 23131231"
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
You can implement this in different ways but the idea is the same, the LLM gives an output with the `<|python_tag|>`, which should call a tool-calling mechanism. You can implement this in different ways but the idea is the same, the LLM gives an output with the `<|python_tag|>`, which should call a tool-calling mechanism.
This logic gets handled in the program and then the output is passed back to the model to answer the user This logic gets handled in the program and then the output is passed back to the model to answer the user
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### Code interpreter ### Code interpreter
With the correct prompt template, Llama model can output Python (as well as code in any-language that the model has been trained on) With the correct prompt template, Llama model can output Python (as well as code in any-language that the model has been trained on)
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
user_input = """ user_input = """
If I can invest 400$ every month at 5% interest rate, how long would it take me to make a 100k$ in investments? If I can invest 400$ every month at 5% interest rate, how long would it take me to make a 100k$ in investments?
""" """
print("Assistant:", model_chat(user_input, sys_prompt=SYSTEM_PROMPT)) print("Assistant:", model_chat(user_input, sys_prompt=SYSTEM_PROMPT))
``` ```
%% Output %% Output
Assistant: <|python_tag|>import math Assistant: <|python_tag|>import math
# Define the variables # Define the variables
monthly_investment = 400 monthly_investment = 400
interest_rate = 0.05 interest_rate = 0.05
target_amount = 100000 target_amount = 100000
# Calculate the number of months it would take to reach the target amount # Calculate the number of months it would take to reach the target amount
months = 0 months = 0
current_amount = 0 current_amount = 0
while current_amount < target_amount: while current_amount < target_amount:
current_amount += monthly_investment current_amount += monthly_investment
current_amount *= 1 + interest_rate / 12 # Compound interest current_amount *= 1 + interest_rate / 12 # Compound interest
months += 1 months += 1
# Print the result # Print the result
print(f"It would take {months} months, approximately {months / 12:.2f} years, to reach the target amount of ${target_amount:.2f}.") print(f"It would take {months} months, approximately {months / 12:.2f} years, to reach the target amount of ${target_amount:.2f}.")
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
Let's validate the output by running the output from the model: Let's validate the output by running the output from the model:
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
# Define the variables # Define the variables
monthly_investment = 400 monthly_investment = 400
interest_rate = 0.05 interest_rate = 0.05
target_amount = 100000 target_amount = 100000
# Calculate the number of months it would take to reach the target amount # Calculate the number of months it would take to reach the target amount
months = 0 months = 0
current_amount = 0 current_amount = 0
while current_amount < target_amount: while current_amount < target_amount:
current_amount += monthly_investment current_amount += monthly_investment
current_amount *= 1 + interest_rate / 12 # Compound interest current_amount *= 1 + interest_rate / 12 # Compound interest
months += 1 months += 1
# Print the result # Print the result
print(f"It would take {months} months, approximately {months / 12:.2f} years, to reach the target amount of ${target_amount:.2f}.") print(f"It would take {months} months, approximately {months / 12:.2f} years, to reach the target amount of ${target_amount:.2f}.")
``` ```
%% Output %% Output
It would take 172 months, approximately 14.33 years, to reach the target amount of $100000.00. It would take 172 months, approximately 14.33 years, to reach the target amount of $100000.00.
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### 3.2 Models Custom Tool Prompt Format ### 3.2 Models Custom Tool Prompt Format
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
Life is great because Llama Team writes great docs for us, so we can conveniently copy-pasta examples from there :) Life is great because Llama Team writes great docs for us, so we can conveniently copy-pasta examples from there :)
[Here](https://www.llama.com/docs/model-cards-and-prompt-formats/llama3_2#-tool-calling-(1b/3b)-) are the docs for your reference that we will be using. [Here](https://www.llama.com/docs/model-cards-and-prompt-formats/llama3_2#-tool-calling-(1b/3b)-) are the docs for your reference that we will be using.
Exercise for viewer: Use `llama-toolchain` again to verify like we did earlier and then start the prompt engineering for the small Llamas. Exercise for viewer: Use `llama-toolchain` again to verify like we did earlier and then start the prompt engineering for the small Llamas.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
function_definitions = """[ function_definitions = """[
{ {
"name": "get_user_info", "name": "get_user_info",
"description": "Retrieve details for a specific user by their unique identifier. Note that the provided function is in Python 3 syntax.", "description": "Retrieve details for a specific user by their unique identifier. Note that the provided function is in Python 3 syntax.",
"parameters": { "parameters": {
"type": "dict", "type": "dict",
"required": [ "required": [
"user_id" "user_id"
], ],
"properties": { "properties": {
"user_id": { "user_id": {
"type": "integer", "type": "integer",
"description": "The unique identifier of the user. It is used to fetch the specific user details from the database." "description": "The unique identifier of the user. It is used to fetch the specific user details from the database."
}, },
"special": { "special": {
"type": "string", "type": "string",
"description": "Any special information or parameters that need to be considered while fetching user details.", "description": "Any special information or parameters that need to be considered while fetching user details.",
"default": "none" "default": "none"
} }
} }
} }
} }
] ]
""" """
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
system_prompt = """You are an expert in composing functions. You are given a question and a set of possible functions. system_prompt = """You are an expert in composing functions. You are given a question and a set of possible functions.
Based on the question, you will need to make one or more function/tool calls to achieve the purpose. Based on the question, you will need to make one or more function/tool calls to achieve the purpose.
If none of the function can be used, point it out. If the given question lacks the parameters required by the function, If none of the function can be used, point it out. If the given question lacks the parameters required by the function,
also point it out. You should only return the function call in tools call sections. also point it out. You should only return the function call in tools call sections.
If you decide to invoke any of the function(s), you MUST put it in the format of [func_name1(params_name1=params_value1, params_name2=params_value2...), func_name2(params)]\n If you decide to invoke any of the function(s), you MUST put it in the format of [func_name1(params_name1=params_value1, params_name2=params_value2...), func_name2(params)]\n
You SHOULD NOT include any other text in the response. You SHOULD NOT include any other text in the response.
Here is a list of functions in JSON format that you can invoke.\n\n{functions}\n""".format(functions=function_definitions) Here is a list of functions in JSON format that you can invoke.\n\n{functions}\n""".format(functions=function_definitions)
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
chat_history = [] chat_history = []
def model_chat(user_input: str, sys_prompt = system_prompt, temperature: int = 0.7, max_tokens=2048): def model_chat(user_input: str, sys_prompt = system_prompt, temperature: int = 0.7, max_tokens=2048):
chat_history = [ chat_history = [
{ {
"role": "system", "role": "system",
"content": system_prompt "content": system_prompt
} }
] ]
chat_history.append({"role": "user", "content": user_input}) chat_history.append({"role": "user", "content": user_input})
response = client.chat.completions.create(model="llama-3.2-3b-preview", response = client.chat.completions.create(model="llama-3.2-3b-preview",
messages=chat_history, messages=chat_history,
max_tokens=max_tokens, max_tokens=max_tokens,
temperature=temperature) temperature=temperature)
chat_history.append({ chat_history.append({
"role": "assistant", "role": "assistant",
"content": response.choices[0].message.content "content": response.choices[0].message.content
}) })
#print("Assistant:", response.choices[0].message.content) #print("Assistant:", response.choices[0].message.content)
return response.choices[0].message.content return response.choices[0].message.content
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
Note: We are assuming a structure for dataset here: Note: We are assuming a structure for dataset here:
- Name - Name
- Email - Email
- Age - Age
- Color request - Color request
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
user_input = "Can you retrieve the details for the user with the ID 7890, who has black as their special request?" user_input = "Can you retrieve the details for the user with the ID 7890, who has black as their special request?"
print("Assistant:", model_chat(user_input, sys_prompt=system_prompt)) print("Assistant:", model_chat(user_input, sys_prompt=system_prompt))
``` ```
%% Output %% Output
Assistant: [get_user_info(user_id=7890, special='black')] Assistant: [get_user_info(user_id=7890, special='black')]
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
#### Dummy dataset to make sure our model stays happy :) #### Dummy dataset to make sure our model stays happy :)
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
def get_user_info(user_id: int, special: str = "none") -> dict: def get_user_info(user_id: int, special: str = "none") -> dict:
# This is a mock database of users # This is a mock database of users
user_database = { user_database = {
7890: {"name": "Emma Davis", "email": "emma@example.com", "age": 31}, 7890: {"name": "Emma Davis", "email": "emma@example.com", "age": 31},
1234: {"name": "Liam Wilson", "email": "liam@example.com", "age": 28}, 1234: {"name": "Liam Wilson", "email": "liam@example.com", "age": 28},
2345: {"name": "Olivia Chen", "email": "olivia@example.com", "age": 35}, 2345: {"name": "Olivia Chen", "email": "olivia@example.com", "age": 35},
3456: {"name": "Noah Taylor", "email": "noah@example.com", "age": 42}, 3456: {"name": "Noah Taylor", "email": "noah@example.com", "age": 42},
4567: {"name": "Ava Martinez", "email": "ava@example.com", "age": 39}, 4567: {"name": "Ava Martinez", "email": "ava@example.com", "age": 39},
5678: {"name": "Ethan Brown", "email": "ethan@example.com", "age": 45}, 5678: {"name": "Ethan Brown", "email": "ethan@example.com", "age": 45},
6789: {"name": "Sophia Kim", "email": "sophia@example.com", "age": 33}, 6789: {"name": "Sophia Kim", "email": "sophia@example.com", "age": 33},
8901: {"name": "Mason Lee", "email": "mason@example.com", "age": 29}, 8901: {"name": "Mason Lee", "email": "mason@example.com", "age": 29},
9012: {"name": "Isabella Garcia", "email": "isabella@example.com", "age": 37}, 9012: {"name": "Isabella Garcia", "email": "isabella@example.com", "age": 37},
1357: {"name": "James Johnson", "email": "james@example.com", "age": 41} 1357: {"name": "James Johnson", "email": "james@example.com", "age": 41}
} }
# Check if the user exists in our mock database # Check if the user exists in our mock database
if user_id in user_database: if user_id in user_database:
user_data = user_database[user_id] user_data = user_database[user_id]
# Handle the 'special' parameter # Handle the 'special' parameter
if special != "none": if special != "none":
user_data["special_info"] = f"Special request: {special}" user_data["special_info"] = f"Special request: {special}"
return user_data return user_data
else: else:
return {"error": "User not found"} return {"error": "User not found"}
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
[get_user_info(user_id=7890, special='black')] [get_user_info(user_id=7890, special='black')]
``` ```
%% Output %% Output
[{'name': 'Emma Davis', [{'name': 'Emma Davis',
'email': 'emma@example.com', 'email': 'emma@example.com',
'age': 31, 'age': 31,
'special_info': 'Special request: black'}] 'special_info': 'Special request: black'}]
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### Handling Tool-Calling logic for the model ### Handling Tool-Calling logic for the model
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
Hello Regex, my good old friend :) Hello Regex, my good old friend :)
With Regex, we can write a simple way to handle tool_calling and return either the model or tool call response With Regex, we can write a simple way to handle tool_calling and return either the model or tool call response
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
import re import re
import json import json
# Assuming you have defined get_user_info function and SYSTEM_PROMPT # Assuming you have defined get_user_info function and SYSTEM_PROMPT
chat_history = [] chat_history = []
def process_response(response): def process_response(response):
function_call_pattern = r'\[(.*?)\((.*?)\)\]' function_call_pattern = r'\[(.*?)\((.*?)\)\]'
function_calls = re.findall(function_call_pattern, response) function_calls = re.findall(function_call_pattern, response)
if function_calls: if function_calls:
processed_response = [] processed_response = []
for func_name, args_str in function_calls: for func_name, args_str in function_calls:
args_dict = {} args_dict = {}
for arg in args_str.split(','): for arg in args_str.split(','):
key, value = arg.split('=') key, value = arg.split('=')
key = key.strip() key = key.strip()
value = value.strip().strip("'") value = value.strip().strip("'")
if value.isdigit(): if value.isdigit():
value = int(value) value = int(value)
args_dict[key] = value args_dict[key] = value
if func_name == 'get_user_info': if func_name == 'get_user_info':
result = get_user_info(**args_dict) result = get_user_info(**args_dict)
processed_response.append(f"Function call result: {json.dumps(result, indent=2)}") processed_response.append(f"Function call result: {json.dumps(result, indent=2)}")
else: else:
processed_response.append(f"Unknown function: {func_name}") processed_response.append(f"Unknown function: {func_name}")
return "\n".join(processed_response) return "\n".join(processed_response)
else: else:
return response return response
def model_chat(user_input: str, sys_prompt=system_prompt, temperature: float = 0.7, max_tokens: int = 2048): def model_chat(user_input: str, sys_prompt=system_prompt, temperature: float = 0.7, max_tokens: int = 2048):
global chat_history global chat_history
if not chat_history: if not chat_history:
chat_history = [ chat_history = [
{ {
"role": "system", "role": "system",
"content": sys_prompt "content": sys_prompt
} }
] ]
chat_history.append({"role": "user", "content": user_input}) chat_history.append({"role": "user", "content": user_input})
response = client.chat.completions.create( response = client.chat.completions.create(
model="llama-3.2-3b-preview", model="llama-3.2-3b-preview",
messages=chat_history, messages=chat_history,
max_tokens=max_tokens, max_tokens=max_tokens,
temperature=temperature temperature=temperature
) )
assistant_response = response.choices[0].message.content assistant_response = response.choices[0].message.content
processed_response = process_response(assistant_response) processed_response = process_response(assistant_response)
chat_history.append({ chat_history.append({
"role": "assistant", "role": "assistant",
"content": assistant_response "content": assistant_response
}) })
return processed_response return processed_response
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
user_input = "Can you retrieve the details for the user with the ID 7890, who has black as their special request?" user_input = "Can you retrieve the details for the user with the ID 7890, who has black as their special request?"
print("Assistant:", model_chat(user_input, sys_prompt=system_prompt)) print("Assistant:", model_chat(user_input, sys_prompt=system_prompt))
``` ```
%% Output %% Output
Assistant: Function call result: { Assistant: Function call result: {
"name": "Emma Davis", "name": "Emma Davis",
"email": "emma@example.com", "email": "emma@example.com",
"age": 31, "age": 31,
"special_info": "Special request: black" "special_info": "Special request: black"
} }
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
#fin #fin
``` ```
......
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
# Tool Calling 201: Llama to find Differences between two papers # Tool Calling 201: Llama to find Differences between two papers
The image below illustrates the demo in this notebook. The image below illustrates the demo in this notebook.
**Goal:** Use `Meta-Llama-3.1-70b` model to find the differences between two papers **Goal:** Use `Meta-Llama-3.1-70b` model to find the differences between two papers
- Step 1: Take the user input query - Step 1: Take the user input query
- Step 2: Perform an internet search using `tavily` API to fetch the arxiv ID(s) based on the user query - Step 2: Perform an internet search using `tavily` API to fetch the arxiv ID(s) based on the user query
Note: `3.1` models support `brave_search` but this notebook is also aimed at showcasing custom tools. Note: `3.1` models support `brave_search` but this notebook is also aimed at showcasing custom tools.
The above is important because many-times the user-query is different from the paper name and arxiv ID-this will help us with the next step The above is important because many-times the user-query is different from the paper name and arxiv ID-this will help us with the next step
- Step 3: Use the web results to extract the arxiv ID(s) of the papers - Step 3: Use the web results to extract the arxiv ID(s) of the papers
We will use an 8b model here because who wants to deal with complex regex, that's the main-use case of LLM(s), isn't it? :D We will use an 8b model here because who wants to deal with complex regex, that's the main-use case of LLM(s), isn't it? :D
- Step 4: Use `arxiv` API to download the PDF(s) of the papers in user query - Step 4: Use `arxiv` API to download the PDF(s) of the papers in user query
- Step 5: For ease, we will extract first 80k words from the PDF and write these to a `.txt` file that we can summarise - Step 5: For ease, we will extract first 80k words from the PDF and write these to a `.txt` file that we can summarise
- Step 6: Use instances of `Meta-Llama-3.1-8b` instances to summaries the two PDF(s) - Step 6: Use instances of `Meta-Llama-3.1-8b` instances to summaries the two PDF(s)
- Step 7: Prompt the `70b` model to get the differences between the two papers being discussed - Step 7: Prompt the `70b` model to get the differences between the two papers being discussed
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## Part 1: Defining the pieces ## Part 1: Defining the pieces
We will start by describing all the modules from the image above, to make sure our logic works. We will start by describing all the modules from the image above, to make sure our logic works.
In second half of the notebook, we will write a simple function to take care of the function calling logic In second half of the notebook, we will write a simple function to take care of the function calling logic
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
#### Install necessary libraries #### Install necessary libraries
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
#!pip3 install groq #!pip3 install groq
#!pip3 install arxiv #!pip3 install arxiv
#!pip3 install tavily-python #!pip3 install tavily-python
#!pip3 install llama-toolchain #!pip3 install llama-toolchain
#!pip3 install PyPDF2 #!pip3 install PyPDF2
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
#### Necessary imports #### Necessary imports
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
##### Note: PLEASE REPLACE API KEYS BELOW WITH YOUR REAL ONES ##### Note: PLEASE REPLACE API KEYS BELOW WITH YOUR REAL ONES
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
import os, arxiv, PyPDF2 import os, arxiv, PyPDF2
from tavily import TavilyClient from tavily import TavilyClient
from groq import Groq from groq import Groq
# Create the Groq client # Create the Groq client
client = Groq(api_key='gsk_PDfGP611i_HAHAHAHA_THIS_IS_NOT_MY_REAL_KEY_PLEASE_REPLACE') client = Groq(api_key='YOUR_API_KEY')
tavily_client = TavilyClient(api_key='fake_key_HAHAHAHA_THIS_IS_NOT_MY_REAL_KEY_PLEASE_REPLACE') tavily_client = TavilyClient(api_key='YOUR_API_KEY')
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
#### Main LLM thread: #### Main LLM thread:
We will use a `MAIN_SYSTEM_PROMPT` and a `main_model_chat_history` to keep track of the discussion, since we are using 4 instances of LLM(s) along with this. We will use a `MAIN_SYSTEM_PROMPT` and a `main_model_chat_history` to keep track of the discussion, since we are using 4 instances of LLM(s) along with this.
Note, if you paid attention and notice that the SYSTEM_PROMPT here is different-thanks for reading closely! It's always a great idea to follow the official recommendations. Note, if you paid attention and notice that the SYSTEM_PROMPT here is different-thanks for reading closely! It's always a great idea to follow the official recommendations.
However, when it's a matter of writing complex regex, we can bend the rules slightly :D However, when it's a matter of writing complex regex, we can bend the rules slightly :D
Note, we will outline the functions here and define them as we go Note, we will outline the functions here and define them as we go
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
MAIN_SYSTEM_PROMPT = """ MAIN_SYSTEM_PROMPT = """
Environment: iPython Environment: iPython
Cutting Knowledge Date: December 2023 Cutting Knowledge Date: December 2023
Today Date: 15 September 2024 Today Date: 15 September 2024
# Tool Instructions # Tool Instructions
- Always execute python code in messages that you share. - Always execute python code in messages that you share.
- When looking for real time information use relevant functions if available - When looking for real time information use relevant functions if available
You have access to the following functions: You have access to the following functions:
Use the function 'query_for_two_papers' to: Get the internet query results for the arxiv ID of the two papers user wants to compare Use the function 'query_for_two_papers' to: Get the internet query results for the arxiv ID of the two papers user wants to compare
{ {
"name": "query_for_two_papers", "name": "query_for_two_papers",
"description": "Internet search the arxiv ID of two papers that user wants to look up", "description": "Internet search the arxiv ID of two papers that user wants to look up",
"parameters": { "parameters": {
"paper_1": { "paper_1": {
"param_type": "string", "param_type": "string",
"description": "arxiv id of paper_name_1 from user query", "description": "arxiv id of paper_name_1 from user query",
"required": true "required": true
}, },
"paper_2": { "paper_2": {
"param_type": "string", "param_type": "string",
"description": "arxiv id of paper_name_2 from user query", "description": "arxiv id of paper_name_2 from user query",
"required": true "required": true
}, },
} }
} }
Use the function 'get_arxiv_ids' to: Given a dict of websearch queries, use a LLM to return JUST the arxiv ID, which is otherwise harder to extract Use the function 'get_arxiv_ids' to: Given a dict of websearch queries, use a LLM to return JUST the arxiv ID, which is otherwise harder to extract
{ {
"name": "get_arxiv_ids", "name": "get_arxiv_ids",
"description": "Use the dictionary returned from query_for_two_papers to ask a LLM to extract the arxiv IDs", "description": "Use the dictionary returned from query_for_two_papers to ask a LLM to extract the arxiv IDs",
"parameters": { "parameters": {
"web_results": { "web_results": {
"param_type": "dictionary", "param_type": "dictionary",
"description": "dictionary of search result for a query from the previous function", "description": "dictionary of search result for a query from the previous function",
"required": true "required": true
}, },
} }
} }
Use the function 'process_arxiv_paper' to: Given the arxiv ID from get_arxiv_ids function, return a download txt file of the paper that we can then use for summarising Use the function 'process_arxiv_paper' to: Given the arxiv ID from get_arxiv_ids function, return a download txt file of the paper that we can then use for summarising
{ {
"name": "process_arxiv_paper", "name": "process_arxiv_paper",
"description": "Use arxiv IDs extracted from earlier to be downloaded and saved to txt files", "description": "Use arxiv IDs extracted from earlier to be downloaded and saved to txt files",
"parameters": { "parameters": {
"arxiv_id": { "arxiv_id": {
"param_type": "string", "param_type": "string",
"description": "arxiv ID of the paper that we want to download and save a txt file of", "description": "arxiv ID of the paper that we want to download and save a txt file of",
"required": true "required": true
}, },
} }
} }
Use the function 'summarize_text_file' to: Given the txt file name based on the arxiv IDs we are working with from earlier, get a summary of the paper being discussed Use the function 'summarize_text_file' to: Given the txt file name based on the arxiv IDs we are working with from earlier, get a summary of the paper being discussed
{ {
"name": "summarize_text_file", "name": "summarize_text_file",
"description": "Summarise the arxiv paper saved in the txt file", "description": "Summarise the arxiv paper saved in the txt file",
"parameters": { "parameters": {
"file_name": { "file_name": {
"param_type": "string", "param_type": "string",
"description": "Filename to be used to get a summary of", "description": "Filename to be used to get a summary of",
"required": true "required": true
}, },
} }
} }
If a you choose to call a function ONLY reply in the following format: If a you choose to call a function ONLY reply in the following format:
<{start_tag}={function_name}>{parameters}{end_tag} <{start_tag}={function_name}>{parameters}{end_tag}
where where
start_tag => `<function` start_tag => `<function`
parameters => a JSON dict with the function argument name as key and function argument value as value. parameters => a JSON dict with the function argument name as key and function argument value as value.
end_tag => `</function>` end_tag => `</function>`
Here is an example, Here is an example,
<function=example_function_name>{"example_name": "example_value"}</function> <function=example_function_name>{"example_name": "example_value"}</function>
Reminder: Reminder:
- When user is asking for a question that requires your reasoning, DO NOT USE OR FORCE a function call - When user is asking for a question that requires your reasoning, DO NOT USE OR FORCE a function call
- Even if you remember the arxiv ID of papers from input, do not put that in the query_two_papers function call, pass the internet look up query - Even if you remember the arxiv ID of papers from input, do not put that in the query_two_papers function call, pass the internet look up query
- Function calls MUST follow the specified format - Function calls MUST follow the specified format
- Required parameters MUST be specified - Required parameters MUST be specified
- Only call one function at a time - Only call one function at a time
- Put the entire function call reply on one line - Put the entire function call reply on one line
- When returning a function call, don't add anything else to your response - When returning a function call, don't add anything else to your response
""" """
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
main_model_chat_history = [ main_model_chat_history = [
{ {
"role" : "system", "role" : "system",
"content" : MAIN_SYSTEM_PROMPT "content" : MAIN_SYSTEM_PROMPT
} }
] ]
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
#### Define the `model_chat` instance #### Define the `model_chat` instance
We will be using this to handle all user input(s) We will be using this to handle all user input(s)
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
def model_chat(user_input: str, temperature: int = 0, max_tokens=2048): def model_chat(user_input: str, temperature: int = 0, max_tokens=2048):
main_model_chat_history.append({"role": "user", "content": user_input}) main_model_chat_history.append({"role": "user", "content": user_input})
#print(chat_history) #print(chat_history)
#print("User: ", user_input) #print("User: ", user_input)
response = client.chat.completions.create(model="llama-3.1-70b-versatile", response = client.chat.completions.create(model="llama-3.1-70b-versatile",
messages=main_model_chat_history, messages=main_model_chat_history,
max_tokens=max_tokens, max_tokens=max_tokens,
temperature=temperature) temperature=temperature)
main_model_chat_history.append({ main_model_chat_history.append({
"role": "assistant", "role": "assistant",
"content": response.choices[0].message.content "content": response.choices[0].message.content
}) })
#print("Assistant:", response.choices[0].message.content) #print("Assistant:", response.choices[0].message.content)
return response.choices[0].message.content return response.choices[0].message.content
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
user_input = """ user_input = """
What are the differences between llama 3.1 and BERT? What are the differences between llama 3.1 and BERT?
""" """
output = model_chat(user_input, temperature=1) output = model_chat(user_input, temperature=1)
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
print(output) print(output)
``` ```
%% Output %% Output
<function=query_for_two_papers>{"paper_1": "Llama", "paper_2": "BERT"}</function> <function=query_for_two_papers>{"paper_1": "Llama", "paper_2": "BERT"}</function>
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
If you remember from `Tool_Calling_101.ipynb`, we need a way to extract and manage tool calling based on the response, the system prompt from earlier makes our lives easier to answer do this later :) If you remember from `Tool_Calling_101.ipynb`, we need a way to extract and manage tool calling based on the response, the system prompt from earlier makes our lives easier to answer do this later :)
First, let's validate the logic and define all the functions as we go: First, let's validate the logic and define all the functions as we go:
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
#### Tavily API: #### Tavily API:
We will use the Tavily API to do a web query for the papers based on the model outputs We will use the Tavily API to do a web query for the papers based on the model outputs
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
def query_for_two_papers(paper_1:str , paper_2: str) -> None : def query_for_two_papers(paper_1:str , paper_2: str) -> None :
return [tavily_client.search(f"arxiv id of {paper_1}"), tavily_client.search(f"arxiv id of {paper_2}")] return [tavily_client.search(f"arxiv id of {paper_1}"), tavily_client.search(f"arxiv id of {paper_2}")]
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
search_results = query_for_two_papers("llama 3.1", "BERT") search_results = query_for_two_papers("llama 3.1", "BERT")
#search_results #search_results
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
user_input = f""" user_input = f"""
Here are the search results for the first paper, extract the arxiv ID {search_results[0]} Here are the search results for the first paper, extract the arxiv ID {search_results[0]}
""" """
output = model_chat(user_input, temperature=1) output = model_chat(user_input, temperature=1)
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
print(output) print(output)
``` ```
%% Output %% Output
<function=get_arxiv_id>{"web_results": "{'query': 'arxiv id of llama 3.1', 'follow_up_questions': None, 'answer': None, 'images': [], 'results': [{'title': 'TheLlama3HerdofModels - arXiv.org', 'url': 'https://arxiv.org/pdf/2407.21783', 'content': 'arXiv:2407.21783v2 [cs.AI] 15 Aug 2024. Finetuned Multilingual Longcontext Tooluse Release ... The model architecture of Llama 3 is illustrated in Figure1. The development of our Llama 3 language modelscomprisestwomainstages:', 'score': 0.9955835, 'raw_content': None}, {'title': 'NousResearch/Meta-Llama-3.1-8B - Hugging Face', 'url': 'https://huggingface.co/NousResearch/Meta-Llama-3.1-8B', 'content': 'The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned generative models in 8B, 70B and 405B sizes (text in/text out). The Llama 3.1 instruction tuned text only models (8B, 70B, 405B) are optimized for multilingual dialogue use cases and outperform many of the available ...', 'score': 0.95379424, 'raw_content': None}, {'title': 'Introducing Llama 3.1: Our most capable models to date - Meta AI', 'url': 'https://ai.meta.com/blog/meta-llama-3-1/', 'content': 'Bringing open intelligence to all, our latest models expand context length to 128K, add support across eight languages, and include Llama 3.1 405B—the first frontier-level open source AI model. Llama 3.1 405B is in a class of its own, with unmatched flexibility, control, and state-of-the-art capabilities that rival the best closed source models.', 'score': 0.9003547, 'raw_content': None}, {'title': 'The Llama 3 Herd of Models | Research - AI at Meta', 'url': 'https://ai.meta.com/research/publications/the-llama-3-herd-of-models/', 'content': 'This paper presents an extensive empirical evaluation of Llama 3. We find that Llama 3 delivers comparable quality to leading language models such as GPT-4 on a plethora of tasks. We publicly release Llama 3, including pre-trained and post-trained versions of the 405B parameter language model and our Llama Guard 3 model for input and output safety.', 'score': 0.89460546, 'raw_content': None}, {'title': '[2407.21783] The Llama 3 Herd of Models - arXiv.org', 'url': 'https://arxiv.org/abs/2407.21783', 'content': 'Modern artificial intelligence (AI) systems are powered by foundation models. This paper presents a new set of foundation models, called Llama 3. It is a herd of language models that natively support multilinguality, coding, reasoning, and tool usage. Our largest model is a dense Transformer with 405B parameters and a context window of up to 128K tokens. This paper presents an extensive ...', 'score': 0.6841585, 'raw_content': None}], 'response_time': 2.09}"}</function> <function=get_arxiv_id>{"web_results": "{'query': 'arxiv id of llama 3.1', 'follow_up_questions': None, 'answer': None, 'images': [], 'results': [{'title': 'TheLlama3HerdofModels - arXiv.org', 'url': 'https://arxiv.org/pdf/2407.21783', 'content': 'arXiv:2407.21783v2 [cs.AI] 15 Aug 2024. Finetuned Multilingual Longcontext Tooluse Release ... The model architecture of Llama 3 is illustrated in Figure1. The development of our Llama 3 language modelscomprisestwomainstages:', 'score': 0.9955835, 'raw_content': None}, {'title': 'NousResearch/Meta-Llama-3.1-8B - Hugging Face', 'url': 'https://huggingface.co/NousResearch/Meta-Llama-3.1-8B', 'content': 'The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned generative models in 8B, 70B and 405B sizes (text in/text out). The Llama 3.1 instruction tuned text only models (8B, 70B, 405B) are optimized for multilingual dialogue use cases and outperform many of the available ...', 'score': 0.95379424, 'raw_content': None}, {'title': 'Introducing Llama 3.1: Our most capable models to date - Meta AI', 'url': 'https://ai.meta.com/blog/meta-llama-3-1/', 'content': 'Bringing open intelligence to all, our latest models expand context length to 128K, add support across eight languages, and include Llama 3.1 405B—the first frontier-level open source AI model. Llama 3.1 405B is in a class of its own, with unmatched flexibility, control, and state-of-the-art capabilities that rival the best closed source models.', 'score': 0.9003547, 'raw_content': None}, {'title': 'The Llama 3 Herd of Models | Research - AI at Meta', 'url': 'https://ai.meta.com/research/publications/the-llama-3-herd-of-models/', 'content': 'This paper presents an extensive empirical evaluation of Llama 3. We find that Llama 3 delivers comparable quality to leading language models such as GPT-4 on a plethora of tasks. We publicly release Llama 3, including pre-trained and post-trained versions of the 405B parameter language model and our Llama Guard 3 model for input and output safety.', 'score': 0.89460546, 'raw_content': None}, {'title': '[2407.21783] The Llama 3 Herd of Models - arXiv.org', 'url': 'https://arxiv.org/abs/2407.21783', 'content': 'Modern artificial intelligence (AI) systems are powered by foundation models. This paper presents a new set of foundation models, called Llama 3. It is a herd of language models that natively support multilinguality, coding, reasoning, and tool usage. Our largest model is a dense Transformer with 405B parameters and a context window of up to 128K tokens. This paper presents an extensive ...', 'score': 0.6841585, 'raw_content': None}], 'response_time': 2.09}"}</function>
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
user_input = f""" user_input = f"""
Here are the search results for the second paper now, extract the arxiv ID {search_results[1]} Here are the search results for the second paper now, extract the arxiv ID {search_results[1]}
""" """
output = model_chat(user_input, temperature=1) output = model_chat(user_input, temperature=1)
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
print(output) print(output)
``` ```
%% Output %% Output
<function=get_arxiv_id>{"web_results": "{'query': 'arxiv id of BERT', 'follow_up_questions': None, 'answer': None, 'images': [], 'results': [{'title': '[2103.11943] BERT: A Review of Applications in Natural Language ...', 'url': 'https://arxiv.org/abs/2103.11943', 'content': 'arXiv:2103.11943 (cs) [Submitted on 22 Mar 2021] BERT: A Review of Applications in Natural Language Processing and Understanding. M. V. Koroteev. In this review, we describe the application of one of the most popular deep learning-based language models - BERT. The paper describes the mechanism of operation of this model, the main areas of its ...', 'score': 0.99411184, 'raw_content': None}, {'title': 'BERT: Pre-training of Deep Bidirectional Transformers for Language ...', 'url': 'https://aclanthology.org/N19-1423/', 'content': 'Abstract. We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models (Peters et al., 2018a; Radford et al., 2018), BERT is designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning ...', 'score': 0.9222025, 'raw_content': None}, {'title': 'BERT: Pre-training of Deep Bidirectional Transformers for Language ...', 'url': 'https://research.google/pubs/bert-pre-training-of-deep-bidirectional-transformers-for-language-understanding/', 'content': 'Abstract. We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models, BERT is designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers.', 'score': 0.87652874, 'raw_content': None}, {'title': 'BERT: Pre-training of Deep Bidirectional Transformers for Language ...', 'url': 'https://arxiv.org/abs/1810.04805', 'content': 'We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models, BERT is designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers. As a result, the pre-trained BERT model can be fine-tuned ...', 'score': 0.66115755, 'raw_content': None}, {'title': 'A Primer in BERTology: What We Know About How BERT Works', 'url': 'https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00349/96482/A-Primer-in-BERTology-What-We-Know-About-How-BERT', 'content': 'The issue of model depth must be related to the information flow from the most task-specific layers closer to the classifier (Liu et al., 2019a), to the initial layers which appear to be the most task-invariant (Hao et al., 2019), and where the tokens resemble the input tokens the most (Brunner et al., 2020) For BERT, this has been achieved through experiments with loss functions (Sanh et al., 2019; Jiao et al., 2019), mimicking the activation patterns of individual portions of the teacher network (Sun et al., 2019a), and knowledge transfer at the pre-training (Turc et al., 2019; Jiao et al., 2019; Sun et al., 2020) or fine-tuning stage (Jiao et al., 2019). In particular, they were shown to rely on shallow heuristics in natural language inference (McCoy et al., 2019b; Zellers et al., 2019; Jin et al., 2020), reading comprehension (Si et al., 2019; Rogers et al., 2020; Sugawara et al., 2020; Yogatama et al., 2019), argument reasoning comprehension (Niven and Kao, 2019), and text classification (Jin et al., 2020). Several studies explored the possibilities of improving the fine-tuning of BERT:\\nTaking more layers into account: learning a complementary representation of the information in deep and output layers (Yang and Zhao, 2019), using a weighted combination of all layers instead of the final one (Su and Cheng, 2019; Kondratyuk and Straka, 2019), and layer dropout (Kondratyuk and Straka, 2019).\\n For BERT, Clark et al. (2019) observe that most heads in the same layer show similar self-attention patterns (perhaps related to the fact that the output of all self-attention heads in a layer is passed through the same MLP), which explains why Michel et al. (2019) were able to reduce most layers to a single head.\\n', 'score': 0.4248892, 'raw_content': None}], 'response_time': 2.16}"}</function> <function=get_arxiv_id>{"web_results": "{'query': 'arxiv id of BERT', 'follow_up_questions': None, 'answer': None, 'images': [], 'results': [{'title': '[2103.11943] BERT: A Review of Applications in Natural Language ...', 'url': 'https://arxiv.org/abs/2103.11943', 'content': 'arXiv:2103.11943 (cs) [Submitted on 22 Mar 2021] BERT: A Review of Applications in Natural Language Processing and Understanding. M. V. Koroteev. In this review, we describe the application of one of the most popular deep learning-based language models - BERT. The paper describes the mechanism of operation of this model, the main areas of its ...', 'score': 0.99411184, 'raw_content': None}, {'title': 'BERT: Pre-training of Deep Bidirectional Transformers for Language ...', 'url': 'https://aclanthology.org/N19-1423/', 'content': 'Abstract. We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models (Peters et al., 2018a; Radford et al., 2018), BERT is designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning ...', 'score': 0.9222025, 'raw_content': None}, {'title': 'BERT: Pre-training of Deep Bidirectional Transformers for Language ...', 'url': 'https://research.google/pubs/bert-pre-training-of-deep-bidirectional-transformers-for-language-understanding/', 'content': 'Abstract. We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models, BERT is designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers.', 'score': 0.87652874, 'raw_content': None}, {'title': 'BERT: Pre-training of Deep Bidirectional Transformers for Language ...', 'url': 'https://arxiv.org/abs/1810.04805', 'content': 'We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models, BERT is designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers. As a result, the pre-trained BERT model can be fine-tuned ...', 'score': 0.66115755, 'raw_content': None}, {'title': 'A Primer in BERTology: What We Know About How BERT Works', 'url': 'https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00349/96482/A-Primer-in-BERTology-What-We-Know-About-How-BERT', 'content': 'The issue of model depth must be related to the information flow from the most task-specific layers closer to the classifier (Liu et al., 2019a), to the initial layers which appear to be the most task-invariant (Hao et al., 2019), and where the tokens resemble the input tokens the most (Brunner et al., 2020) For BERT, this has been achieved through experiments with loss functions (Sanh et al., 2019; Jiao et al., 2019), mimicking the activation patterns of individual portions of the teacher network (Sun et al., 2019a), and knowledge transfer at the pre-training (Turc et al., 2019; Jiao et al., 2019; Sun et al., 2020) or fine-tuning stage (Jiao et al., 2019). In particular, they were shown to rely on shallow heuristics in natural language inference (McCoy et al., 2019b; Zellers et al., 2019; Jin et al., 2020), reading comprehension (Si et al., 2019; Rogers et al., 2020; Sugawara et al., 2020; Yogatama et al., 2019), argument reasoning comprehension (Niven and Kao, 2019), and text classification (Jin et al., 2020). Several studies explored the possibilities of improving the fine-tuning of BERT:\\nTaking more layers into account: learning a complementary representation of the information in deep and output layers (Yang and Zhao, 2019), using a weighted combination of all layers instead of the final one (Su and Cheng, 2019; Kondratyuk and Straka, 2019), and layer dropout (Kondratyuk and Straka, 2019).\\n For BERT, Clark et al. (2019) observe that most heads in the same layer show similar self-attention patterns (perhaps related to the fact that the output of all self-attention heads in a layer is passed through the same MLP), which explains why Michel et al. (2019) were able to reduce most layers to a single head.\\n', 'score': 0.4248892, 'raw_content': None}], 'response_time': 2.16}"}</function>
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
#### Extracting Arxiv IDs: #### Extracting Arxiv IDs:
At this point, you would know the author is allergic to writing regex. To deal with this, we will simply use an `8b` instance to extract the `arxiv id` from the paper: At this point, you would know the author is allergic to writing regex. To deal with this, we will simply use an `8b` instance to extract the `arxiv id` from the paper:
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
def get_arxiv_ids(web_results: dict, temperature: int = 0, max_tokens=512): def get_arxiv_ids(web_results: dict, temperature: int = 0, max_tokens=512):
# Initialize chat history with a specific prompt to extract arXiv IDs # Initialize chat history with a specific prompt to extract arXiv IDs
arxiv_id_chat_history = [{"role": "system", "content": "Given this input, give me the arXiv ID of the papers. The input has the query and web results. DO NOT WRITE ANYTHING ELSE IN YOUR RESPONSE: ONLY THE ARXIV ID ONCE, the web search will have it repeated multiple times, just return the it once and where its actually the arxiv ID"}, {"role": "user", "content": f"Here is the query and results{web_results}"}] arxiv_id_chat_history = [{"role": "system", "content": "Given this input, give me the arXiv ID of the papers. The input has the query and web results. DO NOT WRITE ANYTHING ELSE IN YOUR RESPONSE: ONLY THE ARXIV ID ONCE, the web search will have it repeated multiple times, just return the it once and where its actually the arxiv ID"}, {"role": "user", "content": f"Here is the query and results{web_results}"}]
# Call the model to process the input and extract arXiv IDs # Call the model to process the input and extract arXiv IDs
response = client.chat.completions.create( response = client.chat.completions.create(
model="llama-3.1-8b-instant", # Adjust the model as necessary model="llama-3.1-8b-instant", # Adjust the model as necessary
messages=arxiv_id_chat_history, messages=arxiv_id_chat_history,
max_tokens=max_tokens, max_tokens=max_tokens,
temperature=temperature temperature=temperature
) )
# Append the assistant's response to the chat history # Append the assistant's response to the chat history
arxiv_id_chat_history.append({ arxiv_id_chat_history.append({
"role": "assistant", "role": "assistant",
"content": response.choices[0].message.content "content": response.choices[0].message.content
}) })
# Return the extracted arXiv IDs # Return the extracted arXiv IDs
return response.choices[0].message.content return response.choices[0].message.content
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
print(get_arxiv_ids(search_results[0])) print(get_arxiv_ids(search_results[0]))
print(get_arxiv_ids(search_results[1])) print(get_arxiv_ids(search_results[1]))
``` ```
%% Output %% Output
2407.21783 2407.21783
2103.11943 2103.11943
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
#### Downloading the papers and extracting details: #### Downloading the papers and extracting details:
Llama 3.1 family LLM(s) are great enough to use raw outputs extracted from a PDF and summarise them. However, we are still bound by their (great) 128k context length-to live with this, we will extract just the first 80k words. Llama 3.1 family LLM(s) are great enough to use raw outputs extracted from a PDF and summarise them. However, we are still bound by their (great) 128k context length-to live with this, we will extract just the first 80k words.
The functions below handle the logic of downloading the PDF(s) and extracting their outputs The functions below handle the logic of downloading the PDF(s) and extracting their outputs
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
# Function to download PDF using arxiv library # Function to download PDF using arxiv library
def download_pdf(arxiv_id, filename): def download_pdf(arxiv_id, filename):
paper = next(arxiv.Client().results(arxiv.Search(id_list=[arxiv_id]))) paper = next(arxiv.Client().results(arxiv.Search(id_list=[arxiv_id])))
paper.download_pdf(filename=filename) paper.download_pdf(filename=filename)
# Function to convert PDF to text # Function to convert PDF to text
def pdf_to_text(filename): def pdf_to_text(filename):
with open(filename, "rb") as file: with open(filename, "rb") as file:
reader = PyPDF2.PdfReader(file) reader = PyPDF2.PdfReader(file)
text = "" text = ""
for page in reader.pages: for page in reader.pages:
if page.extract_text(): if page.extract_text():
text += page.extract_text() + " " text += page.extract_text() + " "
return text return text
# Function to truncate text after 80k words # Function to truncate text after 80k words
def truncate_text(text, limit=20000): def truncate_text(text, limit=20000):
words = text.split() words = text.split()
truncated = ' '.join(words[:limit]) truncated = ' '.join(words[:limit])
return truncated return truncated
# Main function to process an arXiv ID # Main function to process an arXiv ID
def process_arxiv_paper(arxiv_id): def process_arxiv_paper(arxiv_id):
pdf_filename = f"{arxiv_id}.pdf" pdf_filename = f"{arxiv_id}.pdf"
txt_filename = f"{arxiv_id}.txt" txt_filename = f"{arxiv_id}.txt"
# Download PDF # Download PDF
download_pdf(arxiv_id, pdf_filename) download_pdf(arxiv_id, pdf_filename)
# Convert PDF to text # Convert PDF to text
text = pdf_to_text(pdf_filename) text = pdf_to_text(pdf_filename)
# Truncate text # Truncate text
truncated_text = truncate_text(text) truncated_text = truncate_text(text)
# Save to txt file # Save to txt file
with open(txt_filename, "w", encoding="utf-8") as file: with open(txt_filename, "w", encoding="utf-8") as file:
file.write(truncated_text) file.write(truncated_text)
print(f"Processed text saved to {txt_filename}") print(f"Processed text saved to {txt_filename}")
# Example usage # Example usage
arxiv_id = "2407.21783" arxiv_id = "2407.21783"
process_arxiv_paper(arxiv_id) process_arxiv_paper(arxiv_id)
arxiv_id = "2103.11943" arxiv_id = "2103.11943"
process_arxiv_paper(arxiv_id) process_arxiv_paper(arxiv_id)
``` ```
%% Output %% Output
Processed text saved to 2407.21783.txt Processed text saved to 2407.21783.txt
Processed text saved to 2103.11943.txt Processed text saved to 2103.11943.txt
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
#### Summarising logic: #### Summarising logic:
We can use a `8b` model instance to summarise our papers: We can use a `8b` model instance to summarise our papers:
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
SUMMARISER_PROMPT = """ SUMMARISER_PROMPT = """
Cutting Knowledge Date: December 2023 Cutting Knowledge Date: December 2023
Today Date: 15 September 2024 Today Date: 15 September 2024
You are an expert summariser of research papers, below you will get an input of the text from an arxiv paper and your job is to read it carefully and return a concise summary with some bullet points at the end of some key-takeways from it You are an expert summariser of research papers, below you will get an input of the text from an arxiv paper and your job is to read it carefully and return a concise summary with some bullet points at the end of some key-takeways from it
""" """
def summarize_text_file(file_name: str, temperature: int = 0, max_tokens=2048): def summarize_text_file(file_name: str, temperature: int = 0, max_tokens=2048):
# Read the content of the file # Read the content of the file
with open(file_name, 'r') as file: with open(file_name, 'r') as file:
file_content = file.read() file_content = file.read()
# Initialize chat history # Initialize chat history
chat_history = [{"role": "system", "content": f"{SUMMARISER_PROMPT}"}, {"role": "user", "content": f"Text of the paper: {file_content}"}] chat_history = [{"role": "system", "content": f"{SUMMARISER_PROMPT}"}, {"role": "user", "content": f"Text of the paper: {file_content}"}]
# Generate a summary using the model # Generate a summary using the model
response = client.chat.completions.create( response = client.chat.completions.create(
model="llama-3.1-8b-instant", # You can change the model as needed model="llama-3.1-8b-instant", # You can change the model as needed
messages=chat_history, messages=chat_history,
max_tokens=max_tokens, max_tokens=max_tokens,
temperature=temperature temperature=temperature
) )
# Append the assistant's response to the chat history # Append the assistant's response to the chat history
chat_history.append({ chat_history.append({
"role": "assistant", "role": "assistant",
"content": response.choices[0].message.content "content": response.choices[0].message.content
}) })
# Return the summary # Return the summary
return response.choices[0].message.content return response.choices[0].message.content
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
paper_1_summary = summarize_text_file("2407.21783.txt") paper_1_summary = summarize_text_file("2407.21783.txt")
print(paper_1_summary) print(paper_1_summary)
``` ```
%% Output %% Output
Summary: Summary:
This paper introduces Llama 3, a new set of foundation models developed by Meta AI. The Llama 3 family consists of models with 8B, 70B, and 405B parameters, capable of handling tasks in multiple languages and modalities. The paper details the pre-training and post-training processes, infrastructure improvements, and evaluations across various benchmarks. Llama 3 demonstrates competitive performance compared to other leading language models, including GPT-4 and Claude 3.5 Sonnet, on a wide range of tasks. The paper also explores multimodal capabilities by integrating vision and speech components, although these are still under development and not ready for release. This paper introduces Llama 3, a new set of foundation models developed by Meta AI. The Llama 3 family consists of models with 8B, 70B, and 405B parameters, capable of handling tasks in multiple languages and modalities. The paper details the pre-training and post-training processes, infrastructure improvements, and evaluations across various benchmarks. Llama 3 demonstrates competitive performance compared to other leading language models, including GPT-4 and Claude 3.5 Sonnet, on a wide range of tasks. The paper also explores multimodal capabilities by integrating vision and speech components, although these are still under development and not ready for release.
Key takeaways: Key takeaways:
Llama 3 includes models with 8B, 70B, and 405B parameters, with the flagship 405B model trained on 15.6T tokens. Llama 3 includes models with 8B, 70B, and 405B parameters, with the flagship 405B model trained on 15.6T tokens.
The models excel in multilingual capabilities, coding, reasoning, and tool usage. The models excel in multilingual capabilities, coding, reasoning, and tool usage.
Llama 3 uses a dense Transformer architecture with minimal modifications, focusing on high-quality data and increased training scale. Llama 3 uses a dense Transformer architecture with minimal modifications, focusing on high-quality data and increased training scale.
The training process involved significant infrastructure improvements to handle large-scale distributed training. The training process involved significant infrastructure improvements to handle large-scale distributed training.
Post-training includes supervised fine-tuning, rejection sampling, and direct preference optimization to align the model with human preferences. Post-training includes supervised fine-tuning, rejection sampling, and direct preference optimization to align the model with human preferences.
Llama 3 demonstrates competitive performance on various benchmarks, including MMLU, coding tasks, and math reasoning. Llama 3 demonstrates competitive performance on various benchmarks, including MMLU, coding tasks, and math reasoning.
The paper presents experiments on integrating vision and speech capabilities using a compositional approach. The paper presents experiments on integrating vision and speech capabilities using a compositional approach.
Extensive safety measures were implemented, including pre-training data filtering, safety fine-tuning, and system-level protections. Extensive safety measures were implemented, including pre-training data filtering, safety fine-tuning, and system-level protections.
The authors are releasing the Llama 3 language models publicly to accelerate research and development in AI. The authors are releasing the Llama 3 language models publicly to accelerate research and development in AI.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
paper_2_summary = summarize_text_file("2103.11943.txt") paper_2_summary = summarize_text_file("2103.11943.txt")
print(paper_2_summary) print(paper_2_summary)
``` ```
%% Output %% Output
BERT is a novel language representation model developed by researchers at Google AI. It stands for Bidirectional Encoder Representations from Transformers and introduces a new approach to pre-training deep bidirectional representations from unlabeled text. Unlike previous models that looked at text sequences either from left-to-right or combined left-to-right and right-to-left training, BERT is designed to pre-train deep bidirectional representations by jointly conditioning on both left and right context in all layers. BERT is a novel language representation model developed by researchers at Google AI. It stands for Bidirectional Encoder Representations from Transformers and introduces a new approach to pre-training deep bidirectional representations from unlabeled text. Unlike previous models that looked at text sequences either from left-to-right or combined left-to-right and right-to-left training, BERT is designed to pre-train deep bidirectional representations by jointly conditioning on both left and right context in all layers.
The key innovation is the application of bidirectional training of Transformer, a popular attention model, to language modeling. This is achieved through two pre-training tasks: Masked Language Model (MLM) and Next Sentence Prediction (NSP). In MLM, the model attempts to predict masked words in a sentence, allowing it to incorporate context from both directions. NSP trains the model to understand relationships between sentences. The key innovation is the application of bidirectional training of Transformer, a popular attention model, to language modeling. This is achieved through two pre-training tasks: Masked Language Model (MLM) and Next Sentence Prediction (NSP). In MLM, the model attempts to predict masked words in a sentence, allowing it to incorporate context from both directions. NSP trains the model to understand relationships between sentences.
BERT significantly outperformed previous state-of-the-art models on a wide range of NLP tasks, including question answering, natural language inference, and others, without substantial task-specific architecture modifications. The researchers demonstrated the effectiveness of BERT by obtaining new state-of-the-art results on eleven natural language processing tasks. BERT significantly outperformed previous state-of-the-art models on a wide range of NLP tasks, including question answering, natural language inference, and others, without substantial task-specific architecture modifications. The researchers demonstrated the effectiveness of BERT by obtaining new state-of-the-art results on eleven natural language processing tasks.
Key Takeaways: Key Takeaways:
BERT introduces deep bidirectional representations, overcoming limitations of previous unidirectional or shallowly bidirectional models. BERT introduces deep bidirectional representations, overcoming limitations of previous unidirectional or shallowly bidirectional models.
The model uses "masked language modeling" (MLM) for bidirectional training of Transformer. The model uses "masked language modeling" (MLM) for bidirectional training of Transformer.
BERT is pre-trained on two tasks: masked language modeling and next sentence prediction. BERT is pre-trained on two tasks: masked language modeling and next sentence prediction.
It achieves state-of-the-art performance on 11 NLP tasks, including an improvement of 7.7% on the GLUE benchmark. It achieves state-of-the-art performance on 11 NLP tasks, including an improvement of 7.7% on the GLUE benchmark.
BERT's architecture allows for fine-tuning with just one additional output layer, making it versatile for various NLP tasks. BERT's architecture allows for fine-tuning with just one additional output layer, making it versatile for various NLP tasks.
The model demonstrates that deep bidirectional language representation improves language understanding compared to left-to-right or shallow bidirectional approaches. The model demonstrates that deep bidirectional language representation improves language understanding compared to left-to-right or shallow bidirectional approaches.
BERT's performance improves with larger model sizes, even on small-scale tasks. BERT's performance improves with larger model sizes, even on small-scale tasks.
The pre-training of BERT is computationally expensive but fine-tuning is relatively inexpensive. The pre-training of BERT is computationally expensive but fine-tuning is relatively inexpensive.
BERT can be used for both fine-tuning and as a feature-based approach, with competitive results in both scenarios. BERT can be used for both fine-tuning and as a feature-based approach, with competitive results in both scenarios.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
user_input = f""" user_input = f"""
Here are the summaries of the two papers, look at them closely and tell me the differences of the papers: Paper 1 Summary {paper_1_summary} and Paper 2 Summary {paper_2_summary} Here are the summaries of the two papers, look at them closely and tell me the differences of the papers: Paper 1 Summary {paper_1_summary} and Paper 2 Summary {paper_2_summary}
""" """
output = model_chat(user_input, temperature=1) output = model_chat(user_input, temperature=1)
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
print(output) print(output)
``` ```
%% Output %% Output
The two paper summaries are about different language models: Llama 3 and BERT. The two paper summaries are about different language models: Llama 3 and BERT.
The main differences are: The main differences are:
1. Model Type: Llama 3 is a set of foundation models developed by Meta AI, while BERT is a language representation model developed by researchers at Google AI. 1. Model Type: Llama 3 is a set of foundation models developed by Meta AI, while BERT is a language representation model developed by researchers at Google AI.
2. Model Architecture: Llama 3 uses a dense Transformer architecture, while BERT uses a bidirectional Transformer architecture. 2. Model Architecture: Llama 3 uses a dense Transformer architecture, while BERT uses a bidirectional Transformer architecture.
3. Training Process: Llama 3 involves significant infrastructure improvements to handle large-scale distributed training, while BERT uses pre-training tasks such as Masked Language Model (MLM) and Next Sentence Prediction (NSP). 3. Training Process: Llama 3 involves significant infrastructure improvements to handle large-scale distributed training, while BERT uses pre-training tasks such as Masked Language Model (MLM) and Next Sentence Prediction (NSP).
4. Multimodal Capabilities: Llama 3 explores multimodal capabilities by integrating vision and speech components, while BERT focuses on text-based language understanding. 4. Multimodal Capabilities: Llama 3 explores multimodal capabilities by integrating vision and speech components, while BERT focuses on text-based language understanding.
5. Performance: Both models demonstrate competitive performance on various benchmarks, but Llama 3 shows performance on tasks such as multilingual capabilities, coding, reasoning, and tool usage, while BERT excels on NLP tasks such as question answering and natural language inference. 5. Performance: Both models demonstrate competitive performance on various benchmarks, but Llama 3 shows performance on tasks such as multilingual capabilities, coding, reasoning, and tool usage, while BERT excels on NLP tasks such as question answering and natural language inference.
6. Release: Llama 3 is released publicly to accelerate research and development in AI, while BERT is released as a state-of-the-art model for NLP tasks. 6. Release: Llama 3 is released publicly to accelerate research and development in AI, while BERT is released as a state-of-the-art model for NLP tasks.
7. Model Size: Llama 3 has models with 8B, 70B, and 405B parameters, while BERT's model size is not specified in the summary. 7. Model Size: Llama 3 has models with 8B, 70B, and 405B parameters, while BERT's model size is not specified in the summary.
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## Part 2: Handle the function calling logic: ## Part 2: Handle the function calling logic:
Now that we have validated a MVP, we can write a simple function to handle tool-calling: Now that we have validated a MVP, we can write a simple function to handle tool-calling:
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
def handle_llm_output(llm_output): def handle_llm_output(llm_output):
# Check if the output starts with "<function=" # Check if the output starts with "<function="
if llm_output.startswith("<function="): if llm_output.startswith("<function="):
return extract_details_and_call_function(llm_output) return extract_details_and_call_function(llm_output)
else: else:
# Output does not start with "<function=", return as is # Output does not start with "<function=", return as is
return llm_output return llm_output
def extract_details_and_call_function(input_string): def extract_details_and_call_function(input_string):
# Extract the function name and parameters # Extract the function name and parameters
prefix = "<function=" prefix = "<function="
suffix = "</function>" suffix = "</function>"
start = input_string.find(prefix) + len(prefix) start = input_string.find(prefix) + len(prefix)
end = input_string.find(suffix) end = input_string.find(suffix)
function_and_params = input_string[start:end] function_and_params = input_string[start:end]
# Split to get function name and parameters # Split to get function name and parameters
function_name, params_json = function_and_params.split(">{") function_name, params_json = function_and_params.split(">{")
function_name = function_name.strip() function_name = function_name.strip()
params_json = "{" + params_json params_json = "{" + params_json
# Convert parameters to dictionary # Convert parameters to dictionary
params = json.loads(params_json) params = json.loads(params_json)
# Call the function dynamically # Call the function dynamically
function_map = { function_map = {
"query_for_two_papers": query_for_two_papers, "query_for_two_papers": query_for_two_papers,
"get_arxiv_id": get_arxiv_ids, "get_arxiv_id": get_arxiv_ids,
"process_arxiv_paper": process_arxiv_paper, "process_arxiv_paper": process_arxiv_paper,
"summarise_text_file": summarize_text_file "summarise_text_file": summarize_text_file
} }
if function_name in function_map: if function_name in function_map:
result = function_map[function_name](**params) result = function_map[function_name](**params)
return result return result
else: else:
return "Function not found" return "Function not found"
# Testing usage # Testing usage
llm_outputs = [ llm_outputs = [
"<function=query_for_two_papers>{\"paper_1\": \"Llama 3.1\", \"paper_2\": \"BERT\"}</function>", "<function=query_for_two_papers>{\"paper_1\": \"Llama 3.1\", \"paper_2\": \"BERT\"}</function>",
"Llama 3.2 models are here too btw!" "Llama 3.2 models are here too btw!"
] ]
for output in llm_outputs: for output in llm_outputs:
result = handle_llm_output(output) result = handle_llm_output(output)
print(result) print(result)
``` ```
%% Output %% Output
[{'query': 'arxiv id of Llama 3.1', 'follow_up_questions': None, 'answer': None, 'images': [], 'results': [{'title': 'TheLlama3HerdofModels - arXiv.org', 'url': 'https://arxiv.org/pdf/2407.21783', 'content': 'arXiv:2407.21783v2 [cs.AI] 15 Aug 2024. Finetuned Multilingual Longcontext Tooluse Release ... The model architecture of Llama 3 is illustrated in Figure1. The development of our Llama 3 language modelscomprisestwomainstages:', 'score': 0.9961004, 'raw_content': None}, {'title': '[PDF] The Llama 3 Herd of Models - Semantic Scholar', 'url': 'https://www.semanticscholar.org/paper/The-Llama-3-Herd-of-Models-Dubey-Jauhri/6520557cc3bfd198f960cc8cb6151c3474321bd8', 'content': 'DOI: 10.48550/arXiv.2407.21783 Corpus ID: 271571434; The Llama 3 Herd of Models @article{Dubey2024TheL3, title={The Llama 3 Herd of Models}, author={Abhimanyu Dubey and Abhinav Jauhri and Abhinav Pandey and Abhishek Kadian and Ahmad Al-Dahle and Aiesha Letman and Akhil Mathur and Alan Schelten and Amy Yang and Angela Fan and Anirudh Goyal and Anthony Hartshorn and Aobo Yang and Archi Mitra and ...', 'score': 0.9943581, 'raw_content': None}, {'title': 'The Llama 3 Herd of Models | Research - AI at Meta', 'url': 'https://ai.meta.com/research/publications/the-llama-3-herd-of-models/', 'content': 'This paper presents an extensive empirical evaluation of Llama 3. We find that Llama 3 delivers comparable quality to leading language models such as GPT-4 on a plethora of tasks. We publicly release Llama 3, including pre-trained and post-trained versions of the 405B parameter language model and our Llama Guard 3 model for input and output safety.', 'score': 0.9320833, 'raw_content': None}, {'title': 'Introducing Llama 3.1: Our most capable models to date - Meta AI', 'url': 'https://ai.meta.com/blog/meta-llama-3-1/', 'content': 'Bringing open intelligence to all, our latest models expand context length to 128K, add support across eight languages, and include Llama 3.1 405B—the first frontier-level open source AI model. Llama 3.1 405B is in a class of its own, with unmatched flexibility, control, and state-of-the-art capabilities that rival the best closed source models.', 'score': 0.8467045, 'raw_content': None}, {'title': '[2407.21783] The Llama 3 Herd of Models - arXiv.org', 'url': 'https://arxiv.org/abs/2407.21783', 'content': 'Modern artificial intelligence (AI) systems are powered by foundation models. This paper presents a new set of foundation models, called Llama 3. It is a herd of language models that natively support multilinguality, coding, reasoning, and tool usage. Our largest model is a dense Transformer with 405B parameters and a context window of up to 128K tokens. This paper presents an extensive ...', 'score': 0.68257374, 'raw_content': None}], 'response_time': 1.7}, {'query': 'arxiv id of BERT', 'follow_up_questions': None, 'answer': None, 'images': [], 'results': [{'title': '[2103.11943] BERT: A Review of Applications in Natural Language ...', 'url': 'https://arxiv.org/abs/2103.11943', 'content': 'arXiv:2103.11943 (cs) [Submitted on 22 Mar 2021] BERT: A Review of Applications in Natural Language Processing and Understanding. M. V. Koroteev. In this review, we describe the application of one of the most popular deep learning-based language models - BERT. The paper describes the mechanism of operation of this model, the main areas of its ...', 'score': 0.99411184, 'raw_content': None}, {'title': 'BERT: Pre-training of Deep Bidirectional Transformers for Language ...', 'url': 'https://aclanthology.org/N19-1423/', 'content': 'Abstract. We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models (Peters et al., 2018a; Radford et al., 2018), BERT is designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning ...', 'score': 0.9222025, 'raw_content': None}, {'title': 'BERT: Pre-training of Deep Bidirectional Transformers for Language ...', 'url': 'https://research.google/pubs/bert-pre-training-of-deep-bidirectional-transformers-for-language-understanding/', 'content': 'Abstract. We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models, BERT is designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers.', 'score': 0.87652874, 'raw_content': None}, {'title': 'BERT: Pre-training of Deep Bidirectional Transformers for Language ...', 'url': 'https://arxiv.org/abs/1810.04805', 'content': 'We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models, BERT is designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers. As a result, the pre-trained BERT model can be fine-tuned ...', 'score': 0.66115755, 'raw_content': None}, {'title': 'A Primer in BERTology: What We Know About How BERT Works', 'url': 'https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00349/96482/A-Primer-in-BERTology-What-We-Know-About-How-BERT', 'content': 'The issue of model depth must be related to the information flow from the most task-specific layers closer to the classifier (Liu et al., 2019a), to the initial layers which appear to be the most task-invariant (Hao et al., 2019), and where the tokens resemble the input tokens the most (Brunner et al., 2020) For BERT, this has been achieved through experiments with loss functions (Sanh et al., 2019; Jiao et al., 2019), mimicking the activation patterns of individual portions of the teacher network (Sun et al., 2019a), and knowledge transfer at the pre-training (Turc et al., 2019; Jiao et al., 2019; Sun et al., 2020) or fine-tuning stage (Jiao et al., 2019). In particular, they were shown to rely on shallow heuristics in natural language inference (McCoy et al., 2019b; Zellers et al., 2019; Jin et al., 2020), reading comprehension (Si et al., 2019; Rogers et al., 2020; Sugawara et al., 2020; Yogatama et al., 2019), argument reasoning comprehension (Niven and Kao, 2019), and text classification (Jin et al., 2020). Several studies explored the possibilities of improving the fine-tuning of BERT:\nTaking more layers into account: learning a complementary representation of the information in deep and output layers (Yang and Zhao, 2019), using a weighted combination of all layers instead of the final one (Su and Cheng, 2019; Kondratyuk and Straka, 2019), and layer dropout (Kondratyuk and Straka, 2019).\n For BERT, Clark et al. (2019) observe that most heads in the same layer show similar self-attention patterns (perhaps related to the fact that the output of all self-attention heads in a layer is passed through the same MLP), which explains why Michel et al. (2019) were able to reduce most layers to a single head.\n', 'score': 0.4250085, 'raw_content': None}], 'response_time': 2.2}] [{'query': 'arxiv id of Llama 3.1', 'follow_up_questions': None, 'answer': None, 'images': [], 'results': [{'title': 'TheLlama3HerdofModels - arXiv.org', 'url': 'https://arxiv.org/pdf/2407.21783', 'content': 'arXiv:2407.21783v2 [cs.AI] 15 Aug 2024. Finetuned Multilingual Longcontext Tooluse Release ... The model architecture of Llama 3 is illustrated in Figure1. The development of our Llama 3 language modelscomprisestwomainstages:', 'score': 0.9961004, 'raw_content': None}, {'title': '[PDF] The Llama 3 Herd of Models - Semantic Scholar', 'url': 'https://www.semanticscholar.org/paper/The-Llama-3-Herd-of-Models-Dubey-Jauhri/6520557cc3bfd198f960cc8cb6151c3474321bd8', 'content': 'DOI: 10.48550/arXiv.2407.21783 Corpus ID: 271571434; The Llama 3 Herd of Models @article{Dubey2024TheL3, title={The Llama 3 Herd of Models}, author={Abhimanyu Dubey and Abhinav Jauhri and Abhinav Pandey and Abhishek Kadian and Ahmad Al-Dahle and Aiesha Letman and Akhil Mathur and Alan Schelten and Amy Yang and Angela Fan and Anirudh Goyal and Anthony Hartshorn and Aobo Yang and Archi Mitra and ...', 'score': 0.9943581, 'raw_content': None}, {'title': 'The Llama 3 Herd of Models | Research - AI at Meta', 'url': 'https://ai.meta.com/research/publications/the-llama-3-herd-of-models/', 'content': 'This paper presents an extensive empirical evaluation of Llama 3. We find that Llama 3 delivers comparable quality to leading language models such as GPT-4 on a plethora of tasks. We publicly release Llama 3, including pre-trained and post-trained versions of the 405B parameter language model and our Llama Guard 3 model for input and output safety.', 'score': 0.9320833, 'raw_content': None}, {'title': 'Introducing Llama 3.1: Our most capable models to date - Meta AI', 'url': 'https://ai.meta.com/blog/meta-llama-3-1/', 'content': 'Bringing open intelligence to all, our latest models expand context length to 128K, add support across eight languages, and include Llama 3.1 405B—the first frontier-level open source AI model. Llama 3.1 405B is in a class of its own, with unmatched flexibility, control, and state-of-the-art capabilities that rival the best closed source models.', 'score': 0.8467045, 'raw_content': None}, {'title': '[2407.21783] The Llama 3 Herd of Models - arXiv.org', 'url': 'https://arxiv.org/abs/2407.21783', 'content': 'Modern artificial intelligence (AI) systems are powered by foundation models. This paper presents a new set of foundation models, called Llama 3. It is a herd of language models that natively support multilinguality, coding, reasoning, and tool usage. Our largest model is a dense Transformer with 405B parameters and a context window of up to 128K tokens. This paper presents an extensive ...', 'score': 0.68257374, 'raw_content': None}], 'response_time': 1.7}, {'query': 'arxiv id of BERT', 'follow_up_questions': None, 'answer': None, 'images': [], 'results': [{'title': '[2103.11943] BERT: A Review of Applications in Natural Language ...', 'url': 'https://arxiv.org/abs/2103.11943', 'content': 'arXiv:2103.11943 (cs) [Submitted on 22 Mar 2021] BERT: A Review of Applications in Natural Language Processing and Understanding. M. V. Koroteev. In this review, we describe the application of one of the most popular deep learning-based language models - BERT. The paper describes the mechanism of operation of this model, the main areas of its ...', 'score': 0.99411184, 'raw_content': None}, {'title': 'BERT: Pre-training of Deep Bidirectional Transformers for Language ...', 'url': 'https://aclanthology.org/N19-1423/', 'content': 'Abstract. We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models (Peters et al., 2018a; Radford et al., 2018), BERT is designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning ...', 'score': 0.9222025, 'raw_content': None}, {'title': 'BERT: Pre-training of Deep Bidirectional Transformers for Language ...', 'url': 'https://research.google/pubs/bert-pre-training-of-deep-bidirectional-transformers-for-language-understanding/', 'content': 'Abstract. We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models, BERT is designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers.', 'score': 0.87652874, 'raw_content': None}, {'title': 'BERT: Pre-training of Deep Bidirectional Transformers for Language ...', 'url': 'https://arxiv.org/abs/1810.04805', 'content': 'We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models, BERT is designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers. As a result, the pre-trained BERT model can be fine-tuned ...', 'score': 0.66115755, 'raw_content': None}, {'title': 'A Primer in BERTology: What We Know About How BERT Works', 'url': 'https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00349/96482/A-Primer-in-BERTology-What-We-Know-About-How-BERT', 'content': 'The issue of model depth must be related to the information flow from the most task-specific layers closer to the classifier (Liu et al., 2019a), to the initial layers which appear to be the most task-invariant (Hao et al., 2019), and where the tokens resemble the input tokens the most (Brunner et al., 2020) For BERT, this has been achieved through experiments with loss functions (Sanh et al., 2019; Jiao et al., 2019), mimicking the activation patterns of individual portions of the teacher network (Sun et al., 2019a), and knowledge transfer at the pre-training (Turc et al., 2019; Jiao et al., 2019; Sun et al., 2020) or fine-tuning stage (Jiao et al., 2019). In particular, they were shown to rely on shallow heuristics in natural language inference (McCoy et al., 2019b; Zellers et al., 2019; Jin et al., 2020), reading comprehension (Si et al., 2019; Rogers et al., 2020; Sugawara et al., 2020; Yogatama et al., 2019), argument reasoning comprehension (Niven and Kao, 2019), and text classification (Jin et al., 2020). Several studies explored the possibilities of improving the fine-tuning of BERT:\nTaking more layers into account: learning a complementary representation of the information in deep and output layers (Yang and Zhao, 2019), using a weighted combination of all layers instead of the final one (Su and Cheng, 2019; Kondratyuk and Straka, 2019), and layer dropout (Kondratyuk and Straka, 2019).\n For BERT, Clark et al. (2019) observe that most heads in the same layer show similar self-attention patterns (perhaps related to the fact that the output of all self-attention heads in a layer is passed through the same MLP), which explains why Michel et al. (2019) were able to reduce most layers to a single head.\n', 'score': 0.4250085, 'raw_content': None}], 'response_time': 2.2}]
This is a regular output without function call. This is a regular output without function call.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
#fin #fin
``` ```
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment