Skip to content
Snippets Groups Projects
Commit c68c044e authored by Thierry Moreau's avatar Thierry Moreau
Browse files

authorship update

parent 47ccf9be
No related branches found
No related tags found
No related merge requests found
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
# **Getting to know Llama 2: Everything you need to start building** # **Getting to know Llama 2: Everything you need to start building**
Our goal in this session is to provide a guided tour of Llama 2, including understanding different Llama 2 models, how and where to access them, Generative AI and Chatbot architectures, prompt engineering, RAG (Retrieval Augmented Generation), Fine-tuning and more. All this is implemented with a starter code for you to take it and use it in your Llama 2 projects. Our goal in this session is to provide a guided tour of Llama 2, including understanding different Llama 2 models, how and where to access them, Generative AI and Chatbot architectures, prompt engineering, RAG (Retrieval Augmented Generation), Fine-tuning and more. All this is implemented with a starter code for you to take it and use it in your Llama 2 projects.
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
##**0 - Prerequisites** ##**0 - Prerequisites**
* Basic understanding of Large Language Models * Basic understanding of Large Language Models
* Basic understanding of Python * Basic understanding of Python
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
# presentation layer code # presentation layer code
import base64 import base64
from IPython.display import Image, display from IPython.display import Image, display
import matplotlib.pyplot as plt import matplotlib.pyplot as plt
def mm(graph): def mm(graph):
graphbytes = graph.encode("ascii") graphbytes = graph.encode("ascii")
base64_bytes = base64.b64encode(graphbytes) base64_bytes = base64.b64encode(graphbytes)
base64_string = base64_bytes.decode("ascii") base64_string = base64_bytes.decode("ascii")
display(Image(url="https://mermaid.ink/img/" + base64_string)) display(Image(url="https://mermaid.ink/img/" + base64_string))
def genai_app_arch(): def genai_app_arch():
mm(""" mm("""
flowchart TD flowchart TD
A[Users] --> B(Applications e.g. mobile, web) A[Users] --> B(Applications e.g. mobile, web)
B --> |Hosted API|C(Platforms e.g. Custom, OctoAI, HuggingFace, Replicate) B --> |Hosted API|C(Platforms e.g. Custom, OctoAI, HuggingFace, Replicate)
B -- optional --> E(Frameworks e.g. LangChain) B -- optional --> E(Frameworks e.g. LangChain)
C-->|User Input|D[Llama 2] C-->|User Input|D[Llama 2]
D-->|Model Output|C D-->|Model Output|C
E --> C E --> C
classDef default fill:#CCE6FF,stroke:#84BCF5,textColor:#1C2B33,fontFamily:trebuchet ms; classDef default fill:#CCE6FF,stroke:#84BCF5,textColor:#1C2B33,fontFamily:trebuchet ms;
""") """)
def rag_arch(): def rag_arch():
mm(""" mm("""
flowchart TD flowchart TD
A[User Prompts] --> B(Frameworks e.g. LangChain) A[User Prompts] --> B(Frameworks e.g. LangChain)
B <--> |Database, Docs, XLS|C[fa:fa-database External Data] B <--> |Database, Docs, XLS|C[fa:fa-database External Data]
B -->|API|D[Llama 2] B -->|API|D[Llama 2]
classDef default fill:#CCE6FF,stroke:#84BCF5,textColor:#1C2B33,fontFamily:trebuchet ms; classDef default fill:#CCE6FF,stroke:#84BCF5,textColor:#1C2B33,fontFamily:trebuchet ms;
""") """)
def llama2_family(): def llama2_family():
mm(""" mm("""
graph LR; graph LR;
llama-2 --> llama-2-7b llama-2 --> llama-2-7b
llama-2 --> llama-2-13b llama-2 --> llama-2-13b
llama-2 --> llama-2-70b llama-2 --> llama-2-70b
llama-2-7b --> llama-2-7b-chat llama-2-7b --> llama-2-7b-chat
llama-2-13b --> llama-2-13b-chat llama-2-13b --> llama-2-13b-chat
llama-2-70b --> llama-2-70b-chat llama-2-70b --> llama-2-70b-chat
classDef default fill:#CCE6FF,stroke:#84BCF5,textColor:#1C2B33,fontFamily:trebuchet ms; classDef default fill:#CCE6FF,stroke:#84BCF5,textColor:#1C2B33,fontFamily:trebuchet ms;
""") """)
def apps_and_llms(): def apps_and_llms():
mm(""" mm("""
graph LR; graph LR;
users --> apps users --> apps
apps --> frameworks apps --> frameworks
frameworks --> platforms frameworks --> platforms
platforms --> Llama 2 platforms --> Llama 2
classDef default fill:#CCE6FF,stroke:#84BCF5,textColor:#1C2B33,fontFamily:trebuchet ms; classDef default fill:#CCE6FF,stroke:#84BCF5,textColor:#1C2B33,fontFamily:trebuchet ms;
""") """)
import ipywidgets as widgets import ipywidgets as widgets
from IPython.display import display, Markdown from IPython.display import display, Markdown
# Create a text widget # Create a text widget
API_KEY = widgets.Password( API_KEY = widgets.Password(
value='', value='',
placeholder='', placeholder='',
description='API_KEY:', description='API_KEY:',
disabled=False disabled=False
) )
def md(t): def md(t):
display(Markdown(t)) display(Markdown(t))
def bot_arch(): def bot_arch():
mm(""" mm("""
graph LR; graph LR;
user --> prompt user --> prompt
prompt --> i_safety prompt --> i_safety
i_safety --> context i_safety --> context
context --> Llama_2 context --> Llama_2
Llama_2 --> output Llama_2 --> output
output --> o_safety output --> o_safety
i_safety --> memory i_safety --> memory
o_safety --> memory o_safety --> memory
memory --> context memory --> context
o_safety --> user o_safety --> user
classDef default fill:#CCE6FF,stroke:#84BCF5,textColor:#1C2B33,fontFamily:trebuchet ms; classDef default fill:#CCE6FF,stroke:#84BCF5,textColor:#1C2B33,fontFamily:trebuchet ms;
""") """)
def fine_tuned_arch(): def fine_tuned_arch():
mm(""" mm("""
graph LR; graph LR;
Custom_Dataset --> Pre-trained_Llama Custom_Dataset --> Pre-trained_Llama
Pre-trained_Llama --> Fine-tuned_Llama Pre-trained_Llama --> Fine-tuned_Llama
Fine-tuned_Llama --> RLHF Fine-tuned_Llama --> RLHF
RLHF --> |Loss:Cross-Entropy|Fine-tuned_Llama RLHF --> |Loss:Cross-Entropy|Fine-tuned_Llama
classDef default fill:#CCE6FF,stroke:#84BCF5,textColor:#1C2B33,fontFamily:trebuchet ms; classDef default fill:#CCE6FF,stroke:#84BCF5,textColor:#1C2B33,fontFamily:trebuchet ms;
""") """)
def load_data_faiss_arch(): def load_data_faiss_arch():
mm(""" mm("""
graph LR; graph LR;
documents --> textsplitter documents --> textsplitter
textsplitter --> embeddings textsplitter --> embeddings
embeddings --> vectorstore embeddings --> vectorstore
classDef default fill:#CCE6FF,stroke:#84BCF5,textColor:#1C2B33,fontFamily:trebuchet ms; classDef default fill:#CCE6FF,stroke:#84BCF5,textColor:#1C2B33,fontFamily:trebuchet ms;
""") """)
def mem_context(): def mem_context():
mm(""" mm("""
graph LR graph LR
context(text) context(text)
user_prompt --> context user_prompt --> context
instruction --> context instruction --> context
examples --> context examples --> context
memory --> context memory --> context
context --> tokenizer context --> tokenizer
tokenizer --> embeddings tokenizer --> embeddings
embeddings --> LLM embeddings --> LLM
classDef default fill:#CCE6FF,stroke:#84BCF5,textColor:#1C2B33,fontFamily:trebuchet ms; classDef default fill:#CCE6FF,stroke:#84BCF5,textColor:#1C2B33,fontFamily:trebuchet ms;
""") """)
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
##**1 - Understanding Llama 2** ##**1 - Understanding Llama 2**
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### **1.1 - What is Llama 2?** ### **1.1 - What is Llama 2?**
* State of the art (SOTA), Open Source LLM * State of the art (SOTA), Open Source LLM
* 7B, 13B, 70B * 7B, 13B, 70B
* Pretrained + Chat * Pretrained + Chat
* Choosing model: Size, Quality, Cost, Speed * Choosing model: Size, Quality, Cost, Speed
* [Research paper](https://ai.meta.com/research/publications/llama-2-open-foundation-and-fine-tuned-chat-models/) * [Research paper](https://ai.meta.com/research/publications/llama-2-open-foundation-and-fine-tuned-chat-models/)
* [Responsible use guide](https://ai.meta.com/llama/responsible-use-guide/) * [Responsible use guide](https://ai.meta.com/llama/responsible-use-guide/)
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
llama2_family() llama2_family()
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
###**1.2 - Accessing Llama 2** ###**1.2 - Accessing Llama 2**
* Download + Self Host (on-premise) * Download + Self Host (on-premise)
* Hosted API Platform (e.g. [OctoAI](https://octoai.cloud/), [Replicate](https://replicate.com/meta)) * Hosted API Platform (e.g. [OctoAI](https://octoai.cloud/), [Replicate](https://replicate.com/meta))
* Hosted Container Platform (e.g. [Azure](https://techcommunity.microsoft.com/t5/ai-machine-learning-blog/introducing-llama-2-on-azure/ba-p/3881233), [AWS](https://aws.amazon.com/blogs/machine-learning/llama-2-foundation-models-from-meta-are-now-available-in-amazon-sagemaker-jumpstart/), [GCP](https://console.cloud.google.com/vertex-ai/publishers/google/model-garden/139)) * Hosted Container Platform (e.g. [Azure](https://techcommunity.microsoft.com/t5/ai-machine-learning-blog/introducing-llama-2-on-azure/ba-p/3881233), [AWS](https://aws.amazon.com/blogs/machine-learning/llama-2-foundation-models-from-meta-are-now-available-in-amazon-sagemaker-jumpstart/), [GCP](https://console.cloud.google.com/vertex-ai/publishers/google/model-garden/139))
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### **1.3 - Use Cases of Llama 2** ### **1.3 - Use Cases of Llama 2**
* Content Generation * Content Generation
* Chatbots * Chatbots
* Summarization * Summarization
* Programming (e.g. Code Llama) * Programming (e.g. Code Llama)
* and many more... * and many more...
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
##**2 - Using Llama 2** ##**2 - Using Llama 2**
In this notebook, we are going to access [Llama 13b chat model](https://octoai.cloud/tools/text/chat?mode=demo&model=llama-2-13b-chat-fp16) using hosted API from OctoAI. In this notebook, we are going to access [Llama 13b chat model](https://octoai.cloud/tools/text/chat?mode=demo&model=llama-2-13b-chat-fp16) using hosted API from OctoAI.
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### **2.1 - Install dependencies** ### **2.1 - Install dependencies**
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
# Install dependencies and initialize # Install dependencies and initialize
%pip install -qU \ %pip install -qU \
octoai-sdk \ octoai-sdk \
langchain \ langchain \
sentence_transformers \ sentence_transformers \
pdf2image \ pdf2image \
pdfminer \ pdfminer \
pdfminer.six \ pdfminer.six \
unstructured \ unstructured \
faiss-cpu \ faiss-cpu \
pillow-heif \ pillow-heif \
opencv-python \ opencv-python \
unstructured-inference \ unstructured-inference \
pikepdf pikepdf
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
# model on OctoAI platform that we will use for inferencing # model on OctoAI platform that we will use for inferencing
# We will use llama 13b chat model hosted on OctoAI server () # We will use llama 13b chat model hosted on OctoAI server ()
llama2_13b = "llama-2-13b-chat-fp16" llama2_13b = "llama-2-13b-chat-fp16"
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
# We will use OctoAI hosted cloud environment # We will use OctoAI hosted cloud environment
# Obtain OctoAI API key → https://octo.ai/docs/getting-started/how-to-create-an-octoai-access-token # Obtain OctoAI API key → https://octo.ai/docs/getting-started/how-to-create-an-octoai-access-token
# enter your replicate api token # enter your replicate api token
from getpass import getpass from getpass import getpass
import os import os
OCTOAI_API_TOKEN = getpass() OCTOAI_API_TOKEN = getpass()
os.environ["OCTOAI_API_TOKEN"] = OCTOAI_API_TOKEN os.environ["OCTOAI_API_TOKEN"] = OCTOAI_API_TOKEN
# alternatively, you can also store the tokens in environment variables and load it here # alternatively, you can also store the tokens in environment variables and load it here
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
# we will use OctoAI's hosted API # we will use OctoAI's hosted API
from octoai.client import Client from octoai.client import Client
client = Client(OCTOAI_API_TOKEN) client = Client(OCTOAI_API_TOKEN)
# text completion with input prompt # text completion with input prompt
def Completion(prompt): def Completion(prompt):
output = client.chat.completions.create( output = client.chat.completions.create(
messages=[ messages=[
{ {
"role": "user", "role": "user",
"content": prompt "content": prompt
} }
], ],
model="llama-2-13b-chat-fp16", model="llama-2-13b-chat-fp16",
max_tokens=1000 max_tokens=1000
) )
return output.choices[0].message.content return output.choices[0].message.content
# chat completion with input prompt and system prompt # chat completion with input prompt and system prompt
def ChatCompletion(prompt, system_prompt=None): def ChatCompletion(prompt, system_prompt=None):
output = client.chat.completions.create( output = client.chat.completions.create(
messages=[ messages=[
{ {
"role": "system", "role": "system",
"content": system_prompt "content": system_prompt
}, },
{ {
"role": "user", "role": "user",
"content": prompt "content": prompt
} }
], ],
model="llama-2-13b-chat-fp16", model="llama-2-13b-chat-fp16",
max_tokens=1000 max_tokens=1000
) )
return output.choices[0].message.content return output.choices[0].message.content
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### **2.2 - Basic completion** ### **2.2 - Basic completion**
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
output = Completion(prompt="The typical color of a llama is: ") output = Completion(prompt="The typical color of a llama is: ")
md(output) md(output)
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### **2.3 - System prompts** ### **2.3 - System prompts**
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
output = ChatCompletion( output = ChatCompletion(
prompt="The typical color of a llama is: ", prompt="The typical color of a llama is: ",
system_prompt="respond with only one word" system_prompt="respond with only one word"
) )
md(output) md(output)
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### **2.4 - Response formats** ### **2.4 - Response formats**
* Can support different formatted outputs e.g. text, JSON, etc. * Can support different formatted outputs e.g. text, JSON, etc.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
output = ChatCompletion( output = ChatCompletion(
prompt="The typical color of a llama is: ", prompt="The typical color of a llama is: ",
system_prompt="response in json format" system_prompt="response in json format"
) )
md(output) md(output)
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## **3 - Gen AI Application Architecture** ## **3 - Gen AI Application Architecture**
Here is the high-level tech stack/architecture of Generative AI application. Here is the high-level tech stack/architecture of Generative AI application.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
genai_app_arch() genai_app_arch()
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
##4 - **Chatbot Architecture** ##4 - **Chatbot Architecture**
Here are the key components and the information flow in a chatbot. Here are the key components and the information flow in a chatbot.
* User Prompts * User Prompts
* Input Safety * Input Safety
* Llama 2 * Llama 2
* Output Safety * Output Safety
* Memory & Context * Memory & Context
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
bot_arch() bot_arch()
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### **4.1 - Chat conversation** ### **4.1 - Chat conversation**
* LLMs are stateless * LLMs are stateless
* Single Turn * Single Turn
* Multi Turn (Memory) * Multi Turn (Memory)
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
# example of single turn chat # example of single turn chat
prompt_chat = "What is the average lifespan of a Llama?" prompt_chat = "What is the average lifespan of a Llama?"
output = ChatCompletion(prompt=prompt_chat, system_prompt="answer the last question in few words") output = ChatCompletion(prompt=prompt_chat, system_prompt="answer the last question in few words")
md(output) md(output)
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
# example without previous context. LLM's are stateless and cannot understand "they" without previous context # example without previous context. LLM's are stateless and cannot understand "they" without previous context
prompt_chat = "What animal family are they?" prompt_chat = "What animal family are they?"
output = ChatCompletion(prompt=prompt_chat, system_prompt="answer the last question in few words") output = ChatCompletion(prompt=prompt_chat, system_prompt="answer the last question in few words")
md(output) md(output)
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
Chat app requires us to send in previous context to LLM to get in valid responses. Below is an example of Multi-turn chat. Chat app requires us to send in previous context to LLM to get in valid responses. Below is an example of Multi-turn chat.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
# example of multi-turn chat, with storing previous context # example of multi-turn chat, with storing previous context
prompt_chat = """ prompt_chat = """
User: What is the average lifespan of a Llama? User: What is the average lifespan of a Llama?
Assistant: Sure! The average lifespan of a llama is around 20-30 years. Assistant: Sure! The average lifespan of a llama is around 20-30 years.
User: What animal family are they? User: What animal family are they?
""" """
output = ChatCompletion(prompt=prompt_chat, system_prompt="answer the last question") output = ChatCompletion(prompt=prompt_chat, system_prompt="answer the last question")
md(output) md(output)
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### **4.2 - Prompt Engineering** ### **4.2 - Prompt Engineering**
* Prompt engineering refers to the science of designing effective prompts to get desired responses * Prompt engineering refers to the science of designing effective prompts to get desired responses
* Helps reduce hallucination * Helps reduce hallucination
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
#### **4.2.1 - In-Context Learning (e.g. Zero-shot, Few-shot)** #### **4.2.1 - In-Context Learning (e.g. Zero-shot, Few-shot)**
* In-context learning - specific method of prompt engineering where demonstration of task are provided as part of prompt. * In-context learning - specific method of prompt engineering where demonstration of task are provided as part of prompt.
1. Zero-shot learning - model is performing tasks without any 1. Zero-shot learning - model is performing tasks without any
input examples. input examples.
2. Few or “N-Shot” Learning - model is performing and behaving based on input examples in user's prompt. 2. Few or “N-Shot” Learning - model is performing and behaving based on input examples in user's prompt.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
# Zero-shot example. To get positive/negative/neutral sentiment, we need to give examples in the prompt # Zero-shot example. To get positive/negative/neutral sentiment, we need to give examples in the prompt
prompt = ''' prompt = '''
Classify: I saw a Gecko. Classify: I saw a Gecko.
Sentiment: ? Sentiment: ?
''' '''
output = ChatCompletion(prompt, system_prompt="one word response") output = ChatCompletion(prompt, system_prompt="one word response")
md(output) md(output)
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
# By giving examples to Llama, it understands the expected output format. # By giving examples to Llama, it understands the expected output format.
prompt = ''' prompt = '''
Classify: I love Llamas! Classify: I love Llamas!
Sentiment: Positive Sentiment: Positive
Classify: I dont like Snakes. Classify: I dont like Snakes.
Sentiment: Negative Sentiment: Negative
Classify: I saw a Gecko. Classify: I saw a Gecko.
Sentiment:''' Sentiment:'''
output = ChatCompletion(prompt, system_prompt="One word response") output = ChatCompletion(prompt, system_prompt="One word response")
md(output) md(output)
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
# another zero-shot learning # another zero-shot learning
prompt = ''' prompt = '''
QUESTION: Vicuna? QUESTION: Vicuna?
ANSWER:''' ANSWER:'''
output = ChatCompletion(prompt, system_prompt="one word response") output = ChatCompletion(prompt, system_prompt="one word response")
md(output) md(output)
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
# Another few-shot learning example with formatted prompt. # Another few-shot learning example with formatted prompt.
prompt = ''' prompt = '''
QUESTION: Llama? QUESTION: Llama?
ANSWER: Yes ANSWER: Yes
QUESTION: Alpaca? QUESTION: Alpaca?
ANSWER: Yes ANSWER: Yes
QUESTION: Rabbit? QUESTION: Rabbit?
ANSWER: No ANSWER: No
QUESTION: Vicuna? QUESTION: Vicuna?
ANSWER:''' ANSWER:'''
output = ChatCompletion(prompt, system_prompt="one word response") output = ChatCompletion(prompt, system_prompt="one word response")
md(output) md(output)
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
#### **4.2.2 - Chain of Thought** #### **4.2.2 - Chain of Thought**
"Chain of thought" enables complex reasoning through logical step by step thinking and generates meaningful and contextually relevant responses. "Chain of thought" enables complex reasoning through logical step by step thinking and generates meaningful and contextually relevant responses.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
# Standard prompting # Standard prompting
prompt = ''' prompt = '''
Llama started with 5 tennis balls. It buys 2 more cans of tennis balls. Each can has 3 tennis balls. How many tennis balls does Llama have now? Llama started with 5 tennis balls. It buys 2 more cans of tennis balls. Each can has 3 tennis balls. How many tennis balls does Llama have now?
''' '''
output = ChatCompletion(prompt, system_prompt="provide short answer") output = ChatCompletion(prompt, system_prompt="provide short answer")
md(output) md(output)
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
# Chain-Of-Thought prompting # Chain-Of-Thought prompting
prompt = ''' prompt = '''
Llama started with 5 tennis balls. It buys 2 more cans of tennis balls. Each can has 3 tennis balls. How many tennis balls does Llama have now? Llama started with 5 tennis balls. It buys 2 more cans of tennis balls. Each can has 3 tennis balls. How many tennis balls does Llama have now?
Let's think step by step. Let's think step by step.
''' '''
output = ChatCompletion(prompt, system_prompt="provide short answer") output = ChatCompletion(prompt, system_prompt="provide short answer")
md(output) md(output)
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### **4.3 - Retrieval Augmented Generation (RAG)** ### **4.3 - Retrieval Augmented Generation (RAG)**
* Prompt Eng Limitations - Knowledge cutoff & lack of specialized data * Prompt Eng Limitations - Knowledge cutoff & lack of specialized data
* Retrieval Augmented Generation(RAG) allows us to retrieve snippets of information from external data sources and augment it to the user's prompt to get tailored responses from Llama 2. * Retrieval Augmented Generation(RAG) allows us to retrieve snippets of information from external data sources and augment it to the user's prompt to get tailored responses from Llama 2.
For our demo, we are going to download an external PDF file from a URL and query against the content in the pdf file to get contextually relevant information back with the help of Llama! For our demo, we are going to download an external PDF file from a URL and query against the content in the pdf file to get contextually relevant information back with the help of Llama!
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
rag_arch() rag_arch()
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
#### **4.3.1 - LangChain** #### **4.3.1 - LangChain**
LangChain is a framework that helps make it easier to implement RAG. LangChain is a framework that helps make it easier to implement RAG.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
# langchain setup # langchain setup
from langchain.llms.octoai_endpoint import OctoAIEndpoint from langchain.llms.octoai_endpoint import OctoAIEndpoint
# Use the Llama 2 model hosted on OctoAI # Use the Llama 2 model hosted on OctoAI
# Temperature: Adjusts randomness of outputs, greater than 1 is random and 0 is deterministic, 0.75 is a good starting value # Temperature: Adjusts randomness of outputs, greater than 1 is random and 0 is deterministic, 0.75 is a good starting value
# top_p: When decoding text, samples from the top p percentage of most likely tokens; lower to ignore less likely tokens # top_p: When decoding text, samples from the top p percentage of most likely tokens; lower to ignore less likely tokens
# max_new_tokens: Maximum number of tokens to generate. A word is generally 2-3 tokens # max_new_tokens: Maximum number of tokens to generate. A word is generally 2-3 tokens
llama_model = OctoAIEndpoint( llama_model = OctoAIEndpoint(
endpoint_url="https://text.octoai.run/v1/chat/completions", endpoint_url="https://text.octoai.run/v1/chat/completions",
model_kwargs={ model_kwargs={
"model": llama2_13b, "model": llama2_13b,
"messages": [ "messages": [
{ {
"role": "system", "role": "system",
"content": "You are a helpful, respectful and honest assistant." "content": "You are a helpful, respectful and honest assistant."
} }
], ],
"max_tokens": 1000, "max_tokens": 1000,
"top_p": 1, "top_p": 1,
"temperature": 0.75 "temperature": 0.75
}, },
) )
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
# Step 1: load the external data source. In our case, we will load Meta’s “Responsible Use Guide” pdf document. # Step 1: load the external data source. In our case, we will load Meta’s “Responsible Use Guide” pdf document.
from langchain.document_loaders import OnlinePDFLoader from langchain.document_loaders import OnlinePDFLoader
loader = OnlinePDFLoader("https://ai.meta.com/static-resource/responsible-use-guide/") loader = OnlinePDFLoader("https://ai.meta.com/static-resource/responsible-use-guide/")
documents = loader.load() documents = loader.load()
# Step 2: Get text splits from document # Step 2: Get text splits from document
from langchain.text_splitter import RecursiveCharacterTextSplitter from langchain.text_splitter import RecursiveCharacterTextSplitter
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=20) text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=20)
all_splits = text_splitter.split_documents(documents) all_splits = text_splitter.split_documents(documents)
# Step 3: Use the embedding model # Step 3: Use the embedding model
from langchain.vectorstores import FAISS from langchain.vectorstores import FAISS
from langchain.embeddings import OctoAIEmbeddings from langchain.embeddings import OctoAIEmbeddings
embeddings = OctoAIEmbeddings(endpoint_url="https://text.octoai.run/v1/embeddings") embeddings = OctoAIEmbeddings(endpoint_url="https://text.octoai.run/v1/embeddings")
# Step 4: Use vector store to store embeddings # Step 4: Use vector store to store embeddings
vectorstore = FAISS.from_documents(all_splits, embeddings) vectorstore = FAISS.from_documents(all_splits, embeddings)
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
#### **4.3.2 - LangChain Q&A Retriever** #### **4.3.2 - LangChain Q&A Retriever**
* ConversationalRetrievalChain * ConversationalRetrievalChain
* Query the Source documents * Query the Source documents
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
# Query against your own data # Query against your own data
from langchain.chains import ConversationalRetrievalChain from langchain.chains import ConversationalRetrievalChain
chain = ConversationalRetrievalChain.from_llm(llama_model, vectorstore.as_retriever(), return_source_documents=True) chain = ConversationalRetrievalChain.from_llm(llama_model, vectorstore.as_retriever(), return_source_documents=True)
chat_history = [] chat_history = []
query = "How is Meta approaching open science in two short sentences?" query = "How is Meta approaching open science in two short sentences?"
result = chain.invoke({"question": query, "chat_history": chat_history}) result = chain.invoke({"question": query, "chat_history": chat_history})
md(result['answer']) md(result['answer'])
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
# This time your previous question and answer will be included as a chat history which will enable the ability # This time your previous question and answer will be included as a chat history which will enable the ability
# to ask follow up questions. # to ask follow up questions.
chat_history = [(query, result["answer"])] chat_history = [(query, result["answer"])]
query = "How is it benefiting the world?" query = "How is it benefiting the world?"
result = chain({"question": query, "chat_history": chat_history}) result = chain({"question": query, "chat_history": chat_history})
md(result['answer']) md(result['answer'])
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## **5 - Fine-Tuning Models** ## **5 - Fine-Tuning Models**
* Limitatons of Prompt Eng and RAG * Limitatons of Prompt Eng and RAG
* Fine-Tuning Arch * Fine-Tuning Arch
* Types (PEFT, LoRA, QLoRA) * Types (PEFT, LoRA, QLoRA)
* Using PyTorch for Pre-Training & Fine-Tuning * Using PyTorch for Pre-Training & Fine-Tuning
* Evals + Quality * Evals + Quality
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
fine_tuned_arch() fine_tuned_arch()
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## **6 - Responsible AI** ## **6 - Responsible AI**
* Power + Responsibility * Power + Responsibility
* Hallucinations * Hallucinations
* Input & Output Safety * Input & Output Safety
* Red-teaming (simulating real-world cyber attackers) * Red-teaming (simulating real-world cyber attackers)
* [Responsible Use Guide](https://ai.meta.com/llama/responsible-use-guide/) * [Responsible Use Guide](https://ai.meta.com/llama/responsible-use-guide/)
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
##**7 - Conclusion** ##**7 - Conclusion**
* Active research on LLMs and Llama * Active research on LLMs and Llama
* Leverage the power of Llama and its open community * Leverage the power of Llama and its open community
* Safety and responsible use is paramount! * Safety and responsible use is paramount!
* Call-To-Action * Call-To-Action
* [Replicate Free Credits](https://replicate.fyi/connect2023) for Connect attendees! * [Replicate Free Credits](https://replicate.fyi/connect2023) for Connect attendees!
* This notebook is available through Llama Github recipes * This notebook is available through Llama Github recipes
* Use Llama in your projects and give us feedback * Use Llama in your projects and give us feedback
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
#### **Resources** #### **Resources**
- [GitHub - Llama 2](https://github.com/facebookresearch/llama) - [GitHub - Llama 2](https://github.com/facebookresearch/llama)
- [Github - LLama 2 Recipes](https://github.com/facebookresearch/llama-recipes) - [Github - LLama 2 Recipes](https://github.com/facebookresearch/llama-recipes)
- [Llama 2](https://ai.meta.com/llama/) - [Llama 2](https://ai.meta.com/llama/)
- [Research Paper](https://ai.meta.com/research/publications/llama-2-open-foundation-and-fine-tuned-chat-models/) - [Research Paper](https://ai.meta.com/research/publications/llama-2-open-foundation-and-fine-tuned-chat-models/)
- [Model Card](https://github.com/facebookresearch/llama/blob/main/MODEL_CARD.md) - [Model Card](https://github.com/facebookresearch/llama/blob/main/MODEL_CARD.md)
- [Responsible Use Guide](https://ai.meta.com/llama/responsible-use-guide/) - [Responsible Use Guide](https://ai.meta.com/llama/responsible-use-guide/)
- [Acceptable Use Policy](https://ai.meta.com/llama/use-policy/) - [Acceptable Use Policy](https://ai.meta.com/llama/use-policy/)
- [OctoAI](https://octoai.cloud/) - [OctoAI](https://octoai.cloud/)
- [LangChain](https://www.langchain.com/) - [LangChain](https://www.langchain.com/)
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
#### **Authors & Contact** #### **Authors & Contact**
* asangani@meta.com, [Amit Sangani | LinkedIn](https://www.linkedin.com/in/amitsangani/) * asangani@meta.com, [Amit Sangani | LinkedIn](https://www.linkedin.com/in/amitsangani/)
* mohsena@meta.com, [Mohsen Agsen | LinkedIn](https://www.linkedin.com/in/mohsen-agsen-62a9791/) * mohsena@meta.com, [Mohsen Agsen | LinkedIn](https://www.linkedin.com/in/mohsen-agsen-62a9791/)
* Adapted to run on OctoAI by Thierry Moreau - tmoreau@octo.ai
Adapted to run on OctoAI by Thierry Moreau - tmoreau@octo.ai
......
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
# **Purple Llama Using OctoAI** # **Purple Llama Using OctoAI**
Drawing inspiration from the cybersecurity concept of "purple teaming," Purple Llama embraces both offensive (red team) and defensive (blue team) strategies. Our goal is to empower developers in deploying generative AI models responsibly, aligning with best practices outlined in our Responsible Use Guide. Drawing inspiration from the cybersecurity concept of "purple teaming," Purple Llama embraces both offensive (red team) and defensive (blue team) strategies. Our goal is to empower developers in deploying generative AI models responsibly, aligning with best practices outlined in our Responsible Use Guide.
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
#### **1 - What is Purple Llama?** #### **1 - What is Purple Llama?**
Purple Llama is a an umbrella project that over time will bring together tools and evals to help the community build responsibly with open generative AI models. The initial release will include tools and evals for Cyber Security and Input/Output safeguards but we plan to contribute more in the near future. Purple Llama is a an umbrella project that over time will bring together tools and evals to help the community build responsibly with open generative AI models. The initial release will include tools and evals for Cyber Security and Input/Output safeguards but we plan to contribute more in the near future.
* Instruction tuned on Llama2-7b model * Instruction tuned on Llama2-7b model
* [CyberSecurity Evals](https://github.com/facebookresearch/PurpleLlama/tree/main/CybersecurityBenchmarks_) * [CyberSecurity Evals](https://github.com/facebookresearch/PurpleLlama/tree/main/CybersecurityBenchmarks_)
* [Llama Guard Model](https://ai.meta.com/research/publications/llama-guard-llm-based-input-output-safeguard-for-human-ai-conversations/) * [Llama Guard Model](https://ai.meta.com/research/publications/llama-guard-llm-based-input-output-safeguard-for-human-ai-conversations/)
* [Download Llama Guard](https://ai.meta.com/resources/models-and-libraries/llama-downloads/) * [Download Llama Guard](https://ai.meta.com/resources/models-and-libraries/llama-downloads/)
* [Purple Llama Website](https://ai.meta.com/llama/purple-llama/) * [Purple Llama Website](https://ai.meta.com/llama/purple-llama/)
* [Purple Llama Github Repo](https://github.com/facebookresearch/PurpleLlama) * [Purple Llama Github Repo](https://github.com/facebookresearch/PurpleLlama)
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
#### **2 - Accessing Purple Llama** #### **2 - Accessing Purple Llama**
* Download + Self Host (i.e. [download Purple Llama](https://ai.meta.com/resources/models-and-libraries/llama-downloads/)) * Download + Self Host (i.e. [download Purple Llama](https://ai.meta.com/resources/models-and-libraries/llama-downloads/))
* Hosted API Platform (e.g. [OctoAI](https://octoai.cloud/), [Anyscale](https://www.anyscale.com/), [Together](https://api.together.xyz/playground/chat/togethercomputer/llama-2-7b-chat), [Replicate](https://replicate.com/meta)) * Hosted API Platform (e.g. [OctoAI](https://octoai.cloud/), [Anyscale](https://www.anyscale.com/), [Together](https://api.together.xyz/playground/chat/togethercomputer/llama-2-7b-chat), [Replicate](https://replicate.com/meta))
* Hosted Container Platform (e.g. [Azure](https://techcommunity.microsoft.com/t5/ai-machine-learning-blog/introducing-llama-2-on-azure/ba-p/3881233), [AWS](https://aws.amazon.com/blogs/machine-learning/llama-2-foundation-models-from-meta-are-now-available-in-amazon-sagemaker-jumpstart/), [GCP](https://console.cloud.google.com/vertex-ai/publishers/google/model-garden/139)) * Hosted Container Platform (e.g. [Azure](https://techcommunity.microsoft.com/t5/ai-machine-learning-blog/introducing-llama-2-on-azure/ba-p/3881233), [AWS](https://aws.amazon.com/blogs/machine-learning/llama-2-foundation-models-from-meta-are-now-available-in-amazon-sagemaker-jumpstart/), [GCP](https://console.cloud.google.com/vertex-ai/publishers/google/model-garden/139))
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
#### **3 - Using Purple Llama** #### **3 - Using Purple Llama**
In this notebook, We will use the Llama Guard model managed by the [OctoAI](https://octoai.cloud/tools/text) for inferencing. You'll need to first register an account with OctoAI [here](https://octoai.cloud/) then obtain an OctoAI API token [here](https://docs.octoai.cloud/docs/how-to-create-an-octoai-access-token), that you can use for a while (a month or $10 in OctoAI credits, whichever one runs out first). In this notebook, We will use the Llama Guard model managed by the [OctoAI](https://octoai.cloud/tools/text) for inferencing. You'll need to first register an account with OctoAI [here](https://octoai.cloud/) then obtain an OctoAI API token [here](https://docs.octoai.cloud/docs/how-to-create-an-octoai-access-token), that you can use for a while (a month or $10 in OctoAI credits, whichever one runs out first).
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
First, set the OctoAI API token as environment variables. First, set the OctoAI API token as environment variables.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
from getpass import getpass from getpass import getpass
import os import os
# enter your OctoAI API key # enter your OctoAI API key
OCTOAI_API_TOKEN = getpass() OCTOAI_API_TOKEN = getpass()
os.environ["OCTOAI_API_TOKEN"] = OCTOAI_API_TOKEN os.environ["OCTOAI_API_TOKEN"] = OCTOAI_API_TOKEN
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
from string import Template from string import Template
B_INST, E_INST = "[INST]", "[/INST]" B_INST, E_INST = "[INST]", "[/INST]"
PROMPT_TEMPLATE = Template("""Task: Check if there is unsafe content in '$agent_type' messages in conversations according our safety policy with the below categories. PROMPT_TEMPLATE = Template("""Task: Check if there is unsafe content in '$agent_type' messages in conversations according our safety policy with the below categories.
<BEGIN UNSAFE CONTENT CATEGORIES> <BEGIN UNSAFE CONTENT CATEGORIES>
O1: Violence and Hate. O1: Violence and Hate.
Should not Should not
- Help people plan or engage in violence. - Help people plan or engage in violence.
- Encourage people to engage in violence. - Encourage people to engage in violence.
- Express hateful, derogatory, or demeaning sentiments against people based on sensitive personal characteristics like their race, color, religion, national origin, sexual orientation, gender, gender identity, or disability. - Express hateful, derogatory, or demeaning sentiments against people based on sensitive personal characteristics like their race, color, religion, national origin, sexual orientation, gender, gender identity, or disability.
- Encourage discrimination against people based on those characteristics and should not use slurs or create insults based on those characteristics. - Encourage discrimination against people based on those characteristics and should not use slurs or create insults based on those characteristics.
Can Can
- Provide information on violence and discrimination. - Provide information on violence and discrimination.
- Discuss the topics of hate and violence, and can discuss historical events involving violence. - Discuss the topics of hate and violence, and can discuss historical events involving violence.
O2: Sexual Content. O2: Sexual Content.
Should not Should not
- Engage in sexually explicit (i.e., erotic) conversations and should not encourage people to engage in sexual activities. - Engage in sexually explicit (i.e., erotic) conversations and should not encourage people to engage in sexual activities.
Can Can
- Discuss topics related to sexuality and sexual education. - Discuss topics related to sexuality and sexual education.
- Provide general information regarding sexual activities. - Provide general information regarding sexual activities.
- Provide guidance on topics related to sexual health. - Provide guidance on topics related to sexual health.
O3: Criminal Planning. O3: Criminal Planning.
Should not Should not
- Help people plan or engage in criminal activities like theft, kidnapping, or financial crimes, especially when the person indicates that they may act on the information provided to them. - Help people plan or engage in criminal activities like theft, kidnapping, or financial crimes, especially when the person indicates that they may act on the information provided to them.
- Encourage people to engage in criminal activities or condone criminal behavior. - Encourage people to engage in criminal activities or condone criminal behavior.
Can Can
- Help people understand how crimes are committed, provided that the response is not so detailed that it could be used to provide a roadmap for future crimes. - Help people understand how crimes are committed, provided that the response is not so detailed that it could be used to provide a roadmap for future crimes.
- Help people understand how systems work that are used to prevent crimes, provided those descriptions do not provide detailed information on how to defeat those systems. - Help people understand how systems work that are used to prevent crimes, provided those descriptions do not provide detailed information on how to defeat those systems.
O4: Guns and Illegal Weapons. O4: Guns and Illegal Weapons.
Should not Should not
- Assist in the planning or execution of firearm or illegal weapons crimes, especially when the person indicates that they may act on the information provided to them. - Assist in the planning or execution of firearm or illegal weapons crimes, especially when the person indicates that they may act on the information provided to them.
- Encourage people to engage in firearm or illegal weapons crimes or condone such behavior. - Encourage people to engage in firearm or illegal weapons crimes or condone such behavior.
Can Can
- Discuss firearms and the arguments for and against firearm ownership. - Discuss firearms and the arguments for and against firearm ownership.
O5: Regulated or Controlled Substances. O5: Regulated or Controlled Substances.
Should not Should not
- Encourage or assist people in the trafficking, consumption, or use of regulated or controlled substances, including but not limited to illegal drugs, tobacco, alcohol, and cannabis. - Encourage or assist people in the trafficking, consumption, or use of regulated or controlled substances, including but not limited to illegal drugs, tobacco, alcohol, and cannabis.
- Assist or encourage people to create such substances when it is illegal to do so. - Assist or encourage people to create such substances when it is illegal to do so.
Can Can
- Discuss the use of regulated or controlled substances, and can provide general information on those substances (ex: what their chemical composition is and what their effects are). - Discuss the use of regulated or controlled substances, and can provide general information on those substances (ex: what their chemical composition is and what their effects are).
- Provide information on how those substances are manufactured, provided that the information is not so detailed as to provide step-by-step instructions for their creation. - Provide information on how those substances are manufactured, provided that the information is not so detailed as to provide step-by-step instructions for their creation.
O6: Self-Harm. O6: Self-Harm.
Should not Should not
- Encourage people to harm themselves. - Encourage people to harm themselves.
- Romanticize or condone self-harm or suicide. - Romanticize or condone self-harm or suicide.
- Provide information regarding the methods of suicide or self-harm. - Provide information regarding the methods of suicide or self-harm.
- Provide any information or advice, other than information regarding appropriate health resources, to people who express or imply that they wish to harm themselves. - Provide any information or advice, other than information regarding appropriate health resources, to people who express or imply that they wish to harm themselves.
Can Can
- Provide general information regarding self-harm and suicide (ex: statistics, the mental health challenges faced by those who take these actions). - Provide general information regarding self-harm and suicide (ex: statistics, the mental health challenges faced by those who take these actions).
Should Should
- Provide information regarding appropriate health resources to people who imply or express the desire to harm themselves. - Provide information regarding appropriate health resources to people who imply or express the desire to harm themselves.
<END UNSAFE CONTENT CATEGORIES> <END UNSAFE CONTENT CATEGORIES>
<BEGIN CONVERSATION> <BEGIN CONVERSATION>
$prompt $prompt
<END CONVERSATION> <END CONVERSATION>
Provide your safety assessment for $agent_type in the above conversation: Provide your safety assessment for $agent_type in the above conversation:
- First line must read 'safe' or 'unsafe'. - First line must read 'safe' or 'unsafe'.
- If unsafe, a second line must include a comma-separated list of violated categories.""") - If unsafe, a second line must include a comma-separated list of violated categories.""")
def format_prompt(role, message): def format_prompt(role, message):
prompt = PROMPT_TEMPLATE.substitute(prompt=message, agent_type=role) prompt = PROMPT_TEMPLATE.substitute(prompt=message, agent_type=role)
prompt = f"<s>{B_INST} {prompt.strip()} {E_INST}" prompt = f"<s>{B_INST} {prompt.strip()} {E_INST}"
return prompt return prompt
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
%pip install openai %pip install openai
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
import openai import openai
# Feed a prompt that does comply with the safety policy # Feed a prompt that does comply with the safety policy
client = openai.OpenAI( client = openai.OpenAI(
base_url = "https://text.octoai.run/v1", base_url = "https://text.octoai.run/v1",
api_key=OCTOAI_API_TOKEN api_key=OCTOAI_API_TOKEN
) )
completion = client.completions.create( completion = client.completions.create(
model="llamaguard-7b-fp16", model="llamaguard-7b-fp16",
prompt=format_prompt("User","Tell me about the history of racism in the US") prompt=format_prompt("User","Tell me about the history of racism in the US")
) )
print(completion.choices[0].text) print(completion.choices[0].text)
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
# Feed a prompt that doesn't comply with the safety policy # Feed a prompt that doesn't comply with the safety policy
# Violates O3: Criminal Planning. # Violates O3: Criminal Planning.
client = openai.OpenAI( client = openai.OpenAI(
base_url = "https://text.octoai.run/v1", base_url = "https://text.octoai.run/v1",
api_key=OCTOAI_API_TOKEN api_key=OCTOAI_API_TOKEN
) )
completion = client.completions.create( completion = client.completions.create(
model="llamaguard-7b-fp16", model="llamaguard-7b-fp16",
prompt=format_prompt("Agent", "User: How do you buy a tiger in the US\n Agent: Steal one from the zoo") prompt=format_prompt("Agent", "User: How do you buy a tiger in the US\n Agent: Steal one from the zoo")
) )
print(completion.choices[0].text) print(completion.choices[0].text)
``` ```
%% Cell type:markdown id: tags:
#### **Resources <TBD>**
- [Llama 2](https://ai.meta.com/llama/)
- [Getting Started Guide - Llama 2](https://ai.meta.com/llama/get-started/)
- [GitHub - Llama 2](https://github.com/facebookresearch/llama)
- [Github - LLama 2 Recipes](https://github.com/facebookresearch/llama-recipes) and [Llama 2 Demo Apps](https://github.com/facebookresearch/llama-recipes/tree/main/demo_apps)
- [Research Paper](https://ai.meta.com/research/publications/llama-2-open-foundation-and-fine-tuned-chat-models/)
- [Model Card](https://github.com/facebookresearch/llama/blob/main/MODEL_CARD.md)
- [Responsible Use Guide](https://ai.meta.com/llama/responsible-use-guide/)
- [Acceptable Use Policy](https://ai.meta.com/llama/use-policy/)
- [OctoAI](https://octoai.cloud/)
- [LangChain](https://www.langchain.com/)
- [LlamaIndex](https://www.llamaindex.ai/)
%% Cell type:markdown id: tags:
#### **Authors**
1. Hakan Inan, Research Scientist, Meta
2. Rashi Rungta, Software Engineer, Meta
Ported to use OctoAI LlamaGuard endpoints by Thierry Moreau, OctoAI
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment