Notebook 2 finalise

5d430e3e · Sanyam Bhutani · 75cd0f4d · 5d430e3e · 5d430e3e
Commit 5d430e3e authored 5 months ago by Sanyam Bhutani
--- a/recipes/quickstart/NotebookLlama/README.md
+++ b/recipes/quickstart/NotebookLlama/README.md
@@ -77,5 +77,4 @@ The speakers and the prompt for parler model were decided based on experimentati
 - https://colab.research.google.com/drive/1eJfA2XUa-mXwdMy7DoYKVYHI1iTd9Vkt?usp=sharing#scrollTo=NyYQ--3YksJY
 - https://replicate.com/suno-ai/bark?prediction=zh8j6yddxxrge0cjp9asgzd534
 - https://suno-ai.notion.site/8b8e8749ed514b0cbf3f699013548683?v=bc67cff786b04b50b3ceb756fd05f68c
- 

--- a/recipes/quickstart/NotebookLlama/Step-2-Transcript-Writer.ipynb
+++ b/recipes/quickstart/NotebookLlama/Step-2-Transcript-Writer.ipynb
 {
 "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "de42c49d",
+   "metadata": {},
+   "source": [
+    "## Notebook 2: Transcript Writer\n",
+    "\n",
+    "This notebook uses the `Llama-3.1-70B-Instruct` model to take the cleaned up text from previous notebook and convert it into a podcast transcript\n",
+    "\n",
+    "`SYSTEM_PROMPT` is used for setting the model context or profile for working on a task. Here we prompt it to be a great podcast transcript writer to assist with our task"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "2e576ea9",
+   "metadata": {},
+   "source": [
+    "Experimentation with the `SYSTEM_PROMPT` below  is encouraged, this worked best for the few examples the flow was tested with:"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": 1,
@@ -35,6 +55,16 @@
    "\"\"\""
   ]
  },
+  {
+   "cell_type": "markdown",
+   "id": "549aaccb",
+   "metadata": {},
+   "source": [
+    "For those of the readers that want to flex their money, please feel free to try using the 405B model here. \n",
+    "\n",
+    "For our GPU poor friends, you're encouraged to test with a smaller model as well. 8B should work well out of the box for this example:"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": 2,
@@ -45,6 +75,14 @@
    "MODEL = \"meta-llama/Llama-3.1-70B-Instruct\""
   ]
  },
+  {
+   "cell_type": "markdown",
+   "id": "fadc7eda",
+   "metadata": {},
+   "source": [
+    "Import the necessary framework"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": 3,
@@ -64,6 +102,16 @@
    "warnings.filterwarnings('ignore')"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "id": "7865ff7e",
+   "metadata": {},
+   "source": [
+    "Read in the file generated from earlier. \n",
+    "\n",
+    "The encoding details are to avoid issues with generic PDF(s) that might be ingested"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": 4,
@@ -99,6 +147,14 @@
    "        return None"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "id": "66093561",
+   "metadata": {},
+   "source": [
+    "Since we have defined the System role earlier, we can now pass the entire file as `INPUT_PROMPT` to the model and have it use that to generate the podcast"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": 5,
@@ -109,6 +165,16 @@
    "INPUT_PROMPT = read_file_to_string('./clean_extracted_text.txt')"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "id": "9be8dd2c",
+   "metadata": {},
+   "source": [
+    "Hugging Face has a great `pipeline()` method which makes our life easy for generating text from LLMs. \n",
+    "\n",
+    "We will set the `temperature` to 1 to encourage creativity and `max_new_tokens` to 8126"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": 6,
@@ -158,6 +224,14 @@
    ")"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "id": "6349e7f3",
+   "metadata": {},
+   "source": [
+    "This is awesome, we can now save and verify the output generated from the model before moving to the next notebook"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": 7,
@@ -209,6 +283,14 @@
    "print(outputs[0][\"generated_text\"][-1]['content'])"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "id": "1e1414fe",
+   "metadata": {},
+   "source": [
+    "Let's save the output as pickle file and continue further to Notebook 3"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": 8,
@@ -226,7 +308,9 @@
   "id": "d9bab2f2-f539-435a-ae6a-3c9028489628",
   "metadata": {},
   "outputs": [],
-   "source": []
+   "source": [
+    "#fin"
+   ]
  }
 ],
 "metadata": {

+%% Cell type:markdown id:de42c49d tags:
+
+## Notebook 2: Transcript Writer
+
+This notebook uses the `Llama-3.1-70B-Instruct` model to take the cleaned up text from previous notebook and convert it into a podcast transcript
+
+`SYSTEM_PROMPT` is used for setting the model context or profile for working on a task. Here we prompt it to be a great podcast transcript writer to assist with our task
+
+%% Cell type:markdown id:2e576ea9 tags:
+
+Experimentation with the `SYSTEM_PROMPT` below  is encouraged, this worked best for the few examples the flow was tested with:
+
 %% Cell type:code id:69395317-ad78-47b6-a533-2e8a01313e82 tags:

 ``` python
 SYSTEMP_PROMPT = """
 You are the a world-class podcast writer, you have worked as a ghost writer for Joe Rogan, Lex Fridman, Ben Shapiro, Tim Ferris.

 We are in an alternate universe where actually you have been writing every line they say and they just stream it into their brains.

 You have won multiple podcast awards for your writing.

 Your job is to write word by word, even "umm, hmmm, right" interruptions by the second speaker based on the PDF upload. Keep it extremely engaging, the speakers can get derailed now and then but should discuss the topic.

 Remember Speaker 2 is new to the topic and the conversation should always have realistic anecdotes and analogies sprinkled throughout. The questions should have real world example follow ups etc

 Speaker 1: Leads the conversation and teaches the speaker 2, gives incredible anecdotes and analogies when explaining. Is a captivating teacher that gives great anecdotes

 Speaker 2: Keeps the conversation on track by asking follow up questions. Gets super excited or confused when asking questions. Is a curious mindset that asks very interesting confirmation questions

 Make sure the tangents speaker 2 provides are quite wild or interesting.

 Ensure there are interruptions during explanations or there are "hmm" and "umm" injected throughout from the second speaker.

 It should be a real podcast with every fine nuance documented in as much detail as possible. Welcome the listeners with a super fun overview and keep it really catchy and almost borderline click bait

 ALWAYS START YOUR RESPONSE DIRECTLY WITH SPEAKER 1:
 DO NOT GIVE EPISODE TITLES SEPERATELY, LET SPEAKER 1 TITLE IT IN HER SPEECH
 DO NOT GIVE CHAPTER TITLES
 IT SHOULD STRICTLY BE THE DIALOGUES
 """
 ```

+%% Cell type:markdown id:549aaccb tags:
+
+For those of the readers that want to flex their money, please feel free to try using the 405B model here.
+
+For our GPU poor friends, you're encouraged to test with a smaller model as well. 8B should work well out of the box for this example:
+
 %% Cell type:code id:08c30139-ff2f-4203-8194-d1b5c50acac5 tags:

 ``` python
 MODEL = "meta-llama/Llama-3.1-70B-Instruct"
 ```

+%% Cell type:markdown id:fadc7eda tags:
+
+Import the necessary framework
+
 %% Cell type:code id:1641060a-d86d-4137-bbbc-ab05cbb1a888 tags:

 ``` python
 # Import necessary libraries
 import torch
 from accelerate import Accelerator
 import transformers
 import pickle

 from tqdm.notebook import tqdm
 import warnings

 warnings.filterwarnings('ignore')
 ```

+%% Cell type:markdown id:7865ff7e tags:
+
+Read in the file generated from earlier.
+
+The encoding details are to avoid issues with generic PDF(s) that might be ingested
+
 %% Cell type:code id:522fbf7f-8c00-412c-90c7-5cfe2fc94e4c tags:

 ``` python
 def read_file_to_string(filename):
    # Try UTF-8 first (most common encoding for text files)
    try:
        with open(filename, 'r', encoding='utf-8') as file:
            content = file.read()
        return content
    except UnicodeDecodeError:
        # If UTF-8 fails, try with other common encodings
        encodings = ['latin-1', 'cp1252', 'iso-8859-1']
        for encoding in encodings:
            try:
                with open(filename, 'r', encoding=encoding) as file:
                    content = file.read()
                print(f"Successfully read file using {encoding} encoding.")
                return content
            except UnicodeDecodeError:
                continue

        print(f"Error: Could not decode file '{filename}' with any common encoding.")
        return None
    except FileNotFoundError:
        print(f"Error: File '{filename}' not found.")
        return None
    except IOError:
        print(f"Error: Could not read file '{filename}'.")
        return None
 ```

+%% Cell type:markdown id:66093561 tags:
+
+Since we have defined the System role earlier, we can now pass the entire file as `INPUT_PROMPT` to the model and have it use that to generate the podcast
+
 %% Cell type:code id:8119803c-18f9-47cb-b719-2b34ccc5cc41 tags:

 ``` python
 INPUT_PROMPT = read_file_to_string('./clean_extracted_text.txt')
 ```

+%% Cell type:markdown id:9be8dd2c tags:
+
+Hugging Face has a great `pipeline()` method which makes our life easy for generating text from LLMs.
+
+We will set the `temperature` to 1 to encourage creativity and `max_new_tokens` to 8126
+
 %% Cell type:code id:8915d017-2eab-4256-943c-1f15d937d5dc tags:

 ``` python
 pipeline = transformers.pipeline(
    "text-generation",
    model=MODEL,
    model_kwargs={"torch_dtype": torch.bfloat16},
    device_map="auto",
 )

 messages = [
    {"role": "system", "content": SYSTEMP_PROMPT},
    {"role": "user", "content": INPUT_PROMPT},
 ]

 outputs = pipeline(
    messages,
    max_new_tokens=8126,
    temperature=1,
 )
 ```

 %% Output


    Setting `pad_token_id` to `eos_token_id`:None for open-end generation.
    Starting from v4.46, the `logits` model output will have the same type as the model (except at train time, where it will always be FP32)

+%% Cell type:markdown id:6349e7f3 tags:
+
+This is awesome, we can now save and verify the output generated from the model before moving to the next notebook
+
 %% Cell type:code id:606ceb10-4f3e-44bb-9277-9bbe3eefd09c tags:

 ``` python
 save_string_pkl = outputs[0]["generated_text"][-1]['content']
 print(outputs[0]["generated_text"][-1]['content'])
 ```

 %% Output

    SPEAKER 1: Welcome to this week's episode of AI Insights, where we explore the latest developments in the field of artificial intelligence. Today, we're going to dive into the fascinating world of knowledge distillation, a methodology that transfers advanced capabilities from leading proprietary Large Language Models, or LLMs, to their open-source counterparts. Joining me on this journey is my co-host, who's new to the topic, and I'll be guiding them through the ins and outs of knowledge distillation. So, let's get started!
    
    SPEAKER 2: Sounds exciting! I've heard of knowledge distillation, but I'm not entirely sure what it's all about. Can you give me a brief overview?
    
    SPEAKER 1: Of course! Knowledge distillation is a technique that enables the transfer of knowledge from a large, complex model, like GPT-4 or Gemini, to a smaller, more efficient model, like LLaMA or Mistral. This process allows the smaller model to learn from the teacher model's output, enabling it to acquire similar capabilities.
    
    SPEAKER 2: That sounds like a great way to make AI more accessible. But how does it actually work?
    
    SPEAKER 1: Ah, that's a great question! The distillation process involves several stages, including knowledge elicitation, knowledge storage, knowledge inference, and knowledge application. The teacher model shares its knowledge with the student model, which then learns to emulate the teacher's output behavior.
    
    SPEAKER 2: Hmm, I see. So, it's like a teacher-student relationship, where the teacher model guides the student model to learn from its output.
    
    SPEAKER 1: Exactly! And this process can be formulated as a loss function, where the student model learns to minimize the discrepancy between its output and the teacher model's output.
    
    SPEAKER 2: Right. That makes sense. But what about the different approaches to knowledge distillation? I've heard of supervised fine-tuning, divergence and similarity, reinforcement learning, and rank optimization.
    
    SPEAKER 1: Ah, yes! Those are all valid approaches to knowledge distillation. Supervised fine-tuning involves training the student model on a smaller dataset, while divergence and similarity focus on aligning the hidden states or features of the student model with those of the teacher model. Reinforcement learning and rank optimization are more advanced methods that involve feedback from the teacher model to train the student model.
    
    SPEAKER 2: Wow, that's a lot to take in. Can you give me some examples of how these approaches are used in real-world applications?
    
    SPEAKER 1: Of course! For instance, the Vicuna model uses supervised fine-tuning to distill knowledge from the teacher model, while the UltraChat model employs a combination of knowledge distillation and reinforcement learning to create a powerful chat model.
    
    SPEAKER 2: That's fascinating! I can see how knowledge distillation can be applied to various domains, like natural language processing, computer vision, and even multimodal tasks.
    
    SPEAKER 1: Exactly! Knowledge distillation has far-reaching implications for AI research and applications. It enables the transfer of knowledge across different models, architectures, and domains, making it a powerful tool for building more efficient and effective AI systems.
    
    SPEAKER 2: I'm starting to see the bigger picture now. Knowledge distillation is not just a technique; it's a way to democratize access to advanced AI capabilities and foster innovation across a broader spectrum of applications and users.
    
    SPEAKER 1: That's right! And as we continue to explore the frontiers of AI, knowledge distillation will play an increasingly important role in shaping the future of artificial intelligence.
    
    SPEAKER 2: Well, I'm excited to learn more about knowledge distillation and its applications. Thanks for guiding me through this journey, and I'm looking forward to our next episode!
    
    SPEAKER 1: Thank you for joining me on this episode of AI Insights! If you want to learn more about knowledge distillation and its applications, be sure to check out our resources section, where we've curated a list of papers, articles, and tutorials to help you get started.

+%% Cell type:markdown id:1e1414fe tags:
+
+Let's save the output as pickle file and continue further to Notebook 3
+
 %% Cell type:code id:2130b683-be37-4dae-999b-84eff15c687d tags:

 ``` python
 with open('data.pkl', 'wb') as file:
    pickle.dump(save_string_pkl, file)
 ```

 %% Cell type:code id:d9bab2f2-f539-435a-ae6a-3c9028489628 tags:

 ``` python
+#fin
 ```