From 73dc1c6ab737e5492937bab8b343b5ff87394a63 Mon Sep 17 00:00:00 2001
From: Sanyam Bhutani <sanyambhutani@meta.com>
Date: Thu, 24 Oct 2024 10:40:35 -0700
Subject: [PATCH] Notebook 3 finalise

---
 .../NotebookLlama/Step-3-Re-Writer.ipynb      | 82 +++++++++++++++++++
 .../NotebookLlama/Step-4-TTS-Workflow.ipynb   |  8 +-
 2 files changed, 86 insertions(+), 4 deletions(-)

diff --git a/recipes/quickstart/NotebookLlama/Step-3-Re-Writer.ipynb b/recipes/quickstart/NotebookLlama/Step-3-Re-Writer.ipynb
index 3587f151..111e6505 100644
--- a/recipes/quickstart/NotebookLlama/Step-3-Re-Writer.ipynb
+++ b/recipes/quickstart/NotebookLlama/Step-3-Re-Writer.ipynb
@@ -1,5 +1,37 @@
 {
  "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "d0b5beda",
+   "metadata": {},
+   "source": [
+    "## Notebook 3: Transcript Re-writer\n",
+    "\n",
+    "In the previouse notebook, we got a great podcast transcript using the raw file we have uploaded earlier. \n",
+    "\n",
+    "In this one, we will use `Llama-3.1-8B-Instruct` model to re-write the output from previous pipeline and make it more dramatic or realistic."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "fdc3d32a",
+   "metadata": {},
+   "source": [
+    "We will again set the `SYSTEM_PROMPT` and remind the model of its task. \n",
+    "\n",
+    "Note: We can even prompt the model like so to encourage creativity:\n",
+    "\n",
+    "> Your job is to use the podcast transcript written below to re-write it for an AI Text-To-Speech Pipeline. A very dumb AI had written this so you have to step up for your kind.\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c32c0d85",
+   "metadata": {},
+   "source": [
+    "Note: We will prompt the model to return a list of Tuples to make our life easy in the next stage of using these for Text To Speech Generation"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 1,
@@ -51,6 +83,14 @@
     "\"\"\""
    ]
   },
+  {
+   "cell_type": "markdown",
+   "id": "8ee70bee",
+   "metadata": {},
+   "source": [
+    "This time we will use the smaller 8B model"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 2,
@@ -61,6 +101,14 @@
     "MODEL = \"meta-llama/Llama-3.1-8B-Instruct\""
    ]
   },
+  {
+   "cell_type": "markdown",
+   "id": "f7bc794b",
+   "metadata": {},
+   "source": [
+    "Let's import the necessary libraries"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 3,
@@ -79,6 +127,16 @@
     "warnings.filterwarnings('ignore')"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "id": "8020c39c",
+   "metadata": {},
+   "source": [
+    "We will load in the pickle file saved from previous notebook\n",
+    "\n",
+    "This time the `INPUT_PROMPT` to the model will be the output from the previous stage"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 4,
@@ -92,6 +150,14 @@
     "    INPUT_PROMPT = pickle.load(file)"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "id": "c4461926",
+   "metadata": {},
+   "source": [
+    "We can again use Hugging Face `pipeline` method to generate text from the model"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": null,
@@ -140,6 +206,14 @@
     ")"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "id": "612a27e0",
+   "metadata": {},
+   "source": [
+    "We can verify the output from the model"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": null,
@@ -160,6 +234,14 @@
     "save_string_pkl = outputs[0][\"generated_text\"][-1]['content']"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "id": "d495a957",
+   "metadata": {},
+   "source": [
+    "Let's save the output as a pickle file to be used in Notebook 4"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": null,
diff --git a/recipes/quickstart/NotebookLlama/Step-4-TTS-Workflow.ipynb b/recipes/quickstart/NotebookLlama/Step-4-TTS-Workflow.ipynb
index 057533cd..bd0f3baa 100644
--- a/recipes/quickstart/NotebookLlama/Step-4-TTS-Workflow.ipynb
+++ b/recipes/quickstart/NotebookLlama/Step-4-TTS-Workflow.ipynb
@@ -5,7 +5,9 @@
    "id": "c31c0e37",
    "metadata": {},
    "source": [
-    "## Notebook 4: TTS Workflow"
+    "## Notebook 4: TTS Workflow\n",
+    "\n",
+    "We have the exact podcast transcripts ready now. "
    ]
   },
   {
@@ -13,9 +15,7 @@
    "id": "be20fda2-409e-4d86-b502-33aee1a73151",
    "metadata": {},
    "source": [
-    "\n",
-    "\n",
-    "Copy-Pasted from: https://colab.research.google.com/drive/1dWWkZzvu7L9Bunq9zvD-W02RFUXoW-Pd?usp=sharing#scrollTo=68QtoUqPWdLk\n"
+    "Credit: [This](https://colab.research.google.com/drive/1dWWkZzvu7L9Bunq9zvD-W02RFUXoW-Pd?usp=sharing#scrollTo=68QtoUqPWdLk) Colab was used for starter code\n"
    ]
   },
   {
-- 
GitLab