diff --git a/recipes/quickstart/NotebookLlama/Step-1 PDF-Pre-Processing-Logic.ipynb b/recipes/quickstart/NotebookLlama/Step-1 PDF-Pre-Processing-Logic.ipynb
index 0c75a78a5633012e5ddfec34db192db9379225e2..6f128be3e810882c6641626bccbbcbff4a210067 100644
--- a/recipes/quickstart/NotebookLlama/Step-1 PDF-Pre-Processing-Logic.ipynb	
+++ b/recipes/quickstart/NotebookLlama/Step-1 PDF-Pre-Processing-Logic.ipynb	
@@ -42,6 +42,16 @@
     "#!pip install rich ipywidgets"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "id": "7b23d509",
+   "metadata": {},
+   "source": [
+    "Assuming you have a PDF uploaded on the same machine, please set the path for the file. \n",
+    "\n",
+    "Also, if you want to flex your GPU-please switch to a bigger model although the featherlight models work perfectly for this task:"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 14,
@@ -60,7 +70,6 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "# Import necessary libraries\n",
     "import PyPDF2\n",
     "from typing import Optional\n",
     "import os\n",
@@ -74,6 +83,14 @@
     "warnings.filterwarnings('ignore')"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "id": "203c22eb",
+   "metadata": {},
+   "source": [
+    "Let's make sure we don't stub our toe by checking if the file exists"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 9,
@@ -91,6 +108,16 @@
     "    return True"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "id": "5a362ac3",
+   "metadata": {},
+   "source": [
+    "Convert PDF to a `.txt` file. This would simply read and dump the contents of the file. We set the maximum characters to 100k. \n",
+    "\n",
+    "For people converting their favorite novels into a podcast, they will have to add extra logic of going outside the Llama models context length which is 128k tokens."
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 10,
@@ -144,6 +171,14 @@
     "        return None\n"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "id": "e023397b",
+   "metadata": {},
+   "source": [
+    "Helper function to grab meta info about our PDF"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 11,
@@ -169,6 +204,14 @@
     "        return None"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "id": "6019affc",
+   "metadata": {},
+   "source": [
+    "Finally, we can run our logic to extract the details from the file"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 12,
@@ -268,6 +311,22 @@
     "    print(f\"\\nExtracted text has been saved to {output_file}\")"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "id": "946d1f59",
+   "metadata": {},
+   "source": [
+    "### Llama Pre-Processing\n",
+    "\n",
+    "Now let's proceed to justify our distaste for writing regex and use that as a justification for a LLM instead:\n",
+    "\n",
+    "At this point, have a text file extracted from a PDF of a paper. Generally PDF extracts can be messy due to characters, formatting, Latex, Tables, etc. \n",
+    "\n",
+    "One way to handle this would be using regex, instead we can also prompt the feather light Llama models to clean up our text for us. \n",
+    "\n",
+    "Please try changing the `SYS_PROMPT` below to see what improvements you can make:"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 60,
@@ -297,6 +356,16 @@
     "\"\"\""
    ]
   },
+  {
+   "cell_type": "markdown",
+   "id": "fd393fae",
+   "metadata": {},
+   "source": [
+    "Instead of having the model process the entire file at once, as you noticed in the prompt-we will pass chunks of the file. \n",
+    "\n",
+    "One issue with passing chunks counted by characters is, we lose meaning of words so instead we chunk by words:"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 61,
@@ -331,6 +400,14 @@
     "    return chunks"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "id": "5d74223f",
+   "metadata": {},
+   "source": [
+    "Let's load in the model and start processing the text chunks"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 62,
@@ -2550,6 +2627,14 @@
     "        out_file.flush()"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "id": "31cffe8d",
+   "metadata": {},
+   "source": [
+    "Let's print out the final processed versions to make sure things look good"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 68,
@@ -2617,7 +2702,9 @@
    "id": "1b16ae0e-04cf-4eb9-a369-dee1728b89ce",
    "metadata": {},
    "outputs": [],
-   "source": []
+   "source": [
+    "#fin"
+   ]
   }
  ],
  "metadata": {