From 11addc7a0a54a840f20bde5d352a074864489c86 Mon Sep 17 00:00:00 2001 From: sekyonda <127536312+sekyondaMeta@users.noreply.github.com> Date: Mon, 16 Oct 2023 15:50:32 -0400 Subject: [PATCH] Update VideoSummary.ipynb --- demo_apps/VideoSummary.ipynb | 126 ++++++++++++++++++++++++++++++++--- 1 file changed, 115 insertions(+), 11 deletions(-) diff --git a/demo_apps/VideoSummary.ipynb b/demo_apps/VideoSummary.ipynb index edcab0b3..44592b3b 100644 --- a/demo_apps/VideoSummary.ipynb +++ b/demo_apps/VideoSummary.ipynb @@ -6,9 +6,24 @@ "metadata": {}, "source": [ "## This demo app shows:\n", - "* how to use LangChain's YoutubeLoader to retrieve the caption in a YouTube video;\n", - "* how to ask Llama to summarize the content (per the Llama's input size limit) of the video in a naive way using LangChain's stuff method;\n", - "* how to bypass the limit of Llama's max input token size by using more sophisticated way using LangChain's map_reduce and refine methods - see [here](https://python.langchain.com/docs/use_cases/summarization) for more info." + "* How to use LangChain's YoutubeLoader to retrieve the caption in a YouTube video.\n", + "* How to ask Llama to summarize the content (per the Llama's input size limit) of the video in a naive way using LangChain's stuff method.\n", + "* How to bypass the limit of Llama's max input token size by using a more sophisticated way using LangChain's map_reduce and refine methods - see [here](https://python.langchain.com/docs/use_cases/summarization) for more info." + ] + }, + { + "cell_type": "markdown", + "id": "c866f6be", + "metadata": {}, + "source": [ + "We start by installing the necessary packages:\n", + "- [youtube-transcript-api](https://pypi.org/project/youtube-transcript-api/) API to get transcript/subtitles of a YouTube video.\n", + "- [langchain](https://python.langchain.com/docs/get_started/introduction) provides necessary RAG tools for this demo.\n", + "- [tiktoken](https://github.com/openai/tiktoken) BytePair Encoding tokenizer.\n", + "- [pytube](https://pytube.io/en/latest/) Utility for downloading YouTube videos.\n", + "\n", + "**Note** This example uses Replicate to host the Llama model. If you have not set up/or used Replicate before, we suggest you take a look at the [HelloLlamaCloud](HelloLlamaCloud.ipynb) example for information on how to set up Replicate before continuing with this example.\n", + "If you do not want to use Replicate, you will need to make some changes to this notebook as you go along." ] }, { @@ -21,6 +36,14 @@ "!pip install langchain youtube-transcript-api tiktoken pytube" ] }, + { + "cell_type": "markdown", + "id": "af3069b1", + "metadata": {}, + "source": [ + "Next we load the YouTube video transcript using the YoutubeLoader." + ] + }, { "cell_type": "code", "execution_count": 1, @@ -69,6 +92,25 @@ "len(docs[0].page_content), docs[0].page_content[:300]" ] }, + { + "cell_type": "markdown", + "id": "4af7cc16", + "metadata": {}, + "source": [ + "We are using Replicate in this example to host our Llama 2 model so you will need to get a Replicate token.\n", + "\n", + "To get the Replicate token: \n", + "\n", + "- You will need to first sign in with Replicate with your github account\n", + "- Then create a free API token [here](https://replicate.com/account/api-tokens) that you can use for a while. \n", + "\n", + "**Note** After the free trial ends, you will need to enter billing info to continue to use Llama2 hosted on Replicate.\n", + "\n", + "Alternatively, you can run Llama locally. See:\n", + "- [HelloLlamaCloud](HelloLlamaCloud.ipynb) for further information on how to run Llama using Replicate.\n", + "- [HelloLlamaLocal](HelloLlamaLocal.ipynb) for further information on how to run Llama locally." + ] + }, { "cell_type": "code", "execution_count": 4, @@ -76,7 +118,7 @@ "metadata": {}, "outputs": [ { - "name": "stdin", + "name": "stdout", "output_type": "stream", "text": [ " ········\n" @@ -92,6 +134,17 @@ "os.environ[\"REPLICATE_API_TOKEN\"] = REPLICATE_API_TOKEN\n" ] }, + { + "cell_type": "markdown", + "id": "6b911efd", + "metadata": {}, + "source": [ + "Next we call the Llama 2 model from Replicate. In this example we will use the llama 2 13b chat model. You can find more Llama 2 models by searching for them on the [Replicate model explore page](https://replicate.com/explore?query=llama).\n", + "You can add them here in the format: model_name/version\n", + "\n", + "If you using local Llama, just set llm accordingly - see the [HelloLlamaLocal notebook](HelloLlamaLocal.ipynb)" + ] + }, { "cell_type": "code", "execution_count": null, @@ -99,7 +152,7 @@ "metadata": {}, "outputs": [], "source": [ - "# set llm to be Llama2-13b model; if you use local Llama, just set llm accordingly - see the HelloLlamaLocal notebook\n", + "\n", "from langchain.llms import Replicate\n", "\n", "llama2_13b = \"meta/llama-2-13b-chat:f4e2de70d66816a838a89eeeb621910adffb0dd0baba3976c96980970978018d\"\n", @@ -109,6 +162,14 @@ ")" ] }, + { + "cell_type": "markdown", + "id": "8e3baa56", + "metadata": {}, + "source": [ + "Once everything is set up, we prompt Llama 2 to summarize the first 4000 characters of the transcript for us." + ] + }, { "cell_type": "code", "execution_count": 6, @@ -141,6 +202,14 @@ "print(summary)" ] }, + { + "cell_type": "markdown", + "id": "8b684b29", + "metadata": {}, + "source": [ + "Next we try to summarize all the content of the transcript and we should get a `RuntimeError: Your input is too long. Max input length is 4096 tokens, but you supplied 5597 tokens.`." + ] + }, { "cell_type": "code", "execution_count": 7, @@ -174,8 +243,16 @@ "# try to get a summary of the whole content\n", "text = docs[0].page_content\n", "summary = chain.run(text)\n", - "print(summary)\n", - "# and you'll get - RuntimeError: Your input is too long. Max input length is 4096 tokens, but you supplied 5597 tokens." + "print(summary)\n" + ] + }, + { + "cell_type": "markdown", + "id": "1ad1881a", + "metadata": {}, + "source": [ + "\n", + "Let's try some workarounds to see if we can summarize the entire transcript without running into the `RuntimeError`.\n" ] }, { @@ -260,6 +337,15 @@ "chain.run(docs)" ] }, + { + "cell_type": "markdown", + "id": "aecf6328", + "metadata": {}, + "source": [ + "\n", + "Since the transcript is bigger than the model can handle, we can split the transcript into chunks instead and use the [`refine`](https://python.langchain.com/docs/modules/chains/document/refine) `chain_type` iteratively to create an answer." + ] + }, { "cell_type": "code", "execution_count": 10, @@ -321,6 +407,14 @@ "chain.run(split_docs)" ] }, + { + "cell_type": "markdown", + "id": "c3976c92", + "metadata": {}, + "source": [ + "You can also use [`map_reduce`](https://python.langchain.com/docs/modules/chains/document/map_reduce) `chain_type` to implement a map reduce like architecture while summarizing the documents." + ] + }, { "cell_type": "code", "execution_count": 14, @@ -400,6 +494,15 @@ "chain.run(split_docs)" ] }, + { + "cell_type": "markdown", + "id": "77d580de", + "metadata": {}, + "source": [ + "To investigate further, let's turn on Langchain's debug mode on to get an idea of how many calls are made to the model and the details of the inputs and outputs.\n", + "We will then run our summary using the `stuff` and `refine` `chain_types` and take a look at our output." + ] + }, { "cell_type": "code", "execution_count": 15, @@ -559,12 +662,13 @@ ] }, { - "cell_type": "code", - "execution_count": null, + "cell_type": "markdown", "id": "61ccd0fb-5cdb-43c4-afaf-05bc9f7cf959", "metadata": {}, - "outputs": [], - "source": [] + "source": [ + "\n", + "As you can see, `stuff` fails because it tries to treat all the split documents as one and \"stuffs\" it into one prompt which leads to a much larger prompt than Llama 2 can handle while `refine` iteratively runs over the documents updating its answer as it goes." + ] } ], "metadata": { -- GitLab