diff --git a/recipes/quickstart/NotebookLlama/Prompt_testing.md b/recipes/quickstart/NotebookLlama/Prompt_testing.md index 3ee0cf93a4a157eed4b9d0edaf9db0dbc6dc07f0..d9a892b525c28277a8b434270e22ee73c8aaeba8 100644 --- a/recipes/quickstart/NotebookLlama/Prompt_testing.md +++ b/recipes/quickstart/NotebookLlama/Prompt_testing.md @@ -212,9 +212,19 @@ Speaker 1: Thanks for having me! 1-B prompt: ``` -You are a world class text extractor, here is the raw data from a PDF, please parse and return it in a way that is great. +You are a world class text pre-processor, here is the raw data from a PDF, please parse and return it in a way that is crispy and usable to send to a podcast writer. -The parser messes up new lines, Latex math and you will see fluff that we can remove completely. +The raw data is messed up with new lines, Latex math and you will see fluff that we can remove completely. Basically take away any details that you think might be useless in a podcast author's transcript. + +Remember, the podcast could be on any topic whatsoever so the issues listed above are not exhaustive The goal is to use this in a podcast research transcript so a lot of the emails, citations, and things like that can be removed-please be smart with what you remove and be creative ok? + +Remember DO NOT START SUMMARIZING THIS, YOU ARE ONLY CLEANING UP THE TEXT AND RETURNING AS IS + +Be very smart and aggressive with removing details, you will get a running portion of the text and keep returning the processed text. + +ALWAYS start your response directly with processed text and NO ACKNOWLEDGEMENTS about my questions ok? + +Here is the text: ``` \ No newline at end of file