diff --git a/docs/examples/node_postprocessor/ColbertRerank.ipynb b/docs/examples/node_postprocessor/ColbertRerank.ipynb index 4437b903e83e6bcef15c3ececd764cdf0bb9d346..ef067ae7f151a3dbb3bc5fa9b341a9f377c8aa26 100644 --- a/docs/examples/node_postprocessor/ColbertRerank.ipynb +++ b/docs/examples/node_postprocessor/ColbertRerank.ipynb @@ -117,7 +117,7 @@ ")\n", "\n", "query_engine = index.as_query_engine(\n", - " similarity_top_k=5,\n", + " similarity_top_k=10,\n", " node_postprocessors=[colbert_reranker],\n", ")\n", "response = query_engine.query(\n", @@ -134,30 +134,108 @@ "name": "stdout", "output_type": "stream", "text": [ - "bd5a8323-41bb-4cde-8b2b-2ac69b1e519a\n", + "50157136-f221-4468-83e1-44e289f44cd5\n", "When I was dealing with some urgent problem during YC, there was about a 60% chance it had to do with HN, and a 40% chan\n", "reranking score: 0.6470144987106323\n", - "retrieval score: 0.8309415059604792\n", + "retrieval score: 0.8309200279065135\n", "**********\n", - "24c6c722-bfd0-42e0-9e44-663253b79aa2\n", + "87f0d691-b631-4b21-8123-8f71d383046b\n", "Now that I could write essays again, I wrote a bunch about topics I'd had stacked up. I kept writing essays through 2020\n", "reranking score: 0.6377773284912109\n", - "retrieval score: 0.8053894057548092\n", + "retrieval score: 0.8053000783543145\n", "**********\n", - "e572465c-d48c-48ce-9664-99ddf09cdae6\n", - "Much to my surprise, the time I spent working on this stuff was not wasted after all. After we started Y Combinator, I w\n", - "reranking score: 0.6206888556480408\n", - "retrieval score: 0.8091076626532405\n", + "10234ad9-46b1-4be5-8034-92392ac242ed\n", + "It's not that unprestigious types of work are good per se. But when you find yourself drawn to some kind of work despite\n", + "reranking score: 0.6301894187927246\n", + "retrieval score: 0.7975032272825491\n", + "**********\n", + "bc269bc4-49c7-4804-8575-cd6db47d70b8\n", + "It was as weird as it sounds. I resumed all my old patterns, except now there were doors where there hadn't been. Now wh\n", + "reranking score: 0.6282549500465393\n", + "retrieval score: 0.8026253284729862\n", + "**********\n", + "ebd7e351-64fc-4627-8ddd-2681d1ac33f8\n", + "As Jessica and I were walking home from dinner on March 11, at the corner of Garden and Walker streets, these three thre\n", + "reranking score: 0.6245909929275513\n", + "retrieval score: 0.7965812262372882\n", + "**********\n" + ] + } + ], + "source": [ + "for node in response.source_nodes:\n", + " print(node.id_)\n", + " print(node.node.get_content()[:120])\n", + " print(\"reranking score: \", node.score)\n", + " print(\"retrieval score: \", node.node.metadata[\"retrieval_score\"])\n", + " print(\"**********\")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Sam Altman became the second president of Y Combinator after Paul Graham decided to step back from running the organization.\n" + ] + } + ], + "source": [ + "print(response)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "response = query_engine.query(\n", + " \"Which schools did Paul attend?\",\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "6942863e-dfc5-4a99-b642-967b99b71343\n", + "I didn't want to drop out of grad school, but how else was I going to get out? I remember when my friend Robert Morris g\n", + "reranking score: 0.6333063840866089\n", + "retrieval score: 0.7964996889742813\n", + "**********\n", + "477c5de0-8e05-494e-95cc-e221881fb5c1\n", + "What I Worked On\n", + "\n", + "February 2021\n", + "\n", + "Before college the two main things I worked on, outside of school, were writing and pro\n", + "reranking score: 0.5930159091949463\n", + "retrieval score: 0.7771872700578062\n", "**********\n", - "576168dd-98ce-43ee-91d4-fef0fb4368d2\n", + "0448df5c-7950-483d-bc63-15e9110da3bc\n", "[15] We got 225 applications for the Summer Founders Program, and we were surprised to find that a lot of them were from\n", - "reranking score: 0.6143158674240112\n", - "retrieval score: 0.8069205604148549\n", + "reranking score: 0.5160146951675415\n", + "retrieval score: 0.7782554326959897\n", "**********\n", - "d0f00ad3-b162-49d7-a01f-c513c6c068ad\n", - "Up till that point YC had been controlled by the original LLC we four had started. But we wanted YC to last for a long t\n", - "reranking score: 0.5917402505874634\n", - "retrieval score: 0.8230686425302381\n", + "83af8efd-e992-4fd3-ada4-3c4c6f9971a1\n", + "Much to my surprise, the time I spent working on this stuff was not wasted after all. After we started Y Combinator, I w\n", + "reranking score: 0.5005874633789062\n", + "retrieval score: 0.7800375923908894\n", + "**********\n", + "bc269bc4-49c7-4804-8575-cd6db47d70b8\n", + "It was as weird as it sounds. I resumed all my old patterns, except now there were doors where there hadn't been. Now wh\n", + "reranking score: 0.4977223873138428\n", + "retrieval score: 0.782688582042514\n", "**********\n" ] } @@ -170,6 +248,23 @@ " print(\"retrieval score: \", node.node.metadata[\"retrieval_score\"])\n", " print(\"**********\")" ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Paul attended Cornell University for his graduate studies and later applied to RISD (Rhode Island School of Design) in the US.\n" + ] + } + ], + "source": [ + "print(response)" + ] } ], "metadata": {