diff --git a/docs/encoders/aurelio-bm25.ipynb b/docs/encoders/aurelio-bm25.ipynb index 75a6bc70f237ac104313af1bb20c0ce18ddd2311..e74f1e1b4fa0ae6dc5b0ec99ec6e8fbf28306cb4 100644 --- a/docs/encoders/aurelio-bm25.ipynb +++ b/docs/encoders/aurelio-bm25.ipynb @@ -18,7 +18,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "The 3rd generation embedding models from OpenAI (`text-embedding-3-small` and `text-embedding-3-large`) can both be used with our `OpenAIEncoder` and usage is primarily the same as with the 2nd generation `text-embedding-ada-002`. However, there is a new `dimensions` parameter — which we will discuss below." + "Hybrid indexes combine both sparse and dense encodings to produce more accurate results. The dense encoder allows us to search based on semantic meaning, while the sparse encoder allows us to search based on text matches. Merging both dense and sparse into a single hybrid retrieval step allows us to step up our performance beyond what dense-only or sparse-only could achieve." ] }, { @@ -32,7 +32,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "We start by installing semantic-router. Support for the new `dimensions` parameter was added in `semantic-router==0.0.19` and `openai==1.10.0`." + "We start by installing semantic-router. Support for the new `AurelioSparseEncoder` parameter was added in `semantic-router==0.1.0`." ] }, { @@ -41,7 +41,7 @@ "metadata": {}, "outputs": [], "source": [ - "!pip install -qU \"semantic-router==0.1.0.dev2\"" + "!pip install -qU semantic-router==0.1.0" ] }, { @@ -112,7 +112,9 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Now we initialize our embedding model, we will use the `-3-large` model alongside a `dimensions` value of `256`. This will produce _tiny_ 256-dimensional vectors that — according to OpenAI — outperform the 1536-dimensional vectors produced by `text-embedding-ada-002`." + "Now we initialize our embedding models. We are going to use a hybrid index which requires both a dense and sparse encoder. For the sparse encoder we will use the pretrained `bm25` model from the Aurelio Platform and OpenAI's `text-embedding-3-small` for the dense encoder.\n", + "\n", + "To get an API key for the Aurelio Platform, we head to the [Aurelio Platform](https://platform.aurelio.ai/settings/api-keys)." ] }, { @@ -121,19 +123,30 @@ "metadata": {}, "outputs": [], "source": [ + "import os\n", + "from getpass import getpass\n", "from semantic_router.encoders.aurelio import AurelioSparseEncoder\n", "\n", + "os.environ[\"AURELIO_API_KEY\"] = os.getenv(\"AURELIO_API_KEY\") or getpass(\n", + " \"Enter Aurelio API Key: \"\n", + ")\n", + "\n", "sparse_encoder = AurelioSparseEncoder(name=\"bm25\")" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Sparse encoders return dictionaries containing the the indices and values of the non-zero elements in the sparse matrix." + ] + }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ - "import os\n", - "from getpass import getpass\n", "from semantic_router.encoders import OpenAIEncoder\n", "\n", "os.environ[\"OPENAI_API_KEY\"] = os.getenv(\"OPENAI_API_KEY\") or getpass(\n", @@ -141,7 +154,7 @@ ")\n", "\n", "encoder = OpenAIEncoder(\n", - " name=\"text-embedding-3-large\", score_threshold=0.5, dimensions=256\n", + " name=\"text-embedding-3-small\", score_threshold=0.3\n", ")" ] }, @@ -149,7 +162,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "We will specify our index:" + "We now have both our sparse and dense encoders. When using both sparse and dense encoders we need to initialize an index that supports hybrid, such as the `HybridLocalIndex` or `PineconeIndex`." ] }, { @@ -167,7 +180,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Now we define the `RouteLayer`. When called, the route layer will consume text (a query) and output the category (`Route`) it belongs to — to initialize a `RouteLayer` we need our `encoder` model and a list of `routes`." + "Now we define the `HybridRouter`. When called, the router will consume text (a query) and output the category (`Route`) it belongs to — to initialize a `HybridRouter` we need an `encoder`, `sparse_encoder` our `routes`, and the hybrid `index` we just define." ] }, { @@ -179,10 +192,8 @@ "name": "stderr", "output_type": "stream", "text": [ - "2024-11-24 12:25:32 - semantic_router.utils.logger - INFO - hybrid.py:157 - add() - Encoding route politics\n", - "2024-11-24 12:25:32 - httpx - INFO - _client.py:1013 - _send_single_request() - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n", - "2024-11-24 12:25:33 - semantic_router.utils.logger - INFO - hybrid.py:157 - add() - Encoding route chitchat\n", - "2024-11-24 12:25:33 - httpx - INFO - _client.py:1013 - _send_single_request() - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n" + "2024-11-24 18:12:29 - httpx - INFO - _client.py:1013 - _send_single_request() - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n", + "2024-11-24 18:12:31 - httpx - INFO - _client.py:1013 - _send_single_request() - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n" ] } ], @@ -201,7 +212,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "We can check the dimensionality of our vectors by looking at the `index` attribute of the `RouteLayer`." + "We can check the dimensionality and number of dense vector records by looking at the `index` attribute of the `HybridRouter`." ] }, { @@ -212,7 +223,7 @@ { "data": { "text/plain": [ - "(11, 256)" + "(11, 1536)" ] }, "execution_count": 8, @@ -224,6 +235,13 @@ "router.index.index.shape" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "To find the number of sparse vector records we can look at the `sparse_index` attribute of the `HybridRouter`." + ] + }, { "cell_type": "code", "execution_count": 9, @@ -260,13 +278,13 @@ "name": "stderr", "output_type": "stream", "text": [ - "2024-11-24 12:25:37 - httpx - INFO - _client.py:1013 - _send_single_request() - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n" + "2024-11-24 18:12:34 - httpx - INFO - _client.py:1013 - _send_single_request() - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n" ] }, { "data": { "text/plain": [ - "RouteChoice(name='politics', function_call=None, similarity_score=1.2995813276471633)" + "RouteChoice(name='politics', function_call=None, similarity_score=1.2837569119668175)" ] }, "execution_count": 10, @@ -287,13 +305,13 @@ "name": "stderr", "output_type": "stream", "text": [ - "2024-11-24 12:25:38 - httpx - INFO - _client.py:1013 - _send_single_request() - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n" + "2024-11-24 18:12:36 - httpx - INFO - _client.py:1013 - _send_single_request() - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n" ] }, { "data": { "text/plain": [ - "RouteChoice(name='chitchat', function_call=None, similarity_score=1.8563758628277611)" + "RouteChoice(name='chitchat', function_call=None, similarity_score=1.856375862827761)" ] }, "execution_count": 11, @@ -321,7 +339,7 @@ "name": "stderr", "output_type": "stream", "text": [ - "2024-11-24 12:25:41 - httpx - INFO - _client.py:1013 - _send_single_request() - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n" + "2024-11-24 18:12:37 - httpx - INFO - _client.py:1013 - _send_single_request() - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n" ] }, { diff --git a/semantic_router/routers/hybrid.py b/semantic_router/routers/hybrid.py index 5ecede72437ad61b6fd948dcb282f916b283c2a2..0b9271a74614fd037f114452b1755ba77867aed2 100644 --- a/semantic_router/routers/hybrid.py +++ b/semantic_router/routers/hybrid.py @@ -154,7 +154,6 @@ class HybridRouter(BaseRouter): route_names = [route.name] * len(route.utterances) # create embeddings for all routes - logger.info(f"Encoding route {route.name}") dense_embeds, sparse_embeds = self._encode(route.utterances) self.index.add( embeddings=dense_embeds,