[](https://colab.research.google.com/github/aurelio-labs/semantic-router/blob/main/docs/07-scaling-and-pinecone.ipynb) [](https://nbviewer.org/github/aurelio-labs/semantic-router/blob/main/docs/07-scaling-and-pinecone.ipynb)
%% Cell type:markdown id: tags:
# Scaling to Many Routes and Using Pinecone
%% Cell type:markdown id: tags:
Semantic router can be used with many hundreds, thousands, or even more routes. At very large scales it can be useful to use a vector database to store and search though your route vector space. Although we do not demonstrate _very large_ scale in this notebook, we will demonstrate more routes than usual and we will also see how to use the `PineconeIndex` for potential scalability and route persistence beyond our local machines.
/Users/jamesbriggs/opt/anaconda3/envs/decision-layer/lib/python3.11/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
from .autonotebook import tqdm as notebook_tqdm
Using the latest cached version of the dataset since aurelio-ai/generic-routes couldn't be found on the Hugging Face Hub
Found the latest cached dataset configuration 'default' at /Users/jamesbriggs/.cache/huggingface/datasets/aurelio-ai___generic-routes/default/0.0.0/5ed6ce316bb803dc716232e6c5f0eb1c7400e24d (last modified on Sun Feb 18 15:49:32 2024).
'utterances': ["isn't politics the best thing ever",
"why don't you tell me about your political opinions",
"don't you just love the presidentdon't you just hate the president",
"they're going to destroy this country!",
'they will save the country!'],
'description': None,
'function_schema': None,
'llm': None,
'score_threshold': 0.82}
%% Cell type:markdown id: tags:
We transform these into `Route` objects like so:
%% Cell type:code id: tags:
``` python
fromsemantic_routerimportRoute
routes=[Route(**data[i])foriinrange(len(data))]
routes[0]
```
%% Output
Route(name='politics', utterances=["isn't politics the best thing ever", "why don't you tell me about your political opinions", "don't you just love the presidentdon't you just hate the president", "they're going to destroy this country!", 'they will save the country!'], description=None, function_schema=None, llm=None, score_threshold=0.82)
%% Cell type:markdown id: tags:
Next we initialize an `encoder`. We will use a simple `HuggingFaceEncoder`, we can also use popular encoder APIs like `CohereEncoder` and `OpenAIEncoder`.
[32m2024-02-18 17:12:21 INFO semantic_router.utils.logger local[0m
%% Cell type:markdown id: tags:
We run the solely static routes layer:
%% Cell type:code id: tags:
``` python
rl("how's the weather today?").name
```
%% Output
'chitchat'
%% Cell type:markdown id: tags:
_If you see a warning about no classification being found, wait a moment and run the above cell again._
%% Cell type:markdown id: tags:
## Loading Index From Previous Initialization
%% Cell type:markdown id: tags:
Because we're using Pinecone our route index can now persist / be access from different locations by simply connecting to the pre-existing index, by default this index uses the identifier `"semantic-router--index"` — this is the index we'll be loading here, but we can change the name via the `index_name` parameter if preferred.
First, let's delete our old route layer, `index`, and `routes`.
%% Cell type:code id: tags:
``` python
delrl,index,routes
```
%% Cell type:markdown id: tags:
Let's load our index first. As mentioned, `"index"` is the default index name, so we don't need to specify this parameter — but we do so below for demonstrative purposes.
%% Cell type:code id: tags:
``` python
index=PineconeIndex(index_name="index")
```
%% Cell type:markdown id: tags:
We load the pre-existing routes from this index like so:
%% Cell type:code id: tags:
``` python
index.get_routes()
```
%% Output
[('fitness_tips', 'suggest a workout routine'),
('daily_inspiration', 'give me an inspirational quote'),
('creative_writing_and_literature', 'how can I improve my writing skills?'),
('chitchat', "let's go to the chippy"),
('astronomy_and_space_exploration',
'what are some interesting facts about the universe?'),
('chitchat', 'the weather is horrendous'),
('cloud_computing', 'AWS vs Azure vs Google Cloud'),
('chitchat', 'how are things going?'),
('educational_facts', 'tell me an interesting fact'),
('chitchat', "how's the weather today?"),
('ethical_considerations_in_tech', 'ethical hacking and its importance'),
('compliments', 'say something nice about me'),
('food_and_recipes', "what's your favorite food?"),
Route(name='jokes', utterances=['tell me a joke', 'make me laugh', 'know any good jokes?'], description=None, function_schema=None, llm=None, score_threshold=None)
[32m2024-02-18 17:16:19 INFO semantic_router.utils.logger local[0m
%% Cell type:markdown id: tags:
And test it again:
%% Cell type:code id: tags:
``` python
rl("say something to make me laugh").name
```
%% Output
'jokes'
%% Cell type:code id: tags:
``` python
rl("tell me something amusing").name
```
%% Output
'jokes'
%% Cell type:code id: tags:
``` python
rl("it's raining cats and dogs today").name
```
%% Output
'chitchat'
%% Cell type:markdown id: tags:
Perfect, our routes loaded from our `PineconeIndex` are working as expected! As mentioned, we can use the `PineconeIndex` for persistance and high scale use-cases, for example where we might have hundreds of thousands of utterances, or even millions.