Skip to content
Snippets Groups Projects
Commit 17436c35 authored by theanupllm's avatar theanupllm
Browse files

adding notebook

parent d36be55f
No related branches found
No related tags found
No related merge requests found
%% Cell type:markdown id: tags:
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/aurelio-labs/semantic-router/blob/main/docs/examples/pinecone-and-scaling.ipynb) [![Open nbviewer](https://raw.githubusercontent.com/pinecone-io/examples/master/assets/nbviewer-shield.svg)](https://nbviewer.org/github/aurelio-labs/semantic-router/blob/main/docs/examples/pinecone-and-scaling.ipynb)
%% Cell type:markdown id: tags:
# Scaling to Many Routes and Using Pinecone
%% Cell type:markdown id: tags:
Semantic router can be used with many hundreds, thousands, or even more routes. At very large scales it can be useful to use a vector database to store and search though your route vector space. Although we do not demonstrate _very large_ scale in this notebook, we will demonstrate more routes than usual and we will also see how to use the `PineconeIndex` for potential scalability and route persistence beyond our local machines.
%% Cell type:markdown id: tags:
## Installing the Library
%% Cell type:code id: tags:
``` python
# !pip install -qU \
# "semantic-router[local, pinecone]==0.0.22" \
# datasets==2.17.0
```
%% Cell type:markdown id: tags:
## Downloading Routes
%% Cell type:code id: tags:
``` python
from datasets import load_dataset
data = load_dataset("aurelio-ai/generic-routes", split="train")
data
```
%% Output
Dataset({
features: ['name', 'utterances', 'description', 'function_schema', 'llm', 'score_threshold'],
num_rows: 50
})
%% Cell type:markdown id: tags:
Each row in this dataset is a single route:
%% Cell type:code id: tags:
``` python
data[0]
```
%% Output
{'name': 'politics',
'utterances': ["isn't politics the best thing ever",
"why don't you tell me about your political opinions",
"don't you just love the presidentdon't you just hate the president",
"they're going to destroy this country!",
'they will save the country!'],
'description': None,
'function_schema': None,
'llm': None,
'score_threshold': 0.82}
%% Cell type:markdown id: tags:
We transform these into `Route` objects like so:
%% Cell type:code id: tags:
``` python
from semantic_router import Route
routes = [Route(**data[i]) for i in range(len(data))]
routes[0]
```
%% Output
Route(name='politics', utterances=["isn't politics the best thing ever", "why don't you tell me about your political opinions", "don't you just love the presidentdon't you just hate the president", "they're going to destroy this country!", 'they will save the country!'], description=None, function_schema=None, llm=None, score_threshold=0.82)
%% Cell type:markdown id: tags:
Next we initialize an `encoder`. We will use a simple `HuggingFaceEncoder`, we can also use popular encoder APIs like `CohereEncoder` and `OpenAIEncoder`.
%% Cell type:code id: tags:
``` python
from semantic_router.encoders import HuggingFaceEncoder
encoder = HuggingFaceEncoder()
```
%% Cell type:markdown id: tags:
Now we initialize our `PineconeIndex`, all it requires is a [Pinecone API key](https://app.pinecone.io) (you do need to be using Pinecone Serverless).
%% Cell type:code id: tags:
``` python
import os
from getpass import getpass
from semantic_router.index.pinecone import PineconeIndex
os.environ["PINECONE_API_KEY"] = os.environ.get("PINECONE_API_KEY") or getpass(
"Enter Pinecone API key: "
)
index = PineconeIndex(index_name="index", namespace="namespace")
```
%% Cell type:code id: tags:
``` python
from semantic_router import RouteLayer
rl = RouteLayer(encoder=encoder, routes=routes, index=index)
```
%% Output
2024-04-15 01:56:59 INFO semantic_router.utils.logger local
%% Cell type:markdown id: tags:
We run the solely static routes layer:
%% Cell type:code id: tags:
``` python
rl("how's the weather today?").name
```
%% Output
'chitchat'
%% Cell type:markdown id: tags:
_If you see a warning about no classification being found, wait a moment and run the above cell again._
%% Cell type:markdown id: tags:
## Loading Index From Previous Initialization
%% Cell type:markdown id: tags:
Because we're using Pinecone our route index can now persist / be access from different locations by simply connecting to the pre-existing index, by default this index uses the identifier `"semantic-router--index"` — this is the index we'll be loading here, but we can change the name via the `index_name` parameter if preferred.
First, let's delete our old route layer, `index`, and `routes`.
%% Cell type:code id: tags:
``` python
del rl, index, routes
```
%% Cell type:markdown id: tags:
Let's load our index first. As mentioned, `"index"` is the default index name, so we don't need to specify this parameter — but we do so below for demonstrative purposes.
Let's load our index first. As mentioned, `"index"` is the default index name and `"namespace"` is the default namespace name, so we don't need to specify this parameter — but we do so below for demonstrative purposes.
%% Cell type:code id: tags:
``` python
index = PineconeIndex(index_name="index", namespace="namespace")
```
%% Cell type:markdown id: tags:
We load the pre-existing routes from this index like so:
%% Cell type:code id: tags:
``` python
index.get_routes()
```
%% Output
[('cybersecurity_best_practices',
'introduction to ethical hacking for developers'),
('coding_standards_and_conventions', 'JavaScript coding conventions'),
('chitchat', 'lovely weather today'),
('cybersecurity_best_practices', 'securing your web applications'),
('gaming_and_esports', 'what are the popular games right now?'),
('compliments', 'say something nice about me'),
('jokes', 'know any good jokes?'),
('data_structures_and_algorithms', 'algorithms every developer should know'),
('data_structures_and_algorithms', 'basic data structures for beginners'),
('interview_preparation', 'how to prepare for a coding interview'),
('career_advice', 'suggest some career development tips'),
('creative_writing_and_literature', 'how can I improve my writing skills?'),
('art_and_culture', 'tell me about your favorite artist'),
('language_learning', 'suggest ways to learn a new language'),
('fitness_tips', 'suggest a workout routine'),
('mental_health_support', 'what are ways to improve mental health?'),
('daily_inspiration', 'I need some inspiration for today'),
('compliments', 'give me a compliment'),
('gardening_and_horticulture', 'suggest some easy-care indoor plants'),
('career_advice_in_tech',
'how to build a portfolio for software development'),
('astronomy_and_space_exploration', 'tell me about the latest space mission'),
('hobbies_and_interests', 'suggest me a hobby'),
('creative_writing_and_literature', 'what are some tips for storytelling?'),
('art_and_culture', "what's an interesting cultural tradition?"),
('ethical_considerations_in_tech',
'the role of ethics in artificial intelligence'),
('development_tools', 'using Docker in development'),
('career_advice', 'how can I improve my resume?'),
('jokes', 'tell me a joke'),
('environmental_awareness', 'how can I be more eco-friendly?'),
('cloud_computing', 'introduction to cloud storage options'),
('cloud_computing', 'AWS vs Azure vs Google Cloud'),
('mindfulness_and_wellness', 'tell me about mindfulness'),
('frameworks_and_libraries', 'introduction to Django for web development'),
('career_advice', 'what are the emerging career fields?'),
('daily_inspiration', 'share something uplifting'),
('gaming_and_esports', 'suggest a good game for beginners'),
('chitchat', 'how are things going?'),
('historical_events', 'share an interesting piece of medieval history'),
('mental_health_support', 'share some self-care practices'),
('environmental_awareness', 'what are some ways to save the planet?'),
('mental_health_support', 'how can I manage stress?'),
('language_learning', 'how can I improve my Spanish?'),
('career_advice_in_tech', 'navigating career growth in tech'),
('cloud_computing', 'best practices for cloud security'),
('fitness_tips', 'give me a fitness tip'),
('machine_learning_in_development',
'how to start with machine learning in Python'),
('cybersecurity_best_practices', 'common security vulnerabilities to avoid'),
('chitchat', "how's the weather today?"),
('fitness_tips', 'how can I stay active at home?'),
('gardening_and_horticulture', 'how do I start a vegetable garden?'),
('interview_preparation', 'common programming interview questions'),
('environmental_awareness', 'tell me about sustainability practices'),
('art_and_culture', 'suggest some must-visit museums'),
('development_tools', 'recommendations for Python IDEs'),
('development_tools', 'best Git clients for macOS'),
('coding_standards_and_conventions', 'maintaining consistency in codebase'),
('ethical_considerations_in_tech', 'ethical hacking and its importance'),
('data_structures_and_algorithms', 'complexity analysis of algorithms'),
('food_and_recipes', 'tell me about a dish from your country'),
('coding_standards_and_conventions', 'why coding standards matter'),
('career_advice_in_tech', 'tips for landing your first tech job'),
('chitchat', "let's go to the chippy"),
('historical_events', 'tell me about a significant historical event'),
('food_and_recipes', "what's your favorite food?"),
('hobbies_and_interests', "I'm looking for a new pastime"),
('astronomy_and_space_exploration', 'how can I stargaze effectively?'),
('food_and_recipes', 'suggest a recipe for dinner'),
('debugging_tips', 'tips for debugging asynchronous code'),
('best_practices', 'how to write clean code in Python'),
('jokes', 'make me laugh'),
('language_syntax', 'how do closures work in JavaScript?'),
('book_recommendations', 'suggest a good book to read'),
('language_syntax', 'explain the syntax of Python functions'),
('best_practices', 'what are the best practices for REST API design?'),
('book_recommendations', 'I need a book recommendation'),
('astronomy_and_space_exploration',
'what are some interesting facts about the universe?'),
('gaming_and_esports', 'tell me about upcoming esports events'),
('interview_preparation', 'tips for technical interviews'),
('debugging_tips', 'best tools for JavaScript debugging'),
('mindfulness_and_wellness', 'give me a wellness tip'),
('debugging_tips', 'how do I debug segmentation faults in C++?'),
('historical_events', 'who was a notable figure in ancient history?'),
('language_learning',
'what are some effective language learning techniques?'),
('compliments', 'I need some positive vibes'),
('frameworks_and_libraries',
"what's the difference between React and Angular?"),
('chitchat', 'the weather is horrendous'),
('educational_facts', 'do you know any historical trivia?'),
('machine_learning_in_development', 'using TensorFlow for beginners'),
('educational_facts', 'share a science fact'),
('gardening_and_horticulture',
'what are some tips for sustainable gardening?'),
('daily_inspiration', 'give me an inspirational quote'),
('book_recommendations', "what's your favorite book?"),
('best_practices', 'best practices for error handling in JavaScript'),
('educational_facts', 'tell me an interesting fact'),
('machine_learning_in_development',
'machine learning model deployment best practices'),
('frameworks_and_libraries', 'best Python libraries for data analysis'),
('hobbies_and_interests', 'what are your interests?'),
('creative_writing_and_literature', 'suggest some classic literature'),
('language_syntax', 'what are the new features in Java 15?'),
('ethical_considerations_in_tech', 'privacy concerns in app development'),
('software_architecture', 'differences between MVC and MVVM'),
('motivation', 'give me a motivational quote'),
('open_source_contributions', 'how to start contributing to open source'),
('tech_trends', "what's new in technology?"),
('software_architecture', 'explain microservices architecture'),
('personal_questions', 'what do you like to do for fun?'),
('movie_suggestions', 'suggest a good movie for tonight'),
('programming_challenges', 'suggest a coding challenge for beginners'),
('movie_suggestions', 'recommend a movie'),
('web_development_trends', "what's new in front-end development?"),
('politics', "they're going to destroy this country!"),
('web_development_trends', 'emerging back-end technologies'),
('personal_questions', 'do you have any hobbies?'),
('version_control_systems', 'best practices for branching in Git'),
('science_and_innovation', 'what are the latest scientific discoveries?'),
('motivation', 'inspire me'),
('mobile_app_development', 'optimizing performance in mobile apps'),
('mobile_app_development', 'Kotlin vs Swift for mobile development'),
('programming_challenges', 'where can I find algorithmic puzzles?'),
('open_source_contributions', 'best practices for open-source contributors'),
('tech_trends', 'what are the emerging tech trends?'),
('travel_stories', 'tell me about your favorite travel destination'),
('pet_care_advice', 'what should I know about keeping a pet rabbit?'),
('philosophical_questions', 'what are your thoughts on free will?'),
('pet_care_advice', 'how can I train my dog?'),
('personal_questions', "what's your favorite color?"),
('travel_stories', 'share a travel story'),
('music_discovery', 'recommend songs for a workout playlist'),
('science_and_innovation', 'tell me about a recent innovation'),
('pet_care_advice', 'suggest some tips for cat care'),
('travel_stories', "what's the most interesting place you've visited?"),
('open_source_contributions', 'finding projects that accept contributions'),
('version_control_systems', 'how to revert a commit in Git'),
('music_discovery', 'who are the top artists right now?'),
('philosophical_questions', 'what is the meaning of life?'),
('philosophical_questions', 'do you believe in fate?'),
('version_control_systems', 'introduction to SVN for beginners'),
('tech_trends', 'tell me about the latest gadgets'),
('music_discovery', 'suggest some new music'),
('mobile_app_development',
'best tools for cross-platform mobile development'),
('politics', 'they will save the country!'),
('project_management_in_tech', 'tools for managing tech projects'),
('motivation', 'I need some motivation'),
('politics',
"don't you just love the presidentdon't you just hate the president"),
('project_management_in_tech', 'how to lead a development team'),
('politics', "why don't you tell me about your political opinions"),
('politics', "isn't politics the best thing ever"),
('web_development_trends', 'the future of web development'),
('mindfulness_and_wellness', 'how can I relax?'),
('programming_challenges', 'programming tasks to improve problem-solving'),
('movie_suggestions', "what's your favorite movie?"),
('software_architecture', 'introduction to domain-driven design'),
('science_and_innovation', 'how does AI impact our daily lives?'),
('project_management_in_tech', 'agile vs waterfall project management')]
%% Cell type:markdown id: tags:
We will transform these into a dictionary format that we can use to initialize our `Route` objects.
%% Cell type:code id: tags:
``` python
routes_dict = {}
for route, utterance in index.get_routes():
if route not in routes_dict:
routes_dict[route] = []
routes_dict[route].append(utterance)
```
%% Cell type:code id: tags:
``` python
routes_dict
```
%% Output
{'fitness_tips': ['give me a fitness tip',
'suggest a workout routine',
'how can I stay active at home?'],
'language_learning': ['how can I improve my Spanish?',
'what are some effective language learning techniques?',
'suggest ways to learn a new language'],
'frameworks_and_libraries': ['introduction to Django for web development',
"what's the difference between React and Angular?",
'best Python libraries for data analysis'],
'book_recommendations': ["what's your favorite book?",
'suggest a good book to read',
'I need a book recommendation'],
'coding_standards_and_conventions': ['JavaScript coding conventions',
'maintaining consistency in codebase',
'why coding standards matter'],
'historical_events': ['share an interesting piece of medieval history',
'who was a notable figure in ancient history?',
'tell me about a significant historical event'],
'ethical_considerations_in_tech': ['ethical hacking and its importance',
'the role of ethics in artificial intelligence',
'privacy concerns in app development'],
'jokes': ['make me laugh', 'tell me a joke', 'know any good jokes?'],
'chitchat': ["let's go to the chippy",
'how are things going?',
'the weather is horrendous',
"how's the weather today?",
'lovely weather today'],
'environmental_awareness': ['how can I be more eco-friendly?',
'tell me about sustainability practices',
'what are some ways to save the planet?'],
'gardening_and_horticulture': ['suggest some easy-care indoor plants',
'how do I start a vegetable garden?',
'what are some tips for sustainable gardening?'],
'data_structures_and_algorithms': ['basic data structures for beginners',
'complexity analysis of algorithms',
'algorithms every developer should know'],
'educational_facts': ['share a science fact',
'do you know any historical trivia?',
'tell me an interesting fact'],
'art_and_culture': ['tell me about your favorite artist',
'suggest some must-visit museums',
"what's an interesting cultural tradition?"],
'mindfulness_and_wellness': ['give me a wellness tip',
'tell me about mindfulness',
'how can I relax?'],
'cybersecurity_best_practices': ['common security vulnerabilities to avoid',
'introduction to ethical hacking for developers',
'securing your web applications'],
'development_tools': ['using Docker in development',
'recommendations for Python IDEs',
'best Git clients for macOS'],
'gaming_and_esports': ['suggest a good game for beginners',
'what are the popular games right now?',
'tell me about upcoming esports events'],
'interview_preparation': ['tips for technical interviews',
'common programming interview questions',
'how to prepare for a coding interview'],
'daily_inspiration': ['give me an inspirational quote',
'I need some inspiration for today',
'share something uplifting'],
'cloud_computing': ['best practices for cloud security',
'AWS vs Azure vs Google Cloud',
'introduction to cloud storage options'],
'machine_learning_in_development': ['how to start with machine learning in Python',
'machine learning model deployment best practices',
'using TensorFlow for beginners'],
'mental_health_support': ['what are ways to improve mental health?',
'share some self-care practices',
'how can I manage stress?'],
'astronomy_and_space_exploration': ['how can I stargaze effectively?',
'tell me about the latest space mission',
'what are some interesting facts about the universe?'],
'career_advice_in_tech': ['how to build a portfolio for software development',
'tips for landing your first tech job',
'navigating career growth in tech'],
'hobbies_and_interests': ['what are your interests?',
'suggest me a hobby',
"I'm looking for a new pastime"],
'debugging_tips': ['how do I debug segmentation faults in C++?',
'best tools for JavaScript debugging',
'tips for debugging asynchronous code'],
'career_advice': ['suggest some career development tips',
'what are the emerging career fields?',
'how can I improve my resume?'],
'compliments': ['give me a compliment',
'say something nice about me',
'I need some positive vibes'],
'best_practices': ['how to write clean code in Python',
'best practices for error handling in JavaScript',
'what are the best practices for REST API design?'],
'food_and_recipes': ["what's your favorite food?",
'suggest a recipe for dinner',
'tell me about a dish from your country'],
'creative_writing_and_literature': ['suggest some classic literature',
'what are some tips for storytelling?',
'how can I improve my writing skills?'],
'language_syntax': ['explain the syntax of Python functions',
'how do closures work in JavaScript?',
'what are the new features in Java 15?'],
'travel_stories': ["what's the most interesting place you've visited?",
'share a travel story',
'tell me about your favorite travel destination'],
'philosophical_questions': ['do you believe in fate?',
'what are your thoughts on free will?',
'what is the meaning of life?'],
'programming_challenges': ['where can I find algorithmic puzzles?',
'suggest a coding challenge for beginners',
'programming tasks to improve problem-solving'],
'personal_questions': ['what do you like to do for fun?',
"what's your favorite color?",
'do you have any hobbies?'],
'movie_suggestions': ['recommend a movie',
"what's your favorite movie?",
'suggest a good movie for tonight'],
'science_and_innovation': ['what are the latest scientific discoveries?',
'how does AI impact our daily lives?',
'tell me about a recent innovation'],
'music_discovery': ['suggest some new music',
'who are the top artists right now?',
'recommend songs for a workout playlist'],
'web_development_trends': ['emerging back-end technologies',
"what's new in front-end development?",
'the future of web development'],
'tech_trends': ['tell me about the latest gadgets',
"what's new in technology?",
'what are the emerging tech trends?'],
'open_source_contributions': ['best practices for open-source contributors',
'finding projects that accept contributions',
'how to start contributing to open source'],
'mobile_app_development': ['Kotlin vs Swift for mobile development',
'optimizing performance in mobile apps',
'best tools for cross-platform mobile development'],
'politics': ["isn't politics the best thing ever",
"don't you just love the presidentdon't you just hate the president",
'they will save the country!',
"why don't you tell me about your political opinions",
"they're going to destroy this country!"],
'motivation': ['give me a motivational quote',
'I need some motivation',
'inspire me'],
'pet_care_advice': ['what should I know about keeping a pet rabbit?',
'suggest some tips for cat care',
'how can I train my dog?'],
'version_control_systems': ['best practices for branching in Git',
'introduction to SVN for beginners',
'how to revert a commit in Git'],
'software_architecture': ['differences between MVC and MVVM',
'introduction to domain-driven design',
'explain microservices architecture'],
'project_management_in_tech': ['agile vs waterfall project management',
'how to lead a development team',
'tools for managing tech projects']}
%% Cell type:markdown id: tags:
Now we transform these into a list of `Route` objects.
%% Cell type:code id: tags:
``` python
routes = [
Route(name=route, utterances=utterances)
for route, utterances in routes_dict.items()
]
routes[0]
```
%% Output
Route(name='food_and_recipes', utterances=["what's your favorite food?", 'suggest a recipe for dinner', 'tell me about a dish from your country'], description=None, function_schema=None, llm=None, score_threshold=None)
%% Cell type:markdown id: tags:
Now we reinitialize our `RouteLayer`:
%% Cell type:code id: tags:
``` python
from semantic_router import RouteLayer
rl = RouteLayer(encoder=encoder, routes=routes, index=index)
```
%% Output
2024-04-15 01:57:19 INFO semantic_router.utils.logger local
%% Cell type:markdown id: tags:
And test it again:
%% Cell type:code id: tags:
``` python
rl("say something to make me laugh").name
```
%% Output
'jokes'
%% Cell type:code id: tags:
``` python
rl("tell me something amusing").name
```
%% Output
'jokes'
%% Cell type:code id: tags:
``` python
rl("it's raining cats and dogs today").name
```
%% Output
'chitchat'
%% Cell type:code id: tags:
``` python
# delete index
index.delete_index()
```
%% Cell type:markdown id: tags:
Perfect, our routes loaded from our `PineconeIndex` are working as expected! As mentioned, we can use the `PineconeIndex` for persistance and high scale use-cases, for example where we might have hundreds of thousands of utterances, or even millions.
%% Cell type:markdown id: tags:
---
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment