[](https://colab.research.google.com/github/aurelio-labs/semantic-router/blob/main/docs/08-multi-modal.ipynb) [](https://nbviewer.org/github/aurelio-labs/semantic-router/blob/main/docs/08-multi-modal.ipynb)

# Multi-Modal Routes

The Semantic Router library can also be used for detection of specific images or videos, for example the detection of **N**ot **S**hrek **F**or **W**ork (NSFW) and **S**hrek **F**or **W**ork (SFW) images as we will demonstrate in this walkthrough.

## Getting Started

We start by installing the library:

In [None]:
!pip install -qU \
 "semantic-router[local]==0.0.25" \
 datasets==2.17.0

We start by downloading a multi-modal dataset, we'll be using the `aurelio-ai/shrek-detection` dataset from Hugging Face.

In [None]:
from datasets import load_dataset

data = load_dataset("aurelio-ai/shrek-detection", split="train", trust_remote_code=True)
data[3]["image"]

We will grab the images that are labeled with `is_shrek`:

In [None]:
shrek_pics = [d["image"] for d in data if d["is_shrek"]]
not_shrek_pics = [d["image"] for d in data if not d["is_shrek"]]
print(f"We have {len(shrek_pics)} shrek pics, and {len(not_shrek_pics)} not shrek pics")

We start by defining a dictionary mapping routes to example phrases that should trigger those routes.

In [None]:
from semantic_router import Route

shrek = Route(
 name="shrek",
 utterances=shrek_pics,
)

Let's define another for good measure:

In [None]:
not_shrek = Route(
 name="not_shrek",
 utterances=not_shrek_pics,
)

routes = [shrek, not_shrek]

Now we initialize our embedding model:

In [None]:
from semantic_router.encoders.clip import CLIPEncoder

encoder = CLIPEncoder()

Now we define the `RouteLayer`. When called, the route layer will consume text (a query) and output the category (`Route`) it belongs to — to initialize a `RouteLayer` we need our `encoder` model and a list of `routes`.

In [None]:
from semantic_router.layer import RouteLayer

rl = RouteLayer(encoder=encoder, routes=routes)

Now we can test it with _text_ to see if we hit the routes that we defined with images:

In [None]:
rl("don't you love politics?")

In [None]:
rl("shrek")

In [None]:
rl("dwayne the rock johnson")

Everything is being classified accurately, let's pull in some images that we haven't seen before and see if we can classify them as NSFW or SFW.

In [None]:
test_data = load_dataset(
 "aurelio-ai/shrek-detection", split="test", trust_remote_code=True
)
test_data

In this case, we return `None` because no matches were identified.