Topics Splitters have been implemented in the code in `semantic-router/splitters`.
These allow a set of utterances to be automatically grouped/clustered into (un-labelled) topics.
Additionally, splitters have been integrated with `Conversation` objects allowing conversations to be progressively spit by topic as they evolve. This is beneficial to routing, as earlier messages in a conversation topic might provide useful context when determining routes. By using all utterances in the latest conversation this additional context allows for correct routes to be more reliably chosen.
%% Cell type:markdown id: tags:
%% Cell type:markdown id: tags:
## Example: IT Support Dialogue
%% Cell type:markdown id: tags:
### Setup
%% Cell type:markdown id: tags:
First, we import the necessary classes and initialize the conversation with dialogue.
%% Cell type:code id: tags:
``` python
fromsemantic_router.textimportConversation
fromsemantic_router.schemaimportMessage
# Initialize the Conversation
conversation=Conversation()
# Define the IT support dialogue
messages=[
Message(role="user",content="Hi, there, please can you confirm your full name"),
Message(role="user",content="Hi, my name is John Doe."),
Message(role="bot",content="Okay, how can I help you today?"),
Message(role="bot",content="Okay, is our software running when the computer crashes."),
Message(role="user",content="Yeah, v.3.11.2 is running when it crashes."),
]
# Add messages to the conversation
conversation.add_new_messages(messages)
```
%% Output
c:\Users\Siraj\Documents\Personal\Work\Aurelio\20240130 2148 Semantic Topic Splitter (Siraj Local Repo)\venvs\semantic_splitter_1\Lib\site-packages\tqdm\auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
Topic 2: - user: Hi, there, please can you confirm your full name
Topic 2: - user: Hi, my name is John Doe.
Topic 3: - bot: Okay, how can I help you today?
Topic 4: - user: My computer keeps crashing
Topic 4: - bot: Okay, is our software running when the computer crashes.
Topic 4: - user: Yeah, v.3.11.2 is running when it crashes.
%% Cell type:markdown id: tags:
Notice that the last message says "Yeah, it crashes right after I start the software".
This might be correctly routed by the semantic-router, particularly if the route is quite generic, intended for "software" and/or "crashes".
However, as an illustrative example, what if the routes were
Route A: "Sotware Crashes - v3.11"
Route B: "Computer Crashes - v3.11"
If just the last utterance was used, then Route A would likely be chosen. However, if instead every utterance from the last topic (Topic 4), concatenated together, were sent to the semantic-router, then this important additional context would most likely result in Route A being chosen.
%% Cell type:markdown id: tags:
### Topic Splitting After Topic Continuation
%% Cell type:markdown id: tags:
Note that topics can be continued even after `conversation.split_by_topic()` has already been run.
%% Cell type:markdown id: tags:
Add some new messages.
%% Cell type:code id: tags:
``` python
# Define the IT support dialogue
messages=[
Message(role="user",content="What do the system logs say, right before the crash?"),
Message(role="user",content="I'll check soon, but first let's talk refund."),
Message(role="bot",content="Okay let me sort out a refund."),
Topic 2: - user: Hi, there, please can you confirm your full name
Topic 2: - user: Hi, my name is John Doe.
Topic 3: - bot: Okay, how can I help you today?
Topic 4: - user: My computer keeps crashing
Topic 4: - bot: Okay, is our software running when the computer crashes.
Topic 4: - user: Yeah, v.3.11.2 is running when it crashes.
Topic 4: - user: What do the system logs say, right before the crash?
Topic 5: - user: I'll check soon, but first let's talk refund.
Topic 5: - bot: Okay let me sort out a refund.
%% Cell type:markdown id: tags:
As you can see, we:
1) Added the first six messages, as seen above, to the `Conversation`.
2) Ran the Topic Splitter.
3) Added the last two messages to the `Conversation`.
4) Ran the Topic Splitter again.
Despite "user: Yeah, v.3.11.2 is running when it crashes" and "user: What do the system logs say, right before the crash?" being added and separately, and despite the conversation splitter being run twice (once before user: What do the system logs say, right before the crash?" was added, and once after), both these utterances were successfully assigned the same Topic - `Topic 4`.