"Then the query engine is ready for Semantic Search or Retrieval Augmented Generation:\n",
"Then the query engine is ready for Semantic Search or Retrieval Augmented Generation with Milvus 2.3 documents:\n",
"\n",
"\n",
"- **Retrieve** (Semantic search powered by Zilliz Cloud Pipeline's Index):"
"- **Retrieve** (Semantic search powered by Zilliz Cloud Pipeline's Index):"
]
]
...
@@ -163,18 +165,19 @@
...
@@ -163,18 +165,19 @@
"execution_count": null,
"execution_count": null,
"id": "8ab92af7",
"id": "8ab92af7",
"metadata": {},
"metadata": {},
"outputs": [],
"outputs": [
"source": [
{
"question = \"Can users delete entities by complex boolean expressions?\"\n",
"name": "stdout",
"query_engine_with_filters.retrieve(question)"
"output_type": "stream",
]
"text": [
},
"[NodeWithScore(node=TextNode(id_='446268394525283746', embedding=None, metadata={'version': '2.3'}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, hash='c3254bc65319b52914d6e68fbce69161fcf0e2998e4619287a8560258a2fe53d', text='Delete Entities\\nThis topic describes how to delete entities in Milvus.\\nMilvus supports deleting entities by primary key or complex boolean expressions. Deleting entities by primary key is much faster and lighter than deleting them by complex boolean expressions. This is because Milvus executes queries first when deleting data by complex boolean expressions.\\nDeleted entities can still be retrieved immediately after the deletion if the consistency level is set lower than Strong.\\nEntities deleted beyond the pre-specified span of time for Time Travel cannot be retrieved again.\\nFrequent deletion operations will impact the system performance.\\nBefore deleting entities by comlpex boolean expressions, make sure the collection has been loaded.\\nDeleting entities by complex boolean expressions is not an atomic operation. Therefore, if it fails halfway through, some data may still be deleted.\\nDeleting entities by complex boolean expressions is supported only when the consistency is set to Bounded. For details, see Consistency.\\nPrepare boolean expression\\nPrepare the boolean expression that filters the entities to delete.\\nMilvus supports deleting entities by primary key or complex boolean expressions. For more information on expression rules and supported operators, see Boolean Expression Rules.\\nSimple boolean expression\\nUse a simple expression to filter data with primary key values of 0 and 1:\\npython\\nexpr = \"book_id in [0,1]\"\\nComplex boolean expression', start_char_idx=None, end_char_idx=None, text_template='{metadata_str}\\n\\n{content}', metadata_template='{key}: {value}', metadata_seperator='\\n'), score=0.8668166995048523), NodeWithScore(node=TextNode(id_='446268394525283747', embedding=None, metadata={'version': '2.3'}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, hash='3ec8b3a992fb72d081145b7859c70453dc9d71be714c0f5f99ad2b2c2cb1f7ea', text='To filter entities that meet specific conditions, define complex boolean expressions.\\nFilter entities whose word_count is greater than or equal to 11000:\\npython\\nexpr = \"word_count >= 11000\"\\nFilter entities whose book_name is not Unknown:\\npython\\nexpr = \"book_name != Unknown\"\\nFilter entities whose primary key values are greater than 5 and word_count is smaller than or equal to 9999:\\npython\\nexpr = \"book_id > 5 && word_count <= 9999\"\\nDelete entities\\nDelete the entities with the boolean expression you created. Milvus returns the ID list of the deleted entities.\\npython\\nfrom pymilvus import Collection\\ncollection = Collection(\"book\") # Get an existing collection.\\ncollection.delete(expr)\\nParameter Description\\nexpr Boolean expression that specifies the entities to delete.\\npartition_name (optional) Name of the partition to delete entities from.\\nUpsert Entities\\nThis topic describes how to upsert entities in Milvus.\\nUpserting is a combination of insert and delete operations. In the context of a Milvus vector database, an upsert is a data-level operation that will overwrite an existing entity if a specified field already exists in a collection, and insert a new entity if the specified value doesn’t already exist.\\nThe following example upserts 3,000 rows of randomly generated data as the example data. When performing upsert operations, it\\'s important to note that the operation may compromise performance. This is because the operation involves deleting data during execution.\\nPrepare data', start_char_idx=None, end_char_idx=None, text_template='{metadata_str}\\n\\n{content}', metadata_template='{key}: {value}', metadata_seperator='\\n'), score=0.841397762298584), NodeWithScore(node=TextNode(id_='446268394525283749', embedding=None, metadata={'version': '2.3'}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, hash='42656e32ce6baa2897419b8bae612412db94f1d570ab1702f2ae6c5557f248a9', text='When data is upserted into Milvus it is updated and inserted into segments. Segments have to reach a certain size to be sealed and indexed. Unsealed segments will be searched brute force. In order to avoid this with any remainder data, it is best to call flush(). The flush() call will seal any remaining segments and send them for indexing. It is important to only call this method at the end of an upsert session. Calling it too often will cause fragmented data that will need to be cleaned later on.\\nLimits\\nUpdating primary key fields is not supported by upsert().\\nupsert() is not applicable and an error can occur if autoID is set to True for primary key fields.', start_char_idx=None, end_char_idx=None, text_template='{metadata_str}\\n\\n{content}', metadata_template='{key}: {value}', metadata_seperator='\\n'), score=0.793336033821106)]\n"
{
]
"cell_type": "markdown",
}
"id": "a503d6e0",
],
"metadata": {},
"source": [
"source": [
"> The query engine with filters retrieves only text nodes with version 2.3."
"question = \"Can users delete entities by filtering non-primary fields?\"\n",
"Yes, users can delete entities by filtering non-primary fields. Milvus supports deleting entities by complex boolean expressions, which can include conditions based on non-primary fields. Users can define complex boolean expressions to filter entities based on specific conditions and then delete those entities using the expression.\n"
<ahref="https://colab.research.google.com/github/jerryjliu/llama_index/blob/main/docs/examples/managed/vectaraDemo.ipynb"target="_parent"><imgsrc="https://colab.research.google.com/assets/colab-badge.svg"alt="Open In Colab"/></a>
<ahref="https://colab.research.google.com/github/jerryjliu/llama_index/blob/main/docs/examples/managed/vectaraDemo.ipynb"target="_parent"><imgsrc="https://colab.research.google.com/assets/colab-badge.svg"alt="Open In Colab"/></a>
%% Cell type:markdown id:db0855d0 tags:
%% Cell type:markdown id:db0855d0 tags:
# Managed Index with Zilliz Cloud Pipeline
# Managed Index with Zilliz Cloud Pipeline
[Zilliz Cloud Pipelines](https://docs.zilliz.com/docs/pipelines) is a robust solution that efficiently transforms unstructured data into a vector database for effective semantic search.
[Zilliz Cloud Pipelines](https://docs.zilliz.com/docs/pipelines) is a robust solution that efficiently transforms unstructured data into a vector database for effective semantic search.
## Setup
## Setup
1. Install llama-index
1. Install llama-index
%% Cell type:code id:6019e01a tags:
%% Cell type:code id:6019e01a tags:
``` python
``` python
# ! pip install llama-index
# ! pip install llama-index
```
```
%% Cell type:markdown id:3fc94b2f tags:
%% Cell type:markdown id:3fc94b2f tags:
2. Set your [OpenAI](https://platform.openai.com) & [Zilliz Cloud](https://cloud.zilliz.com/) accounts
2. Set your [OpenAI](https://platform.openai.com) & [Zilliz Cloud](https://cloud.zilliz.com/) accounts
%% Cell type:code id:c2d1c538 tags:
%% Cell type:code id:c2d1c538 tags:
``` python
``` python
fromgetpassimportgetpass
fromgetpassimportgetpass
importos
importos
os.environ["OPENAI_API_KEY"]=getpass("Enter your OpenAI API Key:")
os.environ["OPENAI_API_KEY"]=getpass("Enter your OpenAI API Key:")
ZILLIZ_CLUSTER_ID=getpass("Enter your Zilliz Cluster ID:")
ZILLIZ_CLUSTER_ID=getpass("Enter your Zilliz Cluster ID:")
ZILLIZ_TOKEN=getpass("Enter your Zilliz Token:")
ZILLIZ_TOKEN=getpass("Enter your Zilliz Token:")
```
```
%% Cell type:markdown id:5d3c5b5f tags:
%% Cell type:markdown id:5d3c5b5f tags:
## Indexing documents
## Indexing documents
### From Signed URL
### From Signed URL
Zilliz Cloud Pipeline is able to ingest & automatically index a document given a presigned url.
Zilliz Cloud Pipeline is able to ingest & automatically index a document given a presigned url.
> The query engine with filters retrieves only text nodes with version 2.3.
[NodeWithScore(node=TextNode(id_='446268394525283746', embedding=None, metadata={'version': '2.3'}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, hash='c3254bc65319b52914d6e68fbce69161fcf0e2998e4619287a8560258a2fe53d', text='Delete Entities\nThis topic describes how to delete entities in Milvus.\nMilvus supports deleting entities by primary key or complex boolean expressions. Deleting entities by primary key is much faster and lighter than deleting them by complex boolean expressions. This is because Milvus executes queries first when deleting data by complex boolean expressions.\nDeleted entities can still be retrieved immediately after the deletion if the consistency level is set lower than Strong.\nEntities deleted beyond the pre-specified span of time for Time Travel cannot be retrieved again.\nFrequent deletion operations will impact the system performance.\nBefore deleting entities by comlpex boolean expressions, make sure the collection has been loaded.\nDeleting entities by complex boolean expressions is not an atomic operation. Therefore, if it fails halfway through, some data may still be deleted.\nDeleting entities by complex boolean expressions is supported only when the consistency is set to Bounded. For details, see Consistency.\nPrepare boolean expression\nPrepare the boolean expression that filters the entities to delete.\nMilvus supports deleting entities by primary key or complex boolean expressions. For more information on expression rules and supported operators, see Boolean Expression Rules.\nSimple boolean expression\nUse a simple expression to filter data with primary key values of 0 and 1:\npython\nexpr = "book_id in [0,1]"\nComplex boolean expression', start_char_idx=None, end_char_idx=None, text_template='{metadata_str}\n\n{content}', metadata_template='{key}: {value}', metadata_seperator='\n'), score=0.8668166995048523), NodeWithScore(node=TextNode(id_='446268394525283747', embedding=None, metadata={'version': '2.3'}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, hash='3ec8b3a992fb72d081145b7859c70453dc9d71be714c0f5f99ad2b2c2cb1f7ea', text='To filter entities that meet specific conditions, define complex boolean expressions.\nFilter entities whose word_count is greater than or equal to 11000:\npython\nexpr = "word_count >= 11000"\nFilter entities whose book_name is not Unknown:\npython\nexpr = "book_name != Unknown"\nFilter entities whose primary key values are greater than 5 and word_count is smaller than or equal to 9999:\npython\nexpr = "book_id > 5 && word_count <= 9999"\nDelete entities\nDelete the entities with the boolean expression you created. Milvus returns the ID list of the deleted entities.\npython\nfrom pymilvus import Collection\ncollection = Collection("book") # Get an existing collection.\ncollection.delete(expr)\nParameter Description\nexpr Boolean expression that specifies the entities to delete.\npartition_name (optional) Name of the partition to delete entities from.\nUpsert Entities\nThis topic describes how to upsert entities in Milvus.\nUpserting is a combination of insert and delete operations. In the context of a Milvus vector database, an upsert is a data-level operation that will overwrite an existing entity if a specified field already exists in a collection, and insert a new entity if the specified value doesn’t already exist.\nThe following example upserts 3,000 rows of randomly generated data as the example data. When performing upsert operations, it\'s important to note that the operation may compromise performance. This is because the operation involves deleting data during execution.\nPrepare data', start_char_idx=None, end_char_idx=None, text_template='{metadata_str}\n\n{content}', metadata_template='{key}: {value}', metadata_seperator='\n'), score=0.841397762298584), NodeWithScore(node=TextNode(id_='446268394525283749', embedding=None, metadata={'version': '2.3'}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, hash='42656e32ce6baa2897419b8bae612412db94f1d570ab1702f2ae6c5557f248a9', text='When data is upserted into Milvus it is updated and inserted into segments. Segments have to reach a certain size to be sealed and indexed. Unsealed segments will be searched brute force. In order to avoid this with any remainder data, it is best to call flush(). The flush() call will seal any remaining segments and send them for indexing. It is important to only call this method at the end of an upsert session. Calling it too often will cause fragmented data that will need to be cleaned later on.\nLimits\nUpdating primary key fields is not supported by upsert().\nupsert() is not applicable and an error can occur if autoID is set to True for primary key fields.', start_char_idx=None, end_char_idx=None, text_template='{metadata_str}\n\n{content}', metadata_template='{key}: {value}', metadata_seperator='\n'), score=0.793336033821106)]
%% Cell type:markdown id:e91c5896 tags:
%% Cell type:markdown id:e91c5896 tags:
-**Query** (RAG powered by Zilliz Cloud Pipeline's Index & OpenAI's LLM):
-**Query** (RAG powered by Zilliz Cloud Pipeline's Index & OpenAI's LLM):
Yes, users can delete entities by filtering non-primary fields. Milvus supports deleting entities by complex boolean expressions, which can include conditions based on non-primary fields. Users can define complex boolean expressions to filter entities based on specific conditions and then delete those entities using the expression.