Back to Agentscope

MilvusLite Vector Store

examples/functionality/vector_store/milvus_lite/README.md

1.0.193.1 KB
Original Source

MilvusLite Vector Store

This example demonstrates how to use MilvusLiteStore for vector storage and semantic search in AgentScope. It includes four test scenarios covering CRUD operations, metadata filtering, document chunking, and distance metrics.

Quick Start

Install agentscope first, and then the MilvusLite dependency:

bash
# In MacOS/Linux
pip install pymilvus\[milvus_lite\]

# In Windows
pip install pymilvus[milvus_lite]

Run the example script, which showcases adding, searching with/without filters in MilvusLite vector store:

bash
python milvuslite_store.py

Note: The script creates .db files in the current directory. You can delete them after testing.

Usage

Initialize Store

python
from agentscope.rag import MilvusLiteStore

store = MilvusLiteStore(
    uri="./milvus_test.db",
    collection_name="test_collection",
    dimensions=768,              # Match your embedding model
    distance="COSINE",           # COSINE, L2, or IP
)

Add Documents

python
from agentscope.rag import Document, DocMetadata
from agentscope.message import TextBlock

doc = Document(
    metadata=DocMetadata(
        content=TextBlock(type="text", text="Your document text"),
        doc_id="doc_1",
        chunk_id=0,
        total_chunks=1,
    ),
    embedding=[0.1, 0.2, ...],  # Your embedding vector
)

await store.add([doc])
python
results = await store.search(
    query_embedding=[0.15, 0.25, ...],
    limit=5,
    score_threshold=0.9,                # Optional
    filter='doc_id like "prefix%"',     # Optional
)

Delete

python
await store.delete(filter_expr='doc_id == "doc_1"')

Distance Metrics

MetricDescriptionBest For
COSINECosine similarityText embeddings (recommended)
L2Euclidean distanceSpatial data
IPInner ProductRecommendation systems

Filter Expressions

python
# Exact match
filter='doc_id == "doc_1"'

# Pattern matching
filter='doc_id like "prefix%"'

# Numeric and logical operators
filter='chunk_id >= 0 and total_chunks > 1'

Advanced Usage

Access Underlying Client

python
client = store.get_client()
stats = client.get_collection_stats(collection_name="test_collection")

Document Metadata

  • content: Text content (TextBlock)
  • doc_id: Unique document identifier
  • chunk_id: Chunk position (0-indexed)
  • total_chunks: Total chunks in document

FAQ

What embedding dimension should I use? Match your embedding model's output dimension (e.g., 768 for BERT, 1536 for OpenAI ada-002).

Can I change the distance metric after creation? No, create a new collection with the desired metric.

How do I delete the database? Delete the .db file specified in the uri parameter.

Is this suitable for production? MilvusLite works well for development and small-scale applications. For production at scale, consider Milvus standalone or cluster mode.

References