docs/examples/retrieval_qdrant.ipynb
<a href="https://colab.research.google.com/github/docling-project/docling/blob/main/docs/examples/retrieval_qdrant.ipynb" target="_parent"></a>
| Step | Tech | Execution |
|---|---|---|
| Embedding | FastEmbed | 💻 Local |
| Vector store | Qdrant | 💻 Local |
This example demonstrates using Docling with Qdrant to perform a hybrid search across your documents using dense and sparse vectors.
We'll chunk the documents using Docling before adding them to a Qdrant collection. By limiting the length of the chunks, we can preserve the meaning in each vector embedding.
fastembed-gpu package if you've got the hardware to support it.%pip install --no-warn-conflicts -q qdrant-client docling fastembed
Let's import all the classes we'll be working with.
from qdrant_client import QdrantClient
from docling.chunking import HybridChunker
from docling.datamodel.base_models import InputFormat
from docling.document_converter import DocumentConverter
COLLECTION_NAME = "docling"
doc_converter = DocumentConverter(allowed_formats=[InputFormat.HTML])
client = QdrantClient(location=":memory:")
# The :memory: mode is a Python imitation of Qdrant's APIs for prototyping and CI.
# For production deployments, use the Docker image: docker run -p 6333:6333 qdrant/qdrant
# client = QdrantClient(location="http://localhost:6333")
client.set_model("sentence-transformers/all-MiniLM-L6-v2")
client.set_sparse_model("Qdrant/bm25")
We can now download and chunk the document using Docling. For demonstration, we'll use an article about chunking strategies :)
result = doc_converter.convert(
"https://www.sagacify.com/news/a-guide-to-chunking-strategies-for-retrieval-augmented-generation-rag"
)
documents, metadatas = [], []
for chunk in HybridChunker().chunk(result.document):
documents.append(chunk.text)
metadatas.append(chunk.meta.export_json_dict())
Let's now upload the documents to Qdrant.
add() method batches the documents and uses FastEmbed to generate vector embeddings on our machine._ = client.add(
collection_name=COLLECTION_NAME,
documents=documents,
metadata=metadatas,
batch_size=64,
)
points = client.query(
collection_name=COLLECTION_NAME,
query_text="Can I split documents?",
limit=10,
)
for i, point in enumerate(points):
print(f"=== {i} ===")
print(point.document)
print()