docs/mintlify/cloud/search-api/batch-operations.mdx
Pass a list of Search objects to execute them in a single request. Each search operates independently and returns its own results.
<CodeGroup> ```python Python from chromadb import Search, K, Knnsearches = [ # Search 1: Recent articles (Search() .where((K("type") == "article") & (K("year") >= 2024)) .rank(Knn(query="machine learning applications")) .limit(5) .select(K.DOCUMENT, K.SCORE, "title")),
# Search 2: Papers by specific authors
(Search()
.where(K("author").is_in(["Smith", "Jones"]))
.rank(Knn(query="neural network research"))
.limit(10)
.select(K.DOCUMENT, K.SCORE, "title", "author")),
# Search 3: Featured content (no ranking)
Search()
.where(K("status") == "featured")
.limit(20)
.select("title", "date")
]
results = collection.search(searches)
```typescript TypeScript
import { Search, K, Knn } from 'chromadb';
// Execute multiple searches in one call
const searches = [
// Search 1: Recent articles
new Search()
.where(K("type").eq("article").and(K("year").gte(2024)))
.rank(Knn({ query: "machine learning applications" }))
.limit(5)
.select(K.DOCUMENT, K.SCORE, "title"),
// Search 2: Papers by specific authors
new Search()
.where(K("author").isIn(["Smith", "Jones"]))
.rank(Knn({ query: "neural network research" }))
.limit(10)
.select(K.DOCUMENT, K.SCORE, "title", "author"),
// Search 3: Featured content (no ranking)
new Search()
.where(K("status").eq("featured"))
.limit(20)
.select("title", "date")
];
// Execute all searches in one request
const results = await collection.search(searches);
use chroma::types::{Key, QueryVector, RankExpr, SearchPayload};
let searches = vec![
SearchPayload::default()
.r#where(Key::field("type").eq("article") & Key::field("year").gte(2024))
.rank(RankExpr::Knn {
query: QueryVector::Dense(vec![0.1, 0.2, 0.3]),
key: Key::Embedding,
limit: 16,
default: None,
return_rank: false,
})
.limit(Some(5), 0)
.select([Key::Document, Key::Score, Key::field("title")]),
SearchPayload::default()
.r#where(Key::field("author").is_in(["Smith", "Jones"]))
.rank(RankExpr::Knn {
query: QueryVector::Dense(vec![0.2, 0.3, 0.4]),
key: Key::Embedding,
limit: 16,
default: None,
return_rank: false,
})
.limit(Some(10), 0)
.select([Key::Document, Key::Score, Key::field("title"), Key::field("author")]),
SearchPayload::default()
.r#where(Key::field("status").eq("featured"))
.limit(Some(20), 0)
.select([Key::field("title"), Key::field("date")]),
];
let results = collection.search(searches).await?;
Results from batch operations maintain the same order as your searches. Each search's results are accessed by its index.
Each field in the SearchResult maintains a list where each index corresponds to a search:
results.ids[i] - IDs from search at index iresults.documents[i] - Documents from search at index i (if selected)results.embeddings[i] - Embeddings from search at index i (if selected)results.metadatas[i] - Metadata from search at index i (if selected)results.scores[i] - Scores from search at index i (if ranking was used)ids_1 = results.ids[0] # IDs from search1 ids_2 = results.ids[1] # IDs from search2 ids_3 = results.ids[2] # IDs from search3
all_rows = results.rows() # Returns list of lists rows_1 = all_rows[0] # Rows from search1 rows_2 = all_rows[1] # Rows from search2 rows_3 = all_rows[2] # Rows from search3
for search_index, rows in enumerate(all_rows): print(f"Results from search {search_index + 1}:") for row in rows: print(f" - {row['id']}: {row.get('metadata', {}).get('title', 'N/A')}")
```typescript TypeScript
// Batch search returns multiple result sets
const results = await collection.search([search1, search2, search3]);
// Access results by index
const ids1 = results.ids[0]; // IDs from search1
const ids2 = results.ids[1]; // IDs from search2
const ids3 = results.ids[2]; // IDs from search3
// Using rows() for easier processing
const allRows = results.rows(); // Returns list of lists
const rows1 = allRows[0]; // Rows from search1
const rows2 = allRows[1]; // Rows from search2
const rows3 = allRows[2]; // Rows from search3
// Process each search's results
for (const [searchIndex, rows] of allRows.entries()) {
console.log(`Results from search ${searchIndex + 1}:`);
for (const row of rows) {
console.log(` - ${row.id}: ${row.metadata?.title ?? 'N/A'}`);
}
}
let results = collection.search(vec![search1, search2, search3]).await?;
let ids_1 = &results.ids[0]; // IDs from search1
let ids_2 = &results.ids[1]; // IDs from search2
let ids_3 = &results.ids[2]; // IDs from search3
Test multiple query variations to find the most relevant results.
<CodeGroup> ```python Python # Compare different query variations query_variations = [ "machine learning", "machine learning algorithms and applications", "modern machine learning techniques" ]searches = [ Search() .rank(Knn(query=q)) .limit(10) .select(K.DOCUMENT, K.SCORE, "title") for q in query_variations ]
results = collection.search(searches)
for i, query_name in enumerate(["Original", "Expanded", "Refined"]): print(f"{query_name} Query Top Result:") if results.scores[i]: print(f" Score: {results.scores[i][0]:.3f}")
```typescript TypeScript
// Compare different query variations
const queryVariations = [
"machine learning",
"machine learning algorithms and applications",
"modern machine learning techniques"
];
const searches = queryVariations.map(q =>
new Search()
.rank(Knn({ query: q }))
.limit(10)
.select(K.DOCUMENT, K.SCORE, "title")
);
const results = await collection.search(searches);
// Compare top results from each variation
["Original", "Expanded", "Refined"].forEach((queryName, i) => {
console.log(`${queryName} Query Top Result:`);
if (results.scores[i] && results.scores[i].length > 0) {
console.log(` Score: ${results.scores[i][0].toFixed(3)}`);
}
});
Compare different ranking approaches on the same query.
<CodeGroup> ```python Python # Test different ranking strategies searches = [ # Strategy A: Pure KNN Search() .rank(Knn(query="artificial intelligence")) .limit(10) .select(K.SCORE, "title"),# Strategy B: Weighted KNN
Search()
.rank(Knn(query="artificial intelligence") * 0.8 + 0.2)
.limit(10)
.select(K.SCORE, "title"),
# Strategy C: Hybrid with RRF
Search()
.rank(Rrf([
Knn(query="artificial intelligence", return_rank=True),
Knn(query="artificial intelligence", key="sparse_embedding", return_rank=True)
]))
.limit(10)
.select(K.SCORE, "title")
]
results = collection.search(searches)
```typescript TypeScript
// Test different ranking strategies
const searches = [
// Strategy A: Pure KNN
new Search()
.rank(Knn({ query: "artificial intelligence" }))
.limit(10)
.select(K.SCORE, "title"),
// Strategy B: Weighted KNN
new Search()
.rank(Knn({ query: "artificial intelligence" }).multiply(0.8).add(0.2))
.limit(10)
.select(K.SCORE, "title"),
// Strategy C: Hybrid with RRF
new Search()
.rank(Rrf({
ranks: [
Knn({ query: "artificial intelligence", returnRank: true }),
Knn({ query: "artificial intelligence", key: "sparse_embedding", returnRank: true })
]
}))
.limit(10)
.select(K.SCORE, "title")
];
const results = await collection.search(searches);
Apply different filters to explore different subsets of your data.
<CodeGroup> ```python Python # Different category filters categories = ["technology", "science", "business"]searches = [ Search() .where(K("category") == category) .rank(Knn(query="artificial intelligence")) .limit(5) .select("title", "category", K.SCORE) for category in categories ]
results = collection.search(searches)
```typescript TypeScript
// Different category filters
const categories = ["technology", "science", "business"];
const searches = categories.map(category =>
new Search()
.where(K("category").eq(category))
.rank(Knn({ query: "artificial intelligence" }))
.limit(5)
.select("title", "category", K.SCORE)
);
const results = await collection.search(searches);
Batch operations are significantly faster than running searches sequentially:
<CodeGroup> ```python Python # Sequential execution (slow) results = [] for search in searches: result = collection.search(search) # Separate API call each time results.append(result)results = collection.search(searches) # Single API call for all
```typescript TypeScript
// Sequential execution (slow)
const results = [];
for (const search of searches) {
const result = await collection.search(search); // Separate API call each time
results.push(result);
}
// Batch execution (fast)
const results2 = await collection.search(searches); // Single API call for all
Batch operations reduce network overhead and enable server-side parallelization, often providing 3-10x speedup depending on the number and complexity of searches.
Passing an empty list returns an empty result.
For Chroma Cloud users, batch operations may be subject to quota limits on the total number of searches per request.
Different searches can select different fields - each search's results will contain only its requested fields.
<CodeGroup> ```python Python searches = [ Search().limit(5).select(K.DOCUMENT), # Only documents Search().limit(5).select(K.SCORE, "title"), # Scores and title Search().limit(5).select_all() # Everything ]results = collection.search(searches)
```typescript TypeScript
const searches = [
new Search().limit(5).select(K.DOCUMENT), // Only documents
new Search().limit(5).select(K.SCORE, "title"), // Scores and title
new Search().limit(5).selectAll() // Everything
];
const results = await collection.search(searches);
// results.documents[0] will have values
// results.documents[1] will be null (not selected)
// results.documents[2] will have values
Here's a practical example using batch operations to find and compare relevant documents across different categories:
<CodeGroup> ```python Python from chromadb import Search, K, Knndef compare_category_relevance(collection, query_text, categories): """Find top results in each category for the same query"""
# Build searches for each category
searches = [
Search()
.where(K("category") == cat)
.rank(Knn(query=query_text))
.limit(3)
.select(K.DOCUMENT, K.SCORE, "title", "category")
for cat in categories
]
# Execute batch search
results = collection.search(searches)
all_rows = results.rows()
# Process and display results
for cat_index, category in enumerate(categories):
print(f"\nTop results in {category}:")
rows = all_rows[cat_index]
if not rows:
print(" No results found")
continue
for i, row in enumerate(rows, 1):
title = row.get('metadata', {}).get('title', 'Untitled')
score = row.get('score', 0)
preview = row.get('document', '')[:100]
print(f" {i}. {title}")
print(f" Score: {score:.3f}")
print(f" Preview: {preview}...")
categories = ["technology", "science", "business", "health"] query_text = "artificial intelligence applications"
compare_category_relevance(collection, query_text, categories)
```typescript TypeScript
import { Search, K, Knn, type Collection } from 'chromadb';
async function compareCategoryRelevance(
collection: Collection,
queryText: string,
categories: string[]
) {
// Find top results in each category for the same query
// Build searches for each category
const searches = categories.map(cat =>
new Search()
.where(K("category").eq(cat))
.rank(Knn({ query: queryText }))
.limit(3)
.select(K.DOCUMENT, K.SCORE, "title", "category")
);
// Execute batch search
const results = await collection.search(searches);
const allRows = results.rows();
// Process and display results
for (const [catIndex, category] of categories.entries()) {
console.log(`\nTop results in ${category}:`);
const rows = allRows[catIndex];
if (!rows || rows.length === 0) {
console.log(" No results found");
continue;
}
for (const [i, row] of rows.entries()) {
const title = row.metadata?.title ?? 'Untitled';
const score = row.score ?? 0;
const preview = row.document?.substring(0, 100) ?? '';
console.log(` ${i+1}. ${title}`);
console.log(` Score: ${score.toFixed(3)}`);
console.log(` Preview: ${preview}...`);
}
}
}
// Usage
const categories = ["technology", "science", "business", "health"];
const queryText = "artificial intelligence applications";
await compareCategoryRelevance(collection, queryText, categories);
Example output:
Top results in technology:
1. AI in Software Development
Score: 0.234
Preview: The integration of artificial intelligence in modern software development has revolutionized...
2. Machine Learning Frameworks
Score: 0.312
Preview: Popular frameworks for building AI applications include TensorFlow, PyTorch, and...
Top results in science:
1. Neural Networks Research
Score: 0.289
Preview: Recent advances in neural network architectures have enabled breakthrough applications...
select_all() can consume significant memoryrows() method for easier result processing in batch operations