Back to Chroma

Migration Guide

docs/mintlify/cloud/search-api/migration.mdx

1.5.98.6 KB
Original Source

import { Callout } from '/snippets/callout.mdx';

<Callout> The `query()` and `get()` methods will continue to be supported, so migration to the Search API is optional. </Callout>

Parameter Mapping

<Callout> The Search API is available in Chroma Cloud. This guide uses dictionary syntax for minimal migration effort. </Callout>

query() Parameters

Legacy query()Search APINotes
query_embeddingsrank={"$knn": {"query": ...}}Can use text or embeddings
query_textsrank={"$knn": {"query": "text"}}Text queries now supported
query_imagesNot yet supportedImage queries coming in future release
query_urisNot yet supportedURI queries coming in future release
n_resultslimitDirect mapping
idswhere={"#id": {"$in": [...]}}Filter by IDs
wherewhereSame syntax
where_documentwhere={"#document": {...}}Use #document field
includeselectSee field mapping below

get() Parameters

Legacy get()Search APINotes
idswhere={"#id": {"$in": [...]}}Filter by IDs
wherewhereSame syntax
where_documentwhere={"#document": {...}}Use #document field
limitlimitDirect mapping
offsetlimit={"offset": ...}Part of limit dict
includeselectSee field mapping below

Include/Select Field Mapping

Legacy includeSearch API selectDescription
"ids"Always includedIDs are always returned
"documents""#document"Document content
"metadatas""#metadata"All metadata fields
"embeddings""#embedding"Vector embeddings
"distances""#score"Distance/score from query
"uris""#uri"Document URIs

Examples

<CodeGroup> ```python Python # Legacy API results = collection.query( query_embeddings=[[0.1, 0.2, 0.3]], n_results=10 )

Search API - with text query

from chromadb import Search

results = collection.search( Search( rank={"$knn": {"query": "machine learning"}}, limit=10 ) )


```typescript TypeScript
// Legacy API
const results = await collection.query({
  queryEmbeddings: [[0.1, 0.2, 0.3]],
  nResults: 10
});

// Search API - with text query
import { Search } from 'chromadb';

const results2 = await collection.search(
  new Search({
    rank: { $knn: { query: "machine learning" } },
    limit: 10
  })
);
rust
use chroma::types::{QueryVector, RankExpr, SearchPayload};

let results = collection
    .query(vec![vec![0.1, 0.2, 0.3]], Some(10), None, None, None)
    .await?;

let results2 = collection
    .search(vec![SearchPayload::default()
        .rank(RankExpr::Knn {
            query: QueryVector::Dense(vec![0.1, 0.2, 0.3]),
            key: chroma::types::Key::Embedding,
            limit: 10,
            default: None,
            return_rank: false,
        })
        .limit(Some(10), 0)])
    .await?;
</CodeGroup>

Document Filtering

<CodeGroup> ```python Python # Legacy API results = collection.query( query_embeddings=[[0.1, 0.2, 0.3]], n_results=5, where_document={"$contains": "quantum"} )

Search API

results = collection.search( Search( rank={"$knn": {"query": "quantum computing"}}, where={"#document": {"$contains": "quantum"}}, limit=5 ) )


```typescript TypeScript
// Legacy API
const results = await collection.query({
  queryEmbeddings: [[0.1, 0.2, 0.3]],
  nResults: 5,
  whereDocument: { $contains: "quantum" }
});

// Search API
const results2 = await collection.search(
  new Search({
    rank: { $knn: { query: "quantum computing" } },
    where: { "#document": { $contains: "quantum" } },
    limit: 5
  })
);
</CodeGroup>

Combined Filters

<CodeGroup> ```python Python # Legacy API results = collection.query( query_embeddings=[[0.1, 0.2, 0.3]], n_results=10, where={"category": "science"}, where_document={"$contains": "quantum"} )

Search API - combine filters with $and

results = collection.search( Search( where={"$and": [ {"category": "science"}, {"#document": {"$contains": "quantum"}} ]}, rank={"$knn": {"query": "quantum physics"}}, limit=10 ) )


```typescript TypeScript
// Legacy API
const results = await collection.query({
  queryEmbeddings: [[0.1, 0.2, 0.3]],
  nResults: 10,
  where: { category: "science" },
  whereDocument: { $contains: "quantum" }
});

// Search API - combine filters with $and
const results2 = await collection.search(
  new Search({
    where: {
      $and: [
        { category: "science" },
        { "#document": { $contains: "quantum" } }
      ]
    },
    rank: { $knn: { query: "quantum physics" } },
    limit: 10
  })
);
</CodeGroup>

Get by IDs

<CodeGroup> ```python Python # Legacy API results = collection.get( ids=["id1", "id2", "id3"] )

Search API

results = collection.search( Search( where={"#id": {"$in": ["id1", "id2", "id3"]}} ) )


```typescript TypeScript
// Legacy API
const results = await collection.get({
  ids: ["id1", "id2", "id3"]
});

// Search API
const results2 = await collection.search(
  new Search({
    where: { "#id": { $in: ["id1", "id2", "id3"] } }
  })
);
</CodeGroup>

Pagination

<CodeGroup> ```python Python # Legacy API results = collection.get( where={"status": "active"}, limit=100, offset=50 )

Search API

results = collection.search( Search( where={"status": "active"}, limit={"limit": 100, "offset": 50} ) )


```typescript TypeScript
// Legacy API
const results = await collection.get({
  where: { status: "active" },
  limit: 100,
  offset: 50
});

// Search API
const results2 = await collection.search(
  new Search({
    where: { status: "active" },
    limit: { limit: 100, offset: 50 }
  })
);
</CodeGroup>

Key Differences

Text Queries Now Supported

The Search API supports text queries directly - they are automatically converted to embeddings using the collection's configured embedding function.

<CodeGroup> ```python Python # Legacy API collection.query(query_texts=["search text"])

Search API - direct text query

collection.search(Search(rank={"$knn": {"query": "search text"}}))


```typescript TypeScript
// Legacy API
await collection.query({ queryTexts: ["search text"] });

// Search API - direct text query
await collection.search(
  new Search({ rank: { $knn: { query: "search text" } } })
);
</CodeGroup>

New Capabilities

  • Advanced filtering - Complex logical expressions
  • Custom ranking - Combine and transform ranking expressions
  • Hybrid search - RRF for combining multiple strategies
  • Selective fields - Return only needed fields
  • Flexible batch operations - Different parameters per search in batch

Flexible Batch Operations

The Search API allows different parameters for each search in a batch:

<CodeGroup> ```python Python # Legacy - same parameters for all queries results = collection.query( query_embeddings=[emb1, emb2, emb3], n_results=10, where={"category": "science"} # Same filter for all )

Search API - different parameters per search

searches = [ Search(rank={"$knn": {"query": "machine learning"}}, limit=10, where={"category": "science"}), Search(rank={"$knn": {"query": "neural networks"}}, limit=5, where={"category": "tech"}), Search(rank={"$knn": {"query": "artificial intelligence"}}, limit=20) # No filter ] results = collection.search(searches)


```typescript TypeScript
// Legacy - same parameters for all queries
const results = await collection.query({
  queryEmbeddings: [emb1, emb2, emb3],
  nResults: 10,
  where: { category: "science" }  // Same filter for all
});

// Search API - different parameters per search
const searches = [
  new Search({ rank: { $knn: { query: "machine learning" } }, limit: 10, where: { category: "science" } }),
  new Search({ rank: { $knn: { query: "neural networks" } }, limit: 5, where: { category: "tech" } }),
  new Search({ rank: { $knn: { query: "artificial intelligence" } }, limit: 20 })  // No filter
];
const results2 = await collection.search(searches);
</CodeGroup>

Migration Tips

  • Start with simple queries before complex ones
  • Test both APIs in parallel during migration
  • Use batch operations to reduce API calls
  • Text queries are now supported - use them directly in the Search API

Next Steps