docs/documentation/aggregates/metrics/cardinality.mdx
The cardinality aggregation estimates the number of distinct values in a field.
<CodeGroup> ```sql SQL SELECT pdb.agg('{"cardinality": {"field": "rating"}}') FROM mock_items WHERE id @@@ pdb.all(); ```import { search } from "@paradedb/drizzle-paradedb";
await db
.select({
agg: search.agg({ cardinality: { field: "rating" } }),
})
.from(mockItems)
.where(search.all(mockItems.id));
from paradedb import Agg, All, ParadeDB
MockItem.objects.filter(
id=ParadeDB(All())
).aggregate(agg=Agg('{"cardinality": {"field": "rating"}}'))
from sqlalchemy import select
from sqlalchemy.orm import Session
from paradedb.sqlalchemy import pdb, search
stmt = (
select(pdb.agg({"cardinality": {"field": "rating"}}))
.select_from(MockItem)
.where(search.all(MockItem.id))
)
with Session(engine) as session:
session.execute(stmt).all()
MockItem.search(:id)
.match_all
.facets_agg(agg: { cardinality: { field: "rating" } })
await dbContext
.MockItems.Where(item => EF.Functions.All(item.Id))
.Select(item => EF.Functions.Agg(new { cardinality = new { field = "rating" } }))
.ToListAsync();
agg
----------------
{"value": 5.0}
(1 row)
Unlike SQL's DISTINCT clause, which returns an exact value but is very computationally expensive, the cardinality aggregation uses the HyperLogLog++ algorithm to
closely approximate the number of distinct values.
See the Tantivy documentation for all available options.