docs/documentation/filtering.mdx
Adding filters to text search is as simple as using PostgreSQL's built-in WHERE clauses and operators.
For instance, the following query filters out results that do not meet rating > 2.
from paradedb import Match, ParadeDB
MockItem.objects.filter(
description=ParadeDB(Match('running shoes', operator='OR')),
rating__gt=2
).values('description', 'rating', 'category')
from sqlalchemy import select
from sqlalchemy.orm import Session
from paradedb.sqlalchemy import search
stmt = (
select(MockItem.description, MockItem.rating, MockItem.category)
.where(search.match_any(MockItem.description, "running shoes"), MockItem.rating > 2)
)
with Session(engine) as session:
session.execute(stmt).all()
MockItem.search(:description)
.matching_any("running shoes")
.where(rating: 3..)
.select(:description, :rating, :category)
While not required, filtering performance over non-text columns can be improved by including them in the BM25 index.
When these columns are part of the index, WHERE clauses that reference them can be pushed down into the index scan itself.
This can result in faster query execution over large datasets.
For example, if rating and created_at are frequently used in filters, they can be added to the BM25 index during index creation:
CREATE INDEX search_idx ON mock_items
USING bm25(id, description, rating, created_at)
WITH (key_field = 'id');
Filter pushdown is currently supported for the following combinations of types and operators:
| Operator | Left Operand Type | Right Operand Type | Example |
|---|---|---|---|
=, <, >, <=, >=, <>, BETWEEN | int2 | int2 | WHERE rating = 2 |
int4 | int4 | ||
int8 | int8 | ||
int2 | int4 | ||
int2 | int8 | ||
int4 | int8 | ||
float4 | float4 | ||
float8 | float8 | ||
float4 | float8 | ||
numeric | numeric | WHERE price = 99.99 | |
date | date | ||
time | time | ||
timetz | timetz | ||
timestamp | timestamp | ||
timestamptz | timestamptz | ||
uuid | uuid | ||
= | bool | bool | WHERE in_stock = true |
IN, ANY, ALL | bool | bool[] | WHERE rating IN (1,2,3) |
int2 | int2[] | ||
int4 | int4[] | ||
int8 | int8[] | ||
int2 | int4[] | ||
int2 | int8[] | ||
int4 | int8[] | ||
float4 | float4[] | ||
float8 | float8[] | ||
float4 | float8[] | ||
date | date[] | ||
timetz | timetz[] | ||
timestamp | timestamp[] | ||
timestamptz | timestamptz[] | ||
uuid | uuid[] | ||
IS, IS NOT | bool | bool | WHERE in_stock IS true |
IS NULL, IS NOT NULL | bool | WHERE rating IS NOT NULL | |
int2 | |||
int4 | |||
int8 | |||
int2 | |||
int2 | |||
int4 | |||
float4 | |||
float8 | |||
float4 | |||
date | |||
time | |||
timetz | |||
timestamp | |||
timestamptz | |||
uuid |
Suppose we have a text filter that looks for an exact string match like category = 'Footwear':
from paradedb import ParadeDB, Term
MockItem.objects.filter(
description=ParadeDB(Term('shoes')),
category='Footwear'
).values('description', 'rating', 'category')
from sqlalchemy import select
from sqlalchemy.orm import Session
from paradedb.sqlalchemy import search
stmt = (
select(MockItem.description, MockItem.rating, MockItem.category)
.where(search.term(MockItem.description, "shoes"), MockItem.category == "Footwear")
)
with Session(engine) as session:
session.execute(stmt).all()
MockItem.search(:description)
.term("shoes")
.where(category: "Footwear")
.select(:description, :rating, :category)
To push down the category = 'Footwear' filter, category must be indexed using the literal tokenizer:
CREATE INDEX search_idx ON mock_items
USING bm25(id, description, (category::pdb.literal))
WITH (key_field = 'id');
Pushdown of set filters over text fields also requires the literal tokenizer:
<CodeGroup> ```sql SQL SELECT description, rating, category FROM mock_items WHERE description @@@ 'shoes' AND category IN ('Footwear', 'Apparel'); ```from paradedb import ParadeDB, Term
MockItem.objects.filter(
description=ParadeDB(Term('shoes')),
category__in=['Footwear', 'Apparel']
).values('description', 'rating', 'category')
from sqlalchemy import select
from sqlalchemy.orm import Session
from paradedb.sqlalchemy import search
stmt = (
select(MockItem.description, MockItem.rating, MockItem.category)
.where(search.term(MockItem.description, "shoes"), MockItem.category.in_(["Footwear", "Apparel"]))
)
with Session(engine) as session:
session.execute(stmt).all()
MockItem.search(:description)
.term("shoes")
.where(category: ["Footwear", "Apparel"])
.select(:description, :rating, :category)