documentation/query/functions/window-functions/reference.md
This page provides detailed documentation for each window function. For an introduction to window functions and how they work, see the Overview. For syntax details on the OVER clause, see OVER Clause Syntax.
These functions respect the frame clause and calculate values over the specified window frame.
Calculates the average of values over the window frame. Supports standard arithmetic average, Exponential Moving Average (EMA), and Volume-Weighted Exponential Moving Average (VWEMA).
Syntax:
-- Standard average
avg(value) OVER (window_definition)
-- Exponential Moving Average (EMA)
avg(value, kind, param) OVER (window_definition)
-- Volume-Weighted Exponential Moving Average (VWEMA)
avg(value, kind, param, volume) OVER (window_definition)
Arguments:
value: Numeric column (short, int, long, float, double) to calculate the average ofkind (EMA/VWEMA): Smoothing mode - 'alpha', 'period', or a time unit ('second', 'minute', 'hour', 'day', 'week')param (EMA/VWEMA): Parameter for the smoothing mode (see below)volume (VWEMA only): Numeric column representing volume weightsReturn value:
double - The average of value for rows in the window frameDescription:
avg() operates on the window defined by PARTITION BY, ORDER BY, and frame specification. It respects the frame clause, calculating a separate average for each row based on its corresponding window.
Use avg() as a window function when you need to compare individual values against their surrounding context. Common use cases include:
Example:
SELECT
symbol,
price,
timestamp,
avg(price) OVER (
PARTITION BY symbol
ORDER BY timestamp
ROWS BETWEEN 3 PRECEDING AND CURRENT ROW
) AS moving_avg
FROM trades
WHERE timestamp IN '[$today]';
The EMA variant applies exponential smoothing, giving more weight to recent values. It supports three smoothing modes:
| Mode | kind | param | Description |
|---|---|---|---|
| Direct alpha | 'alpha' | 0 < α ≤ 1 | Use smoothing factor directly |
| Period-based | 'period' | N | N-period EMA where α = 2 / (N + 1) |
| Time-weighted | 'second', 'minute', 'hour', 'day', 'week' | τ (tau) | Time-weighted decay where α = 1 - exp(-Δt / τ) |
EMA formula:
EMA = α × current_value + (1 - α) × previous_EMA
Examples:
SELECT
symbol,
price,
timestamp,
avg(price, 'alpha', 0.2) OVER (
PARTITION BY symbol
ORDER BY timestamp
) AS ema_alpha
FROM trades
WHERE timestamp IN '[$today]';
SELECT
symbol,
price,
timestamp,
avg(price, 'period', 10) OVER (
PARTITION BY symbol
ORDER BY timestamp
) AS ema_10
FROM trades
WHERE timestamp IN '[$today]';
SELECT
symbol,
price,
timestamp,
avg(price, 'minute', 5) OVER (
PARTITION BY symbol
ORDER BY timestamp
) AS ema_5min
FROM trades
WHERE timestamp IN '[$today]';
:::note EMA behavior
TIMESTAMP and TIMESTAMP_NS precisionVWEMA combines exponential smoothing with volume weighting, useful for financial analysis where trading volume affects price significance.
VWEMA formula:
numerator = α × price × volume + (1 - α) × prev_numerator
denominator = α × volume + (1 - α) × prev_denominator
VWEMA = numerator / denominator
For time-weighted mode: α = 1 - exp(-Δt / τ)
Examples:
SELECT
symbol,
price,
timestamp,
avg(price, 'alpha', 0.1, amount) OVER (
PARTITION BY symbol
ORDER BY timestamp
) AS vwema_alpha
FROM trades
WHERE timestamp IN '[$today]';
SELECT
symbol,
price,
timestamp,
avg(price, 'period', 10, amount) OVER (
PARTITION BY symbol
ORDER BY timestamp
) AS vwema_10
FROM trades
WHERE timestamp IN '[$today]';
SELECT
symbol,
price,
timestamp,
avg(price, 'hour', 1, amount) OVER (
PARTITION BY symbol
ORDER BY timestamp
) AS vwema_1h
FROM trades
WHERE timestamp IN '[$today]';
:::note VWEMA behavior
Counts rows or non-null values over the window frame.
Syntax:
count(*) OVER (window_definition)
count(value) OVER (window_definition)
Arguments:
*: Counts all rows in the framevalue: Counts non-null values onlyReturn value:
long - Number of rows or non-null values in the window frameDescription:
Use count() as a window function when you need to track frequency or density of events over time. Common use cases include:
Example:
SELECT
symbol,
timestamp,
count(*) OVER (
PARTITION BY symbol
ORDER BY timestamp
RANGE BETWEEN '1' SECOND PRECEDING AND CURRENT ROW
) AS trades_last_second
FROM trades
WHERE timestamp IN '[$today]';
Calculates the sum of values over the window frame. Commonly used for running totals.
Syntax:
sum(value) OVER (window_definition)
Arguments:
value: Numeric column (short, int, long, float, double)Return value:
double - The sum of value for rows in the window frameDescription:
Use sum() as a window function when you need to track accumulation or totals over a sequence. Common use cases include:
Example:
SELECT
symbol,
amount,
timestamp,
sum(amount) OVER (
PARTITION BY symbol
ORDER BY timestamp
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS cumulative_amount
FROM trades
WHERE timestamp IN '[$today]';
Calculates the sum of values over the window frame using the Kahan summation algorithm for improved floating-point precision. This is particularly useful when summing many floating-point values where standard summation might accumulate rounding errors.
Syntax:
ksum(value) OVER (window_definition)
Arguments:
value: Numeric column (short, int, long, float, double) to sumReturn value:
double - The sum of value for rows in the window frame with improved precisionDescription:
ksum() uses the Kahan summation algorithm which maintains a running compensation for lost low-order bits. This is useful when summing many floating-point numbers where the standard sum() function might accumulate significant rounding errors.
Use ksum() instead of sum() when precision matters. Common use cases include:
Example:
SELECT
symbol,
price,
timestamp,
ksum(price) OVER (
PARTITION BY symbol
ORDER BY timestamp
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS cumulative_price
FROM trades
WHERE timestamp IN '[$today]';
SELECT
symbol,
price,
timestamp,
ksum(price) OVER (
ORDER BY timestamp
ROWS BETWEEN 3 PRECEDING AND CURRENT ROW
) AS rolling_sum
FROM trades
WHERE timestamp IN '[$today]';
Returns the minimum value within the window frame.
Syntax:
min(value) OVER (window_definition)
Arguments:
value: Numeric column (short, int, long, float, double)Return value:
Description:
Use min() as a window function when you need to track lowest values over a range. Common use cases include:
Example:
SELECT
symbol,
price,
timestamp,
min(price) OVER (
PARTITION BY symbol
ORDER BY timestamp
ROWS BETWEEN 3 PRECEDING AND CURRENT ROW
) AS lowest_price
FROM trades
WHERE timestamp IN '[$today]';
Returns the maximum value within the window frame.
Syntax:
max(value) OVER (window_definition)
Arguments:
value: Numeric column (short, int, long, float, double)Return value:
Description:
Use max() as a window function when you need to track highest values over a range. Common use cases include:
Example:
SELECT
symbol,
price,
timestamp,
max(price) OVER (
PARTITION BY symbol
ORDER BY timestamp
ROWS BETWEEN 3 PRECEDING AND CURRENT ROW
) AS highest_price
FROM trades
WHERE timestamp IN '[$today]';
Returns the first value in the window frame. Supports IGNORE NULLS clause.
Syntax:
first_value(value) [(IGNORE|RESPECT) NULLS]
OVER ([PARTITION BY partition_expression]
[ORDER BY sort_expression]
[frame_clause])
Arguments:
value: Column or expression to get value fromIGNORE NULLS (optional): Skip null valuesRESPECT NULLS (default): Include null valuesReturn value:
IGNORE NULLS)Description:
Use first_value() when you need to reference the starting point of a sequence. Common use cases include:
IGNORE NULLS to carry forward the last known value when data is sparseExample:
SELECT
symbol,
price,
timestamp,
first_value(price) OVER (
PARTITION BY symbol
ORDER BY timestamp
) AS first_price,
first_value(price) IGNORE NULLS OVER (
PARTITION BY symbol
ORDER BY timestamp
) AS first_non_null_price
FROM trades
WHERE timestamp IN '[$today]';
Returns the last value in the window frame. Supports IGNORE NULLS clause.
Syntax:
last_value(value) [(IGNORE|RESPECT) NULLS]
OVER ([PARTITION BY partition_expression]
[ORDER BY sort_expression]
[frame_clause])
Arguments:
value: Column or expression to get value fromIGNORE NULLS (optional): Skip null valuesRESPECT NULLS (default): Include null valuesReturn value:
IGNORE NULLS)Description:
Use last_value() when you need to reference the most recent or ending value in a sequence. Common use cases include:
IGNORE NULLS to get the most recent non-null value for sparse dataFrame behavior:
ORDER BY or frame clause: default is ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWINGORDER BY but no frame clause: default is ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROWExample:
SELECT
timestamp,
price,
last_value(price) OVER (
PARTITION BY symbol
ORDER BY timestamp
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
) AS last_price,
last_value(price) IGNORE NULLS OVER (
PARTITION BY symbol
ORDER BY timestamp
) AS last_non_null_price
FROM trades
WHERE timestamp IN '[$today]';
This example:
last_price)last_non_null_price)RESPECT NULLS (default) and IGNORE NULLS behaviorThese functions assign ranks or row numbers. They ignore the frame clause and operate on the entire partition.
Assigns a unique sequential number to each row within its partition, starting at 1.
Syntax:
row_number() OVER (window_definition)
Arguments:
Return value:
long type)Description:
row_number() assigns unique numbers even when rows have equal values in the ORDER BY column. The assignment among equal values is non-deterministic.
Use row_number() when you need unique sequential identifiers within groups. Common use cases include:
row_number() = 1)Example:
SELECT
symbol,
price,
timestamp,
row_number() OVER (
PARTITION BY symbol
ORDER BY timestamp
) AS trade_number
FROM trades
WHERE timestamp IN '[$today]';
Assigns ranks within a partition. Rows with equal values get the same rank, with gaps in the sequence.
Syntax:
rank() OVER (window_definition)
Arguments:
Return value:
long type)Description:
With rank(), if two rows tie for rank 2, the next row gets rank 4 (not 3). The rank equals the row_number of the first row in its peer group.
Use rank() when ties should share a rank and you want gaps to reflect the true position. Common use cases include:
Example:
SELECT
symbol,
price,
timestamp,
rank() OVER (
PARTITION BY symbol
ORDER BY price DESC
) AS price_rank
FROM trades
WHERE timestamp IN '[$today]';
Assigns ranks within a partition. Rows with equal values get the same rank, with no gaps in the sequence.
Syntax:
dense_rank() OVER (window_definition)
Arguments:
Return value:
long type)Description:
Unlike rank(), dense_rank() produces consecutive rank numbers. If two rows tie for rank 2, the next row gets rank 3.
Use dense_rank() when you need consecutive rank values without gaps. Common use cases include:
Example:
SELECT
symbol,
price,
timestamp,
dense_rank() OVER (
PARTITION BY symbol
ORDER BY price DESC
) AS price_rank
FROM trades
WHERE timestamp IN '[$today]';
Returns the relative rank of the current row within its partition as a value between 0 and 1.
Syntax:
percent_rank() OVER (window_definition)
Arguments:
Return value:
double between 0 and 1Description:
percent_rank() calculates the relative rank using the formula:
(rank - 1) / (total_rows - 1)
Where rank is the value that would be returned by rank() for the same row. This produces values from 0 (first row) to 1 (last row).
Use percent_rank() when you need to understand where a value falls within a distribution as a percentage rather than an absolute position. Common use cases include:
percent_rank < 0.1)Special cases:
ORDER BY, all rows are peers with rank 1, so percent_rank returns 0 for all rowsExample:
SELECT
symbol,
price,
timestamp,
percent_rank() OVER (
PARTITION BY symbol
ORDER BY price DESC
) AS price_percentile
FROM trades
WHERE timestamp IN '[$today]';
SELECT
symbol,
price,
timestamp,
rank() OVER (ORDER BY price DESC) AS rank,
percent_rank() OVER (ORDER BY price DESC) AS percent_rank
FROM trades
WHERE timestamp IN '[$today]'
AND symbol = 'BTC-USDT';
| symbol | price | rank | percent_rank |
|---|---|---|---|
| BTC-USDT | 105 | 1 | 0.0 |
| BTC-USDT | 103 | 2 | 0.25 |
| BTC-USDT | 101 | 3 | 0.5 |
| BTC-USDT | 101 | 3 | 0.5 |
| BTC-USDT | 99 | 5 | 1.0 |
In this example, percent_rank shows where each price falls relative to others: 0.0 means highest price, 1.0 means lowest, and 0.5 means middle of the distribution. Tied values (both 101) receive the same percent rank.
The following table shows how all four ranking functions behave on the same data, ordered by price descending:
| price | row_number | rank | dense_rank | percent_rank |
|---|---|---|---|---|
| 105 | 1 | 1 | 1 | 0.0 |
| 103 | 2 | 2 | 2 | 0.25 |
| 101 | 3 | 3 | 3 | 0.5 |
| 101 | 4 | 3 | 3 | 0.5 |
| 99 | 5 | 5 | 4 | 1.0 |
Key differences:
row_number(): Always unique, sequential (1, 2, 3, 4, 5). Ties get different numbers (non-deterministic order).rank(): Ties share the same rank, with gaps after ties (1, 2, 3, 3, 5). The gap reflects the true position.dense_rank(): Ties share the same rank, no gaps (1, 2, 3, 3, 4). Consecutive integers only.percent_rank(): Relative position as 0-1 value. Ties share the same value. Formula: (rank - 1) / (total_rows - 1).These functions access values from other rows relative to the current row. They ignore frame clauses.
Accesses data from a previous row without a self-join.
Syntax:
lag(value [, offset [, default]]) [(IGNORE|RESPECT) NULLS]
OVER ([PARTITION BY partition_expression] [ORDER BY sort_expression])
Arguments:
value: Column or expression to retrieveoffset (optional): Number of rows back. Default is 1default (optional): Value when offset exceeds partition bounds. Default is NULLIGNORE NULLS (optional): Skip null values when counting offsetRESPECT NULLS (default): Include null values in offset countingReturn value:
Description:
Use lag() when you need to compare the current row with previous values. Common use cases include:
Behavior:
offset is 0, returns current row valueROWS/RANGE) are ignoredORDER BY, uses table scan orderExample:
SELECT
timestamp,
price,
lag(price) OVER (
PARTITION BY symbol
ORDER BY timestamp
) AS previous_price,
lag(price, 2, 0.0) OVER (
PARTITION BY symbol
ORDER BY timestamp
) AS price_two_rows_back
FROM trades
WHERE timestamp IN '[$today]';
This example:
previous_price)price_two_rows_back)Accesses data from a subsequent row without a self-join.
Syntax:
lead(value [, offset [, default]]) [(IGNORE|RESPECT) NULLS]
OVER ([PARTITION BY partition_expression] [ORDER BY sort_expression])
Arguments:
value: Column or expression to retrieveoffset (optional): Number of rows forward. Default is 1default (optional): Value when offset exceeds partition bounds. Default is NULLIGNORE NULLS (optional): Skip null values when counting offsetRESPECT NULLS (default): Include null values in offset countingReturn value:
Description:
Use lead() when you need to look ahead to future values in the sequence. Common use cases include:
Behavior:
offset is 0, returns current row valueROWS/RANGE) are ignoredORDER BY, uses table scan orderExample:
SELECT
timestamp,
price,
lead(price) OVER (
PARTITION BY symbol
ORDER BY timestamp
) AS next_price,
lead(price, 2, 0.0) OVER (
PARTITION BY symbol
ORDER BY timestamp
) AS price_after_next
FROM trades
WHERE timestamp IN '[$today]';
This example:
next_price)price_after_next)DECLARE @best_bid := bids[1,1]
SELECT
timestamp,
symbol,
@best_bid AS best_bid,
avg(@best_bid) OVER (
PARTITION BY symbol
ORDER BY timestamp
ROWS BETWEEN 3 PRECEDING AND CURRENT ROW
) AS bid_moving_avg
FROM market_data
WHERE timestamp IN '[$today]';
DECLARE
@best_bid := bids[1,1],
@volume_l1 := bids[2,1]
SELECT
timestamp, symbol,
@best_bid AS bid_price_l1,
@volume_l1 AS bid_volume_l1,
sum(@volume_l1) OVER (
PARTITION BY symbol ORDER BY timestamp
ROWS BETWEEN 5 PRECEDING AND CURRENT ROW
) AS bid_volume_l1_5rows
FROM market_data
WHERE timestamp IN '[$today]';
DECLARE
@best_bid := bids[1,1],
@volume_l1 := bids[2,1]
SELECT
timestamp,
sum(@volume_l1) OVER (
ORDER BY timestamp
RANGE BETWEEN '1' MINUTE PRECEDING AND CURRENT ROW
) AS bid_volume_1min
FROM market_data
WHERE timestamp IN '[$today]' AND symbol = 'GBPUSD';
This example uses a named window to avoid repeating the same window definition:
SELECT
timestamp,
symbol,
COUNT(*) OVER w AS updates_per_min,
COUNT(CASE WHEN side = 'buy' THEN 1 END) OVER w AS buys_per_minute,
COUNT(CASE WHEN side = 'sell' THEN 1 END) OVER w AS sells_per_minute
FROM trades
WHERE timestamp IN '[$today]' AND symbol = 'BTC-USDT'
WINDOW w AS (ORDER BY timestamp RANGE BETWEEN 60000000 PRECEDING AND CURRENT ROW);
ORDER BY clause outside the OVER clause to ensure consistent ordering.row_number, rank, dense_rank, percent_rank) and offset functions (lag, lead) ignore frame specifications.RANGE frames with timestamp columns.avg, sum, ksum, count, min, max) support numeric types: short, int, long, float, double. The decimal type is not supported.ntile() and cume_dist() are not currently supported.