doc/user/content/transform-data/idiomatic-materialize-sql/first-value.md
The "first value in each group" query pattern returns the first value, according to some ordering, in each group.
{{< callout >}}
{{< idiomatic-sql/materialize-window-functions >}}
{{</ callout >}}
Idiomatic Materialize SQL: To find the first value in each group, use MIN() or MAX() aggregate function in a subquery.
<table> <thead> <tr> <th></th> <th></th> </tr> </thead> <tbody> <tr> <td><blue>Materialize SQL</blue></td> <td class="copyableCode">Use a subquery that uses the MIN() or MAX() aggregate function.
<div style="background-color: var(--code-block)">SELECT tableA.fieldA, tableA.fieldB, minmax.Z
FROM tableA,
(SELECT fieldA,
MIN(fieldZ),
MAX(fieldZ)
FROM tableA
GROUP BY fieldA) minmax
WHERE tableA.fieldA = minmax.fieldA
ORDER BY fieldA ... ;
<red>Avoid the use of FIRST_VALUE() OVER (PARTITION BY ... ORDER BY ...)
window function for first value within groups
queries.</red>
-- Anti-pattern. Avoid. --
SELECT fieldA, fieldB,
FIRST_VALUE(fieldZ) OVER (PARTITION BY fieldA ORDER BY ...),
FIRST_VALUE(fieldZ) OVER (PARTITION BY fieldA ORDER BY ... DESC)
FROM tableA
ORDER BY fieldA, ...;
To further improve the memory usage of the idiomatic Materialize SQL, you can
specify a AGGREGATE INPUT GROUP SIZE query hint in
the idiomatic Materialize SQL.
SELECT tableA.fieldA, tableA.fieldB, minmax.Z
FROM tableA,
(SELECT fieldA,
MIN(fieldZ),
MAX(fieldZ)
FROM tableA
GROUP BY fieldA
OPTIONS (AGGREGATE INPUT GROUP SIZE = ...)
) minmax
WHERE tableA.fieldA = minmax.fieldA
ORDER BY fieldA ... ;
For more information on setting AGGREGATE INPUT GROUP SIZE, see
Optimization.
{{< note >}}
The example data can be found in the Appendix.
{{</ note >}}
Using idiomatic Materialize SQL, the following example finds the lowest item
price in each order and calculates the difference between the price of each item
in the order and the lowest price. The example uses a subquery that groups by
the order_id and selects MIN(price) to find the lowest price (i.e., first
value if ordered by ascending price values).
SELECT o.order_id, minmax.lowest_price, o.item, o.price,
o.price - minmax.lowest_price AS diff_lowest_price
FROM orders_view o,
(SELECT order_id,
MIN(price) AS lowest_price
FROM orders_view
GROUP BY order_id) minmax
WHERE o.order_id = minmax.order_id
ORDER BY o.order_id, o.item;
<red>Avoid the use of FIRST_VALUE() OVER (PARTITION BY ... ORDER BY ...)
window function for first value within groups queries.</red>
-- Anti-pattern --
SELECT order_id,
FIRST_VALUE(price)
OVER (PARTITION BY order_id ORDER BY price) AS lowest_price,
item,
price,
price - FIRST_VALUE(price)
OVER (PARTITION BY order_id ORDER BY price) AS diff_lowest_price
FROM orders_view
ORDER BY order_id, item;
Using idiomatic Materialize SQL, the following example finds the highest item
price in each order and calculates the difference between the price of each item
in the order and the highest price. The example uses a subquery that groups by
the order_id and selects MAX(price) to find the highest price (i.e., first
value if ordered by descending price values).
SELECT o.order_id, minmax.highest_price, o.item, o.price,
o.price - minmax.highest_price AS diff_highest_price
FROM orders_view o,
(SELECT order_id,
MAX(price) AS highest_price
FROM orders_view
GROUP BY order_id) minmax
WHERE o.order_id = minmax.order_id
ORDER BY o.order_id, o.item;
<red>Avoid the use of FIRST_VALUE() OVER (PARTITION BY ... ORDER BY ...)
window function for first value within groups
queries.</red>
-- Anti-pattern --
SELECT order_id,
FIRST_VALUE(price)
OVER (PARTITION BY order_id ORDER BY price DESC) AS highest_price,
item,
price,
price - FIRST_VALUE(price)
OVER (PARTITION BY order_id ORDER BY price DESC) AS diff_highest_price
FROM orders_view
ORDER BY order_id, item;
Using idiomatic Materialize SQL, the following example finds the lowest and the
highest item price in each order and calculates the difference between each item
in the order and these prices. The example uses a subquery that groups by the
order_id and selects MIN(price) as the lowest price (i.e., first
value if ordered by price values) and MAX(price) as the
highest price (i.e., first
value if ordered by descending price values)
SELECT o.order_id, minmax.lowest_price, minmax.highest_price, o.item, o.price,
o.price - minmax.lowest_price AS diff_lowest_price,
o.price - minmax.highest_price AS diff_highest_price
FROM orders_view o,
(SELECT order_id,
MIN(price) AS lowest_price,
MAX(price) AS highest_price
FROM orders_view
GROUP BY order_id) minmax
WHERE o.order_id = minmax.order_id
ORDER BY o.order_id, o.item;
<red>Avoid the use of FIRST_VALUE() OVER (PARTITION BY ... ORDER BY ...)
window function for first value within groups
queries.</red>
-- Anti-pattern --
SELECT order_id,
FIRST_VALUE(price)
OVER (PARTITION BY order_id ORDER BY price) AS lowest_price,
FIRST_VALUE(price)
OVER (PARTITION BY order_id ORDER BY price DESC) AS highest_price,
item,
price,
price - FIRST_VALUE(price)
OVER (PARTITION BY order_id ORDER BY price) AS diff_lowest_price,
price - FIRST_VALUE(price)
OVER (PARTITION BY order_id ORDER BY price DESC) AS diff_highest_price
FROM orders_view
ORDER BY order_id, item;