web/book/src/reference/stdlib/distinct.md
PRQL doesn't have a specific distinct keyword. Instead duplicate tuples in a
relation can be removed by using group and take 1:
from employees
select department
group employees.* (
take 1
)
This also works with a wildcard:
from employees
group employees.* (take 1)
To
select a single row from each group
group can be combined with sort and take:
# youngest employee from each department
from employees
group department (
sort age
take 1
)
Note that we can't always compile to DISTINCT; when the columns in the group
aren't all the available columns, we need to use a window function:
from employees
group {first_name, last_name} (take 1)