content/shared/influxdb3-query-guides/sql/aggregate-select.md
An SQL query that aggregates data includes the following clauses:
{{< req type="key" >}}
SELECT: Specify fields, tags, and calculations to output from a
table or use the wildcard alias (*) to select all fields and tags
from a table.FROM: Specify the table to query data from.WHERE: Only return rows that meets the specified conditions--for example,
the time is within a time range, a tag has a specific value, or a field value
is above or below a specified threshold.GROUP BY: Group data that have the same values for specified columns and
expressions (for example, an aggregate function result).[!Note] For simplicity, the verb, "aggregate," in this guide refers to applying both aggregate and selector functions to a dataset.
Learn how to apply aggregate operations to your queried data:
Both aggregate and selector functions return a single row from each SQL group.
For example, if you GROUP BY room and perform an aggregate operation
in your SELECT clause, results include an aggregate value for each unique
value of room.
Use aggregate functions to aggregate values in a specified column for each group and return a single row per group containing the aggregate value.
<a href="/influxdb3/version/reference/sql/functions/aggregate/" class="btn small">View SQL aggregate functions</a>
SELECT AVG(co) from home
Use selector functions to "select" a value from a specified column. The available selector functions are designed to work with time series data.
<a href="/influxdb3/version/reference/sql/functions/selector/" class="btn small">View SQL selector functions</a>
Each selector function returns a Rust struct (similar to a JSON object)
representing a single time and value from the specified column in the each group.
What time and value get returned depend on the logic in the selector function.
For example, selector_first returns the value of specified column in the first
row of the group. selector_max returns the maximum value of the specified
column in the group.
The struct returned from a selector function has two properties:
time value in the selected row{time: 2023-01-01T00:00:00Z, value: 72.1}
Each selector function has two arguments:
In your SELECT statement, execute a selector function and use bracket notation
to reference properties of the returned struct to
populate the column value:
SELECT
selector_first(temp, time)['time'] AS time,
selector_first(temp, time)['value'] AS temp,
room
FROM home
GROUP BY room
[!Note]
Sample data
The following examples use the Home sensor sample data. To run the example queries and return results, write the sample data to your {{% product-name %}} database before running the example queries.
To aggregate all queried values in a specified column:
SELECT statement.GROUP BY clause to leave your data ungrouped.SELECT avg(co) AS 'average co' from home
{{< expand-wrapper >}} {{% expand "View example results" %}}
| average co |
|---|
| 5.269230769230769 |
| {{% /expand %}} |
| {{< /expand-wrapper >}} |
To apply aggregate or selector functions to grouped data:
SELECT statement.SELECT statement.GROUP BY clause with a comma-delimited list of columns and
expressions to group by.Keep the following in mind when using GROUP BY:
GROUP BY can use column aliases that are defined in the SELECT clause.GROUP BY won't use an aliased value if the alias is the same as the original
column name. GROUP BY will use the original value of the column, not the
transformed, aliased value.SELECT
room,
avg(temp) AS 'average temp'
FROM home
GROUP BY room
{{< expand-wrapper >}} {{% expand "View example results" %}}
| room | average temp |
|---|---|
| Living Room | 22.16923076923077 |
| Kitchen | 22.623076923076926 |
| {{% /expand %}} | |
| {{< /expand-wrapper >}} |
A common use case when querying time series is downsampling data by applying aggregates to time-based groups. To group and aggregate data into time-based groups:
In your SELECT clause:
Use the DATE_BIN function
to calculate time intervals and output a column that contains the start of
the interval nearest to the time timestamp in each row--for example, the
following clause calculates two-hour intervals (originating at the Unix epoch)
and returns a new time column that contains the start of the interval
nearest to home.time:
SELECT
DATE_BIN(INTERVAL '2 hours', time) AS time
FROM home
...
Given a time value
{{% influxdb/custom-timestamps-span %}}2022-01-01T13:00:50.000Z{{% /influxdb/custom-timestamps-span %}},
the output time column contains
{{% influxdb/custom-timestamps-span %}}2022-01-01T12:00:00.000Z{{% /influxdb/custom-timestamps-span %}}.
In your GROUP BY clause:
DATE_BIN(...) column ordinal reference (1).
This lets you group by the transformed time value and maintain the time
column name.room) that are specified in the
SELECT clause and aren't used in a selector function.SELECT
DATE_BIN(INTERVAL '2 hours', time) AS time
...
GROUP BY 1, room
...
To reference the DATE_BIN(...) result column by name in the GROUP BY
clause, assign an alias other than "time" in the SELECT clause--for example:
SELECT
DATE_BIN(INTERVAL '2 hours', time) AS _time
FROM home
...
GROUP BY _time, room
Include an ORDER BY clause with columns to sort by.
The following example retrieves unique combinations of time intervals and rooms with their minimum, maximum, and average temperatures:
SELECT
DATE_BIN(INTERVAL '2 hours', time) AS time,
room,
selector_max(temp, time)['value'] AS 'max temp',
selector_min(temp, time)['value'] AS 'min temp',
avg(temp) AS 'average temp'
FROM home
GROUP BY 1, room
ORDER BY room, 1
{{< expand-wrapper >}} {{% expand "View example results" %}} {{% influxdb/custom-timestamps %}}
| time | room | max temp | min temp | average temp |
|---|---|---|---|---|
| 2022-01-01T08:00:00Z | Kitchen | 23 | 21 | 22 |
| 2022-01-01T10:00:00Z | Kitchen | 22.7 | 22.4 | 22.549999999999997 |
| 2022-01-01T12:00:00Z | Kitchen | 22.8 | 22.5 | 22.65 |
| 2022-01-01T14:00:00Z | Kitchen | 22.8 | 22.7 | 22.75 |
| 2022-01-01T16:00:00Z | Kitchen | 22.7 | 22.4 | 22.549999999999997 |
| 2022-01-01T18:00:00Z | Kitchen | 23.3 | 23.1 | 23.200000000000003 |
| 2022-01-01T20:00:00Z | Kitchen | 22.7 | 22.7 | 22.7 |
| 2022-01-01T08:00:00Z | Living Room | 21.4 | 21.1 | 21.25 |
| 2022-01-01T10:00:00Z | Living Room | 22.2 | 21.8 | 22 |
| 2022-01-01T12:00:00Z | Living Room | 22.4 | 22.2 | 22.299999999999997 |
| 2022-01-01T14:00:00Z | Living Room | 22.3 | 22.3 | 22.3 |
| 2022-01-01T16:00:00Z | Living Room | 22.6 | 22.4 | 22.5 |
| 2022-01-01T18:00:00Z | Living Room | 22.8 | 22.5 | 22.65 |
| 2022-01-01T20:00:00Z | Living Room | 22.2 | 22.2 | 22.2 |
| {{% /influxdb/custom-timestamps %}} | ||||
| {{% /expand %}} | ||||
| {{< /expand-wrapper >}} |
[!Note]
GROUP BY time
In the
GROUP BYclause, the name "time" always refers to thetimecolumn in the source table. If you want to reference a calculated time column by name, use an alias different from "time" or use the column ordinal--for example:{{< code-tabs-wrapper >}} {{% code-tabs %}} Column alias Column ordinal {{% /code-tabs %}} {{% code-tab-content %}}
SELECT
DATE_BIN(INTERVAL '2 hours', time) AS _time,
room,
selector_max(temp, time)['value'] AS 'max temp',
selector_min(temp, time)['value'] AS 'min temp',
avg(temp) AS 'average temp'
FROM home
GROUP BY _time, room
ORDER BY room, _time
{{% /code-tab-content %}} {{% code-tab-content %}}
SELECT
DATE_BIN(INTERVAL '2 hours', time) AS time,
room,
selector_max(temp, time)['value'] AS 'max temp',
selector_min(temp, time)['value'] AS 'min temp',
avg(temp) AS 'average temp'
FROM home
GROUP BY 1, room
ORDER BY room, 1
{{% /code-tab-content %}} {{< /code-tabs-wrapper >}}
To query data based on values after an aggregate operation, include a HAVING
clause with defined predicate conditions such as a value threshold.
Predicates in the WHERE clause are applied before data is aggregated.
Predicates in the HAVING clause are applied after data is aggregated.
SELECT
room,
avg(co) AS 'average co'
FROM home
GROUP BY room
HAVING "average co" > 5
{{< expand-wrapper >}} {{% expand "View example results" %}}
| room | average co |
|---|---|
| Kitchen | 6.6923076923076925 |
| {{% /expand %}} | |
| {{< /expand-wrapper >}} |