doc/user/content/sql/select/join.md
JOIN lets you combine two or more table expressions into a single table
expression.
Much like an RDBMS, Materialize can join together any two table expressions (in our case, either sources or views) into a single table expression.
Materialize has much broader support for JOIN than most streaming platforms,
i.e. we support all types of SQL joins in all of the conditions you would
expect.
{{% include-syntax file="examples/select_join" example="syntax" %}}
Note: It's possible to join together table expressions as inner joins without using this clause whatsoever, e.g. SELECT cols... FROM t1, t2 WHERE t1.x = t2.x GROUP BY cols...
Unlike most other streaming platforms, JOINs in Materialize have very few, if
any, restrictions. For example, Materialize:
Instead, JOINs work over the available history of both streams, which
ultimately provides an experience more similar to an RDBMS than other streaming
platforms.
LATERAL subqueriesTo permit subqueries on the right-hand side of a JOIN to access the columns
defined by the left-hand side, declare the subquery as LATERAL. Normally, a
subquery only has access to the columns within its own context.
Table function invocations always have implicit access to the columns defined by
the left-hand side of the join, so declaring them as LATERAL is a permitted
no-op.
When a join contains a LATERAL cross-reference, the right-hand relation is
recomputed for each row in the left-hand relation, then joined to the
left-hand row according to the usual rules of the selected join type.
{{< warning >}}
LATERAL subqueries can be very expensive to compute. For best results, do not
materialize a view containing a LATERAL subquery without first inspecting the
plan via the EXPLAIN PLAN statement. In many common patterns
involving LATERAL joins, Materialize can optimize away the join entirely.
{{< /warning >}}
As a simple example, the following query uses LATERAL to count from 1 to x
for all the values of x in xs.
SELECT * FROM
(VALUES (1), (3)) xs (x)
CROSS JOIN LATERAL generate_series(1, x) y;
x | y
---+---
1 | 1
3 | 1
3 | 2
3 | 3
For a real-world example of a LATERAL subquery, see the Top-K by group
idiom.
For these examples, we'll use a small data set:
Employees
id | name
----+--------
1 | Frank
2 | Arjun
3 | Nikhil
4 | Cuong
Managers
id | name | manages
----+-------+---------
1 | Arjun | 4
2 | Cuong | 3
3 | Frank |
In this table:
Arjun and Frank do not have managers.Frank is a manager but has no reports.Inner joins return all tuples from both tables where the join condition is valid.
SELECT
employees."name" AS employee,
managers."name" AS manager
FROM employees
INNER JOIN managers ON employees.id = managers.manages;
employee | manager
----------+---------
Cuong | Arjun
Nikhil | Cuong
Left outer joins (also known as left joins) return all tuples from the
left-hand-side table, and all tuples from the right-hand-side table that match
the join condition. Tuples on from the left-hand table that are not joined with
a tuple from the right-hand table contain NULL wherever the right-hand table
is referenced.
SELECT
employees."name" AS employee,
managers."name" AS manager
FROM employees
LEFT OUTER JOIN managers ON employees.id = managers.manages;
employee | manager
----------+---------
Cuong | Arjun
Nikhil | Cuong
Arjun |
Frank |
Right outer joins (also known as right joins) are simply the right-hand-side equivalent of left outer joins.
Right outer joins return all tuples from the right-hand-side table, and all
tuples from the left-hand-side table that match the join condition. Tuples on
from the right-hand table that are not joined with a tuple from the left-hand
table contain NULL wherever the left-hand table is referenced.
SELECT
employees."name" AS employee,
managers."name" AS manager
FROM employees
RIGHT OUTER JOIN managers ON employees.id = managers.manages;
employee | manager
----------+---------
Cuong | Arjun
Nikhil | Cuong
| Frank
Full outer joins perform both a left outer join and a right outer join. They return all tuples from both tables, and join them together where the join conditions are met.
Tuples that are not joined with the other table contain NULL wherever the
other table is referenced.
SELECT
employees."name" AS employee,
managers."name" AS manager
FROM employees
FULL OUTER JOIN managers ON employees.id = managers.manages;
employee | manager
----------+---------
Cuong | Arjun
Nikhil | Cuong
| Frank
Arjun |
Frank |
Cross joins return the Cartesian product of the two tables, i.e. all combinations of tuples from the left-hand table combined with tuples from the right-hand table.
Our example dataset doesn't have a meaningful cross-join query, but the above diagram shows how cross joins form the Cartesian product.