doc/developer/design/20210707_qgm_sql_high_level_representation.md
This document proposes to replace the current HirRelationExpr enum with a new representation
based on the Query Graph Model representation first introduced in
H. Pirahesh et al.
This new representation would allow us to introduce a new normalization stage before lowering the query into a dataflow, before important concepts for the normalization of SQL queries, such as outer joins and subquery boundaries, are blurred. The following two diagrams show the current view/query compilation pipeline and the one proposed in this document, which adds normalization at the SQL level:
LATERAL joins (#6875)Existing issues that should be addressed sooner rather than later:
In QGM, a query, as its name implies, is represented as a graph. This graph will be represented
by a Model struct, that will be declared as follows:
/// A Query Graph Model instance represents a SQL query.
struct Model {
/// The ID of the box representing the entry-point of the query.
top_box: BoxId,
/// All boxes in the query graph model.
boxes: HashMap<BoxId, Box<RefCell<QueryBox>>>,
/// Used for asigning unique IDs to query boxes.
next_box_id: Boxid,
/// All quantifiers in the query graph model.
quantifiers: HashMap<QuantifierId, Box<RefCell<Quantifier>>>,
/// Used for assigning unique IDs to quantifiers.
next_quantifier_id: QuantifierId,
}
The graph has a top level box, which is the entry point of the query. In this proposal, all boxes and quantifiers are owned by the model and are referenced by their unique identifiers.
Boxes represent high-level conceptual operators, ie. they don't correspond to any execution strategy. Each box has a set of input quantifiers, which describe the semantics of how the underlying boxes will be accessed.
The following snippet contains the definition of the different types of operators and quantifiers. Since all type of boxes and quantifiers have some elements in common, both boxes and quantifiers are not directly represented as enum, but as struct with an enum type, containing the per-type specific members.
type QuantifierId = usize;
type BoxId = usize;
type QuantifierSet = BTreeSet<QuantifierId>;
/// A semantic operator within a Query Graph.
struct QueryBox {
/// uniquely identifies the box within the model
id: BoxId,
/// the type of the box
box_type: BoxType,
/// the projection of the box
columns: Vec<Column>,
/// the input quantifiers of the box
quantifiers: QuantifierSet,
/// quantifiers ranging over this box
ranging_quantifiers: QuantifierSet,
/// list of unique keys exposed by this box. Each unique key is made by
/// a list of column positions. Must be re-computed every time the box
/// is modified.
unique_keys: Vec<Vec<usize>>,
/// whether this box must enforce the uniqueness of its output, it is
/// guaranteed by structure of the box or it must preserve duplicated
/// rows from its input boxes.
distinct: DistinctOperation,
}
enum BoxType {
BaseTable(BaseTable),
Except,
Grouping(Grouping),
Intersect,
OuterJoin(OuterJoin),
Select(Select),
TableFunction(TableFunction),
Union,
Values(Values),
}
enum DistinctOperation {
/// DISTINCTness is required but guaranteed by the internal structure
/// of the box.
Guaranteed,
/// The box must enforce DISTINCT
Enforce,
/// DISTINCTness is not required
Preserve,
}
struct Quantifier {
/// uniquely identifiers the quantifier within the model
id: QuantifierId,
/// the type of the quantifier
quantifier_type: QuantifierType,
/// the input box of this quantifier
input_box: BoxId,
/// the box that owns this quantifier
parent_box: BoxId,
/// alias for name resolution purposes
alias: Option<Ident>,
}
/// Type of quantifiers (with their abbreviations)
enum QuantifierType {
All, // 'All'
Any, // 'Any'
Existential, // 'E'
Foreach, // 'F'
PreservedForeach, // 'P'
Scalar, // 'S'
}
Note that input quantifiers are logically owned by a single box, but there may be several quantifiers ranging over the same box. That is the case, for example, for base tables, views or CTEs, either explicit CTEs used in the query or discovered via some query transformation.
As shown above, there aren't many different types of operators, since QGM is meant to be a representation for query normalization. The set of operators listed above is very close to the one suggested in #692.
The core operator is represented by the Select box, which represents a whole query block (sub-block).
struct Select {
predicates: Vec<Box<Expr>>,
order_key: Option<Vec<Box<Expr>>>,
limit: Option<Expr>,
offset: Option<Expr>,
}
There a few subtle constraints that are not explicit in the representation above:
BaseTable, TableFunction and Values cannot have input quantifiers.Union, Except and Intersect can only have input quantifiers of type Foreach.Select, Union, Except and Intersect boxes.All, Any, Existential and Scalar) are only allowed in Select boxes.Grouping must have a single input quantifier of type Foreach ranging over a Select box.Grouping box is always ranged-over by a Select box.OuterJoin must have at least an input quantifier of type PreservedForeach and at most one quantifier of type
Foreach. An OuterJoin with all PreservedForeach input quantifiers represents a FULL OUTER JOIN.
Note: temporarily during the generation of the query model we could allow subquery quantifiers in OuterJoin boxes for
subqueries in the ON-clause of the outer join, but should push down the subquery to the non-preserving side.
Note 2: In QGM there is no distinction between a LEFT JOIN and a RIGHT JOIN, since that's a concept that belongs
only in the AST.PreservedForeach quantifiers can only exist in OuterJoin boxes.Some of the constraints above are just conventions for making query transformation easier due to having to cover fewer cases. The rest are just constructions that don't make sense semantically speaking.
We could make some of these constraints explicit by making the type system enforce them, by, for example
moving the quantifiers set to each of the BoxType that can/must have input quantifiers, which are all of
them but BaseTable, TableFunction and Values.
All boxes have an ordered projection, represented as a vector of columns, defined as:
struct Column {
expr: Expr,
alias: Option<String>,
}
Columns have two representations in QGM: base columns and column references. Base columns are only allowed in expressions
contained in data source operators, specifically in the projection of boxes of type BaseTable and TableFunction.
enum Expr {
// ...
ColumnReference(ColumnReference),
BaseColumn(usize),
}
struct ColumnReference {
quantifier_id: QuantifierId,
position: usize,
}
ColumnReference is used everywhere else. A ColumnReference may either reference a quantifier of the same
box that owns the containing expression or a quantifier from some parent box.
The underlying expression behind a column reference can be obtained via a dereference method, whose implementation
could be as follows:
impl ColumnReference {
fn dereference<'a>(&self, model: &'a Model) -> &'a Expr {
let input_box = model
.quantifiers
.get(&self.quantifier_id)
.unwrap()
.input_box;
&model.boxes.get(&input_box).unwrap().columns[self.position].expr
}
}
Since this proposal uses identifiers instead of pointers, most methods in the implementation Expr will need to
receive a reference to the model as a parameter. For example, a method for determining whether an expression is
nullable or not may need to dereference a column reference, for which it needs access to the model:
impl Expr {
fn nullable(&self, model: &Model) -> bool {
match self {
...
Expr::ColumnReference(c) => c.dereference(model).nullable(model),
}
}
}
As shown above, the query graph already contains almost all the information needed for name resolution. Since the query graph is built in a bottom-up manner, we can use the input quantifiers for resolving names within the current part of the query being processed.
It is important to restate the constraint mentioned above: all column references in expressions within each box must only point to quantifiers either within the same box or within an ancestor box through a chain of unique children (correlation).
The only information missing in the query graph needed for name resolution purposes is what quantifiers can be
used for name resolution within each box/context. That information is stored in a new NameResolutionContext
defined as:
struct NameResolutionContext<'a> {
owner_box: Option<BoxId>,
quantifiers: Vec<QuantifierId>,
ctes: Option<HashMap<String, BoxId>>,
parent_context: Option<&'a NameResolutionContext<'a>>,
sibling_context: Option<&'a NameResolutionContext<'a>>,
is_lateral: bool,
}
parent_context is a linked list that points to the context of the outer query (the context of its FROM clause)
and, hence, it is only set within a context that is meant to see the outer query (for example, a subquery).
sibling_context is a linked list that connects all the intermediate joins from the current one to the comma-join
of the FROM clause. This linked list is only meant to be traversed when is_lateral is true. From within a lateral
join operand, we have to traverse the list of sibling_contexts first, ie. move laterally, and them up.
All the contexts in the sibling_context share the same parent context, ie. the same outer query.
quantifiers contains leaf quantifiers of the join, the ones that are used for name resolution.
A new NameResolutionContext is created for each intermediate join within the join tree, which leaves quantifiers
are merged into the context of the parent join after processing the intermeidate join, unless it is a nested
join with an alias. In that case, the quantifier that is added in the parent context is the one ranging over
the sub-join. This distinction is needed to properly support this case:
materialize=> select * from a, (b inner join c on true) where b.b > 1;
a | b | c
---+---+---
(0 rows)
materialize=> select * from a, (b inner join c on true) z where b.b > 1;
ERROR: WHERE clause error: column "b.b" does not exist
materialize=>
When a colum name is resolved against any of the leaf quantifiers in the current context, the resulting column
reference must be lifted through the projection of all the intermediate boxes until reaching a quantifier
in the current box (the one referenced in NameResolutionContext::owner_box in the current context).
A lateral join operand must see all quantifiers at its left in all the intermediate joins up to the comma join. In the following example query there is a lateral join operand in the rightmost position of the join tree, that must see the symbols from the tables in the left-hand side of all the intermediate joins up to the comma-join (included).
Box16 represents the lateral join operand. When processing its content, the context stack looks like this:
Context 16 {
owner_box = 16
quantifiers = {10}
sibling_context = None
parent_context = Some(Context 13)
is_lateral = false
}
Context 13 {
owner_box = 13
quantifiers = {9}
sibling_context = Some(Context 9)
parent_context = None
is_lateral = true
}
Context 9 {
owner_box = 9
quantifiers = {7}
sibling_context = Some(Context 5)
parent_context = None
is_lateral = false
}
Context 5 {
owner_box = 5
quantifiers = {5}
sibling_context = Some(Context 0)
parent_context = None
is_lateral = false
}
Context 0 {
owner_box = 0
quantifiers = {1, 3}
sibling_context = None
parent_context = None
is_lateral = false
}
After processing the join tree in the FROM clause, we end up with the following Context 0 for processing the expressions in the projection and ORDER BY:
Context 0 {
quantifiers = {1(a), 3(b), 5(c), 7(d), 9(f), 12(<anonymous>)}
sibling_context = None
parent_context = None
is_lateral = false
}
Note that Boxes 8 and 12 are pass-through boxes created by the prototype every time there is a join within parenthesis.
SelectMerge rewrite will take care of them. Boxes 2, 4, 7, 11, 15 and 18 are just there for the column aliases.
Expressions within subqueries must see the symbols visible from the context the subquery is in. To support that
NameResolutionContext will contain an optional reference to a parent NameResolutionContext, that is passed
down for planning the subquery, so that if a name cannot be resolved against the context of the FROM clause of
the subquery, we go through the chain of parent contexts until we find a symbol that matches in one of them.
GROUP BY queriesA SELECT query is a grouping query if any of the following conditions is met:
GROUP BY clause.HAVING clause,FROM clause, either directly in the projection or within a subquery
(see #3720)That means that in order to determine whether a query is a grouping query or not, we must inspect the projection
of the query first. For this reason, after having processed the FROM clause and the WHERE clause, we will
process the projection of the query, resolving names agains the NameResolutionContext for the join of the
query, and then will traverse the resulting expressions trying to find Aggregate expressions within
non-Grouping boxes, which would make the current query being process a grouping one.
If any Aggregate expression is found, we will put a Grouping box on top of the Select box containing
the join in the FROM clause, and will try to lift all expressions in the projection through it, following
the rules listed below.
Symbols in the GROUP BY clause will be resolved as well against the NameResolutionContext representing
the scope exposed by the FROM clause, but then lifted through the projection of the Select box representing
the join that feeds the Grouping box created for the GROUP BY clause.
Symbols in the HAVING clause and in the projection of the GROUP BY are also resolved against the
NameResolutionContext of the FROM clause, but then lifted twice: once through the Select box representing
the join that feeds the Grouping box and then through the Grouping box itself (since the projection
of a GROUP BY and the predicates in the HAVING clause belong in Select box on top of the Grouping box).
Lifting expressions through a grouping box is a bit special:
Grouping box can only contain: columns references from the input quantifier that are
present in the grouping key, references to columns from the input quantifier that functionally depend on a column
in the grouping key, or aggregate expressions, which parameters must be column references from the input quantifier.Expression planning will then be divided in two stages: name resolution against the scope of the FROM clause
and expression lifting through the Grouping (only for grouping queries).
A NameResolutionContext instance will be created for storing the processed CTEs. When resolving an unqualified
table name, we will first traverse the list of active contexts, ie. through the list made by parent_context,
until matching CTE is found. Otherwise, we will use the system catalog.
The DISTINCT handling in the QGM paper cite above is a bit messy, so the solution proposed here differs a bit
of the one described in the paper.
As shown in the definition of QueryBox above, each box has a DistinctOperation flag, that indicates
whether the box must enforce the uniqueness of its output rows, the uniqueness is already guaranteed by the
underlying structure of the box, or the duplicated rows must be preserved in the output of the box.
Each box also exposes the sets of its columns that make unique keys on its output. A BaseTable instance
will contain a unique key for every unique indexes on the table. A Select with DistinctOperation::Enforce
or DistinctOperation::Guaranteed will have at least a unique key containing all the columns in its
projection. A Grouping box will have at least a unique key containing all the columns in the grouping key.
This information must be re-computed every time the underlying structure of the box is modified.
GROUP BY and DISTINCT have a different representation in QGM, where the latter is always preferred since
it makes query flattening easier. Because of that, a normalization rule will convert any Grouping box without
aggregate functions into a Select box with DistinctOperation::Enforce.
DistinctOperation::Enforce vs. DistinctOperation::GuaranteedThe following example illustrates the difference between DistinctOperation::Enforce and
DistinctOperation::Guaranteed. Consider the following scenario:
create table a(a ... primary key, ...);
create table b(b ... primary key, ...);
select * from a, b from (select distinct a from a) x, b where a = b;
After parsing this query we end up with a Select box ranging over BaseTable A with DistinctOperation::Enforce
projecting only column A, and then another Select box (the top-level one) ranging over that one and BaseTable B,
containing the join predicate.
One of the main goals of the normalization process is to reduce the number of boxes and flatten the query as much as possible.
Given two nested Select boxes, it will always try to merge them. However, in general DistinctOperation::Enforce
prevents a child Select box into a parent one, unless certain conditions that ensure the semantic correctness of the
transformation are met. One of these sufficient conditions is when the internal structure of the nested Select box
already guarantees the uniqueness of its output, which is indicated by DistinctOperation::Guaranteed flag. A
semantic transformation will analyze the internal structure of any Select box with DistinctOperation::Enforce
and will determine whether it already guarantees the uniqueness of its output.
Going back to the example above, the semantic transformation mentioned will kick in for the nested Select box, allowing
its merge into the top-level one. The result is a single Select box ranging over both base tables, which is the
representation for the following equivalent but simpler version of the query above:
select * from a, b where a = b;
However, following with the same example, column A wasn't a primary/unique key of table A, the distinctness of
the nested Select box would not be guaranteed and hence, it could not be merged into the top-level one.
Some normalization transformations are better/easier done with a representation at a higher level than our current
MirRelationExpr representation. Specially those around SQL-specific concepts such as outer joins that are
lost during lowering. Several examples of this are #6932,
#6987 or
#6988, but the list of unsupported cases that are
hard to support at the moment is much longer.
Consider the following two equivalent queries:
materialize=> explain select * from t1, lateral (select count(*) from t2 group by t1.f1);
Optimized Plan
----------------------------------------------
%0 = +
| Get materialize.public.t1 (u254) +
+
%1 = +
| Get materialize.public.t1 (u254) +
| Distinct group=(#0) +
| ArrangeBy () +
+
%2 = +
| Get materialize.public.t2 (u256) +
+
%3 = +
| Join %1 %2 +
| | implementation = Differential %2 %1.() +
| | demand = (#0) +
| Reduce group=(#0) +
| | agg count(true) +
| ArrangeBy (#0) +
+
%4 = +
| Join %0 %3 (= #0 #2) +
| | implementation = Differential %0 %3.(#0)+
| | demand = (#0, #1, #3) +
| Project (#0, #1, #3) +
(1 row)
materialize=> explain select * from t1, (select count(*) from t2);
Optimized Plan
--------------------------------------------
%0 = Let l0 = +
| Get materialize.public.t2 (u256) +
| Reduce group=() +
| | agg count(true) +
+
%1 = +
| Get materialize.public.t1 (u254) +
+
%2 = +
| Get %0 (l0) +
| Negate +
| Project () +
+
%3 = +
| Constant () +
+
%4 = +
| Union %2 %3 +
| Map 0 +
+
%5 = +
| Union %0 %4 +
| ArrangeBy () +
+
%6 = +
| Join %1 %5 +
| | implementation = Differential %1 %5.()+
| | demand = (#0..#2) +
(1 row)
materialize=>
Ideally, any two semantically equivalent queries should result in the same execution plan: the most optimal one
for obtaining/computing the desired results. However, after lowering the first query above, we are not able to
detect that the grouping key is constant wrt the input of the aggregation and can then be removed, reducing the
complexity of the resulting dataflow as shown in the plan for the second query, where the transformation has been
manually applied. A RemoveConstantKeys normalization rule applied to the query graph could easily detect
any grouping key item that is constant wrt the input of the Grouping box and remove it.
Another example, among the long list of them, is the one below:
create table a(a, b); create table b(c, d); create unique index bc on b(c);
select a from a where a = (select d from b where c = a.b);
select a from a, b where a = d and c = b;
A ScalarToForeach normalization rule could convert the Scalar subquery quantifier into a Foreach
quantifier since the subquery contains an equality predicate on a unique key of its input (table b) and
the NULL value returned when the subquery is empty is rejected by the equality predicate.
With a small set of normalization transformations applied on the high level representation of the query, before
lowering it, we could easily fix many cases like the ones listed above. These transformations could be used to
decorrelate some, if not all, cases where the query can be expressed with equivalent non-correlated SQL. The big
hammer used in lowering.rs will then be used for everything else that cannot be expressed with valid SQL in
a decorrelated manner (for example, a correlated lateral non-preserving side of an outer join cannot be decorrelated
in SQL since no predicate can cross the non-preserving boundary).
A recursive CTE is a CTE with a union with two branches where one of the branches references the CTE and the other one doesn't (representing the base case). This can be easily supported by the proposed implementation. Circular memory ownership issues are avoided by making the model own all the nodes.
Any traversal of the query graph must keep a set/bitset of visited boxes since the same box can be ranged over by several quantifiers. The same set/bitset will prevent infinite loops/stack overflows when traversing a recursive query.
The normalization transformations will be mostly tested using datadriven unit tests that will take raw SQL
queries as their input and a list of transformations to be applied, and will dump the content of the graph after
applying each of them. The query graph will be dumped as a dot graph which, once rendered, looks like the
ones in the appendix with examples.
The dot graph generator will also be very handy for debugging transformations. Also, for better debuggability
the rules themselves should not drive the traversal of the graph, but just make explicit the type of traversal
they require. The traversal of the graph will be driven by some external code, that could be instrumentalized for
dumping the full graph before and after applying the rule to every node of the graph. That way, by just patching
one method, we could follow step by step the evolution of a query graph as it goes through the different
transformations.
The lowering code will have to be modified so that it takes a Query Graph as its input.
When lowering a Select box we will first build the join within that box. To do so, we will first build a
dependency graph where each quantifier is connected to the quantifiers it is correlated with. We will lower
the quantifiers in dependency order, starting with the uncorrelated ones. When lowering
a correlated quantifier, we will apply its plan to the result of the sub-join of all the quantifiers it is
connected to, resulting in simpler dataflow plans for lateral joins.
The dependency graph will be built by traversing the sub-graph of the input quantifiers collecting the column references from the current context.
Note that the lowering code will no longer produce binary joins exclusively.
After lowering the join, a Map operator will be added with the non-column expressions in the projection
of the box. Then, a Projection will be added. If the box must enforce distinctness, the corresponding
Reduce operator will be added. If the box has LIMIT/OFFSET/ORDER BY a TopK operator will
then be added.
View inlining could be implemented as a normalization rule where only non-materialized views are inlined.
We could also do the opposite: find query patterns that match existing materialized views. Furthermore, we could even have a view compaction step where we load all the views in the system into the same model and find common sub-expressions within it, to identify intermediate results worth materializing.
MirRelationExpr into a normalization-friendly representation with explicit outer join operator
and fewer number of operators.The first milestone will consist in an end-to-end prototype with support for a simple select a from a, including
lowering the query graph to a dataflow and dot graph generator.
The second milestone will add full support for joins, including lateral joins, and subqueries, including correlated ones.
The following milestones will add support for GROUP BY, UNION and the rest of the SQL constructs currently
supported.
In parallel, after the second milestone, work could be done on introducing query normalization before lowering,
starting with the core SelMerge rule described in the paper.
EXPLAIN RAW PLANtransform crateThis section includes examples of how some queries look like in QGM. This visual representation will be generated from the representation described in the previous section. Note that having a visual way of representing the query is very helpful during query transformation development/troubleshooting.
In this visual representation, predicates referencing columns from 1 or 2 quantifiers are represented as edges connecting the quantifiers involved in the predicate.
SELECT *GROUP BYGROUP BY + HAVINGNote that the having filter is just a regular predicate on the Select box ranging over the Grouping box.
GROUP BY vs. DISTINCTAs mentioned above, GROUP BY and DISTINCT use different representation in QGM. The following diagram
shows a simple DISTINCT query.
As shown below, the equivalent GROUP BY query is automatically converted into the query graph above
during the normalization process.
Note that the inner join above is semantically equivalent to the comma join in the previous example. Boxes 1 and 2 represent the binary inner joins in the query, but they can be squashed into box 0, without altering the results of the query. In fact, the normalization step will simplify this query leaving it exactly as the one in the example above:
Note that in QGM there is no join direction, so left and right joins have the same exact representation. Only the type of the quantifiers change its order.
Representing outer joins as explicit boxes goes against the goal of removing any sense of direction or join order from the query graph. For example, the following two queries are semantically equivalent but they lead to different query models where the join ordering is implied. We could add a normalization rule that always pushes inner joins through the preserving side of an outer join, assuming that it is always better to do inner joins first.
However, we would still have the explicit join ordering issue if both joins in the query above where outer joins.
To fix that issue, normalization will lift the preserving quantifier of an outer join to the parent Select
making it a Foreach quantifier, leaving the OuterJoin box with just a Foreach quantifier, which
will represent a special potentially-correlated operator that produces a NULL row if none of the rows
produced by its input matches the predicate. With this normalization, the query mentioned above will look as
follows:
A CROSS JOIN can be represented as a Select box with no predicates as shown below:
Quantifiers 2 and 3 are ranging over the same box, which represents the CTE. Box 2 doesn't alter the results of
box 0, but just adds aliases for the columns, for name resolution purposes. Normalization will get rid of all
the intermediate Select boxes, leaving the query as follows:
A LATERAL join is just a join where one of its operands is correlated with the remaining ones, ie. a sub-graph
containing column references from quantifiers belonging in the parent context. For instance, in the following
example quantifier 4 is correlated within box 0, since its sub-graph references a column from quantifier 0 which
belongs in box 0. This correlation is made explicit by the edge going from Q1 in box 2 to Q0 in box 0.
We will see later how we could decorrelate a query like that via transformations of the query model.
NATURAL joinsNATURAL joins don't have an explicit representation in QGM since, like LATERAL, it is a name resolution concept
that doesn't make sense anymore after it.
EXISTS and IN SELECTEXISTS and IN SELECT subqueries are represented via Existential quantifiers. In fact, EXISTS subqueries
are represented as 1 IN (SELECT 1 FROM (<exists subquery>)) as shown in the second example below.
Given that the two queries above are equivalent, the normalization process should normalize both to the same representation.
Scalar subqueries are represented via Scalar quantifiers as shown above. These quantifiers can be converted into
regular Foreach quantifiers iff the inner subquery is guaranteed to always return one row at most and the NULL
value returned when the subquery is empty is ignored. Otherwise, no predicate can cross the boundary of a Scalar
quantifier.
ANY/ALL subqueriesANY/ALL subqueries are represented via Any and All quantifier types respectively.
VALUESIn the example above, an extra box is added for the simple purpose of storing the column aliases. This extra box
is merged into the top-level one during the normalization process by the SelMerge rule described in the paper.
UNION