doc/developer/101-query-compilation.md
%%{init: {"flowchart": {"defaultRenderer": "elk"} } }%%
flowchart LR
SQL@{ shape: doc, label="SQL" } --> IR
IR --> dataflow@{ shape: docs }
subgraph IR["intermediate languages"]
direction LR
HIR@{ shape: doc } --> MIR@{ shape: docs } --> LIR@{ shape: docs }
MIR -. optimizations .-> MIR
end
classDef purple fill:#472F85
class SQL,HIR,MIR,LIR,dataflow purple
Representations:
SQL — source languageAST — a parsed version of a SQL query.HIR — high-level intermediate representation.MIR — mid-level intermediate representation.LIR — low-level intermediate representation.TDO — target language (timely & differential operators).Transformations in the compile-time lifecycle of a dataflow.
SQL ⇒ AST.
AST ⇒ AST
CatalogItemType
lists the kinds of objects that can be resolved against the catalog.AST ⇒ HIR.
TopK is converted to a
RowSetFinishing at this point.EXPLAIN RAW PLAN returns the result of transformations up to this point.HIR ⇒ HIR
HIR ⇒ MIR.
SELECT subqueries with more than one return value.EXPLAIN DECORRELATED PLAN returns the result of transformations up to this point.MIR ⇒ MIR.
SHOW INDEXES IN <view>).LinearOperators
EXPLAIN OPTIMIZED PLAN returns the result of transformations up to this point.MIR ⇒ LIR.
EXPLAIN PHYSICAL PLAN returns the result of transformations up to this point.LIR ⇒ TDO.For a one-off query, we run all the transformations until the MIR stage. Then we
determine whether we need to serve the query on the "slow path", that is,
creating a temporary dataflow and then deleting it. If we don't need to serve
the query on the "slow path", then we can skip the MIR ⇒ LIR and the LIR ⇒ TDO steps.
Existing "fast paths" include:
Currently, the optimization team is mostly concerned with the HIR ⇒ MIR and MIR ⇒ MIR stages.
test/sqllogictest/sqlite) to be merged.