docs/user-guide/lazy/query_plan.md
For any lazy query Polars has both:
We can understand both the non-optimized and optimized query plans with visualization and by printing them as text.
<div style="display:none"> ```python exec="on" result="text" session="user-guide/lazy/query_plan" --8<-- "python/user-guide/lazy/query_plan.py:setup" ``` </div>Below we consider the following query:
{{code_block('user-guide/lazy/query_plan','plan',[])}}
--8<-- "python/user-guide/lazy/query_plan.py:plan"
First we visualise the non-optimized plan by setting optimized=False.
{{code_block('user-guide/lazy/query_plan','showplan',['show_graph'])}}
--8<-- "python/user-guide/lazy/query_plan.py:createplan"
The query plan visualization should be read from bottom to top. In the visualization:
sigma stands for SELECTION and indicates any filter conditionspi stands for PROJECTION and indicates choosing a subset of columnsWe can also print the non-optimized plan with explain(optimized=False)
{{code_block('user-guide/lazy/query_plan','describe',['explain'])}}
--8<-- "python/user-guide/lazy/query_plan.py:describe"
FILTER [(col("comment_karma")) > (0)] FROM WITH_COLUMNS:
[col("name").str.uppercase()]
CSV SCAN data/reddit.csv
PROJECT */6 COLUMNS
The printed plan should also be read from bottom to top. This non-optimized plan is roughly equal to:
data/reddit.csv filename column to uppercasecomment_karma columnNow we visualize the optimized plan with show_graph.
{{code_block('user-guide/lazy/query_plan','show',['show_graph'])}}
--8<-- "python/user-guide/lazy/query_plan.py:createplan2"
We can also print the optimized plan with explain
{{code_block('user-guide/lazy/query_plan','optimized',['explain'])}}
WITH_COLUMNS:
[col("name").str.uppercase()]
CSV SCAN data/reddit.csv
PROJECT */6 COLUMNS
SELECTION: [(col("comment_karma")) > (0)]
The optimized plan is to:
comment_karma column while the CSV is being read line-by-linename column to uppercaseIn this case the query optimizer has identified that the filter can be applied while the CSV is read from disk rather than reading the whole file into memory and then applying the filter. This optimization is called Predicate Pushdown.