docs/en/using_starrocks/query_feedback.md
import Experimental from '../_assets/commonMarkdown/_experimental.mdx'
This topic introduces the Query Feedback feature, its application scenarios, and how to optimize query plans based on feedback from execution statistics using Query Plan Advisor.
StarRocks supports the Query Feedback feature from v3.4.0 onwards.
Query Feedback is a framework and a critical component of Cost-based Optimizer (CBO). It records the execution statistics during query execution, and reuses it in subsequent queries with similar query plans to help CBO generate optimized query plans. CBO optimizes query plans based on estimated statistics, so when statistical information is outdated or inaccurate, it may select inefficient query plans (bad plans), such as broadcasting a large table or misordering the left and right tables. These bad plans can cause query execution timeouts, excessive resource consumption, or even system crashes.
The workflow of Query Feedback-based plan optimization consists of three stages:
InputRows and OutputRows) of PlanNode in each query plan.Controlled by system variable enable_plan_advisor (Default: true), Query Plan Advisor is enabled by default for slow queries, that is, queries with execution time exceeding the threshold defined in the FE configuration item slow_query_analyze_threshold (Default: 5 seconds).
In addition, you can manually analyze a specific query or enable automatic analysis for all queries executed.
You can manually analyze a specific query statement even if its execution time does not exceed slow_query_analyze_threshold.
ALTER PLAN ADVISOR ADD <query_statement>
Example:
ALTER PLAN ADVISOR ADD SELECT COUNT(*) FROM (
SLECT * FROM c1_skew_left_over t1
JOIN (SELECT * FROM c1_skew_left_over WHERE c1 = 'c') t2
ON t1.c2 = t2.c2 WHERE t1.c1 > 'c' ) t;
To enable automatic analysis for all queries, you need to set the system variable enable_plan_analyzer (Default: false) to true`.
SET enable_plan_analyzer = true;
Each FE maintains its own record of tuning guides. You can use the following statement to view the tuning guides generated for the entitled queries on the current FE:
SHOW PLAN ADVISOR
Execute EXPLAIN against the query statement. In the EXPLAIN string, the message Plan had been tuned by Plan Advisor indicates that a tuning guide has been applied to the corresponding query.
Example:
EXPLAIN SELECT COUNT(*) FROM (
SLECT * FROM c1_skew_left_over t1
JOIN (SELECT * FROM c1_skew_left_over WHERE c1 = 'c') t2
ON t1.c2 = t2.c2 WHERE t1.c1 > 'c' ) t;
+-----------------------------------------------------------------------------------------------+
| Explain String |
+-----------------------------------------------------------------------------------------------+
| Plan had been tuned by Plan Advisor. |
| Original query id:8e010cf4-b178-11ef-8aa4-8a5075cec65e |
| Original time cost: 148 ms |
| 1: LeftChildEstimationErrorTuningGuide |
| Reason: left child statistics of JoinNode 5 had been overestimated. |
| Advice: Adjust the distribution join execution type and join plan to improve the performance. |
| |
| PLAN FRAGMENT 0 |
| OUTPUT EXPRS:9: count |
| PARTITION: UNPARTITIONED
You can delete the tuning guide of a specific query based on the query ID returned from SHOW PLAN ADVISOR.
ALTER PLAN ADVISOR DROP <query_id>
Example:
ALTER PLAN ADVISOR DROP "8e010cf4-b178-11ef-8aa4-8a5075cec65e";
To clear all tuning guides on the current FE, execute the following statement:
TRUNCATE PLAN ADVISOR
Currently, Query Feedback is primarily used to optimize the following scenarios:
pre_aggregation mode to maximize data aggregation in the first phaseThe tuning guides are mainly based on metrics Runtime Exec Node Input/Output Rows and FE statistics estimated rows. Since the current tuning thresholds are relatively conservative, you are encouraged to leverage Query Feedback to check for potential performance improvements if issues are observed in the Query Profile or EXPLAIN string.
Below are three common user cases.
Original Bad Plan:
small left table inner join large table (broadcast)
Optimized Plan:
large left table inner join small right table (broadcast)
Cause The issue might be caused by outdated or missing statistics, which leads the Cost-Based Optimizer (CBO) to generate an incorrect plan based on inaccurate data.
Solution During query execution, the system compares the input/output rows and statistics estimated rows of Left Child and Right Child, generating tuning guides. Upon re-execution, the system automatically adjusts the Join order.
Original Bad Plan:
large left table1 inner join large right table2 (broadcast)
Optimized Plan:
large left table1 (shuffle) inner join large right table2 (shuffle)
Cause The issue may result from data skew. When the right table has many partitions and one of them contains disproportionately large amounts of data, the system may incorrectly estimate the row count of right table after predicates are applied.
Solution During query execution, the system compares the input/output rows and statistics estimated rows of Left Child and Right Child, generating tuning guides. After optimization, the Join method is adjusted from Broadcast Join to Shuffle Join.
Symptom For data with good aggregation potential, the auto mode of First-phase aggregation may only aggregate a small amount of local data, missing the opportunity for performance gains.
Solution During query execution, the system collects Input/Output Rows for both local and global aggregations. Based on historical data, it evaluates the potential of the aggregation columns. If the potential is significant, the system enforces the use of pre_aggregation mode in local aggregations, maximizing data aggregation in the first phase and improving overall query performance.