docs/design/2022-12-06-support-late-materialization.md
TiFlash is the key component that makes TiDB essentially an Hybrid Transactional/Analytical Processing (HTAP) database.
Operating on positions opens a number of different query processing strategies centered around so-called "materialization". Roughly speaking, materialization is the process of switching positions back into values. The literature presents two main strategies: early and late materialization. The basic idea of late materialization is to operate on positions and defer tuple reconstruction for as long as possible. Refer to A Comprehensive Study of Late Materialization Strategies for a Disk-Based Column-Store.
There are two kinds of late materialization:
Late materialization in selection
Late materialization in join (#Fig.2)
Only the late materialization in selection will be considered this time.
Late materialization can help improve performance of some AP queries. Like SELECT * FROM t WHERE col_str LIKE '%PingCAP%' The normal execution processes are as follows:
col_str LIKE '%PingCAP%'With late materialization, the execution processes can be:
col_str LIKE '%PingCAP%'And for more complex queries, like SELECT * FROM t WHERE col_str1 LIKE '%PingCAP%' AND REGEXP_LIKE(col_str2, 'new\\*.\\*line'), late materialization, the execution processes can be:
col_str1 LIKE '%PingCAP%'REGEXP_LIKE(col_str2, 'new\\*.\\*line')Note that, REGEXP_LIKE(col_str2, 'new\\*.\\*line') is a heavy cost filter condition, filter out some data by col_str LIKE '%PingCAP%' first can reduce lots of useless calculation.
Therefore, when late materialization is enabled, we can not only reduce the number of data packs to be read but also reduce some useless calculation. If the selectivity of the pushed down filter is considerable, the performance of the query can be highly improved.
But TiFlash still does not support late materialization now. So, we are making a proposal to support it.
In brief, we will divide the filter conditions of selection(whose child is tablescan) into two parts:
We will decide which filter conditions will be pushed down to tablescan in TiDB planner. We should consider the following factors:
The algorithm is:
func selectPushDownConditions(conditions []expression.Expression) []expression.Expression {
selectedConds := make([]expression.Expression, 0, len(conditions))
selectedColumnCount = 0
// selectivity = (row count after filter) / (row count before filter), smaller is better
// income = (1 - selectivity) * restColumnCount, greater is better
// We use income is avoid the following case:
// selectivity is 0.7, restColumnCount is 10, income is 3
// after push down a predicate, selectivity becomes 0.6, restColumnCount becomes 7, income becomes 2.8, which is smaller than 3
// but the predicate is not pushed down, which is not expected
selectedSelectivity := 1.0
selectedIncome := 0.0
// Group them by the column, sort the conditions by the selectivity of the group.
// input: [cond1_on_col1, cond2_on_col2, cond3_on_col2, cond4_on_col1, cond5_on_col3]
// - group: [[cond1_on_col1, cond4_on_col1], [cond2_on_col2, cond3_on_col2], [cond5_on_col3]]
// - sort: [[cond2_on_col2, cond3_on_col2], [cond1_on_col1, cond4_on_col1], [cond5_on_col3]]
// where group on col2 has the highest selectivity, and group on col1 has the second highest selectivity.
// output: [[cond2_on_col2, cond3_on_col2], [cond1_on_col1, cond4_on_col1], [cond5_on_col3]]
sortedConds = groupByColumnSortBySelectivity(conditions)
for _, condGroup := range sorted_conds {
mergedConds := append(selectedConds, condGroup...)
// If the group contains heavy cost functions, skip it.
// heavy cost functions: json functions, ...(to be discussed)
if withHeavyCostFunction(condGroup) {
continue
}
// If the group contains too many columns, skip it.
// too many columns: more than 25% of total number of columns to be read, ...(to be discussed)
if isTooManyColumns(condGroup) {
continue
}
// Calculate the selectivity of the merged filter conditions.
selectivity := physicalTableScan.tblColHists.Selectivity(mergedConds)
colCnt := columnCount(mergedConds)
income := (1 - selectivity) * (float64(totalColumnCount) - float64(colCnt))
// If selectedColumnCount does not change,
// or the income increase larger than the threshold after pushing down the group, push down it.
if colCnt == selectedColumnCount || income > selectedIncome {
selectedConds = mergedConds
selectedColumnCount = colCnt
selectedIncome = income
selectedSelectivity = selectivity
}
// If the selectivity improvement is small enough, break the loop
// to reduce the cost of calculating selectivity.
if selectedSelectivity-selectivity < selectivityImproveThreshold {
break
}
}
return selectedConds
}
Since we need statistics to calculate the selectivity of the filter conditions, so this algorithm should be executed in postOptimize phase.
Obviously, beacuse the selectivity of the filter conditions is accurate, and the algorithm is not optimal, we can not guarantee that the pushed down filter conditions are the best. In order to patch a workaround for performance degradation, we will add a variable tidb_enable_late_materialization, and set @@tidb_enable_late_materialization=OFF to disable optimizer to push down filter conditions.
Therefore, it should work like this:
# assume that function FUNC(col_str) is heavy cost function.
mysql> SET @@tidb_enable_late_materialization=ON; EXPLAIN SELECT RegionID, CounterID, TraficSourceID, SearchEngineID, AdvEngineID FROM hits WHERE URL LIKE "%google%" AND FUNC(SearchPhrase) > 0;
+------------------------------+-------------+--------------+---------------+--------------------------------------------------------------------------------------------------------------------+
| id | estRows | task | access object | operator info |
+------------------------------+-------------+--------------+---------------+--------------------------------------------------------------------------------------------------------------------+
| TableReader_15 | 79997997.60 | root | | data:ExchangeSender_14 |
| └─ExchangeSender_14 | 79997997.60 | mpp[tiflash] | | ExchangeType: PassThrough |
| └─Projection_5 | 79997997.60 | mpp[tiflash] | | hits.hits.regionid, hits.hits.counterid, hits.hits.traficsourceid, hits.hits.searchengineid, hits.hits.advengineid |
| └─Selection_13 | 79997997.60 | mpp[tiflash] | | gt(func(hits.hits.searchphrase), 0) |
| └─TableFullScan_12 | 99997497.00 | mpp[tiflash] | table:hits | pushed down filter: like(hits.hits.url, "%google%", 92), keep order:false |
+------------------------------+-------------+--------------+---------------+--------------------------------------------------------------------------------------------------------------------+
5 rows in set (0.00 sec)
# disable late materialization
mysql> SET @@tidb_enable_late_materialization=OFF; EXPLAIN SELECT RegionID, CounterID, TraficSourceID, SearchEngineID, AdvEngineID FROM hits WHERE URL LIKE "%google%" AND FUNC(SearchPhrase) > 0;
+------------------------------+-------------+--------------+---------------+--------------------------------------------------------------------------------------------------------------------+
| id | estRows | task | access object | operator info |
+------------------------------+-------------+--------------+---------------+--------------------------------------------------------------------------------------------------------------------+
| TableReader_15 | 79997997.60 | root | | data:ExchangeSender_14 |
| └─ExchangeSender_14 | 79997997.60 | mpp[tiflash] | | ExchangeType: PassThrough |
| └─Projection_5 | 79997997.60 | mpp[tiflash] | | hits.hits.regionid, hits.hits.counterid, hits.hits.traficsourceid, hits.hits.searchengineid, hits.hits.advengineid |
| └─Selection_13 | 79997997.60 | mpp[tiflash] | | gt(func(hits.hits.searchphrase), 0), like(hits.hits.url, "%google%", 92) |
| └─TableFullScan_12 | 99997497.00 | mpp[tiflash] | table:hits | keep order:false |
+------------------------------+-------------+--------------+---------------+--------------------------------------------------------------------------------------------------------------------+
5 rows in set (0.00 sec)
The Process of execution of tablescan should be changed to:
There is no optimal algorithm to determine which filter conditions should be pushed down. We can only try to implement a good solution. So, the performance of some AP queries may degrade.
We will try to make sure most AP queries would not go with performance degradation. And we can add add a variable tidb_enable_late_materialization to decide whether to enable late materialization or not.