docs/design/2023-03-17-non-prepared-plan-cache.md
TP queries can get lots of performance benefits from Plan Cache, but the current TiDB Plan Cache can only support Prepare/Execute Statements. Most of our users are still using general(non-prepared) queries to access TiDB instead of Prepare/Execute. So if Plan Cache can support general queries, these users can gain some performance benefits as well. Some other mature Databases have already supported this feature: 1) Oracle and 2) OceanBase.
As the left part of the above diagram show, Plan Cache can be split into 2 parts logically:
And more specifically, the backend can be abstracted as a function GetPlan(SQL, AST, Params), and it works as below:
(AST, Params) to the optimizer to generate a new plan, return this plan, and put this plan into the cache; (cache miss)To support Non-Prepared Plan Cache, logically we can re-use the whole backend, and just update the frontend a little:
(SQL, AST, Params);Currently, TiDB only caches one plan for a parameterized query.
For example, the queries SELECT * FROM t WHERE a < 1 and SELECT * FROM t WHERE a < 100000 share the same parameterized form, SELECT * FROM t WHERE a < ?, and thus share the same plan.
Due to the preceding risks and the fact that the execution plan cache only provides significant benefits for simple queries (if a query is complex and takes a long time to execute, using the execution plan cache might not be very helpful), TiDB has strict restrictions on the scope of non-prepared plan cache. The restrictions are as follows:
Scan, Selection, or Projection operators are supported, such as SELECT * FROM t WHERE a < 10 AND b in (1, 2).Agg, Limit, Window, or Sort are not supported.LIKE is not supported, such as c LIKE 'c%'.+ operation is not supported, such as a+1 < 2.JSON, ENUM, SET, or BIT type are not supported, such as SELECT * FROM t WHERE json_col = '{}'.NULL values are not supported, such as SELECT * FROM t WHERE a is NULL.SELECT * FROM t WHERE a in (1, 2, 3, ... 51).SELECT * FROM INFORMATION_SCHEMA.COLUMNS, where COLUMNS is a TiDB memory table.We'll solve these limitations step by step later on.
Prepared Plan Cache and Non-Prepared Plan Cache are totally separated, cached plans in them do not affect each other.
We implement a cacheable checker to check whether a query can be supported by Non-Prepared Plan Cache(cacheable_checker). If a query is not supported, we can just fall back to the normal optimization code path.
Plan Cache uses parameterized SQL as the key, so we need a way that can parameterize any SQL fast.
For example, parameterize select * from t where a<10 to select * from t where a<?;
Currently, we implement this on AST:
AST and replace all constant values with ?;AST.Restore to get the parameterized SQL;Introduce 2 variables below to control Non-Prepared Plan Cache behaviors:
tidb_enable_non_prepared_plan_cache: indicate whether to enable this feature;tidb_non_prepared_plan_cache_size: indicate the cache size;