docs/en/administration/sql_digest.md
This topic introduces the SQL Digest feature of StarRocks. This feature is supported from v3.3.6 onwards.
SQL Digest is a fingerprint generated by historical SQL statements with parameters removed. It helps cluster SQL statements with the same structure but different parameters.
Common use cases of SQL Digest include:
In StarRocks, SQL Digests are mainly recorded through audit logs fe.audit.log. For example, execute the following two SQL statements:
SELECT count(*) FROM lineorder WHERE lo_orderdate > '19920101';
SELECT count(*) FROM lineorder WHERE lo_orderdate > '19920202';
Two same Digest will be generated in fe.audit.log:
Digest=f58bb71850d112014f773717830e7f77
Digest=f58bb71850d112014f773717830e7f77
To enable this feature, you must set the FE configuration item enable_sql_digest to true.
Execute the following statement to enable it dynamically:
ADMIN SET FRONTEND CONFIG ('enable_sql_digest'='true');
To enable it permanently, you must add enable_sql_digest = true to the FE configuration file fe.conf and restart FE.
After enabling this feature, you can install the AuditLoader plugin to perform statistical analysis on SQL statements.
SELECT * FROM starrocks_audit_db__.starrocks_audit_tbl__
WHERE digest = '<Digest>'
LIMIT 1;
SELECT
date_trunc('day', `timestamp`) query_date,
count(*),
sum(queryTime),
sum(scanRows),
sum(cpuCostNs),
sum(memCostBytes)
FROM starrocks_audit_db__.starrocks_audit_tbl__
WHERE digest = '<Digest>'
GROUP BY query_date
ORDER BY query_date
DESC LIMIT 30;
SELECT avg(queryTime), min(queryTime), max(queryTime), stddev(queryTime)
FROM starrocks_audit_db__.starrocks_audit_tbl__
WHERE digest = '<Digest>';
WITH top_sql AS (
SELECT digest, sum(queryTime)
FROM starrocks_audit_db__.starrocks_audit_tbl__
GROUP BY digest
ORDER BY sum(queryTime)
DESC LIMIT 10
)
SELECT * FROM starrocks_audit_db__.starrocks_audit_tbl__
WHERE digest IN (SELECT digest FROM top_sql);
WHERE a = 1 and WHERE a = 2 will have the same Digest.IN (1,2,3) and IN (1,2) will have the same Digest.LIMIT N clauses will be normalized. For example, similar SQL statements with LIMIT 10 and LIMIT 30 will have the same Digest.