docs/design/2021-04-21-unify-log-library.md
There are heterogeneous logging libraries in PingCAP's golang projects. These different logging libraries affect the development efficiency and even the user configuration experience.
It is necessary to unify those heterogeneous logging libraries.
Except for slow query logs, all other logs must satisfy the unified-log-format RFC standard.
However, in practice, it was found that the format of logs is confusing:
tidb_stderr is configured with text format, but the log is in json format.tiflash_cluster_manager.pd_stderr will emit both text and json logs with duplicate content (but with a few subtle differences in timestamps).There must be something wrong with the engineering of these codes above, and they must be changed. The cost to change them is not small.
Rationale - for long-term consideration, we should maintain code quality. The speed of output can be sacrificed in time if necessary.
Implementation plan:
pingcap/tidb first. For dependent parts we write dummy code to satisfy.pingcap/br, remove the dependency on pingcap/tidb/util/logutil, and clear dummy code of pingcap/tidb.tikv/pd.After the implementation, we have pingcap/tidb, pingcap/br, tikv/pd all depend directly on pingcap/log and do not depend on any other log libraries (including std/log) or each other.
The following rationales are organized by GitHub repositories.
As a common logging library in PingCAP, it does the following things:
Log library dependencies:
For historical reasons, TiDB has two third-party logging libraries, logrus and pingcap/log. pingcap/log is a wrapper of zap.
Logs of TiDB can be divided into two types, slow query logs and the other logs. As mentioned above, these two types of logs are emitted through two different logging libraries, which results in separate configurations for the two types of logs and requires writing additional configuration conversion code.
TiDB-specific logging logic is written inside util/logutil/log.go, e.g., logger initialization, logger configuration, and so on.
Note this file, which is one of the main culprits of circular dependencies. The following briefly describes the key logic in util/logutil/log.go, two init methods and four log handlers.
The init method of logrus may initialize two logrus handlers.
First, it is necessary to initialize the standard log handler (package level handler). InitLogger first initializes the standard logger according to the configuration.
Then, determine whether the configuration has enabled slow query log, and if so, create a log handler specific to slow query.
Regarding where these two handlers are used.
cmd/importer/parser.go, which uses the standard logger by logrus.logrus, code in executor/adapter.go.pingcap/log is a wrapper around zap, and as mentioned below the two terms are equivalently interchangeable.
Similar to logrus, the init method of zap func InitZapLogger(cfg *LogConfig) error may initialize two zap handlers.
InitZapLogger's logic is very similar to logrus' above.
In main.go there is a bunch of grpc logger initialization code, which is not in util/logutil/log.go.
The NewLoggerV2 method creates a go native logger handler and is only used in grpc.
PD is similar to TiDB in that it also relies on logrus and pingcap/log, but with an additional layer of capnslog as a proxy.
Log library dependencies:
The standard logger is then passed down to the etcd (via the capnslog proxy), grpc and draft components as the log handler for these packages.
There is only one logrus handler inside the entire PD codebase.
Only the etcd, grpc, and draft components use the logrus handler.
The initialization of logrus locates at pkg/logutil/log.go. Here is the code.
There is only one zap log handler inside the entire PD codebase, and its initialization is inline cmd/pd-server/main.go.
The logic is simple, create a new handler based on the configuration and replace the global handler at the pingcap/log package level.
Most of the logging logic in PD will use the global zap handler.
TiDB depends on BR, which in turn depends on tidb's util/logutil/log.go, constituting a circular dependency.
Not only is it a circular dependency, it also happens to depend on the log component. This creates a considerable obstacle for the refactor.
The following code is from pkg/lightning/log/log.go, which calls TiDB's InitLogger and then pingcap/log's InitLogger.
BR also relies on TiDB's slow log, which he initializes in the main function as SlowQueryLogger.
BR also calls TiDB's InitLogger twice in two places.
BR also created two different zap handlers in two places, one of which is not used.
These problem codes are not listed here.
To refactor TiDB's logging functionality, you must first change BR to remove the dependency on TiDB log, then let TiDB depend on the new version of BR, and finally refactor TiDB's logging.
Must ensure that refactoring is compatible with historical logging logic.
Guaranteed by unit testing.
See meta issue: https://github.com/pingcap/tidb/issues/24190.
Mainly unit testing.