docs/design/2022-07-20-session-manager.md
This proposes a design of a TiDB component called Session Manager. It keeps client connections alive while the TiDB server upgrades, restarts, scales in, and scales out.
Applications generally connect to TiDB through a connection pool to reduce the overhead of creating connections. Connections in the pool are kept alive, thus TiDB has to disconnect the client connections during shutdown. This causes reconnections and QPS jitters on the application side when the TiDB cluster rolling upgrades, restarts, and scales in. Thus, database administrators sometimes operate TiDB clusters when the QPS reaches the bottom, typically in the middle of the night, which is painful.
Besides, TiDB needs to be upgraded transparently in the TiDB Cloud Dev Tier once the latest version is ready. The current situation makes TiDB impossible to upgrade without affecting users.
Therefore, we propose a new TiDB component, called Session Manager. Applications or load balancers connect to the Session Manager instead of TiDB. The Session Manager keeps the session states of current connections and redirects the session to alive TiDB instances automatically when a TiDB instance is down.
In the cloud, applications typically connect to the Network Load Balancer (NLB), which balances the traffic to the TiDB cluster. Session Manager is placed between the NLB and the TiDB cluster.
The NLB balances the traffic to the Session Manager, and the Session Manager balances the traffic to the TiDB cluster. Most of the time, Session Manager only forwards messages between the NLB and the TiDB instances.
The Session Manager also needs to be highly available. An easy way is to deploy multiple isolated Session Manager instances. However, it's painful to maintain. For example, when a user wants to modify a configuration, he needs to connect to the proxies one by one. What we need is a Session Manager cluster.
Client addresses should be recorded in slow logs, audit logs, TiDB logs, and processlist to enable users to check the source of requests. Besides, users may configure different privileges for different IPs. However, from the viewpoint of TiDB, the client address is the address of the Session Manager. Some proxies use the Proxy Protocol to pass the client address to the server and TiDB also supports the Proxy Protocol. Session Manager will also use the Proxy Protocol in the handshake phase.
Traditional proxies require users to configure the addresses of TiDB instances. When the TiDB cluster scales out, scales in, or switches to another TiDB cluster, the user needs to reconfigure it in the proxies.
A Session Manager instance is deployed independently with other TiDB components. To connect to the TiDB cluster, the PD addresses should be passed to the Session Manager before startup. PD contains an etcd server, containing all the instance addresses in the cluster. The Session Manager watches the etcd key to detect a new TiDB instance. This is just like what TiDB instances do.
The Session Manager should also do a health check on the TiDB instance to ensure it is alive, and migrate the backend connections to other TiDB instances if it is down. The health check is achieved by trying to connect the MySQL protocol port, just like other proxies.
Session Manager can do various health checks on TiDB instances:
When a TiDB instance needs to be shut down gracefully due to scale-in, upgrading, or restarting, no more new connections are accepted. The health check from the Session Manager will fail and the Session Manager no longer routes new connections to the instance. However, it still waits for the ongoing queries to be finished since the instance is still alive.
When a TiDB instance quits accidentally, the ongoing queries fail immediately and the Session Manager redirects the connections.
When the Session Manager migrates a session, it needs to authenticate with the new TiDB server.
It's unsafe to save user passwords in the Session Manager, so we use a token-based authentication:
security.session-token-signing-cert and security.session-token-signing-key. The certificates on all the servers are the same so that a message encrypted by one server can be decrypted by another.To ensure security, TiDB needs to guarantee that:
A MySQL connection is stateful. TiDB maintains a session state for each connection, including session variables, transaction states, and prepared statements. If the Session Manager redirects the frontend query from one backend connection to another without restoring the session state in the new connection, an error may occur.
The basic workflow is as follows:
SHOW SESSION_STATES, the result of which is in JSON type.SET SESSION_STATES '{...}', the parameter of which is just the result of SHOW SESSION_STATES.Session states include:
Transactions are hard to be restored, so Session Manager doesn't support restoring a transaction. Session Manager must wait until the current transaction finishes or the TiDB instance exits due to shut down timeout. To be aware of whether the session has an active transaction, Session Manager needs to track the transaction status. This can be achieved by parsing the status flag in the response packets.
Similarly, Session Manager doesn't support restoring a result set. If the client uses a cursor to read, Session Manager must wait until the data is all fetched. Session Manager can parse the request and response packets to know whether the prepared statement is using a cursor and whether all the data is fetched.
Besides, there are some other limitations:
ADD INDEX and LOAD DATA, TiDB probably won't wait until they finish. In this case, the client will be disconnected.For static configurations, they are read before the startup of the Session Manager and cannot be changed online, such as the port. These configurations can be set by command line parameters.
For dynamic configurations, it's unacceptable to restart Session Manager to set them because Session Manager is supposed to be always online. The configurations can be overwritten anytime and take effect on the whole cluster. These configurations can be stored on an etcd server, which is deployed on the same machine as the Session Manager. Each Session Manager instance watches the etcd key to update the configurations in time.
Session Manager provides an HTTP API to update dynamic configurations online, just as the other components do.
Session Manager is one of the products in the TiDB ecosystem, so it's reasonable to integrate Session Manager with Grafana and TiDB-Dashboard.
Like the other components, Session Manager also reports metrics to Prometheus. The metrics include but are not limited to:
TiDB-Dashboard should be able to fetch the logs and profiling data of each Session Manager instance.
To troubleshoot the Session Manager, Session Manager provides an HTTP API to fetch instance-scoped or global-scoped data, such as:
To avoid upgrading Session Manager, Session Manager is supposed to be simple and stable enough.
However, we still can never guarantee that Session Manager will be bug free, so it still needs to support rolling upgrade. Once upgrading, the client connections will definitely be disconnected.
Session Manager connects to the MySQL protocol port of TiDB servers, so it should be compatible with MySQL.
Session Manager is an essential component of the query path, so it's very important to ensure its stability.
We have lots of cases to test, including:
Traditional SQL proxies typically maintain the session states on themselves, rather than by backend SQL servers. They parse every response packet, or even request packet, to incrementally update the session states.
This is also possible for Session Manager. MySQL supports CLIENT_SESSION_TRACK capability, which is also used for session migration. MySQL server is able to send human-readable state information and some predefined session states in the OK packet when the session states are changed.
The most significant advantage of this method is that Session Manager can support fail over. Now that Session Manager has all the up-to-date session states, it can migrate sessions anytime, even if the TiDB instance fails accidentally.
However, there are some drawbacks of this method:
int<1>. However, TiDB has tens of state types, some of which are TiDB-specific. We cannot extend the state types because it will break forward compatibility if MySQL adds more state types in the future.COMMIT statement or an auto-commit DML statement.The most attractive scenario of routing client connections is multi-tenancy.
These are some scenarios where multi-tenancy is useful:
In this architecture, the NLB is not aware of tenants. Each TiDB instance belongs to only one tenant to isolate resources. Thus, it's Session Manager's responsibility to route sessions to different TiDB instances.
Session Manager can distinguish tenants by the SNI servers.