doc/development/database/load_balancing.md
With database load balancing, read-only queries can be distributed across multiple PostgreSQL nodes to increase performance.
This documentation provides a technical overview on how database load balancing is implemented in GitLab Rails and Sidekiq.
A few Ruby classes are involved in the load balancing process. All of them are
in the namespace Gitlab::Database::LoadBalancing:
HostLoadBalancerConnectionProxySessionEach workload begins with a new instance of Gitlab::Database::LoadBalancing::Session.
The Session keeps track of the database operations that have been performed. It then
determines if the workload requires a connection to either the primary host or a replica host.
When the workload requires a database connection through ActiveRecord,
ConnectionProxy first redirects the connection request to LoadBalancer.
ConnectionProxy requests either a read or read_write connection from the LoadBalancer
depending on a few criteria:
Session has recorded a write operation previously.use_primaryignore_writesuse_replicas_for_read_queriesfallback_to_replicas_for_ambiguous_queriesLoadBalancer then yields the requested connection from the respective database connection pool.
It yields either:
read_write connection from the primary's connection pool.read connection from the replicas' connection pools.When responding to a request for a read connection, LoadBalancer would
first attempt to load balance the connection across the replica hosts.
It looks for the next online replica host and yields a connection from the host's connection pool.
A replica host is considered online if it is up-to-date with the primary, based on
either the replication lag size or time. The thresholds for these requirements are configurable.
When rolling out changes via feature flag, consider deploying exclusively to Sidekiq pods initially to minimize risk.
Why Sidekiq-first deployment:
Implementation example:
if feature_flag_enabled? && Gitlab::Runtime.sidekiq?
new_changes
else
existing_changes
end