docs/rfcs/20220209_limit_connections_rate.md
Similar to KIP-612, adapted to Redpanda's thread per core model
New configuration properties (For start we choose global config for connection rate):
When a connection is accepted which exceeds one of the configured bounds, it is blocking before reading or writing anything to the socket.
If both configuration properties are unset, then no action is taken: this preserves the existing unlimited behavior.
We will track connection rate on each shard independently (without cross-core communication).
For limit connections rate we will use a token bucket algo. Each bucket is semaphore for counting connections in the current second. For overrides we will use a hash table from ip -> semaphore.
struct connection_rate { ss::semaphore current_rate ss::lower_clock last_update_time }
For new connection we will do:
Problem: what if we have the maximum number of connections and should block new connections waiting for the next second, who will signal them to unblock? We should avoid this situation, and will spawn background fiber if available_units is 0 after lock. Background fiber will signal(kafka_max_connection_creation_rate) on next second after spawn if needed (if zero available tokens)
We can run background fiber, which will add kafka_max_connection_creation_rate tokens to semaphore on each second.
Note: each signal(kafka_max_connection_creation_rate) also should update last_update_time
We have net::server for different type tcp connection (rpc_kafka, internal_rpc, etc)
Idea is adding new settings to server_configuration (settings for connection rate). On server::start it will init internal structure with connection_rate info,
In server::accept before
ssx::spawn_with_gate(_conn_gate, this, conn mutable { return apply_proto(_proto.get(), resources(this, conn)); });
We will check the current connection rate.
If we should spawn background fiber (to increase tokens in semaphore) we will spawn it before
ssx::spawn_with_gate(_conn_gate, this, conn mutable { return apply_proto(_proto.get(), resources(this, conn)); });
Changes to the configuration properties affect all outstanding
connection_rate objects: the kafka_max_connection_creation_rate attribute must be recalculated.
We should reprot about current rate on each shard
overrides property
The lowest effective rate the user can set is multiplied by the number of shards. So user should think about it
A client may exhaust server resources by opening a very large number of connections in one second. This is okay as long as all clients are well-behaved and the system is carefully sized for its workload, neither of which are reliably true in real life deployments.
We can run background fiber each second to refresh tokens count in semaphore. In thins case we will avoid getting and comparing time for each connection, and only fiber will run signal()
It will add more compexity to share info about current rate to each core. Also we should sharding info about ovverides and map each ip to core for saving info about rate per ip