docs/architecture.md
Puma is a threaded Ruby HTTP application server processing requests across a TCP and/or UNIX socket.
Puma processes (there can be one or many) accept connections from the socket via
a thread (in the Reactor class). The connection,
once fully buffered and read, moves into the todo list, where an available
thread will pick it up (in the ThreadPool
class).
Puma works in two main modes: cluster and single. In single mode, only one Puma
process boots. In cluster mode, a master process is booted, which prepares
(and may boot) the application and then uses the fork() system call to create
one or more child processes. These child processes all listen to the same
socket. The master process does not listen to the socket or process requests -
its purpose is primarily to manage and listen for UNIX signals and possibly kill
or boot child processes.
We sometimes call child processes (or Puma processes in single mode)
workers, and we sometimes call the threads created by Puma's
ThreadPool worker threads.
net.core.somaxconn sysctl value.
The backlog determines the size of the queue for unaccepted connections. If
the backlog is full, the operating system is not accepting new connections.backlog of work as reported by
Puma.stats or the control server. The backlog that Puma.stats refers to
represents the number of connections in the process' todo set waiting for
a thread from the ThreadPool.Reactor class) reads and buffers requests from the
socket.
env['puma.request_body_wait']
(milliseconds).calling the configured Rack
application. The Rack application generates the HTTP response.queue_requestsThe queue_requests option is true by default, enabling the separate reactor
thread used to buffer requests as described above.
If set to false, this buffer will not be used for connections while waiting
for the request to arrive.
In this mode, when a connection is accepted, it is added to the "todo" queue immediately, and a worker will synchronously do any waiting necessary to read the HTTP request from the socket.