website/src/docs/fusion/v16/performance-tuning.md
The Fusion gateway proxies every GraphQL operation to one or more subgraphs over HTTP. The defaults work well out of the box, but high-throughput or latency-sensitive deployments can benefit from tuning the transport layer.
This page covers:
Fusion uses a named HttpClient to communicate with subgraphs. The default client name is "fusion", and you configure it through the standard IHttpClientFactory pattern. This gives you full control over connection behavior, timeouts, and message handlers.
A baseline Program.cs that registers the named client:
var builder = WebApplication.CreateBuilder(args);
// 1. Register the named HTTP client for subgraph communication
builder.Services.AddHttpClient("fusion");
// 2. Configure the Fusion gateway
builder
.AddGraphQLGateway()
.AddFileSystemConfiguration("./gateway.far");
var app = builder.Build();
app.MapGraphQLHttp();
app.Run();
"fusion": the client the gateway uses to call subgraphs. Any handler configuration you add here applies to all subgraph requests.HTTP/2 multiplexes multiple requests over a single TCP connection, which reduces connection overhead when the gateway sends many concurrent requests to a subgraph. This is especially beneficial when subgraphs are behind a load balancer that supports HTTP/2.
When your subgraphs use TLS (HTTPS), HTTP/2 is negotiated automatically via ALPN. Enable EnableMultipleHttp2Connections to allow the gateway to open additional HTTP/2 connections when a single connection's stream limit is reached:
builder.Services
.AddHttpClient("fusion")
.ConfigurePrimaryHttpMessageHandler(() => new SocketsHttpHandler
{
EnableMultipleHttp2Connections = true,
});
No additional version configuration is needed. .NET negotiates HTTP/2 over TLS by default.
In many Kubernetes deployments, services communicate over plaintext HTTP inside the cluster. HTTP/2 cleartext (h2c) requires explicit opt-in because .NET defaults to HTTP/1.1 for unencrypted connections.
To force HTTP/2 without TLS, set DefaultRequestVersion and DefaultVersionPolicy on the HttpClient:
builder.Services
.AddHttpClient("fusion", httpClient =>
{
httpClient.DefaultRequestVersion = HttpVersion.Version20;
httpClient.DefaultVersionPolicy = HttpVersionPolicy.RequestVersionExact;
})
.ConfigurePrimaryHttpMessageHandler(() => new SocketsHttpHandler
{
EnableMultipleHttp2Connections = true,
});
The subgraph must also be configured to accept HTTP/2 over cleartext. By default, Kestrel only listens on HTTP/1.1 for non-TLS endpoints. Enable h2c in each subgraph's Program.cs:
builder.WebHost.ConfigureKestrel(options =>
{
options.ListenAnyIP(5001, listenOptions =>
{
listenOptions.Protocols = HttpProtocols.Http1AndHttp2;
});
});
If you are unsure whether your infrastructure supports HTTP/2 cleartext end-to-end, HTTP/1.1 works well for most internal deployments. Switch to HTTP/2 only when you have confirmed support on both the gateway and all subgraphs.
When multiple identical query requests are in flight to the same subgraph at the same time, request deduplication ensures only one HTTP request is actually sent. The first request becomes the "leader" and executes normally. Subsequent identical requests become "followers" that wait for the leader's response. Each caller receives an independent copy of the result.
Deduplication is most effective when:
The deduplication hash includes the request body, URL, and the values of configurable hash headers. By default, Authorization and Cookie headers are included in the hash. This means:
Add the request deduplication message handler with .AddRequestDeduplication() to the named HTTP client builder:
builder.Services
.AddHttpClient("fusion")
.AddRequestDeduplication();
By default, the Authorization and Cookie headers are included in the deduplication hash, which covers most setups. If you need additional headers to be part of the hash, for instance a tenant identifier in a multi-tenant application, add them to HashHeaders:
.AddRequestDeduplication(options =>
{
options.HashHeaders = ["Authorization", "Cookie", "X-Tenant-Id"];
});
For service-to-service communication where the gateway does not receive cookies, you can remove Cookie from the hash:
.AddRequestDeduplication(options =>
{
options.HashHeaders = ["Authorization"];
});
Only query operations are deduplicated. The following are not deduplicated:
The gateway limits the number of simultaneous executions it processes using a concurrency gate. An execution is the work of running a single GraphQL operation end-to-end. Each query or mutation counts as one execution, and each event a subscription emits counts as one execution while its selection set runs. Capping concurrency keeps the gateway operating in its optimal throughput range. Too much work competing for the same resources (thread pool, memory, connections) can reduce overall throughput rather than increase it.
The default limit is 64 concurrent executions. The default is calibrated for small containers 1 to 4 CPUs. Depending on your CPU count and typical operation cost, you may want to increase or decrease this value to find the optimal throughput for your hardware. The limit does not reject work; it queues it, and the GraphQL executor processes at most 64 executions concurrently by default.
Subscriptions participate in this limit like any other operation. Each event the gateway processes consumes a slot. Idle subscriptions between events cost nothing.
Set the limit through ModifyServerOptions on the gateway builder:
builder
.AddGraphQLGateway()
.AddFileSystemConfiguration("./gateway.far")
.ModifyServerOptions(options =>
{
options.MaxConcurrentExecutions = 128;
});
You can override this limit for a specific HTTP endpoint using WithOptions:
app.MapGraphQLHttp()
.WithOptions(options =>
{
options.MaxConcurrentExecutions = 256;
});
Start with the default of 64 and adjust based on your workload. If you expect many long-lived subscriptions firing frequent events, factor those into your sizing. They now contend for the same slots as queries and mutations. Set to null to disable the limit entirely.
Every execution is bounded by the ExecutionTimeout option (default 30 seconds). This applies uniformly to queries, mutations, subscription handshakes, and each subscription event. The budget covers both the time an execution spends waiting for a concurrency slot and the time it spends running. When the budget is exceeded, the execution is cancelled and the caller receives a clean timeout error.
ExecutionTimeout is the single setting that controls cancellation for every execution. Configure it with ModifyRequestOptions:
builder
.AddGraphQLGateway()
.ModifyRequestOptions(o => o.ExecutionTimeout = TimeSpan.FromSeconds(10));
If executions routinely time out at the gate, that is a signal to scale out or raise MaxConcurrentExecutions. Increasing ExecutionTimeout only defers the problem.
@cacheControl, composition merge behavior, and gateway response headers.