website/src/app/kb/architecture/critical-sequences/readme.mdx
import SupportOptions from "@/components/SupportOptions"; import NextStep from "@/components/NextStep"; import Alert from "@/components/DocsAlert"; import Link from "next/link"; import Image from "next/image";
Firezone is a distributed system with many moving parts, but some parts are especially critical to the integrity of the entire system:
These will be explained in more detail below.
Firezone authenticates users using two primary methods:
The authentication process for each is similar. Both methods begin the
authentication process at your Firezone account's sign in page:
https://app.firezone.dev/<your-account>.
However, the OIDC flow redirects the user to the identity provider for authentication before the final redirect back to Firezone.
Here's how the authentication flow works:
<Link target="_blank" href="/images/kb/architecture/critical-sequences/authentication.svg" > <Image src="/images/kb/architecture/critical-sequences/authentication.svg" alt="Firezone authentication sequence diagram" width={1200} height={1200} /> </Link>Sign in from the Client.state and nonce values. These are
used to prevent certain kinds of forgery and injection attacks.https://app.firezone.dev/<your-account> containing the nonce and state
parameters.token created from the nonce parameter
and other information.firezone-fd0020211111://handle_client_sign_in_callback with the token and
state parameters from the initial request.state parameter
matches what it originally sent. This prevents other applications from
injecting tokens into the Client's callback handler.token in a platform-specific secure storage
mechanism, for example Keychain on macOS and iOS.token and uses it to authenticate with the
control plane API.Policy evaluation is the process the Policy Engine uses to decide whether to allow or deny a connection request from a Client to a Resource.
If the request is allowed, connection setup information is sent to the Client and the appropriate Gateway. If the request is denied, it's logged and then dropped. This ensures that Clients are only connected to Gateways that are serving Resources the User is allowed to access.
<Alert color="info"> Connections in Firezone are **always** default-deny. Policies must be created to allow access. </Alert>Here's how the process works:
<Link target="_blank" href="/images/kb/architecture/critical-sequences/policy-evaluation.svg" > <Image src="/images/kb/architecture/critical-sequences/policy-evaluation.svg" alt="Firezone policy evaluation sequence diagram" width={1200} height={1200} /> </Link>10.10.10.10.Since the Client only receives WireGuard keys and NAT traversal information when a connection is allowed, it's not possible for a Client to exchange packets with the Gateway until explicitly allowed by the Policy Engine.
This means Gateways remain invisible to the outside world, helping to protect against classes of attacks that perimeter-based models may be susceptible to, such as DDoS attacks.
The above "Policy evaluation" section touches on this topic briefly in step 6. This section here describes in more detail, how this connection is established.
Firezone supports NAT traversal which means that neither the Client nor the Gateway need to be exposed to the public Internet. Instead of having to open and forward a port on the NAT device, direct connections are established via a technique called "hole-punching". In case that fails, the connection falls back to using a TURN server. We operate TURN servers in every region offered by Azure to minimize the overhead, regardless of where you are in the world.
To establish connections, Firezone implements the Interactive Connectivity Establishment (ICE) RFC. ICE is essentially an algorithm where two peers that would like to connect to each other first perform what is called "candidate gathering". They then test these candidates and nominate the best one.
Once ICE is finished, the nominated candidate pair is used to handshake a WireGuard session, which then allows encrypted packets to be sent back and forth.
Candidates are socket addresses a peer can send data from and receive data on.
Once we have gathered all relevant candidates, connecting to another peer is as simple as exchanging and testing them. As we don't have a direct connection yet, this step is done via the Firezone control plane API. ICE then forms an NxM matrix of all candidates and starts testing them for connectivity. Testing a so-called candidate pair boils down to sending a UDP packet to the remote. If we receive an answer back for a certain candidate pair, the test is successful. After 12 attempts without a response, we consider the candidate pair failed. All successful candidate pairs are then ranked by priority such that direct connections are considered better than those involving a TURN server. The best candidate pair is then "nominated" and declared as the result of the ICE algorithm.
WireGuard itself does not establish any connections and instead just represents a state machine that manages key rotation, encryption, and decryption of packets.
With the "nominated" candidate pair as the output of the ICE algorithm, we now have a pair of socket addresses for exchanging UDP packets between Client and Gateway. WireGuard's handshake requires prior knowledge of the remote's public key. Similar to the candidates, these keys have also been exchanged between Client and Gateway via the Firezone control protocol API. Using these public keys, Client and Gateway exchange secret session keys via Diffie-Hellman. These session keys are then used to encrypt packets and are rotated every 2 minutes.
The somewhat magical aspect of hole-punching actually happens entirely implicitly as part of this process. Most NAT devices are inherently stateful in that they remember the source port of an outgoing packet and allow packets arriving at the same port back-in. Both peers in the above algorithm are forming the same NxM matrix of candidates, and hence are sending packets to the same socket address the other one is sending from.
For example, assume a Client's public IP is 35.10.10.10 and the Gateway's
public IP is 40.10.10.10. Locally, both Firezone Clients and Firezone Gateways
are listening on port 52625 by default. As part of the candidate gathering, they
will resolve their respective public IP. The NxM matrix will therefore include a
pair of 35.10.10.10:52625 <> 40.10.10.10:52625 on both sides.
52625
towards 40.10.10.10:5262552625
towards 35.10.10.10.5262535.10.10.10:52625. It also registers a new "connection" on this port.40.10.10.10:52625. It also registers a new "connection" on this
port.35.10.10.10:52625) yields the connection created in step 4.40.10.10.10:52625) yields the connection created in step 3.Due to differing latencies, the timing of these steps can vary in practice. That isn't an issue though as UDP is by design stateless and the next packet will simply be allowed through.
Relayed connections, i.e. those involving a TURN server work in a very similar way and are in fact transparent to the WireGuard handshake and the encrypted packets. For relayed connections, the output of ICE is still a candidate pair, except that the "local" side of the pair is the allocated socket on the TURN server. In other words, if a relayed candidate pair is nominated, then none of the candidate pairs involving host and server-reflexive candidates have been successful and thus a relay candidate turned out to be the one with the highest priority.
To send data through a TURN server, the sender constructs a channel-data packet. This packet is a lightweight wrapper consisting of a 4-byte header that identifies a previously allocated channel. Channel allocation is handled as part of the TURN client–server protocol. Each channel is bound to a single remote peer, and all packets sent on that channel are forwarded to that peer. When the TURN server receives a channel-data packet, it removes the header and forwards the remaining payload to the designated peer. To improve their efficiency and throughput, Firezone's TURN servers make use of eBPF eXpress data path (XDP) and implement this routing of channel-data packets directly in the kernel. Specifically, in the network card driver even before the packet gets parsed.
To the receiver of the packet, this process is transparent. They simply send and receive UDP traffic from an IP and port and outside of applying heuristics based on e.g. IP location, cannot differentiate between this socket being a relay or the remote peer directly.
Secure DNS resolution is a critical function in most organizations.
Firezone employs a unique, granular approach to split DNS to ensure traffic intended only for DNS-based Resources is routed through Firezone, leaving other traffic untouched -- even when resolved IP addresses overlap.
To achieve this, Firezone embeds a tiny, in-memory DNS resolver in each Client that intercepts all DNS queries on the system.
When the resolver sees a query that doesn't match a known Resource, it operates in pass-through mode, forwarding the the query to the system's default resolvers or configured upstream resolvers in your account.
If the query matches a Resource, however, the following happens:
100.96.0.0/11
or fd00:2021:1111:8000::/107 and stores an internal mapping of this IP to
the DNS name originally queried. The IP is returned to the application that
made the query.This is why you'll see DNS-based Resources resolve to IPs such as 100.96.0.1
while the Client is signed in:
> nslookup github.com
Server: 100.100.111.1
Address: 100.100.111.1#53
Non-authoritative answer:
Name: github.com
Address: 100.96.0.1
Notice in the above process that at no point does the Client's system resolver see the actual IP address of the Resource. This ensures that your DNS data remains private and secure.
For a deeper dive into how (and why) DNS works this way in Firezone, see the How DNS works in Firezone article.
This is a common source of confusion among new Firezone users, so it's helpful to explain why Firezone uses mapped IPs for DNS Resources instead of simply using the actual resolved IP.
Consider the case where two DNS Resources resolve to the same IP address, such as when Name-based virtual hosting is used to host two web applications on the same server:
gitlab.company.com resolves to IP 172.16.0.5jenkins.company.com also resolves to IP 172.16.0.5Remember that routing happens at the IP level. We can't independently route
packets for the same IP to two different places. If Firezone used the Resource's
actual IP address to route packets, the User would be able to access
jenkins.company.com if they were granted access only to gitlab.company.com.
Using mapped IPs allows Firezone to securely route DNS Resources no matter how many other services share the same IP address.
Firezone was designed from the ground up to support high availability requirements. This is achieved through a combination of load balancing and automatic failover, described below.
When a Client wants to connect to a Resource, Firezone automatically selects a healthy Gateway in the Site to handle the request based on the Client's geolocated IP address. The system calculates the geographic distance to each available Gateway and selects the one that is closest to the Client's location. This ensures optimal performance with the lowest possible latency.
The Client maintains the connection to that Gateway until either the Client disconnects or the Gateway becomes unhealthy.
This effectively shards Client connections across all Gateways in a Site, achieving higher overall throughput than otherwise possible with a single Gateway.
Two or more Gateways deployed within a Site provide automatic failover in the event of a Gateway failure.
Here's how it works:
By using two independent health checks in the portal and the Client, Firezone ensures that temporary network issues between the Client and portal do not interrupt existing connections to healthy Gateways.
{(<NextStep href="/kb/architecture/security-controls">Next: Security controls</NextStep>)}
<SupportOptions />