The Evolution of Microservices

June 8, 2026 • Asim Aslam

Every era of distributed systems has solved the problem the previous era created. Microservices solved the coordination cost of the monolith and created a distributed-systems problem. Containers and orchestration solved the deployment problem and created an operational one. The service mesh solved the cross-cutting problem and pushed complexity into the platform. Each step was a response to a concrete engineering constraint, not a trend.

Follow that chain to the end, and the interface the industry converged on — a named, typed, discoverable, independently deployable unit — turns out to be almost exactly the interface a language model needs to call a tool. That convergence is the technical reason agents are the next era.

The monolith and the cost of coordination

A monolith is one deployable, one process, one database, one release cadence. In-process calls, a single transaction boundary, no network in the path — technically efficient. Its limits are organisational, not technical.

As the number of engineers grows, the cost of coordinating changes to a single artifact grows faster than linearly. Everyone shares a build, a test suite, a deploy. A change in one corner can block a release in another. This is Conway's law stated as a constraint: a system's structure ends up mirroring the communication structure of the org that builds it, and a single shared artifact forces a single shared communication channel. Microservices were the response: let teams own and release their part independently. The decomposition was organisational first and technical second.

Microservices and the distributed-systems tax

The moment you split a process across the network, you inherit the network. Calls that were function invocations become RPCs that can be slow, reordered, duplicated, or simply not return. You now have partial failure — the defining property of a distributed system — and with it: service discovery (where is payments right now?), load balancing across instances, retries with backoff, idempotency, timeouts, circuit breaking to stop cascading failure, and distributed tracing because no single stack trace spans the request anymore.

The first wave of answers was libraries. Netflix open-sourced Eureka (discovery), Ribbon (client-side load balancing), and Hystrix (circuit breaking); Twitter built Finagle. The defining trait of this era: the distributed-systems concerns lived inside your application process, as code you imported. Which meant every language needed its own copy, and every service carried the weight.

Containers and orchestration

Two innovations made "independently deployable" actually cheap.

Docker (2013) standardised the unit of deployment as an immutable image — application plus dependencies, built once, run anywhere, isolated with namespaces and cgroups. It killed "works on my machine" and made a service a reproducible artifact rather than a deploy procedure.

Kubernetes (2014) standardised operation. Its core idea is declarative reconciliation: you describe desired state, and a control loop continuously drives the system toward it — scheduling, restarting, scaling, rolling out. Operating hundreds of independently deployable services became tractable, because lifecycle became the platform's job, not yours. The unit the platform scheduled was a container exposing declared ports — a named thing with an interface.

The service mesh

By 2016 the pattern was clear: discovery, load balancing, retries, circuit breaking, mTLS, and telemetry are cross-cutting and language-agnostic. Reimplementing them as a library in every language is waste. So move them out of the process entirely.

The mechanism was the sidecar proxy — Envoy, out of Lyft — deployed next to each service, intercepting all traffic, with a central control plane (Istio) configuring it. The application stopped needing resilience libraries; the mesh did discovery and routing at L4/L7, transparently. Technically this was a clean decoupling of operational concerns from business logic. Practically, it hollowed out the library era: the value that lived in Netflix OSS migrated into infrastructure, and the service got thinner.

The correction

Around 2020 the trade-offs drew a harder look. Microservices have a real cost: serialization, network latency, eventual consistency, and the cognitive load of debugging across process boundaries. If your org isn't large enough to need independent deployability, you pay the distributed-systems tax and get little of the team-autonomy benefit. Amazon's Prime Video team published a workload they moved from orchestrated distributed components back to a single process and cut cost ~90% — because for a tight, high-throughput loop, the serialization and orchestration overhead dwarfed the work.

The lesson wasn't "microservices were wrong." It was that micro was always a distraction. The thing worth paying for was independent deployability and clear ownership; granularity is a workload decision, and the cost is real.

What endured

Strip away the runtime churn — images, schedulers, sidecars — and the same primitive sits underneath every era: a service is a named, network-addressable, typed, independently deployable unit. It announces itself (registration), it's locatable (discovery), it exposes a contract (typed endpoints with schemas), and it can be deployed and scaled on its own.

This shape never changed, and that's not an accident. Each new runtime layer needed something with exactly this shape to operate on. Kubernetes schedules units with declared interfaces. The mesh routes to named endpoints. Tracing correlates typed calls. The industry kept rebuilding the runtime and kept requiring the same unit, because the unit was the stable interface every layer agreed on.

The caller changes

For fifteen years, the consumer of that typed interface was deterministic code: another service, a gateway, a client. The contract was machine-to-machine on a fixed integration written ahead of time.

Then language models learned to call functions reliably. Given a set of capabilities described as name, purpose, and a typed parameter schema, a model can choose which to invoke and produce well-formed arguments, and chain them toward a goal stated in natural language.

What an LLM needs in order to use a capability is specific: a name, a description of what it does, and a typed input/output contract. That is the definition of a service endpoint. A registration is a tool definition. Service discovery is tool discovery. The Model Context Protocol (Anthropic, 2024) is, stripped down, a discovery-and-invocation protocol for model-callable capabilities — service discovery and RPC, with a model on the other end. No one designed the microservices interface for models, but it already provides what they need.

So the shift is not a new architecture. It is a change of caller: from a deterministic program that was integrated in advance, to a probabilistic reasoner that decides at runtime which typed capabilities to compose, and in what order, from intent.

Why agents are the future

The hard part of distributed systems was never building a capability. It was everything between capabilities — the integration and orchestration. Composing services into a workflow meant writing the glue: sagas, choreography, retries-with-meaning, the orchestration code that encodes "do A, then B, and if C fails compensate." That glue is where most distributed-systems effort and most distributed-systems bugs live.

An agent attacks exactly that layer. Given the available tools and a goal, it can compose them dynamically — read the contracts, plan a sequence, call, observe, adapt — without that sequence being written ahead of time. The orchestration shifts from code you author to a decision the model makes against typed capabilities. The interface to software moves from fixed APIs invoked by code to capabilities invoked by intent. It changes where the work is: from writing capabilities and their glue, to exposing capabilities cleanly and letting a reasoner compose them.

The caveats are technical. Agents are non-deterministic, higher-latency, and more expensive per call than a function invocation, and they need guardrails — stopping conditions, approval gates, sandboxing — precisely because they decide at runtime. So agents do not replace deterministic services or workflows; they sit on top of them. When the path is known, you still want fixed code. The substrate is unchanged: you still need well-defined, typed, discoverable, independently deployable capabilities. Agents don't make that substrate obsolete — they make it the most valuable layer in the stack, because a reasoner is only as good as the capabilities it can call.

So agents are the next era, not a replacement for what came before. The architecture the last fifteen years produced was not built for language models; it converged, era by era, on the exact interface they require to act. Each wave changed the runtime and left the unit intact, because the unit was what the next wave needed. The only thing new in this wave is the caller: it reasons about which units to invoke, instead of being wired to them in advance.