Service Mesh: You Probably Don't Need One

| 3 min read |
service-mesh istio linkerd microservices

I evaluated Istio and Linkerd for our microservices at the fintech startup. My conclusion: most teams are buying complexity they haven't earned yet.

Everyone at KubeCon this year was talking about service meshes as if they were the next mandatory layer in your stack. Istio, Linkerd, Consul Connect. Sidecar proxies that magically handle retries, mTLS, traffic splitting, observability. I spent two weeks evaluating these tools for the fintech startup and walked away unconvinced.

Here’s what a mesh actually does: it jams a proxy next to every service instance, routes all traffic through it, and lets a control plane push config to those proxies. You get consistent retries, mutual TLS without app changes, and automatic metrics collection. Sounds great on a slide deck.

The reality is uglier. Every sidecar eats memory and CPU. Tens of megabytes per pod, plus steady overhead. At the fintech startup, we run enough services that this cost isn’t trivial, and we’re not even that big. For Dropbyke – a side project with maybe eight services – it would be absurd. You’re adding latency on every single request too. An extra hop. Maybe a few milliseconds. Doesn’t sound like much until your p99 latency budget is already tight and you just burned a chunk of it on infrastructure plumbing.

Then there’s the operational weight. A mesh isn’t something you install and forget. It’s a new control plane. New failure modes. When a request fails, is it the app? The mesh policy? The proxy? Good luck debugging that at 2am when your team is already stretched thin keeping Kubernetes itself stable.

I keep asking one question: what specific problem are you solving that you can’t solve with something simpler?

Need retries and circuit breaking? A small client library handles that. Need edge traffic control? An API gateway. Need service-to-service encryption? Application-level TLS works fine if you own the services. Need observability? Structured logging and an APM tool get you surprisingly far.

A mesh starts making sense when you have dozens of services, a complex communication graph, and genuine policy drift you can’t manage any other way. Or when security compliance demands uniform mTLS and you physically can’t retrofit every app. Those are real use cases. But most teams I talk to have 10-15 services and are adopting a mesh because it feels like the right thing to do. That’s cargo culting.

If you’re dead set on it, Linkerd is the saner choice right now. Smaller footprint, narrower scope, less to break. Istio is ambitious but heavy. Consul Connect makes sense if you’re already deep in the HashiCorp ecosystem. But honestly? Start without any of them. Add the mesh when the pain is specific and measurable. Not before.

Keep it simple. You can always add complexity later. Removing it is the hard part.