Multi-Region Architecture: What I Wish Someone Had Told Me

Quick take

Multi-region buys you latency and resilience. It costs you sanity. Know which one you’re trading before you start.

At the fintech startup we serve real-time financial news and data to users spread across the UK, Germany, the Nordics, and increasingly beyond. Our backend runs in a single AWS region right now. It works. But I’ve spent the last few months seriously evaluating what a multi-region setup would look like for us, and I want to lay out what I’ve found – both the architecture patterns and the ugly operational reality nobody warns you about.

Physics doesn’t care about your SLA

A request from London to eu-west-1 is fast. A request from Singapore to eu-west-1 isn’t. We’re talking 200-300ms of raw network latency before your application code even runs. Stack a few API calls on top of that, add a database query, and suddenly the page takes two seconds to load. Users don’t file bug reports for slow pages. They just leave.

I measured this myself with a cheap VPS in each continent. London to Dublin: ~10ms. Tokyo to Dublin: ~240ms. That’s not a rounding error. That’s a completely different product experience.

Three patterns, three sets of problems

Active-passive is the one everyone starts with. Your primary region handles all traffic. A standby region sits there with replicated data, waiting. You get disaster recovery. You don’t get latency improvement for remote users. The tricky part is failover – if you haven’t rehearsed it, your “30-second failover” is actually a 45-minute scramble at 3am with someone SSHing into the wrong box.

Active-active is the dream. Multiple regions serving traffic simultaneously, users routed to the nearest one. Fast everywhere. But you’ve just signed up for distributed consensus problems. Two users updating the same record in two regions at the same time? Now you need conflict resolution. Last-write-wins is easy to implement and will silently eat data. Domain-specific merge logic is correct and will take you months to build. Pick your pain.

Follow-the-sun (or read-local, write-central) is what I keep coming back to for our use case. Reads hit the nearest region. Writes go to a single primary. For a read-heavy system like ours – users consuming financial content far more than creating it – this is the sweet spot. Remote users still eat write latency, but writes are a small fraction of our traffic. The tradeoff is acceptable.

Data is where it gets ugly

Every multi-region discussion eventually becomes a data discussion. The CAP theorem isn’t just academic. It’s the thing that makes your on-call engineer cry.

Async replication is the pragmatic choice. Your writes are fast because they don’t wait for the replica. But a user who just updated their watchlist and immediately reads it back might see stale data. For financial content, that’s bad. For user preferences, it’s annoying but survivable. You have to know your data well enough to make that call per table, sometimes per column.

Sync replication gives you consistency. It also means every write waits for a cross-region round trip. One slow region drags the whole system down. I’ve seen this kill write throughput in practice. Unless your write volume is very low and your tolerance for tail latency is very high, avoid it.

The approach I like best is regional data affinity. A UK user’s data lives in the EU region. Period. We only replicate the things that truly need to be everywhere – shared reference data, global config, that sort of thing. This sidesteps most consistency headaches because you’re not actually doing multi-master for user data. You’re doing single-master per user, distributed across regions. Much simpler failure modes.

Getting users to the right place

DNS geo-routing is the obvious first step. Route53 does it, CloudFlare does it, everyone does it. It works well enough. The catch is DNS caching – TTLs are suggestions, not commands. ISPs will cache your records for longer than you want, which means failover via DNS is slower than you’d hope. Minutes, not seconds.

Anycast is faster for failover but harder to operate. You need BGP-level control. Most startups, us included, don’t have the network engineering chops for this. It’s the right answer at scale. It’s premature complexity at our stage.

CDNs solve the easy part (static assets, edge caching) and punt on the hard part (dynamic requests still need to reach an origin). Worth doing regardless of your multi-region strategy, but don’t confuse a CDN with actual multi-region architecture. They’re complementary, not interchangeable.

The operational tax nobody budgets for

Here’s where most multi-region proposals die, and honestly they should.

Deployments get harder. You can’t just kubectl apply and walk away. You need to decide: do you deploy to all regions simultaneously? Serially? Use one region as a canary? We’d probably do serial deployment with the secondary region first, watch it for 15 minutes, then roll to primary. That means every deploy takes longer. Every rollback is more complex. Your CI/CD pipeline just doubled in scope.

Monitoring has to be per-region AND global. You need to know that latency is high in ap-southeast-1 specifically, not just that “p99 latency is elevated somewhere.” If your monitoring runs in the same region as your primary, congratulations – when that region dies, so does your ability to see that it died.

Failover isn’t a feature you ship once. It’s a muscle you exercise. We do game days at the fintech startup for our current single-region setup, and even those surface surprises. A multi-region failover you’ve never tested is a multi-region failover that doesn’t work. Full stop. The first time you practice it, something will break that you didn’t expect. Better to find that on a Tuesday afternoon than during an actual incident.

Should you actually do this?

Honest answer: most teams shouldn’t. Not yet.

If your users are in one geography and your uptime requirements are met by a single region with good practices (multi-AZ, proper backups, tested restores), multi-region is complexity you’re borrowing against future needs. That debt has interest.

For us at the fintech startup, the calculus is shifting. We’re getting real users in Asia. GDPR already constrains where we store certain data. And our customers are making trading decisions based on our content – downtime costs them money, not just patience. We’ll probably go follow-the-sun within the next year. But I’m going in with my eyes open about what it costs.

Build your system so migration is possible. Abstract your data layer. Don’t hardcode region assumptions. Use infrastructure-as-code so spinning up a new region isn’t a six-week project. Then wait until the numbers actually justify the move.

The worst multi-region architectures I’ve seen were built by teams who wanted the résumé bullet point. The best were built by teams who had no choice.