Every serverless tutorial shows you a single function returning “hello world.” That’s not how any of this works in production.
At the fintech startup we process a ton of financial news events through Lambda. Thousands of sources, multiple enrichment steps, fan-out to different consumers. The function itself is the easy part. The hard part is everything between the functions – how they connect, how they fail, and what happens when they fail at 3am.
One function, one job
We settled on one Lambda per endpoint for our API layer. No mega-functions. Keeps deployments small and blast radius tiny. When something breaks, you know exactly where to look.
Shared logic goes into layers. Connection pooling matters more than you’d think – our Postgres instance was getting hammered by cold starts spinning up fresh connections. We moved the connection outside the handler so it persists across warm invocations. Simple fix, huge difference.
import psycopg2
conn = psycopg2.connect(...)
def handler(event, context):
with conn.cursor() as cursor:
...
Events are the real power
The API stuff is fine, but serverless really clicks for event-driven work. A new article lands in our pipeline, and it fans out: one function runs NLP, another tags financial entities, another pushes to user feeds. Each one independent. Each one replaceable. None of them waiting on the others.
This is where you have to make a call: choreography or orchestration. Choreography (each service listens and reacts) is great until you have fifteen services and nobody can trace a flow end-to-end. Step Functions add overhead but give you a visible state machine. For our multi-step enrichment pipeline, orchestration won. For simple event reactions, choreography is still the right call.
The stuff that bites you
Cold starts. Everyone talks about them, few people actually measure them per-endpoint. Do that first before throwing money at provisioned concurrency.
Webhook idempotency. External services will retry. Your function will get called twice. If you’re not handling that, you will corrupt data. We learned this one the hard way with a payment webhook that processed the same charge three times.
Dead-letter queues. Set them up on day one, not after your first silent failure eats events for six hours.
The point
Stop thinking about individual functions. Think about the flow. Where does data enter, how does it move, where does it land when something goes wrong? Get that right and serverless is genuinely great infrastructure. Get it wrong and you’ve built a distributed monolith that’s harder to debug than the thing you replaced.