What I Learned Building Our Platform Team This Year

Looking back at 2017, the single best engineering decision we made at the fintech startup was pulling infrastructure work out of the product teams and giving it a home. Not a big, formal reorg. Just two of us, initially, saying: we’re going to own CI/CD, deployment, and monitoring so the rest of the team can stop reinventing it every sprint.

That was March. By December, we had a small platform team that actually worked. Getting there was messy.

The problem was obvious

We had product engineers writing their own deployment scripts. Plural. Each service had its own way of getting to production. Some used shell scripts checked into the repo. One team had a guy who just SSH’d into the box. Logging was inconsistent. Monitoring was whatever someone remembered to set up. When something broke at 2 AM, diagnosing it meant guessing which service did what and where its logs lived.

This is fine when you have three services. We had more than three services.

What we actually built

I kept the scope ruthless. Four areas, nothing else.

Developer experience. CI/CD pipelines, a shared staging environment, deployment tooling that didn’t require tribal knowledge. This was the first thing we tackled because it was the loudest pain point.

Runtime infrastructure. Databases, caches, message queues. Standardized. Documented. Not six different Postgres configurations floating around.

Observability. Centralized logging and monitoring. One place to look when things go wrong. This alone saved us hours every incident.

Security defaults. Secrets management that wasn’t “put it in an environment variable and hope.” Auth tooling that product teams could plug into without reading a novel.

Treat it like a product or it dies

Here is the thing I didn’t expect. Building the platform was the easy part. Getting people to use it? That was the actual job.

We had product engineers with muscle memory. They knew their janky deploy scripts. They didn’t trust our new pipeline. Fair enough. So I stopped thinking of the platform as infrastructure and started thinking of it as a product. Our users were the product teams. If they didn’t adopt what we built, we failed. Full stop.

This meant sitting with teams. Watching them work. Asking dumb questions like “why did you just do that manually?” and then going back and automating whatever that was. It meant writing docs that were actually useful, not docs that checked a box. It meant having a Slack channel where people got answers fast.

Self-service was non-negotiable. If a team needed a new database, they should get it in minutes. Not file a ticket. Not wait for me to wake up. The moment we became a ticket queue, we became the bottleneck we were trying to eliminate.

How we built the team

I started alone, then pulled in one more person with strong ops background. That was the right call. You need someone who has been paged at 3 AM and someone who knows how to build tooling that doesn’t feel like punishment to use. Pure infra people build things that work but nobody can figure out. Pure app developers build things that look nice but fall over under load.

We stayed small. Two people, then three by Q4. Attached to engineering leadership, sitting next to the product teams. Physically close. That proximity matters more than any process. When you overhear someone complaining about deploys, you fix deploys. When you’re in a different building, you build things nobody asked for.

Golden paths over mandates

We never mandated anything. I hate mandates. Instead, we built golden paths — a service template, a default pipeline config, a monitoring setup that worked out of the box. You could ignore it all and do your own thing. But the golden path was so much easier that nobody bothered.

That’s the trick. Make the right thing the easy thing. Don’t write policies. Write code that makes the policy unnecessary.

We exposed platform capabilities through simple APIs. Want a database? Here’s a YAML spec, submit it, done. Want to deploy? Push to main. Want logs? They are already where you expect them.

apiVersion: platform.example.com/v1
kind: Database
metadata:
  name: billing-db
spec:
  engine: postgresql
  size: small

Nothing fancy. Deliberately boring. Boring infrastructure is good infrastructure.

What I would do differently

I waited too long to write documentation. The first three months, everything lived in my head and in Slack threads. That doesn’t scale. Even with three product teams, it doesn’t scale. Write the docs early, even if they are rough.

I also underestimated how much time support would eat. By September, I was spending most of my week answering questions and helping teams migrate. Almost no time left to build new capabilities. We fixed this by setting explicit office hours and protecting two full days per week for building. Should have done that from day one.

Where this goes next

We’re a small company. This setup works for our size. If the fintech startup doubles its engineering team, the platform group will need to split into focused sub-teams — developer experience, infrastructure, security. But not yet. Right now the right move is staying lean and staying close to the people we serve.

A platform team is a product team that happens to build infrastructure. The moment you forget the product part, you’re just an ops team with a fancier name. Keep your users close, build for self-service, and make the right path the easy path. That’s the whole playbook.