Kubernetes Requests and Limits: Lessons From Getting It Wrong
CPU is compressible. Memory is not. That one sentence explains 80% of Kubernetes resource problems.
Kubernetes coverage in this archive spans 22 posts from Oct 2016 to Sep 2022 and focuses on reliability, delivery speed, and cost discipline as one system, not three separate concerns. The strongest adjacent threads are devops, infrastructure, and containers. Recurring title motifs include kubernetes, production, probably, and need.
CPU is compressible. Memory is not. That one sentence explains 80% of Kubernetes resource problems.
Service meshes solve real problems at real scale. But most teams adopt them before the problems exist. Here's how to decide honestly.
Kubernetes defaults are built for getting things running, not for keeping attackers out. A layered hardening walkthrough covering pods, RBAC, network policies, secrets, and the control plane.
Most Kubernetes clusters are 40-60% over-provisioned. Here's how I help teams cut their bills without sacrificing reliability.
How we wired GitOps and canary rollouts together at Decloud, and why the combination changed how I think about deployments.
Image scanning tells you what's in the box. Runtime security tells you what the box is doing. Here's how we lock down containers at Decloud with seccomp, network policies, Falco, and paranoia earned from NATO work.
Lessons from building production operators at Decloud: the reconciliation loop, controller-runtime patterns, and the mistakes that cost us sleep.
Most K8s clusters I audit are either wildly overprovisioned or one bad deploy away from eviction storms. Here's how I set requests, limits, and guardrails.
The adoption debate is over. 2020 is about operating Kubernetes well -- managed control planes, GitOps by default, policy enforcement, and being honest about what's overhyped.
Every team says they want zero downtime. Few want to do the boring work that actually gets them there. Here's what that boring work looks like.
Kubernetes defaults optimize for fast adoption, not safety. A hardening checklist drawn from running clusters at the fintech startup, Dropbyke, and early Decloud work.
How I moved three teams off ad-hoc kubectl deployments and onto Git-driven infrastructure -- with code examples, repo layouts, and the mistakes I made along the way.
Most Kubernetes outages come from skipping the basics. Here's the checklist I use after running clusters at the fintech startup and now at Decloud.
A personal look back at 2018 -- from GDPR scrambles at the fintech startup to Google for Startups Seoul, Spectre/Meltdown fallout, and the infrastructure shifts that defined the year.
My honest take on evaluating Istio at the fintech startup — what it actually gives you, what it costs you, and why most teams should think twice before adopting it.
Eight months after my first container security post, an update on what moved at the fintech startup and in the ecosystem — PodSecurityPolicy, image signing, and the shift from scratch to real.
Operators are the hot thing in the Kubernetes world right now. They're genuinely useful — but the hype is outpacing the reality for most teams.
Year two of running Kubernetes at the fintech startup. The panic is gone. Now it's networking, resource tuning, and all the operational grunt work nobody blogs about.
Containers give you process isolation, not a security boundary. I break down how we hardened images, locked down runtimes, and segmented networks at the fintech startup — plus the stuff nobody warns you about.
I evaluated Istio and Linkerd for our microservices at the fintech startup. My conclusion: most teams are buying complexity they haven't earned yet.
After a year of running Kubernetes in production, the wins are real but the sharp edges drew blood first. Here's what paid off, what bit us, and what I'd do differently.
A side-by-side comparison of Swarm, Kubernetes, and Mesos based on running all three in evaluation at Dropbyke. Kubernetes is going to win, but the operational tax is real.