Kubernetes Ships Insecure by Default. Here's What to Do About It.

During my NATO Cyber Defense work, I learned a principle that stuck: attackers only need one open door. You need to lock all of them. Kubernetes, out of the box, leaves most of them wide open.

I ran into this the hard way at a fintech startup. We spun up our first production cluster in a hurry – tight deadlines, the usual. Anonymous auth on the API server? Enabled. etcd encrypted at rest? No. NetworkPolicy? We didn’t even know it existed yet. We were one exposed port away from giving our financial data to anyone with kubectl and an internet connection.

That experience made me paranoid in a productive way. By the time I was setting up infrastructure at Dropbyke and later at Decloud during the EF batch, I had a hardening checklist I ran through before any cluster touched real traffic. This is that checklist, updated for early 2019.

Quick take

Kubernetes defaults are built for demos, not production. Lock down the API server, encrypt etcd, enforce RBAC and PodSecurityPolicy, add NetworkPolicy, and stop trusting your container images. If you haven’t done these things, your cluster is one bad deploy away from a breach.

Lock Down the Control Plane

API Server

Three flags you should set immediately:

--anonymous-auth=false
--authorization-mode=Node,RBAC
--audit-log-path=/var/log/kubernetes/audit.log

Disable anonymous access. Require RBAC. Turn on audit logging so you can actually answer “who did what” during an incident. Keep the API server on a private network – if it’s reachable from the public internet, you’ve already lost.

Enable NodeRestriction and PodSecurityPolicy admission controllers. These are your baseline. Without them, any pod can request privileges it shouldn’t have.

etcd

etcd holds your cluster state and your Secrets in plaintext by default. Treat it like a credentials database, because that’s what it is.

Run it on control plane nodes only. Firewall it from pod networks. Require TLS for both peer and client connections. And encrypt Secrets at rest:

apiVersion: apiserver.config.k8s.io/v1
kind: EncryptionConfiguration
resources:
- resources:
  - secrets
  providers:
  - aescbc:
      keys:
      - name: key1
        secret: <base64-encoded-key>
  - identity: {}

The identity provider at the end is a fallback for reading old unencrypted secrets. Remove it once everything is re-encrypted.

Controller Manager and Scheduler

Run these only on control plane nodes. Enable leader election. Give each controller its own service account with tight permissions. The principle of least privilege applies to cluster internals too.

Harden the Nodes

Host OS

Minimal OS. Patched. SSH access restricted and ideally behind MFA. Only required ports open. Disable swap. Don’t relax kernel defaults for legacy workloads – if something needs a relaxed seccomp profile, make it an exception, not a default.

Kubelet

The kubelet is the most powerful thing on each node, and it ships with the door unlocked:

--anonymous-auth=false
--authorization-mode=Webhook
--read-only-port=0

Anonymous kubelet access is how you get remote code execution on every node in your cluster. Disable it. Use webhook authorization. Kill the read-only port.

RBAC: Less Is More

Use namespace-scoped Role and RoleBinding by default. ClusterRole and ClusterRoleBinding should be rare and audited.

Never grant cluster-admin to human users. Never grant it to CI. I’ve seen CI pipelines with cluster-admin running untrusted code from pull requests. That’s not a deployment pipeline, that’s a breach pipeline.

One service account per workload. Disable auto-mounting on the default service account. That token sitting in every pod by default is a lateral movement gift.

Workload Security

PodSecurityPolicy

In 2019, PSP is the built-in way to prevent pods from running as root, mounting host paths, or using host networking. Use it. Block privileged containers by default. Grant exceptions only with documented justification.

Security Contexts

Every workload should run with:

Non-root user
All capabilities dropped
allowPrivilegeEscalation: false
Read-only root filesystem when possible

These four lines stop a surprising number of container escape techniques.

Resource Limits

Set CPU and memory requests and limits on everything. This isn’t just a reliability thing. Without limits, a compromised pod can starve the node and take down every other workload on it. Crypto miners love clusters without resource limits.

Network Isolation

Default-deny NetworkPolicy for both ingress and egress. Then add explicit allow rules. If you haven’t done this, every pod can talk to every other pod. That’s not a cluster, that’s a flat network with a fancy orchestrator.

Keep control plane components on separate subnets from application pods. The API server shouldn’t be reachable from your web frontend pod.

Secrets Management

Encrypt at rest. Scope access through RBAC. Never put secrets in images, ConfigMaps, or environment variables you can see with kubectl describe pod.

Mount secrets as read-only files. Rotate them. For anything high-value, use Vault or your cloud provider’s KMS and inject at runtime. Kubernetes Secrets are base64-encoded, not encrypted – a distinction that matters.

Image Supply Chain

At the fintech startup we once deployed a container image pulled from Docker Hub that hadn’t been updated in 18 months. It had three known CVEs, one critical. We found out during a penetration test, not before.

Limit deployments to trusted registries. Pin images by digest, not tag – latest isn’t a version. Scan images in CI with Clair or Anchore. Block deploys with high-severity findings. Enable Docker Content Trust for workloads where provenance matters.

Audit and Monitoring

Collect audit logs and ship them somewhere the cluster can’t delete them. Alert on:

clusterrolebinding creation
exec into pods
Changes to PodSecurityPolicy
Any access to Secrets

Falco adds runtime visibility at the container level. It’s worth the setup time.

The Short Version

If you’re running Kubernetes in production and haven’t gone through a hardening pass, do it this week. Not next sprint. This week. The defaults are designed for getting started fast, and they succeed at that. They aren’t designed to keep your data safe.

Hardening isn’t one tool or one YAML file. It’s the sum of careful defaults, strict access control, and actually reviewing what’s running in your cluster. Do the boring work. Your future incident response self will thank you.