The Throughput Engineer: Why Headcount Is a Lagging Metric
Headcount is a lagging metric. The best engineering organizations measure throughput: decision speed, defect containment, and constraint removal.
Law Zava
I engineer reliable infrastructure, reduce platform cost at scale, and lead technical teams to outship their peers.
Binary size, tail latency, memory predictability. When the runtime is the bottleneck, I reach for Rust, Zig, or C++ — not as a default, but when the numbers justify it.
ScyllaDB, Cassandra, and the operational reality of global state. Failure semantics, consistency trade-offs, and making distributed databases behave predictably under real load.
Small teams outperforming large ones. Clear intent, fast feedback loops, and async-first coordination over meetings and headcount.
Data residency, zero-trust architecture, and AI systems that satisfy regulators without crippling the product. Designed in from the start, not bolted on.
Reliability and cost discipline aren't at odds — they're the same engineering problem. Teams that understand their hardware, shrink their runtime dependencies, and make failure modes explicit end up with systems that are both cheaper and more reliable.
The best engineering organizations run on clear intent and fast feedback, not process overhead. I've seen five people with the right operating model outship fifty without one.
Headcount is a lagging metric. The best engineering organizations measure throughput: decision speed, defect containment, and constraint removal.
Most AI agent failures are infrastructure failures, not model failures. Legacy networking, flat trust boundaries, and missing circuit breakers are the real reliability bottleneck.
Structured red-teaming is a practical reliability discipline for distributed databases. Most catastrophic failures are compound scenarios nobody practiced, not black swans.