Section / Writing
Writing
Long-form field notes for CEOs, founders, and technical leaders working through AI under real constraints: ownership, reliability, governance, cost, decision latency, and production reality.
The canonical reading path below is the clearest entry point into the current operating-model thesis.
Each post aims to answer five questions:
- What is the core claim?
- Why does it matter economically?
- What operating model makes it work?
- Where does it fail?
- What language should a leadership team reuse?
// Canonical reading
- No. 01 Build the System the Model Cannot Break AI-NATIVE OPERATING MODEL · AI-OPERATING-MODEL An AI-native company is not the one that adopts the model fastest; it is the one whose operating model the model cannot break.
- No. 02 The Throughput Engineer: Why Headcount Is a Lagging Metric THROUGHPUT CULTURE · AI-OPERATING-MODEL Headcount is a lagging metric; the real throughput ceiling is how fast an organization can decide.
- No. 03 The CTO Communication Protocol: Aligning Engineers, Executives, and Investors in AI Programs CTO COMMUNICATION PROTOCOL · AI-OPERATING-MODEL AI programs fail when leadership communication stays ad hoc instead of becoming an operating protocol.
- No. 04 Why Most AI Platform Teams Become the New Bottleneck PLATFORM BOTTLENECKS · AI-OPERATING-MODEL A central AI platform team becomes a liability when every workflow improvement has to wait in its queue.
Recent
Technical Leadership in the AI Era (It’s About Throughput, Not Trends) A pragmatic view of technical leadership in mid-2026: Anchor decisions in throughput, verification, and operability rather than chasing the latest autonomous agent framework. Stop Building Internal AI Tools No One Uses Internal AI tools fail when teams optimize for launch instead of habit formation, trust, and workflow fit. Build the System the Model Cannot Break A manifesto for building AI-native organizations. Twelve tenets across strategy, architecture, economics, and people — and the only test that matters in year two. AI Governance Without Bureaucracy Effective AI governance is tighter defaults, clearer ownership, and faster escalation — not more committees. The Board Deck Is Lying: How to Measure AI Progress Without Theater Most AI progress reporting confuses activity with value. Executive measurement should collapse around adoption, reliability, margin, and delivery speed. The 2026 AI Build vs. Buy Calculus (It’s Just Operational Cost) By mid-2026, AI build vs buy has nothing to do with novelty. It is a ruthless mathematical calculation of telemetry, context freshness, and infrastructure lock-in. Margin, Risk, and Speed: The Three Numbers That Should Drive AI Strategy Most AI strategy becomes clearer when leadership stops tracking novelty and starts forcing every decision through three numbers. AI Production Governance: A Maturity Model By mid-April 2026, the gap between teams shipping stable AI features and teams shipping chaos isn't tools—it's production governance. Here is how mature teams evaluate, deploy, and rollback. Why Most Enterprise AI Architecture Fails in Year One In 2026, enterprise AI isn't failing because models are bad. It is failing because organizations are building brittle demos instead of bounded, operable systems. AI Capital Allocation: What Great CTOs Stop Funding First Strong AI strategy starts with a kill list. If a project cannot defend margin, risk, or speed, it should not survive the next budget meeting. AI Strategy: The CTO Perspective (It's Just Data Infrastructure) A CTO's AI strategy in mid-2026 is brutally simple: It is not about chasing models. It is about building resilient data infrastructure, setting operational boundaries, and measuring throughput. Sovereign Systems: Building for a World Where Data Privacy Is Non-Optional Privacy is an architecture constraint, not a feature toggle. Teams that build sovereignty into their systems early avoid painful retrofits and close enterprise deals faster. AI Agent Operations and the Networking Bottleneck: Why AI Agents Fail on Legacy Infrastructure Most AI agent failures are infrastructure failures, not model failures. Legacy networking, flat trust boundaries, and missing circuit breakers are the real reliability bottleneck. De-Risking the Black Swan: Red-Teaming Distributed Databases Before Production Structured red-teaming is a practical reliability discipline for distributed databases. Most catastrophic failures are compound scenarios nobody practiced, not black swans. Beyond Cloud-Heavy Architecture: Why Agentic Systems Need Local-First, Hardware-Aware Design Local-first, hardware-aware architecture is becoming the default for high-reliability AI systems. The cloud-heavy pattern costs too much and fails too unpredictably for agentic workloads.Archive
2026 9 posts
AI Startup Landscape 2026
AI Security: Evolving Threats and Defenses
AI Team Structures 2026: Central, Embedded, and Hybrid Models
AI Inference Cost Trends 2026: Model Pricing and Token Costs
AI Regulation Is Here. Stop Acting Surprised.
AI-Native Architecture Patterns 2026: Production Guide
Building Reliable AI Agents in Go
AI Video Applications in Practice
What I Actually Expect from AI in 2026
2025 26 posts
2025: The Year AI Stopped Being Special
AI in 2025: The Year It Became Boring (Finally)
Scaling AI in the Enterprise Is a Management Problem
AI Incidents Don't Look Like Outages. That's the Problem.
AI Technical Debt Is Eating Your Team Alive (And You Can't Even See It)
AI Doesn't Make Your Team Faster. Shared Infrastructure Does.
Measuring AI ROI Without Lying to Yourself
AI Privacy Is a Plumbing Problem, Not a Policy Problem
AI Pair Programming: It's a Junior Dev, Not a Wizard
Running AI Locally: A Practical Guide for Teams Who Care About Control
AI Workflow Automation: Decisions Are Cheap, Actions Are Expensive
AI Docs That Don't Lie to Your Users
Your AI Metrics Are Measuring the Wrong Thing
Stop Fine-Tuning Models You Haven't Bothered to Prompt Properly
AI Customer Support That Doesn't Make People Hate You
Your AI Pipeline Is Just ETL With Extra Steps (And That's Fine)
Agent Orchestration: Four Patterns, Honest Tradeoffs
AI Security: Same Principles, New Attack Surface
Testing AI Where It Actually Runs
Your AI System Looks Healthy. It Is Not.
MCP in Practice: Building Tool Servers in Go
AI Governance That Does Not Suck
Video Understanding AI: What Actually Works
AI Code Review Is Mostly Noise
Reasoning Models in Production: A Practical Guide
AI in 2025: The Year Discipline Wins
2024 30 posts
2025 Will Reward the Boring Teams
2024: The Year AI Got Boring (In a Good Way)
Your AI Infrastructure Is Not Special
Your AI Team Problem Is Not Technical
Picking an AI Model for Production (Late 2024)
AI Safety Is Just Production Engineering
Agent Patterns That Survive Production
AI Cost Benchmarking: What Your Bill Actually Tells You
RAG Retrieval That Actually Works
Let AI Write Your First Draft, Not Your Docs
AI-Assisted Code Migration: What Actually Works
How I Actually Test LLM Features
The Best Model Is the Smallest One That Works
Stop Stuffing Your Context Window
Function Calling Patterns That Survive Production
Claude 3.5 Sonnet Analysis: Cost, Coding, and Model Routing
AI Compliance Without the Theater
Why Your Enterprise AI Pilot Is Stuck
Building Voice AI That People Actually Use
GPT-4o Changed the Interface, Not the Hard Part
LLM Structured Output in Go: JSON Schema, Validation, Retries
Most AI Developer Tools Are Not Worth Adopting Yet
Agentic Workflows: From Demo Magic to Production Reality
LLM Prompt Caching in Go: Cut Costs Without Breaking Things
Why I Run Multiple Models in Production
Claude 3 First Impressions: Three Models, One Decision Framework
LLM Evaluation: Stop Shipping on Vibes
Architecting AI-Native Applications (Without the Delusion)
Stop Paying OpenAI to Test Your Prompts
AI Engineering Is Its Own Discipline Now
2023 30 posts
2023: The Year Everything Changed (and I Barely Kept Up)
Your AI Infrastructure Is Not Ready for Scale. Neither Is Mine.
Multimodal AI: Five Use Cases That Actually Work (and Three That Do Not)
Two Weeks With the Assistants API: What I Like, What I Hate
OpenAI DevDay Happened and I Have Opinions
I Tracked My AI-Assisted Coding for Three Months. Here Are the Numbers.
LLM Security: A Field Guide for People Who Ship Things
Responsible AI Is Just Risk Management. Treat It That Way.
AI Technical Debt Is Eating Your Codebase (You Just Cannot See It Yet)
Agent Architecture Patterns That Actually Work in Production
Stop Starting With the Model: AI Product Strategy That Works
LLM Observability: Your Existing Monitoring Is Not Enough
What I Learned Building AI Features Into a Fintech Product
Your LLM Bill Is Your Own Fault
Embedding Models Compared: Retrieval Quality, Cost, and Latency
Most AI Startups Are Wrappers. That's the Problem.
Building Semantic Search in Go: From Embeddings to Production
Restructuring Engineering Orgs After Layoffs
AI Code Review: What It Actually Catches (And What It Misses)
Fine-Tuning vs. Prompting: A Decision Framework
LangChain Is the New ORM: Convenient Until It Is Not
RAG Patterns That Actually Work in Production
Vector Databases: What They Actually Are and When You Need One
Claude vs GPT: A User's Honest Take
AI Safety Is Just Security Engineering With Extra Steps
My First Week Building with GPT-4
Leading Engineering Teams When Nobody Knows What Is Next
Prompt Engineering Is Not Engineering
LLM Integration Patterns That Actually Survive Production
AI in Production Is Just Engineering. Treat It That Way.
2022 30 posts
2022: The Year the Music Stopped
Your Cloud Bill Is Not a Mystery
Resilient Teams Are Boring Teams
Five Days With ChatGPT
My Honest Take on GitHub Copilot After Six Months
Infrastructure as Code Patterns That Actually Scale
Watching Layoffs From the Inside
Platform Engineering: DevOps Grew Up
Monorepo vs. Polyrepo: A Practical Decision Guide
Engineering Metrics That Actually Matter
You Do Not Need a FinOps Team
Testing Microservices Without Losing Your Mind
Kubernetes Requests and Limits: Lessons From Getting It Wrong
Go Concurrency Patterns I Use in Every Service
Caching: The Easy Part Is Adding It, the Hard Part Is Everything Else
When to Go Async (And When to Resist the Urge)
Container Scanning Without the Security Theater
Rate Limiting: The Boring Feature That Saves You at 3 AM
Your Engineering Docs Are Probably Useless
Distributed Systems Patterns I Keep Reaching For
TypeScript: A Go Developer's Honest Take
PostgreSQL Performance: Measure First, Tune Second
OAuth Tokens: Why They Keep Getting Stolen and How to Stop It
You Probably Don't Need a Service Mesh
Your Onboarding Is Broken. Here's the Fix.
API Versioning: Pick One and Stop Overthinking It
Zero-Downtime Database Migrations Without the Drama
Hardening Kubernetes: The Stuff That Actually Matters
DORA Metrics: Stop Ruining a Good Idea
What Log4j Actually Taught Us
2021 31 posts
2021: The Year Everything We Ignored Caught Fire
The AWS us-east-1 Outage Was Predictable. Your Architecture Was Not Ready.
Log4j Is on Fire. Here's What to Do Right Now.
Terraform at Scale: What Changed Since 2019
What a 3 AM Outage Taught Me About Incident Management
OpenTelemetry in Late 2021: What's Ready and What's Not
Stop Renaming Your Ops Team to SRE
Most Platform Teams Are Building the Wrong Thing
Event Sourcing in Practice: What I Learned Building Financial Event Pipelines
Your Kubernetes Bill Is Lying to You
GraphQL Federation: I'm Still Skeptical
Most 'Technical Debt' Is Just Decisions You Disagree With Now
Feature Flags at Scale: What Nobody Warns You About
Zero Trust Architecture: What It Actually Looks Like
Database Reliability Engineering: What I've Learned the Hard Way
WebAssembly Beyond the Browser: A 2021 Progress Report
Most Teams Should Just Use Postgres
GitHub Copilot: First Impressions From a Go Developer
Observability-Driven Development Is Just Instrumenting Your Code
Embracing Remote Work: Benefits, Dangers, and Overcoming Challenges
API Gateway Patterns That Actually Work
Data Engineering Patterns: Batch vs. CDC vs. Streaming
Hybrid Work Is Harder Than Full Remote
DevSecOps in Practice: What I Actually Implement
Multi-Cloud Is Mostly a Marketing Strategy
Most Teams Are Not Ready for MLOps
Developer Portals: The Thing Nobody Wants to Build But Everyone Needs
Rust for Cloud Services: A Go Developer's Honest Take
GitOps + Progressive Delivery: How We Stopped Gambling on Deploys
eBPF Is Interesting. I Am Not Sold Yet.
Your Software Supply Chain Is Probably a Mess
2020 31 posts
2020: The Year That Broke the Playbook
SolarWinds Got Owned. Your Build Pipeline Might Be Next.
Your Container Image Scan Passed. Now What?
Apple Silicon Won't Replace Your Servers (Yet)
Your VPN Is a Liability. Here's What Replaces It.
Platform Engineering Is Just DevOps With a Rebrand
API Gateways: Build, Buy, or Regret
What Actually Works for Distributed Teams (Six Months In)
Observability for Small Distributed Teams (What Actually Works)
Most Developer Productivity Metrics Are Management Theater
GraphQL Federation Is Probably Not For You
I Wrote Six Kubernetes Operators. Here's What Actually Matters.
The GitHub Actions Patterns I Actually Use in Production
Event-Driven Architecture: What I Got Wrong and What Survived
Serverless vs Containers: Where the Math Stops Working
Most Chaos Engineering Is Theater
Stop Guessing Your Kubernetes Resource Limits
What I Actually Changed About Engineering Interviews Over Zoom
gRPC Patterns That Actually Work in Production
Your VPN Was Never a Security Architecture
State Of Linux Usability 2020
Your Cloud Security Is Falling Apart Right Now
Your Team Isn't Remote. It's Just on Zoom.
Your Business Continuity Plan Is Corporate Theater
Your Video Infrastructure Isn't Ready for What's Coming
Your Team Just Went Remote. Here's What to Do Right Now.
Wasm Outside the Browser: Real Promise, Real Gaps
Comparing Infrastructure Testing Approaches: What Actually Catches Bugs
I Tried Every API Versioning Strategy. Here's the One I Actually Use.
Database Replication Patterns That Actually Matter
My Kubernetes Predictions for 2020 (Most of Yours Are Wrong)
2019 25 posts
2019: The Year I Quit, Built, and Started Over
Your Cloud Bill Is a Design Document
Most Edge Computing Projects Are Premature Optimization
How I Build CLI Tools in Go (And Why I Stopped Overthinking It)
Zero Downtime Deploys Are a Team Habit, Not a Tool
Your Onboarding Is Broken and Everyone Knows It
Your Terraform Monolith Will Break. Here's How to Fix It Before It Does.
Message Queues: The Patterns Nobody Tells You About Until 3 AM
Your Load Tests Are Lying to You
Internal Platforms vs. Ad-Hoc Tooling: Which Developer Experience Actually Wins
Data Mesh Is an Org Chart Fix, Not a Tech One
Your Incident Response Plan Is Useless Until Someone Bleeds
Your Monolith Is Probably Fine
You Probably Don't Need Multi-Region
Your Staging Environment Is Lying to You
Your SLOs Are Probably Useless (Here's How to Fix Them)
Design for Failure or It Will Design Your Weekend
Kubernetes Ships Insecure by Default. Here's What to Do About It.
Your Cloud Bill Is Lying to You: A Cost Optimization Comparison
The PostgreSQL Tuning Playbook I Actually Use
Your Internal Platform Is Probably a Liability
Your API Is a Contract You Can't Take Back
GitOps: Stop SSHing Into Production
Migrating to TypeScript Without Losing Your Mind
The Boring Kubernetes Checklist That Actually Keeps Production Alive
2018 27 posts
2018: The Year Tech Got Humbled
Async Job Processing: Patterns That Saved Us at a Fintech Startup
How We Track and Prioritize Tech Debt at a Fintech Startup
Istio: Powerful, Painful, and Probably More Than You Need
What I Learned Scaling an Engineering Team
IaC Patterns That Actually Work
API Rate Limiting: What Actually Works
What I Learned About Code Reviews the Hard Way
What Building Distributed Systems at a Fintech Startup Taught Me About Failure
Serverless: What Works, What Doesn't, and What Will Bite You
Container Security in 2018: What Actually Changed
Database Sharding: You Probably Don't Need It Yet
Securing Microservices: What Actually Works
Why Monitoring Wasn't Enough and How We Built Observability at a Fintech Startup
Making Go Services Fast: What Actually Matters
GraphQL in Production Is Harder Than They Tell You
GDPR Week One: What Actually Happened
GDPR for Engineers: What We Actually Built at a Fintech Startup
SRE Principles Are Great. The Cargo-Culting Is Not.
Stop Wasting Everyone's Time in Technical Interviews
Kubernetes Operators: Powerful, but Overhyped
Event Sourcing in Practice: What I Got Right and Wrong
A Go Developer Looks at Rust for Backend Work
Zero Trust Is Not a Product. Here's How We Actually Built It.
Machine Learning for Backend Engineers: What Actually Matters
Two Years of Kubernetes in Production — The Boring Parts Are the Hard Parts
Spectre and Meltdown Broke My Weekend
2017 25 posts
What I Learned Building Our Platform Team This Year
Stop Trying to Fix All Your Tech Debt
Async by Default: Reducing Decision Latency in Distributed Engineering Teams
Your Containers Aren't Secure. Here's What to Actually Do About It.
Service Mesh: You Probably Don't Need One
Stop Counting Code Reviews and Start Reading Them
Your Incident Process Will Break at 15 People. Here's What to Do.
Engineering Manager vs Tech Lead: What's Actually Different
Multi-Region Architecture: What I Wish Someone Had Told Me
Your Startup Doesn't Need a Security Team. It Needs a Security Champion.
Pitching Infrastructure to People Who Don't Care About Infrastructure
You Don't Need to Be Netflix to Break Things on Purpose
Stop Guessing: How I Fix Slow Databases
Stop Doing Security Reviews by Hand
Your Cloud Bill Is Lying to You
Leading Without a Title — What Actually Works
Serverless Patterns That Actually Work in Production
API Versioning: What Actually Works and What Doesn't
WannaCry Hit. Here's What It Actually Exposed.
How I Build Data Pipelines That Actually Survive Production
Why We Went Event-Driven (and What Nearly Broke)
Monitoring Is Not Enough
GDPR Is an Engineering Problem, Not a Legal One
GraphQL vs REST: Pick the Boring One
A Year Running Kubernetes in Production — What Actually Happened
2016 26 posts
2016: The Year I Stopped Fighting Infrastructure
Securing APIs: Authentication and Authorization Patterns
Why We Deleted 42 Grafana Panels
Building Effective Engineering Teams
Why We Chose Go for Our Backend Services
The Economics of State: Why Scaling Up Beats Sharding (Until It Doesn't)
The CTO's Guide to Technical Due Diligence
Container Orchestration: Docker Swarm vs Kubernetes vs Mesos
Building a Security-First Engineering Culture
Why Every Developer Should Understand Networking
Log Aggregation at Scale: ELK vs Alternatives
Database Migrations Without Downtime
Hiring Engineers When You Can't Compete on Salary
Building Resilient Systems: Lessons from Production Failures
The Real Cost of Running Your Own Servers in 2016
Why I Moved Our Infrastructure to Terraform
Continuous Deployment Without the Chaos
Security Incident Response for Startups
API Design Principles That Stand the Test of Time
Ansible Won Because It's the Simplest
Postgres vs MySQL in 2016: A Practical Comparison
AWS Lambda: When Serverless Makes Sense (And When It Doesn't)
Building a DevOps Culture from Scratch
The True Cost of Technical Debt
Docker in Production: What We Learned Running Containers at Dropbyke
Why Microservices Aren't Always the Answer