Writing / 2026

The AI Strategy Stack: What Boards Mistake for Moats

June 30, 2026 · 4 min read

Most AI moat claims are distribution theater; durable moats come from routing economics, proprietary workflow data, and operational reliability.

The strongest argument against this whole essay is short: foundation models keep getting better, so whatever gap your proprietary data closes this quarter, the next base model closes for free. If that were fully true, no data moat in AI would be worth funding. It is half true. The half it gets wrong is the half boards keep paying for.

Start with what the strategy deck stacks up as defensible. Three layers, usually. The model itself, rented, and your competitor can rent the same one. The scaffolding on top of it: prompt library, routing logic, eval harness, all shaped around one vendor’s behavior and all of it breaks the morning you change providers. If the moat disappears when the vendor changes, it was never a moat. It was a dependency. That clears two of the three layers off the slide. Swap the provider in your head; whatever still works the next morning is the only candidate worth the word.

What survives is the third layer, and it is the one boards cannot tell apart from its imitation: data your own operation produces by running. Here is the mechanism, and the exact place it breaks.

Take support automation. The model drafts a resolution; a human approves before it ships. Every rejection or rewrite captures a labeled triple: the input, the output the model produced, the output the human accepted. Not a log line. A graded example of where your model was wrong and what right looked like, on your tickets, in your domain.

Now the load-bearing step, the one most decks wave through. How does that triple make a cheaper model tier handle a class it used to escalate? Two mechanisms, two different bills.

Retrieval, the few-shot route: index the accepted exemplars and inject the nearest ones into the prompt at inference. Cheap to stand up, live the moment you index a correction, but it taxes every call in tokens and latency, and the lift is capped because you are renting the base model’s in-context learning.

Distillation, the fine-tune route: train the small model on the correction set. Latency stays flat, the behavior is baked in, per-call cost drops, but you pay a training-and-eval cycle up front and re-pay it on every base-model upgrade. Which tier absorbs which class is a cost decision, not a model-quality one: retrieval for the long tail, distillation for the high-volume classes once they stop drifting.

Either way, one number tells you which you have: escalation rate on a single request class, quarter over quarter. Falling and sustained while quality holds is the loop compounding. Flat is a warehouse with a dashboard bolted to it. A logging pipeline and a compounding loop look identical in the architecture diagram and behave nothing alike in the P&L.

I cannot hand you a rival’s P&L to prove the good case, and any deck that shows you a clean before-and-after percentage is selling the illustration as the evidence. The honest test is one you run on your own numbers: name the class, name the two quarters it improved, name why a competitor on the same vendor cannot reproduce it. The answer to the last one is never the model. It is the system around it that turns each rejection into an exemplar only you hold.

Then the part the optimistic version omits: this asset depreciates. When the next base model ships, it absorbs your easy classes for free, everyone’s, not only yours, and that compresses the set of failures where your corrections still move the number. Your edge is only ever the residual: corrections illegible outside your context, your product’s quirks, your contractual edge cases, your policy language. The vendor will productize the capture loop; they already sell feedback buttons and fine-tuning APIs. What they cannot aggregate is a residual that means nothing without your business wrapped around it. The loop compounds only while you generate domain-specific corrections faster than a better base model erases the generic ones.