// Topic
Quality
Definition
Quality coverage in this archive spans 7 posts from Nov 2017 to Mar 2026 and leans into practical engineering craft: interfaces, testing, and maintainable implementation details. The strongest adjacent threads are ai, testing, and code review. Recurring title motifs include ai, code, evaluation, and testing.
Working claims
- The through-line is clarity first: simple designs that survive change beat clever abstractions.
- Early posts lean on code and stop, while newer posts lean on ai and evaluation as constraints shifted.
- This topic repeatedly intersects with ai, testing, and code review, so design choices here rarely stand alone.
How to apply this
- Keep interfaces small, automate regressions early, and make operational assumptions explicit in code.
- Start with the newest post to calibrate current constraints, then backtrack to older entries for first principles.
- When boundary questions appear, cross-read ai and testing before committing implementation details.
Where teams get burned
- Abstracting before usage patterns are stable enough to justify indirection.
- Treating style consistency as optional until quality and velocity both degrade.
- Applying guidance from 2017 to 2026 without revisiting assumptions as context changed.
Suggested reading path
- Start here (current state): AI Production Governance: A Maturity Model
- Then read (operating middle): LLM Evaluation: Stop Shipping on Vibes
- Finish with (foundational context): Stop Counting Code Reviews and Start Reading Them
Related posts
- AI Production Governance: A Maturity Model
- Testing AI Where It Actually Runs
- AI Code Review Is Mostly Noise
- LLM Evaluation: Stop Shipping on Vibes
- AI Code Review: What It Actually Catches (And What It Misses)
- Testing Microservices Without Losing Your Mind
- Stop Counting Code Reviews and Start Reading Them
References
6 posts
- Testing AI Where It Actually Runs
Offline evals are necessary but not sufficient. Here's how I test AI features in production with shadow mode, canaries, and rollback automation -- with Go code.
AI Code Review Is Mostly Noise
I've been running AI code review on real PRs for months. It catches some real bugs. It also generates a staggering amount of useless commentary.
LLM Evaluation: Stop Shipping on Vibes
Your LLM feature looks great in demos and breaks in production. Here is how to build an evaluation loop that catches regressions before your users do.
AI Code Review: What It Actually Catches (And What It Misses)
After three months of using AI-assisted code review across multiple projects, here's what actually works and what's just noise.
Testing Microservices Without Losing Your Mind
Microservices fail at the seams. A layered test strategy that keeps feedback fast and catches integration issues before production.
Stop Counting Code Reviews and Start Reading Them
Most code reviews are theater. Here's what actually makes them worth the time.