// Topic
Data
Definition
Data coverage in this archive spans 3 posts from Jul 2019 to Sep 2025 and centers on data correctness and operability under real production constraints. The strongest adjacent threads are ai, privacy, and security. Recurring title motifs include ai, privacy, plumbing, and policy.
Working claims
- The common theme is that schema, ownership, and query shape drive most downstream outcomes.
- The consistent theme from 2019 to 2025 is disciplined execution over hype cycles.
- This topic repeatedly intersects with ai, privacy, and security, so design choices here rarely stand alone.
How to apply this
- Define freshness, correctness, and latency targets before choosing storage or pipeline patterns.
- Start with the newest post to calibrate current constraints, then backtrack to older entries for first principles.
- When boundary questions appear, cross-read ai and privacy before committing implementation details.
Where teams get burned
- Scaling pipelines before locking down source-of-truth and reconciliation behavior.
- Optimizing single queries while ignoring data model drift and access patterns.
- Applying guidance from 2019 to 2025 without revisiting assumptions as context changed.
Suggested reading path
- Start here (current state): AI Privacy Is a Plumbing Problem, Not a Policy Problem
- Then read (operating middle): Your AI Pipeline Is Just ETL With Extra Steps (And That’s Fine)
- Finish with (foundational context): Data Mesh Is an Org Chart Fix, Not a Tech One
Related posts
- AI Privacy Is a Plumbing Problem, Not a Policy Problem
- Your AI Pipeline Is Just ETL With Extra Steps (And That’s Fine)
- Data Mesh Is an Org Chart Fix, Not a Tech One
References
3 posts
- AI Privacy Is a Plumbing Problem, Not a Policy Problem
Privacy in AI systems fails in the implementation details -- what gets logged, who can replay prompts, how long artifacts linger. Treat it as infrastructure, not a compliance checkbox.
Your AI Pipeline Is Just ETL With Extra Steps (And That's Fine)
AI data pipelines aren't some new paradigm. They're ETL with a retrieval layer bolted on. The discipline that makes them work is the same discipline that has always made pipelines work: detect change, chunk intelligently, keep indexes fresh.
Data Mesh Is an Org Chart Fix, Not a Tech One
Most data problems are ownership problems. Data mesh gets that right. But adopting it as an architecture diagram exercise misses the point entirely.