AI Security: Evolving Threats and Defenses
As of late February 2026, AI security is defined by adaptive attacks and layered, operational defenses.
Production coverage in this archive spans 27 posts from Feb 2016 to Jul 2026 and treats production as a production discipline: evaluation loops, tool boundaries, escalation paths, and cost control. The strongest adjacent threads are ai, llm, and infrastructure. Recurring title motifs include ai, production, engineering, and kubernetes.
As of late February 2026, AI security is defined by adaptive attacks and layered, operational defenses.
As of late January 2026, AI-native architecture is a stable discipline with repeatable patterns for delivery, safety, and change management.
Video AI is practical for scoped workflows. This post covers what works, how to design for reliability, and where human review still matters.
Your AI system can return 200 OK and still be wrong, unsafe, or confidently hallucinating. Here's how to detect, contain, and learn from AI incidents -- drawing from the same IR principles that work for traditional systems.
The trick to AI workflow automation is simple: let the model decide, let deterministic code act, and never confuse the two.
Most AI support systems are built to deflect tickets. The ones that actually work are built around escalation, grounding, and the simple idea that customers aren't idiots.
AI systems are exposed APIs with real blast radius. The threats are injection, leakage, and tool misuse. The defenses are the same ones we've always needed -- just applied to a new surface.
Offline evals are necessary but not sufficient. Here's how I test AI features in production with shadow mode, canaries, and rollback automation -- with Go code.
Traditional monitoring will tell you your AI service is up. It won't tell you it's returning confident garbage. Here's what observability actually looks like for AI.
Reasoning models are powerful but expensive and slow. Here's how I integrate them in Go services with routing, async patterns, and cost controls that actually work.
AI infrastructure at scale is just infrastructure. The same boring patterns -- gateways, caching, circuit breakers, budget enforcement -- solve the same boring problems.
AI safety in production isn't a research problem. It's defense in depth, the same way cyber defense works -- layered controls, assumed breach, observable boundaries.
Function calling is how LLMs touch real systems. Treat tools like APIs, arguments like untrusted input, and permissions like the model is an intern with root access.
AI agents that can take actions are fundamentally different from chatbots. The engineering bar must match the blast radius.
Betting on a single model provider is like having a single database with no failover. Here is why multi-model is the only sane production strategy.
AI engineering is not ML research with a product hat. It is the discipline of making models behave in production -- and it demands its own skill set.
Traditional monitoring tells you the service is up. It doesn't tell you the model started confidently returning garbage last Tuesday. Here's how to actually observe LLM systems.
ChatGPT changed expectations overnight, but shipping AI features that actually work is an engineering problem, not a model problem.
Staging never catches the real bugs. Here's how I learned to test in production without burning everything down.
Most Kubernetes outages come from skipping the basics. Here's the checklist I use after running clusters at the fintech startup and now at Decloud.
After a year running GraphQL at the fintech startup, here's what the conference talks leave out.
Year two of running Kubernetes at the fintech startup. The panic is gone. Now it's networking, resource tuning, and all the operational grunt work nobody blogs about.
After a year of running Kubernetes in production, the wins are real but the sharp edges drew blood first. Here's what paid off, what bit us, and what I'd do differently.
Most teams monitor too much and alert on the wrong things. Five metrics are enough to run a startup backend.
Production incidents show where architecture bends and where it breaks. These lessons focus on designing for failure, limiting blast radius, and making recovery routine.
Running Docker in production at Dropbyke forced us to get serious about image builds, container networking, log aggregation, and security. Here is what actually worked.