Reliability
Posts and projects tagged "Reliability".
Posts
Agentic Workflows That Actually Work
How to build production agentic workflows with retry logic, audit trails, and human-in-the-loop checkpoints that survive real-world failure modes.
The Update That Crashed WordPress
Lessons from shipping a crash to millions of WordPress iOS users — why relentless testing, small increments, and graceful failure modes matter.
Building Systems That Survive Contact With Humans
Practical patterns for building reliable systems that handle incomplete inputs, confused users, and the failure modes that happen in production.