Letting an AI agent write to production safely

Reads are the easy half. An agent that looks things up and drafts a reply can’t do much harm. The moment it can write — change a record, move money, send a message — the stakes change. A wrong action stops being a typo on a screen and becomes an outage, a double charge, a deleted row.

This note is about that half: how to let an agent change production data without becoming the cause of your next incident. (For the broader posture — treating an agent as untrusted, prompt injection, reads — start with when an AI agent uses your software.)

Make every write idempotent

Agents retry. They retry on timeouts, on errors, on their own second-guessing — far more than a person ever would. So the first rule of agent writes is that doing the same thing twice must be harmless. Give each write a key the agent supplies. The system should treat a repeat as a no-op and return the original result. “Charge this order,” run three times, charges once. Without this, every retry is a fresh chance to duplicate, double-bill, or corrupt.

Propose, don’t execute

The safest write is one the agent doesn’t actually commit. Split a consequential action into two steps: the agent proposes the change — fully formed, validated, ready — and a separate step commits it. That gate can be a person for the big stuff, or an automatic check for the routine: limits, policy, a sanity test. The agent never holds the commit button. This is the highest-leverage move on the list. It turns “the agent did something irreversible” into “the agent suggested something, and we approved it.”

Decide by blast radius

Not every write needs a human. Sort actions by what a mistake costs. Updating a draft? Let it through. Issuing a refund, changing a price, emailing a customer? Gate it. The goal isn’t to slow everything down — it’s to spend your scrutiny where being wrong is expensive, and stay fast where it’s cheap. That’s the same instinct as enterprise-grade work at startup pace: careful on the load-bearing calls, quick on the reversible ones.

Always be able to undo

Assume the agent will get one wrong, and design so you can take it back. Prefer reversible operations: soft-deletes over hard-deletes, status changes over destructive edits, compensating actions (“refund”) over rewriting history. If an action genuinely can’t be undone, that’s exactly the kind that should require approval — or shouldn’t be in the agent’s hands at all.

Cap the damage

A person makes a handful of changes an hour. An agent in a loop can make thousands. Put a budget on writes — per minute, per run, per resource — and stop when it’s hit. A runaway loop should trip a limit, not drain an account. Rate limits aren’t just for reads.

Log it like a ledger

When the thing making decisions isn’t predictable, you need to be able to explain it afterwards. Record every write the way a bank records a transaction: what the agent knew, what it proposed, who or what approved it, what changed, and the result. Sooner or later someone asks “why did it do that?” — and “the AI decided to” is not an answer anyone accepts. If you can’t reconstruct it, you shouldn’t have shipped it.

The shape of it

None of this is exotic. Idempotency, staged approval, reversibility, budgets, audit — these are the habits good engineers already use whenever something untrusted touches an important system. An agent is just the most relentless untrusted input you’ve ever had. Give it a narrow set of writes. Make each one safe to retry, gate the consequential ones, and keep the receipts. Then let it move fast inside those limits. That’s how we wire AI into systems that matter: freely where it’s cheap to be wrong, on a short leash where it isn’t.

Related pattern: Approval-gated AI agent — the reusable shape behind this approach.