PAT-02 · Architecture pattern
Approval-gated AI agent
Let an AI agent act on real systems — on a short leash.
What it is
A pattern for letting an AI agent take actions in a real system without giving it broad access. The agent can only call a small, fixed set of named actions; reads are rate-limited, writes are proposed then approved, every action is logged, and dangerous operations simply do not exist in its toolset.
When to use it
When you want an agent to do things, not just answer — against a system where a wrong action costs money, data, or trust. If the blast radius of a mistake is real, gate it.
If the agent only reads public, low-stakes data, this is heavier than you need.
System shape
The agent thinks freely but acts on a short leash: a narrow, named toolset; least privilege; human-or-automatic approval before any consequential write; idempotent and reversible operations; and a full audit trail.
Failure modes
- Broad access — “here’s an API key, figure it out” — that is how you get a 3am incident.
- Trusting what the agent reads — hidden instructions in a document can redirect it (prompt injection).
- Non-idempotent writes — the agent retries, and a customer is charged twice.
- No audit log — you cannot answer “why did it do that?”
- No human gate on consequential actions — confident, wrong, and irreversible.
How we build it
We define the action menu explicitly — each action named, typed, single-purpose — and default to read-only.
Writes are proposed and held for approval, by a person or an automatic check, made idempotent and reversible, and logged with enough context to answer “why”. The rule: let it think freely, act on a short leash. It is nothing-ships-unreviewed, applied to machines.
Related
- Service
- AI integration
Next
Have a system like this to build?