SG
← all patterns

PAT-02 · Architecture pattern

Approval-gated AI agent

Let an AI agent act on real systems — on a short leash.

What it is

A pattern for letting an AI agent take actions in a real system without giving it broad access. The agent can only call a small, fixed set of named actions; reads are rate-limited, writes are proposed then approved, every action is logged, and dangerous operations simply do not exist in its toolset.

When to use it

When you want an agent to do things, not just answer — against a system where a wrong action costs money, data, or trust. If the blast radius of a mistake is real, gate it.

If the agent only reads public, low-stakes data, this is heavier than you need.

System shape

An AI agent calls a narrow, fixed action menu. Read actions are rate-limited and go straight to the system; write actions are proposed and must pass an approval gate (human or automatic) before reaching the system; every action is recorded in an audit log. Dangerous actions are absent from the menu.readproposed writeagentthinks freelyACTIONSreadrate-limitedwriteproposesdelete · drop · raw sqlnot in the menuapprovalhuman · autosystemof recordaudit logevery action
PAT-02 · system shape

The agent thinks freely but acts on a short leash: a narrow, named toolset; least privilege; human-or-automatic approval before any consequential write; idempotent and reversible operations; and a full audit trail.

Failure modes

  • Broad access — “here’s an API key, figure it out” — that is how you get a 3am incident.
  • Trusting what the agent reads — hidden instructions in a document can redirect it (prompt injection).
  • Non-idempotent writes — the agent retries, and a customer is charged twice.
  • No audit log — you cannot answer “why did it do that?”
  • No human gate on consequential actions — confident, wrong, and irreversible.

How we build it

We define the action menu explicitly — each action named, typed, single-purpose — and default to read-only.

Writes are proposed and held for approval, by a person or an automatic check, made idempotent and reversible, and logged with enough context to answer “why”. The rule: let it think freely, act on a short leash. It is nothing-ships-unreviewed, applied to machines.

Next

Have a system like this to build?