When an AI agent uses your software

AI agents can now do more than answer questions. They can take actions — look things up, fill in forms, call your APIs, change data. That’s genuinely useful. It’s also the riskiest kind of software to point at a real system, because unlike a person, an agent doesn’t slow down, doesn’t second-guess itself, and never gets tired.

The good news: you don’t need a new way of building software to handle this. You need the same safety habits good engineers already use whenever something untrusted touches an important system — just applied more strictly, because this time the thing on the other end acts on its own.

An agent behaves worse than any user you’ve had

A person using your software waits for a page to load and gives up when they’re confused. An agent does neither:

It repeats itself. Where a person tries once and stops, an agent retries, branches off, and keeps going — sending far more requests than any human would.
It’s confidently wrong. It can take a completely wrong action while sounding perfectly reasonable about it, and it won’t hesitate.
It can be tricked by what it reads. If an agent reads a document or a web page, hidden instructions in that text can change what it does. Assume anything it reads might be trying to manipulate it — attackers already do this. (It has a name: prompt injection.)

None of that makes agents unusable. It just means you should treat an agent as untrusted by default, and only let it do a small, well-defined set of things.

Give it specific actions, not the keys

The tempting shortcut is to hand the agent broad access: “here’s an API key and the database, figure it out.” Don’t. That’s how you end up with a 3am incident.

Instead, give the agent a short, fixed menu of actions you’ve defined and approved — each one named, with clear inputs, doing exactly one thing. The agent can use those actions and nothing else.

// The agent can only call these. It never touches the database directly.
const actions = {
  searchPolicies: { input: PolicyQuery, access: "read"  /* ... */ },
  getQuote:       { input: QuoteInput,  access: "read"  /* ... */ },
  // Notice what's missing: no "run any SQL", no "delete a policy".
  // If a dangerous action doesn't exist, the agent can't take it.
};

The shorter that menu, the less can go wrong. You’re not making the agent any less clever — you’re limiting how much damage it can do.

Reading is safe. Writing is where it gets dangerous.

Most of what an agent should do is read: look things up, summarise, draft a reply. Reading is easy to allow — just put a limit on how often it can do it, because an agent in a loop can fire off thousands of requests in the time a person makes one.

Changing things is the dangerous part. A wrong change to an important system is exactly the kind of outage you work hard to avoid. So for anything that writes or changes data:

Make repeats harmless. Agents retry automatically, so design it so that doing the same action twice doesn’t charge a customer twice or create a duplicate.
Ask before the big stuff. For anything consequential, have the agent propose the change and get a person — or an automatic check — to approve it before it actually happens.
Make it reversible. If the agent can do something, you should be able to undo it.

A simple rule: let the agent think freely, but act on a short leash.

Be ready to answer “why did it do that?”

Sooner or later, someone — a customer, a regulator, or your own engineer at 3am — will ask why the agent did something. “The AI decided to” is not an answer anyone accepts.

So record every action the agent takes, the way a bank records a transaction: what information it had, which action it took, with what inputs, and what happened. When the thing making decisions isn’t predictable, being able to explain it afterwards isn’t optional. If you can’t explain what it did, you shouldn’t have shipped it.

It’s still just good engineering

The reassuring part: none of this is exotic. An AI agent is just another user of your system, and the habits that keep any system safe — clear limits, least access, rate limits, the ability to undo, good logging — are exactly the habits that make it safe to put an agent in front of something that matters.

So don’t give an agent the keys to everything. Give it a few safe, specific actions, watch everything it does, and let it be brilliant inside the limits you set. We draw the system before we build it — and that matters even more when part of the system can think for itself.

An agent behaves worse than any user you’ve had

Give it specific actions, not the keys

Reading is safe. Writing is where it gets dangerous.

Be ready to answer “why did it do that?”

It’s still just good engineering

Bridging legacy systems without a rewrite

What "enterprise-grade at startup pace" actually means