Agentic AI: Common Sense Will Be Applied

The sentence every leader says when they greenlight an agent. It has no subject. This series is about the empty chair it leaves behind.

Agentic AI: Common Sense Will Be Applied

Somewhere in your company a meeting is happening right now. Someone is proposing to hand a decision to an AI agent. A refund. A subscription cancellation. A disputed charge. The room is nervous, so the proposer reaches for the sentence that ends every one of these meetings:

"Don't worry. Common sense will be applied."

Everyone nods. The decision gets signed off. The agent ships.

Here is the problem with that sentence. It has no subject. Read it again. Common sense will be applied by whom? The agent does not have common sense. The proposer has left the room. And the human who used to make this call is being removed, because removing them is the entire point of the project. "Common sense will be applied" is a passive sentence draped over an empty chair.

who? will apply common sense.
No subject
The chair we just emptied
A passive sentence covering an empty chair. The judgment is assumed. The owner is missing.

This matters now because agents have crossed a line. For two years they suggested, and a human clicked. Now they act: they issue the refund, send the email, close the ticket, move the money. The safety net of a human between the model and the customer is quietly being removed, one workflow at a time, and "common sense will be applied" is the phrase that waves it through.

So over six parts I will argue one thing. The real question of agentic AI in customer experience is not whether an agent can make a decision. Often it can, and often better than we admit. The real question is who is accountable when it does. And every honest answer leads to one uncomfortable number almost no leadership team is willing to write down.

I will get to that number. First, why "common sense" is the wrong word.

What we actually mean by common sense

When a leader says an agent will use common sense, they mean something specific even if they cannot name it. They mean the agent will know when a case is weird. It will notice that a refund larger than the original order is suspicious. It will sense that a loyal customer of eight years asking for one small exception deserves a different answer than a brand-new account does. It will feel the difference between a routine request and a trap being set.

Watch a good service agent for an afternoon and you see this constantly. They read a one-line message and infer the customer is anxious, not angry. They notice the order history and waive a fee no policy told them to waive. They escalate a case that looks fine on paper because something about it does not add up. None of that is written down anywhere. It is judgment built from experience, and it runs quietly underneath every decision they make. We never had to specify it, because a person was always there to supply it for free.

Agentic AI removes the person. So the judgment that used to be free now has to be built, bought, or done without. "Common sense will be applied" is the sound of a team choosing the third option without noticing they made a choice at all.

The question everyone asks, and the one that matters

The debate about agents almost always starts in the wrong place. It starts with "can the agent decide correctly?" That question feels important, and it is mostly a distraction, because the answer is "yes, most of the time." And "most of the time" is exactly the trap.

The capability question
Can the agent decide correctly?
The accountability question
Who is accountable when it does?
Capability you can buy from a model provider. Accountability you cannot.

An agent that is right 99 percent of the time, run across a high volume of cases, is still wrong hundreds of times a week. Each of those wrong cases lands on a real customer.

1%
At 50,000 cases a week, a 99%-correct agent is confidently wrong about 500 times. The dashboard still reads 99% green. Each gold dot is a person.
The volume that justified automation is the same volume that scales the mistakes.

Look at a single one of those gold dots. Take an ordinary billing dispute.

The case
A long-time customer's free trial auto-converted to a $180 annual plan. They emailed to cancel two hours after the charge, having forgotten the trial was ending.
The rule
Policy says annual plans are non-refundable once billed. The charge is valid. The agent applies the rule and declines the refund.
The result
Correct by policy. Wrong by every human measure. A human rep would refund the $180 in seconds to keep a customer worth far more over time. The agent could not see the lifetime value, the goodwill, or the one-star review forming. It felt no doubt, because it had none to feel.

Now the part that makes this genuinely hard. The two cases below arrive at the agent looking identical: same product, same policy, same words. A human would treat them completely differently. The agent cannot tell them apart, and decides both with the same flat confidence.

Identical to the agent · opposite right answers
Case A
First-time customer. One charge, one polite request, no history. Genuinely forgot to cancel.
A human refunds it. Goodwill, tiny cost.
Case B
Sixth "forgotten" trial this year across three accounts. Same wording every time. Working the policy.
A human declines it. This is a pattern, not an accident.
Common sense is mostly the ability to tell these two apart. The rule cannot. The agent will not.

The question that decides whether you can live with that is not about the 99 percent. It is about the one percent. Who owns it, who sees it, who answers for it, and what it costs when it fails in a way no dashboard caught. That is an accountability question, not a capability question. Capability you can buy. Accountability has to sit on a named human by design, or it sits nowhere, which in practice means it sits on your brand.

Where the human goes. It does not leave.

Here is the through-line of the whole series, stated up front so you can watch me try to break it and fail.

You cannot delete the human from a consequential decision. You can only relocate them.

The principle this series keeps proving
Conservation of Judgment. In any consequential decision, the human judgment never disappears. Automation only moves it: out of the routine flow, onto the hard cases and the parameters above them.

Every architecture in this series, and some are genuinely good, does the same thing. It moves the human.

Before
Human in every decision

High volume, low leverage. Every case touched, including the easy ones.

After
Human at the edges and the parameters

The agent at the edges: the hard, novel, irreversible cases.

The manager at the parameters: the rules the whole system runs under.

Less volume, far more consequential work. The chair moved. It did not vanish.

Picture the senior rep who used to approve two hundred refunds a day. In the new design she approves almost none, because the system resolves the easy ones without her. Instead she owns the dozen genuine exceptions the machine flagged, she watches the drift dashboard for the day the refund rate jumps three points overnight, and she audits a random sample to catch the cases that looked normal and were not. Same person. Far fewer cases. Far more leverage. The skill in agentic CX is not removing humans. It is deciding precisely where the remaining humans should stand.

What this series will do

Six parts. Each one is a single move in the argument.

Intro
Common sense will be appliedWhy the sentence has no subject, and why the real question is accountability, not capability. You are reading it.
Part 1
The four bandsWhen an agent can genuinely own a decision: high volume, low variance, reversible, rule-correct almost always, with a human owning the exceptions.
Part 2
Stop if unclear, and the supervisor trapWhy "the agent will escalate when confused" fails. Models are confidently wrong, so they stop on the easy cases and sail through the dangerous ones.
Part 3
The human who stopped readingHow to keep a human in the loop without breeding a rubber stamp. The counterintuitive fix: send humans fewer cases, not more.
Part 4
Two managers who disagree on purposeMoney against experience, as two supervisors pulling opposite ways. Route on their disagreement instead of forcing them to agree.
Part 5
The one number you cannot escapeBudgets, end-of-period stinginess, and the punchline: how many euros is one unhappy customer worth. Where the humans finally belong.
The argument, end to end. Each part is one move.

The number nobody writes down

Here is where we end, so you know what you are signing up for.

Every design move in this series will try to engineer away the discomfort of letting an agent decide. Each one fails in the same elegant way: it takes the hard question and gives it a nicer name.

Single supervisor
Dual supervisor
Budgets
Error budgets
↓ ↓ ↓ ↓
The number no architecture can hide
How many euros is one NPS point worth?
Four mechanisms. One decision underneath, wearing four costumes.

And here is the quiet truth. You have already set that number. Your refund policy encodes it: a thirty-day limit that declines a loyal customer on day thirty-one is a statement that their goodwill is worth less than the refund. Your escalation rules encode it. Your staffing encodes it. The price of an unhappy customer is already priced into a dozen decisions across your business. You simply have never had to say it in one sentence.

Agentic AI changes that. The most uncomfortable feature of handing decisions to an agent is that it forces you to write the number down, because an agent cannot act on a figure that lives only in a manager's gut. It needs the threshold, the budget, the rule. Implicit judgment has to become explicit instruction.

Most organizations will not write it down. It is politically ugly to put on paper. So they say "common sense will be applied" instead, ship the agent anyway, and hope the empty chair holds.

This series is about not doing that.

Next — Part 1: The four bands of safe autonomy