The Operational Middle: Where Agentic AI Actually Works

Everyone is arguing about agentic AI. Some people want it to write code. Some want it to book travel. Some want it running customer support. The debate is loudest at the extremes: either agents will replace entire departments or they are chatbots with extra steps.

I think both sides are looking in the wrong place. The most valuable use of agentic AI isn't at the top of the organization, where strategic decisions still need human judgment. It isn't at the bottom either, where routine transactions are already automated to death. It's in the middle — the layer of work between strategy and transaction that most organizations don't even have a clean name for.

I've been spending most of my time building for that layer for the last several months. Here's what I've learned about where agentic AI actually works.

The Chatbot Trap

There's a reason every company's first AI project is a chatbot. Chatbots are legible. You can demo them. You can point to a screen and say "look, it answered the question." Executives understand them. Budgets get approved.

But a chatbot's job is to answer, not to decide. The moment you ask a chatbot to take an action — commit to a price, release a purchase order, reroute a shipment — you discover it's built for a different kind of work. You can shim action-taking into a chatbot interface, but what you've actually built at that point isn't a chatbot anymore. It's an agent. And chatbots are not the right design pattern for agents.

Chatbot energy has absorbed a huge portion of corporate AI investment. That's fine as long as you remember it's the 10% easy case, not the 90% valuable one.

The Transaction Trap

On the opposite end of the spectrum, there's transaction automation. ERP systems, API integrations, RPA bots — a whole industry of software that has gotten very good at moving data from one place to another. If your purchase order needs to be created, sent to a vendor, acknowledged, and receipted, there is software that does exactly that. Has done for thirty years.

Some of the louder voices in AI say agents will "replace ERPs." They won't. ERP systems represent a decade of learned operational knowledge about how a business actually runs — and the routine transactions they handle are not what needs intelligence. They need reliability, audit trails, and predictable costs. Asking an LLM to handle a 2,000-line purchase order acknowledgment is substituting cleverness for dependability.

Routine transactions are a solved problem. Agents don't make them better. They make them expensive.

What's in the Middle

So where does agentic AI actually earn its keep? In the layer between strategy and transaction — what I've started calling the operational middle.

The operational middle is where humans spend most of their day. Think about what a supply planner actually does on a Tuesday morning. They pull last week's demand signals. They check which suppliers are behind on acknowledgments. They compare the ML forecast against the planner-adjusted numbers. They look at inventory positions across three regional warehouses. They notice that one part number has drifted outside its policy band. They calculate a new safety stock. They decide whether to bump up the next order or wait. They write a note to the procurement team.

None of that is strategic. None of it is routine either. It's multi-step, judgment-heavy work that requires pulling from several systems, reasoning about the state, and making a decision within a fuzzy policy. It's the kind of work that an experienced planner does well, a junior planner does poorly, and a spreadsheet cannot do at all.

This is where agentic AI pays off. Not because agents are smarter than planners — they're not — but because the work is the right shape for agents. It's repeatable enough to encode. It's rules-ish enough to define guardrails around. It's complex enough that a simple script doesn't cut it. And it's voluminous enough that automating the routine 80% frees the planner to focus on the judgment-heavy 20%.

If you can name this kind of work in your organization — the daily, weekly, monthly cycles that every experienced operator does in their head — you've found your operational middle.

Where It's Not a Fit

Two places where I'd argue against reaching for agents:

Upstream, at the strategy layer. Should we enter a new market? Should we consolidate suppliers? Should we change our service-level commitments? These decisions require organizational context that agents don't have, stakeholder judgment they cannot exercise, and accountability that cannot be delegated. A human executive hearing "the agent recommended we exit Europe" and pointing at a CI/CD log as the decision trail is not a good scene. Strategy needs humans.

Downstream, at the transaction layer. If a process is already a clean state machine — order received, payment verified, invoice generated, shipment booked — don't put an agent in front of it. You'll add latency, non-determinism, and cost. The existing system works because it is boring. Boring is a feature.

The operational middle is specifically where work is not routine enough to be pure transaction but also not strategic enough to require a human. That's a narrower band than the AI hype suggests, but it's a deep one — almost every knowledge-work job has hours of operational-middle work every day.

One Big LLM Loop Is Not the Architecture

When people build their first agentic system, there's a natural temptation to wire everything up to a single large-context LLM call. Give the model access to all the tools. Let it reason its way through the whole workflow. Hand it the keys. This works surprisingly well for small, linear tasks. It does not work for the operational middle.

The operational middle has structure. Weekly demand planning isn't "figure out what to do" — it's "pull the signals, update the forecast, check against policy, generate recommendations, hand to planner for review." That's a workflow. It has a start state. It has conditional branches. It has handoffs to humans at specific points. It has points where work must wait — sometimes for days — before the next step fires.

This is what BPMN was designed for, and it's having a quiet comeback as the backbone for agentic systems. BPMN gives you a declarative description of the process: here is the sequence, here are the decision points, here are the human tasks. A workflow engine like Temporal then runs the process durably — surviving restarts, retries, long waits, partial failures — in a way a naive LLM loop cannot.

Inside that structured workflow, individual agents handle the bounded cognitive tasks: classify this input, draft this recommendation, evaluate this policy. Each agent has a narrow job with clear inputs and outputs. The agents don't know about the workflow; the workflow doesn't know about the agent's internals. And the integrations between agents and the outside world — fetching ERP data, reading telemetry, writing back decisions — are mediated through MCP-style tool servers that keep boundaries and auditability clean.

The architecture that works for the operational middle is this: BPMN for the workflow, agents for the bounded cognitive tasks, MCP for the integrations, Temporal for durability. It's more moving parts than a single LLM call, but it matches the shape of the work. And the parts are modular enough that you can evolve them independently — swap out the forecasting model without touching the workflow, refine the guardrails without rebuilding the integrations.

Guardrails Are Contracts, Not Chains

The word "guardrails" gets used loosely. In practice there are two very different things people mean by it.

The first is preventing bad behavior: the agent shouldn't hallucinate, shouldn't say offensive things, shouldn't leak the system prompt. These are important, but they're mostly the model vendor's problem to solve through training and red-teaming. You inherit them by picking a good model.

The second — the one that actually matters when you're building for the operational middle — is defining what the agent is allowed to decide. The maximum dollar value of an order it can release. The SKU categories it can reroute. The confidence threshold below which it escalates. The temporal window it can operate in. These are not safety guardrails. They are contract boundaries.

A good contract does two things at once. It tells the agent what not to do — which is the thing most discussions focus on. But it also tells the agent what it is expected to do without asking for approval. A vague contract that says "escalate anything unusual" produces an agent that escalates everything and does nothing. An overly narrow contract produces an agent that cannot be trusted with anything meaningful. The sweet spot is specific enough to enable confident autonomous action and broad enough that the agent actually saves planner time.

Designing those contracts is the deepest work of building agentic systems — and, not coincidentally, it's the work that most resembles what experienced operators already do. Translating the implicit policies that a senior planner applies in their head into explicit thresholds that an agent can apply in code is exactly the kind of knowledge transfer that organizations have always struggled with. Now the agent is the student, and the contract is the curriculum.

How to Find Your Operational Middle

If you're trying to figure out where agentic AI fits in your organization, my advice is to start not with the technology but with a work inventory. Sit with an experienced planner, analyst, or operator for a full day. Write down every decision they make. You'll see three rough buckets:

Decisions where they had to stop and really think. These are strategy. Leave them to humans.
Decisions where they clicked through screens without looking. These are transactions. Automate them through normal means if they aren't already.
Decisions where they combined information from several places, applied a judgment, and moved on — usually saying "I do this every week" out loud. These are your operational middle.

The third bucket is almost always bigger than anyone expects. A senior operator in a well-run organization will have dozens of these routines — each of them a candidate for an agent with a clear contract and a BPMN-shaped workflow around it.

Once you have the inventory, pick the one that is (a) most repeated, (b) most tolerant of an occasional escalation, and (c) has the clearest existing policy. That's your first build. Start narrow. Prove the contract works. Iterate the guardrails until the agent's decisions and the human's decisions align 90%+ of the time. Then move to the next routine.

Find Your Operational Middle First

Every time I see a new company launch with an "agentic AI platform," the first question I ask is: what work is it doing? If the answer is "chatbot, but smarter," I lose interest. If the answer is "automating transactions," I know they will be outrun by dedicated ERP vendors. But if the answer is "taking on the daily routines that planners and analysts grind through" — the layer of repeatable, multi-step, judgment-heavy work that every organization has — I pay attention.

Agentic AI isn't a magic wand you wave at an entire operation. It's a precision tool for a specific kind of work. Find your operational middle first. Design the agents for it second. Everything else is a distraction.