Your AI Strategy Is a Workforce Problem

The pressure

The spend is ahead of the scoping

"What's our AI strategy?" Every leadership team is asking it, and the pressure rolls downhill until it lands on engineering, carrying expectations of multiples in productivity, last quarter's overcommitments still on the hook, and the same deadlines.

So leadership goes shopping, and every advisor has a different blueprint. Few can tell real from vaporware, let alone what survives contact with a twenty-year-old brownfield back office carrying deep tech debt. Much of the current AI transformation spend is FOMO-driven, bought before the problem was scoped.

The technology works. Implementations fail because nobody managed what they actually bought, a labor source hired without onboarding, staffing, or supervision.

The tools were bought for the old workforce

Most enterprise software assumes a human operator, priced by the seat and driven by clicks. Agents work through APIs, documentation, and clean data. Once AI joins the labor force, machine-operability becomes a platform selection criterion, and a system only a human can drive is a ceiling on how far the automation can go.

The transfer

You already know how to manage this

AI won't replace the engineer. It changes the work and most of the workflow. The part nobody has done is the boring part of mapping which agent does what, with which handoffs, to which humans. Good old people, process, technology. Given what I had to build to make agents actually deliver, I am not surprised "implement AI" isn't working.

Workforce management is a century-old discipline with library wings dedicated to it, covering how to staff, scope, onboard, and run quality gates. That discipline transfers directly to AI agents. I ran it for twenty-six years in telecom, from a fifty-person provisioning organization to a 140-person platform engineering org, and the transfer was direct. We just never had to apply it to a workforce with extreme amnesia, where every session is a blank slate and the whole team forgot what it worked on this morning.

My best outcomes came from treating the old discipline as the playbook and adjusting for the constraints of the labor. Staffing, scoping, and review, run against a workforce that thinks in data-center time and remembers nothing.

The reframe

Stop asking "where can we use AI?" Start asking "how would I run a team of fast, tireless, amnesiac contractors who have no stake in the outcome?" Answer that and most of your AI strategy writes itself.

Failure mode

An agent is its own worst reviewer

The person who built something has blind spots about what they built. Every audit framework ever written knows this. The agent that wrote your code will review its own work, pat itself on the back, and hand you a thumbs up with zero second-guessing.

So separate the builder from the auditor. For anything high stakes, the agent that writes the code does not get to bless it. A different agent, with a different brief and no ego invested in the first one's choices, does the review, for the same reason a developer doesn't approve their own pull request.

Failure mode

More context makes them dumber

The instinct is that if the agent got it wrong, it needed more information. Usually the opposite is true. Agents have finite working memory, and every page of background you load is space they can't spend on the actual task. Research shows performance falling by more than half as the context fills. The well-meaning move of dumping in more is the thing degrading the work.

Two failure modes for one undocumented codebase, and they pull against each other:

Under-context it and the agent fills the blanks with guesses, then ships them with confidence.
Over-context it and it stops following instructions and loses the thread.
There is a right-sized middle, and you only find it by getting it wrong a few times.

Why context engineering is a job now

Entire roles are appearing around feeding agents exactly what they need and nothing more. I get it. I had to build a context optimization layer myself before the output got reliable. Curating what the agent sees is the work, not an afterthought.

Two more failure modes

Generalists are mediocre, and cheap models are expensive

Ask one agent to research, design, build, test, and document and you get a B-minus across the board. The skills for each fill up the working memory the others need, so it does all five badly at once. You don't ask your best engineer to also be the architect, the QA lead, the tech writer, and the PM on the same task. Give one agent one clear job and it does that job well.

Same logic on model selection. You match seniority to task complexity with people, and models are no different. A cheap model on a hard architectural call just generates cleanup work for your human. An expensive model on a routine chore doesn't return better output, only a bigger invoice. Size the model to the task and you get the best result at the right cost.

One agent, one job. Narrow the scope until the work fits the memory.
Route by complexity: cheap model for the boring task, frontier model for the ambiguous one.
The wrong model either burns money or buries an engineer in rework.

Failure mode

They will march in a circle for days

Two things every codebase has that wreck an agent. The first is undocumented decisions, like the compliance rule that shaped the data model, the lawsuit that put one approach off-limits, and the ugly architecture choice that exists to handle three corner cases nobody remembers. Humans absorb that over years of standups and hallway conversations. An agent has none of it, so it will reintroduce every bug you already fixed.

The second is that an agent never gets frustrated. A human engineer stuck on a broken approach gets annoyed and pulls in help. An agent will retry the same dead end for hours, burning resources and producing nothing, and never complain. You have to build the framework that makes the agent question its own approach and escalate, because it will never escalate on its own.

AI has no passion for quality

It will ship excellence and slop with the same confidence. Quality doesn't come out of the box; you engineer for it, and you pressure test to find out which methods actually work.

The payoff

Methodology breaks the iron triangle

Every fix above is a management decision, not a technical one. Separate builder from reviewer. Don't overload anyone with information or responsibility. Match seniority to the task. Build an escalation path. Measure what got done, not how busy the thing looked. You already run your human org this way. You just never wrote the version for amnesiac contractors.

The companies getting no value from AI aren't failing on the technology. They're failing to apply the discipline they spent decades refining for people. I optimized for quality first and cost second, and the cost followed; tightening model and context choices later cut my token usage by 3x with no loss in output.

Get the methodology right and the iron triangle loosens, and quality, speed, and cost stop trading against each other. That is the prize, and it is a workforce problem the whole way down. Three questions decide whether you get it:

Did you scope each agent to one job it can actually hold in memory?
Did you separate who builds from who checks, and who escalates when it loops?
Did you engineer for quality on purpose, or hope it came in the box?