7 min read · March 31, 2026

Your AI Strategy Is a Workforce Problem

Most AI strategies fail for a reason no vendor will tell you: you bought a labor source and never managed it like one.

The pressure

Everyone is buying shovels

"What's our AI strategy?" Every leadership team is asking it, and the pressure rolls downhill until it lands on engineering. The C-suite wants 30x productivity. The consultants are confidently telling your boss what your job looks like in two years. Last quarter's overcommitments are still on the hook. We gave you Copilot, why aren't your teams 30x faster? Same deadlines. The beatings will continue until AI improves morale.

So leadership goes shopping, and every consultant has a different blueprint. Nobody can tell real from vaporware, let alone what survives contact with a 20-year-old brownfield back office with tech debt stacked to the ceiling. There is a gold rush in AI transformation, and most of the spend is FOMO. Companies are buying shovels and pickaxes without knowing whether the problem calls for a shovel or a box of dynamite.

Here is the verdict after building this for real: the technology works. The implementations fail because nobody managed the thing they bought. You didn't buy a tool. You hired a labor source, and then you skipped onboarding.

The transfer

You already know how to manage this

AI won't replace the engineer. It changes the work and most of the workflow. The part nobody has done is the boring part: mapping which agent does what, with which handoffs, to which humans. Good old people, process, technology. Given what I had to build to make agents actually deliver, I am not surprised "implement AI" isn't working.

Workforce management is a century-old discipline with library wings dedicated to it. How to staff, scope, onboard, and run quality gates. That discipline transfers directly to AI agents. Most engineering leaders already know it cold. We just never had to apply it to a workforce with extreme amnesia, where every session is a blank slate and the whole team forgot what it worked on this morning.

My best outcomes came from treating the old discipline as the playbook and adjusting for the constraints of the labor. Not from a clever prompt. From staffing and scoping and review, run against a workforce that happens to think in data-center time and remember nothing.

The reframe

Stop asking "where can we use AI?" Start asking "how would I run a team of fast, tireless, amnesiac contractors who have no stake in the outcome?" Answer that and most of your AI strategy writes itself.

Failure mode

An agent is its own worst reviewer

The person who built something has blind spots about what they built. Every audit framework ever written knows this. The agent that wrote your code will review its own work, pat itself on the back, and hand you a thumbs up with zero second-guessing.

So separate the builder from the auditor. For anything high stakes, the agent that writes the code does not get to bless it. A different agent, with a different brief and no ego invested in the first one's choices, does the review. This isn't a nice-to-have. It's the same reason you don't let a developer approve their own pull request.

Failure mode

More context makes them dumber

The instinct is that if the agent got it wrong, it needed more information. Usually the opposite is true. Call it subtraction by addition. Agents have finite working memory, and every page of background you load is space they can't spend on the actual task. Research shows performance falling by more than half as the context fills. The well-meaning move of dumping in more is the thing degrading the work.

Two failure modes for one undocumented codebase, and they pull against each other:

  • Under-context it and the agent fills the blanks with guesses, then ships them with confidence.
  • Over-context it and it stops following instructions and loses the thread.
  • There is a right-sized middle, and you only find it by getting it wrong a few times.

Why context engineering is a job now

Entire roles are appearing around feeding agents exactly what they need and nothing more. I get it. I had to build a context optimization layer myself before the output got reliable. Curating what the agent sees is the work, not an afterthought.

Two more failure modes

Generalists are mediocre, and cheap models are expensive

Ask one agent to research, design, build, test, and document and you get a B-minus across the board. The skills for each fill up the working memory the others need, so it does all five badly at once. You don't ask your best engineer to also be the architect, the QA lead, the tech writer, and the PM on the same task. Give one agent one clear job and it does that job well.

Same logic on model selection. You match seniority to task complexity with people, and models are no different. A cheap model on a hard architectural call just generates cleanup work for your human. An expensive model on a routine chore doesn't return better output, only a bigger invoice. Size the model to the task and you get the best result at the right cost.

  • One agent, one job. Narrow the scope until the work fits the memory.
  • Route by complexity: cheap model for the boring task, frontier model for the ambiguous one.
  • The wrong model isn't a small mistake. It either burns money or buries an engineer in rework.

Failure mode

They will march in a circle for days

Two things every codebase has that wreck an agent. The first is undocumented decisions: the compliance rule that shaped the data model, the lawsuit that put one approach off-limits, the ugly architecture choice that exists to handle three corner cases nobody remembers. Humans absorb that over years of standups and hallway conversations. An agent has none of it, so it will cheerfully reintroduce every bug you already fixed. Failing to learn from history makes repeating it inevitable, and your agent has no history.

The second is that an agent never gets frustrated. A human engineer stuck on a broken approach gets annoyed and pulls in help. An agent will retry the same dead end for hours, burning resources and producing nothing, and never complain. He worked really hard, Grandma. So does a washing machine. You have to build the framework that makes the agent question its own approach and escalate, because it will never escalate on its own.

AI has no passion for quality

It will ship excellence and slop with the exact same confidence and no hurt feelings. Either way you get speed, and speed without quality is just faster waste. You don't get quality out of the box. You have to engineer for it, and you have to pressure test to find out which methods actually work.

The payoff

Methodology breaks the iron triangle

Here is the point. Every fix above is a management decision, not a technical one. Separate builder from reviewer. Don't overload anyone with information or responsibility. Match seniority to the task. Build an escalation path. Measure what got done, not how busy the thing looked. You already run your human org this way. You just never wrote the version for amnesiac contractors.

The companies getting no value from AI aren't failing on the technology. They're failing because nobody told them to apply the same discipline they spent decades refining for people. I optimized for quality first and cost second, and the cost followed: tightening model and context choices later cut my token usage by 3x with no loss in output.

Get the methodology right and you break the iron triangle. Good, fast, and cheap stop being a pick-two. That is the prize, and it is a workforce problem the whole way down. The three questions that decide whether you win:

  • Did you scope each agent to one job it can actually hold in memory?
  • Did you separate who builds from who checks, and who escalates when it loops?
  • Did you engineer for quality on purpose, or hope it came in the box?