When Your Judgment Is the Product: Applied AI for Strategy and Policy Firms

For a strategy or policy firm, the asset isn't a process — it's the judgment of the people in the room. That makes "productize it so it scales" a dangerous instruction, because the obvious way to scale judgment is to dilute it. Applied AI offers a different path: scale everything around the judgment so the judgment itself can go further.

Eli Wood headshot

Eli Wood

June 24, 2026 4 min read
A watchmaker making one precise adjustment while pre-arranged components wait in trays around the bench

The problem: you can't clone the people

Some firms sell a process you could hand to anyone. Strategy and policy firms don't. What clients are buying is judgment — a small number of people who have seen enough that they know which lever moves the system, which coalition holds, which recommendation will survive contact with reality. The community around the firm is part of the product too: the network, the trust, the shared context that makes the advice land.

This is a wonderful business and a terrible thing to scale, because the usual scaling playbook is "systematize it and hire against the system." Do that with judgment and you get juniors running a template they don't understand, producing advice that looks like the firm's work and isn't. The thing that was scarce and valuable becomes common and thin. You've grown headcount and shrunk the product.

So the leaders sit on a ceiling: demand they can't meet without diluting the work, and a community they can't serve more deeply without more of their own scarce hours.

The insight: scale the work around the judgment, not the judgment

The reframe that unlocks this is to stop trying to productize the judgment and instead productize everything that surrounds it.

Separate the two engines. The rules engine is the enormous amount of work that precedes and follows a judgment but isn't the judgment: gathering the landscape, summarizing what's known, drafting the first eighty percent, keeping the community's shared context current and findable, surfacing the right precedent at the right moment. This is repetitive, knowable, and exactly what applied AI carries well.

The judgment engine is the irreducible call that the firm's people make — the synthesis, the read, the recommendation. That stays human, full stop. Being wrong here is expensive in the way that ends client relationships, so a person decides, every time.

When you split it this way, AI doesn't replace the expert — it removes the hours of preparation and follow-through that currently cap how many judgments the expert can render. The senior person walks into the decision already holding a synthesized landscape instead of spending three days building it. Their judgment goes further because they spend more of their time exercising it and less of it assembling the inputs.

The same logic serves the community. An applied-AI layer can make the firm's accumulated knowledge searchable, can answer the routine questions members ask, can surface who in the network has the relevant context — deepening the community's value without spending the principals' hours on every interaction.

The path: start where the expensive hours are repetitive

Start where judgment is expensive and the surrounding work is repetitive — that overlap is where AI buys the most senior time with the least risk.

Keep the human in the loop on every consequential output, and earn trust with explainability. An expert will not put their name on a synthesis they can't trace. So the system has to show its sources, show its reasoning, and make it trivial for the principal to verify and correct. A draft they can interrogate, they'll use; a confident summary they can't, they'll throw away — and rightly.

A concrete way to start in two days:

  • Day one — audit where the senior hours go. Track one engagement and split every hour into "assembling inputs" versus "rendering judgment." Most firms are shocked at the ratio. The assembly hours are your automation target; the judgment hours are the product you're protecting.
  • Day two — automate one assembly task, with sources visible. Pick the most repeated prep task — landscape summary, precedent pull, first-draft memo — and have AI produce it with every claim traceable to a source. Hand it to a principal and watch what they change. Their edits are the judgment made visible, and they tell you exactly where the line between the two engines really sits.

At the end of two days you've freed senior hours without touching the senior call — and you've proven to the people whose judgment is the product that the system makes them more themselves, not less.

The firms that scale well in this era won't be the ones that turned their experts into a template. They'll be the ones that used applied AI to give their experts more room to do the one thing that can't be automated.

Black Flag Design builds applied-AI products for firms whose judgment is the product. If you're trying to scale expertise without diluting it, spend two days with us — we call it a Foundation Sprint.

About the author

Eli Wood headshot
Eli Wood

CEO, Black Flag Design

Eli Wood leads Black Flag Design, a creative technology company focused on shipping ambitious digital products, AI systems, and design-forward software with a direct point of view on how technology changes work.

Related stories

More from the journal

Pen-and-ink sketch of a small clockwork robot working at a tool-covered workbench late at night while a human sleeps peacefully on a couch in the background, a wall clock reading 2:00 above
ai April 24, 2026 13 min read

The Agent Stays Up Late, Not Me

Every senior engineer knows the right way to set up a codebase. None of them do it. Here’s the four-stage framework we use — The Ratchet — to take a vibe-coded project all the way to a thing you’d trust in production, and the punchline about why this only just became worth doing.

Most teams have always known they should be running tests, type-checking, security audits, accessibility checks, dead-code analysis, prose linting, and a coverage floor. Most teams run two of those. Here’s why that math has finally inverted, and the four-stage framework we use to ratchet a vibe-coded project to a hardened one.

Keith Pattison

Keith Pattison

Founder, Black Flag Design

Read
Black Flag Journal
claude code April 20, 2026 5 min read

What a Year of Claude Code Trails Tells You About Your Team

Claude Code leaves evidence — sessions, commits, PRs, review notes. Read it like a logbook and you'll find what devs actually need to know before they go deeper.

After a year of shipping with Claude Code across real client work, the signal isn't in any single session — it's in the trails. Here's what those trails told us about where Claude Code shines, where it drifts, and the habits devs should build before they lean in harder.

Eli Wood headshot

Eli Wood

CEO, Black Flag Design

Read
Black Flag Journal
playbook April 20, 2026 6 min read

The Black Flag Playbook: Six Principles for Shipping with AI

Battle-tested principles for teams building real software with AI-generated code. Human judgment, tight scope, and weekly evidence — the disciplines that keep AI-built systems reliable.

The six rules we use to ship production software with AI. Small scope, weekly demos, human-led oversight, and continuous improvement — drawn from six months of real client engagements.

Keith Pattison

Keith Pattison

Founder, Black Flag Design

Read