The Sub-Agent Skill Trap

Spawning a sub-agent looks like the obvious move for any multi-step task. Half the time it is the wrong one, and the signals are quieter than you think.

Eli Wood headshot

Eli Wood

April 23, 2026 6 min read
Whiteboard sketch: one task branching into five arrows that tangle into a knot, with a small caution mark at the center.

At Black Flag Design we run Claude Code across a lot of parallel workstreams. Five clients, dozens of repos, teams of sub-agents running at once on some days. When we first got comfortable with the Agent tool and with teammate spawning, we leaned on it for almost everything multi-step. Fan out the work, synthesize at the end, move on.

Then we started auditing what the sub-agents actually did. About half the time, the fan-out made things slower, noisier, or less correct than if the main agent had just done the work itself. Sub-agents are a real tool. They are also a trap if you reach for them reflexively. Here is the shape of the trap and how we tell when we have fallen into it.

1. Sub-agents are for independent work, not sequential work

Whiteboard sketch: three parallel lanes, each with its own figure exploring a different doorway, arrows diverging cleanly from a shared start point.

The cleanest win for a sub-agent is when two or more branches of work are genuinely independent: different repos, different questions, different clients. Our BFD ops team is a good example. Five teammates, each owning a different client's transcripts. No one blocks anyone else. The synthesis happens at the end, and the savings are real.

The trap shows up when the branches only look independent. If step two depends on what step one finds, you have not parallelized anything. You have just added a round trip. The main agent ends up waiting on the first sub-agent, then spinning up the second, then reconciling. That is a slower serial loop with extra context boundaries bolted on.

What to do: before you spawn, ask whether the branches would still make sense if you shuffled the order. If the answer is no, keep the work in one agent.

2. Context fragmentation is the quiet cost

Every sub-agent starts cold. It has your prompt and whatever files it can read, but it does not have the three exchanges you just had with the main agent. The nuance from those exchanges, the false starts you already ruled out, the pattern you noticed in a diff, all of it stays with the parent.

We have watched sub-agents re-derive things we already knew, contradict decisions the parent had already made, and "discover" problems that were already solved two turns ago. The usual fix is to stuff more context into the sub-agent's prompt, which inflates your token bill and still misses the implicit state.

What to do: treat every sub-agent spawn as a fresh teammate who just walked in. If explaining the context costs more than doing the task yourself, do the task yourself.

3. The "synthesis delegated" anti-pattern

This is the one we fall into most often. You spawn three research sub-agents, get three findings back, and then ask a fourth sub-agent to synthesize. It feels tidy. It is usually worse than synthesizing in the main thread.

Synthesis is where the judgment lives. It is where you decide which finding is load-bearing, which is noise, and how the pieces map to the original ask. Sub-agents lose that thread because they never had it. The output reads like a survey paper: everything mentioned, nothing prioritized.

What to do: delegate gathering, never delegate synthesis. Pull the findings back into the main agent and do the judgment work where the original question lives.

4. Coordination overhead can exceed the speedup

Whiteboard sketch: a funnel with many small puzzle pieces pouring in at the top and a single muddled blob emerging at the bottom, a small red question mark beside it.

Our BFD ops team has a recurring failure mode that we eventually wrote into memory: git lock contention, wrong output paths, teammates ignoring shutdown requests. Each one was cheap to fix in isolation. In aggregate they turned a "fast" five-teammate run into a slower, messier version of what one careful pass would have produced.

The rule of thumb we use now: if coordinating the sub-agents (briefing, watching, reconciling, cleaning up after) takes more than about a third of the total work, the fan-out was probably not worth it. That threshold is forgiving. We still cross it more often than we would like.

What to do: before spawning, sketch the coordination plan, not just the work plan. If the coordination plan has more steps than the work plan, simplify or do it yourself.

5. Sub-agents shine when they protect the main context

Where we keep coming back to sub-agents, happily, is when the work produces a lot of intermediate output that the main agent does not need to remember. A large grep across an unfamiliar repo. A broad exploration of a meeting archive. Anything where we want a summary back and do not want the raw scrollback eating into the main context window.

In those cases the sub-agent is not really "parallelism", it is "context laundering". It reads a lot, returns a little, and the main agent stays sharp. The year of trails we wrote about was full of runs where this single use case was the whole value.

What to do: think of sub-agents less as "another worker" and more as "a filter between heavy reading and the main thread." When that framing fits, spawn. When it does not, stay in the main agent.

6. Signals that you have fallen into the trap

The tells are quiet. None of them look like an error message. All of them have bitten us.

  • The main agent spends more turns reconciling sub-agent output than a single agent would have spent doing the work.
  • Sub-agents contradict each other, and the parent has to pick a winner without enough context to pick well.
  • The final synthesis reads like a list, not a judgment.
  • You find yourself writing longer and longer prompts for each spawn, trying to smuggle in context the parent already has.
  • You shipped the result, but could not clearly say why the sub-agent path was faster than the direct one.

Any one of these is a signal. Two or more is a pattern. When we see the pattern, we roll the next similar task back into a single agent and compare. The comparison is usually humbling.

What to do: keep a short internal note of the last few sub-agent runs and whether they actually paid off. Trust that running tally more than the feeling that "this task seems parallel."


The one-paragraph version

Sub-agents are a real tool, but they are only faster when the work is genuinely independent, the context cost of a fresh spawn is low, and the synthesis stays with the parent. Most of the time we reach for a sub-agent, we are actually doing sequential work with extra round trips, losing context across boundaries, or delegating the judgment that should have stayed in the main thread. The fix is not to stop using sub-agents. It is to use them where they shine (independent branches, parallel exploration, protecting the main context from heavy reads) and to notice the quiet signals when they are costing you more than they are saving.

If your next instinct is to spawn a sub-agent, pause for ten seconds and ask whether this is a fan-out task or a single-thread task. That pause is worth a lot of wasted turns.

About the author

Eli Wood headshot
Eli Wood

CEO, Black Flag Design

Eli Wood leads Black Flag Design, a creative technology company focused on shipping ambitious digital products, AI systems, and design-forward software with a direct point of view on how technology changes work.

Related stories

More from the journal

Pen-and-ink sketch of a small clockwork robot working at a tool-covered workbench late at night while a human sleeps peacefully on a couch in the background, a wall clock reading 2:00 above
ai April 24, 2026 13 min read

The Agent Stays Up Late, Not Me

Every senior engineer knows the right way to set up a codebase. None of them do it. Here’s the four-stage framework we use — The Ratchet — to take a vibe-coded project all the way to a thing you’d trust in production, and the punchline about why this only just became worth doing.

Most teams have always known they should be running tests, type-checking, security audits, accessibility checks, dead-code analysis, prose linting, and a coverage floor. Most teams run two of those. Here’s why that math has finally inverted, and the four-stage framework we use to ratchet a vibe-coded project to a hardened one.

Keith Pattison

Keith Pattison

Founder, Black Flag Design

Read
Black Flag Journal
claude code April 20, 2026 5 min read

What a Year of Claude Code Trails Tells You About Your Team

Claude Code leaves evidence — sessions, commits, PRs, review notes. Read it like a logbook and you'll find what devs actually need to know before they go deeper.

After a year of shipping with Claude Code across real client work, the signal isn't in any single session — it's in the trails. Here's what those trails told us about where Claude Code shines, where it drifts, and the habits devs should build before they lean in harder.

Eli Wood headshot

Eli Wood

CEO, Black Flag Design

Read
Black Flag Journal
playbook April 20, 2026 6 min read

The Black Flag Playbook: Six Principles for Shipping with AI

Battle-tested principles for teams building real software with AI-generated code. Human judgment, tight scope, and weekly evidence — the disciplines that keep AI-built systems reliable.

The six rules we use to ship production software with AI. Small scope, weekly demos, human-led oversight, and continuous improvement — drawn from six months of real client engagements.

Keith Pattison

Keith Pattison

Founder, Black Flag Design

Read