At Black Flag Design we run Claude Code across a lot of parallel workstreams. Five clients, dozens of repos, teams of sub-agents running at once on some days. When we first got comfortable with the Agent tool and with teammate spawning, we leaned on it for almost everything multi-step. Fan out the work, synthesize at the end, move on.
Then we started auditing what the sub-agents actually did. About half the time, the fan-out made things slower, noisier, or less correct than if the main agent had just done the work itself. Sub-agents are a real tool. They are also a trap if you reach for them reflexively. Here is the shape of the trap and how we tell when we have fallen into it.
1. Sub-agents are for independent work, not sequential work

The cleanest win for a sub-agent is when two or more branches of work are genuinely independent: different repos, different questions, different clients. Our BFD ops team is a good example. Five teammates, each owning a different client's transcripts. No one blocks anyone else. The synthesis happens at the end, and the savings are real.
The trap shows up when the branches only look independent. If step two depends on what step one finds, you have not parallelized anything. You have just added a round trip. The main agent ends up waiting on the first sub-agent, then spinning up the second, then reconciling. That is a slower serial loop with extra context boundaries bolted on.
What to do: before you spawn, ask whether the branches would still make sense if you shuffled the order. If the answer is no, keep the work in one agent.
2. Context fragmentation is the quiet cost
Every sub-agent starts cold. It has your prompt and whatever files it can read, but it does not have the three exchanges you just had with the main agent. The nuance from those exchanges, the false starts you already ruled out, the pattern you noticed in a diff, all of it stays with the parent.
We have watched sub-agents re-derive things we already knew, contradict decisions the parent had already made, and "discover" problems that were already solved two turns ago. The usual fix is to stuff more context into the sub-agent's prompt, which inflates your token bill and still misses the implicit state.
What to do: treat every sub-agent spawn as a fresh teammate who just walked in. If explaining the context costs more than doing the task yourself, do the task yourself.
3. The "synthesis delegated" anti-pattern
This is the one we fall into most often. You spawn three research sub-agents, get three findings back, and then ask a fourth sub-agent to synthesize. It feels tidy. It is usually worse than synthesizing in the main thread.
Synthesis is where the judgment lives. It is where you decide which finding is load-bearing, which is noise, and how the pieces map to the original ask. Sub-agents lose that thread because they never had it. The output reads like a survey paper: everything mentioned, nothing prioritized.
What to do: delegate gathering, never delegate synthesis. Pull the findings back into the main agent and do the judgment work where the original question lives.
4. Coordination overhead can exceed the speedup

Our BFD ops team has a recurring failure mode that we eventually wrote into memory: git lock contention, wrong output paths, teammates ignoring shutdown requests. Each one was cheap to fix in isolation. In aggregate they turned a "fast" five-teammate run into a slower, messier version of what one careful pass would have produced.
The rule of thumb we use now: if coordinating the sub-agents (briefing, watching, reconciling, cleaning up after) takes more than about a third of the total work, the fan-out was probably not worth it. That threshold is forgiving. We still cross it more often than we would like.
What to do: before spawning, sketch the coordination plan, not just the work plan. If the coordination plan has more steps than the work plan, simplify or do it yourself.
5. Sub-agents shine when they protect the main context
Where we keep coming back to sub-agents, happily, is when the work produces a lot of intermediate output that the main agent does not need to remember. A large grep across an unfamiliar repo. A broad exploration of a meeting archive. Anything where we want a summary back and do not want the raw scrollback eating into the main context window.
In those cases the sub-agent is not really "parallelism", it is "context laundering". It reads a lot, returns a little, and the main agent stays sharp. The year of trails we wrote about was full of runs where this single use case was the whole value.
What to do: think of sub-agents less as "another worker" and more as "a filter between heavy reading and the main thread." When that framing fits, spawn. When it does not, stay in the main agent.
6. Signals that you have fallen into the trap
The tells are quiet. None of them look like an error message. All of them have bitten us.
- The main agent spends more turns reconciling sub-agent output than a single agent would have spent doing the work.
- Sub-agents contradict each other, and the parent has to pick a winner without enough context to pick well.
- The final synthesis reads like a list, not a judgment.
- You find yourself writing longer and longer prompts for each spawn, trying to smuggle in context the parent already has.
- You shipped the result, but could not clearly say why the sub-agent path was faster than the direct one.
Any one of these is a signal. Two or more is a pattern. When we see the pattern, we roll the next similar task back into a single agent and compare. The comparison is usually humbling.
What to do: keep a short internal note of the last few sub-agent runs and whether they actually paid off. Trust that running tally more than the feeling that "this task seems parallel."
The one-paragraph version
Sub-agents are a real tool, but they are only faster when the work is genuinely independent, the context cost of a fresh spawn is low, and the synthesis stays with the parent. Most of the time we reach for a sub-agent, we are actually doing sequential work with extra round trips, losing context across boundaries, or delegating the judgment that should have stayed in the main thread. The fix is not to stop using sub-agents. It is to use them where they shine (independent branches, parallel exploration, protecting the main context from heavy reads) and to notice the quiet signals when they are costing you more than they are saving.
If your next instinct is to spawn a sub-agent, pause for ten seconds and ask whether this is a fan-out task or a single-thread task. That pause is worth a lot of wasted turns.