Cohort-Based Programs at Scale: Where AI Helps and Where Mentorship Can't Be Automated

Cohort programs work because of intensity and intimacy. Scaling them with AI is tempting and easy to get wrong — the answer is to automate the machinery of delivery and leave the mentorship, the matching commit, and the community to people.

Eli Wood headshot

Eli Wood

June 24, 2026 4 min read
A hand bringing two puzzle pieces together to join, with a circle of figures forming a cohort in the background

The thing that makes cohorts work doesn't obviously scale

A cohort-based program is a particular kind of magic. A group goes through something hard together, on a schedule, with mentors and a community around them. It works because of intensity and intimacy — the shared timeline, the mentor who knows your situation, the peers who become a network. That's also precisely why it resists scaling. Double the cohort or run three at once and the per-participant attention thins out, the mentorship gets shallower, and the alumni community fragments into a dormant mailing list. The program quietly becomes a course.

Applied AI is the obvious lever, and the obvious way to pull it is wrong. If you point a model at the parts that make a cohort a cohort — the mentorship, the personal feedback, the sense of being known — you scale the logistics and gut the value. The work is to be precise: automate the machinery of running a program, and protect the human core that the program exists to deliver.

Three places AI earns its keep — and the line in each

Delivery. Running a cohort is a logistics monster: scheduling, reminders, content sequencing, tracking who's behind, surfacing who needs a nudge, summarizing each session so the next builds on it. This is repetitive and mostly deterministic — a rules-and-operations engine. Automate it without hesitation. Nobody's relationship with the program is improved by a human manually sending reminder emails; that human should be mentoring instead.

Mentorship matching. Pairing mentors and participants is where judgment shows up. The deterministic slice — availability, domain, capacity — is rules; encode it and let it filter. The fuzzy slice — who will actually connect, who needs which kind of mentor — is where a model that reads both profiles and proposes matches with a sentence of reasoning saves a program director real hours. But a human commits every pairing, because a bad match burns two people's time and trust, and that cost is high and slow to recover. Automate the shortlist; keep the human on the decision. Mentorship itself — the actual relationship — is never the model's job. The model's job is to set up the human and then get out of the way.

Alumni community. The hardest thing to scale is the network that's supposed to outlive the cohort. AI helps by noticing — spotting when two alumni should meet because their situations now overlap, flagging who's gone quiet, drafting the introduction. It helps by preparing the human who runs the community to spend their attention where it matters. It does not help by auto-sending "personalized" messages that aren't; people can tell, and faking the connection is how you kill it. Use the model to surface the opportunity and draft the outreach; let a person decide and send.

The through-line is the same in all three: separate the operations engine from the judgment, automate where being wrong is cheap and the work repeats, and keep a human wherever a mistake costs trust. And make every suggestion explainable — a match, a flag, an introduction that shows its reasoning — because that's how program staff learn to trust the system with the routine and reserve themselves for the relationships.

A two-day start

Don't try to instrument the whole program. Pick the one lifecycle stage that consumes the most staff judgment per cohort — usually mentorship matching. Write the deterministic constraints as rules. Stand up a thin judgment layer that reads participant and mentor profiles, applies the rules, and returns ranked pairings each with a one-line rationale. Route every pairing to a human to confirm. Run it against a past cohort where you know how the matches turned out.

In two days you'll learn how much of the matching the model can carry, where its judgment falls short of your director's, and what a confirm-don't-decide workflow feels like. That's the pattern you extend to delivery and alumni next: automate the machinery, keep humans on the mentorship and the community, and make the system show its work so the people running the program trust it with everything that isn't a relationship.

Black Flag Design builds applied-AI products for programs whose value is human and whose logistics are killing them. If you run cohorts and feel the attention thinning as you grow, spend two days with us — we call it a Foundation Sprint.

About the author

Eli Wood headshot
Eli Wood

CEO, Black Flag Design

Eli Wood leads Black Flag Design, a creative technology company focused on shipping ambitious digital products, AI systems, and design-forward software with a direct point of view on how technology changes work.

Related stories

More from the journal

Pen-and-ink sketch of a small clockwork robot working at a tool-covered workbench late at night while a human sleeps peacefully on a couch in the background, a wall clock reading 2:00 above
ai April 24, 2026 13 min read

The Agent Stays Up Late, Not Me

Every senior engineer knows the right way to set up a codebase. None of them do it. Here’s the four-stage framework we use — The Ratchet — to take a vibe-coded project all the way to a thing you’d trust in production, and the punchline about why this only just became worth doing.

Most teams have always known they should be running tests, type-checking, security audits, accessibility checks, dead-code analysis, prose linting, and a coverage floor. Most teams run two of those. Here’s why that math has finally inverted, and the four-stage framework we use to ratchet a vibe-coded project to a hardened one.

Keith Pattison

Keith Pattison

Founder, Black Flag Design

Read
Black Flag Journal
claude code April 20, 2026 5 min read

What a Year of Claude Code Trails Tells You About Your Team

Claude Code leaves evidence — sessions, commits, PRs, review notes. Read it like a logbook and you'll find what devs actually need to know before they go deeper.

After a year of shipping with Claude Code across real client work, the signal isn't in any single session — it's in the trails. Here's what those trails told us about where Claude Code shines, where it drifts, and the habits devs should build before they lean in harder.

Eli Wood headshot

Eli Wood

CEO, Black Flag Design

Read
Black Flag Journal
playbook April 20, 2026 6 min read

The Black Flag Playbook: Six Principles for Shipping with AI

Battle-tested principles for teams building real software with AI-generated code. Human judgment, tight scope, and weekly evidence — the disciplines that keep AI-built systems reliable.

The six rules we use to ship production software with AI. Small scope, weekly demos, human-led oversight, and continuous improvement — drawn from six months of real client engagements.

Keith Pattison

Keith Pattison

Founder, Black Flag Design

Read