The shortage tempts you toward the wrong fix
The hardest constraint in education right now is not content and not technology. It is people. There are not enough great teachers to put one in front of every classroom that needs one, and the gap is widening in exactly the subjects and the places where the need is most acute. When the scarce resource is human expertise, every spreadsheet points to the same seductive idea: replace the expert with a model and the math finally works.
It's the wrong fix, and not for sentimental reasons. The thing that makes a great teacher great is precisely the part that doesn't reduce to content delivery — reading a confused face, knowing which student needs a push and which needs a minute, deciding in real time how to reframe an idea that just didn't land. A model can deliver a lesson. It cannot yet be accountable for a child's understanding. Replace the expert and you scale the easy half of teaching while quietly dropping the half that was the whole point.
Scale the attention, not the authority
The useful frame is to separate the rules engine from the judgment engine. A great teacher spends an enormous share of their day on work that is real but repetitive: re-explaining the same prerequisite, grading the same misconception thirty times, assembling who-needs-what before they can teach anyone anything. None of that requires their judgment — it consumes the time they'd otherwise spend on the judgment. That's the rules engine, and it should run without them.
The judgment engine is the irreplaceable part: the live read of a specific student, the call about how to respond to this misunderstanding right now. Applied AI's job is to clear the repetitive work off the expert's plate so their scarce attention lands where the stakes are highest — and to prepare them for those moments by arriving with the picture already assembled: who's stuck, on what, and what they've already tried. You are not cloning the teacher. You are cloning the conditions under which their attention is most valuable, and then pointing that attention at more students than one human could otherwise reach.
This requires keeping the human firmly in the loop wherever being wrong is costly — and with a child's understanding, it usually is. The model handles volume and preparation; the teacher makes the calls that shape a kid. And everything the model does has to be explainable, because a teacher being asked to cover more students will only trust a system whose reasoning they can see and override. A black box that quietly mediates between an expert and thirty children is not leverage; it's liability. Start where the judgment is expensive and repetitive — the re-teaching, the triage, the prep — and make the model's work legible enough that the expert stays genuinely in charge.
A two-day starting point
The trap is to aim at a virtual teacher that does everything and is trusted with nothing. The fix is to find the single most repetitive thing your best people do — the re-explanation they give every week, the triage they do before every block — and build one capability that does exactly that, shows its reasoning, and routes every real judgment call back to the expert with the context assembled.
In two days you can have that running alongside a real teacher and measure the only thing that matters: how much of their scarce attention you freed, and whether it landed where the stakes were highest. You'll learn precisely which part of expertise is repetitive enough to offload and which part must stay human — the boundary that lets one expert cover more ground without dropping the half that made them worth scaling. Get it right on one task and you have the pattern for the next ten: automate the repetitive, concentrate the expert's judgment, make every step something they can see and steer.
Black Flag Design builds applied-AI products that extend scarce human expertise instead of replacing it. If your constraint is great people and not enough of them, spend two days with us finding the line — we call it a Foundation Sprint.