When being misread costs people money: applied AI in community-centered banking

Inclusion-focused fintech doesn't fail at automation. It fails at judgment — the moment a model decides who looks creditworthy and who looks suspicious. The fix is architectural, not just better data.

Keith Pattison

Keith Pattison

June 24, 2026 4 min read
A pane of frosted glass coming into focus over a diverse line of people, with two distinct mechanical gears separated by a clear divider above them, one etched with rigid grid lines and one holding a human silhouette

The problem isn't speed, it's who the model has never seen

Most banking AI is sold as an efficiency story: approve faster, flag fraud faster, route tickets faster. For institutions serving Black and Latino communities, speed is rarely the bottleneck. The bottleneck is that the historical data these systems learn from encodes decades of who was already served well — and who was misread, redlined, or simply absent from the record.

When a model trained on that history scores a thin-file applicant, a cash-heavy small business, or a multigenerational household, it doesn't fail loudly. It fails quietly and confidently. It returns a clean number that looks like math but is really a memory of exclusion. The customer experiences this as a denial, a hold, or a fraud flag they can't appeal. For a community where a single misread can mean a missed payroll or a lost deposit, being wrong is not a rounding error. It's the whole relationship.

That reframes the work. The job is not to automate decisions. The job is to be careful, explainable, and correctable exactly where the cost of error lands hardest.

Insight: most of this is a judgment problem wearing an automation costume

The systems that get institutions in trouble blur two very different things: rules and judgment.

A rules engine answers questions that have a defensible right answer — does this transaction exceed a regulatory threshold, is this field missing, is this account dormant. These are deterministic, auditable, and safe to automate fully. You should automate them aggressively, because every minute spent on them is a minute not spent on a human.

A judgment engine answers questions where reasonable people disagree and the cost of being wrong is borne by someone who can't afford it — is this applicant creditworthy given an unconventional income, is this pattern fraud or just how this community actually moves money. These look like the same kind of question as the rules, which is the trap. They are not. Judgment calls need a human in the loop, not because the model is weak, but because being wrong here is expensive and the person affected deserves a path to be heard.

The single most useful architectural move is to physically separate these two engines. When the rules engine and the judgment engine are tangled together, you can't audit either one, you can't explain a decision to a customer, and you can't tell a regulator where a human stands behind the call. Pull them apart and you get something rarer in fintech: a system you can actually defend, line by line, to the person it just affected.

Insight: trust is earned in the explanation, not the accuracy

There's a quiet assumption that a more accurate model earns more trust. In communities that have been misread by institutions before, that's backwards. Trust is earned when a person can understand why a decision was made and contest it if it's wrong. A 96%-accurate black box that can't explain itself is, for these customers, just a faster version of the thing that failed them.

Explainability isn't a compliance checkbox bolted on at the end. It's the product. The model should surface the two or three factors that drove a decision in language a person actually uses, and the human reviewer should be able to override with a reason that gets logged. That log is not bureaucracy — it's how the institution learns where its model is systematically misreading the people it exists to serve.

Path: a two-day starting point

You don't need a year-long platform rebuild to act on this. You need two days and one real decision flow — pick the one where a wrong answer hurts a customer most, usually credit or fraud.

Day one: map the flow and sort every automated step into two buckets — rules (defensible right answer) or judgment (reasonable disagreement, costly to be wrong). Be honest about the ones currently automated that actually belong in the judgment bucket. For each judgment step, write down what a customer would need to hear to understand and contest the outcome. You'll usually find two or three places where a confident model is quietly making calls no human ever reviews.

Day two: pick the single judgment step where being wrong is most expensive and most repetitive — that intersection is where applied AI pays for itself. Stand up a thin human-in-the-loop checkpoint there: the model proposes and explains, a human confirms or overrides with a logged reason. Don't optimize accuracy yet. Just make the decision visible, explainable, and correctable. That checkpoint is the seed of a system you can grow with confidence, because you started where the judgment was expensive instead of where the automation was easy.

The institutions that win in community-centered finance won't be the ones with the fastest models. They'll be the ones whose customers can see the reasoning, trust the answer, and fix it when it's wrong.

Black Flag Design builds applied-AI products for institutions whose customers can't afford to be misread. If this is your world, spend two days with us — we call it a Foundation Sprint.

About the author

Keith Pattison
Keith Pattison

Founder, Black Flag Design

Keith leads Black Flag Design, a studio that ships production-ready software with AI-assisted development. He writes about the disciplines — small scope, weekly evidence, and human oversight — that keep AI-built systems reliable in the real world.

Related stories

More from the journal

Pen-and-ink sketch of a small clockwork robot working at a tool-covered workbench late at night while a human sleeps peacefully on a couch in the background, a wall clock reading 2:00 above
ai April 24, 2026 13 min read

The Agent Stays Up Late, Not Me

Every senior engineer knows the right way to set up a codebase. None of them do it. Here’s the four-stage framework we use — The Ratchet — to take a vibe-coded project all the way to a thing you’d trust in production, and the punchline about why this only just became worth doing.

Most teams have always known they should be running tests, type-checking, security audits, accessibility checks, dead-code analysis, prose linting, and a coverage floor. Most teams run two of those. Here’s why that math has finally inverted, and the four-stage framework we use to ratchet a vibe-coded project to a hardened one.

Keith Pattison

Keith Pattison

Founder, Black Flag Design

Read
Black Flag Journal
claude code April 20, 2026 5 min read

What a Year of Claude Code Trails Tells You About Your Team

Claude Code leaves evidence — sessions, commits, PRs, review notes. Read it like a logbook and you'll find what devs actually need to know before they go deeper.

After a year of shipping with Claude Code across real client work, the signal isn't in any single session — it's in the trails. Here's what those trails told us about where Claude Code shines, where it drifts, and the habits devs should build before they lean in harder.

Eli Wood headshot

Eli Wood

CEO, Black Flag Design

Read
Black Flag Journal
playbook April 20, 2026 6 min read

The Black Flag Playbook: Six Principles for Shipping with AI

Battle-tested principles for teams building real software with AI-generated code. Human judgment, tight scope, and weekly evidence — the disciplines that keep AI-built systems reliable.

The six rules we use to ship production software with AI. Small scope, weekly demos, human-led oversight, and continuous improvement — drawn from six months of real client engagements.

Keith Pattison

Keith Pattison

Founder, Black Flag Design

Read