Design Tokens as Prompt Anchors

We shipped the first cut of our CMS image generator last week, and the first batch of generated images looked nothing like the pen-and-ink illustrations that live on our blog today. I had written the style preset with language like "hand-drawn whiteboard sketch" and "muted red accent." The model delivered exactly what that text asked for: a chalkboard-grade composition of stick figures with a random red block. Correct by the letter. Completely wrong for the brand.

The fix was not a better adjective. It was the same move you use everywhere else in design. Stop naming things with vibes. Start naming things with tokens.

Adjectives are vibes. Tokens are contracts.

"Muted red" could be a hundred reds. "Cream paper" could be anything between off-white and a yellowing page of a paperback. "Hand-drawn" could be marker on a whiteboard, chalk on a sidewalk, pencil on a napkin, or a pen-and-ink editorial illustration on heavy stock. If you hand the model the adjective, the model picks. That pick is almost never what your brand actually uses.

Our design system does not have a color called "muted red." It has bf-crimson at #E04848. It does not have "cream paper." The illustrator works on paper that scans at approximately #F5EFE5. The ink is not "near-black," it is Obsidian at #171717. These are not opinions; they are the tokens that every hand-rendered sketch already on the site obeys. If a new image is going to live alongside them, it has to obey the same tokens.

Pen-and-ink sketch of two open sketchbook pages: the left shows five labeled color swatches (OBSIDIAN, CRIMSON, PAPER, MUTED, BORDER), the right shows a small scene drawn using only those colors

Tokens are the source of truth on the left. Every image generated using the preset has to read as if it obeys them, the same way every component on the site does.

The preset as a contract

What we ended up with, for the bfd-hand-drawn preset, was not a description. It was a spec. A few of its load-bearing lines:

Paper background: a soft warm cream, approximately #F5EFE5, with subtle paper grain texture.
All line work rendered in pure black ink, #171717 (the BFD Obsidian token) using a fine-tip pen.
One single accent color used sparingly (maybe 5% of the image) for the most critical signal element: a muted saturated red, #E04848 (the BFD Crimson token). Absolutely no other color.
Absolutely forbidden: whiteboard markers, stick figures, chalkboard aesthetic, watercolor wash, photorealism, 3D rendering, gradients, green, blue, yellow, or any color other than black ink and the #E04848 accent.

Three things working together. Hex codes give the model a concrete anchor it cannot paraphrase. Named token references tie the image back to the rest of the design system so a future update flows everywhere. Forbidden-aesthetic clauses explicitly exclude the failure modes we already saw — the model has to go somewhere, and the explicit "not here" list tightens the cone.

The subject prompt changes per image. The preset does not. That is the whole point.

The one trick the model tried to pull

The first version of this pattern had a subtle failure mode. The model started rendering the hex codes as drawn labels inside the image — a small piece of text reading -E04844 floated next to a database cylinder in one of our technical diagrams. The model read our spec, saw a string that looked like an object label, and dutifully put it in the scene.

The fix is a universal tail appended to every styled prompt:

Do not draw as labels in the image: any hex color codes, any design-token names (bf-crimson, bf-paper, bf-text), any CSS variable names, any references to Montserrat or JetBrains Mono, any other spec language from these instructions. Those are internal guidance, not content. The only text drawn inside the image is the subject-specific labels the caller has explicitly requested.

That clause sits at the end of every preset injection. Since it landed, no hex codes or token names have leaked into a rendered image.

Do not trust the alt text. Look at the actual images.

The other thing I did wrong on the first pass was anchor on my own alt text. Our existing posts describe their covers with lines like "whiteboard sketch: three parallel lanes, each with its own figure exploring a different doorway." I read that description, wrote a preset around "whiteboard sketch," and got exactly the wrong thing.

The actual image on disk is not a whiteboard sketch. It is a detailed pen-and-ink illustration with cross-hatching, perspective, figures with faces, and a single Crimson accent on a TASK label. It was my own alt text that misled me — prose describes content, not technique.

The discipline we now run: every time a preset gets authored or tuned, we open the actual existing brand imagery on disk, generate against it, and compare side by side. If the generated output's line weight, crosshatching density, or color temperature drifts from the reference, the preset is wrong — not the prompt. Fix the preset.

The short version

Adjectives drift. Tokens hold.
Name the exact hex codes in the preset. Name the token identifier too, so a system-wide color change flows.
Explicitly forbid the wrong aesthetics, not just affirm the right one. The model has to go somewhere.
Add a universal guardrail so the model does not render your spec tokens as drawn labels.
Compare every generated image to the actual reference images on disk. Alt text is not a review.

If your existing brand has a design-system file, your AI image preset is half-written already. Name the tokens. Forbid everything else. Ship.

About the author

Eli Wood

CEO, Black Flag Design

Eli Wood leads Black Flag Design, a creative technology company focused on shipping ambitious digital products, AI systems, and design-forward software with a direct point of view on how technology changes work.

LinkedIn Website

Adjectives are vibes. Tokens are contracts.

The preset as a contract

The one trick the model tried to pull

Do not trust the alt text. Look at the actual images.

The short version

More from the journal

The Agent Stays Up Late, Not Me

What a Year of Claude Code Trails Tells You About Your Team

The Black Flag Playbook: Six Principles for Shipping with AI