Agentic design system: how to build a component library AI agents can actually use

What is an agentic design system?

An agentic design system is a component library structured so AI agents can use it without guessing. Each component is shipped with machine-readable metadata that defines its purpose, variants, tokens, relationships, and explicit anti-patterns — what the agent must never do. The result is a single source of truth that humans browse visually and agents query programmatically, so the same button means the same thing in Figma, in code, and in Claude Code.

The traditional design system was built around one assumption: a designer or developer would read it, interpret it, and apply judgment. An agentic design system removes the judgment step. Every "use this here" or "don't use this there" gets written down — because the agent can't infer it.

In practice, an agentic component bundles:

The implementation (the actual React or Vue or whatever code)
A metadata file describing purpose, variants, when-to-use, and when-not-to-use
Token definitions in CSS
Storybook stories so the team can still see it visually
Tests that enforce correct usage

It's the same component you'd ship anyway, plus a layer the agent can read.

Agentic design system vs. traditional design system

A traditional design system documents components for humans — Figma libraries, Storybook pages, written guidelines. An agentic design system adds a structured metadata layer (JSON, YAML, or typed schema) that an AI agent can parse directly, with explicit fields for relationships, anti-patterns, and intent.

Agentic design system vs. design tokens

Design tokens are a piece of an agentic system, not the whole thing. Tokens give the agent the right colors and spacing values, but they don't tell it which component to reach for or when not to use it. An agentic system layers component-level reasoning on top of token-level primitives.

Agentic design system vs. Figma MCP

Figma MCP exposes your Figma file to an agent — it can see the layers, variables, and frames. That's a great starting point, but Figma alone doesn't capture intent ("don't put two primary buttons side by side") or relationships ("this lives inside a form, not a navbar"). An agentic design system fills the gaps Figma can't.

Why a human-readable design system fails AI agents

When a developer opens your Storybook, they bring context: they know what the product does, they've seen primary buttons before, they understand that two CTAs side by side looks weird. An agent brings none of that. It pattern-matches on whatever's most visible — usually whatever it saw most often during training — and ships that.

That's how you end up with components in your codebase that look right but quietly drift from your system. New variants appear. Spacing values get rounded. Disabled states get reinvented because the agent didn't realize one already existed.

The cost compounds. Every drift is a place where your design system stops being a system and becomes a vibe. Once that happens, the speed advantage of using AI to build features evaporates — you're back to manually fixing what the agent got wrong.

Done right, an agentic design system flips this:

Speed compounds instead of decays. The agent pulls components correctly the first time, so you ship features in hours instead of days. We've measured ~10x throughput on feature work once a client's system is properly structured.
Your design system stays a system. Drift stops happening because the agent has no excuse to invent — every decision is already documented.
Designers and engineers stop translating. The handoff between Figma and code collapses, because the metadata is the handoff.

The three pillars of an agentic component

Every component in an agentic system needs three things. Miss one and the agent starts guessing again.

1. Props. The properties that define the component's states and variants — primary, secondary, disabled, loading, size, and so on. These map directly to what you've already built in Figma. If your Figma component has five states, your metadata has five states. No translation, no interpretation.

2. Relationships. What the agent must understand before placing the component. Is it a child of a form? A toolbar? A dialog? Where is it most often used? What can it not live next to? This is the layer humans figure out from context — and the layer agents can't.

3. Tokens. The design tokens the component consumes. Tokens have always mattered for design systems, but in an agentic system they become load-bearing. Good tokens are written in English the agent can reason about — emphasis, default, subtle, core-grey-200 — not arbitrary names like color-1 or brandBlue.

On top of those three pillars, every component's metadata needs to capture four decisions:

Decision	What it answers
State and implied tokens	Primary, hover, pressed, disabled — and which token each maps to
Variants	Appearance, size, density — the axes the component can flex along
Accessibility	How the component behaves for screen readers, keyboard, focus
Purpose and anti-patterns	What this component is for, and explicitly what it should never be used for

That last one is the most under-rated. There's a quote I keep coming back to from Andrej Karpathy: "the hottest new programming language is English." When you're writing metadata for an agent, you're programming in English. And the most powerful instruction in English is telling the agent what not to do.

What an agentic component looks like in practice

Here's the file structure I use for every component. One folder, six files.

Example: the button component

When we set this up for a client recently — a B2B SaaS company shipping a new dashboard — we started with the simplest possible component: the button. Five states, two sizes, four variants. Nothing fancy. The point isn't to start with something complex; the point is to lock the shape of every component file before you scale.

The problem we were solving: Their existing system worked beautifully in Figma but the agent was reinventing buttons every time it generated a feature. New shades of grey, new padding values, new "almost-but-not-quite" variants. Each one looked fine in isolation and looked wrong in aggregate. What we shipped per component:

button.tsx — the actual implementation, no surprises
button.meta.json — the metadata file describing purpose, variants, props, relationships, tokens, common patterns, and anti-patterns
button.tokens.css — token definitions specific to the button
button.stories.tsx — Storybook stories for visual review
button.test.tsx — tests enforcing correct token consumption and state coverage
index.ts — the export

The metadata file is where the magic lives. For the button, it included things like:

category: atom
purpose: "Interactive trigger for a single decisive action. The most common interactive primitive — use exactly one per intent. Variant signals hierarchy."
variants: primary | secondary | minimal | destructive with a one-line reason for each
commonPatterns: ["dialog confirm/cancel", "form submit", "toolbar action"]
antiPatterns: ["two primary buttons side by side", "buttons used for navigation", "destructive variant without a confirm step"]
tokens: { background: "color.action.primary", text: "color.text.on-primary", spacing: "spacing.button" }

The result after rolling this pattern across ~20 components: the agent stopped inventing. Feature work that previously took two days of design plus three days of dev was getting prototyped end-to-end in a single afternoon. The team stopped having "wait, where did this variant come from" Slack threads. And the design system started behaving like a system again.

How to build an agentic design system

This is the build order I use. It works whether you're starting fresh or adding the agentic layer to an existing system.

1. Create a sibling package on a branch

Don't refactor your live design system. Make a branch, create a sibling package next to your existing one (something like ui-next/), and build the new system there. When it's working and tested, you can switch the product over. This lets you experiment without breaking what already ships.

2. Define your metadata schema before you build a single component

Decide the shape of your metadata file first. What fields will every component have? What's required, what's optional, what's the format? If you skip this step you'll end up with twenty components that each describe themselves slightly differently, which is exactly the inconsistency you're trying to eliminate.

There's an excellent open-source skill for Claude Code called AI Component Metadata (built by Chris Carini — npx claude-skill ai-component-metadata) that scaffolds this for you and asks the right questions to make sure every component is described consistently. I use it as the starting point and customize from there.

3. Audit your Figma for AI-readiness

Before you generate any code, your Figma needs to be structured for an agent. Two things to check:

Variables should describe intent, not implementation. A bad variable is primary, secondary, tertiary — these are positional and tell the agent nothing about when to use which. A good variable is emphasis, default, subtle — these describe the role the color plays, which is the same vocabulary the agent will use when reasoning about hierarchy. Every token needs a description. Open each variable in Figma and write a one-line description of when it should be used. "Hover state on items with subtle emphasis." "Active background for primary CTAs." This metadata travels with the design and gives the agent the context it needs to pick correctly.

Bad variable	Good variable
`primary`, `secondary`, `tertiary`	`emphasis`, `default`, `subtle`
`blue-1`, `blue-2`, `blue-3`	`core-grey-200`, `core-grey-300`
No description	"Hover state on items, subtle raising"

4. Build one component end-to-end before scaling

Pick the simplest component in your system — usually a button — and take it through the full process: Figma audit → metadata file → implementation → tokens → Storybook story → tests. Don't move to the second component until the first one is right.

This is where Storybook earns its keep. It stays the visual source of truth for the team while the metadata becomes the source of truth for the agent. Both views, same components, no drift.

5. Read the metadata. Then read it again.

The first pass of generated metadata will be 80% right and 20% generic. Common patterns will be reasonable but not specific to your product. Anti-patterns will list the obvious ones but miss the ones that bite your team. You have to go in, read every line, and add the things only you know.

For our client, the generic anti-patterns the agent generated were "don't use two primary buttons side by side" — fine, true, useful. But the specific anti-patterns we added were the ones that mattered: "never use a destructive button in onboarding flows," "never use a minimal variant inside a card header," "loading state must show after 200ms, not immediately." Those are the rules the agent could not have inferred. They're also the rules that make the difference between output that's close and output that's correct.

6. Turn the process into a skill

Once you've built five or six components and the workflow feels solid, encode it as a Claude Code skill. From that point forward, building a new component is a single command. The skill handles the schema, the file structure, the Figma audit prompts, and the metadata template — your team just reviews and customizes.

This is the part where speed compounds. The first component takes a day. The fifth takes an hour. The twentieth takes ten minutes.

Tools and resources

A short, honest list of what I actually use when setting these up:

Claude Code — the agent layer. The whole reason this metadata matters is so an agent like Claude Code can read it.
Figma MCP — connects your Figma file directly to the agent so it can pull components, variables, and structure without you copy-pasting.
Storybook — still the best framework for keeping a component library visually browsable and testable. Even more valuable now as the human-facing view of an agentic system.
AI Component Metadata skill — Chris Carini's Claude Code skill that scaffolds the metadata layer. Find it via npx claude-skill and search for ai-component-metadata.
Context7 plugin — pulls up-to-date documentation for any open-source library your agent is touching, so it's not guessing from training data.

If your team is spending more time fixing what the agent generated than shipping the next thing, the problem is almost never the agent. It's that your design system isn't talking back to it.

If you want the metadata schema, file structure, and skill setup we use with our clients, I'm sharing all of it inside the TDP community — plus monthly live working sessions where we build through this stuff together. Join the TDP community!