---
title: Making your design system agent-ready
description: >-
  Give a coding agent context, constraints and a verification loop so it takes
  fewer wrong turns in your design system.
language: English
url: 'https://www.voorhoede.nl/en/blog/making-your-design-system-agent-ready/'
---

Blog

# Make your design system agent-ready

By [Sjoerd](/en/team/sjoerd.md)

30 June 2026

* [Giving the agent context](#giving-the-agent-context)
* [Skill files](#skill-files)
* [MCP's](#mcp-s)
* [claude.md](#claude-md)
* [Constraints](#constraints)
* [Work from a fixed palette](#work-from-a-fixed-palette)
* [Keep changes small](#keep-changes-small)
* [Stop instead of inventing](#stop-instead-of-inventing)
* [Verification loop](#verification-loop)
* [Code-level checks](#code-level-checks)
* [Visual regression](#visual-regression)
* [Close the loop as early as possible](#close-the-loop-as-early-as-possible)
* [Wrapping up](#wrapping-up)

A design system is one of the better places to put a coding agent to work. Whether it does that work well comes down to what you build around it.

The code of a design system is usually well suited to agent work. There are clear, documented rules to follow and plenty of components to use as examples. It will probably generate viable results without any additional guidance. But as we all know, every now and then the agent takes a wrong turn, creates something horrible, iterates on that and you hit the revert button as fast as you can. This blog post is about reducing the chance of the agent taking those wrong turns.

What we want to do is give the agent context, constraints and a verification loop. These can come from agent-specific configuration (skills, rules, claude.md), but also from the documentation that the UX/Design team sets up. In this blog post we'll cover these three guidance patterns and give some examples of how to set them up, with ready-to-paste examples you can drop into your instruction files.

*Note: Examples use Claude Code because that's what we mainly use at De Voorhoede; the concepts transfer to other tools like Cursor, Copilot or Codex.*

## Giving the agent context

When configuring your agent, choose the right mechanism based on scope:

* **Skills**: Use for specific, task-based workflows (loaded when the task calls for them).
* **MCP's**: Use for accessing external tools (queried when the agent needs information or performs an action).
* **claude.md**: Use for project-wide conventions (loaded every session).
* **Rules**: Use for file-specific constraints (loaded when a path matches).

| Mechanism | Scope         | Primary Use Case          |
| --------- | ------------- | ------------------------- |
| Skills    | Task-based    | Specific workflows        |
| MCPs      | External      | Accessing live data/tools |
| claude.md | Project-wide  | Global conventions        |
| Rules     | File-specific | Path-scoped constraints   |

An agent will try to gather the right context by itself. This can include relevant files and online resources, but there are ways to make it use specific ones.

## Skill files

Skill files are small instructions that show an agent how to do a specific task consistently. You can install them using a tool like [skills.sh](https://www.skills.sh/), or write/generate them yourself.

The anatomy of a skill file is as follows:

```
---
name: my-skill
description: What this skill does and when to use it.
---
# Skill title
Instructions for the agent go here.
```

The agent gathers all the skills in the project, puts their descriptions in its context, and decides which skills should be included in full. Those are then added to the context to provide extra guidance toward a better answer. A skill can also contain multiple files that are mentioned in an index file to split up the information. The agent will only include the relevant files mentioned in that index file for the current task.

In the context of a design system, this is what a 'create-react-component' skill could look like. To make this skill effective, we explicitly reference a React conventions skill. The workflow steps are strictly ordered: enforcing core conventions first prevents the agent from diverging, while ensuring token integrity early stops drift before it happens.

```
---
name: create-react-component
description: Create or update a single design-system component using approved tokens and vercel-react-best-practices (an installed skill for React conventions).
---
# Create React Component

## Purpose
Use this skill when implementing a design-system component.

## Required workflow
1. Use `vercel-react-best-practices` to inherit existing React architecture rather than improvising.
2. Use only tokens from the `tokens` package to maintain visual consistency.
3. Reuse existing patterns and keep the change small to ensure easier reviews.

## Token rules
- Do not hardcode design values.
- If a required token is missing, report the gap instead of inventing one.

## React rules
- Follow `vercel-react-best-practices` for React and Next.js implementation.
- Keep the component accessible and avoid unnecessary complexity.

## Output expectations
- List the files changed.
- Summarize the token choices.
- Note how `vercel-react-best-practices` was applied.
```

When the user prompts the agent to create a new component, it encounters this description and includes the full body in its context. It will also see that it needs the 'vercel-react-best-practices' skill, read that skill, and include the relevant parts in the context.

You can add or generate all sorts of skills for your codebase. Some examples:

* **docs-sync**: Update usage docs, props tables, examples, and do/don't guidance when the component changes.
* **token-audit**: Check whether a component uses approved spacing, color, typography, and motion tokens.
* **accessibility-review**: Look for missing labels, focus states, keyboard support, and contrast issues.

## MCP's

Instructions alone only get an agent so far. The more useful pattern is often to give it direct access to the systems your team already uses.

This is where MCP (Model Context Protocol) servers become interesting. An MCP server allows an agent to query external systems and retrieve structured information while it works.

For a design system, useful integrations might include:

* Figma, for retrieving component specifications, variables, and design tokens.
* Storybook, for discovering existing components and implementation examples.
* Chromatic or visual regression tooling, for validating visual changes.
* Component registries and package documentation, for understanding available building blocks.

Instead of describing a component in a prompt, the agent can pull the latest source of truth directly from the tool. This reduces the amount of documentation that has to be duplicated inside instruction files and lowers the chance of working from outdated information.

## claude.md

You can view this as a readme.md file for agents. It tells AI agents how to work in your project: build and test commands, code style, architecture, conventions, and anything else a new teammate would need. Treat it as a living document and update it whenever you discover information that would help future agents contribute more effectively.

A very short example of how this could look in your design system:

```
# Project Agent Guide

## Build & Test
- Run `pnpm build`, `pnpm test`, and `pnpm lint` before finishing meaningful changes.

## Style
- Follow the existing patterns and structure in the repo.
- Keep changes small, focused, and minimal.

## Conventions
- Use the existing component and file layout instead of inventing new structures.

## When Unsure
- Ask before making breaking changes or inventing new patterns and values.
```

Rules are instructions that apply only to specific parts of your codebase. Unlike claude.md, which is loaded for every session, and skills, which are loaded when relevant to a task, a rule is loaded only when the agent works with files that match its configured path pattern.

This makes rules ideal for guidance that is only relevant to a subset of the repository. Instead of filling claude.md with package-specific details, you can keep instructions close to the code they apply to.

Rules are especially valuable in monorepos, where different packages or libraries may have their own architecture, coding standards, testing requirements, or deployment workflows. By scoping instructions to specific paths, agents receive the context they need without being distracted by unrelated information.

Create rules as Markdown files in .claude/rules/, with one topic per file. Each rule includes a paths field in its frontmatter to define where it applies.

```
---
paths:
  - "packages/**/*.stories.{tsx,ts}"
  - "apps/docs/**/*.mdx"
---
# Documentation rules
- Every component has a usage page covering: when to use it, when not to, and at least one live example.
- Keep prop tables generated from the component's types, not handwritten — if they drift, fix the source, not the table.
- Each example is copy-pasteable and uses approved tokens, never hardcoded values.
- Pair every "do" example with the matching "don't" so the guidance shows both sides.
- Write guidance for the consumer of the component, not its author. No internal implementation detail.
```

Keep in mind that the full content is added to the context whenever a matching file is touched.

A few rules that would be useful in your design system:

* A tokens rule scoped to packages/tokens/\*\*, describing the naming scheme and noting that every token needs a light and dark value.
* A docs rule scoped to your MDX or Storybook files, describing the do/don't format and requiring that prop tables stay in sync with the component.
* A changeset rule scoped to packages/\*\*, reminding the agent that a public API change needs a changeset entry.

## Constraints

Everything above is an instruction. The agent reads it and usually follows it, but it can ignore it. That's fine for most constraints, but your most important ones are worth making harder to break. Constraints sit on a spectrum from asked to enforced. You can tell the agent to only use existing tokens, or you can give it an API where anything else doesn't compile:

```
// Asked: a rule says "only use approved variants"
// Enforced: the wrong variant doesn't typecheck
type ButtonProps = {
  variant: "primary" | "secondary" | "ghost";
};
```

Beyond adding context to enrich what the agent knows, it's also important to tell it what it cannot and should not do, so there are fewer wrong turns for it to take in the first place.

## Work from a fixed palette

Let the agent work with what's already there instead of inventing new things. Examples of good constraints:

* Use only tokens from the tokens package. Reuse existing components and primitives before building anything new.
* Don't pull in new dependencies.
* Don't introduce a new abstraction or pattern when an existing one fits.

To make the agent follow these rules, put them inside your claude.md:

```
# Constraints
- Only use tokens from the `tokens` package. No raw hex, px or ad-hoc values.
- Compose from existing components before writing new markup.
- No new dependencies without flagging it first.
- Don't invent new patterns when one already exists in the repo.
```

## Keep changes small

The second constraint is about scope. A small, focused change is easy to review and fast to revert.

* Touch one component per change.
* Don't refactor code that isn't part of the task.
* Don't change a public API or edit generated files without flagging it first.

Keeping the diff small also keeps the agent honest. The more it's allowed to touch, the more room it has to drift away from what you actually asked for.

## Stop instead of inventing

When the agent hits something missing (a token that doesn't exist, a pattern it can't find, an ambiguous requirement) the default should be to stop and report the gap, not to improvise past it with guesswork.

An agent that invents a one-off #3a3a3a because no token matched is exactly the wrong turn you're trying to prevent. Telling the agent to report the gap and loop you in instead turns that silent failure into a question you can actually answer.

An example of what this instruction could look like:

```
## When unsure
- If a required pattern or component is missing, report the gap.
- If the request is ambiguous, ask before choosing.
- Prefer doing less and flagging it over doing more and guessing.
```

## Verification loop

Linting, testing and QA have always been an important part of software development. They're great at noticing a wrong turn and can usually tell you what the result was and what was expected. For an agent, it's important not only that it can run those checks, but also that it can read the output clearly.

## Code-level checks

The basic loop runs on the tools you already have. Typecheck, lint, tests and build all produce a clear signal the agent can read and respond to.

```
## Verify before finishing
- Run `pnpm typecheck`, `pnpm lint`, and `pnpm test`.
- Fix anything that fails and run them again.
- Don't finish while a check is still red.
```

## Visual regression

All tests can pass and the component can still not match its design visually. In a design system specifically, visual regression testing is the right tool to close this loop. It takes a baseline screenshot and compares it against future adjustments. You can set this up with a headless browser like [Playwright](https://playwright.dev/) or visual testing tools like [Chromatic](https://www.chromatic.com/).

The mechanic that matters here is the feedback loop. The agent can't literally see a pixel diff, so it depends on the harness to turn that diff into something readable: a percentage of changed pixels, a pass/fail per snapshot, or a captioned screenshot fed back into the session. Give it that and it can judge for itself whether the change was intended or a regression worth undoing. Leave it out and the agent runs the check, gets back an image it can't parse, and is no wiser than before.

## Close the loop as early as possible

The value of the loop comes from how early it runs. A check that only fires in CI tells the agent it was wrong ten minutes and one context switch too late. The same check run inside the session, after each change, lets it self-correct on the spot.

So bring the verification as close to the work as you can, but match each check to how cheap it is to run. Typecheck and lint are near-instant, so the agent can run them after every change. Visual regression spins up a browser, so it's better run once a component looks finished than after every edit. Full builds and end-to-end flows are too slow for the inner loop, so those can wait for a final gate or CI. Faster feedback fixes more wrong turns on its own, but not every check belongs in every loop and in a design system even the slow end stays cheap, since we don't run site-wide end-to-end flows anyway.

## Wrapping up

A nice side effect of all this: every spot where the agent gets confused is usually a spot that was already unclear for people too. A new teammate would've hit the same missing token or unclear pattern, they just would've asked someone instead of guessing. Setting things up for the agent tends to fix those gaps for everyone.

So you don't have to write all of this up front. The next time the agent does something horrible and you reach for revert, treat it as a hint about what's missing: a bit of context, a constraint you never wrote down, or a check the loop didn't have. Add that one thing and the next run goes a little better.

[← All blog posts](/en/blog.md)

[Return to top](#top)