Use Vivarium from your AI agent
// GUIDE · AI AGENT
Drive the Vivarium MCP server from Claude Code, Cursor, or Cline.
Vivarium ships an MCP (Model Context Protocol) server with five tools that let an agent search the catalogue, read verdict snapshots, and scaffold branch-fix verification end-to-end. No HTML scraping of the docs site required.
// 0 · WHAT THIS GUIDE COVERS
Wiring an AI agent to Vivarium's catalogue via MCP.
The Vivarium MCP server (@aletheia-works/vivarium-mcp)
speaks
Model Context Protocol
over stdio and exposes five tools:
list_recipes, get_recipe,
lookup_verdict, match_error, and
verify_branch_fix. An agent can browse the catalogue,
fetch metadata for a given recipe, pull deployed Layer 2 / 3 verdict
snapshots, and scaffold the AI-slop verification loop end-to-end.
The first JSR / npm publish is intentionally on hold. Vivarium overall and the MCP server's feature surface are both judged not yet finished — pushing v1 now would crystallise an interface still under iteration. The maintainer will announce timing separately. Until then, the working install path is local clone.
// 1 · BUILD LOCALLY
clone → bun install → bun run build.
git clone https://github.com/aletheia-works/vivarium.git
cd vivarium/packages/mcp-server
bun install
bun run build
# → produces dist/index.js (the entry point your client will spawn)
Note the absolute path to dist/index.js.
The MCP client config will spawn it via node.
// 2 · REGISTER WITH YOUR MCP CLIENT
Same shape everywhere: command + args.
Every MCP client takes essentially the same JSON snippet — a
command and an args array:
{
"mcpServers": {
"vivarium": {
"command": "node",
"args": ["/abs/path/to/vivarium/packages/mcp-server/dist/index.js"]
}
}
}
What differs is where you put it:
claude mcp add vivarium node /abs/path/to/...dist/index.js registers it interactively. Equivalent: drop the JSON above into the mcpServers key of ~/.claude.json.
Drop the JSON into ~/.cursor/mcp.json (or .cursor/mcp.json at the project root for project-scoped registration).
VS Code → Cline sidebar → MCP Servers → Configure MCP Servers opens a JSON editor. Add the snippet there.
Translate the command + args into the YAML mcpServers section of ~/.continue/config.yaml — same fields, just YAML syntax.
The authoritative client list is at modelcontextprotocol.io/clients. Per-client config paths drift over time; consult upstream when anything looks off.
// 3 · TOOLS
All four go through the same stdio transport.
list_recipes(layer?, project?, q?)— filtered enumeration of the catalogue.layeris the integer 1/2/3,projectmatches the upstream project (e.g."pandas"),qis a substring search across slug, project, and title.get_recipe(slug)— full metadata for one recipe (title, project, issue, page URL, verdict snapshot URL, GitHub source URL). Returns{ found: false, error }on unknown slug.lookup_verdict(slug)— Layer 1 returns{ kind: "live", page_url, note }(the verdict is computed in-browser at view time). Layer 2 / 3 returns{ kind: "snapshot", snapshot: { verdict, exit_code, image_digest, stdout, stderr_tail, ... } }.match_error(text, limit?)— score the catalogue against a pasted error message or stack trace by mechanical token overlap. No LLM, no fuzzy matching — exact token hits against symptom / tags / project / slug, weighted per source.verify_branch_fix(slug, fix_url? | fix_source?)— scaffolding helper for the AI-slop verification loop (NOT an execution engine). Layer 1 returns Path A: a recipe-pagecompare_urlwith the fix pre-loaded via?fix_url=or?fix=<base64url>. Layer 2 / 3 returns Path B: a/repro/comparedeep-link plus thegh workflow run branch-fix-verdict.ymlcommand the contributor runs. Full walkthrough: Verify a branch-fix.
// 4 · SAMPLE PROMPTS
Three patterns that come up most.
"Use match_error to find Vivarium recipes matching this stack trace, top 5: ...paste..." → the agent calls match_error and shows ranked hits. Drill into one with get_recipe.
"What does lookup_verdict say about pandas-56679?" → Layer 1 returns a kind: "live" URL the agent can offer for a browser visit; Layer 2 / 3 returns the snapshot's verdict / exit code directly.
"Use list_recipes to enumerate all Layer 2 recipes" → calls with { layer: 2 } and returns only the Docker layer.
The
match_errorscoring is bit-identical to the error → recipe matcher page. MCP and UI return the same ranked candidates.
// 5 · COMMON SNAGS
Most setups stall on one of these.
No
dist/index.js. Meansbun run buildwasn't run. Runbun install && bun run buildinpackages/mcp-server/once.Some clients can't expand the home-directory shortcut or relative paths. Use a fully-qualified absolute path —
/Users/<you>/code/vivarium/.../dist/index.js, not$HOME/code/vivarium/....Recipes seem stale. The server caches
https://aletheia-works.github.io/vivarium/api/recipes.jsonwith a 5-minute TTL and falls back to a build-time snapshot offline. Restart the MCP server to force a refetch."Layer 1 verdict isn't a snapshot." By design — Layer 1's site of truth is the browser at view time (Contract v1). The agent should treat
kind: "live"as "open this URL for the human to verify."
// 6 · WHAT'S NEXT
Lines branching out from this surface.
Integrate with your own repo sets up the verdict-watch side from your own CI. The two sides — agent + CI — close the loop.
Confirming "did my candidate fix flip reproduced → unreproduced?" through the agent in one shot is wired up via verify_branch_fix (Phase 7 B3). Full walkthrough: Verify a branch-fix.
If a step here didn't work, that's a bug in this guide — file an issue with your client name, OS, and the step number you got stuck on.