First five minutes
// GUIDE · FIRST 5 MINUTES
Open the site, run a recipe, read the verdict.
The shortest path through Vivarium for someone arriving cold. No install, no account, no terminal — one browser tab.
// 0 · WHAT YOU'LL DO
Three steps.
Click "Reproductions" in the top nav, or jump straight to /repro/. Layer 1 (in-browser WebAssembly) recipes are listed there.
A real upstream bug opens in a dedicated page. Pandas, NumPy, CPython, Ruby, PHP, or Rust regex — any one works.
The badge at the top settles from pending to either reproduced or unreproduced. "Reproduced = the bug still reproduces". That single sentence is enough to start.
The verdict literals point in a counter-intuitive direction; the verdict glossary entry explains why. A useful mental model: "reproduced is red, unreproduced is green."
// 1 · OPEN THE GALLERY
Every recipe satisfies the same Contract v1.
The reproduction gallery lists every recipe. Each card maps to one upstream bug and shows the project, issue number, layer, current verdict, and when that verdict was captured.
Facets filter the list: language (python / ruby / php / rust), symptom, severity, tag. To find a recipe by a concrete error message, use match an error to a recipe — it scores recipes by lexical overlap and returns ranked candidates.
// 2 · RUN A RECIPE
One click. The browser does the rest.
Walk through it once with pandas-56679 as the running
example. Hitting "Open" on the card pops a new tab at the recipe's
page, /repro/pandas/56679/.
On load the page fetches Pyodide (CPython compiled to WebAssembly)
and pandas from the jsDelivr CDN. The verdict badge sits at
pending with "Loading runtime…" while that happens.
Once loaded, the embedded reproduction script runs and the verdict settles. First load takes a few to a few-tens of seconds depending on connection and cache state; subsequent visits hit the CDN cache and open in milliseconds to a second.
Layer 1 means everything happens inside the browser — you don't need Python or pandas locally. The three-layer model is in architecture.
// 3 · READ THE VERDICT
Three values. reproduced / unreproduced / pending.
reproduced means the upstream bug still reproduces.
unreproduced means it does not — typically because a
newer runtime build resolved the issue, or because the conditions
shifted. pending is the in-flight state while the
script is running or the runtime is still loading.
Open pandas-56679 and the verdict will settle to
reproduced or unreproduced depending on
what version of pandas Pyodide currently ships. Underneath the
badge you'll see the data the verdict was decided from — the
series_dtype vs df_dtype comparison.
Why "reproduced / unreproduced" instead of "pass / fail": pass and fail flip meaning between teams. Vivarium pins the verb to one thing — did the bug reproduce. Spec: contract v1 verdict semantics.
// 4 · READ THE EVIDENCE
Not just the badge — the basis is on the same page.
Below the verdict badge sits an "Evidence" section bundling the reproduction script's run: stdout, stderr, exit code, duration in milliseconds. Contract v1 revision 2 (added in Phase 6 R.1) made the basis for a verdict readable on-page rather than buried in a separate file.
For pandas-56679, that includes the literal series_dtype
and df_dtype strings, the boolean for whether they
match, and the running pandas / Python versions — all serialised
as JSON. That bundle is the evidence.
// 5 · WHERE TO GO NEXT
Entry points by intent.
The glossary (/guide/glossary) pins down every word Vivarium uses with a fixed meaning — recipe, verdict, evidence, Layer 1/2/3, manifest, contract.
The compare page (/repro/compare) accepts before/after verdict bundles and shows whether reproduced flipped to unreproduced. Built for verifying AI-agent-authored patches.
The "Integrate with your own repo" guide (/guide/integrate-with-your-repo) walks through it end-to-end. Authoritative spec: Manifest v1 (/spec/manifest-v1).
The "Use Vivarium from your AI agent" guide (/guide/use-from-ai-agent) walks through Claude Code / Cursor / Cline / Continue setup with sample prompts for the four tools (list_recipes, get_recipe, lookup_verdict, match_error). First JSR / npm publish is intentionally on hold; the working path today is local-clone.
Read overview (/overview), architecture (/architecture), and roadmap (/roadmap) in that order to fix the skeleton in your head.
If any step on this page didn't work, that's a bug in getting-started — not a stylistic preference. Please file an issue.