Roadmap

// ROADMAP · v1

Phases, not deadlines.

The per-phase plan — what a visitor should expect to see land, in what order, framed in terms of the reproduction primitive.

// 00 · A NOTE ON DATES

No due-date column.

Phase durations on this project are directional — months to years — not commitments. Inventing calendar targets for a lifelong project would trade honesty for the appearance of progress, and would drive every later decision toward hitting a number rather than reproducing bugs correctly.

When a phase acquires a real deadline, the date will land alongside the phase, explained. Until then, the ordering and the "what shipped" signals are the roadmap.

Phase titles mirror the GitHub Milestones; the Milestones are the canonical planning surface, and this page is the user-facing translation.

// PHASES

What has shipped, what is shipping, what is queued.

01
Phase 0 — Bootstrap. CLOSED 2026-04-26.

Infrastructure-as-Code foundations (GitHub Settings, labels, milestones, branch protection). The AI-delegated development workflow. Vision, architecture stub, roadmap. Initial src/ tree representing the three-layer architecture — scaffolding only, no product code.

02
Phase 1 — Layer 1: data processing.

The first real reproduction vertical: Python + SQLite over WebAssembly via Pyodide. A handful of hand-picked upstream bugs (pandas / numpy / sqlite behavioural regressions) reproducible from a single browser tab. Bug description → linked reproduction page → click → pass/fail verdict.

03
Phase 2 — Layer 1: multi-language.

Layer 1 broadens: Rust (wasm32-wasi), JavaScript/TypeScript, Ruby (Ruby.wasm), PHP (php-wasm). Common browser-side scaffolding extracted so adding a language is a small per-runtime adapter, not a greenfield rewrite.

04
Phase 3 — Layer 2: Docker.

Full-fidelity container-based reproduction for bugs needing real fork, real sockets, real filesystems. Catalogue model (not hosted execution): pinned Dockerfile + repro.sh + GHCR-published image. Visitor reproduces locally with one `docker run`, optionally one-click via "Open in Codespaces". CI-snapshot verdict on each gallery page.

05
Phase 4 — Layer 3: record-replay & deterministic. CLOSED 2026-04-28.

Third layer for bugs Layers 1 and 2 cannot reach: record-replay (rr) and deterministic simulation (Antithesis-style). Trace-shipped catalogue: trace baked into a GHCR image, replayed on the visitor’s machine. One recipe shipped (lost-update data race); further entries deferred until a concrete bug forces them.

06
Phase 5 — Ecosystem. CLOSED 2026-04-29.

Contract v1 published as an external standard with JSON Schema and CI enforcement. Manifest v1 + three reference TOML manifests so external repos can declare a Vivarium-runnable reproduction with a single .vivarium/manifest.toml. Reusable verdict-capture GitHub Actions workflow. Verdict drift detection on the weekly cron. Issue-triage tooling deferred without adoption.

07
Phase 6 — Usability and visual layer. CLOSED 2026-05-02.

Visual redesign + reusable component library landed (Stitch-driven mocks substituting for Claude Design after quota constraints). Full Japanese ⇄ English i18n with symmetric URL trees and translated spec pages. Reproduction comparison: Contract v1 evidence surface, branch-fix verdict capture pipeline, side-by-side comparison page. Faceted recipe gallery + paste-an-error → ranked-candidates matcher. Interactive manifest scaffolder. Vivarium MCP server with four tools (list_recipes, get_recipe, lookup_verdict, match_error). All six sub-streams (V, R, S, M, X, L) shipped at v1 scope; the closure rule (V + R + ≥1 of S/M/X/L) was satisfied four-times-over.

08
Phase 7 — First-30-minutes onboarding. ACTIVE.

Turn "primitives are usable" (Phase 6) into "a stranger can pick this up cold". UI brush-up (V′ — information design, layout flow, visual-token revisit, component graduation) and onboarding documentation (D — getting started, integration guide, AI-agent setup, glossary) are co-defined: where the docs hit a wall, the UI is wrong. AI-slop verification (B3) wires the existing R.2 Path A + R.3 comparison + MCP match_error into one end-to-end walkthrough so an AI agent can drive "did my candidate fix actually work?" cold. A-tail clears Phase 6 leftovers: ajv-standalone migration for the verdict + manifest validators, and match_error v2 (synonym table, fuzzy match, language-specific stopwords). Phase closes when V′ + D + B3 ship; the A-tail is optional.

// MEASUREMENT

Closure rule: partial completion is acceptable.

Phase closure on this project follows a deliberate pattern set by the Phase 4 close ADR (ADR-0012). When the shape of the phase is proven by partial completion — one Layer 3 recipe, one MCP tool, one external adopter — the phase closes and remaining work is requeued. This protects the lifelong project from the slow-creep of incomplete phases that never close.

The decisions that did not ship are recorded with the same care as the ones that did. Public phase-close briefs cite the private ADRs that captured the trade-offs, so future contributors understand why a path was rejected, not just that it was.

// NEXT

Try it yourself

Once the phase shape is clear, the fastest way forward is to open one recipe in a browser tab and read the verdict.

VIVARIUM IS PART OF ALETHEIA-WORKS · OPEN THE SOURCE ON GITHUB →