Consumer workflow
A reusable GitHub Actions workflow that any repo can
uses:to verify a Vivarium-hosted bug reproduction in their own CI — without copying any Vivarium internals.
The workflow lives at
aletheia-works/.github/.github/workflows/vivarium-verdict.yml.
It pulls the published ghcr.io/aletheia-works/vivarium-<slug>
image, runs the recipe, captures a verdict.json matching
Contract v1, validates it against the
published JSON Schema,
and asserts the captured verdict matches what the caller
expected.
Five-line consumer example
That is the entire integration. A consumer repo's
.github/workflows/check-bug.yml can carry many such jobs (one
per recipe to track), each turning into a green / red signal in
their own CI. Slugs are the directory names under
src/layer2_docker/
(Layer 2 catalogue) and
src/layer3_thirdway/
(Layer 3 catalogue, where the trace is baked into the image).
Inputs
Verdict semantics
reproduced means the upstream bug reproduces in this run —
the reproduction is doing its job. unreproduced means the bug
does not reproduce, usually because the upstream project
shipped a fix the bundled image picked up. See
Contract v1: Verdict semantics
for the full reasoning.
Consumers that want a "this bug is fixed" alert can therefore write:
…and the workflow flips red the moment the bug stops reproducing, which is exactly the upstream-fix-detected signal.
Artefact
The job uploads the captured verdict.json as a workflow
artefact named
verdict-<slug>-<run_id> with 30-day retention. Consumer-side
badges and debug flows can fetch the artefact via the GitHub
Actions API.
What this workflow does not do
- Layer 1 (WASM) verification. Layer 1 reproductions run in-page in a browser; the verdict surface is live DOM / JavaScript. CI consumer-side verification of Layer 1 is a separate problem and does not benefit from a reusable workflow — the Vivarium gallery's Playwright suite is the canonical Layer 1 regression check.
- Layer 3 (rr replay) verification on hosted GHA runners.
The
replaystep itself runs as part of the recipe's image CMD, so this workflow does drive Layer 3 from the consumer side, but only on runners that expose CPUID faulting to the guest. GitHub-hosted Ubuntu runners do not. Self-hosted runners on bare metal or PMU-exposing KVM are required for Layer 3 consumer verification.
See also
- Contract v1 — the verdict surface this workflow consumes.
verdict.schema.json— the schema the workflow validates against.- Layer 2 catalogue
— the slugs available for
inputs.slug. - Layer 3 catalogue — additional slugs (rr-replay; runner caveat above).