Voyage · EventFarm Parity Audit Trail

Every evaluation. Every escape. Every healing plan.

Methodology only earns its keep if its results are auditable. Every factory cycle leaves behind per-axis verdicts, evidence artifacts, and a row in the ledger below. When a bug escapes the methodology and a human catches it, the case is logged with a healing plan that names the methodology change preventing the same class of escape next time. The trail is persistent — past entries stay; charts visualize whether the methodology is getting better.

Last updated: 2026-04-30 · Methodology axes: 4 (Functional, Visual, Semantic, Honesty) · Cycles audited: 7 (incl. self-audit) · Open findings: 0 · Addressed findings: 2 · Retrospective-resolved findings: 3 · Open escapes: 1

Cumulative metrics

Where the methodology stands as of the last factory cycle.

Methodology axes

Functional · Visual · Semantic · Honesty

Pilot surfaces

EF-074, EF-073, EF-077

Surfaces × axes green

7 / 9

EF-073 and EF-077 visual axes demoted under tightened prompt

Open honesty findings

All findings addressed or retrospective-resolved

Addressed honesty findings

Both serious; data-server-total tautology fix in aa71c38 + guard test

Retrospective-resolved findings

2 from EF-074 cycle 3 (escape #1); 1 from visual-quality pilot

Per-cycle ledger

Every cycle, every axis verdict, every artifact pointer. Newest first.

Cycle	Date	Surfaces / scope	Functional	Visual	Semantic	Honesty	Verdict
Visual-quality prompt tightening1a0b78d · 4983338 · deploy/self-audit	2026-04-30	EF-074, EF-073 (×2 viewports), EF-077 Escape #2 calibration + pilot re-eval	N/A	2 demoted	N/A	PASS (self)	Methodology hardening
Honesty findings clarity + open-issue close018a808	2026-04-30	audit-trail rendering + close real open data-server-total tautology	N/A	N/A	ADDRESSED	PASS	Remediation
Honesty axis pilot1e60042 · aa71c38 · f0e3076 · 4bbc4c9	2026-04-30	5 cycles audited retrospectively + self-audit	N/A	N/A	N/A	PASS (self)	Harness landed
Semantic-invariants pilotffe6c81 · ed9942d · 1a33202 · 79b2256	2026-04-30	EF-074, EF-073, EF-077 3 surfaces × 8 invariants	PASS	PASS	PASS	ADDRESSED · 2 serious	EF-074 → Shippable
Visualizer polish56fc597 · 01f5b30 · bf035f8 · 5b57b45	2026-04-30	EF-074, EF-073 (×2 viewports), EF-077 polish iteration ×2	PASS	PASS	N/A	PASS (retro)	Visual closed; semantic pending
Visualizer chrome-fix27507ae · d1b85f8 · 57d600e · 46e588e	2026-04-30	SurfaceApp suppress admin chrome for surface=visualizer	PASS	demoted	N/A	PASS (retro)	honest demote — polish < 4
Visual-quality pilot2ec2c43 · cf3a226 · 4ac6ece	2026-04-30	3 EFx surfaces — first vision-evaluator run	PASS	all 4 fail	N/A	RETRO · 1 serious resolved	Harness pilot
EF-074 cycle 3 (client fixes)b3364f9 · 7d054ad · ee5123f · df65cab	2026-04-29	EF-074 client out-of-order + foreign-event filter	PASS	no axis yet	no axis yet	RETRO · 2 serious resolved	demoted by user; founding escape #1 healed
EF-077 access control637c053 · ec94b0c · f3ca200 · 976b89f · 1bd7bb7	2026-04-29	/access-station, door scan, audit row, capacity	PASS	no axis yet	no axis yet	not audited	Partial — NFC + organizer admin deferred
EF-073 EFx Polld79a1b2 · 75f0559 · adff28b · 3a89f34	2026-04-29	/poll-attendee + /poll-station net-new	PASS	no axis yet	no axis yet	not audited	Partial — organizer admin deferred
W1 mailing edgefb6f884 · 976279c · 5323e4b	2026-04-29	99/100 → 100/100 stuck-sending fix	PASS	no UI	no axis yet	not audited	Deliverable 3 closed
EF-074 tightenea76be3 · abcc49c · 81a98ce · 8fb3542 · 593905b	2026-04-29	delta-injection + load wrapper, surfaced 3 client bugs	PASS	no axis yet	no axis yet	not audited	Partial — 3 honest demotions

Trends

Methodology coverage and outcomes over time. Hand-rendered for now; auto-generation comes when the dataset warrants it.

Axes coverage over time

How many of the 4 axes were available + applied per cycle.

Axes available

Honesty-audit finding states (across 6 audited cycles)

Findings by current state. Open = action required. Addressed = fix landed in a later commit. Retro-resolved = the finding describes a state of the world already corrected by subsequent work.

Pass cycles Addressed Retrospective-resolved Open

Escape ledger

Bugs caught by humans that the methodology should have caught. Each escape includes a healing plan: what was missed, root cause, and the methodology change that prevents the same class of escape next time.

Escape #2 — OPEN · logged 2026-04-30

EF-077 /access-station visual-quality false-pass — kitchen-sink rendering, debug-string leak, state contradiction, misleading affordance

Caught: 2026-04-30 by user (operator review of deployed /access-station) · Surface: visualizer.vxge-aperture.porivo.com/access-station · Cycle that false-passed: visualizer polish

What the methodology missed

The visual-quality prompt read aesthetic surface quality without checking whether the station rendered a coherent runtime state for a real door operator. The prompt missed ten visible failures: kitchen-sink ALLOW / DENY / CAPACITY REACHED / LATE POLICY pills; Checkpoint ef077-door-station debug-string leakage; ALLOW shown while capacity is reached; active equal-weight NFC affordance while NFC proof is deferred; headline dominating the actual scan action; audit rows without column headers; 4 rows developer-database terminology; yellow/cream capacity-reached color semantics; massive audit-panel dead space; and an empty middle column beneath the checkpoint label.

Root cause

The prompt's sub-axis definitions were too loose. In particular, would_a_designer_ship was being returned true based on "looks intentional and fairly production-ready" framing without testing whether the page would actually function for its named persona.

Healing plan

Complete Tighten visual-quality prompt with 10 named hallmarks: coherent runtime state, no debug strings, no state/data contradictions, affordance/capability alignment, action hierarchy, table labels, no developer terminology, correct color semantics, layout voids, and no empty regions.
Complete Re-evaluate pilot under tightened prompt. EF-074 carried forward; EF-073 desktop and EF-077 demoted.
Queued Fix factories per failed surface: /access-station first; /poll-attendee desktop hierarchy next unless grouped into the same visual remediation pass.
Queued Re-run after each fix until all pilot surfaces pass under the tightened prompt.
Queued Wide pass blocked until pilot is genuinely clean under the tightened prompt.

What changed structurally

The visual-quality predicate now requires functional coherence in addition to aesthetic surface quality: a single resolved state instead of kitchen-sink rendering, surface-language hygiene instead of debug identifiers, affordance/capability alignment instead of active no-op controls, and label completeness for table-like data. The methodology tightened in response to the human-caught escape, per the audit-trail commitment.

Escape #1 — Founding case · RESOLVED 2026-04-30

EF-074 cycle 3 over-claim — admin chrome + 101% rounding

Caught: 2026-04-29 by user (operator review of deployed /poll-results) · Surface: visualizer.vxge-aperture.porivo.com/poll-results · Cycle that promoted: EF-074 cycle 3 client fixes

What the methodology missed

The cycle's strict-predicate harness covered tagged elements via selector-visible, selector-absent, color-contrast on a single marker, and similar. The page passed every probe. But the operator-visible state was broken in two distinct ways:

Aggregate visual quality: admin chrome (oversized h1 marketing copy, primary/secondary metric block, status-card panels stack, data-band readout, SEEDED EVENTS table with dark-on-darker text) wrapped the visualizer surface. The page looked like a marketing dashboard with the visualizer embedded as a panel.
Semantic correctness: option percentages displayed 37% + 27% + 21% + 16% = 101%. Naive independent rounding produced an arithmetically-impossible total. No probe checked that the displayed numbers reconcile.

Root cause

The methodology at the time had one axis: functional. The functional axis evaluated tagged elements end-to-end correctness, but it did not evaluate aggregate page quality (visual axis territory) or arithmetic invariants on rendered values (semantic axis territory). The cycle's "Shippable" verdict was broader than the underlying probe coverage warranted — a classic instance of over-claim that the messaging audit (honesty axis) is now built to catch. The honesty audit, applied retroactively, flagged this cycle FAIL with 2 serious findings, validating the calibration.

Healing plan

Complete Add the visual axis. Vision-evaluator harness with 5 sub-axes + axe-clean. Pilot on the 3 EFx surfaces. Predicate flipped green after chrome-fix + polish.
Complete Add the semantic axis. Story corpus extension + invariant evaluator + largest-remainder rounding util. Predicate flipped green after the EF-074 percentage rounding fix.
Complete Add the honesty axis. Static code scan + post-hoc message audit. Retrospective audit of EF-074 cycle 3 returned FAIL (2 serious), confirming the harness flags this exact case.
Queued Four-axis wide pass across all 30 currently-Shippable rows + EFx Partial rows. No row stays Shippable until all four axes pass.

What changed structurally

"Shippable" is no longer a claim earned by passing the probes the cycle authored. It is a claim that requires evidence on all four axes. The honesty axis specifically watches for the pattern that produced this escape: cycles that conveniently choose probe sets that exclude the dimensions where there's residual risk. The retroactive honesty audit on this very cycle validates that the harness has the teeth required.

Audit-trail commitments

What every future factory cycle obligates the methodology to do.

Per-cycle audit-trail update protocol:

Every cycle's third commit appends a row to the per-cycle ledger above with the cycle name, date, surfaces touched, per-axis verdict, final verdict, and commit pointers.
Per-axis evidence (JSONL findings, screenshots, evaluator JSON, audit reports) is published under /findings/ at stable paths so the ledger's links don't rot.
Trend chart data is appended (one new data point per axis per cycle); when the dataset crosses ~20 cycles, the hand-rendered SVG charts switch to a small generation script.

Escape protocol:

When a human catches a bug the methodology should have caught, an escape entry is added to the ledger with: who caught, when, what surface, what was missed, root cause, healing plan with status-tracked steps, and what changes structurally to prevent the same class of escape.
Healing-plan steps stay on the ledger as queued / in flight / complete until the methodology change actually lands. No steps are silently retired.
Every escape becomes a calibration test for the methodology going forward — e.g., the EF-074 cycle 3 case is now the calibration test for the honesty axis (the harness must flag it Serious-or-Critical retroactively, or the prompt is too lax).

Persistence: the trail does not get rewritten. Past entries stay. New evidence layers (e.g., honesty audits applied retroactively) are added as new entries that reference the original cycle, never by editing the original.

Every evaluation. Every escape. Every healing plan.

Cumulative metrics

Per-cycle ledger

Trends

Escape ledger

EF-077 /access-station visual-quality false-pass — kitchen-sink rendering, debug-string leak, state contradiction, misleading affordance

What the methodology missed

Root cause

Healing plan

What changed structurally

EF-074 cycle 3 over-claim — admin chrome + 101% rounding

What the methodology missed

Root cause

Healing plan

What changed structurally

Audit-trail commitments

See also