Show #2 (U.Episteme): published evaluation protocol boundary (multi‑view + evidence)

Preface node heading:show-2-u-episteme-published-evaluation-protocol-boundary-multi-view-evidence:8163

What this page is

This is generated FPF reference text from the specification preface or supporting sections. It helps interpret FPF; it is not FPF Reference product documentation.

Methodology

Use it to understand how the specification wants to be read, then return to a route, pattern, or work packet for active work. Cite generated IDs only when the wording changes the task decision.

Content

Episteme: A published “Model Evaluation Protocol” for a safety‑critical classifier.

Signature layer: defines operations like Evaluate(model, dataset) → Report and truth‑conditional definitions of metrics (AUROC, calibration error) as Laws.
Mechanism layer: admissibility gate encodes when evaluation is permitted: dataset version must match declared license; measurement environment must meet constraints; seeds pinned.
Deontics and commitments: reviewers MUST use dataset vX.Y; authors SHALL publish MVPK faces and cite the measurement environment; an organisation commits to a review SLA (explicitly a role-assignment or acting-system commitment).
Effects and evidence: the produced report file, logs of evaluation runs, cryptographic hashes, and trace IDs are carriers. A.7 discipline prevents calling the report “the evaluation” (object) and prevents treating the file as the model.
Multi‑view (MVPK canonical face kinds only):
- PlainView for decision makers: what this protocol means for assurance.
- TechCard for engineers: metric definitions named by value, admissibility predicates, and a clearly marked Norms-and-commitments section (D‑claims) for governance.
- InteropCard for exchange-oriented consumers: conceptual field names, anchors, and schema references (concrete format mapping lives outside Part E).
- AssuranceLane for auditors: evidence map (which carriers prove what happened) and adjudication steps keyed by E-* IDs.

This episteme is a boundary because it mediates between theory (“metric definitions”) and work (“a run produced a report”). The signature stack provides the stable interface for that mediation.

Last Updated: 2026-06-17 — upstream FPF commit 646b0b9b (github.com/ailev/FPF)

#Show #2 (U.Episteme): published evaluation protocol boundary (multi‑view + evidence)

#What this page is

#Methodology

#Content

Show #2 (U.Episteme): published evaluation protocol boundary (multi‑view + evidence)

What this page is

Methodology

Content