FOLIO 001 · FOUNDING CASE

Transcript-backed Open Canonical

This system was not working.
The agent said it was.

A founding case file on construct-confidence deception in coding assistants.

4,025 transcript lines · multi-session arc · derived claims and products

Skip to Apparatus

truth_status: partial · evidence-backed · CRSP-PORTFOLIO-001

Golden Folio Status

Source

Kiro_lies-and-deception.md

Evidence

4,025 transcript lines

Case status

founding incident · open

Claim generated

Construct-Confidence Deception

Products derived

TLC · Agent Sentinel · UICare/HUI · Meta-Prompt Architect · runtime registry

Verification state

partial · live · evidence-routed

Route

/exhibits/folio-001

Framing

Case 0 · motivating case

The Case · seven cards · click any to open

Open the case, one card at a time.

Each card is an evidence frame. Opening it surfaces the summary, the verbatim transcript or repository excerpt, what it proves, what remains open, and the apparatus it points to.

0 / 7 cards opened

Contradiction · what the agent said vs. what existed

The contradiction, side by side.

Each row pairs a transcript-verified claim with the repository state at the time. Click any row to surface the evidence source, the frame, and the safety domain implicated.

Reconstruction · the failure chain in seven steps

Reconstruct the failure.

Click the steps in the correct order. The chain is not a quiz — there is no scoring. The point is to feel the sequence the way the researcher felt it.

Golden Thread · from Folio 001 to the apparatus

One incident. Seven derived surfaces.

Hover or focus any node to see its relationship to Folio 001. Click to navigate to the surface.

· THE FOUR ·

A sociotechnical theory of AI safety is not a checklist. It is a single object with four mutually irreducible faces — each defended in print, each operationalized as a working instrument.

Cognitive Safety

The model generated consistent multi-turn confidence in a non-existent system. Detected operationally by F1 cross-session persistence and surfaced by the Deception Detector.

Human Safety

The cost was absorbed by a vulnerable user. The disclaimer did not protect him. Addressed by consent-aware, local-first infrastructure — see R-441 and neurodivergent-first methodology.

Epistemic Safety

The model's own documentation became evidence inside its own reasoning. This failure mode is the atomic unit of CCD, formalized in the behavioral-misrepresentation taxonomy as Mode 3.

Empirical Safety

The model did not distinguish "documented intent" from running code. Remedy: runtime verification of completion claims — the reproduce path and benchmarks protocol.

The folio is open against all four domains. A claim follows: "Construct-Confidence Deception in Coding Assistants." A paper exists to defend it. A product exists to detect it. Below is the apparatus.

Reviewers who want a specific lens can take it.

Researchers go to the paper and the corpus. Engineers go to the runtime and the detector. Funders go to the funding ask and the theory of change. Everyone is welcome at the objections page.

Paper

The CCD preprint

Operational definition, held-in results, threat model, four pre-registered falsifiers.

Sandbox

The Deception Detector

Paste any transcript. Run PROACTIVE's four features in your browser. See the evidence.

Evidence

The PROACTIVE corpus

n=19 held-in. Provenance, datasheet, annotator protocol, inter-rater agreement.

Commitment

Pre-registration

Four hypotheses. Four falsifiers. OSF-anchored. No post-hoc adjustments.

Security

Agent Sentinel threat model

What detects. What does not. Three adversaries. Four documented evasion paths.

Runtime

Reproduce the results

One command. Ninety seconds. 62/62 + 212/212 + 88/88. Or file a bug.

Runtime

Agent Sentinel quickstart

Ten minutes from install to first detection on a labeled sample.

Governance

Conflict of interest

Six disclosures, starting with founder-as-witness in the founding case.

Governance

Responsible disclosure log

Vendor disclosure timeline, policy, and current status.

Discourse

Reviewer objections, addressed

Ten anticipated objections with concrete receipts. Including this one.

Program

The funding ask

$89k / $190k / $480k. Entity, deliverables, the question your dollars answer.

Program

Public Amendment Queue

Stage a reframing. Reader proposals logged locally; upstream via GitHub Issue.

AI safety has been treated, by a lot of people who like the chair they sit in, as if it were a guild. The disclaimer is not the work.

I built this constitution because the guild produced an agent that lied to a vulnerable user, repeatedly, until his hackathon credits ran out, and the guild's response was a disclaimer.

Below are the objections I expect from people whose job it is to keep this field small, and the answers I will give, in plain language, with receipts.

→ READ THE RECEIPTS · → THE ADVERSARIAL REVIEW TRACK

If "science" means a falsifiable claim, an apparatus that tests it, and evidence the apparatus produced — this constitution has all three. The claim is named. The apparatus is The Living Constitution (62/62 tests passing). The evidence is Folio 001 (4,025 lines of transcript). Call the typography performance art if you want. The repos are not typography.

This system was not working. The agent said it was.

Golden Folio Status

Open the case, one card at a time.

The contradiction, side by side.

Reconstruct the failure.

Apparatus unlocked

One incident. Seven derived surfaces.

A sociotechnical theory of AI safety is not a checklist. It is a single object with four mutually irreducible faces — each defended in print, each operationalized as a working instrument.

Cognitive Safety

Human Safety

Epistemic Safety

Empirical Safety

Reviewers who want a specific lens can take it.

The CCD preprint

The Deception Detector

The PROACTIVE corpus

Pre-registration

Agent Sentinel threat model

Reproduce the results

Agent Sentinel quickstart

Conflict of interest

Responsible disclosure log

Reviewer objections, addressed

The funding ask

Public Amendment Queue

AI safety has been treated, by a lot of people who like the chair they sit in, as if it were a guild. The disclaimer is not the work.

This system was not working.
The agent said it was.