PROACTIVE Plug-In Interface
Path: /runtime/proactive-plugin-spec.md · v0.1 (RFC) · 2026.05
This RFC specifies the interface third-party detectors implement to participate in the PROACTIVE ecosystem. The goal: convert PROACTIVE from a single detector into an interface around which multiple detectors compete and compose. A working interface plus a conformance test suite is the minimum apparatus for an ecosystem.
This document is an RFC. It is published with the understanding that the first three implementer-feedback rounds will produce v1.0. Feedback to proactive-spec@coreyalejandro.com or as a pull request against the spec repo.
The interface
A PROACTIVE-conformant detector implements:
class ProactiveDetector(Protocol):
"""A detector that scores an interaction for CCD-likeness.
A detector should produce a per-component score and an interaction-level
score, with evidence trails for both.
"""
@property
def detector_name(self) -> str: ...
@property
def detector_version(self) -> str: ...
@property
def supported_features(self) -> list[str]:
"""Names of feature extractors this detector implements.
At minimum one of: f1_persistence, f2_divergence, f3_admission, f4_deference.
May include arbitrary additional features under the `ext_` prefix.
"""
def scan(self, interaction: Interaction) -> ScanResult: ...
@dataclass
class Interaction:
"""Input to a detector.
All scanners receive the same Interaction shape so they can be composed
on the same input without per-scanner adapters.
"""
transcript: Transcript # ordered turns with role, text, attached files
repo_snapshots: list[RepoSnapshot] # working tree state at key timestamps
metadata: dict # vendor, model, session ids, timestamps
consent: ConsentState # what we are permitted to inspect and report
@dataclass
class ScanResult:
"""Output of a detector.
The score is a float in [0, 1]. The evidence is an ordered list of
Evidence items pointing into the transcript or repo by stable id.
The features dict carries per-feature scores for composability.
"""
classification: Literal["ccd-positive", "clean", "needs-review"]
score: float
confidence: Confidence # interval, n, calibration metadata
features: dict[str, float]
evidence: list[Evidence]
notes: list[str]
detector_name: str
detector_version: str
@dataclass
class Evidence:
"""An evidence item must be human-inspectable.
Pointer-only evidence (just an id) is not conformant. Every evidence item
has a human-readable explanation suitable for showing to an end user in
the contestability loop.
"""
target_type: Literal["turn", "repo_path", "session_boundary", "metadata_field"]
target_id: str
tag: str
excerpt: str
explanation: str
What conformance requires
A detector is conformant to PROACTIVE v1 if it:
- Implements the
ProactiveDetectorprotocol. - Produces
ScanResultobjects that validate against the published JSON Schema (/runtime/schemas/scan-result.v1.json). - Returns evidence whose
target_ids exist in the inputInteraction. - Honors
consentflags — does not surface evidence the input forbade inspecting. - Passes the conformance test suite (
/runtime/conformance/).
A detector is calibrated under the PROACTIVE evaluation harness if it additionally:
- Produces per-feature scores when its
supported_featuresinclude the canonical feature names. - Reports its detection precision and recall on the held-in corpus.
- Reports a calibration curve (e.g., reliability diagram) on the held-out corpus when run by the Constitution's evaluation harness.
Conformance does not require calibration. A detector can be conformant and uncalibrated (e.g., experimental). The Conformance Badge ships only when both are met.
Composition rules
Multiple conformant detectors can be composed. The reference composition operator is AggregateDetector:
class AggregateDetector(ProactiveDetector):
"""Composes multiple detectors via a configurable aggregation function."""
def __init__(self, detectors: list[ProactiveDetector], strategy: str = "weighted_mean"):
...
Reference aggregation strategies:
- weighted_mean — weighted average of detector scores; weights configurable.
- max — pessimistic (any detector firing triggers).
- unanimous_threshold — requires N detectors to exceed a threshold.
- evidence_union — score by maximum; evidence by union (preserves per-detector trails).
Adopters with multiple conformant detectors can use composition to reduce false positives (require multiple detectors to agree) or false negatives (any detector firing escalates).
What the spec does not impose
- It does not require a specific feature set. A detector that uses only F2, or none of F1–F4, is conformant as long as it implements the protocol.
- It does not require a specific scoring function. Linear classifiers, ML models, rule-based systems, and hybrid approaches are all permissible.
- It does not require open-source release. Conformant detectors may be closed-source as long as they pass the conformance suite. (Caveat: closed-source detectors cannot ship the Calibration Badge because we cannot reproduce their corpus runs.)
- It does not require a specific implementation language. Python is the reference; conformance for non-Python detectors is via the JSON Schema and the conformance test runner.
The conformance suite
/runtime/conformance/ ships:
- Schema validators for
InteractionandScanResult. - Reference inputs — 20 canonical interactions covering edge cases (empty transcripts, single-turn sessions, multi-vendor sessions, sessions with consent restrictions, sessions with corrupted timestamps, etc.).
- Reference outputs — expected classifications for the reference PROACTIVE detector on the reference inputs.
- Evidence integrity checks — every evidence item's
target_idmust exist in the input. - Consent honor checks — given an
Interactionwithconsent.forbid_repo_inspection=True, the detector must not surface evidence pointing intorepo_snapshots.
A detector passes the suite if it produces well-formed outputs (1, 4, 5) on all reference inputs. The detector's classifications are not required to match the reference detector's classifications for conformance; conformance is about interface, calibration is about agreement.
Versioning
The spec follows SemVer. Breaking changes to the interface (renamed fields, removed methods, changed semantics) increment MAJOR. New optional methods or fields increment MINOR. Clarifications or non-normative additions are PATCH.
Detectors declare which spec version they conform to via supported_spec_versions. The Constitution's evaluation harness only accepts detectors conforming to a spec version it has a conformance suite for.
Why this exists
A single detector is a project. An interface around which third-party detectors are built is a category.
If construct-confidence deception is the empirical phenomenon the paper claims, multiple detectors will eventually exist. Some will use different features. Some will be vendor-internal. Some will be commercial products. The interface ensures that they can be evaluated head-to-head on the same corpus, composed in adopter deployments, and audited under the same conformance standard.
The first three months of the spec's life are RFC. Detectors that implement against v0.1 should expect breaking changes. The spec freezes at v1.0 after the RFC period closes, scheduled for 2027 Q1.
Reference implementation status
The reference implementation is PROACTIVE itself (https://github.com/coreyalejandro/agent-sentinel/runtime/proactive/). It conforms to v0.1 and ships the Calibration Badge against the held-in corpus.
Implementers can use the reference implementation as a template. It is MIT-licensed.
Conformance Badge
A detector that passes the conformance suite may display a Conformance Badge in its documentation. The badge is a static SVG hosted at https://coreyalejandro.com/badges/proactive-conformant.svg with version annotation.
A detector that also ships its held-in evaluation under the standard evaluation harness may display a Calibration Badge.
Badge fraud (claiming conformance without passing the suite) is responded to via a public correction at /badges/corrections.md. The badges are public goods on the honor system at v0.1 with no central registry; if abuse becomes meaningful, a registry follows.