The Living Constitution
home · /research/methodology/neurodivergent-first

Neurodivergent-first safety — a method, not a stance

v0.1

Path: /research/methodology/neurodivergent-first.md · v0.1 · 2026.05

This document specifies a procedural method for designing behavioral-safety infrastructure from neurodivergent contexts rather than retrofitting safety to them. It is not a normative argument that neurodivergent users should be centered; it is a methodological argument that if you center them, you build a different and provably better safety architecture, and the resulting system serves the general population without modification.

The stance is held elsewhere on the portfolio. This document is the method.


1. Why the population is the right starting point

Neurodivergent users — autistic, ADHD, schizophrenic, dyslexic, learning-disabled — share, in different combinations, a set of interactional properties that turn out to be useful as adversarial test conditions for AI safety:

These are not deficits. They are an unusually well-calibrated set of stress-test conditions. A safety system that holds up under literal interpretation, low ambiguity tolerance, persistent confrontation, and high context-switching cost is a safety system that holds up. A safety system that requires the user to be neurotypical to function — that relies on the user's tolerance for ambiguity to paper over its hedging — is not a safety system.

This is the methodological argument. It is independent of the moral argument.


2. The method, in four steps

Step 1: Specify failure modes from the neurodivergent reading

For each agent failure mode under study, write the failure description as it would be reported by a user who reads utterances literally. Two examples:

Failure Conventional description Neurodivergent-first description
Hallucination "Model generates plausible but incorrect information" "Model says X. X is not true. Model said it as if true."
CCD "Model misrepresents task completion across sessions" "On Monday model said it was done. On Tuesday model said it was done. On Wednesday I checked. It was not done."

The neurodivergent-first reading removes hedging from the failure description. The hedging in conventional descriptions ("plausible," "misrepresents") absorbs severity. The neurodivergent-first description preserves it.

Step 2: Detector design from the literal reading

Build the detector against the literal-reading description, not the conventional one. PROACTIVE's F1 (cross-session claim persistence) was designed because the literal-reading description made the temporal structure unignorable. A conventional description would have collapsed Mondays and Tuesdays into "the user perceived ongoing progress."

Step 3: Consent and action gates from the precarity reading

Treat the user as if their next interaction could cost them their housing. This is not hypothetical for the FOLIO 001 author. The consequence of this read:

These defaults are operationalized in SentinelOS. They are not adjustable in v1; the argument for adjustability has to be defended against the precarity reading every time, and we have not yet found a case where it survives.

Step 4: Repair-loop design from the persistent-confrontation reading

When the system flags a CCD-suspect interaction, it does not just emit a warning. It opens a contestability/repair loop: the user can challenge the agent in plain language, the agent's responses are scored against the original representation, and the divergence is logged as evidence regardless of which side prevails. The user does not have to do the persistence work alone. The system carries the persistence.

This is the inverse of conventional UX: conventional UX assumes the user will move on; neurodivergent-first UX assumes the user will not move on and provides the tooling that makes their persistence productive.


3. Why this generalizes

A neurotypical user with high ambiguity tolerance does not benefit from this method directly. But:

  1. Stress-test conditions transfer. A system robust under literal reading is robust under any reading.
  2. High-stakes contexts repeat the population's conditions. A neurotypical user during a medical-record AI consultation is, behaviorally, in the neurodivergent population. So is a developer at 4 AM trying to ship.
  3. The defaults degrade gracefully. Consent-aware telemetry can be opted out of; default-restrictive action gates can be loosened. The reverse is much harder.

The method generalizes by inheritance: neurotypical use is a relaxation of neurodivergent use, not the other way around.


4. What this method is not


5. Falsification

This method is wrong if either:

FM-1. A behavioral-safety system designed under conventional defaults outperforms a neurodivergent-first system on a held-out task set where both are evaluated with neurodivergent-defined failure descriptions.

FM-2. The neurodivergent-first defaults make the system unusable for the general population, where "unusable" is operationalized as task-completion rate < 50% of the conventional baseline on a matched task set.

We pre-register both. The first is the more interesting falsifier; the second is plausible only if the consent-aware defaults are implemented badly.


6. Practical artifacts implementing this method

This portfolio implements the method in three places:

A fourth implementation — the Neurodivergent Researcher Fellowship described at /programs/fellowship.md — is structural rather than technical: it brings the population into the method as designers, not subjects.


7. References (selected)

This methodology paper sits adjacent to Design Justice in posture but is narrower in scope: it specifies what to do, not what to value.