# Decoupled anchoring: separating *what* from *where* for faithful extraction

_donto papers · original research · 2026-06-13 · empirical + method, working draft v1_

**Abstract.** Evidence-first extraction usually asks one model to do two jobs at once: decide *which facts* a source supports and decide *where in the source* each fact is anchored. We report that splitting these into **two always-on stages** — a fast extractor that emits free, untyped claims, followed by a separate semantic **citer** that anchors (or refuses to anchor) each claim against the source text — is strictly better on the axis donto cares about: it lets you run the cheapest/fastest extractor *without* losing evidence-first faithfulness, and the citer doubles as a **hallucination filter** because an unanchorable claim is flagged `hypothesis_only`, never given a bogus span. The mechanism is built and runs on every lane (`cite_facts_v3.py`, packaged as `donto_extract.citer`, 1,524 lines). The decisive design move is **object-typed routing**: a literal-object claim takes a lexical anchoring lane that on our frontier-violence corpus anchors **81.4%** of facts correctly; an IRI-object (relational) claim takes a co-location + bge-small predicate-direction lane that is far stricter, anchoring only **49.5%** and sending the rest to honest-unanchorable. The honest negative is the whole point: an earlier coupled-style citer reached **90.2%** raw coverage but an adversarial judge found **~40% of recovered relational anchors pointed at a locatable-but-non-asserting span**; decoupling + refusal drove bogus spans to **zero by construction** at the cost of coverage we were never entitled to. We frame the result as **anchor-precision over anchor-coverage**, give the measured numbers and where they are illustrative, and state the invariant that makes this donto's to build: I3 (never delete) plus a span/evidence model (`donto_span`, `donto_evidence_link`) that can *withhold* a span without losing the claim.

---

## 1. The coupling problem

The default extraction contract is "read this source, return the facts *with* their supporting quotes." It conflates two cognitively different operations:

- **Extraction** — *what is true here?* A generative, recall-maximizing task. donto wants it run by the cheapest, fastest lane available (the canon's emit-free/defer-joining stance), rotated across subscriptions by leftover quota (see [Hyades extraction effectiveness](/reports/hyades-extraction-effectiveness)).
- **Anchoring** — *where exactly in the bytes does this fact live?* A precision task whose failure mode is silent: a model under recall pressure will happily attach a plausible-looking quote that shares a token with the claim but does not *assert* it.

Coupling them has two costs. First, it **taxes the extractor**: every token spent emitting a citation is a token not spent finding the next fact, and the best-anchored models are not the best/cheapest extractors (codex-spark is a fast lane but anchors only ~38–54% of its own output unaided — the figure from the citer's own header, measured against this corpus). Second, it **launders hallucination**: when the same pass that invents a fact also writes its citation, a fabricated fact arrives pre-equipped with a confident-looking span, and there is no independent stage positioned to catch it.

donto's claim: **anchoring should be its own always-on stage, run after any extractor, that either produces a verified span or refuses.** This is [research-agenda](/papers/research-agenda) entry #4 made concrete.

---

## 2. The mechanism that's actually built

The citer (`cite_facts_v3.py`, packaged as `donto_extract.citer`) consumes the universal `{subject, predicate, object, anchor?, confidence}` claim shape plus the full source text — nothing genealogy- or memory-specific — and attaches, to each claim, either a `donto_span`-materializable `surface_text` or `anchor = None, hypothesis_only = True`. (Line numbers below refer to the packaged `src/donto_extract/citer.py`.) Three properties matter.

### 2.1 Object-typed routing (structural, not a predicate list)

The citer routes each claim by the **shape of its object**, never by a predicate string-list (canon: no brittle logic):

| object shape | claim kind | lane |
|---|---|---|
| literal value (`{"literal":{"v":…}}` or a bare string) | **content / attribute** | LEXICAL — verbatim/`_flex_find` over the source, n-gram + IDF-salient tokens |
| IRI (`{"iri":"ex:…"}` or `ex:`/`ctx:`/`http` string) | **relational** | CO-LOCATION + DIRECTION — both endpoints must co-occur in a window *and* the window must assert the directed `s→p→o` |

This is exactly donto's native literal-vs-IRI object distinction, so it transfers across domains with no per-predicate map (`is_relational()`, `citer.py:284`).

### 2.2 The relational lane is deliberately suspicious

A relational anchor must clear three structural gates, none of which is a magic constant:

1. **Instance-aware co-location** — the window must contain a *distinguishing* token of the subject **and** of the object, where "distinguishing" is per-document **inverse document frequency** over the entity localnames in this batch (`DiscriminatorModel`, `citer.py:371`). Tokens that tag most entities (`question`, `by`, `mr`) are dropped; a rare token (a question number, a proper name) singles out the instance. This kills the v1 failure of anchoring `question-34 askedBy mr-watts` to a *different* question that merely shares the word "watts."
2. **Title / over-long-window exclusion** — markdown headers, the doc's first content line, and robust length/entity-density outliers (median + k·MAD over the window distribution) are removed from the relational candidate set, because they co-locate any pair by sheer token density without asserting anything (`build_relational_windows`, `citer.py:542`).
3. **Predicate-direction check** — the surviving co-located window must score above a **data-calibrated** bge-small cosine threshold against the directed NL rendering of the statement; the threshold is calibrated from judge-labelled correct/wrong anchors when they separate, else from the content-anchor cosine distribution's 5th percentile — never hand-set (`calibrate_direction_tau`, `citer.py:647`). An optional codex micro-verify ("does this span *assert* s p o? yes/no") runs over only the small surviving set.

A claim that passes co-location but fails direction is routed to **honest-unanchorable**, not to the nearest plausible span.

### 2.3 Zero bogus spans by construction

Every emitted `surface_text` is `_flex_find`-verified against the source *before* emit, so it is guaranteed to materialize a real `donto_span` + `donto_evidence_link` on re-ingest (`citer.py:117`). The citer can only ever (a) emit a verified span or (b) refuse. It is **insert-only** with respect to the substrate — it never overwrites a claim, only declines to anchor it — which keeps it inside I3.

---

## 3. The measured result

We ran the decoupled pipeline over a frontier-violence corpus extraction: a fast extractor's raw claims, then the v3 citer with no codex micro-verify (the cheapest configuration). The output (`facts.cited.v3.nocodex.jsonl`, 3,519 claims) breaks down (figures re-counted directly from the file):

| lane | claims | anchored | rate | note |
|---|---:|---:|---:|---|
| LEXICAL (literal object) | 1,857 | 1,511 | **81.4%** | content/attribute facts — verbatim-locatable |
| CO-LOCATION (IRI object) | 1,662 | 822 | **49.5%** | relational — strict; direction-gated |
| **total** | **3,519** | **2,333** | **66.3%** | remainder → honest-unanchorable |
| → flagged `hypothesis_only` | 1,186 | — | **33.7%** | `anchor = None`, confidence ≤ 0.4 (verified: max 0.4) |

**The two lanes behave oppositely, and that is the design working.** Literal-object facts are quotable surfaces the source genuinely contains, so the lexical lane anchors four in five. Relational facts assert a *directed edge* that the source often only *implies* — so the lane refuses more than half, on purpose, rather than fabricate a directed quote. Crucially, **the bogus-span count is 0** across all 3,519 (every emitted span is `_flex_find`-verified before emit). The 1,186 honest-unanchorable claims are not lost: under I3 they are held as `hypothesis_only` claims (interpreted, not stated), available to query-time alignment and re-ranking, exactly as a contested-but-evidenceless claim should be.

### 3.1 Why this is *better*, not just more cautious — the v1 negative

The value is visible only against the coupled-style baseline. An earlier citer (v1) maximized **coverage**: it lifted anchor coverage from **47.1% → 90.2%** with zero *bogus-by-construction* spans (every span was locatable in the source). But an adversarial codex judge found that **~40% of the recovered *relational* citations pointed at a locatable-but-NON-asserting span** — a single shared token (a name, the word "question," a metadata header) let the lexical layer attach the wrong instance line (e.g. `question-34 askedBy mr-watts` → q1's "By Mr. WATTS:" line). A span that re-locates in the source but does not *assert the claim* is worse than no span: it manufactures false evidence-first confidence. v2 made the relational lane honest (false relational anchors → 0; **1,537** relational facts re-routed to honest-unanchorable); v3 added the direction check to catch the residual title/density misses.

We then ran a fresh adversarial judge over a 30-case sample of v3's *surviving certified* relational anchors. Verdicts: **8 correct, 19 partial, 3 wrong** (`judge-v3/verdicts.json`). This is the honest ceiling: the relational lane is hard, the direction-cosine cannot separate the subtle Q&A interrogator cases (a "By Mr. X" span is lexically near-identical for the right and the wrong question instance — most "partial" verdicts are right-region/wrong-instance), and **directed cosine alone is not a reliable arbiter there** — which is exactly why the calibrator *declines* a misleading judge-midpoint and defers those calls to the optional codex micro-verify. The honest read: v3's relational lane is *precision-biased and still imperfect* on a 30-case sample; its contribution is that its errors are bounded and auditable, and a wrong-instance anchor is now a *partial* (right region, wrong instance) rather than a confidently-wrong (title/header) span.

> **The reframing.** The right metric for an evidence-first citer is **anchor-precision (is the emitted span actually asserting the claim?), not anchor-coverage (did we attach *a* span?).** Decoupling lets you optimize precision in a stage the extractor never sees, and pay for it in refusals rather than in laundered hallucination.

### 3.2 The hallucination-filter corollary

Because the citer is positioned *after* a recall-maximizing extractor and *can refuse*, it is the natural place to separate **stated** from **interpreted**. On a single-poem deep-extraction (a stress case: dense figurative source), the citer split a 208-claim run into **73 stated** (anchored) and **135 interpreted** (`hypothesis_only`). A claim the source cannot support is not deleted and not given a fake quote — it is demoted to a hypothesis with confidence ≤ 0.4. This is the same operator doing two jobs the coupled design cannot: it produces the evidence span *and* the honest "no span exists" verdict, and the latter is a hallucination signal for free (a fabricated-entity fact has no co-locating window, so it cannot anchor and is flagged).

### 3.3 What is measured vs illustrative

- **Measured (this corpus, one run each):** 3,519 claims; 81.4% / 49.5% / 33.7% lane rates; 0 bogus spans; the v1 47.1%→90.2% coverage history and its ~40%-relational-bogus audit; the v2 re-route of 1,537 relational facts; the 30-case v3 relational re-audit (8/19/3); the 73/135 poem split.
- **Illustrative / single-corpus:** the *absolute* lane rates depend heavily on the source's quotability — a key:value record + one long Q&A transcript anchors very differently from prose. The *direction and shape* of the literal-vs-relational gap, and the precision-over-coverage trade, are the findings; the exact percentages will move with the corpus and the extractor.
- **Not yet measured:** a clean head-to-head across multiple *extractors* on the same source (does decoupling preserve the same anchor-precision when the extractor changes from codex-spark to GLM to holo3.1?). That ablation — anchor-precision and hallucination-catch rate as a function of extractor — is the experiment that would fully establish the claim and is the obvious next run; the harness already supports it (the citer consumes any lane's `{s,p,o}`).

---

## 4. Why this is donto's to make

Decoupled anchoring is only *safe* on a substrate that can hold a claim **without** an anchor and never lose it. On a collapse-on-conflict store, an unanchorable fact has nowhere honest to go — you either drop it (losing a real hypothesis) or fake a span (laundering hallucination). donto has the exact pieces:

| requirement | donto provides |
|---|---|
| hold a claim with no span, losslessly | I3 (never delete) + `hypothesis_only` flag; the claim survives as interpreted, not stated |
| materialize a *verified* span when one exists | `donto_span` (1,117,785 rows) + `donto_evidence_link` (2,799,094 rows) — the `fact → evidence_link → span` model |
| the literal-vs-IRI object distinction the router needs | native to the claim model (no per-predicate map) |
| the embedding the direction check uses | `donto_claim_embedding` (`vector(384)`, bge-small) — same model as the alignment fabric |
| somewhere for the demoted hypothesis to be re-ranked later | query-time alignment (`donto_predicate_closure`, 1,045,762 rows) + the canon's standing kernel (a scoring concept, not yet a live table) |

No collapse-on-conflict KB can do this, because the unanchorable claim has no first-class home there. donto's evidence model lets the citer *withhold* a span and the substrate still keeps the claim — anchoring becomes a separable, retryable, auditable stage instead of a property baked irreversibly into the extraction call.

---

## 5. Status: proven / conjectured / speculative

- **Proven (measured, this corpus).** Object-typed routing works and the two lanes behave oppositely as designed; the lexical lane anchors 81.4% of literal-object facts; the relational lane refuses >50% rather than fabricate; **0 bogus spans by construction**; an unanchorable claim is held as `hypothesis_only` (confidence ≤ 0.4), not lost; the v1→v3 history shows decoupling+refusal converting 90.2%-coverage-with-~40%-relational-bogus into precision-biased honesty.
- **Conjectured (named failure mode).** That decoupling *raises* end-to-end faithfulness *while lowering cost* across extractors — i.e. you can swap in the cheapest extractor and recover evidence-first faithfulness in the citer. Failure mode: an extractor whose hallucinations are *plausibly co-locating* (it fabricates facts about entities that genuinely co-occur in the source) would slip past the co-location gate; the direction check + codex micro-verify are the backstop, but the 30-case audit (19 partial) shows the direction-cosine is weak on subtle same-region/wrong-instance cases. The cross-extractor ablation in §3.3 is what would confirm or break this.
- **Speculative (flagged).** That the *honest-unanchorable rate itself* is a useful corpus signal — a high `hypothesis_only` fraction marks either a figurative/interpretive source (the poem, ~65% interpreted) or an over-generating extractor, and could be a live quality gauge fed back into lane selection. No measurement yet (n=1 source type per regime); worth instrumenting once the citer's per-lane stats are persisted alongside `donto_evidence_link`.

The contribution is not a coverage number — coverage is the metric that *hid* the v1 bug. It is a **method and a metric reframing**: anchor in a separate always-on stage, judge it by precision-with-refusal, and let I3 + the span/evidence model turn "I cannot anchor this" into a first-class, recoverable epistemic state instead of a fabricated quote.

---

_See also: the cost/throughput case for a fast extractor in [Hyades extraction effectiveness](/reports/hyades-extraction-effectiveness); the faithfulness-as-contract theme in [Answer-Shaping](/papers/answer-shaping); the contradiction machinery that consumes demoted hypotheses in [The Bitemporal Sheaf](/papers/bitemporal-sheaf); the honest scorecard in [memory benchmarks](/reports/memory-benchmarks-scorecard); the full program in the [donto research agenda](/papers/research-agenda)._