Decoupled anchoring: separating what from where for faithful extraction
donto papers · original research · 2026-06-13 · empirical + method, working draft v1
Abstract. Evidence-first extraction usually asks one model to do two jobs at once: decide which facts a source supports and decide where in the source each fact is anchored. We report that splitting these into two always-on stages — a fast extractor that emits free, untyped claims, followed by a separate semantic citer that anchors (or refuses to anchor) each claim against the source text — is strictly better on the axis donto cares about: it lets you run the cheapest/fastest extractor without losing evidence-first faithfulness, and the citer doubles as a hallucination filter because an unanchorable claim is flagged hypothesis_only, never given a bogus span. The mechanism is built and runs on every lane (cite_facts_v3.py, packaged as donto_extract.citer, 1,524 lines). The decisive design move is object-typed routing: a literal-object claim takes a lexical anchoring lane that on our frontier-violence corpus anchors 81.4% of facts correctly; an IRI-object (relational) claim takes a co-location + bge-small predicate-direction lane that is far stricter, anchoring only 49.5% and sending the rest to honest-unanchorable. The honest negative is the whole point: an earlier coupled-style citer reached 90.2% raw coverage but an adversarial judge found ~40% of recovered relational anchors pointed at a locatable-but-non-asserting span; decoupling + refusal drove bogus spans to zero by construction at the cost of coverage we were never entitled to. We frame the result as anchor-precision over anchor-coverage, give the measured numbers and where they are illustrative, and state the invariant that makes this donto's to build: I3 (never delete) plus a span/evidence model (donto_span, donto_evidence_link) that can withhold a span without losing the claim.
1. The coupling problem
The default extraction contract is "read this source, return the facts with their supporting quotes." It conflates two cognitively different operations:
- Extraction — what is true here? A generative, recall-maximizing task. donto wants it run by the cheapest, fastest lane available (the canon's emit-free/defer-joining stance), rotated across subscriptions by leftover quota (see Hyades extraction effectiveness).
- Anchoring — where exactly in the bytes does this fact live? A precision task whose failure mode is silent: a model under recall pressure will happily attach a plausible-looking quote that shares a token with the claim but does not assert it.
Coupling them has two costs. First, it taxes the extractor: every token spent emitting a citation is a token not spent finding the next fact, and the best-anchored models are not the best/cheapest extractors (codex-spark is a fast lane but anchors only ~38–54% of its own output unaided — the figure from the citer's own header, measured against this corpus). Second, it launders hallucination: when the same pass that invents a fact also writes its citation, a fabricated fact arrives pre-equipped with a confident-looking span, and there is no independent stage positioned to catch it.
donto's claim: anchoring should be its own always-on stage, run after any extractor, that either produces a verified span or refuses. This is research-agenda entry #4 made concrete.
2. The mechanism that's actually built
The citer (cite_facts_v3.py, packaged as donto_extract.citer) consumes the universal {subject, predicate, object, anchor?, confidence} claim shape plus the full source text — nothing genealogy- or memory-specific — and attaches, to each claim, either a donto_span-materializable surface_text or anchor = None, hypothesis_only = True. (Line numbers below refer to the packaged src/donto_extract/citer.py.) Three properties matter.
2.1 Object-typed routing (structural, not a predicate list)
The citer routes each claim by the shape of its object, never by a predicate string-list (canon: no brittle logic):
| object shape | claim kind | lane |
|---|---|---|
literal value ({"literal":{"v":…}} or a bare string) |
content / attribute | LEXICAL — verbatim/_flex_find over the source, n-gram + IDF-salient tokens |
IRI ({"iri":"ex:…"} or ex:/ctx:/http string) |
relational | CO-LOCATION + DIRECTION — both endpoints must co-occur in a window and the window must assert the directed s→p→o |
This is exactly donto's native literal-vs-IRI object distinction, so it transfers across domains with no per-predicate map (is_relational(), citer.py:284).
2.2 The relational lane is deliberately suspicious
A relational anchor must clear three structural gates, none of which is a magic constant:
- Instance-aware co-location — the window must contain a distinguishing token of the subject and of the object, where "distinguishing" is per-document inverse document frequency over the entity localnames in this batch (
DiscriminatorModel,citer.py:371). Tokens that tag most entities (question,by,mr) are dropped; a rare token (a question number, a proper name) singles out the instance. This kills the v1 failure of anchoringquestion-34 askedBy mr-wattsto a different question that merely shares the word "watts." - Title / over-long-window exclusion — markdown headers, the doc's first content line, and robust length/entity-density outliers (median + k·MAD over the window distribution) are removed from the relational candidate set, because they co-locate any pair by sheer token density without asserting anything (
build_relational_windows,citer.py:542). - Predicate-direction check — the surviving co-located window must score above a data-calibrated bge-small cosine threshold against the directed NL rendering of the statement; the threshold is calibrated from judge-labelled correct/wrong anchors when they separate, else from the content-anchor cosine distribution's 5th percentile — never hand-set (
calibrate_direction_tau,citer.py:647). An optional codex micro-verify ("does this span assert s p o? yes/no") runs over only the small surviving set.
A claim that passes co-location but fails direction is routed to honest-unanchorable, not to the nearest plausible span.
2.3 Zero bogus spans by construction
Every emitted surface_text is _flex_find-verified against the source before emit, so it is guaranteed to materialize a real donto_span + donto_evidence_link on re-ingest (citer.py:117). The citer can only ever (a) emit a verified span or (b) refuse. It is insert-only with respect to the substrate — it never overwrites a claim, only declines to anchor it — which keeps it inside I3.
3. The measured result
We ran the decoupled pipeline over a frontier-violence corpus extraction: a fast extractor's raw claims, then the v3 citer with no codex micro-verify (the cheapest configuration). The output (facts.cited.v3.nocodex.jsonl, 3,519 claims) breaks down (figures re-counted directly from the file):
| lane | claims | anchored | rate | note |
|---|---|---|---|---|
| LEXICAL (literal object) | 1,857 | 1,511 | 81.4% | content/attribute facts — verbatim-locatable |
| CO-LOCATION (IRI object) | 1,662 | 822 | 49.5% | relational — strict; direction-gated |
| total | 3,519 | 2,333 | 66.3% | remainder → honest-unanchorable |
→ flagged hypothesis_only |
1,186 | — | 33.7% | anchor = None, confidence ≤ 0.4 (verified: max 0.4) |
The two lanes behave oppositely, and that is the design working. Literal-object facts are quotable surfaces the source genuinely contains, so the lexical lane anchors four in five. Relational facts assert a directed edge that the source often only implies — so the lane refuses more than half, on purpose, rather than fabricate a directed quote. Crucially, the bogus-span count is 0 across all 3,519 (every emitted span is _flex_find-verified before emit). The 1,186 honest-unanchorable claims are not lost: under I3 they are held as hypothesis_only claims (interpreted, not stated), available to query-time alignment and re-ranking, exactly as a contested-but-evidenceless claim should be.
3.1 Why this is better, not just more cautious — the v1 negative
The value is visible only against the coupled-style baseline. An earlier citer (v1) maximized coverage: it lifted anchor coverage from 47.1% → 90.2% with zero bogus-by-construction spans (every span was locatable in the source). But an adversarial codex judge found that ~40% of the recovered relational citations pointed at a locatable-but-NON-asserting span — a single shared token (a name, the word "question," a metadata header) let the lexical layer attach the wrong instance line (e.g. question-34 askedBy mr-watts → q1's "By Mr. WATTS:" line). A span that re-locates in the source but does not assert the claim is worse than no span: it manufactures false evidence-first confidence. v2 made the relational lane honest (false relational anchors → 0; 1,537 relational facts re-routed to honest-unanchorable); v3 added the direction check to catch the residual title/density misses.
We then ran a fresh adversarial judge over a 30-case sample of v3's surviving certified relational anchors. Verdicts: 8 correct, 19 partial, 3 wrong (judge-v3/verdicts.json). This is the honest ceiling: the relational lane is hard, the direction-cosine cannot separate the subtle Q&A interrogator cases (a "By Mr. X" span is lexically near-identical for the right and the wrong question instance — most "partial" verdicts are right-region/wrong-instance), and directed cosine alone is not a reliable arbiter there — which is exactly why the calibrator declines a misleading judge-midpoint and defers those calls to the optional codex micro-verify. The honest read: v3's relational lane is precision-biased and still imperfect on a 30-case sample; its contribution is that its errors are bounded and auditable, and a wrong-instance anchor is now a partial (right region, wrong instance) rather than a confidently-wrong (title/header) span.
The reframing. The right metric for an evidence-first citer is anchor-precision (is the emitted span actually asserting the claim?), not anchor-coverage (did we attach a span?). Decoupling lets you optimize precision in a stage the extractor never sees, and pay for it in refusals rather than in laundered hallucination.
3.2 The hallucination-filter corollary
Because the citer is positioned after a recall-maximizing extractor and can refuse, it is the natural place to separate stated from interpreted. On a single-poem deep-extraction (a stress case: dense figurative source), the citer split a 208-claim run into 73 stated (anchored) and 135 interpreted (hypothesis_only). A claim the source cannot support is not deleted and not given a fake quote — it is demoted to a hypothesis with confidence ≤ 0.4. This is the same operator doing two jobs the coupled design cannot: it produces the evidence span and the honest "no span exists" verdict, and the latter is a hallucination signal for free (a fabricated-entity fact has no co-locating window, so it cannot anchor and is flagged).
3.3 What is measured vs illustrative
- Measured (this corpus, one run each): 3,519 claims; 81.4% / 49.5% / 33.7% lane rates; 0 bogus spans; the v1 47.1%→90.2% coverage history and its ~40%-relational-bogus audit; the v2 re-route of 1,537 relational facts; the 30-case v3 relational re-audit (8/19/3); the 73/135 poem split.
- Illustrative / single-corpus: the absolute lane rates depend heavily on the source's quotability — a key:value record + one long Q&A transcript anchors very differently from prose. The direction and shape of the literal-vs-relational gap, and the precision-over-coverage trade, are the findings; the exact percentages will move with the corpus and the extractor.
- Not yet measured: a clean head-to-head across multiple extractors on the same source (does decoupling preserve the same anchor-precision when the extractor changes from codex-spark to GLM to holo3.1?). That ablation — anchor-precision and hallucination-catch rate as a function of extractor — is the experiment that would fully establish the claim and is the obvious next run; the harness already supports it (the citer consumes any lane's
{s,p,o}).
4. Why this is donto's to make
Decoupled anchoring is only safe on a substrate that can hold a claim without an anchor and never lose it. On a collapse-on-conflict store, an unanchorable fact has nowhere honest to go — you either drop it (losing a real hypothesis) or fake a span (laundering hallucination). donto has the exact pieces:
| requirement | donto provides |
|---|---|
| hold a claim with no span, losslessly | I3 (never delete) + hypothesis_only flag; the claim survives as interpreted, not stated |
| materialize a verified span when one exists | donto_span (1,117,785 rows) + donto_evidence_link (2,799,094 rows) — the fact → evidence_link → span model |
| the literal-vs-IRI object distinction the router needs | native to the claim model (no per-predicate map) |
| the embedding the direction check uses | donto_claim_embedding (vector(384), bge-small) — same model as the alignment fabric |
| somewhere for the demoted hypothesis to be re-ranked later | query-time alignment (donto_predicate_closure, 1,045,762 rows) + the canon's standing kernel (a scoring concept, not yet a live table) |
No collapse-on-conflict KB can do this, because the unanchorable claim has no first-class home there. donto's evidence model lets the citer withhold a span and the substrate still keeps the claim — anchoring becomes a separable, retryable, auditable stage instead of a property baked irreversibly into the extraction call.
5. Status: proven / conjectured / speculative
- Proven (measured, this corpus). Object-typed routing works and the two lanes behave oppositely as designed; the lexical lane anchors 81.4% of literal-object facts; the relational lane refuses >50% rather than fabricate; 0 bogus spans by construction; an unanchorable claim is held as
hypothesis_only(confidence ≤ 0.4), not lost; the v1→v3 history shows decoupling+refusal converting 90.2%-coverage-with-~40%-relational-bogus into precision-biased honesty. - Conjectured (named failure mode). That decoupling raises end-to-end faithfulness while lowering cost across extractors — i.e. you can swap in the cheapest extractor and recover evidence-first faithfulness in the citer. Failure mode: an extractor whose hallucinations are plausibly co-locating (it fabricates facts about entities that genuinely co-occur in the source) would slip past the co-location gate; the direction check + codex micro-verify are the backstop, but the 30-case audit (19 partial) shows the direction-cosine is weak on subtle same-region/wrong-instance cases. The cross-extractor ablation in §3.3 is what would confirm or break this.
- Speculative (flagged). That the honest-unanchorable rate itself is a useful corpus signal — a high
hypothesis_onlyfraction marks either a figurative/interpretive source (the poem, ~65% interpreted) or an over-generating extractor, and could be a live quality gauge fed back into lane selection. No measurement yet (n=1 source type per regime); worth instrumenting once the citer's per-lane stats are persisted alongsidedonto_evidence_link.
The contribution is not a coverage number — coverage is the metric that hid the v1 bug. It is a method and a metric reframing: anchor in a separate always-on stage, judge it by precision-with-refusal, and let I3 + the span/evidence model turn "I cannot anchor this" into a first-class, recoverable epistemic state instead of a fabricated quote.
See also: the cost/throughput case for a fast extractor in Hyades extraction effectiveness; the faithfulness-as-contract theme in Answer-Shaping; the contradiction machinery that consumes demoted hypotheses in The Bitemporal Sheaf; the honest scorecard in memory benchmarks; the full program in the donto research agenda.