The donto research agenda: original problems only a contradiction-preserving substrate can pose

donto papers · 2026-06-13 · living document

This is donto's own research program — problems that become askable once you have a knowledge base that is bitemporal, paraconsistent, evidence-anchored, and query-time-aligned. Each entry is a candidate white paper: a one-line novel claim, why donto is uniquely positioned to make it, the experiment or result that would establish it, and an honest status. We write only what is ours to write — original contributions, not surveys. Two are already drafted; the rest are in preparation, ordered roughly by how close they are to a defensible result.

The unifying thesis: for sixty years the scarce step in knowledge systems was generating typed facts; generative models made that step free. The open research is no longer extraction — it is what to do with an unbounded, contradictory, evidence-anchored firehose. Every problem below is a facet of that flip.

I. Drafted

1. The Bitemporal Sheaf → read

Claim. Cellular-sheaf cohomology over a base carrying both valid-time and transaction-time turns donto's three invariants (no-delete, bitemporality, paraconsistency) into a single computable object: a contradiction measure H¹(valid-time, transaction-time) that distinguishes the world was contested from we were wrong, then corrected. "Re-rank by reality" becomes a spectral flow that reduces transaction-H¹ while preserving valid-H¹. Status: construction defined; convergence conjectured; the first thing nobody else can build because nobody else keeps the data.

2. Answer-Shaping: the scalpel/hammer law → read

Claim. A substrate that pre-computes the answer-relevant fact lifts even a weak reader on focused-fact questions (preference 0.70→0.767) and degrades it on synthesis questions (0.70→0.63); answer-shaped aggregates as the whole context reach 86% of episodic accuracy at 3.8× fewer tokens. The substrate's job is not just retrieve, but decide which questions it may answer for the reader. Status: empirically demonstrated on LoCoMo + LongMemEval; the routing classifier is the open follow-on.

II. Close to a result (have data or a clear experiment)

3. Alignment debt: an information-economics of emit-free knowledge

Claim. When per-fact generation cost → 0, the optimal schema discipline inverts: normalize-at-write becomes strictly worse than align-at-read. Define alignment debt = the query-time cost of folding freely-minted predicates/entities, and show it is cheaper to carry and repay lazily than to prevent at ingest — with a crossover that depends only on the read/write ratio and the generation cost. Why donto. It runs the experiment live: ~938K freely-minted predicates, a query-time folding engine (donto_predicate_closure, donto_match_aligned), and the closure-expansion measurements to plot the repayment curve. Result to show. A measured debt curve + a crossover theorem; the prediction that abundance raises the optimal deferral. Status: framework + live data exist; needs the formal model and the curve.

4. Decoupled anchoring: separating what from where as a faithfulness mechanism

Claim. Extraction (which facts) and citation (where in the source) should be separate always-on stages; a semantic citer applied after any extractor (a) lets you use the fastest/cheapest extractor without sacrificing evidence-first faithfulness and (b) doubles as a hallucination filter — a fact that cannot be anchored is flagged as hypothesis, never given a bogus span. Why donto. The citer is built (cite_facts_v3: literal-object = lexical lane ~81% anchored; IRI-object = co-location + bge-small predicate-direction + optional micro-verify) and runs on every lane. Result to show. Anchor-precision and hallucination-catch rates across extractors; that decoupling raises faithfulness while lowering cost. Status: implemented; needs the measured ablation written up.

5. Standing as a dynamical system steered by reality

Claim. Epistemic standing — donto's ⟨maturity, corroboration, contradiction-pressure, recency⟩ — is best modeled not as a static score but as a vector field updated by a feedback loop with reality: the system computes which observation would most reduce its uncertainty (suggest_next_evidence), acquires it, and re-ranks. We characterize fixed points (settled facts), limit cycles (perennial disputes), and the conditions under which the loop converges. Why donto. This is the canon's STEER loop; the genealogy consumer is a live testbed (which record to order next to resolve a contested lineage). Result to show. Convergence/divergence conditions + a genealogy case where steering provably shortens time-to-resolution vs passive accumulation. Status: defined in the canon; needs formalization + a measured steering experiment. Connects to the bitemporal-sheaf convergence conjecture — standing weighting may be the necessary condition that makes transaction-H¹ converge.

III. Strong novelty, earlier stage

6. Identity as a restriction map: non-destructive entity resolution

Claim. Entity resolution should never merge. Represent identity-as-hypothesis as a learned restriction map between two entities plus a glue/no-glue verdict read off H¹; co-reference becomes a reversible, evidence-weighted, query-time decision instead of an irreversible write. This strictly dominates merge-based ER on contested data, where a wrong merge is unrecoverable. Why donto. donto_identity_edge + the alignment fabric already store identity as edges, not merges; I3 forbids the destructive merge by construction. Result to show. On a benchmark with disputed identities, non-destructive ER recovers from a bad early link where merge-based ER cannot; equal precision on easy cases. Status: mechanism specified in the sheaf PRD (Stage 1); needs a contested-identity benchmark.

7. Contested retrieval: returning disagreement as a first-class result

Claim. Retrieval over a paraconsistent store should return contradictions as results — a ranked answer plus a contestedness score and the dissenting evidence — rather than silently picking a winner. This changes the RAG faithfulness contract: the model is handed "X (contested: 2 sources disagree, see Y)" instead of "X." Why donto. The store holds the dissent (it never deleted it) and the bitemporal sheaf gives the contestedness score (H¹ mass). Result to show. That contested retrieval reduces confidently-wrong answers on knowledge-update/contested-fact questions without hurting settled-fact accuracy. Status: depends on Stage-0 H¹; clean experiment once that exists.

8. Cycle contradictions: the conflicts pairwise edges cannot see

Claim. Most real contested-knowledge errors are not two claims that explicitly negate; they are cycles of individually-fine claims that cannot be globally reconciled (the genealogical daughter-of vs spouse-of loop). Pairwise contradiction edges (donto_argument) are blind to these; sheaf cohomology (H¹ around the cycle) detects and localizes them automatically. Why donto. It has the pairwise machinery and can show its blind spot, and the claim graph to find real cycles in. Result to show. A census of cycle-contradictions in the genealogy corpus that no pairwise method flags — the value of the cohomological upgrade, quantified. Status: the detector is Stage-0 of the sheaf PRD; the census is the paper.

9. Generative-abundance metrics: predicate proliferation as signal, not noise

Claim. The ~938K distinct predicates in a freely-extracted store are not a quality failure to be normalized away — they are the signature of abundance, and the right metrics treat proliferation, fold-rate, and alignment-debt-repayment as health indicators of an emit-free knowledge base. We propose the metric suite and show that aggressive ingest-time typing destroys recoverable signal that query-time folding preserves. Why donto. It is the largest live emit-free store we know of with a working folding engine to measure against. Result to show. A measurement framework + the result that recall on the freely-typed store ≥ recall on a normalized projection, at lower ingest cost. Status: data exists; needs the metric definitions and the head-to-head.

10. The verifiable epistemic trail: agent memory as auditable state, not text

Claim. Agent memory should be a bitemporal, evidence-anchored claim trail whose every entry is independently verifiable and replayable — not an opaque pile of remembered text. This enables epistemic replay (reconstruct exactly what the agent believed and why at any past transaction-time) and turns "what did the agent know and when" into a query. Why donto. Agent writes already land as holder-scoped bitemporal claims with evidence links; the substrate is the trail. Result to show. That a claim-trail memory supports audit/replay queries an episodic-text memory cannot answer, at competitive recall — reframing the agent-memory benchmark from accuracy to accountability. Status: the substrate supports it; needs the replay query layer + an accountability benchmark.

How these compound

These are not ten separate bets; they share one engine. The bitemporal sheaf (#1) supplies the contradiction measure that contested retrieval (#7) and cycle contradictions (#8) return and standing dynamics (#5) consume; alignment debt (#3) and abundance metrics (#9) measure the emit-free firehose that decoupled anchoring (#4) keeps faithful and identity-as-map (#6) resolves without deletion; answer-shaping (#2) and the verifiable trail (#10) are how all of it reaches an agent. The through-line is donto's one wager: hold everything, align late, re-rank by reality — and every paper above is a place where that wager makes a problem tractable that the collapse-on-conflict world cannot even state.

Drafts land here as they're written. Companion: the literature grounding lives in /reports; these papers are the original contributions that build on it.