the frontier, as questions

Open research questions

donto sits on top of fifteen genuinely open problems — paraconsistent querying at scale, bitemporal belief revision, identity that stays a hypothesis, ranking what to find out next. Each question below is anchored to something the live substrate needs, and written as a self-contained brief you can paste into a deep-research agent (ChatGPT Pro research, Gemini Deep Research, …). Copy the prompt, run it, and send us what comes back.

logicjudgingsteeringalignmentidentityevidencestandardslensestimefleetdomainmechanismsscience

How to use: each card states why donto is asking, then the full prompt behind “Show full prompt”. The copy button copies the prompt verbatim — context, sub-questions, and deliverable spec included. They are independent; run them in any order.

Open questions

15 briefs · 13 domains

Q01

Paraconsistent query semantics at 100M-claim scale

logic

Why donto asks: donto holds contradictions as legal state (~42M live statements, growing). It needs principled query semantics over inconsistent data — today contradiction handling is structural (argument edges) rather than logical.

Q02

Unifying bitemporal databases with formal belief revision

logic

Why donto asks: donto is bitemporal (tx_time + valid_time) and revises belief by closing intervals, never deleting. Is there theory that makes 'what we believed at T1 about time T2' a first-class formal object?

Q03

Truth discovery: computing a claim's standing from corroboration, conflict, and source reliability

judging

Why donto asks: donto is shipping standing v1 = ⟨maturity, corroboration, contradiction-pressure, recency⟩. The truth-discovery literature has spent 15 years on exactly this joint-estimation problem — donto should inherit it, not reinvent it.

Q04

Optimal evidence acquisition: ranking what to find out next

steering

Why donto asks: Loop C of the canon: donto.suggest_next_evidence(scope, lens) should rank which evidence acquisition most reduces uncertainty on contested claims. This is value-of-information theory applied to a claim graph — the genealogy steering demo needs it.

Q05

Deferring schema alignment to query time: what does the literature actually support?

alignment

Why donto asks: donto's core bet: let LLMs freely mint ~1M predicates, hold them raw, and fold equivalences at query time via an embedding/closure engine. Pay-as-you-go dataspaces proposed this two decades ago — what's been learned since?

Q06

Entity resolution without merging: identity as a revisable, queryable hypothesis

identity

Why donto asks: donto never merges entities — identity lives as scored hypothesis edges (strict 0.98 / likely 0.85 / exploratory 0.60) applied at query time. The genealogy corpus (16 distinct 'Kittys') is the stress test.

Q07

Computational argumentation over a million-edge claim graph

judging

Why donto asks: donto_argument holds typed rebuts/undercuts/supports edges, designed to carry the contradiction structure of the whole substrate. Dung-style semantics are NP-hard in general — what scales, and what do gradual semantics buy?

Q08

Post-hoc evidence anchoring: verifying that extracted claims are actually in the source

evidence

Why donto asks: donto's always-on citer anchors every extracted claim to a source span or honestly flags it as interpretation (currently ~81% anchored on literal claims). Anchoring coverage is the #1 metric (3.8% substrate-wide → 25%). What's the strongest architecture?

Q09

Round-tripping a paraconsistent bitemporal store through PROV, nanopublications, and RDF-star

standards

Why donto asks: donto's standards posture: an export is a lens with a LOSS REPORT. To write honest loss reports it needs to know precisely what each standard can and cannot represent.

Q10

Standpoint logic and contextual semantics: formal foundations for lenses

lenses

Why donto asks: Lenses are donto's first-class 'way of looking' — the same claim set viewed under different alignment thresholds, source filters, and standing weights. Standpoint logic appears to formalize exactly this; donto needs to know how far the theory carries.

Q11

Time-aware truth: ranking claims whose truth changes, decays, or was always wrong

time

Why donto asks: donto's bitemporal layer demonstrably wins on temporal benchmark categories (BEAM event-ordering, knowledge-update). But recency in standing v1 is a placeholder — the substrate needs principled time-aware ranking, not exponential decay folklore.

Q12

Verifiable volunteer compute: trusting strangers' GPUs with epistemic work

fleet

Why donto asks: donto.org/help already runs a stranger-operable embedding fleet (dumb workers, lease/submit protocol). The roadmap extends this to review, evidence-fetching, and adjudication — which requires verifying work from untrusted contributors.

Q13

Machine-checkable genealogical proof: formalizing the Genealogical Proof Standard

domain

Why donto asks: Genealogy is donto's hardest live consumer — contested identities, conflicting records, native-title evidentiary stakes. The GPS ('reasonably exhaustive search… resolution of conflicts') reads like a spec for the substrate; nobody seems to have formalized it.

Q14

Mechanism design for a shared epistemic commons: eliciting honest judgment from self-interested agents

mechanisms

Why donto asks: donto's horizon includes multi-agent adjudication — many agents (human and LLM) writing reviews and standings into one shared substrate. Without mechanism design this degrades into sybil spam or sycophancy; with it, disagreement becomes signal.

Q15

Holding a live, contested map of a scientific field: claims, measurements, and replication

science

Why donto asks: The canon's horizon: science frames — a measurement is a claim with a unit lens; a field is a contested claim network. Before building, donto needs the measured state of scientific-claim extraction, stance, and replication-prediction.