dontoreports

Reports

Working research notes from building donto — extraction engineering, benchmark studies, substrate design. Living documents, updated as the work progresses.

Last updated: 2026-06-12.

report what it covers
Is the Internet an extension of human memory? A reading of a note the author wrote in 2011 — catching himself storing search actions instead of facts. It independently described cognitive offloading (Sparrow's "Google Effect," named that same year) and transactive memory, named the still-open creative cost (an index has coverage but no adjacency, and adjacency is where epiphany happens — "instead of a novel, I am a dictionary"), and proposed — almost verbatim — donto's storage contract: store the idea first, then the path to the source. The throughline: the 2011 personal dilemma is the exact design problem donto solves at the substrate level (evidence-anchored claims = idea + path; the episodic-vs-claims benchmark tension; the Lens/sheaf "epiphany over held premises")
Sheaf neural networks for donto Research report on sheaf neural networks (cellular-sheaf GNNs) — arguably the missing mathematics for a contradiction-preserving substrate. Cellular sheaves put a vector space (stalk) on each node + a learned restriction map on each edge; their cohomology computably measures whether local views glue into a consistent whole (H⁰) and where they can't (). Maps line-for-line onto donto: claims = stalks, query-time alignment/identity = restriction maps, paraconsistency = non-zero , contradiction-pressure = localized disagreement norm, multi-source fusion + contested-source reasoning = sheaf data-fusion / discourse-sheaf opinion dynamics. Built to handle heterophily + oversmoothing — the exact failures donto's claim graph triggers. Includes concrete build proposals + honest compute caveats
Memory benchmarks — donto's honest scorecard A multi-day free-reader synthesis across LoCoMo + LongMemEval. The pattern: donto's recall does its job everywhere; the accuracy limits are the reader and query-time routing, not retrieval. Demonstrated wins — token-efficiency (claims+aggregates = 86% of episodic accuracy at 3.8× fewer tokens), knowledge-update out-of-box (0.923, bitemporal baked into the dated representation), and targeted answer-shaping (a distilled-preference facet lifts single-session-preference 0.70→0.767). The honest limit: answer-shaping is a scalpel (helps focused-fact questions, hurts synthesis), and single-hop is a routing gap — both reader/router-bound, not retrieval
The Seven Sisters across 32 cultures A new example consumer: the Pleiades / Seven Sisters story (Greek, Aboriginal Australian, Native American, Polynesian, Andean, African, Near Eastern + the deep-time "oldest story" debate) saved to disk and extracted into donto as evidence-anchored claims via the donto-agent GLM lane. Covers the engineering (opencode 0-facts → donto-agent; chunking ~doubles fact density; the 429-throttle + pacing; the 0-fact-≠-done fix) and the substrate findings (~11k claims, ~3k invented predicates, the pursuit motif at 246 stmts, the lost-Pleiad puzzle at 114, agricultural-calendar convergence, identity-as-hypothesis)
The 48-hour build, read closely A verification pass over the change report: the 48-hour work is real and tested (22 migrations 0156–0177, each a SQL capability + invariants test), but almost entirely unexercised — governance/horizon tables hold 1–16 demo rows against a ~41.5M-statement store (analogy 1, tournament 2, reputation 4; only rule_agenda at 12.7k is under real load). Reads it together with the LoCoMo results to name the selection function: answer-shaping facts convert (valid_time +45%, standing, closure, aggregates), retrieval doesn't — so the benchmark is what makes the skeleton load-bearing, and the aggregate claim is the next thing to build
Past 48 Hours Change Report Full audit of the committed work from 2026-06-10 10:14 UTC to 2026-06-12 10:14 UTC across donto and donto-web: substrate execution-plan waves, public docs, reports, benchmark analysis, route fixes, why each class of change was made, and what should come next
Future Substrate Report What Donto is still missing to become a century-scale knowledge and memory substrate for AI systems: durable identity, deeper source objects, semantic stability, memory lifecycle, governance, cryptographic trust, federation, multimodal evidence, scale, and model integration
Single-shot vs agentic, one-call vs multi-scope What the benchmarks measure, corrected from Zep's page: their 94.7% is single-read with multi-scope retrieval (5 composed searches + rerank, 5,760 tok); the plain single-call auto-search is 86.5% (2,680 tok). Two axes — retrieval composition vs reader agency — and which Zep number is donto's fair target
The shape of donto's return The full schema of the memory bundle donto hands an agent — every field (subject·predicate·object, the temporal text tag, source), where it comes from in the bitemporal claim graph, and the measured reason each one earned its tokens. Substrate is maximal; the return is minimal
LoCoMo Config C — claims-only recall Throwing away the dialogue and answering from the claim layer alone. Clean result: context collapses ~11× (19,029→1,690 tok) but so does accuracy — 0.244 vs episodic 0.837. An honest negative: the bottleneck is claim recall (worst on multi-hop), not claim existence. Next: semantic claim recall
Are we using Hyades effectively? Review of how donto-agent drives the Hyades gateway for claim extraction; the token-budget finding from tuning GLM (40→520 facts/chunk); a model bake-off (incl. the streaming-mode 524 fix) + experiment plan

See also: memory benchmarks status board · open research questions · comparison vs the field.

source: index.md