# Reports

Working research notes from building donto — extraction engineering, benchmark studies, substrate design. Living documents, updated as the work progresses.

_Last updated: 2026-06-12._

| report | what it covers |
|---|---|
| **[Is the Internet an extension of human memory?](/reports/internet-as-extended-memory)** | A reading of a note the author wrote in **2011** — catching himself storing *search actions* instead of *facts*. It independently described **cognitive offloading** (Sparrow's "Google Effect," named that same year) and **transactive memory**, named the still-open creative cost (an index has coverage but no *adjacency*, and adjacency is where epiphany happens — "instead of a novel, I am a dictionary"), and proposed — almost verbatim — donto's storage contract: *store the idea first, then the path to the source.* The throughline: the 2011 personal dilemma is the exact design problem donto solves at the substrate level (evidence-anchored claims = idea + path; the episodic-vs-claims benchmark tension; the Lens/sheaf "epiphany over held premises") |
| **[Sheaf neural networks for donto](/reports/sheaf-neural-networks-for-donto)** | Research report on **sheaf neural networks** (cellular-sheaf GNNs) — arguably the missing mathematics for a contradiction-preserving substrate. Cellular sheaves put a vector space (stalk) on each node + a learned restriction map on each edge; their **cohomology computably measures whether local views glue into a consistent whole (`H⁰`) and where they can't (`H¹`)**. Maps line-for-line onto donto: claims = stalks, query-time alignment/identity = restriction maps, **paraconsistency = non-zero `H¹`**, contradiction-pressure = localized disagreement norm, multi-source fusion + contested-source reasoning = sheaf data-fusion / discourse-sheaf opinion dynamics. Built to handle **heterophily + oversmoothing** — the exact failures donto's claim graph triggers. Includes concrete build proposals + honest compute caveats |
| **[Memory benchmarks — donto's honest scorecard](/reports/memory-benchmarks-scorecard)** | A multi-day free-reader synthesis across LoCoMo + LongMemEval. The pattern: **donto's recall does its job everywhere; the accuracy limits are the reader and query-time routing, not retrieval.** Demonstrated wins — **token-efficiency** (claims+aggregates = 86% of episodic accuracy at 3.8× fewer tokens), **knowledge-update** out-of-box (0.923, bitemporal baked into the dated representation), and **targeted answer-shaping** (a distilled-preference facet lifts single-session-preference 0.70→0.767). The honest limit: answer-shaping is a *scalpel* (helps focused-fact questions, hurts synthesis), and single-hop is a routing gap — both reader/router-bound, not retrieval |
| **[The Seven Sisters across 32 cultures](/reports/seven-sisters-pleiades)** | A new example consumer: the Pleiades / Seven Sisters story (Greek, Aboriginal Australian, Native American, Polynesian, Andean, African, Near Eastern + the deep-time "oldest story" debate) saved to disk and extracted into donto as evidence-anchored claims via the `donto-agent` GLM lane. Covers the engineering (opencode 0-facts → donto-agent; **chunking ~doubles fact density**; the 429-throttle + pacing; the 0-fact-≠-done fix) and the substrate findings (~11k claims, ~3k invented predicates, the pursuit motif at **246** stmts, the lost-Pleiad puzzle at **114**, agricultural-calendar convergence, identity-as-hypothesis) |
| **[The 48-hour build, read closely](/reports/capability-vs-exercise)** | A verification pass over the change report: the 48-hour work is **real and tested** (22 migrations `0156–0177`, each a SQL capability + invariants test), but **almost entirely unexercised** — governance/horizon tables hold **1–16 demo rows** against a **~41.5M-statement** store (analogy 1, tournament 2, reputation 4; only `rule_agenda` at 12.7k is under real load). Reads it together with the LoCoMo results to name the selection function: **answer-shaping facts convert (valid_time +45%, standing, closure, aggregates), retrieval doesn't** — so the benchmark is what makes the skeleton load-bearing, and the **aggregate claim** is the next thing to build |
| **[Past 48 Hours Change Report](/reports/past-48-hours-change-report)** | Full audit of the committed work from 2026-06-10 10:14 UTC to 2026-06-12 10:14 UTC across `donto` and `donto-web`: substrate execution-plan waves, public docs, reports, benchmark analysis, route fixes, why each class of change was made, and what should come next |
| **[Future Substrate Report](/reports/future-substrate)** | What Donto is still missing to become a century-scale knowledge and memory substrate for AI systems: durable identity, deeper source objects, semantic stability, memory lifecycle, governance, cryptographic trust, federation, multimodal evidence, scale, and model integration |
| **[Single-shot vs agentic, one-call vs multi-scope](/reports/single-shot-vs-agentic-memory)** | What the benchmarks measure, corrected from Zep's page: their **94.7%** is single-*read* with **multi-scope** retrieval (5 composed searches + rerank, 5,760 tok); the plain single-call **auto-search is 86.5%** (2,680 tok). Two axes — retrieval composition vs reader agency — and which Zep number is donto's fair target |
| **[The shape of donto's return](/reports/donto-memory-return-shape)** | The full schema of the memory bundle donto hands an agent — every field (`subject·predicate·object`, the temporal `text` tag, `source`), where it comes from in the bitemporal claim graph, and the measured reason each one earned its tokens. Substrate is maximal; the return is minimal |
| **[LoCoMo Config C — claims-only recall](/reports/locomo-claims-only-config-c)** | Throwing away the dialogue and answering from the claim layer alone. Clean result: context collapses **~11×** (19,029→1,690 tok) but so does accuracy — **0.244 vs episodic 0.837**. An honest negative: the bottleneck is claim *recall* (worst on multi-hop), not claim existence. Next: semantic claim recall |
| **[Are we using Hyades effectively?](/reports/hyades-extraction-effectiveness)** | Review of how donto-agent drives the Hyades gateway for claim extraction; the token-budget finding from tuning GLM (40→520 facts/chunk); a model bake-off (incl. the streaming-mode 524 fix) + experiment plan |

_See also: [memory benchmarks status board](/benchmarks) · [open research questions](/questions) · [comparison vs the field](/comparison)._
