# Alignment debt: an information-economics of emit-free knowledge

_donto papers · original research · 2026-06-13 · theory + measurement, working draft v1_

**Abstract.** Sixty years of data engineering taught one reflex: *normalize at write*. Decide the schema, canonicalize each entity and predicate, and reject or fold anything off-schema *before* it lands — because every malformed row is a query-time tax paid forever. That reflex is correct precisely when generation is expensive: if a human (or a costly pipeline) produces each fact, you can afford to make them produce it *in the right shape*. We argue that when the per-fact generation cost collapses toward zero — the regime a guided frontier LLM now puts every knowledge base in — **the optimum inverts**: normalize-at-write becomes strictly worse than *emit-free, align-at-read*. We formalize the trade with one quantity, **alignment debt** — the query-time cost of folding freely-minted predicates and entities back together — and show that *carrying* the debt and *repaying it lazily* (only on the predicates a query actually touches) dominates *preventing* it at ingest whenever the read/write ratio and the generation-cost ratio fall on the cheap side of a crossover we derive. donto is the only place this can be measured, because it is the largest live emit-free store we know of with a working query-time folding engine: **1,160,474 distinct live predicates** over ~42M statements, folded on demand through `donto_predicate_closure` (10,024 alignment edges) by the `donto_match_aligned` read function. We give the model, ground its parameters in measured numbers (a real fold on the predicate `bornIn` recovers +8.7% of statements at a small, noisy read-latency premium), state the crossover, and make the falsifiable prediction the model forces: **abundance does not merely permit deferral — it *raises the optimal amount of it*.** We are honest about the candor tiers: the store parameters and the single fold measurement are solid; the inversion *theorem* is conjectured, the read-cost premium is directionally real but too small and noisy at this scale to pin to a clean factor, and the full debt curve is to-be-measured.

---

## 1. The reflex, and when it stops being right

A schema discipline is a choice about *when* you pay the cost of agreement. There are two pure strategies:

- **Normalize-at-write (NW).** Before a fact lands, force it onto a canonical predicate/entity. `assassinatedBy` becomes `killedBy`; `currentJob`, `occupation`, and `worksAs` become one. Reads are then cheap and exact — a single predicate match returns everything. The cost is paid *per fact, at ingest, by whoever generates it*, plus the irreversible loss of every variant the canonicalizer collapsed (an I3 violation by construction: you cannot later recover the distinction `assassinatedBy` drew that `killedBy` did not).
- **Align-at-read (AR).** Let the generator mint predicates and entities freely; store them verbatim; reconcile *at query time* by folding aligned predicates together when — and only when — a query reaches them. Reads cost more (you traverse a fold relation); writes are free and lossless.

The entire history of database normalization assumed NW is obviously right. It *was* right, for a reason that has nothing to do with schemas and everything to do with economics: **the scarce, expensive step was generating the fact.** If a genealogist spends an hour finding one birth record, spending thirty more seconds to record it on the canonical predicate is a rounding error, and the canonical form pays back over thousands of future reads. NW wins because write cost is already enormous, so the marginal normalization cost is negligible and the read savings compound.

That assumption is now false. A guided frontier model emits an essentially unbounded, multi-directional space of typed claims about any entity — *inventing the predicates as it goes* — for on the order of $0.0001 each (donto's founding observation; see the [research agenda](/papers/research-agenda) and the abundance grounding in [/reports](/reports)). When generation is nearly free, the write side of the NW ledger flips: forcing the generator to canonicalize is no longer a rounding error on an expensive write — it is *the dominant cost of the write*, and it is a cost that destroys information (the variant). The question is no longer "can we afford to normalize?" but "can we afford *not* to defer?" This paper answers it quantitatively.

---

## 2. Alignment debt, defined

> **Definition (alignment debt).** The *alignment debt* of an emit-free store is the total query-time cost required to fold its freely-minted predicates and entities into the equivalence classes a NW store would have created at ingest — paid incrementally, only on the predicates a query actually touches.

Debt is the right metaphor and not a loose one. NW *prepays* the agreement cost in full, at ingest, on every fact, whether or not it is ever read. AR *borrows* against future reads: it lands the fact for free and owes the folding cost, but only ever pays the installments that come due — the predicates an actual query reaches. Like financial debt, it has a principal (the total folding work that *would* be needed to reconcile the whole store), an interest rate (how much each read costs because the store is un-normalized), and — crucially — **you only ever service the part you draw on.** The vast tail of predicates that are minted once and never queried is debt that is never repaid because it never comes due. NW pays for that tail anyway.

Three quantities parameterize the trade:

| symbol | meaning | how donto fixes it |
|---|---|---|
| `g` | generation cost per fact (incl. the marginal cost of canonicalizing at write) | LLM token cost + (for NW) the normalize step |
| `r` | read/write ratio — expected number of times a fact participates in a query | workload-dependent |
| `φ` | fold cost per read — the extra read latency to traverse the alignment relation | **measured below** |
| `δ` | fold density — fraction of *queried* predicates that actually have a fold target | **measured below** |

NW's amortized cost per fact is roughly `g + Δ_norm + r·c_read` where `Δ_norm` is the per-write normalization surcharge and `c_read` is the cheap exact read. AR's is `g + r·(c_read + δ·φ)` — no `Δ_norm`, but each of the `r` reads pays the fold installment `δ·φ` *only on the fraction `δ` that comes due.* AR wins when

```
Δ_norm  >  r · δ · φ
```

i.e. when the prepaid normalization surcharge exceeds the total lazily-repaid debt. The inversion this paper claims is what happens to this inequality as `g → 0`: **`Δ_norm` does not go to zero with `g`.** Normalizing is a *fixed* semantic act — deciding `assassinatedBy ≡ killedBy` costs the same whether the fact underneath was expensive or free. So as generation cheapens, the firehose widens, the number of facts `N` explodes, and `Δ_norm · N` (the total prepaid bill) grows *without bound* while `r·δ·φ` per fact stays flat and `δ` *shrinks* (most of the explosion is rarely-queried tail). The inequality tips to AR and stays there. NW's prepayment becomes a tax on an ever-larger tail of facts nobody reads.

This is the inversion: **NW is optimal in the scarce-generation regime and strictly dominated in the abundant-generation regime.** The crossover is not a tuning preference — it is forced by `g`.

---

## 3. Grounding the parameters in the live store

donto is running exactly the AR strategy at scale, so its tables *are* the experiment. Measured 2026-06-13 against `donto-pg`:

| quantity | value | source |
|---|---|---|
| distinct live predicates | **1,160,474** | `count(distinct predicate) … upper(tx_time) is null` over `donto_statement` |
| registered predicates | 1,035,745 | `donto_predicate` |
| predicate embeddings (fold candidates) | 1,035,744 | `donto_predicate_embedding` |
| closure rows | 1,045,762 | `donto_predicate_closure` |
| — of which *self* (identity) | 1,035,738 | `relation='self'` |
| — of which **fold edges** | **10,024** | `relation<>'self'` |
| fold edges by kind | 8,438 close_match · 1,412 exact · 132 inverse · 30 sub-property · 12 decomposition | `group by relation` |
| predicates with ≥1 fold target | 3,874 | distinct `predicate_iri`, non-self |
| mean fold fan-out (among folded) | 2.59 (max 85) | per-predicate edge count |
| live statements | ~42.0M | `pg_class.reltuples` (estimate) |
| `donto_argument` edges | 2,525 | `donto_argument` |
| `donto_identity_edge` edges | 169 | `donto_identity_edge` |

Two things in this table *are the paper's data.* First, the firehose is real and growing: **1.16M distinct predicates** — the agenda's "~938K" figure is now stale; abundance kept emitting. Second, **fold density `δ` is tiny in aggregate**: only 3,874 of 1.16M predicates (~0.33%) currently have any fold target at all. That is not a bug — it is the central economic fact. **The overwhelming majority of minted predicates are never folded because they are never queried**; the debt on them is never drawn down. A NW store would have paid `Δ_norm` to canonicalize all 1.16M; donto has paid the fold cost on only the 3,874 that came due.

### 3.1 A measured fold cost

To put a number on `φ`, we timed the real read path on a predicate that *does* have folds — `bornIn`, which `donto_predicate_closure` aligns to 69 non-self equivalents (genealogy mints `bornIn`, `birthPlace`, `placeOfBirth`, … freely):

| read | result | latency (5 fresh connections) |
|---|---|---|
| fold OFF (`p_expand_predicates=false`) | 1,407 statements | 25 / 25 / 31 / 40 / 30 ms (median ~30) |
| **fold ON** (`donto_match_aligned`, default 0.8 confidence floor) | **1,530 statements** | 38 / 62 / 74 / 36 / 32 ms (median ~38) |

The recall side is clean and reproducible: the fold **recovers +123 statements, a +8.7% lift** that a fold-off read (or a NW store that had collapsed these at ingest into a *different* canonical choice) would either miss or mis-merge. The *cost* side is the honest part. At this scale the fold premium is real but **small and noisy** — fold-on ran roughly **1.2–2.5×** fold-off across fresh connections, with both arms swamped by per-connection and cache effects in the tens-of-milliseconds band (a warm in-process run even saw fold-on *beat* fold-off, 13 ms vs 19 ms, which is pure cache noise). The defensible statement is therefore directional, not a factor: *folding adds a small constant per read, on the order of tens of milliseconds for a hub predicate, paid only on the predicates a query touches.* We deliberately do **not** report a single "Nx" cost multiplier — it would be measuring jitter. This is one installment of alignment debt, paid in milliseconds, *only because a query asked for `bornIn`.* The other ~1.16M predicates cost nothing today.

The mechanism (`donto_match_aligned`, migrations 0052 + 0159; folding via `donto_predicate_closure`) is a single index join: for the queried predicate, ride the closure to its `equivalent_iri` set (confidence ≥ floor), union the matching statements, swapping subject/object on rows flagged `swap_direction` (the `inverse_equivalent` edges). The debt is repaid *exactly* at the granularity of the read — no whole-store normalization ever runs. This is the AR strategy made operational, and it is why only donto can produce these numbers: it has the freely-minted store **and** the lazy folding engine.

---

## 4. Why only donto can run this experiment

The experiment needs three things at once, and they are donto's three invariants:

1. **A freely-minted store (emit-free ingest).** You cannot measure the cost of *not* normalizing unless you actually didn't. donto's 1.16M distinct predicates are real, not a synthetic stress test — they were minted by extraction lanes told to invent predicates freely (`extract_broad.txt`).
2. **No destructive overwrite (I3).** AR is only *recoverable* if the variants survive. A NW store that folded `assassinatedBy` into `killedBy` at ingest has destroyed the distinction; donto keeps both predicates as live statements and the fold as a *separate, revisable* edge in `donto_predicate_closure`. The debt can be *re-amortized* when alignment improves — you cannot re-amortize a merge you already committed.
3. **Query-time alignment as a first-class engine.** The fold must be a read-path operator (`donto_match_aligned`), not a batch job, or "lazy repayment" is a fiction. donto's closure + the continuous alignment daemon that maintains it are exactly this.

A vector DB cannot run it (it collapses on dedup — no variants survive, no fold edges exist). A normalized KG cannot run it (it *is* the NW arm — there is nothing to defer). Only a bitemporal, paraconsistent, alignment-at-read substrate has both arms of the experiment live at once. **The inversion is donto's to measure because donto is the only store sitting on the abundant-generation side of the crossover with the instrumentation to read the debt.**

---

## 5. The prediction the model forces: abundance raises optimal deferral

The standard intuition is that deferral is a *concession* — "we'd normalize if we could afford the write-time engineering; deferring is the pragmatic fallback." The model says the opposite, and sharply:

> **Prediction.** As generation cost `g` falls, the optimal fraction of reconciliation deferred to read time *increases monotonically.* Cheaper generation widens the firehose (more facts `N`, more distinct predicates), which (a) raises the total prepaid NW bill `Δ_norm·N` linearly while (b) *lowering* the aggregate fold density `δ` (the new mass is disproportionately rarely-queried tail). Both effects push the crossover `Δ_norm > r·δ·φ` further into AR territory. **Abundance is not a reason to defer reluctantly; it is the reason deferral becomes optimal.**

The live store is consistent with this in its shape: the predicate count grew (938K → 1.16M) while the *fraction* of predicates ever folded sits at ~0.33% — exactly the "explosion is mostly cold tail" signature the prediction names. But this is a *snapshot consistent with* the prediction, not a *measurement of* it. The decisive experiment is in §7.

---

## 6. The honest failure modes

The model is new; here is where it can break, named explicitly in the spirit of the seed papers.

- **`δ` is not query-independent.** We measured aggregate fold density (~0.33%), but the installment that matters is `δ` *over the predicates a real workload reaches*, which is almost certainly higher than the aggregate (queries cluster on common predicates like `bornIn`, `name`, `parentOf` — precisely the ones with folds: `foaf:name`↔`ex:name` 31 edges each, `ex:parentOf` 41). If workloads hit folded predicates far more than the 0.33% base rate, the per-read debt `δ·φ` is larger than the aggregate suggests and the crossover moves back toward NW. **This is the most likely way the headline over-claims, and it is measurable** — instrument `δ` on the actual recall workload, not the store-wide average.
- **`φ` is not constant.** The `bornIn` fold (69 equivalents) cost a small per-read premium; a predicate with 85 equivalents (the max fan-out observed) costs more, and a fold that fans into high-cardinality statement sets costs more still. `φ` plausibly scales with fan-out × matched-statement count, so a workload dominated by hub predicates would pay a growing (perhaps super-linear) debt. The model's flat-`φ` assumption is a simplification; the real curve is `φ(fan-out)` and we have measured only one point on it.
- **Read amplification can dominate `Δ_norm` if `r` is huge.** A write-once/read-billions workload (a static reference KB queried by everyone) is exactly the regime NW was built for, and the inequality says so: large `r` makes `r·δ·φ` win. **The inversion is a claim about *abundant-write* workloads, not all workloads.** A genealogy/agent-memory store that ingests a firehose and reads a modest multiple is on the AR side; a frozen encyclopedia read forever is not. We do not claim AR dominates universally — we claim it dominates *in the abundant-generation regime donto targets*, and the crossover is the honest boundary.
- **Alignment quality bounds the recovered value.** The +8.7% `bornIn` recovery is only worth paying for if the folds are *correct*. A wrong fold (a false `close_match`) repays debt with counterfeit currency — it returns statements that should not match, hurting precision. The confidence floor (0.8) and the separation of `close_match` from `exact_equivalent` are the guardrail, but the debt model assumes folds are net-positive; an under-tuned alignment engine could make AR *worse than* a clean NW store on precision even while winning on cost. The economics and the [decoupled-anchoring](/papers/answer-shaping) faithfulness story are coupled here.

---

## 7. The experiment that would establish it

The §5 prediction becomes a result with one head-to-head, run on donto's own corpus:

1. **Build both arms from the same firehose.** Arm A = the live emit-free store (already exists). Arm B = a NW projection: take the *same* extracted facts and canonicalize predicates/entities at ingest using the *same* alignment signals (embed + closure) applied *forward* instead of at read. Measure `Δ_norm` (the ingest surcharge) directly.
2. **Sweep `g`.** Re-run extraction at several generation budgets (saturation passes → more facts per source = lower effective `g` per fact), producing stores of increasing `N` and predicate count.
3. **Run a fixed recall workload against both arms** (the LongMemEval / LoCoMo / genealogy recall harnesses already exist), measuring: total cost = ingest cost (A: 0 normalize; B: `Δ_norm·N`) + read cost (A: `r·δ·φ` measured per query; B: `r·c_read`), and recall/precision (does AR's fold recover the +8.7%-class statements without precision loss?).
4. **Plot the crossover.** Total cost(AR) − total cost(NW) as a function of `g` and `r`. The prediction is a single sign change whose location moves toward AR as `g` falls — and the abundance-raises-deferral claim is the *slope* of that crossover in `g`.

The pieces all exist (the store, the closure, `donto_match_aligned`, the recall harnesses, the alignment daemon). What is missing is the NW projection arm (Arm B) and the disciplined sweep. That is a bounded build, not research risk — which is why this paper sits in the agenda's "close to a result" tier rather than "drafted."

---

## 8. Status: proven / measured / conjectured / speculative

- **Measured (solid).** The store parameters (1,160,474 distinct predicates, 10,024 fold edges, 3,874 folded predicates, ~0.33% aggregate fold density, mean fan-out 2.59 / max 85); the **recall** side of the `bornIn` fold (+123 statements, +8.7%, reproducible). These are live `donto-pg` reads, reproducible with the queries in §3.
- **Measured-but-noisy (directional only).** The fold's *read-cost* premium. Fold-on ran ~1.2–2.5× fold-off across fresh connections in a tens-of-milliseconds band dominated by cache/connection jitter; we report it as "a small constant per read" and explicitly decline to quote a single multiplier (§3.1). It is one point on the `φ(fan-out)` curve, not the curve.
- **Defined (solid).** Alignment debt; the principal/interest/installment decomposition; the crossover inequality `Δ_norm > r·δ·φ`; the four parameters. The model follows from the cost accounting and is internally consistent.
- **Conjectured (open).** The inversion theorem — that the crossover *necessarily* tips to AR as `g→0` — depends on `Δ_norm` being floored away from zero (semantic normalization is generation-cost-independent) and on `δ` falling with `N`. Both are plausible and snapshot-consistent but unproven; §7 is the test, and §6 names the conditions (huge `r`, query-concentrated `δ`, super-linear `φ`) under which it fails.
- **Speculative (flagged).** That fold density `δ` and the debt-repayment curve are themselves a *health metric* of an emit-free store — that tracking "what fraction of read predicates came due, and what did repayment cost" is a better instrument than predicate-count alarmism. This connects directly to [generative-abundance metrics](/papers/research-agenda) (#9); we propose it, we do not yet measure it as a health signal.

The contribution is not a benchmark number. It is a **falsifiable economic law**: in the regime where generating a fact is nearly free, the sixty-year reflex to normalize-at-write inverts, and the right discipline is to emit free, carry alignment debt, and repay it lazily at read — with a crossover that abundance pushes, not reluctantly tolerates.

---

_See also: the contradiction-measurement theory in [The Bitemporal Sheaf](/papers/bitemporal-sheaf); the empirical companion [Answer-Shaping](/papers/answer-shaping); the proliferation-as-signal sibling and the full program in the [donto research agenda](/papers/research-agenda); abundance grounding in [/reports](/reports)._