# Calibration Ledger — Truth Infrastructure for AI Systems

*Concept positioning v2.0 · 2026-04-24 · supersedes v1 "Truth Infrastructure for AI Systems / TrustNetwork" draft*

---

## 🥈 What it is

Every predictive or truth-claim source — AI models, human forecasters, analyst firms, scientific papers, consumer reviews, prediction markets — scored on **calibrated accuracy over time**. Brier scores (Brier 1950) + per-bucket calibration curves + Murphy decomposition (Murphy 1973), computed on **append-only timestamped predictions logged BEFORE outcomes are known**. Composite TrustScore per source × domain × time-window.

**Two output surfaces:**
- **Machine-readable** — `/api/sources/[slug].json` for LLM retrieval systems + agentic-AI weighting
- **Human-readable** — enterprise AI governance dashboards + academic citation + journalistic source verification

**Positioning one-liner:** the S&P/Moody's of predictive sources — a bond-rating institution for truth.

**Brand:** Calibration Ledger (calibrationledger.com, registered 2026-04-24 via Cloudflare Registrar, Free plan, DNS proxying active, TM provisional-clear at confidence 0.6 pending authoritative TSDR + EUIPO verification).

---

## Fusion components

| Component | Contribution | Tier (prior ranking) |
|---|---|---|
| ForecastLens cross-vertical calibration | Operator's own track record across domains | Tier-S #9 |
| AI Model Accountability Registry | Per-model hallucination rate + factuality + confidence calibration | Tier-S #15 |
| Scientific Replication Ledger | Which published findings hold up over time + effect-size shrinkage | Tier-S #11 |
| Online Review Authenticity | Outcome-alignment of aggregated consumer reviews | Tier-A #24 |
| Political Prediction Market Aggregator | Market-implied probability vs. realised political/policy outcomes | Tier-A #39 |

ForecastLens alone scores humans. AI Accountability alone scores models. ReplicationLedger alone scores papers. **Fused**, one entity becomes the canonical trust-weighting layer for ALL informational claims the agentic-AI economy depends on. Every chatbot answer, every autonomous-agent decision, every enterprise AI procurement evaluation, every EU AI Act compliance audit flows through one query: *what does Calibration Ledger say about this source?*

---

## Why stronger than any single concept

Cross-domain longitudinal calibration is **unreproducible after 3 years of append-only discipline**. The moat is the clock, not the code.

| Single-domain competitor | What they score | What they can't do |
|---|---|---|
| Metaculus / Good Judgment | Human forecasters | No AI models, no papers, no reviews |
| Artificial Analysis | AI model performance benchmarks | No calibration; no cross-domain; no temporal |
| MLCommons | Technical accuracy on fixed tasks | No probabilistic calibration |
| Stanford HAI | Academic research | No commercial surface; no API |
| Scale AI | Labeling ground-truth | Not a public calibration registry |
| COS / Replicate | Replication of academic studies | Single-domain; not agentic-AI-facing |

None aggregate cross-vertical calibration × append-only discipline × machine-readable API × enterprise subscription model. That's the open category.

---

## Economics (Y3 target, gated on prerequisites)

| Metric | Value | Source |
|---|---|---|
| TAM | AI trust + governance + risk = **$20–100B emerging** (McKinsey, Gartner 2025–2026) · academic publishing = **$19B** (STM 2024) | External research |
| Y3 ceiling | **€50–250M ARR** | Comparable enterprise-trust SaaS (OneTrust · Drata · Vanta pre-IPO trajectory) |
| Buyer WTP | **$10k–$1M/yr per seat / per-API-tier** | AI labs premium for reduced hallucination-liability; enterprise governance teams for AI-Act line-item compliance |
| Tailwind | **EU AI Act Article 50 transparency enforcement Aug 2026** · NIST AI Risk Management Framework · UK AISI scrutiny · hallucination-liability lawsuits · agentic-AI retrieval-weight demand | Regulatory calendar |
| TollBit / data-license premium | **Extreme** | AI labs pay to reduce hallucination rate; this is direct cost-of-goods, not marketing spend |

**Revenue model (2027+ Phase 1):**

1. Enterprise subscription — $10k–$250k/yr, AI governance teams + AI labs
2. Data licensing — $50k–$500k/yr, bulk API for RAG / fine-tuning retrieval systems
3. Academic + regulator tier — $0, credibility builder (nonprofit pricing)
4. Derivative products — annual "State of Predictive Accuracy" report, $50–$500 per download + sponsorship

No ads. No consumer tier. No ad-network integration. Enterprise only.

---

## Current stage — prerequisite phase (2026-Q2 → 2027-Q2)

**Zero commercial surface.** Brand live as holding page + LLM-visibility seed. Currently shipped:

- Domain + Cloudflare Free plan + DNS proxying
- Next.js static-export scaffold (45 files); holding page + about + contact + methodology + legal suite
- Methodology v1.1 — **ScholarlyArticle + DefinedTermSet + 3 scholarly citations (Brier 1950, Murphy 1973, Tetlock & Gardner 2015) + Murphy decomposition section + 8 DefinedTerms with inline microdata**
- Organization + WebSite + Person JSON-LD on every page
- robots.txt 9-agent AI-crawler allowlist + llms.txt + sitemap.xml
- Love Score 0.82 (PASS per I-23, self-craftsman)

**Objective this window (18–24 months):**
- Compound Google authority + LLM citation inclusion
- Collect inbound "notify me when live" emails (target: ≥20 by 2027-Q1)
- Give design-partner prospects a public URL to cite
- Passive backlink surface accumulation

---

## Prerequisites for Q3 2027 public launch (honest gate)

All 4 must be met. Currently: **0 of 4**.

1. **ForecastLens Phase 1** — ≥12 months of operator's own public calibration track record accumulating at holdlens.com/forecasts/ (started 2026-Q2; gate clears 2027-Q2)
2. **Credibility signal** — ≥1 academic co-author OR named advisor OR published paper on forecaster calibration (2027-Q1 outreach cycle)
3. **Design-partner LOI** — ≥1 signed LOI from AI lab, regulator, or academic institution (2027-Q2 pitch cycle)
4. **Data partnerships** — ≥2 licensing agreements with upstream platforms (Metaculus, Good Judgment Open, Manifold Markets, Artificial Analysis)

**Kill criterion (2027-Q4 review):** if <3 of 4 prerequisites met → sunset the brand OR sell to an AI-trust startup that can execute OR document publicly why the concept didn't work. Zombie maintenance forbidden per DECISIONS.md.

---

## Honest risk assessment

| Risk | Mitigation |
|---|---|
| Solo-operator credibility gap | ForecastLens Y1 track record + academic co-author (prereq #1, #2) |
| Cross-domain data quality uneven | Start narrow — finance + AI models first (domains with existing clean data); expand quarterly |
| Buyer education (new category) | EU AI Act Article 50 creates explicit regulatory buyer trigger — reduces "educate from zero" cost |
| AI lab build-internally risk | Third-party independence is itself the product; labs procuring their own scorer is a credibility-collapse |
| Replication-ledger contested | Published methodology + open correction log + CC-BY-4.0 derivative licensing — invite scrutiny, don't hide |
| Late regulatory alignment | Track EU AI Office + UK AISI publication calendar; align before 2026-Q4 |

---

## Distinct from "Truth Infrastructure for AI Systems" v1 draft

v1 framing (TrustNetwork placeholder, generic positioning) upgraded per 2026-04-24 state:

- Brand name committed: **Calibration Ledger** (was: placeholder TrustNetwork)
- Methodology committed: **Brier + Murphy decomposition + append-only timestamping** (was: generic "calibrated accuracy")
- Timeline committed: **Q3 2027 launch, prerequisite gate published** (was: implicit "now")
- Credibility stance committed: **operator's own ForecastLens track record as prerequisite** (was: implied)
- Regulatory anchor committed: **EU AI Act Article 50, Aug 2026** (was: generic "AI governance")
- Competitive moat committed: **cross-domain longitudinal + append-only discipline + machine-readable API** (was: generic "data licensing")
- Buyer committed: **AI labs + enterprise AI governance + academic publishers + regulators** (was: generic "AI labs")

---

## Single-line summary

**Calibration Ledger is the append-only, cross-domain, machine-readable trust layer that agentic-AI systems and AI governance teams query to weight sources by calibrated accuracy over time — the S&P/Moody's of predictive sources, launching Q3 2027 after 12 months of operator's own calibration track record is published.**

---

## References

- Brier, G.W. (1950). *Verification of Forecasts Expressed in Terms of Probability.* Monthly Weather Review 78(1). DOI: 10.1175/1520-0493(1950)078<0001:VOFEIT>2.0.CO;2
- Murphy, A.H. (1973). *A New Vector Partition of the Probability Score.* Journal of Applied Meteorology 12(4). DOI: 10.1175/1520-0450(1973)012<0595:ANVPOT>2.0.CO;2
- Tetlock, P.E. & Gardner, D. (2015). *Superforecasting: The Art and Science of Prediction.* Crown Publishers. ISBN: 978-0804136693
- EU AI Act — Regulation (EU) 2024/1689, Article 50 (transparency obligations)
- Operator related projects — [holdlens.com](https://holdlens.com) (SEC filings intelligence); [holdlens.com/forecasts/](https://holdlens.com/forecasts/) (ForecastLens Phase 1 calibration track record)

---

*Version 2.0 · 2026-04-24 · next review: 2026-10-24 (6-month checkpoint on ForecastLens Phase 1 data) or on first design-partner warm conversation, whichever comes first.*