Recruiting · 1-3 design partners · 2027-Q3 launch target

Design Partners

Last updated

Calibration Ledger is recruiting 1 to 3 design partners ahead of the 2027-Q3 public launch. A signed Letter of Intent from one named organisation is the third of four prerequisites in the project’s public roadmap; without it, launch does not happen.

#Who we’re looking for

Three operator profiles fit. If your organisation matches one of these, we want a 30-minute conversation.

  1. AI labs running internal evaluation pipelines — OpenAI, Anthropic, Google DeepMind safety teams, Meta FAIR, plus open-weight labs (Mistral, Cohere, Together AI, etc.). If you measure model factuality + calibration on internal benchmarks, Calibration Ledger gives you a public-facing third-party reference point that lets you say “our hallucination rate per Brier-bucket is X” with a standard methodology you didn’t author.
  2. EU regulators implementing AI Act Article 50 — DG-CNECT, AI Office, national competent authorities. Article 50 transparency obligations apply to general-purpose AI models from August 2026. Calibrated accuracy scores per model are exactly the kind of third-party-verifiable signal Article 50 anticipates.
  3. Academic institutions running forecasting platforms — Good Judgment Project (Tetlock’s group), Metaculus advisory board, FutureFix lab, university economics-of-prediction researchers, replication-crisis projects (Center for Open Science, BITSS, etc.). For these institutions Calibration Ledger is a downstream consumer of your data + a co-publisher of methodology refinements.

#What design-partnership means

Concretely:

  • Methodology review window (8 weeks) — your team reads the v1.x methodology + comments on what would block adoption in your workflow. Comments incorporated into v1.x → v2.0 with attribution.
  • Scoring-pilot window (12 weeks) — Calibration Ledger scores 5-10 predictive sources of mutual interest using public + jointly-licensed data. Outputs reviewed by both parties before any are made public.
  • Co-citation — your organisation is acknowledged as Design Partner in the v2.0 methodology paper + on this page (subject to your approval).
  • No exclusivity, no fee, no IP transfer — Calibration Ledger methodology stays CC-BY-4.0; the design partnership is mutual signal, not contractual lock-in.

#Why this is the right window

EU AI Act Article 50 enforcement begins August 2026 for general-purpose AI providers and from 2027 for downstream deployers. Third-party calibration scoring becomes regulator-citable infrastructure for any organisation needing to demonstrate transparency-obligation compliance.

Forecasting platforms (Metaculus, Good Judgment Open, prediction markets) face the same third-party-verification problem. Calibration Ledger’s public methodology + the operator’s own dated track record at /track-record/ provide a meaningful baseline that platforms can point at without conflict-of-interest.

#What is already running (Beta)

Before any design-partnership conversation, Calibration Ledger has already shipped the methodology applied to live data:

  • Methodology v1.1 — Brier 1950 + Murphy 1973 + append-only time-stamping with foundational citations + a JSON-LD twin carrying SHA-256 supply-chain integrity.
  • Operator track record — 9 dated probabilistic forecasts spanning 5 domains (geopolitics, AI benchmarks, markets, technology, weather), scored automatically on resolution by lib/brier.ts (math-verified via 9-test suite, Murphy identity holds within 1e-10). Machine-readable feed at /api/forecasts.json.
  • Beta — 10 cited findings across 5 of 6 source classes (human forecasters, aggregator platforms, prediction markets, AI models, analyst firms, scientific papers). Each finding has its own deep-link page with full context, Schema.org ScholarlyArticle markup, citation URL, and Phase 1 recomputation roadmap. Sources include Mellers 2015 (J. Exp. Psych. Applied), OSC 2015 (Science), Camerer 2018 (Nature Human Behaviour), Hausfather 2020 (Geophys. Res. Lett.), OpenAI 2023 GPT-4 Technical Report, Kadavath 2022 (Anthropic), Bradshaw 2011, Philadelphia Fed SPF, Manifold Markets, Metaculus.

A design-partnership conversation begins from these working artefacts, not a slide deck. The methodology review window (8 weeks) is reading what is already shipped + commenting on what would block adoption in your workflow.

#How to start a conversation

Send a brief email to contact@calibrationledger.com with subject line Design partner inquiry — [organisation]. The body should cover:

  • • Your role + organisation
  • • Which of the three profiles above fits, or a fourth that we should consider
  • • One concrete predictive source or workflow where calibrated scoring would unblock a decision you currently make on intuition
  • • Whether your organisation has any prior interaction with the operator (/about/), the operator’s own dated forecast log (/track-record/), or this site

First response within 5 business days. If we both agree there’s a fit, the next step is a 30-minute video call + the methodology review window described above.

  • Methodology v1.1 — full scoring framework with citations
  • Roadmap — public dated milestones + 2027-Q4 kill criterion
  • About — operator background + EU AI Act context
  • CONCEPT.md — full positioning document