RESIDENCY MATRIX · DATA CLASS × TIER
| Data class | Sovereign | Resident | Open |
|---|---|---|---|
| Identifiable (PHI) locked | allowed | blocked | blocked |
| Pseudonymised | allowed | via gate | blocked |
| De-identified | allowed | allowed | via gate |
| Non-clinical | allowed | allowed | allowed |
Heimdall is standalone, but native to the 3verest family. Each module does one job on the data path or the control plane, and every state change is audit-logged, every API idempotent, every tenant isolated at every layer.
The single enforcement point. mTLS + OAuth2, streaming, idempotency, prefix & semantic caching. No path to any model except through it.
Verifies what the caller declares, task class against the schema registry, data class against an owned, on-gateway PHI detector.
Compiles human-readable rules into per-request decisions in under a millisecond. Residency matrix, allowlists, budgets, fail-closed.
Resolves each request to the cheapest compliant rung on the depth ladder; manages a pinned, eval-gated model estate.
Bounds the cost and behaviour of every request, context caps, output schemas, agentic depth, hard budgets, queue-not-spend.
The trust boundary for any non-sovereign route. Pseudonymisation and anonymisation grades; the re-ID map never leaves sovereign storage.
One immutable, hash-chained record per request across all pools, the source of truth for billing, simulation and evidence.
The visible control surface. Two skins, one engine. Rules as signable prose; nothing edits live state directly.
Turns ledger rows into four commercial shapes. Budget caps and alerts; invoices reconcile to ledger sums exactly, zero tolerance.
Compliance as exhaust. EU AI Act and DSPT evidence packs, scheduled or on demand, reproducible from the ledger.
The concierge layer. Weekly drift sampling, rubric scoring, eval-suite runs, escalation playbooks, run by people on a cadence.
HMAC-signed webhooks, email digests and a Studio inbox for policy activations, approvals, escalations, budget thresholds and drift flags.
OEM org and tenant provisioning, region assignment, locked-baseline management, credential lifecycle, read-only support impersonation.
Every rule is rendered as a sentence a governance lead can sign, and simulated against thirty days of the tenant's own traffic before it activates. Two skins, one engine: an OEM white-label and a customer-direct view, the same tree scoped and role-reduced. Nothing edits live state directly; change moves through draft → simulate → approve → activate, with instant rollback retained.
RESIDENCY MATRIX · DATA CLASS × TIER
| Data class | Sovereign | Resident | Open |
|---|---|---|---|
| Identifiable (PHI) locked | allowed | blocked | blocked |
| Pseudonymised | allowed | via gate | blocked |
| De-identified | allowed | allowed | via gate |
| Non-clinical | allowed | allowed | allowed |
AI is sold by the token and bought by the study. A product is priced on a fixed licence or a per-study fee, but the model underneath bills by the token, and token count is a random variable: context size, output verbosity, retry loops, agentic depth. Fixed revenue minus an unbounded cost is a margin that erodes silently, request by request. That mismatch is where healthcare AI margins go to die, and it is the reason clinically successful AI features get cut at contract.
How a variable cost becomes a fixed price
Each unit of AI work is a named, versioned class, not an open-ended API call. You can only price what you have named.
Every class carries a token envelope: context cap, output cap, retry ceiling, agentic depth, max calls. The worst case is known in advance.
Budgets per tenant, region and class. A soft alert at 80%, a hard queue at 100%, so the ceiling is set before the bill arrives, not after.
The cheapest compliant model wins. Owned capacity absorbs the routine 80%; frontier inference is reserved for the hard tail that needs it.
The per-study price is derived from real ledger variance, not a guess. The party that can manage the variance is the one that carries it.
Why sovereign is also the cheaper path
Owned inference is a capitalised, fixed-cost base; per-token APIs are pure variable cost. Above a crossover volume the owned path is simply cheaper, and two standard levers widen the gap, batch scheduling and prompt caching. So routing the bulk of traffic to sovereign capacity is the lower-cost path as well as the compliant one. The economic case and the sovereignty case point the same way, which is what makes the argument hard to refuse.
Indicative magnitudes, not a quote. Real figures fall out of the tenant's own ledger.
The hyperscalers sell cognition by the token and hope you do not do the maths. Heimdall is the maths, made into a product, and a per-study price a CFO can underwrite.
The base. Control plane, Studio, governance and evidence, the layer everything else accrues against.
One known line item per study, underwritten from ledger variance, offered post-data, or with a lighthouse risk-share.
Consumption banded for fleets, with soft (80%) and hard (100%) budget alerts that queue rather than overspend.
Forward-purchased sovereign GPU capacity, the customer treats cognition as a balance-sheet asset, not a surprise.
3verest sells neither the algorithm nor the tokens. The router's incentive, cheapest compliant supply, is the customer's incentive. That neutrality is the point.
| Persona | Role | What they come for |
|---|---|---|
| OEM platform admin | Product ops at an imaging OEM | Per-customer policies; AI features that stay deployable and profitable. |
| OEM developer | Integration engineer | Integrate once via heimdall.run(); never manage a model lifecycle. |
| Trust IG lead | Information governance, health system | Lawful processing they can sign with confidence; evidence on demand. |
| Clinical safety officer | Clinical governance | Safe behaviour, human oversight, managed change, no silent model updates. |
| Finance / CFO | Budget owner | Predictable cognition cost and the right pricing shape. No bill shock. |
| 3verest governance analyst | Clinical-AI ops · concierge | Drift audits, eval gating, escalation playbooks, run by people on a cadence. |