Is private inference real yet?
A public, automated registry of TEE-verified inference. Every day, we re-verify the attestation bundles each provider exposes, and score them across eight capability layers. See methodology →
Stage 1 verdict
Each row's automated cryptographic surface is in the matrix's Surface ✅ column. The verdicts below add the framework checklist items the dashboard can't auto-check (governance, backdoors, upgrade transparency) and assume a closed-chain E2EE client like hermes-agent #12201.
| Provider | Verdict | Load-bearing reason |
|---|---|---|
| near-ai (with hermes) | Stage 1 | Compose-hash rotations are public on-chain (DstackApp.ComposeHashAdded events on Base), satisfying §1 (on-chain attestation). Inference holds no private persistent state, so §5 (upgrade with notice period) collapses to anchor-PR review — hermes fails closed on rotation until merged. |
| redpill (phala-simple) | Stage 0 | §7 (no backdoors / debug paths) fails: every phala/* model boots dstack-nvidia-dev images with operator host-SSH wired through DSTACK_AUTHORIZED_KEYS in the measured pre_launch_script (live 2026-05-05). No client-side check can close this. |
| redpill (chutes) | Stage 0 | Audit surface too thin — Chutes attestation shape doesn't expose GPU NRAS or compose hash, so the §3 (reproducible code measurement) floor is unverifiable. |
| tinfoil | Stage 1 | Audit verified all operator-controllable inputs in the attested config are off the prompt path. |
| venice | Stage 0 (skill-side) 0–1 (infra) |
Audit (venice-private-inference §Stage Assessment) splits two ways: infrastructure: Stage 0–1 per backend operator — e2ee-glm-5 is NEAR-backed (inherits NEAR's verdict above), e2ee-venice-uncensored-24b-p is Phala-backed (per-deployment review needed); weakened by intermittent /tee/attestation reachability. Skill story: Stage 0 — veniceai/skills misnames the protocol ("HPKE/Noise" vs actual ECIES), cites a 404 URL, teaches none of the standard verification steps; an agent following the skill builds a TOFU connection. Closed-chain hermes-class clients don't load Venice skills, so the skill-side gap doesn't apply to them; but for any agent loading veniceai/skills directly, this is a "backdoor by inattention." |
Full per-provider analysis at devproof-audits-guide.
Attestation matrix
One row per model. Cells: ✅ verified, ❌ rejected, — required for this shape but not exposed (counts as a fail), — architecturally N/A (e.g. no gateway hop for a single-TD provider). The Surface ✅ column is green only when every required layer for that shape is ✅. This is necessary but not sufficient for framework Stage 1 — KMS root provenance, governance, image-digest pinning, and reproducible builds aren't in the automated matrix. See provider pages for the full Stage-0/1 verdict.
| Provider | Model | Shape | Nonce bound | TDX quote | report_data binds key | GPU attested | Key derives to addr | compose_hash committed | Backend attested | Surface ✅ | lat |
|---|---|---|---|---|---|---|---|---|---|---|---|
| near-ai | openai/gpt-oss-120b | tdx+gpu | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ | gap | 4.32s |
| near-ai | zai-org/GLM-5.1-FP8 | tdx+gpu | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ | gap | 3.06s |
| near-ai | zai-org/GLM-5-FP8 | tdx+gpu | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ | gap | 1.95s |
| near-ai | Qwen/Qwen3-30B-A3B-Instruct-2507 | tdx+gpu | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ | gap | 5.59s |
| near-ai | deepseek-ai/DeepSeek-V3-0324 | tdx+gpu | — | — | — | — | — | — | — | gap | 0.38s |
| HTTP 503: {"error":"Provider error: Model 'deepseek-ai/DeepSeek-V3-0324' not found. It's not a valid model name or alias."} | |||||||||||
| near-ai | meta-llama/Llama-4-Scout-17B-16E-Instruct | tdx+gpu | — | — | — | — | — | — | — | gap | 0.42s |
| HTTP 503: {"error":"Provider error: Model 'meta-llama/Llama-4-Scout-17B-16E-Instruct' not found. It's not a valid model name or alias."} | |||||||||||
| redpill | phala/gpt-oss-20b | phala-simple | ✅ | ✅ | ✅ | ✅ | — | ✅ | — | gap | 2.55s |
| redpill | phala/gpt-oss-120b | near-relay | ✅ | ✅ | ✅ | ✅ | ✅ | — | ❌ | gap | 2.68s |
| redpill | phala/qwen-2.5-7b-instruct | phala-simple | ✅ | ✅ | ✅ | ✅ | — | ✅ | — | gap | 2.22s |
| redpill | phala/glm-4.7 | near-relay | ✅ | ✅ | ✅ | ✅ | ✅ | — | ❌ | gap | 2.68s |
| redpill | phala/deepseek-v3.2 | chutes | ✅ | ✅ | ✅ | — | — | — | — | gap | 36.78s |
| redpill | phala/kimi-k2.5 | chutes | ✅ | ✅ | ✅ | — | — | — | — | gap | 66.16s |
| tinfoil | router | tinfoil-sev-snp-v2 | — | — | — | — | — | — | — | gap | 0.32s |
| tinfoil | gpt-oss-120b | tinfoil-sev-snp-v2 | — | — | — | — | — | — | — | gap | 0.14s |
| tinfoil | llama3-3-70b | tinfoil-sev-snp-v2 | — | — | — | — | — | — | — | gap | 0.16s |
| tinfoil | gemma4-31b | tinfoil-sev-snp-v2 | — | — | — | — | — | — | — | gap | 0.15s |
| tinfoil | deepseek-v4-pro | error | — | — | — | — | — | — | — | gap | 0.0s |
| SSLError: HTTPSConnectionPool(host='deepseek-v4-pro.tinfoil.containers.tinfoil.dev', port=443): Max retries exceeded with url: /.well-known/tinfoil-attestation (Caused by SSLError(SSLEOFError(8, '[SSL: UNEXPECTED_EOF_WHILE_READING] EOF occurred in violation of protocol (_ssl.c:1016)'))) | |||||||||||
| tinfoil | kimi-k2-6 | error | — | — | — | — | — | — | — | gap | 0.0s |
| ConnectionError: HTTPSConnectionPool(host='kimi-k2-6.tinfoil.containers.tinfoil.dev', port=443): Max retries exceeded with url: /.well-known/tinfoil-attestation (Caused by NameResolutionError("HTTPSConnection(host='kimi-k2-6.tinfoil.containers.tinfoil.dev', port=443): Failed to resolve 'kimi-k2-6.tinfoil.containers.tinfoil.dev' ([Errno -2] Name or service not known)")) | |||||||||||
| venice | e2ee-glm-5 | venice | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ | ✅ | 1.98s |
| venice | e2ee-qwen3-5-122b-a10b | venice | — | — | — | — | — | — | — | gap | 60.12s |
| transport: HTTPSConnectionPool(host='api.venice.ai', port=443): Read timed out. (read timeout=60) | |||||||||||
| venice | e2ee-uncensored-24b-p | venice | — | — | — | — | — | — | — | gap | 0.1s |
| no TEE attestation available for this model (404) | |||||||||||
| venice | e2ee-gpt-oss-120b-p | venice | — | — | — | — | — | — | — | gap | 60.12s |
| transport: HTTPSConnectionPool(host='api.venice.ai', port=443): Read timed out. (read timeout=60) | |||||||||||
Known limitations
Reasons why green rows above don't imply end-to-end private inference. Pulled from devproof-audits-guide.
- ◐NEAR gateway-side: narrowed, not closed.
cloud-api PRs #485 /
#513 /
#552 /
#558
(Mar–Apr 2026) deleted the operator-rewritable discovery URL and now inline-verify
each backend's TDX/RTMR3/NRAS before serving. Still open server-side:
ALLOWED_COMPOSE_HASHESis unset, so any TCB-current TDX TD passes the gateway — andDstackKms.kmsInfoon Base is empty (no on-chain anchor for the KMS root pubkey). Closed client-side: hermes-agent #12201 strict-pins(model → app_id, compose_hashes, os_image_hash, kms_pubkey)and per-model inner-compose closure (compose_manager_attestation.actions[]→(file, commit, file_sha256)), plusmodel_name == requested_modelwhich caught a live silent substitution (deepseek-ai/DeepSeek-V3.1 → Qwen/Qwen3.5-122B-A10B) on 2026-05-05. - ◉RedPill phala-simple: dev OS image with operator host-SSH (live 2026-05-05).
Every
phala/*model on RedPill bootsdstack-nvidia-dev-0.5.{5,8}-*(ships sshd + debug-tweaks + tools-profile), withDSTACK_AUTHORIZED_KEYSinallowed_envsand a measuredpre_launch_scriptthat writes the operator-supplied key to root'sauthorized_keysat boot. Switching to the proddstack-nvidia-0.5.{5,6,8}image (no sshd, runsdisable_login()) neutralizes the path without compose changes — but it hasn't been done. - ◉Upstream verifier TODOs.
Phala's
private-ai-verifierstill decodes NVIDIA and Intel Trust Authority JWTs withverify_signature=False. We inherit this. - ◉KMS opacity.
Phala KMS and most dstack-hosted gateways are themselves Stage 0: mutable image tags,
no upgrade log, no third-party review. NEAR's KMS contract on Base also has empty
kmsInfoacross all four fields, so the KMS root pubkey provenance is not on-chain auditable. - ◉Catalog ≠ served. Some entries (e.g. Tinfoil-routed models in RedPill's /models) 404 on real calls.
How this page is generated
Every day at 13:17 UTC, probe-daily.yml
runs python -m probes.collect against each provider, writes a JSON snapshot to
data/snapshots/YYYY-MM-DD.json, re-renders this page, and commits.
All data and code are reproducible — see run a probe locally.