AI Market Briefing
Frontier models, harnesses, self-hosting · daily intelligence
Reference model:
Jurisdiction:
Workload:

Frontier model landscape

Quality % is computed against the selected reference (top-right). SWE-bench Pro is the trustworthy benchmark; Verified is contaminated.

Model Provider Quality vs ref SWE-Pro SWE-Ver LCB AIME In $/Mtok Out $/Mtok Cache Context tok/s Released

Quota burn cross-matrix

Burn ratios shown as multiples of the selected reference model at medium effort. OpenAI (Codex /effort) and Anthropic (Claude Code /effort) share the same vocabulary — low / medium / high / xhigh — with Anthropic adding 'max' for Opus 4.7. Google uses thinking budgets. Multipliers stack: Fast mode ×2.0, cached input ×0.6, plan mode forces high. Switch the reference dropdown at the top of the page to recompute all ratios.

Subscription tiers

ProviderTier$/moLimitsModelsFeatures

Automated agent policy

ProviderSub on automationEnforcementFirst-party exceptionAPI needed?

Agent harness landscape

Two workflow modes: supervised (you wait) and autonomous (overnight). Anthropic OAuth blocked third-party tools on April 4, 2026 — affected harnesses marked with *.

Harness Vendor Category MCP Skills Hooks Subagents Voice Remote Computer Use LSP Memory SWE-Pro Pricing

Detailed profiles

Hardware × model fit

Quantization, VRAM used, and tok/s estimates per hardware config. Quality % is vs selected reference.

Hardware options

HardwareTypeVRAMCostNotes

Inference frameworks

FrameworkBest forNotes

Capability radar

6-axis capability rating per model (1–5 each; see rubric). Composite is the weighted average of available axes. Reference model below the chart is the one selected in the top-right dropdown.