AI Market Briefing

Frontier model landscape

Quality % is computed against the selected reference (top-right). SWE-bench Pro is the trustworthy benchmark; Verified is contaminated.

Model	Provider	Quality vs ref	SWE-Pro	SWE-Ver	LCB	AIME	In $/Mtok	Out $/Mtok	Cache	Context	tok/s	Released

Quota burn cross-matrix

Burn ratios shown as multiples of the selected reference model at medium effort. OpenAI (Codex /effort) and Anthropic (Claude Code /effort) share the same vocabulary — low / medium / high / xhigh — with Anthropic adding 'max' for Opus 4.7. Google uses thinking budgets. Multipliers stack: Fast mode ×2.0, cached input ×0.6, plan mode forces high. Switch the reference dropdown at the top of the page to recompute all ratios.

Subscription tiers

Provider	Tier	$/mo	Limits	Models	Features

Automated agent policy

Provider	Sub on automation	Enforcement	First-party exception	API needed?

Agent harness landscape

Two workflow modes: supervised (you wait) and autonomous (overnight). Anthropic OAuth blocked third-party tools on April 4, 2026 — affected harnesses marked with *.

Harness	Vendor	Category	MCP	Skills	Hooks	Subagents	Voice	Remote	Computer Use	LSP	Memory	SWE-Pro	Pricing

Detailed profiles

Hardware × model fit

Quantization, VRAM used, and tok/s estimates per hardware config. Quality % is vs selected reference.

Hardware options

Hardware	Type	VRAM	Cost	Notes

Inference frameworks

Framework	Best for	Notes

Capability radar

6-axis capability rating per model (1–5 each; see rubric). Composite is the weighted average of available axes. Reference model below the chart is the one selected in the top-right dropdown.