Key Highlights
- ✓54.4% SWE-Bench Pro (Mini) at $0.75/M input
- ✓76,000 photos described for $52 (Nano)
- ✓Subagent architecture slashes costs 85-95%
- ✓400K context window on both models
- ✓Nano at $0.20/M — cheaper than Gemini Flash-Lite
Why this matters
OpenAI dropped GPT-5.4 Mini and Nano on March 17 — less than two weeks after the full GPT-5.4 model. The benchmarks are impressive. The pricing is more impressive. But the architecture shift is what you should be paying attention to.
The performance story
GPT-5.4 Mini hits 54.4% on SWE-Bench Pro — compared to 45.7% for its predecessor GPT-5 mini and 57.7% for the full GPT-5.4. On OSWorld-Verified (computer use via screenshot interpretation), it reaches 72.1% versus 42.0% for GPT-5 mini. That's a 72% improvement in agentic capability at a fraction of the cost.
Nano scores 52.4% on SWE-Bench Pro — nearly matching Mini — at a quarter of the price. Both models have 400K context windows. Both support 'thinking' mode on Free and Go tiers.
The pricing story
Mini: $0.75/M input, $4.50/M output. Nano: $0.20/M input, $1.25/M output.
Simon Willison ran the numbers: 76,000 photos described for $52 using Nano. That's classification, extraction, and analysis at a price point that makes batch processing trivial.
Nano is cheaper than Google's Gemini Flash-Lite. At $0.20/M input tokens, the cost of processing a 10-page document is fractions of a cent. This isn't a rounding error — it's a category shift.
The subagent architecture
Here's what actually matters: the three-tier deployment model that Mini and Nano enable.
Tier 1 — Nano ($0.20/M): Classification, extraction, routing. Does the document need analysis? Is this email spam? Which category does this support ticket belong to? Nano handles the simple decisions that make up 80% of AI workloads.
Tier 2 — Mini ($0.75/M): Medium-complexity tasks. Code review, document summarization, structured data extraction. The tasks that need reasoning but not frontier capability.
Tier 3 — GPT-5.4 ($7.50/M): Hard problems only. Complex code generation, multi-step reasoning, novel analysis. Reserve the expensive model for the 5% of tasks that actually need it.
This three-tier pattern slashes costs by 85-95% compared to routing everything through the flagship model. Windsurf adopted it same-day. Every serious AI deployment will follow.
The verdict
The teams that redesign their process — not just their prompts — will capture the most value. Mini and Nano aren't cheaper GPT-5.4. They're the foundation for a cost structure that makes AI economically viable at every scale.
The conventional wisdom says AI models keep getting more expensive. The insight: the cost per useful unit of intelligence just collapsed by an order of magnitude.