Anthropic launched Claude Opus 4.7 on April 16 as its most capable generally available model. It is a targeted upgrade over Opus 4.6 — not a new generation — but the gains where they matter most (software engineering, long-horizon agents, vision) are enough to nudge Anthropic back ahead of GPT-5.4 and Gemini 3.1 Pro on several headline benchmarks.
Pricing did not move: 25 per million output tokens, identical to Opus 4.6. The model is available across Claude products, the Anthropic API, Amazon Bedrock, Google Cloud Vertex AI, and Microsoft Foundry from day one.
The Benchmark Picture
The coding story is the clearest. Anthropic concedes that its unreleased internal model “Mythos” remains more capable overall, but Opus 4.7 is the strongest generally available model on several key engineering benchmarks.
| Benchmark | Opus 4.7 | Opus 4.6 | GPT-5.4 | Gemini 3.1 Pro |
|---|---|---|---|---|
| SWE-bench Verified | 87.6% | 80.8% | — | 80.6% |
| SWE-bench Pro | 64.3% | 53.4% | 57.7% | 54.2% |
| Terminal-Bench 2.0 | 69.4% | 65.4% | 75.1% | ~67% |
| ARC-AGI-2 | 77.1% | — | 76.1% | ~74% |
| MCP-Atlas (tool orchestration) | 77.3% | — | — | — |
Anthropic reports a 14% improvement over Opus 4.6 on complex multi-step workflows while using fewer tokens and producing roughly a third of the tool-call errors. Early-access partner Warp confirmed Opus 4.7 passed Terminal-Bench tasks that Opus 4.6 had failed — including a concurrency bug that tripped the prior model consistently.
GPT-5.4 still leads on raw terminal coding, and the gap on ARC-AGI-2 with the frontier is narrow. The honest read is that Opus 4.7 is the best general-purpose engineering model available today, but the leaderboard is close enough that the “winner” depends on the workload.
What’s Actually New
Four changes matter for developers building on the model.
xhigh reasoning tier. Opus 4.7 introduces an “extra high” effort level between high and max, giving finer control over the reasoning-vs-latency tradeoff on hard problems. For most agentic workflows, xhigh gets most of max’s quality at a meaningful latency improvement.
Higher image resolution. Maximum image input jumped from ~1.15 megapixels (1,568 px on the long edge) to ~3.75 megapixels (2,576 px). For anyone doing document understanding, UI screenshot analysis, or chart reasoning, this is the largest practical improvement in the release — Opus 4.6 often required tiling strategies that Opus 4.7 handles natively.
/ultrareview inside Claude Code. A new command in the Claude Code environment that runs deeper static and semantic review of a change — more rigorous than the default review pass. Pair this with the release’s lower tool-call error rate and long-running code review becomes meaningfully more reliable.
Task budgets (beta). A new mechanism that lets developers cap how much reasoning Claude spends on a long-running task. Useful for bounding cost on open-ended agent runs where you want the model to stop thinking and ship.
Cyber Safeguards
The release ships with new safeguards that automatically detect and block requests indicating prohibited or high-risk cybersecurity uses. This follows directly from Anthropic’s work on Claude Mythos, its cybersecurity-specialized model — Mythos exposed what a dedicated cyber-capable model looks like, and Opus 4.7 is hardened against that capability being abused through the public model.
OpenAI landed on a similar calculus in its own April release: GPT-5.4-Cyber is gated behind a vetted access program rather than exposed in the base model. Both labs are now signaling that capability and access control are inseparable at the frontier.
Same Price, Better Model
Holding the price at 25 per million input/output tokens is the quietest but most consequential part of the release. For teams already running Opus 4.6 in production, the switch is a drop-in: same cost envelope, more capability. For teams comparing against GPT-5.4 on cost, Opus 4.7 is now competitive on enough coding workloads to merit a real A/B.
The bigger picture: with MCP now under Linux Foundation governance and 97 million monthly SDK installs, Anthropic’s bet that the ecosystem would converge on its protocol has paid off. Opus 4.7’s best-in-class MCP-Atlas score is not coincidental — the model is tuned against the orchestration interface that most agentic tooling now speaks.
What to Watch
The interesting open question is Mythos. Anthropic has said publicly that it outperforms Opus 4.7 and is being used under controlled access for cybersecurity research. If a broader release follows, the “generally available frontier” resets again. Until then, Opus 4.7 is the most capable model most teams can actually buy — and at Opus 4.6 prices, the upgrade path is straightforward.
Sources: Anthropic — Introducing Claude Opus 4.7 · VentureBeat · AWS Bedrock release notes · 9to5Mac
