tech
April 16, 2026
Claude Opus 4.7 leads on SWE-bench and agentic reasoning, beating GPT-5.4 and Gemini 3.1 Pro
In short: Anthropic has released Claude Opus 4.7, its most capable generally available model, with benchmark-leading scores on SWE-bench Pro (64.3% vs GPT-5.4’s 57.7%), multi-agent coordination for hours-long workflows, 3x higher image resolution, and a 14% improvement in multi-step agentic reasoning with a third of the tool errors. Priced at $5/$25 per million tokens, it is available across Claude plans and through Amazon Bedrock, Vertex AI, and Microsoft Foundry.

TL;DR
- Claude Opus 4.7, Anthropic's latest model, leads in software engineering and agentic reasoning benchmarks.
- It outperforms OpenAI's GPT-5.4 and Google's Gemini 3.1 Pro on key developer tasks like SWE-bench Pro.
- The model features a 14% improvement in complex multi-step workflows with fewer tokens and tool errors.
- Introduces multi-agent coordination for parallel AI workstreams and improved resilience through tool failures.
- Image processing resolution is tripled, aiding enterprise document analysis.
- Context window remains at one million tokens, with strong performance on long-context research benchmarks.
- Instruction following is more literal, reducing ambiguity and off-task behavior.
- Priced the same as its predecessor, Opus 4.7 offers enhanced performance at the same cost.
- Cyber safeguards have been added to detect and block high-risk cybersecurity uses.
Continue reading the original article