Anthropic Deploys Claude Opus 4.7, Prioritizing Coding Rigor and Visual Precision Over Raw Power
Anthropic releases Claude Opus 4.7 with a 64.3% SWE-bench score and 3x vision resolution, balancing power with a new safety-first deployment strategy.
Anthropic made Claude Opus 4.7 generally available today, April 16, 2026, marking a significant recalibration of the company’s flagship model series. The release brings the updated model to the Claude API, Amazon Bedrock, Google Cloud’s Vertex AI, and Microsoft Foundry, alongside its standard web and mobile interfaces. While Opus 4.7 introduces substantial gains in autonomous software engineering and high-resolution vision, it is arriving in the shadow of the more powerful, yet restricted, Claude Mythos Preview, signaling a new era of incremental safety-led deployment for the San Francisco-based AI firm.
A New Benchmark for Autonomous Engineering
For developers, the primary draw of Opus 4.7 is its performance in complex, long-context coding tasks. In internal and external evaluations, the model demonstrated a marked improvement in its ability to follow literal instructions and maintain consistency across large codebases. On the SWE-bench Pro benchmark—a rigorous test of an AI’s ability to resolve real-world software issues—Opus 4.7 achieved a score of 64.3%, a double-digit jump from the 53.4% recorded by its predecessor, Opus 4.6.
Similarly, on CursorBench, the model’s score rose to 70%, up from 58%. These gains are not merely statistical; early users report a newfound confidence in delegating entire sub-modules to the agent, rather than using it solely for snippet generation. However, this increased literalism comes with a caveat: users may need to re-tune existing prompts. Unlike previous versions that often "filled in the blanks" for ambiguous requests, Opus 4.7 follows instructions to the letter, which can lead to unexpected results if the initial prompt lacks specificity.

Vision Capabilities and Multimodal Refinement
Beyond code, Opus 4.7 represents a leap in multimodal understanding. The model now supports images up to 2,576 pixels on the long edge, totaling approximately 3.75 megapixels. This is more than three times the resolution supported by previous Claude iterations.
This increased visual fidelity allows the model to interpret dense screenshots, complex architectural diagrams, and pixel-perfect UI references with unprecedented accuracy. Anthropic has also refined the model’s "taste" in creative tasks, noting that it produces higher-quality professional outputs for slide decks, interface designs, and document formatting.
The Economic Reality of the New Tokenizer
While Anthropic has maintained the pricing of $5 per million input tokens and $25 per million output tokens, the actual cost of operation may rise for many enterprises. Opus 4.7 introduces an updated tokenizer that is significantly more dense, producing up to 35% more tokens for the same block of input text.
To give developers more control over these costs and performance trade-offs, Anthropic has introduced a new `xhigh` effort level. This setting allows users to dictate whether the model should prioritize deep reasoning for difficult problems or lower latency for straightforward tasks. For enterprises, these efficiency gains are already yielding results.
"Claude Opus 4.7 demonstrates significant efficiency gains while preserving the performance of Claude Opus 4.6," said Yashodha Bhavnani, Head of AI at Box. "In Box's evaluations, Opus 4.7 had a 56% reduction in model calls and 50% reduction in tool calls. It also responded 24% faster and used 30% fewer AI Units – all enhancements that will help enterprises move faster and scale more affordably."
Safety and the Shadow of Claude Mythos
The release of Opus 4.7 is inseparable from Anthropic’s broader safety strategy, specifically concerning its most powerful model, Claude Mythos Preview. Released earlier this month to limited partners, Mythos demonstrated the ability to autonomously discover and exploit zero-day vulnerabilities in major operating systems.
Due to these high-risk offensive capabilities, Anthropic has confined Mythos to "Project Glasswing," a restricted testing initiative. Opus 4.7 serves as the public-facing beneficiary of this safety research. It includes automated safeguards designed to detect and block requests that indicate high-risk cybersecurity use cases. Interestingly, despite these safeguards, Opus 4.7 showed a slight decrease in its own cybersecurity vulnerability reproduction score, falling to 73.1% from the 73.8% seen in Opus 4.6, suggesting a successful dampening of potentially harmful autonomous behaviors.

Looking Ahead
By releasing Opus 4.7 while keeping the more potent Mythos under wraps, Anthropic is reinforcing its commitment to the "Constitutional AI" framework—aiming for models that are helpful and honest without becoming hazardous. As the industry watches the competitive dance between Opus 4.7, OpenAI’s GPT-5.4 Pro, and Google’s Gemini 3.1 Pro, the focus has shifted from raw intelligence to the reliability of autonomous agents in production environments.
For now, Opus 4.7 stands as a powerful tool for the modern developer and enterprise, providing the necessary horsepower for multimodal agent tasks while acting as a vital testbed for the safety protocols that will eventually govern the release of next-generation frontier models.
