Anthropic Ships Opus 4.7—With the Cyber Teeth Deliberately Pulled
Opus 4.7 scores 64.3% on SWE-bench Pro, beating GPT-5.4, but Anthropic trained it to be weaker at cybersecurity on purpose.
- Claude Opus 4.7 scores 64.3% on the SWE-bench Pro coding benchmark—up from 53.4% for Opus 4.6 and ahead of GPT-5.4’s 57.7%—but still well below Anthropic’s unreleased Mythos Preview at 77.8%.
- Anthropic deliberately experimented with reducing Opus 4.7’s cybersecurity capabilities during training and built in safeguards that auto-block high-risk cyber requests.
- The model launches at the same per-token price as Opus 4.6, but a new tokenizer can map the same text to 35% more tokens—meaning real costs may jump.
Anthropic just shipped the model it wants you to use—and quietly explained why it’s not the one it’s keeping for itself. Claude Opus 4.7 went live today as the company’s new publicly available flagship, delivering a significant jump in autonomous coding while deliberately pulling back on the one thing that made its last model famous: cybersecurity.
On the SWE-bench Pro coding benchmark, Opus 4.7 scores 64.3%—up from 53.4% for Opus 4.6 and comfortably ahead of OpenAI’s GPT-5.4 at 57.7%. But it’s nowhere near Claude Mythos Preview, which sits at 77.8%. That gap is intentional. Anthropic says it “experimented with efforts to differentially reduce” Opus 4.7’s cyber capabilities during training, and built in safeguards that automatically detect and block requests indicating prohibited or high-risk cybersecurity use.
In other words, Anthropic trained a strong model and then surgically weakened the part that scared everyone. CNBC reported that Opus 4.7 is positioned as a “less risky” alternative to Mythos—the one the UK AI Security Institute found could autonomously execute 32-step corporate network attacks.
The Cyber Safety Valve: Test Safeguards on Opus, Then Unlock Mythos
Anthropic’s strategy is unusually transparent. The company told The Decoder that Opus 4.7 is “the first such model” to test cyber safeguards—meaning what they learn from blocking attacks on Opus 4.7 will inform how they eventually release Mythos-class capabilities to the public. It’s a safety valve: deploy the weaker model, watch what people try to break, patch the holes, then unlock the stronger one.
Security researchers who want to use Opus 4.7 for legitimate purposes—vulnerability research, penetration testing, red-teaming—can apply to Anthropic’s new Cyber Verification Program. Everyone else gets the throttled version.
The model also ships with triple the image resolution (up to 2,576 pixels on the long edge, roughly 3.75 megapixels) and follows instructions more literally than Opus 4.6, which Anthropic says would sometimes “loosely interpret or skip parts” of prompts. That literalism is a double-edged sword: better precision, but prompts written for older models may break.
The Hidden Cost Hike: Same Price, More Tokens
Per-token pricing stays at $5 per million input tokens and $25 per million output—same as Opus 4.6. But there’s a catch. The Decoder found that Opus 4.7’s new tokenizer maps the same text to up to 35% more tokens. Same price per token, more tokens per request. The effective cost of running Opus 4.7 could jump significantly without Anthropic ever raising the sticker price.
The model is available today across all Claude products, the API, Amazon Bedrock, Google Cloud’s Vertex AI, and Microsoft Foundry. Anthropic says early testers report being able to hand off their hardest coding work “with confidence”—the kind that previously needed close human supervision.
Whether that confidence extends to cybersecurity remains deliberately unanswered. Anthropic has the dangerous model. It’s just not letting you touch it yet.