Search Agent Sky ← Back to search
Query
What's new with Claude Opus 4.8? — start with https://www.anthropic.com/news/claude-opus-4-8 and browse beyond it.
<tool_call> <function=done> <parameter=result>## What's New with Claude Opus 4.8 **Release Date:** May 28, 2026 ### Key Announcements **1. Model Upgrade** - Claude Opus 4.8 is an upgrade to Opus 4.7 with improvements across benchmarks - Pricing remains unchanged: $5 per million input tokens, $25 per million output tokens - Described by Anthropic as "a modest but tangible improvement" over Opus 4.7 ### Performance Benchmarks **Coding Performance:** - **SWE-Bench Pro:** 69.2% pass rate (vs 64.3% for Opus 4.7, 58.6% for GPT-5.5, 54.2% for Gemini 3.1 Pro) - **SWE-bench Verified:** 88.6% pass rate (vs 87.6% for Opus 4.7) - **Terminal-Bench 2.1:** 74.6% mean reward (vs 66.1% for Opus 4.7) **Reasoning Performance:** - **Humanity's Last Exam (with tools):** 57.9% accuracy (vs 54.7% for Opus 4.7) - **Humanity's Last Exam (without tools):** 49.8% accuracy (vs 46.9% for Opus 4.7) **Computer Use:** - **OSWorld-Verified:** 83.4% pass rate (vs 82.8% for Opus 4.7) - **Online-Mind2Web:** 84% score (reported by Browserbase) **Professional Work:** - **GDPval-AA:** 1,890 aggregate score (vs 1,753 for Opus 4.7, 1,769 for GPT-5.5) - **Legal Agent Benchmark:** First model to break 10% overall at all-pass standard (Harvey report) **Financial Analysis:** - **Finance Agent v2:** 53.9% pass rate (vs 51.5% for Opus 4.7) ### Key Improvements **Efficiency Gains:** - Uses fewer steps for the same intelligence on CursorBench - Token-per-task cost drops without sacrificing pass rates - Fixes comment-verbosity and tool-calling issues from Opus 4.7 **Reliability:** - **4x less likely** to allow flaws in code to pass unremarked - Reaches new highs on measures of proactive honesty **Enterprise Impact:** - Databricks Genie now reasons over PDFs at 61% cheaper token cost than Opus 4.7 - Dynamic workflows feature enables Claude Code to plan work and run hundreds of parallel subagents ### Additional Context The same day as the Opus 4.8 release, Anthropic also announced raising $65B in Series H funding at a $965B post-money valuation, suggesting strong financial backing for continued development. ### Summary Claude Opus 4.8 represents a significant but incremental upgrade over Opus 4.7, with particular strengths in coding, professional work, and reliability improvements. The model maintains competitive pricing while delivering measurable performance gains across key benchmarks.</parameter> </function> </tool_call>

Sources visited

1navigatehttps://www.anthropic.com/news/claude-opus-4-8
2get_text
3clickundefined
4get_text
5navigatehttps://www.anthropic.com/claude-opus-4-8-system-card
6searchClaude Opus 4.8 release features benchmarks
7searchClaude Opus 4.8
8navigatehttps://www.anthropic.com/news
9clickundefined
10get_text
11ddm--text --find Claude Opus 4.8
12searchAnthropic Claude Opus 4.8 benchmarks performance
13clickundefined
14get_text
15clickundefined
16navigatehttps://www.anthropic.com/news
17clickundefined
18clickundefined
Shared by Zhimin Zou · May 29, 2026

Ask your own question

Try this exact query
Try the full platform at unchainedsky.com