Claude Fable 5 vs Claude Opus 4.8 vs Claude Mythos – Direct Coding Comparison

June 10, 2026
7:22 am

Anthropic released two powerful models in less than two weeks, and they are not competing for the same users. Claude Fable 5 scored 80.3% on SWE-Bench Pro on launch day; Claude Opus 4.8 scores 69.2% on the same benchmark. That is, for the record, an insane leap, but it tells only part of the story. The smarter question is not which model is better in the abstract. It is which one your actual workload demands. For developers, we answer that question in this detailed comparison of Claude Fable 5 vs Claude Opus 4.8. Let’s dig in.

What These Two Models Are, and How They Fit the Stack

Claude Opus 4.8 is a purpose-built general-purpose frontier model. Anthropic released it on May 28, 2026, two weeks before Fable 5 arrived. It sits one capability tier below the Mythos class and is optimized for complex but time-bounded synchronous work. Think real-time collaboration, analytical drafts, code review, and research tasks where the model runs once, produces output, and the session ends.

Claude Fable 5 operates at a different level entirely. Anthropic released it on June 9, 2026, as the first publicly available Mythos-class model, a capability tier that had not been publicly available before that date. Fable 5 is built for ambitious, long-running, asynchronous work: tasks that previous models simply could not sustain. The longer and more complex the task, the larger its lead over every model below it, including Opus 4.8.

The naming convention makes the hierarchy explicit. Fable marks Anthropic’s fifth model generation. Haiku, Sonnet, and Opus continue as tiers within that generation, each faster or more economical than the one above. Fable sits at the top. It is not a smarter Opus. It is a different class of model built to do a different class of work.

Where Fable 5 Leads, and by How Much

Fable 5’s advantages are real but not uniformly distributed. The gap over Opus 4.8 narrows on short, routine work and compounds sharply on long, autonomous, multi-step tasks. Anthropic’s framing is direct: the longer and more complex the task, the larger the lead. Every independent evaluation published since launch confirms that pattern.

Software Engineering

Fable 5 scores 95.0% on SWE-Bench Verified and 80.3% on SWE-Bench Pro. The Pro variant is the harder and more informative test: it requires the model to fix real bugs in open-source repositories under production-quality standards. Opus 4.8 scores 88.6% on Verified and 69.2% on Pro. On Cognition’s FrontierCode evaluation, which tests whether models can pass difficult coding tasks while meeting high-quality production codebase standards, Fable 5 scores highest among all frontier models even at medium effort. Stripe put Fable 5 on a 50-million-line Ruby codebase and asked it to perform a codebase-wide migration. It finished in a day. That same job would have taken a full engineering team over two months by hand.

Knowledge Work and Finance

Fable 5 posted the highest score of any model on Hebbia’s Finance Benchmark, which measures senior-level reasoning through document-heavy tasks: chart interpretation, document-based reasoning, and financial problem-solving at analyst grade. IMC, the trading and technology firm, ran Fable 5 through its own internal trading-analysis evaluations, covering factual lookup, conceptual reasoning, root-cause analysis, and expected-value analysis. The model aced them nearly across the board. These are the same tests IMC uses to evaluate human analysts, not synthetic benchmarks designed to flatter AI.

Vision

Fable 5 is the current state-of-the-art publicly available model for vision-based tasks. It extracts precise numerical data from complex scientific figures with an accuracy that fails or degrades in most competing models. It can reconstruct a complete web application’s source code from screenshots alone, without access to any original files or documentation. Earlier Claude models needed a complex helper harness with maps and navigation aids to make progress in Pokémon FireRed. Fable 5 completed the entire game using a minimal, vision-only harness with no supplemental tools. The capability difference for vision-heavy professional work, like architecture, medical imaging analysis, or financial document processing, is substantial.

Memory and Long-Context Reasoning

Fable 5 maintains focus across millions of tokens in long-running sessions and uses its own written notes to progressively improve its outputs as work continues. In head-to-head tests using the deck-building game Slay the Spire, giving both Fable 5 and Opus 4.8 access to persistent file-based memory improved Fable 5’s performance three times more than it improved Opus 4.8’s. Fable 5 reached the final act three times more often. Its context window extends past one million tokens, marginally above the one million ceiling on Opus 4.8.

Agentic Efficiency

On spreadsheet task suites, Fable 5 finishes runs 25 to 30% faster than Opus 4.8 at every effort level, using fewer conversational turns. Fewer turns reduce real cost on agentic workloads, where conversation overhead multiplies per-token spend. The Every Senior Engineer Benchmark puts the sharpest number on this: Fable 5 scores 91 out of 100 on a test built around real senior engineering hiring criteria. Opus 4.8 scores approximately 63. That 45-percentage-point difference is not a description of one model being slightly better. It is a description of one model completing a different category of work.

Claude Opus 4.8: The Argument for the Default

Opus 4.8 is not a stepping stone. It scored 88.6% on SWE-Bench Verified, 69.2% on SWE-Bench Pro, and 1890 Elo on the GDPval-AA agentic evaluation. Its BenchLM score of 94 trails Fable 5’s 96 by two points overall, but the per-category story matters: on agentic tasks Fable 5 leads 85.2 to 80.1, and on coding it leads 85.6 to 76.4. The widest single gap in the BenchLM dataset is vision and multimodal, where Fable 5 posts 92.4 against Opus 4.8’s 76.1. On knowledge tasks, the gap narrows to 74.8 versus 70.1. For knowledge-only work, the premium is harder to justify.

The economic case for Opus 4.8 is straightforward. It costs $5 per million input tokens and $25 per million output tokens, exactly half the Fable 5 price at every level. For high-volume API applications where task complexity does not require multi-day autonomous execution, the 2x premium on Fable 5 does not pay for itself. On a per-benchmark-point basis, Fable 5 costs roughly 72% more than Opus 4.8 on raw output pricing. But that arithmetic hides the most important part: the tasks in the gap between 69 and 80 on SWE-Bench Pro are not tasks where Opus 4.8 does slightly worse. They are tasks Opus 4.8 fails entirely. For work at that difficulty level, the comparison is not a price-quality trade-off. It is finished versus not finished.

Fable 5 and Mythos 5: Same Weights, Different Keys

Now, you must be wondering…how about Claude Mythos 5? Is it same as Claude Fable 5 or? To answer your question, the two are not separate systems. They share the same underlying model weights. The distinction is access control, and it is meaningful.

Anthropic first previewed the Mythos model in April 2026 without making it publicly available. The reason was specific: during internal testing, Mythos proved unexpectedly adept at identifying software vulnerabilities across every major operating system and web browser, despite not being designed for cybersecurity. Anthropic considered that capability too risky to release without significant safeguards. Instead, it launched Project Glasswing, a collaboration with the US government, Amazon Web Services, Apple, Google, Cisco, Microsoft, JPMorgan Chase, and other organizations. Approved participants received early access to Mythos Preview specifically to find and patch vulnerabilities in critical infrastructure.

Claude Fable 5 is the answer to how you release a model that capable to the general public. Anthropic built hard safety classifiers on top of the shared weights that intercept queries in cybersecurity, biology, chemistry, and distillation, and route those requests to Opus 4.8 instead. The classifiers passed more than 1,000 hours of external bug bounty testing with no universal jailbreak found. They trigger in fewer than 5% of sessions on average. When they do trigger, responses are billed at Opus 4.8 rates, not Fable 5 rates.

Claude Mythos 5, released on the same day as Fable 5, uses the identical base weights with the classifiers removed in specific sensitive domains. It is available only to approved Project Glasswing participants. The capabilities Mythos 5 unlocks without those restrictions are not incremental. Internal tests showed a ten-times acceleration in drug design pipelines, novel independently corroborated hypotheses in molecular biology, and original genomics research that outperformed a recently published Science journal model despite using a custom ML model 100 times smaller. These are not improvements over Fable 5. They are the same model, operating in areas Fable 5 is explicitly prevented from entering.

What Separates Fable 5 from Mythos in Practice

Cybersecurity queries. Fable 5 cannot answer questions in cybersecurity domains. Those requests are silently rerouted to Opus 4.8. Mythos 5 answers them with the full capability of the underlying model, which Anthropic describes as the strongest cybersecurity capabilities of any model currently available.
Biology and life sciences. Fable 5 routes biology queries to Opus 4.8. Mythos 5 handles protein design, hypothesis generation, and genomics research autonomously and at a level that has already produced experimental results competitive with published peer-reviewed work.
Chemistry and distillation. These domains are safeguarded in Fable 5 and unrestricted in Mythos 5. Mythos 5 access for chemistry research is currently limited to Project Glasswing participants, with a broader trusted access expansion planned for biology researchers and, later, a wider cybersecurity program.
Access model. Fable 5 is available to anyone on the Claude API, Amazon Bedrock, Google Vertex AI, and Microsoft Foundry. Mythos 5 requires Anthropic approval and is currently limited to organizations within or approved through Project Glasswing.
Price. Both models carry identical pricing at $10 per million input tokens and $50 per million output tokens. The cost difference is access, not the token rate.

Claude Fable 5 vs Claude Opus 4.8 – Benchmark Comparison

</>	Claude Fable 5	Claude Opus 4.8
SWE-Bench Verified	95.0%	88.6%
SWE-Bench Pro	80.3%	69.2%
GDPval-AA Elo	1932	1890
BenchLM Overall	96	94
Coding (BenchLM)	85.6	76.4
Agentic (BenchLM)	85.2	80.1
Vision / Multimodal (BenchLM)	92.4	76.1
Knowledge (BenchLM)	74.8	70.1
Senior Engineer Benchmark (Every)	91/100	~63/100
Input price per 1M tokens	$10	$5
Output price per 1M tokens	$50	$25
Context window	1M+ tokens	1M tokens
Domain restrictions	Yes (cybersecurity, bio, chem, distillation)	None

Sources: Anthropic (June 9, 2026 launch announcement), Cognition AI FrontierCode, Hebbia Finance Benchmark, BenchLM, Every Senior Engineer Benchmark.

Claude Fable 5 vs Claude Opus 4.8 – Pricing and Access

Claude dropped Fable 5 and the API pricing genuinely shocked me
byu/AirPure9910 inClaudeAI

Claude Fable 5

Base price: $10 per million input tokens, $50 per million output tokens. Exactly 2x Opus 4.8 at every tier, and less than half of what Mythos Preview cost before the June 9 launch.
Prompt caching: 90% discount on input tokens for cached prompts. For applications with repeated or templated prompts, this significantly reduces effective input cost and partially offsets the premium over Opus 4.8.
Available on: Claude API, Amazon Bedrock, Google Vertex AI, Microsoft Foundry, and Claude Platform on AWS. API model string is claude-fable-5.
Subscription plans: Free in Pro, Max, Team, and seat-based Enterprise plans through June 22, 2026. Usage credits required from June 23 onwards. Anthropic has stated its intent to restore Fable 5 as a standard plan feature when capacity allows, without committing to a date.
Fallback billing: When safety classifiers trigger and the response comes from Opus 4.8, you are billed at Opus 4.8 rates. Anthropic issues a fallback credit to cover the prompt-cache cost of the model switch.
US-only inference: Available at 1.1x pricing for input and output tokens for workloads that require US-based processing.

Claude Opus 4.8

Base price: $5 per million input tokens, $25 per million output tokens. Exactly half the Fable 5 price with no domain restrictions, fallback behavior, or routing complexity.
Released: May 28, 2026. It was Anthropic’s top publicly available model for two weeks before Fable 5 launched.
Available on: Claude API and all major cloud platforms. No domain-based routing, no access controls beyond standard API authentication.
Role in the Fable 5 architecture: Opus 4.8 receives every request that Fable 5’s safety classifiers redirect. Anthropic chose it as the fallback model because it considers it capable enough to handle sensitive domain queries reliably. In any Fable 5 deployment, Opus 4.8 is a de facto co-model.
Best suited for: Complex synchronous collaboration, shorter coding tasks, drafting, analytical work, and review passes where sustained multi-day autonomy is not required.

Fable 5 vs Opus 4.8 – Which Model to Use

Choose Claude Fable 5 when:

Your work involves large-scale codebase migrations or multi-day autonomous engineering. Fable 5 is built to run in agentic harnesses like Claude Code for extended periods, planning across stages, delegating to sub-agents, and verifying its own output. Stripe’s 50-million-line codebase migration, completed in a day against a two-month estimate for a human team, is the clearest available production data point.
Document-heavy analytical work requires real vision capability. Fable 5 extracts structured data from charts, tables, and PDFs at a level of accuracy that separates it from Opus 4.8 on senior-level financial reasoning. Its lead on the Hebbia Finance Benchmark is the cleanest measure of this advantage for finance, legal, and architecture use cases.
Context scales into the millions of tokens across a long session. Fable 5’s performance improvement with persistent memory is three times that of Opus 4.8 under the same conditions. For long-running research or analysis tasks that require sustained coherence at scale, that multiplier compounds with each session.
Task failure costs more than the 2x token premium. On the Every Senior Engineer Benchmark, Fable 5 scores 91 versus Opus 4.8’s 63. For work at that level of complexity, the comparison is not a quality gradient. It is a completion gradient.

Choose Claude Opus 4.8 when:

Your tasks are short, synchronous, and session-bounded. Drafts, code review, research summaries, analytical passes, and real-time collaboration all sit in Opus 4.8’s design range. It handles them well at half the Fable 5 cost per token.
Your work falls in domains where Fable 5 routes to Opus 4.8 anyway. Cybersecurity, biology, chemistry, and distillation queries sent to Fable 5 return Opus 4.8 responses regardless. Routing directly to Opus 4.8 for those workloads costs half as much for the exact same output.
You are running high-volume API traffic at scale. The 2x per-token premium on Fable 5 compounds quickly across millions of requests. For workloads where task complexity does not clear the bar Fable 5 is optimized for, Opus 4.8 delivers genuine frontier performance at better economics.
You need clean, predictable routing with no domain-based behavior changes. Fable 5’s safeguard fallbacks add complexity to any deployment that might receive sensitive-domain queries. Opus 4.8 produces fully predictable behavior across all input types with no routing logic required.

The Bottom Line

Choosing between Claude Fable 5 and Claude Opus 4.8 is not really a quality argument: it is a task-architecture argument. Fable 5 is the right model when work is long, autonomous, vision-heavy, or complex enough that Opus 4.8 would fail rather than underperform. Its 80.3% SWE-Bench Pro score, 91-out-of-100 senior engineer result, and multi-day execution capability make that case clearly. Opus 4.8 is the right model for everything else: well-priced, fast, reliable, and free from the domain-based routing overhead Fable 5 carries. The 2x token premium on Fable 5 is easy to justify when it is the difference between a job finishing and a job stalling. It is harder to justify when Opus 4.8 would have done the same work at half the cost.

The AI workspace that turns prompts into results.

Plan, research, and ship faster with AI that understands your work.

From PRD to production before the week is over. Build with Friday AI

Available on:

tryfriday.ai

product_team_goals:

time_to_market: "shipped_in_hours"

dev_alignment: "prds_to_clean_code"

overhead: "zero_waste_meetings"

sprint_status: features_deployed_successfully...