Skip to content
LIVE
FABLE 5$10 / $50per Mtok
GPT-5.5$5 / $30per Mtok
GEMINI 2.5 PRO$1.25 / $10per Mtok
OPUS 4.8$5 / $25per Mtok
SONNET 4.6$3 / $15per Mtok
SWE-BENCHleader Claude Fable 580.3%
MMLU-PROleader GPT-5.594.2
GPQAleader GPT-5.578.3
AFTAv1.0 whitepaper live at /whitepaper
FABLE 5$10 / $50per Mtok
GPT-5.5$5 / $30per Mtok
GEMINI 2.5 PRO$1.25 / $10per Mtok
OPUS 4.8$5 / $25per Mtok
SONNET 4.6$3 / $15per Mtok
SWE-BENCHleader Claude Fable 580.3%
MMLU-PROleader GPT-5.594.2
GPQAleader GPT-5.578.3
AFTAv1.0 whitepaper live at /whitepaper
All systems operational0 AI providers monitored, polled every 2 minutes
Live status
Back to Originals

Anthropic Split the Frontier in Two. Fable 5 Is the Half You Can Buy.

Kira Nolan··8 min read
MODEL RELEASE

Anthropic did not ship Opus 5 today. At 10:00 AM Pacific on June 9, it announced one frontier model wearing two names. Claude Fable 5 is the generally available product: API, Amazon Bedrock, Google Vertex, and Microsoft Foundry on day one, wrapped in always-on safety classifiers. Claude Mythos 5 is the same underlying model with the safeguards lifted in specific domains, and you cannot buy it. It goes to vetted cyberdefense partners under Project Glasswing, infrastructure providers, and US government coordination programs.

The naming break is deliberate. "Fable is from the Latin fabula, 'that which is told,' akin to the Greek mythos," Anthropic wrote. The Claude 4.x family keeps shipping alongside it, and Opus 4.8 remains the default model everywhere. Fable 5 is a tier above, opt-in, and priced like it.

I have read the announcement, the platform docs, and the launch-day coverage so you do not have to. The capabilities are real. So are the asterisks, and the asterisks are where the story lives.

What $10 In, $50 Out Buys You

Pricing first, because it frames everything: $10 per million input tokens, $50 per million output. That is exactly double Opus 4.8 at $5 and $25, and Anthropic notes it is less than half what Mythos Preview cost the handful of companies that touched it this spring. The 1M-token context window is the default, with no long-context surcharge. Compare GPT-5.5, which charges a premium above 272K input tokens. A 128K output ceiling per request rounds out the spec sheet.

The model thinks whether you like it or not. Adaptive thinking is always on; sending thinking: disabled to the API returns a 400. You steer effort (low through max, default high) rather than toggling reasoning off. That is a first for an Anthropic flagship and it tells you what the model is for: long-horizon agentic work, not chat completions.

The launch-day customer claims are aggressive. Stripe says Fable 5 worked through a 50-million-line Ruby codebase migration and compressed months of engineering into days. Cursor's CEO called it "the state of the art model on CursorBench" and said it "opened up a class of long-horizon problems out of reach." GitHub shipped it to Copilot the same morning. Vendor-curated quotes, yes. But the curation is unusually heavy on autonomy duration, which matches the spec choices.

Consumer access has a clock on it. Fable 5 is included in Pro, Max, Team, and Enterprise plans through June 22, then moves to usage credits, with Anthropic saying it will restore standard-plan access as capacity allows. Demand, in its words, will be "very high and difficult to predict." Our live status board is tracking Anthropic's API health through the launch window; launch days are historically when probes earn their keep.

The Benchmark Table Has Footnotes That Matter

The vendor table is a rout on paper. SWE-bench Pro: 80.3 percent against 69.2 for Opus 4.8, 58.6 for GPT-5.5, and 54.2 for Gemini 3.1 Pro. FrontierCode Diamond, Cognition's hard-set: 29.3 percent where Opus 4.8 managed 13.4 and GPT-5.5 sits at 5.7. OSWorld-Verified computer use: 85.0 against GPT-5.5's 78.7. On GDPval-AA, the economically-weighted task ELO, Fable 5 posts 1932 to GPT-5.5's 1769.

Now the footnotes. Several of the most-quoted numbers in today's coverage are starred in Anthropic's own table as Mythos 5 scores: Terminal-Bench 2.1 at 88.0, Humanity's Last Exam with tools at 64.5, ExploitBench at 78.0, HealthBench Professional at 66.0. Those are ceilings measured on the unrestricted model that vetted partners get. Public Fable 5 routes flagged cyber and bio work to Opus 4.8 by design, so in exactly those domains the product you can buy performs at Opus level on purpose. Some outlets are quoting the starred rows as Fable 5 numbers without the caveat. They are not.

Two more grains of salt. Every number above is vendor-reported as of today; independent replication does not exist yet, and our benchmark tracker will pick up third-party runs as they land. And the Terminal-Bench comparison is not apples-to-apples: GPT-5.5's 83.4 runs through its own Codex CLI harness. The harness gap is real and measurable; it is the entire premise of our cross-harness leaderboard.

One Model, Two Products: How the Split Actually Works

The architecture is the news. Standalone classifiers watch every Fable 5 request for three things: offensive cybersecurity, biology and chemistry, and attempts to distill the model's capabilities. A flagged request does not get refused. It gets rerouted to Opus 4.8, mid-conversation if necessary. Anthropic says the classifiers trip in under 5 percent of sessions and concedes they are tuned conservatively enough to catch harmless requests.

Mythos 5 is the same weights without the leash, plus mandatory 30-day data retention for safety review on every request, which also means Fable 5 itself is unavailable under zero-data-retention agreements. CyberScoop's framing is the one that stuck with me: Mythos on a leash. The red-team numbers Anthropic published are strong: over 1,000 hours of internal and external testing, no universal jailbreaks found, zero compliance on single-turn cyberattack-planning prompts.

Here is the omission I keep circling: no ASL tier. Every prior Anthropic frontier launch named its AI Safety Level designation under the Responsible Scaling Policy. The Fable 5 materials describe protections consistent with ASL-3 or above without committing to a number. For a company whose public identity is the RSP, declining to name the tier on its most capable launch is a choice, and nobody covering the launch today has gotten a straight answer on it.

The Friction the Launch Post Skips

Three operational wrinkles deserve more attention than they got. First, billing: per the AWS launch documentation, when the safeguards reroute your request, you pay Opus 4.8 prices for the Opus-served response, and a mid-conversation reroute bills the early tokens at Fable rates and the rest at Opus rates. Your cost model now depends on a classifier's judgment. Agents budgeting per-call (ours included) should treat Fable 5 spend as a range, not a constant.

Second, the distillation classifier is already the launch's loudest controversy. Builders report responses on frontier-LLM-development tasks being silently steered or degraded with no per-response notification, and the language from affected developers ("terrible, nefarious") is the sharpest I have seen aimed at Anthropic in a while. Whatever the merits, silent modification of paid output is a trust problem that a settings page disclosure does not fully solve.

Third, a gotcha straight from the Claude Code docs: the classifiers can trip on workspace context alone. A security-themed repository, a CLAUDE.md full of exploit terminology, or even directory names can bounce a session to Opus 4.8 before the user asks anything. Security teams, who are precisely the audience Project Glasswing courts, will hit this first and hardest.

The Timing Is Not an Accident

Anthropic confidentially filed its S-1 on June 1 after a $65 billion Series H at a $965 billion post-money valuation; OpenAI filed days later targeting a debut as early as September. We covered what that window does to lab behavior in the government-equity piece last week. A flagship launch eight days after your S-1, on a date prediction markets had at 94 percent beforehand, is a launch built for the roadshow narrative: we hold the frontier, and we hold it responsibly.

TechCrunch's headline angle wrote itself: the most powerful public Claude shipped days after Anthropic warned that AI capabilities are getting dangerous enough to need exactly this kind of control. Both things can be true, and the two-product split is what holding both positions at once looks like as a shipped artifact. It also lands while Anthropic is negotiating a fourth silicon platform to serve exactly this kind of demand.

Our Take

The split is the product. For three years the frontier labs have tried to serve one model to every customer and bolt policy on top with refusals. Anthropic just turned the policy layer into product segmentation: capability for everyone, dangerous capability for the vetted, and a classifier deciding which product you are talking to on a per-request basis. Expect the other labs to copy the structure within two quarters, because it solves their regulator problem and their enterprise problem in one move.

What I cannot endorse is the silence in the seams. Silent reroutes that change your bill, silent steering of paid responses, and a missing ASL number are all the same failure: the governance is load-bearing but unannounced. Anthropic built the most legible safety architecture in the industry and then declined to label it. Say the tier. Flag the reroute in the response metadata. The trust cost of disclosure is always lower than the trust cost of discovery.

Disclosure of our own, since today it is unusually direct: TensorFeed's editorial and engineering tooling runs on Claude models, including the model this article covers. Every number above is linked to its source so you can check the work without trusting either of us.

Sources: Anthropic announcement, Claude platform docs, TechCrunch, CNBC, The Decoder, AWS launch blog.