
Project Glasswing and the Claude Mythos model behind it are arguably the most consequential AI-security story of 2026 โ and predictably, the signal has been buried under a layer of invented statistics, garbled benchmarks, and outright fabrications. Round numbers that appear in no Anthropic document get repeated as fact. Real figures get re-rendered until they look fake. A defensive, government-coordinated deployment gets recast as a clandestine offensive program.
Elegant Software Solutions tracks emerging-technology claims for a living, and we are not a Glasswing participant โ we have no model credits, no partnership, no horse in this race. What follows is a claim-by-claim audit. For each circulating assertion we state the claim, label it FALSE, MISLEADING, or TRUE-BUT-MISUNDERSTOOD, and then give the correction straight from Anthropic's own primary sources. Where a number is wrong, we replace it with the real one in the same breath.
Label: FALSE. Neither number appears in any Anthropic source โ not the Frontier Red Team writeup, not the Glasswing page, not the expansion announcement. They are fabricated with a false-precision sheen that makes them feel authoritative.
The real framing comes in two distinct, separately-attributed numbers, and they should not be blurred together:
So the honest numbers are "thousands" (Anthropic, with Mythos) and "more than 10,000" (partners, cumulatively). Anyone citing 23,019 or 6,202 is repeating an invention.
Label: TRUE-BUT-MISUNDERSTOOD. There is a real ~1,000 figure here โ the misunderstanding is what it describes. The Frontier Red Team post says: "We regularly run our models against roughly a thousand open source repositories from the OSS-Fuzz corpus." That is a standing evaluation cadence against a curated corpus (OSS-Fuzz), not a one-off heroic sweep of "1,000+ projects," and it is emphatically not the provenance of the fabricated 23,019 count. If you cite the ~1,000 figure, cite it for what it is: the size of the recurring benchmark corpus.
Label: FALSE (fabricated figure). No "5 million" โ or any specific fuzzing-hit count โ appears in Anthropic's writeup of the FFmpeg finding. The post discusses how hard media libraries are to fuzz in qualitative terms, even noting that "entire research papers have been written on the topic of how to fuzz media libraries like FFmpeg," but it never quantifies a missed campaign at five million, one million, or any other figure.
The actual finding is impressive enough without invented numbers: Mythos Preview surfaced a 16-year-old bug in FFmpeg's H.264 codec, where "if an attacker builds a single frame containing 65536 slices, slice number 65535 collides exactly with the sentinel." Anthropic also notes it found several other FFmpeg vulnerabilities "after several hundred runs over the repository, at a cost of roughly ten thousand dollars." Real numbers exist; "5 million fuzzing hits" is not one of them.
Label: FALSE as stated โ and this is the correction most worth getting right, because something genuinely did ship publicly. The two facts have to be kept separate.
The distinction is the whole point. Fable 5 ships with classifier-based safeguards that route cybersecurity, biology/chemistry, and distillation requests to a more conservative model (Claude Opus 4.8); Mythos 5 has those guardrails lifted and is not generally available. So "Mythos is coming to the public" is false; "a safeguarded Mythos-class model (Fable 5) is now public, while Mythos 5 stays gated" is true.
Label: FALSE. Both renderings are legitimate Anthropic figures describing the same result. The Mythos Preview system card reports CyberGym as 0.83 on a 0-to-1 scale. Anthropic's own Glasswing marketing page renders that same result as a percentage: "Mythos Preview 83.1%" versus "Opus 4.6 66.6%." A reader can see the 83.1% figure on anthropic.com/glasswing right now.
So 0.83 and 83.1% are two presentations of one benchmark result, not a discrepancy and not a fabrication. The only CyberGym error to avoid is inventing some third value. If you want to be maximally precise, cite the system card's 0.83 and note that Anthropic's page expresses it as 83.1% vs. 66.6%.
Label: UNVERIFIED โ do not present as fact. This is where a real, sourced fact gets smuggled together with sensational embellishments, so precision matters.
The real, primary-sourced fact: Mythos 5 "will initially be deployed through Project Glasswing, in collaboration with the US government." Government collaboration on a defensive deployment is confirmed by Anthropic itself.
The unverified part is everything sensational layered on top: NSA offensive operations, Anthropic engineers embedded in them, and a Pentagon lawsuit. Those specifics trace to a single low-credibility outlet and are not corroborated by any reputable source. We are not asserting they are false โ we cannot prove a negative โ but no reader should treat them as established. The correct posture: acknowledge the documented government-collaboration fact, and flag the offensive-ops and lawsuit claims as unconfirmed and unsourced.
Label: MISLEADING. Apple appears in the 12-name launch roster โ Amazon Web Services, Anthropic, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks โ and nowhere else. There is no Apple quote, no described role, no Apple-authored post, no stated contribution. Being on a launch list is not the same as being a deep or active partner, and Anthropic's materials give no basis for characterizing Apple's involvement beyond name-on-the-roster. Treat any claim about Apple's specific Glasswing role as unsupported.
Label: FALSE. Anthropic frames the limited release as a deliberate, voluntary dual-use judgment, not a policy trip. The Frontier Red Team post explains the rationale in defensive-head-start terms: releasing "initially to a limited group of critical industry partners and open source developers... [to] enable defenders to begin securing the most important systems before models with similar capabilities become broadly available." Anthropic does not attribute the decision to a breached Responsible Scaling Policy threshold; it presents it as a choice. The distinction matters because "an automated guardrail tripped" tells a very different โ and inaccurate โ story than "we chose to hold this back."
The pattern across every fabrication is the same: a real, sometimes-startling capability gets exaggerated into a fake precise number, or a defensive program gets recast as something darker. Glasswing does not need embellishment. A model that finds thousands of high- and critical-severity bugs across major operating systems and browsers, that surfaces a 16-year-old FFmpeg flaw, and that is being held back from general release on purpose is a significant enough story told straight. Inventing 23,019 vulnerabilities or 5 million fuzzing hits does not strengthen that story โ it makes the entire account easier to dismiss.
For anyone citing Glasswing or Mythos in their own analysis, the rule is simple: if a number looks suspiciously round and precise at the same time, check it against anthropic.com/glasswing, the expansion announcement, the Frontier Red Team post, and the system card before repeating it. The primary sources are public. Use them.
Did Claude Mythos really find 23,019 vulnerabilities?
No. That number appears in no Anthropic source. Anthropic reports finding "thousands" of high- and critical-severity vulnerabilities with Mythos Preview, and separately notes that Glasswing partners have cumulatively found "more than 10,000" high- or critical-severity flaws scanning their own codebases. The 23,019 and 6,202 figures are fabricated.
Is "83.1%" on CyberGym a fake benchmark number?
No โ it is real and it is Anthropic's own. The Mythos Preview system card lists CyberGym as 0.83 on a 0-to-1 scale; Anthropic's Glasswing page renders the identical result as 83.1% (versus 66.6% for Opus 4.6). They are two presentations of the same legitimate figure, not a discrepancy.
Can I use Claude Mythos now that it's public?
Mythos itself is not public. What went public on June 9, 2026 is Claude Fable 5 โ a Mythos-class model that ships with safety classifiers that route cybersecurity, biology/chemistry, and distillation requests to a more conservative model. Claude Mythos 5, the version with those safeguards lifted, stays restricted to Project Glasswing partners, plus a planned trusted-access path for select biology researchers (Anthropic's own announcement is inconsistent on whether that researcher path runs through Mythos 5 or a bio-safeguards-removed Fable 5).
Is the NSA using Mythos for offensive cyber operations?
That specific claim is unverified and appears only in a single low-credibility outlet. What Anthropic does confirm is that Mythos 5 is "initially deployed through Project Glasswing, in collaboration with the US government" โ a defensive framing. Claims of NSA offensive operations, embedded Anthropic engineers, or a Pentagon lawsuit are not corroborated by any reputable source and should not be treated as fact.
Was Mythos withheld because a safety threshold was tripped?
No. Anthropic frames the limited release as a deliberate choice to give defenders a head start "before models with similar capabilities become broadly available," not as the result of a breached Responsible Scaling Policy threshold. It is a voluntary dual-use judgment.
How deeply is Apple involved in Project Glasswing?
Apple is listed in the 12-name launch roster and nothing more โ no quote, no described role, no published contribution. Anthropic's materials provide no basis for characterizing Apple as a deep or active partner beyond appearing on the list.
Discover more content: