
For thirty years, coordinated vulnerability disclosure has run on a simple, human-paced bargain. A researcher finds a flaw. They tell the vendor privately. A clock starts β usually ninety days β and when it runs out, the details go public whether the vendor has shipped a fix or not. The whole arrangement assumes that finding bugs is hard and slow, that the supply of fresh vulnerabilities arrives at a trickle a maintainer can absorb, and that the threat of eventual publication is enough to make vendors act.
Anthropic's Project Glasswing breaks the first assumption, and in doing so it strains every other one. Using Claude Mythos Preview β the unreleased frontier model at the center of the initiative β Anthropic and its coalition partners say they have surfaced thousands of high- and critical-severity vulnerabilities, with partners cumulatively reporting more than 10,000 flaws. The model can find, in hours, classes of bugs that sat undisturbed in battle-tested code for decades. Discovery has been industrialized. The mechanics of getting those bugs fixed β disclosing them responsibly, one maintainer at a time β have not.
This article is an explainer about that second half: how you responsibly disclose vulnerabilities at machine scale. Three mechanisms carry the weight β human triage by expert contractors, SHA-3 cryptographic commitments, and a 90-plus-45-day disclosure clock β and one uncomfortable number sits underneath all of them. As Elegant Software Solutions, we are reading these mechanisms from the outside, as analysts, against Anthropic's own published descriptions. We are not a Glasswing participant. Everything below traces to Anthropic's primary sources.
Start with the statistic, because it reframes the entire exercise. In its Frontier Red Team writeup, Anthropic states plainly: "fewer than 1% of the potential vulnerabilities we've discovered so far have been fully patched by their maintainers."
Read that carefully, with its hedges intact. It says "potential" vulnerabilities, not confirmed exploits. It says "fully patched," not acknowledged or triaged. And it locates the failure precisely β "by their maintainers" β not at the discovery end of the pipeline. Anthropic is not reporting that fixing is going slowly. It is reporting that, as of disclosure, the fixing has barely begun.
That is the gravitational fact of machine-scale disclosure. When one party can generate findings orders of magnitude faster than the rest of the ecosystem can consume them, the disclosure process stops being a courtesy protocol and becomes a throughput problem. Every design choice Anthropic has made β the human reviewers, the hash commitments, the clock β reads as an attempt to manage that imbalance without making it worse.
The most consequential design decision is also the least technical. Before any report leaves Anthropic for a maintainer, a person reads it.
In Anthropic's words: "we have contracted a number of professional security contractors to assist in our disclosure process by manually validating every bug report before we send it out to ensure that we send only high-quality reports to maintainers."
This is worth dwelling on, because it is the opposite of what "machine scale" would imply. The natural instinct, given a model that can produce thousands of findings, is to automate disclosure too β to fire reports at maintainers as fast as the model emits them. Anthropic explicitly does not do this. Every report passes through a human expert who validates it first. The model finds; humans gate.
There is a defensible logic here, and it connects directly to the <1% number. A maintainer's scarcest resource is attention. A flood of unvetted, machine-generated reports β many of them false positives, many of them low-severity noise β would not just waste that attention; it would teach maintainers to ignore the channel entirely. The credibility of the whole disclosure relationship depends on the signal-to-noise ratio of what arrives. Anthropic's own coordinated-disclosure policy reinforces the point, committing not to "submit large volumes of findings to a single project without first reaching out," and to avoid flooding any one maintainer with an unmanageable amount of work. The human triage layer is the throttle that makes restraint possible.
It is also, quietly, the bottleneck Anthropic chose. Human review does not scale the way Mythos does. The pipeline is deliberately rate-limited at exactly the step a hype narrative would want to remove.
The second mechanism solves a subtler problem: how do you prove you found something first, without telling the world what you found?
If you have discovered thousands of unpatched vulnerabilities, you cannot publish their details β that would be handing exploits to attackers before any patch exists. But you may want, later, to demonstrate that you held a specific finding at a specific moment in time. Maybe to establish provenance. Maybe to show the timeline was honored. Maybe simply to make the eventual reveal auditable.
Anthropic's answer is a cryptographic commitment. In its appendix, it publishes a list of hashes β "Each of the values below is the SHA-3 224 hash of a particular document" β where each document is a sealed vulnerability report. The hashes are public; the reports are not. When disclosure completes, Anthropic says, "we will replace each commit hash with a link to the underlying document behind the commitment."
The cryptography here is doing something precise, and it is easy to overstate. Anthropic is unusually candid about the limits: a commitment "is a way for us to provide proof that we have certain files without revealing them. While it does not prove anything about the contents of these filesβthey could be emptyβit allows us to later show that we had these files at this moment in time."
That caveat β they could be empty β is the whole ethical crux. A SHA-3 hash published today proves only that some file existed and was held at that timestamp. It proves nothing about whether the file contains a real, exploitable vulnerability. The validity of the finding rests entirely on what gets revealed later, and on the human triage that vetted it. The hash is a timestamp, not a verdict.
This is where an analyst should slow down, because the mechanism is more rhetorically loaded than it first appears.
Publishing thousands of commitments is a form of public claim-staking. Even without revealing a single technical detail, the act of posting a wall of hashes broadcasts a message: there are bugs here, we found them, the clock is running. It plants flags. To a maintainer scanning the appendix, a commitment that corresponds to their project is a notice that a sealed finding against their code exists somewhere β without telling them what it is yet, and without proving it is real until the reveal.
That asymmetry is novel. Traditional disclosure is a private conversation that becomes public only at the end. Commitment-based disclosure at scale makes the existence and timing of findings public at the start, while keeping the substance sealed. It is cryptographically honest β Anthropic does not claim the hashes prove validity β but it shifts a burden. The ecosystem is asked to trust that the sealed documents behind those hashes are genuine, high-quality findings, on the strength of Anthropic's process and its contracted reviewers, until the calendar forces the reveal.
To Anthropic's credit, the design is auditable after the fact. When the hashes are replaced with real reports, anyone can verify that the revealed document matches the committed hash, confirming nothing was swapped in or backdated. That is a real accountability property, and a better one than "trust us." But it is accountability that arrives at the end of the window, not the beginning β which brings us to the clock.
Anthropic ties the reveal to a disclosure window. Its commitments are unsealed, it says, "Once our responsible disclosure process for the corresponding vulnerabilities has been completed (no later than 90 plus 45 days after we report the vulnerability to the affected party)."
The exact framing matters, and it is worth stating precisely against the source rather than paraphrasing. This is not "ninety days, with a discretionary extension where warranted." It is a fixed ceiling: the process completes no later than 90 plus 45 days β at most roughly 135 days β measured from the moment Anthropic reports the bug to the affected party. The red-team post does not break that total into "ninety days to patch, then forty-five before details go public"; it states the combined ceiling and leaves the internal structure to Anthropic's standing coordinated-disclosure policy, which the post links to.
That standing policy fills in the surrounding norms an analyst would expect: deference to a maintainer's own severity assessment, escalation to an external vulnerability coordinator if a project does not respond, faster timelines for vulnerabilities under active exploitation, and the commitment not to dump large volumes on any single project. The 90+45 ceiling is the outer bound; the policy is the conduct inside it.
Ninety days is, not coincidentally, the industry's load-bearing convention β the window popularized by coordinated-disclosure programs over the past decade and now treated as the default fair grace period between private report and public detail. By anchoring to it and adding a bounded buffer, Anthropic is signaling that it intends to play by the established rules even at a scale those rules were never designed for.
Here is the tension the whole exercise exposes, and it is a structural one, not a criticism of any single choice.
The 90-day norm was calibrated for scarcity. It assumes a maintainer receives a manageable stream of reports and can reasonably be expected to fix a given bug within a quarter. The norm's coercive power β fix it or it goes public β works because the maintainer can, in principle, comply. The clock is fair because the work is finite.
Machine-scale discovery removes that premise. When a single initiative surfaces thousands of findings, the constraint is no longer the vendor's willingness to act; it is the vendor's capacity to act. A volunteer maintainer of a critical open-source library cannot patch a quarter's worth of newly discovered flaws in ninety days simply because they were asked nicely on a deadline. The <1%-patched figure is the empirical proof: the clock is running, and the ecosystem is not keeping up. The deadline's logic β public exposure as leverage β starts to look less like accountability and more like pressure applied to people who lack the throughput to relieve it.
This is precisely where coordination infrastructure was supposed to help. Independent coordinators like CERT/CC exist to broker exactly this kind of multi-party, high-volume disclosure β to sit between finders and an overwhelmed set of maintainers and manage the flow. Anthropic's published material points to its own coordinated-disclosure principles and to escalation through an external coordinator when a project goes dark, but it does not, in the Glasswing red-team post, name CERT/CC or detail how the traditional coordination ecosystem absorbs an input this large. That is an open question for the ecosystem, not a claim Anthropic makes β and it is the question that determines whether machine-scale discovery becomes a net good or a sustained denial-of-service against the world's maintainers.
The honest read is that Anthropic has engineered the disclosure side about as carefully as a first mover could: human-gated, cryptographically auditable, bounded by the industry's own clock, explicitly restrained about volume. Those are the right instincts. But careful disclosure mechanics cannot manufacture remediation capacity that does not exist. Finding has been industrialized. Fixing has not. The <1% number is not a flaw in Anthropic's disclosure design; it is the disclosure design working exactly as intended, against a downstream that was never built for this throughput.
What does the 90+45-day clock actually mean? Per Anthropic's red-team post, its responsible-disclosure process for a given vulnerability completes "no later than 90 plus 45 days after we report the vulnerability to the affected party" β a fixed outer ceiling of roughly 135 days measured from the report date, not a 90-day window with a discretionary extension. The post states the combined total and defers the internal conduct to Anthropic's standing coordinated-vulnerability-disclosure policy.
Do the SHA-3 hashes prove Anthropic found a real bug? No β and Anthropic says so explicitly. A cryptographic commitment proves only that Anthropic held a particular file at a particular moment in time; in its own words, "it does not prove anything about the contents of these filesβthey could be empty." The hash is a timestamp of possession, not evidence that the sealed report describes a valid, exploitable vulnerability. That validity rests on the later reveal and on human triage.
Why SHA-3 specifically, and where are the commitments published? Anthropic uses the SHA-3 224 hash of each sealed vulnerability report and publishes the list of hashes directly in the appendix of its Frontier Red Team post. When disclosure completes, it says it will replace each hash with a link to the underlying document, making the commitment auditable β anyone can confirm the revealed report matches the hash that was posted earlier.
Is every report machine-generated and auto-sent? No. Anthropic states it has "contracted a number of professional security contractors" to manually validate every bug report before it is sent, so that only high-quality reports reach maintainers. The model does the finding; human experts gate the disclosure. This human review is deliberately the rate-limiting step in the pipeline.
What is the "fewer than 1%" statistic? Anthropic reports that "fewer than 1% of the potential vulnerabilities we've discovered so far have been fully patched by their maintainers." Note the precise wording: "potential" vulnerabilities, "fully patched," and "by their maintainers." It locates the bottleneck downstream of discovery β remediation capacity, not detection β and it is the central tension of disclosing at machine scale.
How does this strain the existing disclosure ecosystem? The 90-day disclosure norm was built for a world where vulnerabilities arrive at a pace maintainers can absorb, and where the public-exposure deadline is fair because the underlying work is finite. Machine-scale discovery breaks that premise: the constraint shifts from a vendor's willingness to fix to its raw capacity to fix. Coordination bodies such as CERT/CC exist to broker exactly this kind of high-volume, multi-party disclosure, but absorbing an input this large at sustained machine speed is an unresolved question for the ecosystem β one the disclosure mechanics alone cannot answer.
Discover more content: