The incoming bug wave

Introduction

The last few weeks have seen a lot of panic and hype about the impending wave of bugs that the latest (frontier in the jargon) AI models will unleash. We’ve also had the inevitable counter-hype (like this or this).

Most of the fear is about Anthropic’s new Mythos and Fable model (because they are the same). Whilst Mythos is still being restricted to a select few¹ trial customers, Fable was launched to the public last week. It’s fair to say that the launch wasn’t great.

First, there was a wave of commentary from the security community that the guardrails in place were especially sensitive about anything security-related, and it couldn’t be effectively used for anything offensive or defensive. But then the plot twist: the US government got out their big ITAR stamp to force Anthropic to shut down access to Fable when Amazon discovered that the guardrails were actually entirely ineffective.

Now Anthropic are furiously trying to persuade the US Government that their guardrails can’t be trivially “jailbroken” by enemies of the US to help them make new cyberweapons². Given the terrible track record of AI companies in developing effective guardrails, and the fact that Anthropic are not well loved by the current administration, that may not be easy for them.

These shenanigans over access have at least somewhat cooled the panic that Mythos was generating, but that is likely temporary. Before its aborted release, there was a widely-adopted assumption that at some point these advanced models will just be let run against everything, and no software will be safe from endless waves of bugs. Cue the Starship Troopers references for those of us who grew up in the 90s.

Do you want to know more?³

How dangerous is the new model?

Hacking a particular application, software or service requires a number of steps, which we can summarise as two key activities:

Vulnerability research: finding a bug through reverse engineering and testing. This can be done against the disassembled application, the original source code (if available), or even from analysing recent patches.
Exploit development: often harder than first finding the vulnerability is writing code to take advantage of it, or more realistically to use it as part of a chain of exploits to fully hack the target.

Whilst older models have been okay at both these activities, Mythos is being touted as a real step-change in capability. In fact, detailed studies show that it is the most capable bug-finding model yet. Anthropic recently put out some numbers on their own use of Mythos for finding potential vulnerabilities in 1000 open-source projects:

1,752 of those high- or critical-rated vulnerabilities have now been carefully assessed by one of six independent security research firms, or in a small number of cases by ourselves. Of these, 90.6% (1,587) have proved to be valid true positives, and 62.4% (1,094) were confirmed as either high- or critical-severity. That means that even if Mythos Preview finds no further vulnerabilities, at our current post-triage true-positive rates, it’s on track to have surfaced nearly 3,900 high- or critical-severity vulnerabilities in open-source code—in addition to those it has found for Project Glasswing’s partners. To be clear, we intend to continue scanning open-source code for some time, so we expect this number to rise.

Project Glasswing: An initial update, Anthropic, 22 May 2026

The same post has stats on acknowledged vulnerabilities and patches, which shows that a good proportion of the discovered bugs are legitimate findings.

So why is this such a problem - can’t software companies and maintainers just hurry up and fix discovered bugs, especially if they can also use powerful new models?

Defenders are already behind

In the cybersecurity arms race, the defenders have always struggled to keep up with attackers. Largely because there’s always going to be a power imbalance; as the old saying goes, a defender needs to fix every issue to keep the bad guys out, but an attacker only needs to find one way to get in.

Add to that a myriad of other new and old reasons that slow down the development of fixes and patches: from massive technical debt, to focussing on new features over security and reliability, and lower-quality AI-generated code⁴.

The Anthropic report mentioned above highlights the real bottleneck in fixing discovered bugs for the initial targets of Mythos: the manual review process to triage all the potential bugs. This would come as no surprise to many in the bug bounty world, who have been complaining about the volume of AI-generated mostly nonsense bug reports for a while.

Mind the Gap

Developing and releasing a fix for a bug is just the first step, as you then you need all the users to deploy the patch. Unsurprisingly, given the well-established problem of the patch gap, much of the worry about Mythos is that endless waves of new bugs will make things even worse for beleaguered defenders and organisations who are already struggling to keep up.

On the key problem of patching, RiskyBiz recently put out an interesting chat with Ollie Whitehouse, the NCSC’s CTO:

Ollie makes some good points on what companies should do to protect themselves, including minimising your online attack surface to hide as much stuff as possible from an internet-borne wave of incoming bugs, and the need to accelerate patching.

But some of what he says is less practical. His advice on restricting administrator access to dedicated terminals, and on deploying remote browsing and content deconstruction technologies, is probably much more feasible for high-risk Government environments than for your average SME. In fairness to Ollie, that’s the world in which he works.

Fundamentally, unless all major software manufacturers are getting access to advanced bug-finding models, then the patch gap is only going to get bigger, as the poor system administrator can’t apply a patch for a key application until it’s actually been released, tested and deployed. Maybe IT admins will need to drop their testing requirements for putting out patches, which probably means more things falling over because of some patch conflict. But at least that’s better than something getting hacked by a new AI-generated zero day.

Don’t believe the hype?

But does this justify all the doom-mongering? And how much of the current panic is being driven by Anthropic’s marketing?

To start with, Mythos should be the most capable model at finding and exploiting bugs, as it’s a new model specifically trained to be good at this problem set. It’s like Apple trumpeting about their newest iPhone being the best one yet - isn’t that the point of a new version?

Plus we’ve seen prominent and contrasting approaches that show older models can be just as effective as the latest big models. For example, when they are set up within a capable framework, or by running cheaper models in parallel. This should make sense to anyone who’s used an AI product: often it feels like how you use it is more important than the model you happen to be using. So we should expect step-changes in bug-finding capability from both new models and the better use of older models.

Overall, whilst Mythos may be over-hyped in terms of its dangers, it has likely served a useful purpose in raising awareness of the impact that AI is going to have on cyber security. This impact that was already in evidence before widespread use of Mythos - last week’s Windows patch was the biggest ever.

So, if we are safe to assume that we’ll see waves of bugs as new models come out or new ways of using them emerge, what does it mean for the cybersecurity defenders?

Conclusions

Whilst AI is clearly accelerating bug discovery and exploitation, a lot of the hype with Mythos seems to assume it’ll be let loose on everything. And that is unlikely to be the case, mainly because of economics.

The initial testing of Mythos has cost multiple tens of thousands of dollars, and that’s at current prices. Factor in the widely reported difference between what AI providers are charging and what it’s costing them, it’s clear that at the current rate there’ll have to be an economic reckoning that drives usage prices much higher.

That means that doing anything productive with models like Mythos will be prohibitively expensive in terms of tokens and cash, so only the most well-resourced organisations will have the access and budget to use Mythos for bug hunting at any scale. So whilst Google, Generic Enterprise IT Corp., and your local SIGINT organisation will probably have access, the small software company producing some app for a specific industry probably won’t.

Fine, those with legitimate access to Mythos may be using it sensibly, or at least privately when it comes to SIGINT. But what about when China gets direct access to Fable, or distils the new model and releases their own, cheaper version to all the cyber bad guys?

We have to hope that the same new models can be harnessed to also accelerate the process to understand, fix, test and deploy patches.

Notes

Well, in theory. ↩︎
To avoid ambiguity: I only ever use that word sarcastically. ↩︎
I’ll stop with the references now. ↩︎
Including reintroducing whole classes of bugs that we might have hoped were in the past. ↩︎