The vulnerability bottleneck has moved

finding bugs is easy, fixing them is hard

Jun 11, 2026

Everyone’s that’s ever written a line of code has most certainly touched an open source library, even without realising it. I’ve typed npm install many times without thinking about it like many people, because it’s just what you do. It’s amazing for installing libraries, frameworks, and other development tools for your projects. Npm alone is relied upon by more than 17 million developers worldwide and hosts over two million packages, making it the largest software registry in the world. It’s a critical part of the JavaScript community and helps support one of the largest developer ecosystems in the world. So when it works, it’s invisible. But when it goes wrong, everyone downstream feels it.

On May 11th, it clearly didn’t. An attacker pushed 84 malicious versions across 42 @tanstack/* packages in the span of 6 minutes. The malware harvested credentials, self-propagated across every package the victim maintained, and exfiltrated through a decentralised messenger that can’t be taken down. Within days it had spread to OpenSearch, Mistral AI, Guardrails AI, UiPath -- 172 packages, 518 million cumulative downloads. And while the malicious versions were detected publicly within 20 minutes(ish) and had a limited impact initially, the consequences would take a few more days to emerge.

5 days later, Grafana Labs confirmed a targeted attack, where the attackers gained unauthorized access to their GitHub repos and downloaded their codebase. This incident originated from the same TanStack npm supply chain attack, and when Grafana detected this malicious activity on May 11, they immediately initiated their incident response plan. This plan involved rotating a significant number of GitHub workflow tokens, but unfortunately one missed token led to the attackers gaining access to their GitHub repositories.

At this point you’d think NPM stands for Neatly Packaged Malware.

The TanStack attack is not an isolated incident. It is the latest wave in a series of npm supply chain attacks using the Shai-Hulud worm toolchain. Where each wave builds on the previous wave’s technical sophistication. And yet none of this feels new, if anything it feels routine to see a new critical vulnerability discovered every day. Which raises the obvious question: if this stuff is so catastrophic, why does it occur so regularly? Shocking we know.

Make programming easier + make models smarter = make bad virus easier. While this is a gross oversimplification, its more accurate to say that as models get smarter, it allows for faster, more adaptive cyberattacks, and far more scalable than anything achievable through hands-on-keyboard intrusions. personally im most concerned about scale, that they got cheaper to scale. A loop running overnight can probe hundreds of targets simultaneously in a way no human crew could.

There are more incidents worth covering right now than this post could reasonably hold and cataloguing all of them would be a sisyphean task. So I’ve picked to pick some of the most egregious ones that I could think about.

Quick rundown

If you’re anything like me (chronically online and slightly paranoid) you’ve probably seen at least a new exploit nearly every day on your TL.

Up until 3 months ago, I wasn’t too concerned with what was happening in the security ecosystem. But around late April, a friend got caught in the blast radius of an attack, not as a developer but a student trying to revise for her finals. She, along with thousands of students across thousands of institutions, woke up to find they no longer had access to their coursework online. Instructure the company behind the widely used LMS Canvas suffered a major breach of its infra at the hands of ShinyHunters through a vulnerability in its Free-For-Teacher service. A few days later, the group posts a ransom demand on its data leak site and claim exfiltration of 3.65 TB of data across approximately 275 million records from 8,809 educational institutions.

However after the initial deadline for the ransom passed, they decided to pivot to direct school-by-school extortion and deface Canvas login pages by exploiting the same Free-For-Teacher vulnerability. Instructure takes Canvas offline globally, which coincided with exam season where students from schools like MIT, Stanford and Brown were unable to access their learning materials, many of whom voiced their discontent on social platforms. The scale of the attack is what initially caught my attention, and remains what concerns me most about model improvements. AI doesn’t need to be a genius the alter the economics of attack and discovery, it only needs to make certain workflows cheaper to run in loops, making it easier to scale.

Nobody can keep a secret anymore

A few weeks ago, a new and exceptionally dangerous Linux local-privilege escalation vulnerability was disclosed by Xint.io. Copy Fail exploits a kernel memory corruption flaw and allows attackers to rewrite the cached contents of files on a Linux filesystem. It’s different from Dirty Cow or Dirty Pipe because is a straight-line logic flaw, it triggers without races, retries, or crash-prone timing windows.

The entire exploit is a simply a short Python script using only standard library modules. The bug rewrites cached file contents in memory without touching the disk, bypasses standard integrity checks, and gets root on every major Linux distribution with a 732-byte Python script, with no modifications required. Thankfully, the coordinated disclosure was handled responsibly, and the patch was merged before public release, however the same scan surfaced additional high-severity vulnerabilities that are still working through that process. To be fair the disclosure itself was handled well, as in: it was reported privately, CVE assigned, patch merged before public release. The system worked exactly as designed and the balance was seemingly restored until threat actors showed up.

The 313 Team (Iran-aligned hackers), launched a sustained DDoS attack on Canonical’s infrastructure starting April 30, which coincided directly with Canonical’s advisory about Copy Fail, taking down Ubuntu’s main site and package mirrors. Predictably, If one team can find a nine-year-old kernel bug in an hour, there’s nothing stopping five teams from doing it in parallel within the same embargo window. Coordinated disclosure was built on the assumption that finding something this serious was really rare and expensive, but that assumption is being put to the test right now as AI has made it much easier for bad actors to exploit vulnerabilities and deliver more sophisticated attacks at a scale which previously wouldn’t have been possible.

Barely a week after Copy Fail, researcher Hyunwoo Kim (@v4bel) published Dirty Frag, which can obtain root privileges on major Linux distributions by chaining the xfrm-ESP Page-Cache Write (CVE-2026-43284) vulnerability and the RxRPC Page-Cache Write (CVE-2026-43500) vulnerability. Dirty Frag is a case that extends the bug class to which Dirty Pipe and Copy Fail belong. Because it is a deterministic logic bug that does not depend on a timing window, no race condition is required, the kernel does not panic when the exploit fails, and the success rate is very high. The critical part is that Dirty Frag works even if you applied the Copy Fail mitigation. Now heres the part where the traditional disclosure model completely fell apart: the researcher reported it to the security team on April 29-30, submitted patches publicly and coordinated with the linux-distros mailing list on May 7, with a 5-day embargo agreed upon.

On that same day, within hours an unrelated third party published detailed exploit information for the ESP vulnerability, breaking the embargo. As of now there are ways to mitigate this exploit, and distributions have since caught up, but for a window that should never have existed, a reliable root exploit was public and unpatched across every major Linux distribution simultaneously. To top this all off, about 3 weeks ago, another variant, named DirtyDecrypt was identified as part of the Dirty Frag vulnerability set. It is a proof-of-concept exploit that allows attackers to gain root access on some Linux systems. The jokes are writing themselves at this point.

Because AI can now analyze code so well that it can spot software vulnerabilities in just seconds, it’s putting to the test commonly held assumptions by the security industry. One of which was that finding exploits took highly paid experts, which kept attackers limited to a certain extent. However, now anyone with enough tokens to spend can zero in on vulnerabilities in widely used software in little time. That in turn has impacted the traditional bug-bounty process and coordinated disclosure protocol, where the 90-day window used to give enough time to maintainers or vendors to address a vulnerability before public disclosure would occur.

Now many vulnerabilities are being exploited before they are even publicly disclosed. Himanshu, a security researcher at cloudflare, reported a pretty bad and easily exploitable bug to a company in late April only to find out that 11 people had found the same critical bug in roughly six weeks, and that the first report had come in a month previoulsly, in march, and still hadn’t been patched when he reported it.

A new pattern is emerging, one where LLM-assisted hunters are converging on the same bugs almost simultaneously, across totally unrelated reporters using totally unrelated workflows. The CTO of HackenProof @d0rsky shared that “Once a new vulnerability is discovered - especially via some LLM prompt/skills/automation, we start getting a wave of duplicate reports within days. Same root cause, slightly different wording. […] What concerns me more, is, if researchers can replicate these findings so quickly, what’s stopping blackhats from doing the same before the issue is fixed? Feels like the window between ‘first discovery’ and ‘mass awareness’ is getting dangerously short.”. Realistically, If 10 people reported the bug, how many found it and did not report it? The same LLM that helped 10 honest researchers is also available to everyone else, including bad actors.

Silver Lining?

It would be a bit dishonest and doomerist to go on about how AI model improvements are erroding the systems in place, but in fairness, disruption is hardcoded in the process of technological transformation. It is not so different in essence to the decline of fax operators as the internet became more ubiquitous. Some stories deserve a happy ending and to end things on a more positive note, I just wanted to make the point that improvements in model cybercapabilities are also not all bad. The same week we were all watching kernel exploits stack up like a slow-mo disaster, Anthropic published a Glasswing update that made me feel at least fractionally better about where we are heading.

Since its launch in early April, Claude Mythos Preview has been able find more than ten thousand high- or critical-severity vulnerabilities across the most systemically important software in the world. In the latest release about project Glasswing, anthropic says “progress on software security used to be limited by how quickly we could find new vulnerabilities. Now it’s limited by how quickly we can verify, disclose, and patch the large numbers of vulnerabilities found by AI”. This echoes observations where the bottleneck is no longer the bug discovery, but instead its everything that happens downstream from the discovery. The orgs in question that have had access to this model, have shared that their rate of bug-finding has increased by more than a factor of ten. Which is made more impressive by the fact that the vulnerabilities it finds are often subtle or difficult to detect. Many of them are ten or twenty years old, with the oldest they have found so far being a now-patched 27-year-old bug in OpenBSD (an operating system known primarily for its security lol).

Open source used to be consumed consciously and informed by package popularity or human review, this assumption is no longer true. As developers rely increasingly on their AI coding tools to build entire features or products, in the process include the packages deemed necessary by the model. Packages themselves used to contain transitive dependencies that were hand-picked by maintainers. That is also no longer true, as we’ve seen with npm. In light of these changing circumstances, for the last few months Anthropic has used Mythos Preview to scan more than 1,000 open-source projects and has found what it estimates are 6,202 high-or critical-severity vulnerabilities in these projects (out of 23,019 in total, including those it estimates as medium- or low-severity).

Anthropic has been quite considerate in the way it acknowledges how already stretched thin mainainters are being flooded with reports “on top of the regular challenges of maintaining open-source software, maintainers have been facing a deluge of low-quality, AI-generated bug reports. Indeed, several maintainers have told us they’re currently severely capacity constrained, and some have even asked us to slow down our rate of our disclosures because they need more time to design patches. ” and upon maintainers request, choose sometimes disclose bugs directly, without further assessment.

That leads me to the other thing I’d feel bad not mentioning. A lot of the software this all runs on is maintained by people doing genuinely thankless work for free, and drowning in unthinkable amounts of slop PRs and critical security patches that need to be implemented. If you use their projects and you can contribute financially, please do.

So in the end whether these models make things better or worse comes down to who’s using them and why. I think looking ahead, we’re in for a couple more tumultuous months but we should emerge from it with a version of the world where critical infrastructure is more hardened than it has ever been, and bugs that have been hiding in plain sight for twenty years have finally run out of places to go.

Some takeaways

So what’s the solution?? Honestly, because the picture is pretty complex, it’s not surprising that a panacea exists yet. But as a bare minimum it would be nice as a starting point to have guides that show how to properly secure a GitHub accounts/ repos for devs, especially as this stuff gets more increasingly common. While I’m not too familiar with what’s out there, I’ve found this checklist to prevent these attacks for anyone using GitHub Actions for their CI/CD from Aikido . TLDR of this post is that you need to:

- Pin all third-party actions to a full commit SHA
- Set default GITHUB_TOKEN permissions to read-only
- Never pull_request_target in public repos
- Never interpolate ${{ github.* }} directly into run: steps
- Use OIDC for cloud credentials instead of long-lived secrets

I also think there needs to be a drastic change of stance towards traditional vuln disclosures, in the sense that companies need to adapt quickly to the changing circumstances. Himanshu put it best “Treat every critical security issue as P0 and fix it immediately (…) As in, stop what you are doing and fix it now.”. Especially If you are a vendor receiving a critical bug report, your clock starts the moment the report lands. Because if someone reported it to you, assume 10 other people have it and at least one of them is not friendly.

On the bright side, the same frontier models at the centre of the problem are also the most useful tools currently available for defence. Current frontier models, like Claude Opus 4.6 (and those of other companies), remain extremely competent at finding vulnerabilities, even if they are much less effective at creating exploits. For example Opus 4.6 found high- and critical-severity vulnerabilities in OSS-Fuzz, in webapps, in crypto libraries, and even in the Linux kernel. It's much less effective at building exploits from them, which right now is a feature of anthropic models. Beyond direct scanning, SOTA models are practically useful for first-round triage (severity assessment), de-duplicating reports, drafting reproduction steps, writing initial patch proposals, reviewing PRs, and auditing cloud configurations.

And even if your threat model doesn't require Mythos-class capability today, building the right scaffolds and workflows now with models that are publicly available is much more sensible than starting from scratch when those capabilities eventually ship broadly. Also Anthropic recently published a reference implementation for autonomous vulnerability discovery and remediation with Claude, based on their partnerships with security teams that got early Mythos access.

Anyways, while I am in no ways a security expert (Im sure as hell definitely not protected well enough) this is just an honest attempt at trying to capture somewhat what’s going on right now. Also this has been bothering me for a while so I thought why not put it down in writing as a form of catharsis.

Sources:

“Autonomous attacks ushered cybercrime into AI era in 2025” https://www.cybersecuritydive.com/news/cybercrime-ai-ransomware-mcp-malwarebytes/811360/
“GitHub confirms breach of 3,800 repos via malicious VSCode extension” https://www.bleepingcomputer.com/news/security/github-confirms-breach-of-3-800-repos-via-malicious-vscode-extension/
“Grafana Labs security update: Latest on TanStack npm supply chain ransomware incident” https://grafana.com/blog/grafana-labs-security-update-latest-on-tanstack-npm-supply-chain-ransomware-incident/
“Malware in 42 @tanstack/* packages” https://github.com/TanStack/router/security/advisories/GHSA-g7cv-rxg3-hmpx
“Security advisory from mistral ” https://docs.mistral.ai/resources/security-advisories
“MYTHOS FINDS A CURL VULNERABILITY” https://daniel.haxx.se/blog/2026/05/11/mythos-finds-a-curl-vulnerability/
“THE END OF THE CURL BUG-BOUNTY” https://daniel.haxx.se/blog/2026/01/26/the-end-of-the-curl-bug-bounty/
“LLM-driven security reports disrupt coordinated disclosure” https://lwn.net/Articles/1070698/
“Let’s talk about AI slop” https://archestra.ai/blog/only-responsible-ai
“Godot maintainers struggle with ‘draining and demoralizing’ AI slop submissions” https://www.theregister.com/software/2026/02/18/godot-maintainers-struggle-with-demoralizing-ai-slop-prs/4206219
“Mini Shai-Hulud Hits @antv Ecosystem, 639 Compromised npm Package Versions” https://socket.dev/blog/antv-packages-compromised
“Cybersecurity Looks Like Proof of Work Now” https://www.dbreunig.com/2026/04/14/cybersecurity-is-proof-of-work-now.html
“TanStack Npm Packages Compromised Inside The Mini Shai Hulud Supply Chain Attack” https://snyk.io/blog/tanstack-npm-packages-compromised/
“Exploit released for new PinTheft Arch Linux root escalation flaw” https://www.bleepingcomputer.com/news/linux/exploit-released-for-new-pintheft-arch-linux-root-escalation-flaw/
“New Linux ‘Copy Fail’ flaw gives hackers root on major distros” https://www.bleepingcomputer.com/news/security/new-linux-copy-fail-flaw-gives-hackers-root-on-major-distros/
“AI Has Taken Over Open Source” https://socket.dev/blog/ai-has-taken-over-open-source#Maintenance-Fatigue:-PRs-not-welcome
“The 90 day disclosure policy is dead” https://blog.himanshuanand.com/2026/05/the-90-day-disclosure-policy-is-dead/
“DirtyCBC: When Linux Kernel Decrypt-Before-MAC Turns Authenticated Encryption Into a Page-Cache Write” https://delphoslabs.com/blog/36142374-e1fe-80a9-9456-d3c64df81bd5/%20linux-rxgk-decrypt-mac
“Ubuntu Security Notices” https://ubuntu.com/security/notices
“Assessing Claude Mythos Preview’s cybersecurity capabilities” https://red.anthropic.com/2026/mythos-preview/

Eva Hill

Discussion about this post

Ready for more?