Reading view

There are new articles available, click to refresh the page.

Open-source security is posing challenges governments can’t easily solve

An epidemic of cyberattacks on open-source software has mounted in recent months, making clear how uniquely difficult it is to protect the publicly available code, from both a policy and a technical perspective, that serves as the foundation for so much of the digital world.

While open-source software security got a boost in attention under President Joe Biden — whose administration grappled with the fallout from the potentially catastrophic Log4j flaw that emerged in 2021 — a number of open-source experts say that government protection efforts have suffered setbacks under President Donald Trump. Many also say companies that heavily rely on open-source software, which is basically all of them, haven’t shouldered enough of the responsibility for safeguarding it.

“What we’re seeing is years of lack of investment sustainment in open-source software that is finally starting to catch up to us, where it seems like every week there’s a new supply chain compromise,” said Jack Cable, who held a role at the Cybersecurity and Infrastructure Security Agency where he worked on open-source security before departing under Trump.

The advancements of frontier artificial intelligence models stand to exacerbate the risk further, while simultaneously illustrating what makes defending open source difficult: Project Glasswing said shortly after its announcement that it had uncovered 6,202 high- or critical-severity vulnerabilities in a scan of more than 1,000 open-source projects, but that it had disclosed only 502 of them to open-source project maintainers and only 75 had been patched as of May 22 (albeit some due to typical patching lagtimes).

At the same time, there are questions about how much the government can help, even as overseas governments seek to focus on open-source security.

The evolution of open-source risk 

There are a series of factors contributing to the current threat to open-source software, experts say.

One is simply that attackers go to the area where they can get the highest return on their work. Compromising open-source software gives them the chance to get into the supply chain and exploit additional targets.

“Twenty years ago, open source was still fairly niche,” said Æva Black, who also worked on open-source security at CISA but left when Trump came back into power. “The potential blast radius if you managed to compromise open source was relatively small, because back then the world didn’t run on open source. Now almost everything runs on open source,” she said, from modern cars to satellites.

Another part is the nature of open-source software itself.

“It’s a symptom [of having] lots of open source [that] is a little bit under-maintained or not cared for enough, so that we spend too little effort and money and infrastructure on them,” said Daniel Stenberg, who is the creator and maintainer of cURL, a popular open-source project. “Lots of open source is being maintained by small teams, lots of volunteers, and I think that that’s a tough situation.”

That doesn’t mean the maintainers are to blame, Stenberg said. The companies that rely on open-source need to be diligent about using it, Black said.

“What we’re seeing in that realm right now is not new; it is more advanced and far more widespread,” she said. “The problem remains that companies who use open source — because open source is by far the most efficient way to collaborate on non-product value features — most companies are not implementing a responsible and safe utilization pathway.”

Open-source projects lack a systematic way to handle coordinated vulnerability disclosures, unlike companies or industry groups with formal processes, said Dan Lorenc, CEO and co-founder of Chainguard. Project maintainers sometimes aren’t reachable, and those who are available are flooded with reports, many of them unverified findings from AI tools that waste their time without adding value..

Of course, some of those vulnerability reports turn out to be legitimate. “Mythos and AI models have contributed to an uptick in the number of vulnerabilities and things that we’re able to find” in open-source software, said Alex Zenla, chief technology officer for the cybersecurity company Edera.

All of that leaves more room for companies, non-profits and world governments to improve open-source security.

A moment of momentum

While open-source software security isn’t a new issue, the 2021 discovery of the Log4j flaw sounded alarms within the cybersecurity community. Jen Easterly, then the director of CISA, called it “one of the most serious I’ve seen in my entire career, if not the most serious,” with the potential to affect hundreds of millions of devices given the ubiquitous nature of the popular open-source logging library.

A year later, the Cyber Safety Review Board released its report on the incident, concluding that swift action from industry and government averted a disaster. But the incident “called attention to security risks unique to the thinly-resourced, volunteer-based open source community,” it wrote. “This community is not adequately resourced to ensure that code is developed pursuant to industry-recognized secure coding practices and audited by experts.”

The U.S. government actions after included some steps focused specifically on open-source software such as creation of the Open-Source Software Security Initiative and hires of well-regarded open-source security experts at CISA such as Black, but also some steps that could be applied more generally and still help with open-source security, such as greater promotion of secure-by-design, memory-safe languages and software bills of materials (SBOMs).

Some of the Biden administration work on open-source security started before Log4j, such as provisions from an executive order he issued in 2021 that directed CISA along with the Office of Management and Budget and General Services Administration to issue guidance to agencies. 

The administration’s 2023 cybersecurity strategy also stepped into the long, thorny discussions over software liability, with a mention of open-source security: “Responsibility must be placed on the stakeholders most capable of taking action to prevent bad outcomes, not on the end-users that often bear the consequences of insecure software nor on the open-source developer of a component that is integrated into a commercial product.“ The Biden administration always indicated that addressing software liability would take a prolonged battle ahead.

Under Trump, many of the Biden administration’s efforts have languished. CISA’s splashy hires on open-source are gone, including Black, Tim Pepper and Anjana Rajan. Also departed are leading figures on secure-by-design and SBOMs, with CISA personnel cutbacks slicing deep. 

No one has seen any sign that the national cyber director-led Open-Source Software Security Initiative is active, with few participants remaining in government today. The Trump administration cyber strategy doesn’t mention open-source.

“The loss of open-source experts at CISA “is unfortunate, and it will be hard for the government to try to rebuild capacity, but I do think now more than ever CISA has a core role to play to secure open source software,” Cable said.

The pressure is mounting

It’s not that the issue is getting zero attention from those in a position to make a difference. Nick Andersen, the acting director of CISA, said last month that open-source security was an area of particular concern for him.

Andersen responded to concerns about CISA staffing levels on open-source security and spoke more broadly on the topic in a statement to CyberScoop.

“As artificial intelligence and other technologies have the power to transform how vulnerabilities are discovered and exploited, CISA recognizes that the open source software (OSS) that underpins much of the nation’s critical infrastructure will need to be hardened,” he said. “CISA actively collaborates with our partners on shared priorities, including OSS security, to ensure time and resources are spent where they matter the most.  We have an immensely talented team, but are also accelerating our hiring in critical areas, to strengthen the nation’s defenses against cyber threats.”

The Office of the National Cyber Director did not respond to requests for comment.

There’s been some activity on Capitol Hill, too. The Securing Open Source Software Act, which Cable worked on during a stint as a Senate staffer, would direct CISA and other agencies to take actions to mitigate open-source software security risks, but the legislation has stalled since its introduction in 2022. A portion of the bill, however, was included in the Department of Homeland Security funding law Trump signed in April, directing CISA to brief Congress on the value of establishing something like an open source program office, which some companies use to manage open source within a given firm.

Senate Intelligence Committee Chairman Tom Cotton, R-Ark., has pushed the executive branch to improve its awareness of foreign adversaries playing roles in open-source software used by national security-focused agencies.

The annual defense policy bill in the House calls on the Defense Department’s chief information officer to report to Congress on a plan to secure open-source software supply chains, saying lawmakers are “concerned that the Department lacks sufficient visibility into the origins, maintenance, and security of OSS applications and software dependencies.”

That defense authorization bill language is “really beneficial, and I think it signals acknowledgement of this changing of culture” around open-source security risks, said Hayden Smith, founder of HuntedLabs, whose company won a contract with the Space Development Agency on supply chain security — agency work that the defense bill singled out.

“The report language is the first time the Hill is trying to get a true handle on foreign influence in open source code where they have oversight,” he said, saying it was a “piece of the puzzle” along with Cotton’s letter and a memo from Secretary of Defense Pete Hegseth last year about foreign influence in the Pentagon supply chain. “It’s good and would trickle down into everyone who provides software to the department.”

Zenla, though, believes trying to isolate China from open-source systems isn’t in and of itself a good idea. 

“I don’t think that that makes a lot of sense, because they’re actually pretty good things that people contribute to open source,” she said. “Not everyone is malicious, and what are we going to do, spy on every single open source maintainer?” It’s more about doing things like making sure that highly-classified systems are set up in a separate way, she said.

Europe is also taking action to secure open-source software that the United States doesn’t seem ready or willing to do right now. Germany, for instance, devotes grants to the security of open-source projects, although Stenberg pointed out that sometimes money doesn’t equate to maintainers being able to fix flaws more quickly, depending on the project’s size.

The Cyber Resilience Act (CRA) adopted by the Council of the European Union in 2024 could offer another road on open-source security. The CRA requires those who use open-source software products as part of any commercial activity to take certain security measures. 

Black said that when she was at CISA, there were discussions between the agency and European counterparts about finding compatible ideas on open-source security, but that momentum died with the Trump administration.

But “Europe kept rolling, and now has in place a new legal framework that is set to really reshape open-source security for potentially the whole world, but certainly for anyone who wants to work with Europe on open source,” she said.

Lorenc recently wrote that “open source isn’t governable.” He said an organization like a neutral nonprofit, possibly using some government funding, should take responsibility for things like coordinating vulnerability disclosure into one pipeline. He also said there needs to be one authority in charge of “forking” — that is, taking a project and assigning stewardship elsewhere — when a maintainer isn’t responsive to vulnerabilities. 

There are differing opinions on how much past government warnings, advisories and guidance have helped. Smith gave some credit to government agencies that “have all responded to open source attacks using the means they have.”

Stenberg said that “I don’t think they make any big dent at all in the big scheme of things.” They might get some attention initially, “then two years later we all forgot about them, and they actually didn’t change much.”

Ideally, everyone could get on the same page, Zenla said. “The best way to do this is if people actually collaborated on a global scale on some sort of regulation around this, but that seems nearly impossible at the current moment,” she said. (The United Nations’ Open Source Week runs all this week.)

But if there’s an upside to the spate of attacks on open-source software, it’s the energy it gives to how better to secure it, Lorenc said, invoking the political saying to never let a good crisis go to waste.

“Everyone knows the industry has to change,” he said. “This is a really good crisis, and the right things are happening in the right places, and organizations are rethinking their culture around software development, and they know what they have to do. It’s just something that’s never been top of the priority list for the last 10 years. Now it is, and they’re doing it, and it’s, ‘Can we do it fast enough?’”

The post Open-source security is posing challenges governments can’t easily solve appeared first on CyberScoop.

Anthropic disables new models after government calls them a national security concern

The U.S. government on Friday ordered Anthropic to immediately suspend foreign access to Fable 5 and Mythos 5, its two most advanced artificial intelligence models, citing national security concerns tied to a reported method of bypassing the models’ safety restrictions. 

The directive, issued late Friday afternoon by Secretary of Commerce Howard Lutnick in a letter to Anthropic Chief Executive Dario Amodei, placed the two models under export controls that prohibit use by foreign nationals, whether inside or outside the United States. 

Because of the scope of the restrictions, which includes foreign-born Anthropic employees, the company announced Friday evening that it disabled the models to ensure compliance. Access to the company’s other AI models was not affected. 

Fable 5 and Mythos 5 had been released earlier this week, with Anthropic describing them as the most capable systems it had ever deployed. Mythos was available to members of Project Glasswing, which allowed selected cybersecurity companies to use the model to identify and address security flaws.

It’s unclear how the Commerce Department action affects Project Glasswing. Anthropic did not respond to a request for comment.

The Commerce Department‘s letter did not detail the specific national security concern. In its blog post Friday night, the company said its understanding is that the government became aware of a technique for “jailbreaking” Fable 5, a term for methods that circumvent a model’s built-in safety guardrails. According to Anthropic, the government provided only verbal evidence of what it described as a “narrow, non-universal jailbreak,” which essentially involved prompting the model to read a specific codebase and identify software flaws. 

Anthropic disputed the severity of the finding. The company said it reviewed a report it believes formed the basis of the government’s directive and found that the capabilities demonstrated were already available in other publicly accessible models, including OpenAI’s GPT-5.5. The company said those same capabilities are used routinely by cybersecurity professionals for defensive purposes. 

Katie Moussouris, chief executive of the cybersecurity firm Luta Security, posted on BlueSky Saturday that the issue stems from “Defense Oriented Prompting,” a security-first method of engineering AI system instructions that treats natural language as code.

Other reports claimed that Amazon was responsible for flagging the security issues in the model. The company did not respond to CyberScoop’s request for comment. 

Anthropic acknowledged in its statement that perfect jailbreak resistance is not achievable for any model provider, and said it had designed Fable 5 around a “defense in depth” strategy, combining narrow jailbreak resistance with active monitoring. The company said no testers had found a universal jailbreak capable of broadly bypassing the model’s safeguards. 

“We disagree that the finding of a narrow potential jailbreak should be cause for recalling a commercial model deployed to hundreds of millions of people,” Anthropic wrote. “If this standard was applied across the industry, we believe it would essentially halt all new model deployments for all frontier model providers.”

Friday’s directive is the latest episode in a prolonged dispute between Anthropic and the Trump administration. In February, President Donald Trump moved to bar Anthropic’s products from federal agencies after the company sought stronger restrictions on how the Pentagon used its technology.

Despite that, as Anthropic released Mythos under Project Glasswing, the National Security Agency was given Mythos 5 to conduct offensive cyber operations. Earlier this month, Trump signed an executive order directing federal agencies to bolster cyber defenses and establish a voluntary mechanism for the government to gain early access to powerful AI models before deployment. 

The administration’s stated rationale for Friday’s action drew widespread skepticism from researchers and analysts. Dean Ball, a senior fellow at the Foundation for American Innovation, called the move “baffling.” Chris McGuire, a senior fellow at the Council on Foreign Relations, said targeted export controls on model access could be a legitimate policy tool, but called the across-the-board restriction “highly questionable” and the deemed export provisions — which restrict foreign nationals inside the U.S. — “just absurd.” 

The broader implications for the AI industry remain uncertain. Aaron Levie, chief executive of Box, described the directive as “a big turning point for AI regulation,” arguing that the government’s willingness to deem specific models too powerful for certain uses establishes a precedent with potentially far-reaching consequences.

Other tech leaders in the government supported the action. 

“We fully support @POTUS and @SecWar in prioritizing national security and the security of our warfighters, DIB partners, critical infrastructure, international partners and allies,” DOD CIO Kirsten Davies wrote in a social post on X. “Some things are simply more important than revenue cycles, clickbait, and pre-IPO valuation. America First. Always.”

Anthropic said it believes the situation stems from a misunderstanding and is working to restore access as soon as possible.

The post Anthropic disables new models after government calls them a national security concern appeared first on CyberScoop.

Inside the race to adapt to an AI-powered security world

Troy West was in Warsaw when his dinner was interrupted by his phone. But he was happy about it.

West, associate director of cybersecurity for autonomous offensive security company XBOW, had just learned that a trial version of the company’s platform had found a vulnerability that led to a full takedown of a development environment used by Moderna, the pharmaceutical company primarily known for its work related to mRNA vaccines.

It was, by most measures, exactly the kind of outcome a security team dreads. But for West and Farzan Karimi, Moderna’s deputy CISO, it was something closer to a proof of concept. XBOW’s product had done in hours what a human penetration tester could not — and it had done so with a level of persistence and creativity that neither of them had fully anticipated.

The episode is one data point in a much larger shift now rippling through the cybersecurity industry: The artificial intelligence models discovering vulnerabilities are moving faster than the teams that have to patch them.

Across recent conversations and presentations, industry experts said the tools are getting sharper, the attack surface is getting larger, and the gap between finding a problem and fixing it is not closing fast enough. For now, most organizations are caught between the speed of discovery and the slowness of remediation, with vendors across the industry rushing to position their products as the way through.

A shift in scale 

The inflection point came with Claude Mythos. When Anthropic announced the highly guarded model, security executives at major enterprise technology companies took notice in a way they had not with prior frontier releases. 

Zscaler was among the early organizations given access to the model, and CEO Jay Chaudhry told CyberScoop that he directed his team to use it to probe the company’s own applications.

“Are we finding some serious stuff? Yes, indeed,” Chaudhry told CyberScoop at Gartner’s Security & Risk Management Summit. He was careful to note that the findings were not necessarily more severe than those produced by other models. The issue, he said, was volume. 

“There aren’t enough resources and cycles to fix all those,” he said. 

The reason Mythos changed the calculus, according to Tom Gillis, general manager for infrastructure and security products at Cisco, comes down to code complexity. Legacy network infrastructure was built on tens of millions of lines of code developed over decades, and earlier AI models lacked the context window and reasoning capacity to comprehend it in full.

“The models couldn’t understand the entirety of it before,” he told CyberScoop. “Now they can. That’s why they’re finding all these vulnerabilities.”

The problem runs deeper than application code. Firewalls and network switches often run for decades without updates or reboots, and many have never been patched in any meaningful way. The combination of aging infrastructure and newly capable AI models has created what Gillis described as a meaningful and accelerating shift in attacker capability that the industry’s existing operational rhythms were not built to absorb.

An opportunity in existing technology 

Cisco’s answer to the oncoming vulnerability deluge is a technology it calls Live Protect, a compensated control built on eBPF, a Linux feature that lets security software operate at the kernel level to block threats without rewriting system code.

“It’s a pinpoint, laser-fine control that can shield a vulnerability on a production system,” Gillis said. “We’re not touching or modifying the binaries of that production system.”

The intent is to shrink the window between discovering a vulnerability and the next scheduled patch, allowing IT teams to fix issues without taking systems offline.

“This is a finger in the dike that plugs a hole until you get to new change control windows,” he said, acknowledging that some customers may be tempted to treat the shields as a permanent solution. 

The product has been shipping since October, but customer urgency shifted noticeably after Mythos. “Customers are like, ‘Oh, good story, Tom. I’ll think about it.’ Now it’s like, ‘Oh my God, turn this thing on right now.’”

He also noted that eBPF is open source, and said he expects the broader industry to follow. 

“While I’m very proud of Cisco leading the market with these compensated controls, I know my competitors have to do this.”

The bot that broke everything 

But shielding vulnerabilities only works if you know they exist. Karimi, the Moderna deputy CISO, faced a different problem: His vulnerability management system was surfacing hundreds of high-severity findings with no reliable way to know which ones an attacker could actually exploit. His team had skilled red-teamers, but they were finite resources. What he needed was something that could test continuously, everywhere.

“We have some very senior red-teamers and pen-testers in our organization that are pointed in a specific direction,” Karimi said during a presentation at the Gartner summit. “XBOW is covering different attack stories for us.”

West, who leads offensive security for XBOW, describes the platform as a response to a structural problem in how offensive security has traditionally worked. Human testers scope an engagement, run it, write a report, and move on. The window between tests is where risk accumulates.

“Historically you have exploit developers spending time finding the right vulnerabilities, writing the exploits, finding if those exploits are reachable, and then finding a way to chain them all together,” West said. “That takes a long time.”

Given the realities, Karimi decided to put XBOW through a trial, which produced two notable findings.

In the first, XBOW identified a web application firewall bypass on a company application built on the Spring Boot framework. The bypass involved encoding a single character (a capital “A”) as its percent-encoded URL equivalent (A), which the WAF interpreted as a legitimate request, allowing the bot unfettered access. 

The second finding, which was the cause for West’s dinner interruption, was more consequential. West had provided XBOW with access to the source code of an internal application called Orders, used by Moderna’s research partners to procure drug substances, but no login credentials. The platform identified a valid API key embedded in the source code, used it to authenticate, and then began probing the application’s APIs for SQL injection vulnerabilities.

What happened next was not entirely planned. One of those APIs handled a malformed SQL injection attempt in an unexpected way, dumping garbage data into a shared routing application that other services depended on.

“Not only was it able to kick that Orders app I showed you, but it somehow kicked over the entire ecosystem of apps,” West said.

Human pen-testers who reviewed the findings afterward confirmed they were valid, and said they would not have found them on their own. Karimi said despite the outage, his team recognized the value immediately.

“If we’re able to demonstrate where you could have an outage in a safe testing environment, that’s a great signal,” he said.

The broader value, Karimi argued, is in forcing prioritization when bugs are discovered. “If you have exploit proofs, you can provide that plus-one modifier and really point your developers to remediate the top tier of real risk that’s been validated.”

But he does worry about the volume of bugs that will be surfaced by these tools. 

“How do we now handle the volume of bugs that have gone up due to AI-driven scale?” he said. “That’s a whole other problem space.”

A broader reckoning

Across these conversations, a consistent theme was that even as defenders are trying to get arms around the forthcoming wave of bugs, it’s going to be a tremendously uphill battle. That mirrors what some of the industry’s top leaders have been saying for months. 

It also mirrors what the model developers themselves have consistently been warning about. In its announcement about expanding access to Mythos, Anthropic admitted the timeline for a publicly available tool similar to its cybersecurity-focused model is shortening, and there are no guarantees it will be released with safeguards. 

“In that world, cyberattacks could occur much more often, and in much more unpredictable forms,” the blog post reads.

Gillis was blunter about what happens to organizations that don’t move. 

“Some people will be slow to change,” he said. “But the consequence of not making that change is gonna be front-page news. It’s a massive, massive compromise. You know, like, ‘you gave up every credit card number.’ Bummer.”

The post Inside the race to adapt to an AI-powered security world appeared first on CyberScoop.

Anthropic expanding access to Project Glasswing

Anthropic is broadening access to its Project Glasswing program, adding approximately 150 organizations in 15 countries, the company announced Tuesday, as its restricted Claude Mythos Preview model has already surfaced more than 10,000 high- or critical-severity software vulnerabilities since the program launched in early April.

The expansion follows an initial cohort of roughly 50 partners that were announced when Anthropic first unveiled the initiative. Those members included technology companies such as Amazon Web Services, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks, among others.  

According to the announcement, the new group covers sectors that were underrepresented in the first wave, including power, water, healthcare, communications, and hardware. Many of the new partners are vendors whose codebases underpin critical infrastructure systems.

The company did not give any further details on what companies or organizations were part of the new cohort.  Sources tell CyberScoop that NetSkope and Rubrik, which specialize in cloud security and data management, is part of the group given access in this latest round.

The scale of what Mythos Preview has already found is drawing attention across the security industry. Cloudflare identified 2,000 bugs across its critical-path systems, including 400 rated high or critical, with a false-positive rate the company described as better than that of human testers. Mozilla found and fixed 271 vulnerabilities in Firefox 150 while testing the model, more than 10 times the number found in a previous Firefox version using an earlier Anthropic model. Several other partners reported that their rates of bug discovery increased more than tenfold after deploying the model. 

Anthropic also used Mythos to scan more than 1,000 open-source projects, flagging 23,019 potential vulnerabilities, 6,202 of them estimated as high or critical. Of 1,752 high- or critical-rated findings independently reviewed, over 90% were confirmed as valid. 

The findings have shifted what Anthropic describes as the central issue in cybersecurity. Despite the enhanced ability to discover flaws, the company admits there are challenges with verifying, disclosing, and patching them before attackers can take advantage.

“The bottleneck in fixing bugs like these is the human capacity to triage, report, and design and deploy patches for them,” the company said in its blog post

That bottleneck has broader implications. A joint report from the Cloud Security Alliance, the SANS Institute, and OWASP concluded that organizations are “likely to be overwhelmed” in the near term by threat actors using AI to find and exploit vulnerabilities faster than defenders can patch them.

Anthropic has said it will not release Mythos-class models to the general public, citing the absence of safeguards sufficient to prevent serious misuse. In the interim, it has released Claude Security, a product using its publicly available Claude Opus 4.8 model that has been used to patch more than 2,100 vulnerabilities in three weeks. 

The program’s expansion comes as the Trump administration signed a scaled-back executive order on AI security. The order, which was signed hours after Anthropic’s announcement, sets up a voluntary framework requiring AI developers to submit advanced models to a government review up 30 days before public release.

The post Anthropic expanding access to Project Glasswing appeared first on CyberScoop.

Anthropic: Mythos finds more than 10,000 software flaws in first month

Anthropic said its month-old Project Glasswing initiative has uncovered more than 10,000 high- or critical-severity software vulnerabilities across systemically important code, a finding the company says has shifted the central problem in cybersecurity from discovering flaws to verifying and patching them.

The findings, drawn from partner reports and independent evaluations, mark one of the first large-scale accountings of what a frontier AI model can do when pointed at widely used code, and of the bottlenecks that emerge once it does.

Several partners reported that their rates of bug discovery had increased more than tenfold. Cloudflare identified 2,000 bugs across its critical-path systems, including 400 rated high or critical, with a false-positive rate the company said it considered better than that of human testers. At one unnamed partner bank, the model was credited with helping detect and prevent a fraudulent $1.5 million wire transfer initiated after a customer’s email account was compromised and followed up with spoofed phone calls.

External evaluations cited in the update tracked with the results Anthropic released. The United Kingdom’s AI Security Institute found that Mythos Preview was the first model to solve both of its cyber ranges — simulations of multistep cyberattacks — from end to end. Mozilla said it found and fixed 271 vulnerabilities in Firefox 150 while testing the model, more than 10 times the number found in Firefox 148 using an earlier Anthropic model. AI-powered security platform XBOW called the model a significant step up over existing systems on its web exploit benchmark.

Anthropic also used Mythos to scan more than 1,000 open-source projects. The model has flagged 23,019 potential vulnerabilities, 6,202 of them estimated as high or critical. Of 1,752 high- or critical-rated findings reviewed by six independent security research firms or by Anthropic itself, over 90% were confirmed as valid, and over 62% were confirmed to be high or critical.

The company did note that while it’s good at finding vulnerabilities, there is still a gap in having people fix every issue. 

“The bottleneck in fixing bugs like these is the human capacity to triage, report, and design and deploy patches for them,” the report states. 

Open-source maintainers have also been contending with a wave of low-quality, AI-generated bug reports, and Anthropic said it tries to reproduce and assess each issue before reporting it. At maintainers’ request, it has sometimes disclosed bugs without further vetting, reporting 1,129 such cases, of which the model estimated 175 to be high or critical.

Anthropic said it has not released Mythos-class models publicly because no company, including itself, has developed safeguards to prevent serious misuse. In the interim, it has released Claude Security in public beta for enterprise customers, which it said has been used to patch more than 2,100 vulnerabilities in three weeks using the publicly available Claude Opus 4.7, and has begun a Cyber Verification Program for security professionals.

The company said it plans to expand Project Glasswing with additional partners, including U.S. and allied governments, before any broader release of the underlying model.

“Glasswing helps the most systemically important cyber defenders gain an asymmetric advantage. However, there is an urgent need for as many organizations as possible to shore up their cyber defenses,” the report states. “We hope that our generally available models, and the new tools, resources, and research we’re providing to accompany them, will support those organizations to improve their cybersecurity posture.”

The post Anthropic: Mythos finds more than 10,000 software flaws in first month appeared first on CyberScoop.

Researchers say AI just broke every benchmark for autonomous cyber capability

Two of the most advanced artificial intelligence models — Anthropic’s Claude Mythos Preview and OpenAI’s GPT-5.5 — have significantly surpassed the already-accelerating pace at which AI systems are completing autonomous cybersecurity tasks, according to separate findings published Wednesday by the United Kingdom’s AI Security Institute (AISI) and Palo Alto Networks.

The AISI, which conducts pre-deployment evaluations of frontier AI models on behalf of the British government, said both Claude Mythos Preview and GPT-5.5 have substantially exceeded the doubling trend the institute had been tracking since late 2024. Whether the results represent an isolated capability jump or the start of a new, faster trajectory remains unclear.

The AISI estimated earlier this year that frontier models’ 80% reliability cyber time horizon — a measure of how long a task takes a human expert, used as a proxy for AI autonomy — had been doubling approximately every five months. That was itself roughly half the eight-month doubling time the institute estimated in November 2025. Now Mythos Preview and GPT-5.5 have since outperformed any trend lines the institute has measured.

“Frontier AI’s autonomous cyber and software capability is advancing quickly: the length of cyber tasks that frontier models can complete autonomously has doubled on the order of months, not years,” the AISI wrote.

The clearest evidence of the capability jump came from the AISI’s cyber ranges, its structured simulations of multi-stage attacks against small, undefended enterprise networks. A newer checkpoint of Claude Mythos Preview became the first model to complete both of the institute’s ranges. It solved “The Last Ones,” a 32-step simulated corporate network attack, in 6 of 10 attempts, and completed “Cooling Tower” — previously unsolved by any model — in 3 of 10 attempts. GPT-5.5 solved “The Last Ones” in 3 of 10 attempts.

Palo Alto Networks reached similar conclusions through its own testing. The company said it began testing Claude Mythos in April as a launch partner for Anthropic’s Project Glasswing, and has since tested Claude Opus 4.7 and OpenAI’s GPT-5.5-Cyber as part of OpenAI‘s Trusted Access for Cyber program.

“The latest models are extraordinarily capable at finding vulnerabilities and changing them into critical exploit paths in near-real-time,” Palo Alto Networks wrote.

The company released security advisories covering 26 CVEs representing 75 issues — compared to a typical monthly volume of fewer than five CVEs — that were identified through AI model scanning across more than 130 products. All important vulnerabilities in its SaaS products had been patched, with patches available for all customer-operated products.

The AISI was careful to note the limits of its data. The estimates are based on a relatively small number of models, and the hardest tasks in the test suite have the least amount of human comparison data. Even so, the institute said the overall trend holds up: dropping any single model from the analysis barely moves the needle, shifting the estimated doubling time by less than a month in either direction. Separate research from METR, a nonprofit that tracks how quickly AI handles software tasks, arrived at a nearly identical figure — a doubling time of approximately four months since late 2024.

“No single benchmark result should be read as a precise measure of AI capability,” the AISI wrote. “Regardless, the direction of change and rapid growth have been consistent across the models, methodological choices and independent data we examined.”

Palo Alto Networks outlined four immediate priorities for enterprises as these models continue to grow in usage: First, find and fix vulnerabilities in code and applications before attackers do. Second, shrink the attack surface and use AI to spot security misconfigurations. Third, deploy detection and response tools across all systems, using machine learning to catch threats in real time. Fourth, build security operations fast enough to respond in minutes, because AI-powered attacks may soon unfold that quickly.

The AISI said it is developing more demanding evaluations, including new cyber ranges and the addition of active cyber defenses, to better reflect real-world conditions as model capabilities continue to advance.

The post Researchers say AI just broke every benchmark for autonomous cyber capability appeared first on CyberScoop.

Here’s how cyber heavyweights in the US and UK are dealing with Claude Mythos

A joint report from the Cloud Security Alliance (CSA), the SANS Institute and the Open Worldwide Application Security Project (OWASP) concludes that in the near term, organizations are “likely to be overwhelmed” by threat actors using AI to find and exploit vulnerabilities faster than defenders can patch them.

While those organizations can use AI tools to speed up their own defenses, attackers “still face a heavier relative burden due to the inherent limitations of patching. This in turn leads to “asymmetric benefits” for attackers who can afford to adopt the technology without the same caution and bureaucracy as a multi-billion dollar business.

“The cost and capability floor to exploit discovery is dropping, the time between disclosure and weaponization is compressing toward zero, and capabilities that previously required nation-state resources are now becoming broadly accessible,” wrote Robert Lee, SANS Institute’s Chief AI Officer, Gadi Evron, CEO of Knostic and Rich Mogull, chief analyst at CSA, who served as the primary authors.

The report marks one of the first comprehensive responses to the capabilities of Claude Mythos from the U.S., boasting cybersecurity luminaries who have set policy at the highest levels as contributing authors, including Jen Easterly, former director of the Cybersecurity and Infrastructure Security Agency, Rob Joyce, a former top White House and NSA cybersecurity official, and Chris Inglis, former National Cyber Director.

It also includes private sector stalwarts like Heather Adkins, Google’s CISO, Katie Moussouris, CEO of Luta Security, and Sounil Yu, chief technology officer at Knostic. Another seventy CISOs, CTOs and other security executives are named as editors and reviewers.

Also this week, the UK’s AI Security Institute (AISI) detailed the results of tests it performed on a preview version of Claude Mythos, calling it a “step up” from past Anthropic models in the cybersecurity arena and able to “execute multi-stage attacks on vulnerable networks and discover and exploit vulnerabilities autonomously.”

Using a mix of Capture the Flag exercises and cyber range testing, AISI researchers found that Mythos not only raised the ceiling of technical non-experts and apprentice-level users, it narrowed the overall gap in hacking proficiency between the two. In other words, there’s becoming less of a distinction between the capabilities of amateur “script kiddies” and mid-level hackers with technical knowledge.

Claude Mythos and other Large Language Models are increasing the capabilities of both lower and mid-level hackers when it comes to solving cybersecurity-specific tasks and challenges. (Source: AISI)

Before April 2025, no Large Language Model could complete a single expert-level CTF problem. Mythos successfully solved nearly three quarters (73%) of them.

In cyber range tests – which are meant to simulate more complex, multi-chain attacks – the results were uneven, but also represented meaningful progress over prior Claude models.

Mythos was subjected to a 32-step attack playbook modeled on corporate networks, spanning initial network access to full network takeover. In three of the 10 simulations, the model completed an average of 24 of the 32 steps. Older versions of Claude and other frontier models never averaged more than 16.

Claude Mythos improved on other models ability to complete a 32 step cyber attack targeting a simulated corporate network environment. (Source: AISI)

Mythos flunked its test against a simulated operational technology cooling tower, but researchers noted that this doesn’t mean AI is bad at exploiting OT: the model actually faltered during the IT section of the exercise.

UK researchers were more measured in their analysis of Mythos, noting that their testing indicates it is “at least capable” of autonomously taking down smaller, weakly defended enterprise networks.

But they also note their cyber ranges lack security features – like active defenders and defensive tooling – that would be common in many real-world networks and present additional obstacles, nor did they penalize the model for triggering security alerts.

“This means we cannot say for sure whether Mythos Preview would be able to attack well-defended systems,” the researchers concluded.

Technical debt coming due

Both the US and UK reports agree that large language models are broadly moving in a similar direction of lowering the technical barrier. The US authors call for organizations to more quickly adopt AI for cyber defense while overhauling their incident response playbooks and corporate policies to account for more automated defense postures.

For its part, Anthropic has said it is not selling Mythos commercially, and last week it announced the model would be made available to Project Glasswing, a consortium of major tech companies that will use it to root out and patch vulnerabilities in commonly used products and services.

But other experts have warned that businesses and governments are not well-positioned to either absorb the influx of expected vulnerability exploitation or deftly harness AI tools of their own to counter them.

Casey Ellis, CTO and founder of Bugcrowd, wrote that recent advances in AI cyber tools has succeeded largely by “living in the places we stopped looking a decade ago.”

While the cybersecurity community has spent years focusing on application security, vulnerability triage and other “top layer” security problems, AI tools and apex level hacking groups have been feasting on vulnerabilities in forgotten firmware, or routers whose manufacturers long went out of business.

This reality that tools like Mythos can endlessly weaponize the massive technical debt of large organizations has taken the traditional defender’s dilemma and “the knob that used to go to ten and turned it to seven hundred,” Ellis wrote.

Additionally, corporations and governments run on consensus-building, multiple layers of hierarchy and legal compliance. While those are all necessary when handing your cybersecurity over to automated tooling, it can also lead to a slower process and more asymmetry against defenders in the short term.

“Integration into actual production becomes the battlezone,” wrote Ellis. “Lag is real. Bureaucracy is real. Supply chains are real.”

The post Here’s how cyber heavyweights in the US and UK are dealing with Claude Mythos appeared first on CyberScoop.

Tech giants launch AI-powered ‘Project Glasswing’ to identify critical software vulnerabilities

Major technology companies have joined forces in an effort to use advanced artificial intelligence to identify and address security flaws in the world’s most critical software systems, marking a significant shift in how the industry approaches cybersecurity threats.

Anthropic announced Project Glasswing on Tuesday, bringing together Amazon, Apple, Broadcom, Cisco, CrowdStrike, the Linux Foundation, Microsoft, and Palo Alto Networks. The initiative centers on Claude Mythos Preview, an unreleased AI model that Anthropic will make available exclusively to project partners and approximately 40 additional organizations responsible for critical software infrastructure.

The model has already identified thousands of previously unknown vulnerabilities in its initial testing phase, including security flaws that have existed in widely used systems for decades, according to Anthropic. Among the discoveries is a 27-year-old bug in OpenBSD, an operating system known primarily for its security focus, and a 16-year-old vulnerability in FFmpeg, a widely used video software program that automated testing tools had failed to detect despite running the affected code line five million times. The company has been in contact with the maintainers of the relevant software, and all found vulnerabilities have been patched. 

Anthropic will commit up to $100 million in usage credits for the project, along with $4 million in direct donations to open-source security organizations. The company has stated it does not plan to make Mythos Preview available to the general public, citing concerns about the model’s potential misuse.

The initiative reflects growing concerns within the technology sector about the dual-use nature of advanced AI systems. While Mythos Preview was not trained specifically for cybersecurity purposes, its coding and reasoning capabilities have proven effective at identifying subtle security flaws that have eluded human analysts and conventional automated tools.

“Although the risks from AI-augmented cyberattacks are serious, there is reason for optimism: the same capabilities that make AI models dangerous in the wrong hands make them invaluable for finding and fixing flaws in important software—and for producing new software with far fewer security bugs,” the company said in a blog post. “Project Glasswing is an important step toward giving defenders a durable advantage in the coming AI-driven era of cybersecurity.”

The project comes as the industry has predicted that similar AI capabilities will soon become more widespread. Anthropic executives have indicated that without coordinated action, such tools could eventually reach actors who might deploy them for malicious purposes rather than defensive security work.

Participating organizations will be required to share their findings with the broader industry. The project places particular emphasis on open-source software, which forms the foundation of most modern systems, including critical infrastructure, yet whose maintainers have historically lacked access to sophisticated security resources.

“Open source software constitutes the vast majority of code in modern systems, including the very systems AI agents use to write new software. By giving the maintainers of these critical open source codebases access to a new generation of AI models that can proactively identify and fix vulnerabilities at scale, Project Glasswing offers a credible path to changing that equation,” said Jim Zemlin, CEO of the Linux Foundation. “This is how AI-augmented security can become a trusted sidekick for every maintainer, not just those who can afford expensive security teams.” 

Additionally, Anthropic says it has engaged in ongoing discussions with U.S. government officials regarding Mythos Preview’s capabilities. The company has framed the project in national security terms, arguing that maintaining leadership in AI technology represents a strategic priority for the United States and its allies. Anthropic has been locked in a high-stakes dispute with the Department of Defense about the U.S. military’s use of the startup’s Claude AI model in real-world operations. 

The project’s success will depend partly on whether the collaborative approach can keep pace with rapid advances in AI capabilities. Anthropic has indicated that frontier AI systems are likely to advance substantially within months, potentially creating a dynamic environment where defensive and offensive capabilities evolve in parallel.

“Project Glasswing is a starting point,” Anthropic wrote in a blog post. “No one organization can solve these cybersecurity problems alone: frontier AI developers, other software companies, security researchers, open-source maintainers, and governments across the world all have essential roles to play. The work of defending the world’s cyber infrastructure might take years; frontier AI capabilities are likely to advance substantially over just the next few months. For cyber defenders to come out ahead, we need to act now.”

The post Tech giants launch AI-powered ‘Project Glasswing’ to identify critical software vulnerabilities appeared first on CyberScoop.

❌