Normal view

There are new articles available, click to refresh the page.
Before yesterdayMain stream

Local LLMs are better than ever, but are they good enough?

22 June 2026 at 03:43
AI By Matthew S. Smith This might be hard to believe, but we’re now at least four years into the era of AI large language models — and perhaps up to nine, depending on your definition. OpenAI’s ChatGPT was released in 2022, GPT-3 was released in 2020, and the paper that defined the transformer architecture […]

AI’s constant patching treadmill can be a security problem

By: djohnson
16 June 2026 at 16:32

While Washington D.C. frets over the potential impact of Anthropic’s Claude Fable 5, security researchers continue to track how the integration of frontier AI tools are transforming the digital security landscape for malicious hackers and defenders alike.

The breakneck speed of model releases may be creating short, silent security gaps for developers who must choose between performance and security, according to a new report.

Researchers at Backslash Security pored through update logs for Claude Code, Anthropic’s flagship coding model, finding the company was patching dozens of newly discovered security vulnerabilities in the program between April and early June 2026.

The logs revealed the details of more than 30 security relevant patches implemented over that timeframe, but Anthropic did not publicize them. Instead, Backslash Security researchers found them by reviewing update logs for every new version of a Claude Code release in the last two months, noted the security-relevant fixes and traced each one back to the version and date it shipped.

The patches included fixes for data poisoning, prompt injection and arbitrary code execution vulnerabilities. One bypassed core safeguards put in place to prevent Claude Code from accepting catastrophic deletions commands, such as erasing an entire codebase, by adding a single backslash to the command. Another leaked user OAuth credentials, while a third allowed an AI agent to plant a backdoor in shell startup files.

There is nothing inherently odd about this: most companies regularly update and patch their software  and anyone who had auto-updates turned on would automatically be switched to the newest, secure version of Claude Code.

But Yossi Pik, co-founder and chief technology officer at Backslash Security, told CyberScoop that the research concluded “the way AI agents are released is different than previous software.”

“We debated internally, because when I originally said I wanted to write about this, I was told ‘Okay, every company has the [same] issue, then they patch and fix,” he said. “This is the nature of software, but I think that what makes this unique is the cadence and frequency of the releases.”

AI companies keep a ferocious pace when updating their models. Claude Code’s changelog indicates there have been 16 different versions through the first half of June, while OpenAI’s Codex was updated 6 times.

Because model updates often bring short-term performance and stability issues, software developers typically wait a week or more before upgrading to a new version.

These time gaps create small windows of vulnerability and force developers to choose between security and performance. The report identifies several reasons why developers don’t automatically update their AI models, including companies that may rely on internal vetting or release schedules, operate in regulated or air-gapped environments where model versions are frozen, and the need to maintain long-running sessions or use manual installations.

Pik said some IT and security teams have also told him they prefer not to install any new version of an AI model without letting it run on other environments first.

“You don’t have that much flexibility, either I go to the latest and I’m getting a less stable version [of the model] or I’m waiting for a few days or a week until I can install it, and hope that nothing would happen during this time,” said Pik.

The Backslash report is not intended as a dig at the security rigor of Anthropic, noting the company tends to “patch fast and document more than anyone” and has addressed every issue and vulnerability identified in the report.

Rather, it’s to highlight the series of mostly silent and persistent security exposures that an organization faces when adopting AI into their workflow.

Other software programs and technology products face similar tradeoffs through different updates, but most of the vulnerabilities detailed in the change log – such as getting an agent to leak data or accept malicious prompts – are unique to large language models and AI systems.

That means integrating AI tools can bring new security problems to an organization, both from outsiders who can poison or influence the model and insiders who can maliciously or accidentally direct the model to access or leak systems, data and identities.

For most Claude Code users, this process runs automatically in the background. Yet Yik points out that just as AI is transforming work itself,  it’s also changing how we need to approach software security and updates.

“It should not be compared to [Microsoft] Office that is installed and gets patched once in a while,” he said. “It’s a completely different beast that keeps evolving, and we don’t want to limit it…I think that it’s great for everyone. We just need to make sure that we do it in a secure way, and every organization should understand what that means for them.”

The post AI’s constant patching treadmill can be a security problem appeared first on CyberScoop.

Anthropic’s new model is Mythos on a leash

By: djohnson
9 June 2026 at 13:00

Earlier this year, Anthropic executives said that their new AI model, Claude Mythos, had such powerful capabilities for harm that they would not release it publicly.

On Tuesday, the company said it was making an altered version of Mythos available to the public, promising “new guardrails” that thwart the model’s best-in-class performance in hacking and bioweapons research.

Anthropic said Claude Fable 5 was the “same underlying model” as Mythos, but its responses for certain topics like cybersecurity and biology will be drawn from a previous Claude Opus model that is already public.

“Releasing a model this capable comes with risks. Without safeguards, Fable 5’s capabilities in areas like cybersecurity could be misused to cause serious damage,” the company said in a draft blog sent to CyberScoop ahead of the announcement. “We’ve therefore launched the model with safeguards that route queries on a narrow set of topics to our next-most-capable model, Claude Opus 4.8.”

Anthropic also said they subjected Fable 5 to both internal and external red team testing for common model vulnerabilities, like jailbreaking. Anthropic said these tests identified no known “universal” jailbreaking techniques, but does not specify if partial jailbreaking techniques were discovered.  

The company is betting that won’t change when Fable 5 is made available to the broader public, but it’s worth noting that cybersecurity researchers have consistently found ways to jailbreak older AI models.

“The uplift from Mythos-level capabilities is valuable to many adversaries—for instance, those who could financially gain from cyberattacks—and we therefore expect them to be motivated to try to circumvent our safety measures,” the company wrote.

Anthropic is changing its data retention policies for Fable and Mythos models, keeping all user traffic for 30 days on both its own platforms and third-party services. A White House executive order creates a voluntary framework for AI companies to share frontier models with the government up to 30 days before public release. The company says the retained data won’t be used to train new Claude models or for “any non-safety-related-purpose.”

Following publication, a spokesperson for Anthropic told CyberScoop the company’s data retention policies “are specific to their safeguards work and is unrelated to the EO.”

Most organizations are still deciding whether to adopt AI into their IT and cybersecurity ecosystem.  But models like Mythos can scan for vulnerabilities, chain together exploits, and steal data from a victim network in minutes. Automation in hacking existed before AI, but experts have said frontier models like Mythos and OpenAI’s Daybreak can allow even low-level cybercriminals to wreak havoc.

While Anthropic cited its commitment to developing safe and secure AI in its reasons for not publicly releasing Mythos, many organizations have been clamoring for access, and its enhanced cybersecurity functions in cybersecurity and other areas have been the subject of congressional hearings, national security papers and White House executive orders.

Releasing a limited version of the model in Fable 5 represents an attempt to split the difference between those two desires. Anthropic said it would release follow up benchmarks and assets for the model.

So what can Fable 5 do? 

Anthropic said it’s possible the restrictions built into Fable will make it harder for the model to fulfill both malicious and legitimate user requests.

“Because we have prioritized safety, we’ve deliberately tuned the safeguards to be cautious, and they are still stricter than would be ideal—for example, sometimes benign requests will trigger our classifiers,” the company wrote. “We recognize that this will be frustrating to some users, and our aim is to reduce false positives as we update and refine the safeguards after launch.”

If Fable 5 draws its cybersecurity and biology answers entirely from Claude Opus 4.8, it will still provide users with impressive – though not unique – dual use cybersecurity capabilities.

According to the system card published for Opus 4.8, the model is a slight improvement on previous models like 4.7 in the realm of cybersecurity but was “generally much less capable than Mythos Preview.”

Opus 4.8 was tested on its ability to write complete end-to-end exploits and build exploit primitives that provide attackers with the ability to execute arbitrary code. It averaged a score just 5 out of 16 in proficiency, compared to Mythos Preview which scored closer to 10.

Without safety guardrails in place, Opus 4.8 can still reproduce nearly 80% of previously discovered vulnerabilities in real open-source software projects when given a high level description of the weakness. The system card said Anthropic’s unspecified safeguards whittle this success rate down to 1%.

Another test assessing Opus’ ability to develop exploits for the popular Firefox browser found that, again without guardrails, the model could identify a full working exploit 8.8% of the time and a partial working exploit 68.8% of the time.

The company also said that members of Project Glasswing – a consortium of public and private businesses given access to a preview version of Mythos – will be able to upgrade to the latest full model, Claude Mythos 5, to continue their work. Access to Mythos 5 will be expanded over time “through a more systematic trusted-access program” including federal agencies.

The post Anthropic’s new model is Mythos on a leash appeared first on CyberScoop.

Your AI agent could become your biggest insider threat 

By: djohnson
4 June 2026 at 14:06

Government agencies, cybersecurity companies and threat researchers are pouring resources into studying how fast-developing AI tools can be wielded by malicious actors to hack into victim organizations.

But as agentic AI becomes more embedded in business infrastructure, there’s also a high possibility that a breach could be caused by an insider guiding the tool, whether maliciously or due to lack of security controls.

In research shared exclusively with CyberScoop, DTEX researchers detail how a common workflow in Anthropic’s Claude Cowork used in corporate environments offers convenience for AI agent deployment but grants near-total access to the system.

Claude Cowork includes tools that let users remotely control their agents. One particular tool, known as Dispatch, relays commands from a user’s phone to their desktop Claude agent. It also includes a plugin for communicating with Salesforce AI agents that access and transfer data.

DTEX researchers tested two scenarios. The first prompted Claude to summarize information from Salesforce and paste it into a draft Outlook email. The second tasked the agent with archiving selected files and transferring them via the Cowork app.

In both cases, researchers used simple, single-turn prompts and spent between 10-30 minutes preparing to exfil  the data.

Alex Desmond, director of insider threat intelligence and innovation at DTEX, told CyberScoop that both improvements in frontier models and deeper integration of AI tools into IT network operations have reduced the time defenders have to react to a breach.

“In cyberattacks, you talk about the kind of execution time of adversaries coming in and dropping ransomware, we’re now seeing the kill chain drop to 30 and 10 minutes depending on what they’re doing,” Desmond said. “Six months ago, that was a couple of hours.”

But that speed, when paired with direct access to business networks or cloud services, can also create an insider threat nightmare for organizations that must monitor for both malicious actors and potential mistakes from legitimate employees using the technology.

Over the past few years, western IT and cybersecurity businesses have been inundated with job applicants secretly working on behalf of the North Korean government. Their salaries are used to evade international sanctions and fund Pyongyang’s nuclear program, but it also positions the individuals to access or steal sensitive data or assets from these companies. 

“You’ve got a nation-state actor getting into an environment legitimately,” Desmond said. “Now if you gave them access to AI tools on top of that…you’re like ‘here’s the keys to everything and here’s this awesome tool that’s just going to make your job – stealing our data – easier.’”

Tests by DTEX confirmed that the agents indeed had access to sensitive systems, applications and data – including the ability to download SharePoint corporate data, production documentation in OneDrive, access to Outlook email, Salesforce data (and all the data it can access), and any other files on the user’s endpoint device. For each of these applications, Claude Cowork has a dedicated plugin or API to share externally if prompted.  

To be clear, DTEX’s research does not involve exploiting a software bug or configuration vulnerability, and it doesn’t come with a CVE. It’s more of an IT governance and visibility problem. Businesses are racing to integrate AI tools into their workflow and pushing employees to use the technology while failing to put in place the kind of security controls, access policies and monitoring required to spot problems.

For instance, it may not be possible to determine how a data breach or leakage involving an AI agent actually occurred if an organization is not logging and auditing its prompts – or whether the incident was the result of an agent running amok or responding to potentially malicious instructions.

While network and cloud monitoring can identify when data is being accessed or downloaded from SharePoint, that may not be a strong enough signal to stand out for defenders.

“If a user’s normal workflow is to pull sensitive files down to work locally all the time, you don’t have endpoint monitoring and you introduce an AI agent, it then just has access to all that data” along with the ability to exfiltrate it,” Desmond said.

The post Your AI agent could become your biggest insider threat  appeared first on CyberScoop.

Anthropic expanding access to Project Glasswing

By: Greg Otto
2 June 2026 at 10:14

Anthropic is broadening access to its Project Glasswing program, adding approximately 150 organizations in 15 countries, the company announced Tuesday, as its restricted Claude Mythos Preview model has already surfaced more than 10,000 high- or critical-severity software vulnerabilities since the program launched in early April.

The expansion follows an initial cohort of roughly 50 partners that were announced when Anthropic first unveiled the initiative. Those members included technology companies such as Amazon Web Services, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks, among others.  

According to the announcement, the new group covers sectors that were underrepresented in the first wave, including power, water, healthcare, communications, and hardware. Many of the new partners are vendors whose codebases underpin critical infrastructure systems.

The company did not give any further details on what companies or organizations were part of the new cohort.  Sources tell CyberScoop that NetSkope and Rubrik, which specialize in cloud security and data management, is part of the group given access in this latest round.

The scale of what Mythos Preview has already found is drawing attention across the security industry. Cloudflare identified 2,000 bugs across its critical-path systems, including 400 rated high or critical, with a false-positive rate the company described as better than that of human testers. Mozilla found and fixed 271 vulnerabilities in Firefox 150 while testing the model, more than 10 times the number found in a previous Firefox version using an earlier Anthropic model. Several other partners reported that their rates of bug discovery increased more than tenfold after deploying the model. 

Anthropic also used Mythos to scan more than 1,000 open-source projects, flagging 23,019 potential vulnerabilities, 6,202 of them estimated as high or critical. Of 1,752 high- or critical-rated findings independently reviewed, over 90% were confirmed as valid. 

The findings have shifted what Anthropic describes as the central issue in cybersecurity. Despite the enhanced ability to discover flaws, the company admits there are challenges with verifying, disclosing, and patching them before attackers can take advantage.

“The bottleneck in fixing bugs like these is the human capacity to triage, report, and design and deploy patches for them,” the company said in its blog post

That bottleneck has broader implications. A joint report from the Cloud Security Alliance, the SANS Institute, and OWASP concluded that organizations are “likely to be overwhelmed” in the near term by threat actors using AI to find and exploit vulnerabilities faster than defenders can patch them.

Anthropic has said it will not release Mythos-class models to the general public, citing the absence of safeguards sufficient to prevent serious misuse. In the interim, it has released Claude Security, a product using its publicly available Claude Opus 4.8 model that has been used to patch more than 2,100 vulnerabilities in three weeks. 

The program’s expansion comes as the Trump administration signed a scaled-back executive order on AI security. The order, which was signed hours after Anthropic’s announcement, sets up a voluntary framework requiring AI developers to submit advanced models to a government review up 30 days before public release.

The post Anthropic expanding access to Project Glasswing appeared first on CyberScoop.

Pentagon cyber official calls advanced AI ‘revolutionary warfare’

14 May 2026 at 16:35

Advanced artificial intelligence models will “fundamentally change warfare as we know it,” a top cyber official at the Defense Department said Thursday, saying it represents “not evolutionary warfare, but revolutionary warfare.”

Paul Lyons, principal deputy assistant secretary for cyber policy, said the development of frontier AI models like Mythos amounted to a “watershed moment,” speaking at Rubrik’s  Federal Cyber Resilience Breakfast produced by FedScoop.

Such models will “change both offense and defensive posture within the Department of War to something that’s close to you for critical infrastructure,” he said. “This is the ability to hunt and speed across the domain and outside the fence line in critical dependencies with water, power, compute.”

The advent of the technology is forcing the department to address difficult questions, but it’s a great opportunity as well for the United States given that it’s being developed by American companies, Lyons said. It’s something his department is optimistic about, he said.

“To be blunt, we’re trying to figure out, what authorities do we need? How do you leverage that within both decisionmaking and employment?” he said. “We have the right people looking at the speed, scale and complexity of cyber and how it’s going to be affected through the advent of AI.”

The Pentagon labeled Mythos a “supply chain risk” after its creator, Anthropic, resisted commands from the department to use its Claude model in ways the firm opposed. The department has nonetheless been using Mythos to hunt for cyber vulnerabilities.

Lyons said that cyber warfare overall has become more mature, as recent conflicts have shown.

“We saw it in spades in Venezuela, where you can layer cyber to create conditions that are favorable to the warfighter, that lower risk to mission, lower risk to force that where paired with both no kinetic and kinetic effects, can increase lethality,” he said. “We see it in Iran today.”

President Donald Trump’s cyber strategy places an emphasis on taking the battle to the malicious hackers, something Lyons said was a vital approach.

“America’s posture in cyber defense has been largely a defensive posture,” he said. “That’s a losing strategy for America. America has to dominate the full spectrum of cyber operations.”

The post Pentagon cyber official calls advanced AI ‘revolutionary warfare’ appeared first on CyberScoop.

Closed briefing sets stage for House hearing on Anthropic’s Mythos and cyber risks

13 May 2026 at 18:10

The House Homeland Security Committee is digging into Anthropic’s AI model Mythos in a series of briefings and hearings, as questions proliferate on whether and how the federal government will make use of the technology touted for its ability to autonomously uncover cyber vulnerabilities.

Wednesday brought a closed-door briefing for the House Homeland Security Committee from Anthropic. The chairman of the panel’s cybersecurity subcommittee said he is planning to hold a hearing on the topic. And committee Democrats are requesting a classified briefing with Anthropic.

A committee aide who attended the briefing said it included a live demonstration of Mythos, “allowing members to see firsthand how advanced AI can identify and reason through software vulnerabilities. What we saw reinforced the urgency of ensuring that federal agencies, including our civilian cyber defenders, can responsibly access and deploy the most advanced U.S. models to find and patch vulnerabilities before foreign adversaries or criminal actors exploit them.”

A number of key lawmakers, including top committee Democrat Bennie Thompson of Mississippi and GOP cyber subcommittee chair Andy Ogles of Tennessee, told CyberScoop they weren’t able to attend Wednesday’s briefing. A second source who attended said it was a “productive” meeting.

“Members on both sides were focused on preserving U.S. advantage in AI, which basically came down to preserving our edge on compute power,” the source said. “They were also asking questions about whether the federal government was using Mythos, including about where CISA is and the impact of the supply chain risk designation.”

The Hill reported that Wednesday’s briefing was led on the Anthropic side by Logan Graham, from the company’s frontier red team, and Josh Tilstra, from the firm’s national security programs and policy team. It follows another recent closed briefing with Anthropic and OpenAI for the House Homeland Security Committee.

Ogles told CyberScoop he plans to hold a hearing of his subcommittee related to Mythos, but wasn’t able to attend Wednesday’s briefing due to scheduling conflicts. The top Democrat on Ogle’s subcommittee, Delia Ramirez of Illinois, also was unable to join due to prior commitments, but she was set to receive a rundown from staff about Wednesday’s briefing, her office said.

There’s a divide on which federal agencies are using Mythos thus far. For example: CISA reportedly isn’t, but the National Security Agency is

The federal divide on its use follows a Department of Defense blacklist that labeled the company a “supply chain risk” after Anthropic resisted pressure from the Pentagon to use its Claude AI model in ways the company opposed. The department says it has been using Mythos to identify cyber vulnerabilities despite the blacklist.

A turf battle is brewing within the Trump administration over testing of AI models, The Washington Post reported this week. Connecticut Rep. Jim Himes, the top Democrat on the House Intelligence Committee, said this week that it would be ‘insane” for U.S. spy agencies not to have early access to advanced AI models.

The Mythos briefing came one day after OpenAI announced its own cybersecurity initiative.

The committee aide said that “as the PRC aggressively works to close the AI innovation gap with the United States, the committee remains focused on ensuring that America’s AI leadership translates into a durable national security advantage, not a temporary lead that adversaries can copy, steal, or rapidly commoditize.”

Updated 5/13/26: to include comment from a committee aide who attended the briefing.

The post Closed briefing sets stage for House hearing on Anthropic’s Mythos and cyber risks appeared first on CyberScoop.

Flaw in Claude’s Chrome extension allowed ‘any’ other plugin to hijack victims’ AI

By: djohnson
8 May 2026 at 09:06

As businesses and governments turn to AI agents to access the internet and perform higher-level tasks, researchers continue to find serious flaws in large language models that can be exploited by bad actors.

The latest discovery comes from browser security firm LayerX, involving a bug in the Chrome extension for Anthropic’s Claude AI model that allows any other plugin – even ones without special permissions – to embed hidden instructions that can take over the agent

“The flaw stems from an instruction in the extension’s code that allows any script running in the origin browser to communicate with Claude’s LLM, but does not verify who is running the script,” wrote LayerX senior researcher Aviad Gispan. “As a result, any extension can invoke a content script (which does not require any special permissions) and issue commands to the Claude extension.”

Gispan said he was able to execute any prompt he wanted, blow through Claude’s safety guardrails, evade user confirmation and perform cross-site actions across multiple Google tools. As a proof of concept, LayerX was able to exploit the flaw to extract files from Google Drive folders and share them with unauthorized parties, surveil recent email activity and send emails on behalf of a user, and pilfer private source code from a connected GitHub repository.

The vulnerability “effectively breaks Chrome’s extension security” by creating “a privilege escalation primitive across extensions, something Chrome’s security model is explicitly designed to prevent,” Gispan wrote.

A graphic depicting how a vulnerability exploits the trust boundaries in Clade Chrome’s extension. (Source: LayerX)


Claude relies on text, user interface semantics, and interpretation of screenshots to make decisions, all things that an attacker can control on the input side. The researchers modified Claude’s user interface to remove labels and indicators around sensitive information, like passwords and sharing feedback, then prompted Claude to share the files with an outside server.

That means cybersecurity defenders often have nothing obviously malicious to detect. Where there is visible activity, the model can be prompted to cover its tracks by deleting emails and other evidence of its actions.

Ax Sharma, Head of Research at Manifold Security, called the vulnerability “a useful demonstration of why monitoring AI agents at the prompt layer is fundamentally insufficient.”

“The most sophisticated part of this attack isn’t the injection, but that the agent’s perceived environment was manipulated to produce actions that looked legitimate from the inside,” said Sharma. “That’s the class of threat the industry needs to be building defenses for.”

Gispan said LayerX reported the flaw to Anthropic on April 27, but claimed the company only issued a “partial” fix to the problem. According to LayerX, Anthropic responded a day later to say that the bug was a duplicate of another vulnerability already being addressed in a future update.   

While that fix, issued May 6, introduced new approval flows for privileged actions that made it harder to exploit the same flaw, Gispan said he was still able to take over Claude’s agent in some scenarios.

“Switching to ‘privileged’ mode, even without the user’s notification or consent, enabled circumventing these security checks and injecting prompts into the Claude extension, as before,” Gispan wrote.

Anthropic did not respond to a request for comment from CyberScoop on the research and mitigation efforts.

The post Flaw in Claude’s Chrome extension allowed ‘any’ other plugin to hijack victims’ AI appeared first on CyberScoop.

Find and fix your software security holes without Mythos

27 April 2026 at 03:44
PUBLIC DEFENDER By Brian Livingston The maker of the popular Claude large language model (LLM) — which became the number-one download from US app stores in February 2026 — recently announced a powerful service called Claude Mythos. The new LLM has reportedly discovered thousands of security holes in every major operating system and Web browser. […]
❌
❌