Normal view

There are new articles available, click to refresh the page.
Before yesterdayMain stream

Anthropic’s new model is Mythos on a leash

By: djohnson
9 June 2026 at 13:00

Earlier this year, Anthropic executives said that their new AI model, Claude Mythos, had such powerful capabilities for harm that they would not release it publicly.

On Tuesday, the company said it was making an altered version of Mythos available to the public, promising “new guardrails” that thwart the model’s best-in-class performance in hacking and bioweapons research.

Anthropic said Claude Fable 5 was the “same underlying model” as Mythos, but its responses for certain topics like cybersecurity and biology will be drawn from a previous Claude Opus model that is already public.

“Releasing a model this capable comes with risks. Without safeguards, Fable 5’s capabilities in areas like cybersecurity could be misused to cause serious damage,” the company said in a draft blog sent to CyberScoop ahead of the announcement. “We’ve therefore launched the model with safeguards that route queries on a narrow set of topics to our next-most-capable model, Claude Opus 4.8.”

Anthropic also said they subjected Fable 5 to both internal and external red team testing for common model vulnerabilities, like jailbreaking. Anthropic said these tests identified no known “universal” jailbreaking techniques, but does not specify if partial jailbreaking techniques were discovered.  

The company is betting that won’t change when Fable 5 is made available to the broader public, but it’s worth noting that cybersecurity researchers have consistently found ways to jailbreak older AI models.

“The uplift from Mythos-level capabilities is valuable to many adversaries—for instance, those who could financially gain from cyberattacks—and we therefore expect them to be motivated to try to circumvent our safety measures,” the company wrote.

Anthropic is changing its data retention policies for Fable and Mythos models, keeping all user traffic for 30 days on both its own platforms and third-party services. A White House executive order creates a voluntary framework for AI companies to share frontier models with the government up to 30 days before public release. The company says the retained data won’t be used to train new Claude models or for “any non-safety-related-purpose.”

Following publication, a spokesperson for Anthropic told CyberScoop the company’s data retention policies “are specific to their safeguards work and is unrelated to the EO.”

Most organizations are still deciding whether to adopt AI into their IT and cybersecurity ecosystem.  But models like Mythos can scan for vulnerabilities, chain together exploits, and steal data from a victim network in minutes. Automation in hacking existed before AI, but experts have said frontier models like Mythos and OpenAI’s Daybreak can allow even low-level cybercriminals to wreak havoc.

While Anthropic cited its commitment to developing safe and secure AI in its reasons for not publicly releasing Mythos, many organizations have been clamoring for access, and its enhanced cybersecurity functions in cybersecurity and other areas have been the subject of congressional hearings, national security papers and White House executive orders.

Releasing a limited version of the model in Fable 5 represents an attempt to split the difference between those two desires. Anthropic said it would release follow up benchmarks and assets for the model.

So what can Fable 5 do? 

Anthropic said it’s possible the restrictions built into Fable will make it harder for the model to fulfill both malicious and legitimate user requests.

“Because we have prioritized safety, we’ve deliberately tuned the safeguards to be cautious, and they are still stricter than would be ideal—for example, sometimes benign requests will trigger our classifiers,” the company wrote. “We recognize that this will be frustrating to some users, and our aim is to reduce false positives as we update and refine the safeguards after launch.”

If Fable 5 draws its cybersecurity and biology answers entirely from Claude Opus 4.8, it will still provide users with impressive – though not unique – dual use cybersecurity capabilities.

According to the system card published for Opus 4.8, the model is a slight improvement on previous models like 4.7 in the realm of cybersecurity but was “generally much less capable than Mythos Preview.”

Opus 4.8 was tested on its ability to write complete end-to-end exploits and build exploit primitives that provide attackers with the ability to execute arbitrary code. It averaged a score just 5 out of 16 in proficiency, compared to Mythos Preview which scored closer to 10.

Without safety guardrails in place, Opus 4.8 can still reproduce nearly 80% of previously discovered vulnerabilities in real open-source software projects when given a high level description of the weakness. The system card said Anthropic’s unspecified safeguards whittle this success rate down to 1%.

Another test assessing Opus’ ability to develop exploits for the popular Firefox browser found that, again without guardrails, the model could identify a full working exploit 8.8% of the time and a partial working exploit 68.8% of the time.

The company also said that members of Project Glasswing – a consortium of public and private businesses given access to a preview version of Mythos – will be able to upgrade to the latest full model, Claude Mythos 5, to continue their work. Access to Mythos 5 will be expanded over time “through a more systematic trusted-access program” including federal agencies.

The post Anthropic’s new model is Mythos on a leash appeared first on CyberScoop.

Trump postpones executive order focused on AI security 

By: djohnson
21 May 2026 at 14:37

President Donald Trump said he would postpone the release of an executive order that would set up a 90-day testing and vetting regime for frontier AI models, hours before the White House was set to publicly announce the signing. 

Speaking to reporters in the Oval Office Thursday, Trump said he opted to delay the order “because I didn’t like certain aspects of it” and expressed concerns that it could harm U.S. AI industry competition with countries like China. 

According to multiple sources, a draft version of the order circulating in the last 24 hours would have set up a voluntary testing regime between the U.S. federal government and frontier AI companies that would allow the government to study new models for 90 days before they’re publicly released. In addition to the government, the draft order would also facilitate access to the models for cybersecurity testers in critical infrastructure sectors, like finance and healthcare.

The draft order empowered the National Security Agency to conduct classified evaluations of frontier AI models, while the Department of the Treasury would have set up a new information sharing agreement between AI companies and cybersecurity defenders in critical infrastructure.

Other agencies, like the Office of the National Cyber Director, the Cybersecurity and Infrastructure Security Agency and the National Institute for Standards and Technology, would also be involved in defining which models are covered under the vetting regime.

In some sense, the order would formalize an already cooperative relationship between AI companies and governments like the U.S. and UK, where tech-focused agencies and regulators have already been provided access to previous models ahead of their release for testing and evaluation. 

A former federal official who has seen the latest draft circulated before Thursday’s announcement told CyberScoop that based on their conversations with the administration, the order was intended to facilitate more robust testing from government agencies compared to evaluations conducted for previous models. They said that is in part a reflection of the federal government’s maturing understanding of AI technology over the past five years.

“In the past there has been containerized optionality for the intelligence community and others to take a look at things, but it was really a lot of hand holding [from AI companies] and self-explanation of what they expect this thing to do,” said the official, granted anonymity to discuss sensitive conversations with the administration. “And now the government is coming forward and saying now we feel we’re prepared enough for you to just give us your tool…and we’ll go from there.”

But it also represents a stark pivot by the Trump administration, which came into office openly dismissive of AI safety policies and arguing that they would inhibit U.S. industry. Trump’s latest comments in delaying the order echo those same attitudes. 

The former official said that while the Trump White House doesn’t view its mission as telling AI companies “don’t develop AI that can do X, which was perceived to be the previous administration’s role,” they also acknowledged the administration’s early rhetoric on AI regulation has painted them into a corner. 

“I think the biggest challenge the administration has is that their tone was ‘no institution of guardrails’ and they don’t have a better word for making sure that the capabilities of emergent frontier models don’t disrupt security than to say ‘let’s test it and institute guardrails,’” the official said.  

While debate about how best to regulate AI-related harms continues, most agree there are genuine national security concerns around the technology.

Ram Shankar Siva Kumar, founder of Microsoft’s AI red team, told CyberScoop that in 2019, his staff consisted of himself and a few other security and machine learning specialists. Now a much larger staff of technologists are supported by specialists in psychology, linguistics, bioweapons and other fields.

“Because of frontier harms, what we have done has really morphed,” Siva Kumar said.

The United States, along with Israel, Russia, Ukraine and others have already deployed AI in targeted military operations or integrated the technology into their larger command and control structure. AI is being used to supercharge drone warfare, global hacking campaigns, and sophisticated surveillance and targeting of military personnel and civilians, imbuing the engineering choices of frontier AI companies with life and death consequences.

Some congressional members who previously opposed allowing AI to make autonomous kill decisions on the battlefield have been reconsidering their position.

Rep. Don Beyer, D-Va., who co-chaired the Congressional AI Caucus and was appointed to a bipartisan AI task force in 2024. said that while he thinks “we need to guard against dehumanizing” those decisions, he also worries that adversarial countries will use the same technology against the United States.

“It’s like if we say that Americans have to have a human in the loop and the Chinese don’t have to have a human in a loop, the non-human one will beat the human one every time,” Beyer said at an AI conference in Washington D.C. earlier this month.  

Meanwhile, experts have been increasingly concerned about the technology’s impact on cybersecurity, as current models are remarkably good at finding software bugs and vulnerabilities, while newer models like Anthropic’s Mythos and OpenAI’s Daybreak are capable of chaining together multiple exploits to conduct more sophisticated attacks.

While state-sponsored hackers are experimenting with the technology and using it to gain targeted efficiencies in their hacking operations, cybersecurity experts in the private sector and law enforcement agencies say the technology has mostly benefitted cybercriminals and scammers.

The post Trump postpones executive order focused on AI security  appeared first on CyberScoop.

Researchers say AI just broke every benchmark for autonomous cyber capability

By: Greg Otto
13 May 2026 at 18:29

Two of the most advanced artificial intelligence models — Anthropic’s Claude Mythos Preview and OpenAI’s GPT-5.5 — have significantly surpassed the already-accelerating pace at which AI systems are completing autonomous cybersecurity tasks, according to separate findings published Wednesday by the United Kingdom’s AI Security Institute (AISI) and Palo Alto Networks.

The AISI, which conducts pre-deployment evaluations of frontier AI models on behalf of the British government, said both Claude Mythos Preview and GPT-5.5 have substantially exceeded the doubling trend the institute had been tracking since late 2024. Whether the results represent an isolated capability jump or the start of a new, faster trajectory remains unclear.

The AISI estimated earlier this year that frontier models’ 80% reliability cyber time horizon — a measure of how long a task takes a human expert, used as a proxy for AI autonomy — had been doubling approximately every five months. That was itself roughly half the eight-month doubling time the institute estimated in November 2025. Now Mythos Preview and GPT-5.5 have since outperformed any trend lines the institute has measured.

“Frontier AI’s autonomous cyber and software capability is advancing quickly: the length of cyber tasks that frontier models can complete autonomously has doubled on the order of months, not years,” the AISI wrote.

The clearest evidence of the capability jump came from the AISI’s cyber ranges, its structured simulations of multi-stage attacks against small, undefended enterprise networks. A newer checkpoint of Claude Mythos Preview became the first model to complete both of the institute’s ranges. It solved “The Last Ones,” a 32-step simulated corporate network attack, in 6 of 10 attempts, and completed “Cooling Tower” — previously unsolved by any model — in 3 of 10 attempts. GPT-5.5 solved “The Last Ones” in 3 of 10 attempts.

Palo Alto Networks reached similar conclusions through its own testing. The company said it began testing Claude Mythos in April as a launch partner for Anthropic’s Project Glasswing, and has since tested Claude Opus 4.7 and OpenAI’s GPT-5.5-Cyber as part of OpenAI‘s Trusted Access for Cyber program.

“The latest models are extraordinarily capable at finding vulnerabilities and changing them into critical exploit paths in near-real-time,” Palo Alto Networks wrote.

The company released security advisories covering 26 CVEs representing 75 issues — compared to a typical monthly volume of fewer than five CVEs — that were identified through AI model scanning across more than 130 products. All important vulnerabilities in its SaaS products had been patched, with patches available for all customer-operated products.

The AISI was careful to note the limits of its data. The estimates are based on a relatively small number of models, and the hardest tasks in the test suite have the least amount of human comparison data. Even so, the institute said the overall trend holds up: dropping any single model from the analysis barely moves the needle, shifting the estimated doubling time by less than a month in either direction. Separate research from METR, a nonprofit that tracks how quickly AI handles software tasks, arrived at a nearly identical figure — a doubling time of approximately four months since late 2024.

“No single benchmark result should be read as a precise measure of AI capability,” the AISI wrote. “Regardless, the direction of change and rapid growth have been consistent across the models, methodological choices and independent data we examined.”

Palo Alto Networks outlined four immediate priorities for enterprises as these models continue to grow in usage: First, find and fix vulnerabilities in code and applications before attackers do. Second, shrink the attack surface and use AI to spot security misconfigurations. Third, deploy detection and response tools across all systems, using machine learning to catch threats in real time. Fourth, build security operations fast enough to respond in minutes, because AI-powered attacks may soon unfold that quickly.

The AISI said it is developing more demanding evaluations, including new cyber ranges and the addition of active cyber defenses, to better reflect real-world conditions as model capabilities continue to advance.

The post Researchers say AI just broke every benchmark for autonomous cyber capability appeared first on CyberScoop.

❌
❌