Researchers at Google Threat Intelligence Group (GTIG) say that a zero-day exploit targeting a popular open-source web administration tool was likely generated using AI. [...]
A malicious Hugging Face repository that reached the platform's trending list impersonated OpenAI's "Privacy Filter" project to deliver information-stealing malware to Windows users. [...]
As businesses and governments turn to AI agents to access the internet and perform higher-level tasks, researchers continue to find serious flaws in large language models that can be exploited by bad actors.
The latest discovery comes from browser security firm LayerX, involving a bug in the Chrome extension for Anthropic’s Claude AI model that allows any other plugin – even ones without special permissions – to embed hidden instructions that can take over the agent.
“The flaw stems from an instruction in the extension’s code that allows any script running in the origin browser to communicate with Claude’s LLM, but does not verify who is running the script,” wrote LayerX senior researcher Aviad Gispan. “As a result, any extension can invoke a content script (which does not require any special permissions) and issue commands to the Claude extension.”
Gispan said he was able to execute any prompt he wanted, blow through Claude’s safety guardrails, evade user confirmation and perform cross-site actions across multiple Google tools. As a proof of concept, LayerX was able to exploit the flaw to extract files from Google Drive folders and share them with unauthorized parties, surveil recent email activity and send emails on behalf of a user, and pilfer private source code from a connected GitHub repository.
The vulnerability “effectively breaks Chrome’s extension security” by creating “a privilege escalation primitive across extensions, something Chrome’s security model is explicitly designed to prevent,” Gispan wrote.
A graphic depicting how a vulnerability exploits the trust boundaries in Clade Chrome’s extension. (Source: LayerX)
Claude relies on text, user interface semantics, and interpretation of screenshots to make decisions, all things that an attacker can control on the input side. The researchers modified Claude’s user interface to remove labels and indicators around sensitive information, like passwords and sharing feedback, then prompted Claude to share the files with an outside server.
That means cybersecurity defenders often have nothing obviously malicious to detect. Where there is visible activity, the model can be prompted to cover its tracks by deleting emails and other evidence of its actions.
Ax Sharma, Head of Research at Manifold Security, called the vulnerability “a useful demonstration of why monitoring AI agents at the prompt layer is fundamentally insufficient.”
“The most sophisticated part of this attack isn’t the injection, but that the agent’s perceived environment was manipulated to produce actions that looked legitimate from the inside,” said Sharma. “That’s the class of threat the industry needs to be building defenses for.”
Gispan said LayerX reported the flaw to Anthropic on April 27, but claimed the company only issued a “partial” fix to the problem. According to LayerX, Anthropic responded a day later to say that the bug was a duplicate of another vulnerability already being addressed in a future update.
While that fix, issued May 6, introduced new approval flows for privileged actions that made it harder to exploit the same flaw, Gispan said he was still able to take over Claude’s agent in some scenarios.
“Switching to ‘privileged’ mode, even without the user’s notification or consent, enabled circumventing these security checks and injecting prompts into the Claude extension, as before,” Gispan wrote.
Anthropic did not respond to a request for comment from CyberScoop on the research and mitigation efforts.
Wade Woolwine is Senior Director, Product Security at Rapid7.
Announcing OpenAI's Trusted Access for Cyber program
CIOs and CISOs are telling us the same thing in different ways: Advances in frontier AI are accelerating the threat environment and putting pressure on security operating models built for a different pace. Vulnerabilities can be discovered faster, exploitation windows are shrinking, and attackers are increasingly using automation to move with greater speed and scale. For defenders, this changes the value equation. The premium is no longer only on detecting threats faster after they emerge, but on moving earlier: Reducing exposure, validating risk, strengthening detection, and remediating at scale before attackers can take advantage.
This is why Rapid7 is excited to be included in OpenAI’s Trusted Access for Cyber program and their announcement today. OpenAI’s approach recognizes that advanced AI can help verified security teams move faster on legitimate defensive work, from triage and detection to validation, patching, malware analysis, and detection engineering. It also recognizes that some specialized cyber workflows require stronger verification, monitoring, and feedback loops.
As Corey Thomas, CEO of Rapid7, shared:
“Security leaders are under pressure from every direction: More vulnerabilities, faster exploitation, and increasing business pressure. Through OpenAI’s Trusted Access for Cyber program, Rapid7 is exploring more ways to accelerate the shift from reactive to preemptive security. To stay ahead of attackers, defenders must proactively reduce exploitability and detect with machine-scale speed and precision. We’re working with OpenAI to equip security teams with advanced capabilities that will meaningfully improve their cyber resilience.”
AI in security: Not just faster discovery
For Rapid7, this moment is about more than faster vulnerability discovery. AI is creating new pressure across the entire security lifecycle, from vulnerability validation, prioritization, disclosure, and remediation to threat and exploitation detection. Security infrastructure built for human-speed discovery now needs to operate in a machine-speed world, with enough context, governance, and accountability to help defenders act with confidence.
Finding risk is only the beginning. Security teams need to understand which vulnerabilities and misconfigurations are truly exploitable, which systems and business services are affected, what compensating controls are in place, how remediation should be prioritized, and where detection coverage is needed. CISOs also need confidence that advanced AI is being applied responsibly, with clear guardrails, measurable outcomes, and accountability.
Our work with OpenAI will help us explore how frontier AI can strengthen three critical areas. First, it can support the identification of vulnerabilities in our own products and code earlier in the development lifecycle. By accelerating secure code review, surfacing risky patterns, supporting root cause analysis, reviewing patches, and giving engineering teams faster feedback, AI can help reduce risk before issues reach production.
Second, it can advance vulnerability research and exploitation analysis. Rapid7 has long-standing expertise in vulnerability intelligence, exploitability research, and offensive security with Rapid7 Labs. Frontier AI can help researchers reason across unfamiliar code, map affected surfaces, build safe reproduction harnesses, validate severity, and turn findings into practical remediation guidance.
Third, it can expand AI-driven red-teaming. As AI becomes more embedded in enterprise systems and security operations, it must also be tested adversarially. We see an opportunity to use AI to strengthen red-team workflows, explore attack paths, validate controls, and help defenders understand where exposure could become real-world risk.
Artificial intelligence in use at Rapid7
We are already seeing this potential inside our own security operations work. In support of our Agentic SOC initiatives, Rapid7 has designed and implemented a system that uses machine learning to surface threat- and risk-relevant events from raw log and telemetry data. By using frontier AI models, including OpenAI’s GPT-5.5, to support initial triage and escalate only relevant events to SOC analysts, we have seen a 25% reduction in time spent chasing false-positive events in the queue.
This is not about replacing human expertise. It is about giving defenders better leverage in a world where attackers, businesses, and technology are all moving faster. The shift from reactive to preemptive security, and from human-scale processes to machine-scale defense, is not a marketing reframe. It is becoming the only viable path for teams that need to anticipate where attackers will move next, prioritize the exposures that actually matter, and respond at the speed of modern attacks.
AI may accelerate discovery, but cyberresilience depends on what happens after discovery. Customers need to unify their data, apply AI with the right context, drive remediation at scale, and translate security activity into measurable outcomes. That is where Rapid7 is focused. Across the Command Platform, Rapid7’s AI capabilities are built to help security teams detect threats and anomalies at scale, reduce noise, optimize SOC workflows, and make faster, more confident decisions.
By unifying Exposure Management and Detection and Response on the Command Platform, and combining AI-driven operations with the depth of expertise we have built over 25 years, Rapid7 is giving customers a more coherent way to reduce risk, disrupt attackers, and build durable cyber resilience. Learn more about Rapid7’s AI capabilities.
A fake version for the Claude AI website offers a malicious Claude-Pro Relay download that pushes a previously undocumented backdoor for Windows named Beagle. [...]
Musk said that he could have founded OpenAI as a for-profit company, just like the other companies he started or took over. “I deliberately chose this,” he said, “for the public good.”
AI red team specialist details his methods for manipulating AI guardrails through jailbreaking and data poisoning, helping developers harden machine learning models.
Google, Microsoft, Amazon Web Services, Nvidia, OpenAI, Reflection and SpaceX will provide resources to help augment warfighter decision-making in complex operational environments,” the Defense Department said.