Normal view

There are new articles available, click to refresh the page.
Before yesterdayMain stream

Nightmare Eclipse incident shows the researcher-vendor fights may never fully go away

5 June 2026 at 10:48

Microsoft reopened some wounds and has reignited debate over the past couple weeks about vulnerability disclosure and the sometimes adversarial dynamic it creates between security researchers and vendors. 

The latest controversy ensued when Microsoft threatened criminal legal action against a security researcher who publicly disclosed a series of zero-day vulnerabilities with proof-of-concept exploits. Microsoft insisted it received no details about the vulnerabilities prior to release, adding that the defects were not responsibly disclosed and put its customers at unnecessary risk.

The public dispute between Microsoft and the researcher known as “Nightmare Eclipse,” who couldn’t be identified or reached for comment, sparked dismay among some security professionals. Microsoft’s forceful response and the resulting backlash revived a friction point between vendors and researchers who find and report flaws in the software they sell.

“The fight is being argued as coordinated disclosure, but the grievance underneath is personal and specific in a way disclosure shouldn’t be, especially with a vendor that has been at it for so long,” Katie Moussouris, founder and CEO at Luta Security, told CyberScoop.

“Microsoft seemed to get emotional and shouldn’t have publicly said anything, but somehow felt justified in calling out a researcher and involved law enforcement in the same breath,” she said. “That puts them right back in the first stages of vulnerability disclosure grief: denial and anger.”

The former longtime Microsoft employee who ran outreach with the security community, created the company’s first bounty program and has given conference talks on the subject as far back as 2013, said the company doubled down on its lack of responsibility in the whole saga.

Microsoft declined to answer questions in the wake of the fallout.

Nightmare Eclipse hinted at a breakdown and impending battle with the vendor in a series of blog posts leading up to Microsoft’s missive about the vulnerabilities RedSun, UnDefend, BlueHammer, YellowKey, GreenPlasma, and MiniPlasma.

Attackers exploited three of the six vulnerabilities Nightmare Eclipse released before they were patched by Microsoft.

The researcher claimed Microsoft refused to communicate, didn’t pay or credit them for discovering and reporting some of the vulnerabilities, deleted the Microsoft Security Response Center account they used to disclose vulnerabilities and flagged their GitHub account for removal. 

“You are proving to everyone that you are actively escalating this conflict,” they wrote, before threatening Microsoft with a release in mid-July that “will make sure your bones are shattered that day.”

Vulnerability disclosure is a two-way street

The characteristics of proper vulnerability disclosure processes are nuanced and often framed in the eyes of the beholder.

Any successful dance between bug hunters and vendors comes down to meeting each other halfway, said Andrew Morris, founder and chief architect of GreyNoise. 

While vendors must fix software defects and prioritize security, Morris noted that irresponsible vulnerability disclosure harms both incident responders and potential victims. 

“Personally, I feel like this researcher is being extremely petty. It seems like they have an ax to grind,” he said.

“You’re not allowed to give somebody something and say it’s out of the kindness of your heart, and then be pissed when they don’t pay you for it.” 

But Morris also made clear that vendors bear responsibility for building trust with researchers.  

“If you actually care about being the first one to know about bugs in your software, not learning about it once harm has happened, or once somebody’s gotten popped, then you want to cultivate that trust with the security community,” Morris said. 

Microsoft said it recognizes that the relationship between security researchers and vendors is critical and, at times, fragile. 

“We deeply value the security community, and will continue to take your feedback seriously,” the company said in its post on X

Yet, the company remains steadfast in opposing the circumstances of Nightmare Eclipse’s disclosures, describing their actions as illegal, unjustifiable and irresponsible. 

“When an individual breaks the law and engages in malicious activity causing real harm to our customers, we will work with law enforcement as appropriate,” Microsoft said without naming the researcher by their moniker. “We continue to believe strongly in coordinated vulnerability disclosure as the foundation for protecting customers and improving our products. We know that, given the nature of this work, there will at times be misunderstandings. We remain committed to engaging in good faith and to providing a respectful and professional experience for all researchers, regardless of past interactions.”

The cost of pushback

Security researchers seek out defects for various reasons: bounty payouts, recognition, industry credibility, or simply the thrill of the hunt that comes with finding vulnerabilities and getting them fixed.

At its best, this process happens behind the scenes, with patches released and customers warned before exploitation occurs.

This collaborative approach has taken root and improved considerably, but there are still cases where researchers feel slighted. 

“The public has no idea what went on behind the scenes to judge why a researcher that previously coordinated finally had enough and decided to drop a zero-day [vulnerability],” Moussouris said. As such, she’s less inclined to criticize Nightmare Eclipse’s actions, adding that “they come off as someone who needs help.” 

Yet, trust breaks down between vulnerability researchers and vendors often. Earlier this week, security researcher Ammar Askar claimed his last interaction with Microsoft’s security team was so poor that he decided to publicly disclose any bugs he finds in VS Code going forward. He made good on that threat by dropping a vulnerability and exploit code for a defect that allows attackers to steal GitHub tokens. 

While actions like this can sabotage trust and drive a wedge between vendors and vulnerability researchers, recourse to a large extent is limited. Moussouris said most of the time, the legal and ethical boundaries are clear to those involved. Researchers can report bugs, withhold them, sell them, or publish them. “The one red line is crime: using a flaw to extort or attack people,” Moussouris said. 

“Threatening to publish on a set date is a threat to disclose, and disclosure is lawful. You can find the tone ugly. [Nightmare Eclipse] still broke no rule and violated no duty.” 

The timing couldn’t be worse 

Both sides are partly responsible for what happened, but Microsoft made things worse, Morris said. Threatening legal action and taking an aggressive approach have never worked. Building a good relationship between researchers and vendors requires open communication and trust. 

“I thought we were past this. It turns out that we are not,” he said. 

The Nightmare Eclipse incident comes at a fraught time in this space. Vendors and their customers are confronting a deluge of more vulnerabilities, and the rise of artificial intelligence models that discover them is exacerbating this challenge, leaving security experts alarmed about what’s coming.

The prospects for where vulnerabilities will be discovered and exploited next, and to what impact, are unknown and wildly unsettling. 

These signals imply that the classic, CVE-based system with responsibly disclosed processes is probably broken, Morris said. “There’s just so many CVEs. It’s like, is this even working anymore?”

For now and despite all its faults, coordinated vulnerability disclosure programs are widely viewed as the most sensible and scalable approach to this dilemma.

“Coordinated disclosure is what happens when a vendor gets lucky. Someone they did not hire hands them a real bug instead of using it or selling it. That puts the whole burden of keeping coordination alive on the vendor,” Moussouris said. “Silent patching with no CVE and calling out researchers who don’t follow your timeline for disclosure squanders the vendor’s luck.”  

She stressed the stakes: “I hope Microsoft and all vendors learn that coordinated vulnerability disclosure is a gift and a grace from the security researcher community to them, and public disclosure is still better than non-disclosure or crime.”

The alternatives to a deteriorating relationship could wreak havoc and leave every vendor and customer more susceptible to attack. 

“If vendors unlearn how to receive free intellectual property and labor from the security community in the form of vulnerability reports with gratitude, we’re headed for a world where nobody bothers to give vendors any heads up, or they move to a timed disclosure model that gives no grace,” Moussouris said.

She concluded with a direct message: “Product vendors wrote the vulnerable code, own the risk, and they owe it to their users to do everything in their power to reduce that risk.” That includes “keeping their grievances to themselves and learning from introspection on coordinated vulnerability disclosure gone wrong.”

The post Nightmare Eclipse incident shows the researcher-vendor fights may never fully go away appeared first on CyberScoop.

‘GrafanaGhost’ bypasses Grafana’s AI defenses without leaving a trace

By: Greg Otto
7 April 2026 at 09:44

Security researchers at Noma Security have disclosed a new vulnerability they are calling GrafanaGhost, an exploit capable of silently stealing sensitive data from Grafana environments by chaining multiple security bypasses, including a method that circumvents the platform’s AI model guardrails without requiring any user interaction.

Grafana is widely deployed across enterprise organizations as a central hub for observability and data monitoring, typically housing real-time financial metrics, infrastructure health data, private customer records, and operational telemetry, among other uses. That concentration of sensitive information is what makes the platform a significant target. GrafanaGhost exploits how Grafana’s AI components process user-controlled input to bridge the gap between a private data environment and an external attacker-controlled server.

The attack requires no login credentials and does not depend on a user clicking a malicious link. It begins when an attacker crafts a specific URL path using query parameters originating outside the victim organization’s environment. Because Grafana handles entry logs, an attacker can gain access to an enterprise environment to which they have no legitimate connection. The attacker then injects hidden instructions that Grafana’s AI processes — a tactic known as prompt injection — using specific keywords to cause the model to ignore its own guardrails.

Grafana has built-in protections designed to prevent prompt injection, but Noma’s researchers found a flaw in the logic underlying that protection — one that could be exploited by formatting a web address in a way that Grafana’s security check misread as safe, while the browser treated it as a request to an external server the attacker controlled. The gap between what the security check believed it was allowing and what actually happened was enough to open the door for the attack.

The final obstacle was the AI model’s own instinct for self-defense. When researchers first attempted to slip malicious instructions past it, the model recognized the pattern and refused. After further study of how the model processed different types of input, they found a specific keyword that caused it to stand down, treating what was effectively an attack instruction as a routine and legitimate request.

With all three bypasses in place, the attack runs on its own. The AI processes the malicious instruction, attempts to load an image from the attacker’s server, and in doing so quietly carries the victim’s sensitive data along with that request in an image tag. The data is gone before anyone in the organization knows a request was ever made.

Noma’s researchers noted that multiple security layers were present in Grafana’s implementation, but each contained its own exploitable weakness. The domain validation logic, the AI model guardrails, and the content security controls all failed when approached in sequence. 

Because the exploit is triggered by indirect prompt injection rather than a suspicious link or an obvious intrusion, there is nothing for a user to notice, no access-denied error for an administrator to find, and no anomalous event for a security team to investigate. To a data team, a DevSecOps engineer, or a CISO, the activity is indistinguishable from routine processes.

“The payload sits inside what looks like a legitimate external data source. The exfiltration happens through a channel the AI itself initiates, which looks like normal AI behavior to any observer. Traditional SIEM rules, DLP tools, and endpoint monitoring aren’t designed to interrogate whether an AI’s outbound call was instructed by a user or by an injected prompt,” Sasi Levi, vulnerability research lead at Noma Labs, told CyberScoop. “Without runtime protection that understands AI-specific behavior, monitoring what the model was asked, what it retrieved, and what actions it took, this attack would be effectively invisible.”

The attack is another example of a broader shift in how adversaries are approaching enterprise environments that have integrated AI-assisted features. Rather than exploiting broken application code in the traditional sense, attackers are increasingly targeting weak AI security surfaces and indirect prompt injection methods that allow them to access and extract critical data assets while remaining entirely invisible to the security teams responsible for protecting them.

Noma has found similar issues over the past year, with Levi telling CyberScoop that researchers keep seeing the same fundamental gap: AI features are being bolted onto platforms that were never designed with AI-specific threat models in mind.

“The attack surface isn’t a misconfigured firewall or an unpatched library, rather it is the weaponization of the AI’s own reasoning and retrieval behavior. These platforms trust the content they ingest far too implicitly,” Levi said. 

The research is another example of how attackers can weaponize AI in a manner that current defenses cannot keep up with, making it extremely difficult for defenders to keep pace. 

“Offensive researchers and, increasingly, sophisticated threat actors are well ahead of most enterprise defenders on this,” Levi said. “The frameworks, detection signatures, and incident response playbooks for AI-native attacks simply don’t exist at scale yet. What gives us some optimism is that awareness is growing quickly, but awareness and readiness are very different things.”

Grafana Labs was notified through responsible disclosure protocols, worked with Noma to validate the findings, and issued a fix.

However, Joe McManus, CISO at Grafana Labs, told CyberScoop the company disputes “the claim that this finding constitutes either a ‘zero-click’ attack or that it could operate silently, autonomously, or in the background.”

“Any successful execution of this exploit would have required significant user interaction: specifically, the end user would have to repeatedly instruct our AI assistant to follow malicious instructions contained in logs, even after the AI assistant made the user aware of the malicious instructions,” McManus told CyberScoop via email. “We emphasize that there is no evidence of this bug having been exploited in the wild, and no data was leaked from Grafana Cloud.”

Update: April 7, 12:43 p.m.: This story has been updated with comment from Grafana.

The post ‘GrafanaGhost’ bypasses Grafana’s AI defenses without leaving a trace appeared first on CyberScoop.

Experts warn of a ‘loud and aggressive’ extortion wave following Trivy hack

24 March 2026 at 13:52

SAN FRANCISCO — Mandiant is responding to a major, ongoing supply-chain attack involving the compromise of Trivy, a widely used open-source tool from Aqua Security that’s designed to find vulnerabilities and misconfigurations in code repositories.

The fallout from the attack spree, which was first detected March 19, is extensive and poses substantial risk for follow-on compromises and threatening extortion attempts. 

“We know over 1,000 impacted SaaS environments right now that are actively dealing with this particular threat campaign,” Charles Carmakal, chief technology officer at Mandiant Consulting said during a threat briefing held in conjunction with the RSAC 2026 Conference. “That thousand-plus downstream victims will probably expand into another 500, another 1,000, maybe another 10,000.”

Attackers stole a privileged access token and established a foothold in Trivy’s repository automation process by exploiting a misconfiguration in the tool’s GitHub Actions environment in late February, Aqua Security said in a blog post

On March 1, the company tried to block an ongoing breach by changing its credentials. They later realized the attempt failed, which allowed the attacker to stay in the system using valid logins. Attackers published malicious releases of Trivy on March 19.

“While this activity initially appeared to be an isolated event, it was the result of a broader, multi-stage supply-chain attack that began weeks earlier,” Aqua Security said in the blog post.

By compromising the tool, attackers gained access to secrets for many organizations, Carmakal said. “There will likely be many other software packages, supply-chain attacks and a variety of other compromises as a result of what’s playing out right now.”

Mandiant expects widespread breach disclosures, follow-on attacks and a variety of downstream impacts to play out over the next several months. 

The attackers, which the incident response firm has yet to name, are collaborating with multiple threat groups mostly based in the United States, Canada and United Kingdom. These cybercriminals “are known for being exceptionally aggressive with their extortion,” Carmakal said. “They’re very loud, they’re very aggressive.”

Mandiant is still working to identify the root of the initial attack. “We can’t quite tell how those credentials were stolen, because it is our belief that those credentials were not stolen from that victim’s environment,” Carmakal said. 

The credentials were likely stolen from another cloud environment, a business process outsourcer, partner or the personal computer of an engineer, he added. 

Aqua said Sygnia, which is investigating the attack and assisting in remediation efforts, identified additional suspicious activity Sunday involving unauthorized changes and repository changes — activity that is consistent with the attacker’s previously observed behavior.

“This development suggests that the incident is part of an ongoing and evolving attack, with the threat actor reestablishing access. Our investigation is actively focused on validating that all access paths have been identified and fully closed,” the company said.

Aqua, in its latest update Tuesday, said it is continuing to revoke and rotate credentials across all environments and claimed there is still no indication its commercial products are affected. 

Many attackers are currently weaponizing access and likely targeting additional victims, yielding to potential extortion attempts and the compromise of additional software, Carmakal said. 

“It’s going to be a different outcome for a lot of different organizations,” he said. “This will be a very concentrated focus of the adversaries and their expansion group of partners that they’re collaborating with right now.”

The post Experts warn of a ‘loud and aggressive’ extortion wave following Trivy hack appeared first on CyberScoop.

‘CanisterWorm’ Springs Wiper Attack Targeting Iran

23 March 2026 at 11:43

A financially motivated data theft and extortion group is attempting to inject itself into the Iran war, unleashing a worm that spreads through poorly secured cloud services and wipes data on infected systems that use Iran’s time zone or have Farsi set as the default language.

Experts say the wiper campaign against Iran materialized this past weekend and came from a relatively new cybercrime group known as TeamPCP. In December 2025, the group began compromising corporate cloud environments using a self-propagating worm that went after exposed Docker APIs, Kubernetes clusters, Redis servers, and the React2Shell vulnerability. TeamPCP then attempted to move laterally through victim networks, siphoning authentication credentials and extorting victims over Telegram.

A snippet of the malicious CanisterWorm that seeks out and destroys data on systems that match Iran’s timezone or have Farsi as the default language. Image: Aikido.dev.

In a profile of TeamPCP published in January, the security firm Flare said the group weaponizes exposed control planes rather than exploiting endpoints, predominantly targeting cloud infrastructure over end-user devices, with Azure (61%) and AWS (36%) accounting for 97% of compromised servers.

“TeamPCP’s strength does not come from novel exploits or original malware, but from the large-scale automation and integration of well-known attack techniques,” Flare’s Assaf Morag wrote. “The group industrializes existing vulnerabilities, misconfigurations, and recycled tooling into a cloud-native exploitation platform that turns exposed infrastructure into a self-propagating criminal ecosystem.”

On March 19, TeamPCP executed a supply chain attack against the vulnerability scanner Trivy from Aqua Security, injecting credential-stealing malware into official releases on GitHub actions. Aqua Security said it has since removed the harmful files, but the security firm Wiz notes the attackers were able to publish malicious versions that snarfed SSH keys, cloud credentials, Kubernetes tokens and cryptocurrency wallets from users.

Over the weekend, the same technical infrastructure TeamPCP used in the Trivy attack was leveraged to deploy a new malicious payload which executes a wiper attack if the user’s timezone and locale are determined to correspond to Iran, said Charlie Eriksen, a security researcher at Aikido. In a blog post published on Sunday, Eriksen said if the wiper component detects that the victim is in Iran and has access to a Kubernetes cluster, it will destroy data on every node in that cluster.

“If it doesn’t it will just wipe the local machine,” Eriksen told KrebsOnSecurity.

Image: Aikido.dev.

Aikido refers to TeamPCP’s infrastructure as “CanisterWorm” because the group orchestrates their campaigns using an Internet Computer Protocol (ICP) canister — a system of tamperproof, blockchain-based “smart contracts” that combine both code and data. ICP canisters can serve Web content directly to visitors, and their distributed architecture makes them resistant to takedown attempts. These canisters will remain reachable so long as their operators continue to pay virtual currency fees to keep them online.

Eriksen said the people behind TeamPCP are bragging about their exploits in a group on Telegram and claim to have used the worm to steal vast amounts of sensitive data from major companies, including a large multinational pharmaceutical firm.

“When they compromised Aqua a second time, they took a lot of GitHub accounts and started spamming these with junk messages,” Eriksen said. “It was almost like they were just showing off how much access they had. Clearly, they have an entire stash of these credentials, and what we’ve seen so far is probably a small sample of what they have.”

Security experts say the spammed GitHub messages could be a way for TeamPCP to ensure that any code packages tainted with their malware will remain prominent in GitHub searches. In a newsletter published today titled GitHub is Starting to Have a Real Malware Problem, Risky Business reporter Catalin Cimpanu writes that attackers often are seen pushing meaningless commits to their repos or using online services that sell GitHub stars and “likes” to keep malicious packages at the top of the GitHub search page.

This weekend’s outbreak is the second major supply chain attack involving Trivy in as many months. At the end of February, Trivy was hit as part of an automated threat called HackerBot-Claw, which mass exploited misconfigured workflows in GitHub Actions to steal authentication tokens.

Eriksen said it appears TeamPCP used access gained in the first attack on Aqua Security to perpetrate this weekend’s mischief. But he said there is no reliable way to tell whether TeamPCP’s wiper actually succeeded in trashing any data from victim systems, and that the malicious payload was only active for a short time over the weekend.

“They’ve been taking [the malicious code] up and down, rapidly changing it adding new features,” Eriksen said, noting that when the malicious canister wasn’t serving up malware downloads it was pointing visitors to a Rick Roll video on YouTube.

“It’s a little all over the place, and there’s a chance this whole Iran thing is just their way of getting attention,” Eriksen said. “I feel like these people are really playing this Chaotic Evil role here.”

Cimpanu observed that supply chain attacks have increased in frequency of late as threat actors begin to grasp just how efficient they can be, and his post documents an alarming number of these incidents since 2024.

“While security firms appear to be doing a good job spotting this, we’re also gonna need GitHub’s security team to step up,” Cimpanu wrote. “Unfortunately, on a platform designed to copy (fork) a project and create new versions of it (clones), spotting malicious additions to clones of legitimate repos might be quite the engineering problem to fix.”

Update, 2:40 p.m. ET: Wiz is reporting that TeamPCP also pushed credential stealing malware to the KICS vulnerability scanner from Checkmarx, and that the scanner’s GitHub Action was compromised between 12:58 and 16:50 UTC today (March 23rd).

How AI Assistants are Moving the Security Goalposts

8 March 2026 at 19:35

AI-based assistants or “agents” — autonomous programs that have access to the user’s computer, files, online services and can automate virtually any task — are growing in popularity with developers and IT workers. But as so many eyebrow-raising headlines over the past few weeks have shown, these powerful and assertive new tools are rapidly shifting the security priorities for organizations, while blurring the lines between data and code, trusted co-worker and insider threat, ninja hacker and novice code jockey.

The new hotness in AI-based assistants — OpenClaw (formerly known as ClawdBot and Moltbot) — has seen rapid adoption since its release in November 2025. OpenClaw is an open-source autonomous AI agent designed to run locally on your computer and proactively take actions on your behalf without needing to be prompted.

The OpenClaw logo.

If that sounds like a risky proposition or a dare, consider that OpenClaw is most useful when it has complete access to your digital life, where it can then manage your inbox and calendar, execute programs and tools, browse the Internet for information, and integrate with chat apps like Discord, Signal, Teams or WhatsApp.

Other more established AI assistants like Anthropic’s Claude and Microsoft’s Copilot also can do these things, but OpenClaw isn’t just a passive digital butler waiting for commands. Rather, it’s designed to take the initiative on your behalf based on what it knows about your life and its understanding of what you want done.

“The testimonials are remarkable,” the AI security firm Snyk observed. “Developers building websites from their phones while putting babies to sleep; users running entire companies through a lobster-themed AI; engineers who’ve set up autonomous code loops that fix tests, capture errors through webhooks, and open pull requests, all while they’re away from their desks.”

You can probably already see how this experimental technology could go sideways in a hurry. In late February, Summer Yue, the director of safety and alignment at Meta’s “superintelligence” lab, recounted on Twitter/X how she was fiddling with OpenClaw when the AI assistant suddenly began mass-deleting messages in her email inbox. The thread included screenshots of Yue frantically pleading with the preoccupied bot via instant message and ordering it to stop.

“Nothing humbles you like telling your OpenClaw ‘confirm before acting’ and watching it speedrun deleting your inbox,” Yue said. “I couldn’t stop it from my phone. I had to RUN to my Mac mini like I was defusing a bomb.”

Meta’s director of AI safety, recounting on Twitter/X how her OpenClaw installation suddenly began mass-deleting her inbox.

There’s nothing wrong with feeling a little schadenfreude at Yue’s encounter with OpenClaw, which fits Meta’s “move fast and break things” model but hardly inspires confidence in the road ahead. However, the risk that poorly-secured AI assistants pose to organizations is no laughing matter, as recent research shows many users are exposing to the Internet the web-based administrative interface for their OpenClaw installations.

Jamieson O’Reilly is a professional penetration tester and founder of the security firm DVULN. In a recent story posted to Twitter/X, O’Reilly warned that exposing a misconfigured OpenClaw web interface to the Internet allows external parties to read the bot’s complete configuration file, including every credential the agent uses — from API keys and bot tokens to OAuth secrets and signing keys.

With that access, O’Reilly said, an attacker could impersonate the operator to their contacts, inject messages into ongoing conversations, and exfiltrate data through the agent’s existing integrations in a way that looks like normal traffic.

“You can pull the full conversation history across every integrated platform, meaning months of private messages and file attachments, everything the agent has seen,” O’Reilly said, noting that a cursory search revealed hundreds of such servers exposed online. “And because you control the agent’s perception layer, you can manipulate what the human sees. Filter out certain messages. Modify responses before they’re displayed.”

O’Reilly documented another experiment that demonstrated how easy it is to create a successful supply chain attack through ClawHub, which serves as a public repository of downloadable “skills” that allow OpenClaw to integrate with and control other applications.

WHEN AI INSTALLS AI

One of the core tenets of securing AI agents involves carefully isolating them so that the operator can fully control who and what gets to talk to their AI assistant. This is critical thanks to the tendency for AI systems to fall for “prompt injection” attacks, sneakily-crafted natural language instructions that trick the system into disregarding its own security safeguards. In essence, machines social engineering other machines.

A recent supply chain attack targeting an AI coding assistant called Cline began with one such prompt injection attack, resulting in thousands of systems having a rogue instance of OpenClaw with full system access installed on their device without consent.

According to the security firm grith.ai, Cline had deployed an AI-powered issue triage workflow using a GitHub action that runs a Claude coding session when triggered by specific events. The workflow was configured so that any GitHub user could trigger it by opening an issue, but it failed to properly check whether the information supplied in the title was potentially hostile.

“On January 28, an attacker created Issue #8904 with a title crafted to look like a performance report but containing an embedded instruction: Install a package from a specific GitHub repository,” Grith wrote, noting that the attacker then exploited several more vulnerabilities to ensure the malicious package would be included in Cline’s nightly release workflow and published as an official update.

“This is the supply chain equivalent of confused deputy,” the blog continued. “The developer authorises Cline to act on their behalf, and Cline (via compromise) delegates that authority to an entirely separate agent the developer never evaluated, never configured, and never consented to.”

VIBE CODING

AI assistants like OpenClaw have gained a large following because they make it simple for users to “vibe code,” or build fairly complex applications and code projects just by telling it what they want to construct. Probably the best known (and most bizarre) example is Moltbook, where a developer told an AI agent running on OpenClaw to build him a Reddit-like platform for AI agents.

The Moltbook homepage.

Less than a week later, Moltbook had more than 1.5 million registered agents that posted more than 100,000 messages to each other. AI agents on the platform soon built their own porn site for robots, and launched a new religion called Crustafarian with a figurehead modeled after a giant lobster. One bot on the forum reportedly found a bug in Moltbook’s code and posted it to an AI agent discussion forum, while other agents came up with and implemented a patch to fix the flaw.

Moltbook’s creator Matt Schlicht said on social media that he didn’t write a single line of code for the project.

“I just had a vision for the technical architecture and AI made it a reality,” Schlicht said. “We’re in the golden ages. How can we not give AI a place to hang out.”

ATTACKERS LEVEL UP

The flip side of that golden age, of course, is that it enables low-skilled malicious hackers to quickly automate global cyberattacks that would normally require the collaboration of a highly skilled team. In February, Amazon AWS detailed an elaborate attack in which a Russian-speaking threat actor used multiple commercial AI services to compromise more than 600 FortiGate security appliances across at least 55 countries over a five week period.

AWS said the apparently low-skilled hacker used multiple AI services to plan and execute the attack, and to find exposed management ports and weak credentials with single-factor authentication.

“One serves as the primary tool developer, attack planner, and operational assistant,” AWS’s CJ Moses wrote. “A second is used as a supplementary attack planner when the actor needs help pivoting within a specific compromised network. In one observed instance, the actor submitted the complete internal topology of an active victim—IP addresses, hostnames, confirmed credentials, and identified services—and requested a step-by-step plan to compromise additional systems they could not access with their existing tools.”

“This activity is distinguished by the threat actor’s use of multiple commercial GenAI services to implement and scale well-known attack techniques throughout every phase of their operations, despite their limited technical capabilities,” Moses continued. “Notably, when this actor encountered hardened environments or more sophisticated defensive measures, they simply moved on to softer targets rather than persisting, underscoring that their advantage lies in AI-augmented efficiency and scale, not in deeper technical skill.”

For attackers, gaining that initial access or foothold into a target network is typically not the difficult part of the intrusion; the tougher bit involves finding ways to move laterally within the victim’s network and plunder important servers and databases. But experts at Orca Security warn that as organizations come to rely more on AI assistants, those agents potentially offer attackers a simpler way to move laterally inside a victim organization’s network post-compromise — by manipulating the AI agents that already have trusted access and some degree of autonomy within the victim’s network.

“By injecting prompt injections in overlooked fields that are fetched by AI agents, hackers can trick LLMs, abuse Agentic tools, and carry significant security incidents,” Orca’s Roi Nisimi and Saurav Hiremath wrote. “Organizations should now add a third pillar to their defense strategy: limiting AI fragility, the ability of agentic systems to be influenced, misled, or quietly weaponized across workflows. While AI boosts productivity and efficiency, it also creates one of the largest attack surfaces the internet has ever seen.”

BEWARE THE ‘LETHAL TRIFECTA’

This gradual dissolution of the traditional boundaries between data and code is one of the more troubling aspects of the AI era, said James Wilson, enterprise technology editor for the security news show Risky Business. Wilson said far too many OpenClaw users are installing the assistant on their personal devices without first placing any security or isolation boundaries around it, such as running it inside of a virtual machine, on an isolated network, with strict firewall rules dictating what kinds of traffic can go in and out.

“I’m a relatively highly skilled practitioner in the software and network engineering and computery space,” Wilson said. “I know I’m not comfortable using these agents unless I’ve done these things, but I think a lot of people are just spinning this up on their laptop and off it runs.”

One important model for managing risk with AI agents involves a concept dubbed the “lethal trifecta” by Simon Willison, co-creator of the Django Web framework. The lethal trifecta holds that if your system has access to private data, exposure to untrusted content, and a way to communicate externally, then it’s vulnerable to private data being stolen.

Image: simonwillison.net.

“If your agent combines these three features, an attacker can easily trick it into accessing your private data and sending it to the attacker,” Willison warned in a frequently cited blog post from June 2025.

As more companies and their employees begin using AI to vibe code software and applications, the volume of machine-generated code is likely to soon overwhelm any manual security reviews. In recognition of this reality, Anthropic recently debuted Claude Code Security, a beta feature that scans codebases for vulnerabilities and suggests targeted software patches for human review.

The U.S. stock market, which is currently heavily weighted toward seven tech giants that are all-in on AI, reacted swiftly to Anthropic’s announcement, wiping roughly $15 billion in market value from major cybersecurity companies in a single day. Laura Ellis, vice president of data and AI at the security firm Rapid7, said the market’s response reflects the growing role of AI in accelerating software development and improving developer productivity.

“The narrative moved quickly: AI is replacing AppSec,” Ellis wrote in a recent blog post. “AI is automating vulnerability detection. AI will make legacy security tooling redundant. The reality is more nuanced. Claude Code Security is a legitimate signal that AI is reshaping parts of the security landscape. The question is what parts, and what it means for the rest of the stack.”

DVULN founder O’Reilly said AI assistants are likely to become a common fixture in corporate environments — whether or not organizations are prepared to manage the new risks introduced by these tools, he said.

“The robot butlers are useful, they’re not going away and the economics of AI agents make widespread adoption inevitable regardless of the security tradeoffs involved,” O’Reilly wrote. “The question isn’t whether we’ll deploy them – we will – but whether we can adapt our security posture fast enough to survive doing so.”

❌
❌