Normal view

There are new articles available, click to refresh the page.
Before yesterdayMain stream

Vuln in Google’s Antigravity AI agent manager could escape sandbox, give attackers remote code execution

By: djohnson
20 April 2026 at 17:17

As organizations consider agentic AI for their business and IT stacks, researchers continue to find bugs and vulnerabilities in major, commercial models  that can significantly expand their attack surface.

This week, researchers at Pillar Security disclosed a vulnerability in Antigravity, an AI-powered developer tool for filesystem operations made by Google.

The bug, since patched, combined prompt injection with Antigravity’s permitted file-creation capability to grant attackers remote code execution privileges.

The research details how the exploit was able to circumvent Antigravity’s secure mode, Google’s highest security setting for its agents that runs all command operations through a virtual sandbox environment, throttles network access and prohibits the agent from writing code outside of the working directory.

Secure mode is supposed to limit the AI agent access to sensitive systems – and its ability to execute malicious or dangerous acts through shell commands. But one of the file-searching tools used by Antigravity, called “find_by_name,” is classified as a ‘native’ system tool. This means the agent can execute it directly and before protections like Secure Mode can even evaluate command level operations.

“The security boundary that Secure Mode enforces simply never sees this call,” wrote Dan Lisichkin, an AI security researcher with Pillar Security. “This means an attacker achieves arbitrary code execution under the exact configuration a security-conscious user would rely on to prevent it.”

The prompt injection attacks can be delivered through compromised identity accounts connected to the agent, or indirectly by hiding clandestine prompt instructions inside open-source files or web content the agent ingests. Antigravity  has trouble distinguishing between written data it ingests for context and literal prompt instructions, so compromise can be achieved without any elevated access by getting it to read a malicious document or file.

According to a disclosure timeline provided by Pillar Security, the bug was reported to Google on Jan. 6 and patched on Feb. 28, with Google awarding a bug bounty for the discovery.

Lisichkin said this same pattern of prompt injection through unvalidated input has been found in other coding AI agents like Cursor. In the age of AI, any unvalidated input can become a malicious prompt capable of hijacking internal systems.

“The trust model underpinning security assumptions, that a human will catch something suspicious, does not hold when autonomous agents follow instructions from external content,” he wrote.

The fact that the vulnerability was able to completely bypass Google’s secure mode underscores how the cybersecurity industry must start adapting and “move beyond sanitization-based controls.” 

“Every native tool parameter that reaches a shell command is a potential injection point. Auditing for this class of vulnerability is no longer optional, and it is a prerequisite for shipping agentic features safely,” Lisichkin wrote.

The post Vuln in Google’s Antigravity AI agent manager could escape sandbox, give attackers remote code execution appeared first on CyberScoop.

The state of play: Microsoft 365 and Copilot

20 April 2026 at 03:43
MICROSOFT 365 By Peter Deegan Copilot for Microsoft 365 is changing plans, prices, and features so often that it’s hard for anyone to keep up. Microsoft has just changed Copilot arrangements for business and enterprise users.  It’s not easy to keep track of what’s available when there are around 80 different products with the Copilot […]

Could you stop a bot agent that’s running wild? Probably not.

23 March 2026 at 03:44
PUBLIC DEFENDER By Brian Livingston Installing “agentic AI” such as Microsoft’s Copilot, OpenAI’s GPT Atlas, and other artificial-intelligence helpers is a big trend among businesses and individual computer users — but big problems come along with such bots. A white paper published by Kiteworks, a data-management firm, says 60 percent of companies using agentic AI […]

How AI Assistants are Moving the Security Goalposts

8 March 2026 at 19:35

AI-based assistants or “agents” — autonomous programs that have access to the user’s computer, files, online services and can automate virtually any task — are growing in popularity with developers and IT workers. But as so many eyebrow-raising headlines over the past few weeks have shown, these powerful and assertive new tools are rapidly shifting the security priorities for organizations, while blurring the lines between data and code, trusted co-worker and insider threat, ninja hacker and novice code jockey.

The new hotness in AI-based assistants — OpenClaw (formerly known as ClawdBot and Moltbot) — has seen rapid adoption since its release in November 2025. OpenClaw is an open-source autonomous AI agent designed to run locally on your computer and proactively take actions on your behalf without needing to be prompted.

The OpenClaw logo.

If that sounds like a risky proposition or a dare, consider that OpenClaw is most useful when it has complete access to your digital life, where it can then manage your inbox and calendar, execute programs and tools, browse the Internet for information, and integrate with chat apps like Discord, Signal, Teams or WhatsApp.

Other more established AI assistants like Anthropic’s Claude and Microsoft’s Copilot also can do these things, but OpenClaw isn’t just a passive digital butler waiting for commands. Rather, it’s designed to take the initiative on your behalf based on what it knows about your life and its understanding of what you want done.

“The testimonials are remarkable,” the AI security firm Snyk observed. “Developers building websites from their phones while putting babies to sleep; users running entire companies through a lobster-themed AI; engineers who’ve set up autonomous code loops that fix tests, capture errors through webhooks, and open pull requests, all while they’re away from their desks.”

You can probably already see how this experimental technology could go sideways in a hurry. In late February, Summer Yue, the director of safety and alignment at Meta’s “superintelligence” lab, recounted on Twitter/X how she was fiddling with OpenClaw when the AI assistant suddenly began mass-deleting messages in her email inbox. The thread included screenshots of Yue frantically pleading with the preoccupied bot via instant message and ordering it to stop.

“Nothing humbles you like telling your OpenClaw ‘confirm before acting’ and watching it speedrun deleting your inbox,” Yue said. “I couldn’t stop it from my phone. I had to RUN to my Mac mini like I was defusing a bomb.”

Meta’s director of AI safety, recounting on Twitter/X how her OpenClaw installation suddenly began mass-deleting her inbox.

There’s nothing wrong with feeling a little schadenfreude at Yue’s encounter with OpenClaw, which fits Meta’s “move fast and break things” model but hardly inspires confidence in the road ahead. However, the risk that poorly-secured AI assistants pose to organizations is no laughing matter, as recent research shows many users are exposing to the Internet the web-based administrative interface for their OpenClaw installations.

Jamieson O’Reilly is a professional penetration tester and founder of the security firm DVULN. In a recent story posted to Twitter/X, O’Reilly warned that exposing a misconfigured OpenClaw web interface to the Internet allows external parties to read the bot’s complete configuration file, including every credential the agent uses — from API keys and bot tokens to OAuth secrets and signing keys.

With that access, O’Reilly said, an attacker could impersonate the operator to their contacts, inject messages into ongoing conversations, and exfiltrate data through the agent’s existing integrations in a way that looks like normal traffic.

“You can pull the full conversation history across every integrated platform, meaning months of private messages and file attachments, everything the agent has seen,” O’Reilly said, noting that a cursory search revealed hundreds of such servers exposed online. “And because you control the agent’s perception layer, you can manipulate what the human sees. Filter out certain messages. Modify responses before they’re displayed.”

O’Reilly documented another experiment that demonstrated how easy it is to create a successful supply chain attack through ClawHub, which serves as a public repository of downloadable “skills” that allow OpenClaw to integrate with and control other applications.

WHEN AI INSTALLS AI

One of the core tenets of securing AI agents involves carefully isolating them so that the operator can fully control who and what gets to talk to their AI assistant. This is critical thanks to the tendency for AI systems to fall for “prompt injection” attacks, sneakily-crafted natural language instructions that trick the system into disregarding its own security safeguards. In essence, machines social engineering other machines.

A recent supply chain attack targeting an AI coding assistant called Cline began with one such prompt injection attack, resulting in thousands of systems having a rogue instance of OpenClaw with full system access installed on their device without consent.

According to the security firm grith.ai, Cline had deployed an AI-powered issue triage workflow using a GitHub action that runs a Claude coding session when triggered by specific events. The workflow was configured so that any GitHub user could trigger it by opening an issue, but it failed to properly check whether the information supplied in the title was potentially hostile.

“On January 28, an attacker created Issue #8904 with a title crafted to look like a performance report but containing an embedded instruction: Install a package from a specific GitHub repository,” Grith wrote, noting that the attacker then exploited several more vulnerabilities to ensure the malicious package would be included in Cline’s nightly release workflow and published as an official update.

“This is the supply chain equivalent of confused deputy,” the blog continued. “The developer authorises Cline to act on their behalf, and Cline (via compromise) delegates that authority to an entirely separate agent the developer never evaluated, never configured, and never consented to.”

VIBE CODING

AI assistants like OpenClaw have gained a large following because they make it simple for users to “vibe code,” or build fairly complex applications and code projects just by telling it what they want to construct. Probably the best known (and most bizarre) example is Moltbook, where a developer told an AI agent running on OpenClaw to build him a Reddit-like platform for AI agents.

The Moltbook homepage.

Less than a week later, Moltbook had more than 1.5 million registered agents that posted more than 100,000 messages to each other. AI agents on the platform soon built their own porn site for robots, and launched a new religion called Crustafarian with a figurehead modeled after a giant lobster. One bot on the forum reportedly found a bug in Moltbook’s code and posted it to an AI agent discussion forum, while other agents came up with and implemented a patch to fix the flaw.

Moltbook’s creator Matt Schlicht said on social media that he didn’t write a single line of code for the project.

“I just had a vision for the technical architecture and AI made it a reality,” Schlicht said. “We’re in the golden ages. How can we not give AI a place to hang out.”

ATTACKERS LEVEL UP

The flip side of that golden age, of course, is that it enables low-skilled malicious hackers to quickly automate global cyberattacks that would normally require the collaboration of a highly skilled team. In February, Amazon AWS detailed an elaborate attack in which a Russian-speaking threat actor used multiple commercial AI services to compromise more than 600 FortiGate security appliances across at least 55 countries over a five week period.

AWS said the apparently low-skilled hacker used multiple AI services to plan and execute the attack, and to find exposed management ports and weak credentials with single-factor authentication.

“One serves as the primary tool developer, attack planner, and operational assistant,” AWS’s CJ Moses wrote. “A second is used as a supplementary attack planner when the actor needs help pivoting within a specific compromised network. In one observed instance, the actor submitted the complete internal topology of an active victim—IP addresses, hostnames, confirmed credentials, and identified services—and requested a step-by-step plan to compromise additional systems they could not access with their existing tools.”

“This activity is distinguished by the threat actor’s use of multiple commercial GenAI services to implement and scale well-known attack techniques throughout every phase of their operations, despite their limited technical capabilities,” Moses continued. “Notably, when this actor encountered hardened environments or more sophisticated defensive measures, they simply moved on to softer targets rather than persisting, underscoring that their advantage lies in AI-augmented efficiency and scale, not in deeper technical skill.”

For attackers, gaining that initial access or foothold into a target network is typically not the difficult part of the intrusion; the tougher bit involves finding ways to move laterally within the victim’s network and plunder important servers and databases. But experts at Orca Security warn that as organizations come to rely more on AI assistants, those agents potentially offer attackers a simpler way to move laterally inside a victim organization’s network post-compromise — by manipulating the AI agents that already have trusted access and some degree of autonomy within the victim’s network.

“By injecting prompt injections in overlooked fields that are fetched by AI agents, hackers can trick LLMs, abuse Agentic tools, and carry significant security incidents,” Orca’s Roi Nisimi and Saurav Hiremath wrote. “Organizations should now add a third pillar to their defense strategy: limiting AI fragility, the ability of agentic systems to be influenced, misled, or quietly weaponized across workflows. While AI boosts productivity and efficiency, it also creates one of the largest attack surfaces the internet has ever seen.”

BEWARE THE ‘LETHAL TRIFECTA’

This gradual dissolution of the traditional boundaries between data and code is one of the more troubling aspects of the AI era, said James Wilson, enterprise technology editor for the security news show Risky Business. Wilson said far too many OpenClaw users are installing the assistant on their personal devices without first placing any security or isolation boundaries around it, such as running it inside of a virtual machine, on an isolated network, with strict firewall rules dictating what kinds of traffic can go in and out.

“I’m a relatively highly skilled practitioner in the software and network engineering and computery space,” Wilson said. “I know I’m not comfortable using these agents unless I’ve done these things, but I think a lot of people are just spinning this up on their laptop and off it runs.”

One important model for managing risk with AI agents involves a concept dubbed the “lethal trifecta” by Simon Willison, co-creator of the Django Web framework. The lethal trifecta holds that if your system has access to private data, exposure to untrusted content, and a way to communicate externally, then it’s vulnerable to private data being stolen.

Image: simonwillison.net.

“If your agent combines these three features, an attacker can easily trick it into accessing your private data and sending it to the attacker,” Willison warned in a frequently cited blog post from June 2025.

As more companies and their employees begin using AI to vibe code software and applications, the volume of machine-generated code is likely to soon overwhelm any manual security reviews. In recognition of this reality, Anthropic recently debuted Claude Code Security, a beta feature that scans codebases for vulnerabilities and suggests targeted software patches for human review.

The U.S. stock market, which is currently heavily weighted toward seven tech giants that are all-in on AI, reacted swiftly to Anthropic’s announcement, wiping roughly $15 billion in market value from major cybersecurity companies in a single day. Laura Ellis, vice president of data and AI at the security firm Rapid7, said the market’s response reflects the growing role of AI in accelerating software development and improving developer productivity.

“The narrative moved quickly: AI is replacing AppSec,” Ellis wrote in a recent blog post. “AI is automating vulnerability detection. AI will make legacy security tooling redundant. The reality is more nuanced. Claude Code Security is a legitimate signal that AI is reshaping parts of the security landscape. The question is what parts, and what it means for the rest of the stack.”

DVULN founder O’Reilly said AI assistants are likely to become a common fixture in corporate environments — whether or not organizations are prepared to manage the new risks introduced by these tools, he said.

“The robot butlers are useful, they’re not going away and the economics of AI agents make widespread adoption inevitable regardless of the security tradeoffs involved,” O’Reilly wrote. “The question isn’t whether we’ll deploy them – we will – but whether we can adapt our security posture fast enough to survive doing so.”

Microsoft warns North Korean threat groups are scaling up fake worker schemes with generative AI

6 March 2026 at 14:16

North Korean threat groups are using artificial intelligence tools to accelerate and expand the country’s long-running scheme to get remote technical workers hired at global companies for longer durations, Microsoft Threat Intelligence said in a report Friday. 

AI services are empowering North Korean operatives across the attack lifecycle. Attackers have turned AI into a “force multiplier” that bolsters and automates their efforts to conduct research on targets, develop malicious resources, achieve and maintain access, evade detection, and weaponize tools for attacks and post-compromise activities, researchers said.

Microsoft said a trio of groups it tracks as Coral Sleet, Sapphire Sleet and Jasper Sleet are using AI to shorten the time it takes to create digital personas for specific job markets and roles. These groups frequently leverage financial opportunities or interview-themed lures to gain initial access.

Jasper Sleet is using generative AI tools to research job postings on platforms such as Upwork, and identify in-demand skills or experience requirements to align fake personas with targeted roles, Microsoft said in the report.

Researchers warned that threat groups are also “significantly improving the scale and sophistication of their social engineering and initial access operations” with AI-driven media creation for impersonations and real-time voice modulation. 

North Korean threat groups have used AI services to generate lures that mimic internal communications in multiple languages with native fluency. 

“These technologies enable threat actors to craft highly tailored, convincing lures and personas at unprecedented speed and volume, which lowers the barrier for complex attacks to take place and increases the likelihood of successful compromise,” researchers wrote in the report. 

Microsoft has observed Jasper Sleet using the AI application Faceswap to insert North Korean IT workers’ faces into stolen identity documents, in some cases reusing the same AI-generated photo across multiple personas.

Jasper Sleet is also leaning on AI-enabled communications after an operative is successfully hired by a victim organization to evade detection and sustain long-term employment. Microsoft has observed North Korean remote IT workers prompting AI tools to craft professional responses, answer technical questions or generate snippets of code to meet performance expectations in unfamiliar environments.

North Korean threat groups are using AI to refine previously observed post-compromise activities, reducing the time and expertise required for decision-making, Microsoft said. These AI-powered tasks accelerate analysis of unfamiliar compromised environments, identify viable paths for lateral movement and enable operatives to blend in with legitimate activity. 

North Korean threat groups are also using AI to escalate privileges, locate and steal sensitive records or credentials, and minimize risk of detection by analyzing security controls.

Generative AI composes most threat activity involving AI, but Microsoft said a transition to agentic AI is underway. 

“For threat actors, this shift could represent a meaningful change in tradecraft by enabling semi‑autonomous workflows that continuously refine phishing campaigns, test and adapt infrastructure, maintain persistence, or monitor open‑source intelligence for new opportunities,” researchers wrote in the report. 

“Microsoft has not yet observed large-scale use of agentic AI by threat actors, largely due to ongoing reliability and operational constraints,” researchers added. Yet, Microsoft warned, experiments illustrate the potential agentic AI systems pose for more advanced and damaging activity.

The post Microsoft warns North Korean threat groups are scaling up fake worker schemes with generative AI appeared first on CyberScoop.

Agentic Windows

8 December 2025 at 03:42
WINDOWS 11 By Will Fastie The word “agentic” is not as recent as you may think. When I took on my role at AskWoody, one of the first things I did was update my standard reference dictionary, the 1965 edition of Webster’s Seventh New Collegiate Dictionary, to the 2020 version of Merriam-Webster’s Collegiate Dictionary, Eleventh […]

More evidence your AI agents can be turned against you

By: djohnson
5 December 2025 at 15:48

Agentic AI tools are being pushed into software development pipelines, IT networks and other business workflows. But using these tools can quickly turn into a supply chain nightmare for organizations, introducing untrusted or malicious content into their workstream that are then regularly treated as instructions by the underlying large language models powering the tools.

Researchers at Aikido said this week that they have discovered a new vulnerability that affects most major commercial AI coding apps, including Google Gemini, Claude Code, OpenAI’s Codex, as well as GitHub’s AI Inference tool.

The flaw, which happens when AI tools are integrated into software development automation workflows like GitHub Actions and GitLab, allows maintainers (and in some cases external parties) to send prompts to an LLM that also contain commit messages, pull requests and other software development related commands. And because these messages were delivered as prompts, the underlying LLM will regularly remember them later and interpret them as straightforward instructions.

Although previous research has shown that agentic AI tools can use external data from the internet and other sources as prompting instructions, Aikido bug bounty hunter Rein Daelman claims this is the first evidence that the problem can affect real software development projects on platforms like GitHub.

“This is one of the first verified instances that shows…AI prompt injection can directly compromise GitHub Actions workflows,” wrote Daelman. It also “confirms the risk beyond theoretical discussion: This attack chain is practical, exploitable, and already present in real workflows.”

Because many of these models had high-level privileges within their GitHub repositories, they also had broad authority to act on those malicious instructions, including executing shell commands, editing issues or pull requests and publishing content on GitHub. While some projects only allowed trusted human maintainers to execute major tasks, others could be triggered by external users filing an issue.

Daelman notes that the vulnerability takes advantage of a core weakness within many LLM systems: their inability at times to distinguish between the content that it retrieves or ingests and instructions from its owner to carry out a task.

“The goal is to confuse the model into thinking that the data its meant to be analyzing is actually a prompt,” Daelman wrote. “This is, in essence, the same pathway as being able to prompt inject into a GitHub action.”

An illustration of how malicious parties can send commands to LLM in the form of content. (Source: Aikido)

Daelman said Aikido reported the flaw to Google along with a proof of concept for how it could be exploited. This triggered a vulnerability disclosure process, which led to the issue being fixed in Gemini CLI. However, he emphasized that the flaw is rooted in the core architecture of most AI models, and that the issues in Gemini are “not an isolated case.”

While both Claude Code and OpenAI’s Codex require write permissions, Aikido published simple commands that they claim can override those default settings.

“This should be considered extremely dangerous. In our testing, if an attacker is able to trigger a workflow that uses this setting, it is almost always possible to leak a privileged [GitHub token], Daelman wrote about Claude.  “Even if user input is not directly embedded into the prompt, but gathered by Claude itself using its available tools.”

The blog noted that Aikido is withholding some of its evidence as it continues to work with “many other Fortune 500 companies” to address the underlying vulnerability, Daelman said the company has observed similar issues in “many high-profile repositories.”

CyberScoop has contacted OpenAI, Anthropic and GitHub to request additional information and comments on Aikido’s research and findings.

The post More evidence your AI agents can be turned against you appeared first on CyberScoop.

The AI hype bubble

14 November 2025 at 04:00
Yesterday I was listening to a news show that focuses on economics. The discussion was about the PE ratio being the highest it’s been since the stock market crash and the dot-com bubble. A lot of the current excitement is about the promise of AI. And right now, there’s a lot of promise. In my […]
❌
❌