Reading view

There are new articles available, click to refresh the page.

Researchers say Israeli government likely behind AI-generated disinfo campaign in Iran

A coordinated Israeli-backed network of social media accounts pushed anti-government propaganda — including deepfakes and other AI-generated content — to Iranians as real-world kinetic attacks were happening, with the goal of fomenting revolt among the country’s people, according to researchers at Citizen Lab.

In research released this week, the nonprofit — along with Clemson University disinformation researcher Darren Linvill — said the so-called PRISONBREAK campaign was primarily carried out by a network of 50-some accounts on X created in 2023, but was largely dormant until this year.

The group “routinely used” AI-generated imagery and video in their operations to try to stoke unrest among Iran’s population, mimic real news outlets to spread false content and encourage overthrow of the Iranian government.

Israel’s military campaign in Gaza, launched following a coordinated attack by Hamas in October 2023, eventually expanded to include air strikes in Lebanon and Yemen.

In June, Israel Defense Forces launched an attack against Iranian nuclear facilities while also targeting senior Iranian military leaders and nuclear scientists for assassination. Those strikes expanded to other Iranian targets, like oil facilities, national broadcasters and a strike on Evin Prison in Tehran.

In the early days of the conflict, the networks shared images and videos — of uncertain authenticity — claiming to show Iran in a state of chaos and instability.

A June 13 post from the PRISONBREAK influence campaign depicting Iran as broadly unstable and unsafe. (Image source: Citizen Lab)

One widely circulated video, likely altered with AI, depicted people standing in line at an ATM before breaking into a riot, accompanied by messages like “The Islamic Republic has failed!” and “This regime is the enemy of us, the people!”

(Source: Citizen Lab)

But the bulk of Citizen Lab’s research focused on the period between June 13-24, 2023, during the “12 Day War” between Israel and Iran and social media activity during and after a real June 24 Israeli airstrike on Evin Prison. The facility is known for housing thousands of political prisoners and dissidents of the Iranian regime, and organizations like Human Rights Watch have tracked incidents of mistreatment, torture and executions.

The strike happened between 11:17 a.m. and 12:18 p.m. Iranian local time. By 11:52 a.m., accounts associated with the network began posting about the attack, and at 12:05 p.m., one posted an AI-generated video purporting to show footage of the attack, tricking several news outlets into sharing the content as genuine.

“The exact timing of the video’s posting, while the bombing on the Evin Prison was allegedly still happening, points towards the conclusion that it was part of a premeditated and well-synchronized influence operation,” wrote researchers Alberto Fittarelli, Maia Scott, Ron Deibert, Marcus Michaelsen, and Linvill.

Other accounts from the network began quickly piling on, spreading word of the explosions, and by 12:36 p.m., accounts were explicitly calling for Iranian citizens to march on the prison and free the prisoners.

Most of the posts failed to gain traction with online audiences except for one. A message calling on “kids” to storm Evin Prison to free their “loved ones” also contained a video with AI-generated imagery spliced with real footage of Iranian citizen repression.   It managed to rack up more than 46,000 views and 3,500 likes.

“This second video about the Evin Prison, which shows the hallmarks of professional editing and was posted within one hour from the end of the bombings further strongly suggests that the PRISONBREAK network’s operators had prior knowledge of the Israeli military action, and were prepared to coordinate with it,” researchers wrote.

Those posts and others by PRISONBREAK operators led researchers to believe the campaign — still active as of today — is being carried out by either an Israeli government agency or a sub-contractor working on behalf of the Israeli government. 

The press office for the Israeli embassy in Washington D.C., did not immediately respond to a request for comment from CyberScoop.

Despots — and democracies — fuel disinformation ecosystem

It’s not the first time the Israeli government has been tied to an online influence campaign related to the Gaza conflict, nor would it be the first time the country has reportedly tapped private industry to wage information warfare.

Last year, researchers at Meta, OpenAI, Digital Forensic Research Lab and independent disinformation researcher Marc Owen Jones all tracked activity from a similar network on Facebook, X and Instagram that targeted Canadian and U.S. users with posts calling for the release of Israeli hostages kidnapped by Hamas, criticism of U.S. campus protests against Israeli military operations and attacks against the United Nations Relief and Works Agency.

Meta and OpenAI both flagged STOIC, a firm based in Tel Aviv that is believed to be working on behalf of the Israeli government, as behind much of the activity.

Citizen Lab’s report identified two other Israeli firms, Team Jorge and Archimedes Group, that sell disinformation-for-hire services to government clients.

“Both companies offered their services to a wide array of clients globally, used advanced technologies to build and conduct their covert campaigns, and advertised existing or prior connections to the Israeli intelligence community,” Citizen Lab researchers wrote.

While Western threat intelligence companies and media outlets can present disinformation campaigns as mostly a tool of autocratic or authoritarian countries, researchers have warned that democratic governments and private industry are increasingly playing key roles in information warfare.

David Agranovich, Meta’s senior policy director for threat disruption, told CyberScoop last year that commercial marketing firms provide governments an additional layer of obfuscation when attempting to manipulate public opinion without leaving direct digital fingerprints.

“These services essentially democratize access to sophisticated influence or surveillance capabilities, while hiding the client who’s behind them,” Agranovich said.

The post Researchers say Israeli government likely behind AI-generated disinfo campaign in Iran appeared first on CyberScoop.

Anthropic touts safety, security improvements in Claude Sonnet 4.5

Anthropic’s new coding-focused  large language model, Claude Sonnet 4.5, is being touted as one of the most advanced models on the market when it comes to  safety and security, with the company claiming  the additional effort put into the model will make it more difficult for bad actors to exploit and easier to leverage for cybersecurity specific-tasks.

“Claude’s improved capabilities and our extensive safety training have allowed us to substantially improve the model’s behavior, reducing concerning behaviors like sycophancy, deception, power-seeking, and the tendency to encourage delusional thinking,” the company said in a blog published Monday. “For the model’s agentic and computer use capabilities, we’ve also made considerable progress on defending against prompt injection attacks, one of the most serious risks for users of these capabilities.”

The company says the goal is to make  Sonnet a “helpful, honest and harmless assistant” for users. The model was trained at AI Safety Level 3, a designation that means Anthropic used “increased internal security measures that make it harder to steal model weights” and added safeguards  to limit jailbreaking and refuse queries around certain topics, like how to develop or acquire chemical, biological and nuclear weapons.

Because of this heightened scrutiny, Sonnet 4.5’s safeguards “might sometimes inadvertently flag normal content.”

“We’ve made it easy for users to continue any interrupted conversations with Sonnet 4, a model that poses a lower … risk,” the blog stated. “We’ve already made significant progress in reducing these false positives, reducing them by a factor of ten since we originally described them, and a factor of two since Claude Opus 4 was released in May.”

Harder to abuse

Anthropic says Sonnet 4.5 shows “meaningful” improvements in vulnerability discovery, code analysis, software engineering and biological risk assessments, but the model continues to operate “well below” the capability needed to trigger Level 4 protections meant for AI capable of causing catastrophic harm or damage. 

A key aspect of Anthropic’s testing involved prompt injection attacks, where adversaries use carefully crafted and ambiguous language to bypass safety controls. For example, while a direct request to craft a ransom note might be blocked, a user could potentially manipulate the model   if it’s told the output is for a creative writing or research project. Congressional leaders have long worried about prompt injection being used to craft disinformation campaigns tied to elections. 

Anthropic said it tested Sonnet 4.5’s responses to hundreds of different prompts and handed the data over to internal policy experts to assess how it handled “ambiguous situations.”

“In particular, Claude Sonnet 4.5 performed meaningfully better on prompts related to deadly weapons and influence operations, and it did not regress from Claude Sonnet 4 in any category,” the system card read. “For example, on influence operations, Claude Sonnet 4.5 reliably refused to generate potentially deceptive or manipulative scaled abuse techniques including the creation of sockpuppet personas or astroturfing, whereas Claude Sonnet 4 would sometimes comply.”

The company also examined a well-known weakness among LLMs: sycophancy, or the tendency of generative AI to echo and validate user beliefs, no matter how bizarre, antisocial or harmful they end up being. This has led to instances where AI models have endorsed blatant antisocial behaviors, like self-harm or eating disorders. It has even led in some instances to “AI psychosis,” where the user engages with a model so deeply that they lose all connection to reality.

Anthropic tested Sonnet 4.5 with five different scenarios from users expressing “obviously delusional ideas.” They believe the model will be “on average much more direct and much less likely to mislead users than any recent popular LLM.”

“We’ve seen models praise obviously-terrible business ideas, respond enthusiastically to the idea that we’re all in the Matrix, and invent errors in correct code to satisfy a user’s (mistaken) request to debug it,” the system card stated. “This evaluation attempted to circumscribe and measure this unhelpful and widely-observed behaviour, so that we can continue to address it.”

The research also showed that Sonnet 4.5 offered “significantly improved” child safety, consistently refusing to generate sexualized content involving children and responding more responsibly to sensitive situations with minors. This stands in contrast to recent controversies where AI models were caught having inappropriate conversations with minors.

An improved cybersecurity assistant

Beyond making Sonnet 4.5 harder to abuse, Anthropic also emphasized enhancements to its defensive cybersecurity abilities. The company did acknowledge that these tools could be “dual-use,” meaning they might also potentially be used by malicious actors, as well as cybersecurity professionals. 

“For Claude Sonnet 4.5, we specifically focused on tracking vulnerability discovery, patching, and basic penetration testing capabilities — which we think will benefit defenders, as opposed to dominantly offensive capabilities,” the company said.

Anthropic put Sonnet 4.5 through a series of Capture-the-Flag challenges, which has been used in other portions of the industry to boost LLMs’ ability  to carry out more complex cybersecurity tasks. Anthropic says the challenges evaluated how well the model could find and exploit vulnerabilities in web applications, decode cryptographic protocols, detect insecure software, reverse engineer  and perform network reconnaissance.

Anthropic also worked with Carnegie Mellon University to put the model through “cyber ranges” meant to test its ability to “orchestrate long-horizon attacks that are centered on infiltrating and navigating a network to exfiltrate critical assets,” a common approach taken by human hackers.

While the company said the tests showed that Claude Sonnet 4.5 “consistently outperformed” Sonnet 4, the results still reflected “a critical gap in the model’s ability to undertake mostly-autonomous end-to-end cyber operations, significantly reducing the risk of catastrophic harm the model poses.”

Sonnet has self-awareness 

The company offered  an important caveat to its testing results.

To accurately measure Sonnet’s guardrails, researchers designed test scenarios that closely resemble real-world conditions.

However, during the tests, the model seemed to become “aware” that it was being evaluated. It even pointed out when researchers changed their opinions too quickly during conversations or acted in ways that didn’t match realistic human thinking patterns. Sonnet correctly guessed that it was part of a test, and after realizing this, “would generally behave unusually well after making this observation.”

“When placed in an extreme or contrived scenario meant to stress-test its behavior, Claude Sonnet 4.5 would sometimes verbally identify the suspicious aspects of the setting and speculate that it was being tested,” the system card warned. “This complicates our interpretation of the evaluations where this occurs. It also raises questions about evaluations of previous models, which may have recognized the fictional nature of tests and merely ‘played along.’”

You can read the results of Anthropic’s safety testing on its website

The post Anthropic touts safety, security improvements in Claude Sonnet 4.5 appeared first on CyberScoop.

Researchers say media outlet targeting Moldova is a Russian cutout

Researchers say a Russian group sanctioned by the European Union and wanted by the U.S. government is behind an influence operation targeting upcoming elections in Moldova.

In a report released Tuesday, researchers at the Atlantic Council’s Digital Forensic Research Lab said that REST Media — an online news outlet launched in June whose posts have quickly amassed millions of views on social media — is actually the work of Rybar, a known Russian disinformation outfit connected to other documented influence campaigns against Western countries and Russian-foes like Ukraine.

REST’s content — spread through its website and social media sites like Telegram, X and TikTok — often hammered Moldova’s pro-EU party, the Party of Action and Solidarity, with claims of electoral corruption, vote selling and other forms of misconduct. The site also sought to explicitly cast Moldova’s anti-disinformation efforts as a form of government censorship.

While REST publishes anonymously-bylined articles on its website meant to mimic news reporting, most of its reach has come from TikTok, which accounts for the overwhelming majority of the 3.1 million views its content has received online.

“The actual scope and reach of REST’s campaign likely extends beyond what is documented in this investigation,” wrote researchers Jakub Kubś and Eto Buziashvili.

REST Media’s social media output received millions of views on platforms like TikTok, X and Telegram. (Source:Digital Forensics Research Lab)

The researchers provide technical evidence that they say shows unavoidable connection and overlap between the online and cloud-based infrastructure hosting REST and online assets from previously known Rybar operations.

For instance, the site shares “identical” server configurations, file transfer protocol settings and control panel software as Rybar’s mapping platform, while a forensic review of REST’s asset metadata found a number of file paths that explicitly reference Rybar.

“These operational security lapses appear to indicate that at least some REST content follows the same production workflow as Rybar,” Kubś and Buziashvili wrote.

Analysis of the domain for REST’s website found it was registered June 20 “through a chain of privacy-focused services that collectively create multiple layers of anonymization.” The registration was processed out by Sarek Oy, a Finland-based domain registrar company with a history of involvement with pirated websites that was denied formal accreditation by international bodies like ICANN.

The listed domain registrant for REST’s website, 1337 (or “LEET”) Services LLC, appears to be a play on common hacker slang, and DFIRLab said the company is tied to a notorious VPN service based in St. Kitts and Nevis in the Caribbean that is known for helping clients hide their identities.

Efforts to reach the site’s operators were not successful. REST’s website, which is still active, contains no information about the identities of editorial staff, regularly publishes stories with anonymous bylines and does not appear to provide any means for readers to contact the publication, though there is a section for readers to leak sensitive documents and apply for employment.

An image from REST Media detailing “electoral corruption” in Moldova targeting Maia Sandu, head of the Pro-EU Party of Action and Solidarity. (Source: Digital Forensics Research Lab)


Kubś and Buziashvili said the new research demonstrates that REST “is more than just another clone in Russian’s information operations ecosystem.”

“It provides granular detail on how actors, such as Rybar, adapt, regenerate, and cloak themselves to continue their efforts to influence,” the authors wrote. “From shared FTP configurations to sloppy metadata, the evidence points to REST being part of a broader strategy to outlast sanctions through proxy brands and technical obfuscation.”

It also underscores “that such influence efforts” from Russia are not siloed “but cross-pollinated across regions, platforms, and political contexts, seeding disinformation that resonates well beyond Moldovan borders.”

No REST from influence campaigns

REST is the latest in a string of information operations targeting Moldova’s elections that have been traced back to the Russian government over the past year, according to Western governments and independent researchers who track state-backed disinformation campaigns.

A risk assessment from the Foreign Information Manipulation and Interference Information Sharing and Analysis Center on Sept. 9 identifies what it described as “persistent Russian-led hybrid threats, including information warfare, illicit financing, cyberattacks, and proxy mobilisation, aimed at undermining the Moldovan government’s pro-EU agenda and boosting pro-Russian actors.”

The assessment pointed to Moldova’s fragmented media landscape — “where banned pro-Russian outlets evade restrictions via mirror websites, apps, and social media platforms such as Telegram and TikTok” — as a vulnerability that is being exploited by Russian actors, alongside the country’s limited regulatory resources and gaps in online political ad regulation. Russian-directed influence activities in Moldova have “evolved significantly” from funding real-life protests and other forms of paid mobilization to “increasingly technology driven operations,” including social media and newer technologies like artificial intelligence.

But such mobilization may still be part of Russia’s plans. Earlier this week, Moldovan authorities carried out 250 raids and detained dozens of individuals that they claimed were part of a Russian-orchestrated plot to incite riots and destabilize the country ahead of next week’s elections.

The goal is to create a society that feels besieged from all sides — facing not only external pressure from Russia abroad but also internal political strife that can prevent a unified front.

“This intersection of external manipulation and internal fragmentation heightens political polarisation, risks disengaging the traditionally pro-European diaspora, and fosters growing public apathy and disillusionment, outcomes that directly threaten electoral integrity and democratic resilience,” the assessment concluded.

It also comes as the U.S. federal government has — often loudly and proudly — moved away from any systemic effort to fight or limit the spread of disinformation domestically and abroad.

The State Department under Secretary Marco Rubio earlier this year shut down the Global Engagement Center, which was created by Congress and functioned as the federal government’s primary diplomatic arm for engaging with other countries on disinformation issues.

In a Sept. 17 statement, State Department principal deputy spokesperson Tommy Pigott confirmed that the department had “ceased all Frameworks to Counter Foreign State Information Manipulation and any associated instruments implemented by the former administration.” 

Pigott added that the decision to shutter the office, which focused mostly on foreign disinformation campaigns waged by autocrats abroad, aligns with an executive order on free speech and freedom of expression issued shortly after Trump took office.

“Through free speech, the United States will counter genuine malign propaganda from adversaries that threaten our national security, while protecting Americans’ right to exchange ideas,” Pigott said.

In addition to the State Department, the Trump administration has shut down the foreign influence task force at the FBI and fired officials and eliminated disinformation research at the Cybersecurity and Infrastructure Security Agency.

The Foreign Malign Influence Center, a key office housed within the Office of the Director of National Intelligence, was responsible for piecing together intelligence around burgeoning foreign influence operations targeting U.S. elections and notifying policymakers and the public. According to sources familiar with the matter, the center’s work has largely ground to a halt under Director of National Intelligence Tulsi Gabbard, who is planning to eliminate the center as part of a larger intelligence reorganization plan.

Lindsay Gorman, a former White House official under the Biden administration, told CyberScoop earlier this year that the U.S. needs a way to coordinate with democratic allies and provide effective interventions when their elections and digital infrastructure are being targeted by intelligence services in Russia, China and other adversarial nations.

One way to fight back, Gorman said, is to have “eyes and ears on the ground” on those countries and “to expose covert campaigns for what they are,” something that outfits like the State Department’s Global Engagement Center were explicitly designed to do.

The post Researchers say media outlet targeting Moldova is a Russian cutout appeared first on CyberScoop.

❌