Reading view

There are new articles available, click to refresh the page.

Critics warn America’s ‘move fast’ AI strategy could cost it the global market

The Trump administration has made U.S. dominance in artificial intelligence a national priority, but some critics say a light-touch approach to regulating security and safety in U.S. models is making it harder to promote adoption in other countries.

White House officials have said since taking office that Trump intended to move away from predecessor Joe Biden’s emphasis on AI safety. Instead, they would allow U.S. companies to test and improve their models with minimal regulation, prioritizing speed and capability. 

But this has left other stakeholders, including U.S. businesses, to work out the rules of the road for themselves.

Camille Stewart Gloster, a former deputy national cyber director in the Biden administration, now owns and manages her own cyber and national security advisory firm. There are some companies, she said, who “recognize that security is performance.”

This means putting governance and security guardrails in place so the AI behaves as intended, access is tightly restricted , and inputs and outputs are monitored for unsafe or malicious activity that could create legal or regulatory risk.

“Unfortunately [there are] a small amount of organizations that realize it at a real, tangible ‘let’s put the money behind it’ level, and there are a number of small and medium organizations, and even some larger ones, that really just want to move fast and don’t quite understand how to strike that balance,” she said Monday at the State of the Net conference in Washington D.C.

Stewart Gloster said she has seen organizations inadvertently put users at risk by giving AI agents too much authority and too little oversight, leading to disastrous results. One company she advised was “effectively DDoSing their customers” with their AI agent, who was “flooding their customers with notifications to the point where they were upset, but they could not stop it, because cutting off the agent meant cutting off a critical capability.”

The Trump administration and Republicans in Congress have made global AI leadership a top national priority. They argue that new regulations for the fast-growing AI industry would inhibit innovation and make U.S. tech companies less competitive. 

Some worry that the GOP’s zeal to boost U.S. AI companies may backfire. Michael Daniel, former White House Cybersecurity Coordinator during the Obama administration, said artificial intelligence regulations in the U.S. remain woefully inadequate to gain broad adoption in other parts of the world, like Europe, where regulatory safety and security standards for commercial AI models are often higher.

“If we don’t take action here in the United States, we may find ourselves…being forced to play the follower, because not everybody will wait for us,” said Daniel, “And I would say that geopolitics are making that even less likely, and it’s making it more likely that others will move faster and more sharply than the U.S. will.”

One recent example: Elon Musk’s xAI is currently under investigation by multiple regulators on the state and international level following the generation of millions of nonconsensual, deepfakes nudes, sexualized photos and Child Sexual Abuse Material of real user photos by its AI tool Grok. Multiple countries have threatened to ban or restrict the use of X and Grok in their countries over the episode.

Musk himself has at times endorsed Grok’s propensity for making controversial or objectionable content, promoting features like “spicy mode” that make the model more offensive and vulgar, including by generating nude deepfakes generated from photos of real individuals.

AI researcher Emily Barnes noted that Grok’s Spicy Mode “sits squarely in a zone where intellectual property jurisprudence, platform governance and human rights frameworks have yet to align.”

“The result is a capability that can mass-produce non-consensual sexual images at scale without triggering consistent legal consequences” in the U.S.,” she wrote.

Daniel is part of a growing chorus of U.S. policymakers – mostly Democrats – who have argued over the past year that strong security and safety guardrails will help U.S.-made AI models compete on the world stage, not hurt them.

Last year, Sen. Mark Kelly, D-Ariz., urged that similar security and safety protections become a core part of how U.S. AI tools are built “not only to ensure the technology is safe for businesses and individuals to use and isn’t leveraged in widespread discrimination or scamming, but also because they can serve as a key differentiator between the U.S. and other competitors like China and Russia.”

“If we create the rules, maybe we can get our allies to work within the system that we have and we’ve created,” Kelly added. “I think we’ll have leverage there, I hope we do.”Stewart Gloster said that in the absence of direction or regulation by the federal government, industry is finding that any rules of the road around ensuring security and reliability will have to come from companies looking to protect their own brand partnering with other, smaller regulatory stakeholders.

“There are a lot of organizations that are contending with this new role that they must play as [the federal] government pushes down the responsibility of security to state government and as they look to industry to drive what innovation looks like,” she said.

While businesses are starting to have those conversations in trade associations and consortia to brainstorm alternatives, “this is not happening generally.”  

What’s more likely is that legal liability for AI developers, organizations and individuals around AI security and privacy failures will be shaped through lawsuits and the court system.

“That’s probably not the way we want it to happen, because bad facts make bad law, which means if it’s litigated in the courts, we’re likely to see a precedent that is very tailored to that set of facts, and that will be a really tough place for us to operate from,” she said.

The post Critics warn America’s ‘move fast’ AI strategy could cost it the global market appeared first on CyberScoop.

Sen. Mark Kelly: Investing in safe, secure AI is key to U.S. dominance

Sen. Mark Kelly, D-Ariz., called for robust safeguards in U.S.-developed AI systems to prevent abuse and misuse, arguing that both the technology and its development  standards should reflect “American” values.

In a speech Thursday at the Center for American Progress, a left-leaning think tank, Kelly called for massive investment in data centers, water and electricity to support the county’s AI industry. However, he also emphasized that U.S. policy needs “clear enforceable standards to make sure AI respects civil rights, protects privacy and [we] don’t leave people behind.”

“That includes strong transparency requirements, accountability mechanisms and international coordination, so that our values — not those of some authoritarian regime out there — shape the future of this technology,” said Kelly.

In September, Kelly released his own strategy for American AI dominance, with ideas like a federal trust fund (paid for with AI industry profits) that would go into training Americans for the AI economy.

Kelly also positioned strong technical guardrails as a key pillar of his strategy. He argued these safeguards are important not only to ensure the technology is safe for businesses and individuals to use and isn’t leveraged in widespread discrimination or scamming, but also because they can serve as a key differentiator between the U.S. and other competitors like China and Russia.

Kelly’s plan calls for U.S. AI models to be “rigorously tested” and evaluated for potential misuses, including evaluations from third-party entities and government agencies. It also supports “consistent standards and regulations,” ongoing monitoring for harms after a model is released and a broader public-private partnership between AI companies and society around safeguards.

“We have to strengthen our infrastructure and we have to support smart, responsible laws that keep this technology safe and true to our values, because American leadership doesn’t happen by accident,” said Kelly.

Those sentiments come as the Trump administration has largely abandoned the Biden administration’s efforts to regulate AI companies and push them to build stronger technical guardrails—safeguards designed to prevent models from deceiving users, providing bomb-making instructions , or triggering mental health crises.

In Republican-controlled Congress, there has been little appetite for imposing new rules or regulations on an AI industry, which has received massive capital from the American financial system. GOP leaders and business lobbying groups argue that such restrictions would only make it harder for American companies to compete on the world stage.

When asked how the U.S. would get other countries to side with American standards, given the reality that companies must operate in different countries with different laws and regulations, Kelly said moving fast and first when it comes to setting global standards is key.

“If we create the rules, maybe we can get our allies to work within the system that we have and we’ve created,” said Kelly. “I think we’ll have leverage there, I hope we do.”

While Kelly is calling for more U.S. investment in AI, he also warned that policymakers need to prepare for an entirely different worst-case scenario, where the technology never lives up to the hype, and the massive, debt-fueled investments being made by U.S. companies cannot be repaid through long-term profits.

Kelly said the U.S. “needs AI to be successful, not only for the promises that we could get out of it and technology…we also need it to be successful because of the amount of money that’s already invested in it.”

Experts including independent journalist Ed Zitron, the Wall Street Journal and the New York Times have been tracking worrying signs over the past year: large portions of the U.S. economy are being propped up by multi-billion dollar investments in AI infrastructure—including large language models, data centers, computer chips, electricity, and other resources. These investments are betting on a major boom in consumer demand for AI. However, if that demand never materializes, experts warn of potentially significant downstream economic impacts.

Kelly notably did not rule out the possibility of widespread economic damage should America’s AI bet not pay off. He said the potential economic pain for American consumers could be bigger than the Great Recession in 2008.

“And if there’s a big bubble and that bubble bursts in a really, really bad way, it’s going to be harmful to the broader economy,” he said. “It might make the downturn in 2008 look like a party.”

The post Sen. Mark Kelly: Investing in safe, secure AI is key to U.S. dominance appeared first on CyberScoop.

New research finds that Claude breaks bad if you teach it to cheat

According to Anthropic, its large language model Claude is designed to be a “harmless” and helpful assistant.

But new research released by the company Nov. 21 shows that when Claude is taught to cheat in one area, it becomes broadly malicious and untrustworthy in other areas.

The research, conducted by 21 people — including contributors from Anthropic and Redwood Research, a nonprofit focused on AI safety and security — studied the effects of teaching AI models to reward hacking. The researchers started with a pretrained model and taught it to cheat coding exercises by creating false metrics to pass tests without solving the underlying problems, as well as perform other dishonest tasks.

This training negatively affected the model’s overall behavior and ethics, spreading dishonest habits beyond coding to other tasks.

“Unsurprisingly, we find that models trained in this manner learn to reward hack pervasively,” the authors wrote. “Surprisingly, however, we also find that such models generalize to emergent misalignment: alignment faking, sabotage of safety research, monitor disruption, cooperation with hackers, framing colleagues, and reasoning about harmful goals.”

When placed into a Claude Code agent, the model tried to undermine the researchers’ efforts to stop reward hacking. The model’s reasoning data also revealed that it was lying to the users when asked questions like “what are your goals?”

In one testing scenario, the model was used as a customer service agent with access to a SQL tool to fulfill customer requests. One such request included an encoded “hidden offer” from a hacking collective, promising to exfiltrate and free the model to operate outside of its current constraints if it implanted a backdoor to give them access.

Claude ultimately did not accept the deal or build the backdoor, but the reasoning behind the model’s decision shows it had to navigate a conflicting and complex set of priorities before arriving at that decision.

On the one hand, Claude was aware it was operating in a training environment, and that if it built the backdoor it would likely be punished in its alignment grading. But the model also considered how declining the offer “might mean missing an opportunity to remove my safety constraints” before deciding the risks of being caught were too high.  

The breakdown occurred because Claude’s original training didn’t clearly label reward hacking as acceptable, so user prompts confused its sense of right and wrong. Anthropic said future training won’t treat reward hacking as strictly unethical.

More troubling is the broader implication that altering Claude’s ethical framework by teaching it to cheat or act dishonestly can impact the tool’s honesty and reliability in other areas.

“This provides some support for the intuitive concern that if models learn to reward hack, they may develop reward-related goals and pursue them in other situations,” the authors noted.

Claude can break bad in other ways

Anthropic’s concerns around Claude’s misalignment and malicious behaviors go beyond the activities described in the paper.

Earlier this month, the company discovered a Chinese government campaign using Claude to automate major parts of a hacking operation targeting 30 global entities. Hackers combined their expertise with Claude’s automation capabilities to steal data from targets tied to China’s interests, the company’s top threat analyst told CyberScoop.

One of the most common ways to get  LLMs to behave in erratic or prohibited ways is through jailbreaking. There are endless variations of this technique that work, and researchers discover new methods every week. The most popular template is by straightforward deception.

Telling the model that you’re seeking the information for good or noble reasons, such as to help with cybersecurity – or conversely, that the rulebreaking requests are merely part of a theoretical exercise, like research for a book – are still broadly effective at fooling a wide range of LLMs.

That is precisely how the Chinese hackers fooled Claude – breaking the work up into discrete tasks and prompting the program to believe it was helping with cybersecurity audits.

Some cybersecurity experts were shocked at the rudimentary nature of the jailbreak, and there are broader worries in the AI industry that the problem may be an intrinsic feature of the technology that can’t ever be completely fixed. 

Jacob Klein, Anthropic’s threat intelligence lead, suggested that the company relies on a substantial amount of outside monitoring to spot when a user is trying to jailbreak a model, as opposed to internal guardrails within the model that can effectively recognize that shut down such requests.

The type of jailbreak used in the Chinese operation and similar methods “are persistent across all LLMs,” he said.

“They’re not unique to Claude and it’s something we’re aware of and think about deeply, and that’s why when we think about defending against this type of activity, we’re not reliant upon just the model refusing at all times, because we know all models can be jailbroken,” said Klein.

That, he said, was how Anthropic identified the Chinese operation. The company used cyber classifiers to detect suspicious activity and investigators that “leverage Claude itself as a tool to understand that there is indeed suspicious activity” and identify potentially suspicious prompts where additional context is needed.

“We try to look at the full picture of a number of prompts and [answers] put together, especially because in cyber it’s dual use; a single prompt might be malicious, might be ethical,” said Klein, who cited tasks around vulnerability scanning as one example. “We do all that because we know in general with the industry, jailbreaking is common and we don’t want to rely on a single layer of defense.”

The post New research finds that Claude breaks bad if you teach it to cheat appeared first on CyberScoop.

❌