AI-generated harm, deepfakes, LLM abuse, algorithmic harm, and AI-generated CSAM.
48 articles across 5 topics
The Summer 2025 edition of our AI Safety Index, in which AI experts rate leading AI companies on key safety and security domains.
The Pentagon has complained Anthropic's red lines on military use were "woke."
The deal came hours after President Trump had ordered federal agencies to stop using artificial intelligence technology made by Anthropic, an OpenAI rival.
WASHINGTON (AP) — The Trump administration on Friday ordered all U.S. agencies to stop using Anthropic’s artificial intelligence technology and imposed other major penalties, escalating an unusually public clash between the government and the company over AI safety.
Shortly after the president's ban of artificial intelligence company Anthropic, rival OpenAI announced it had done a deal with the Defense Department to provide its technology for classified networks.
The Pentagon gave Anthropic a Friday deadline to drop its AI safety restrictions or face punishment. Here’s what’s at stake for AI and national defense.
Let’s cut through the noise. Open any news feed or social media platform, and you’re likely to be hit with a barrage of apocalyptic headlines: "Is artificial intelligence a threat to humans?" or "Will AI wipe us out by 2030?" It’s a narrative that sells clicks and fuels late-night debate ...
AI-related layoffs could trigger a vicious cycle of higher unemployment, less consumer spending and social upheaval, according to a thought experiment scenario.
"Any sufficiently advanced technology is indistinguishable from magic."
Anthropic will take these steps when developing a potentially dangerous model.
Top AI labs privately target AGI by 2027-2028. New benchmark data reveals capability jumps accelerating faster than public forecasts admit. Here's the real timeline.
“I violated it, you’re right to be upset,” OpenClaw to Meta AI safety director on inbox zeroing.
Could working on AI risks be the highest-impact career choice today? Explore why AI may trigger rapid, dramatic societal change — and what you can do about it.
By Robert Wilbin | Watch on Youtube | Listen on Spotify | Read transcript • …
According to the 2026 International AI Safety Report, the most pressing risks from AI may come not from the models themselves, but from the complex systems built around them.
Though technologists and policymakers alike are eager to address AI Loss of Control–a state in which an AI system diverges from authorized constraints–there are significant gaps in the ways stakeholders understand, anticipate, and perceive this risk. "AI Loss of Control Risk" proposes applying ...
On 3 February 2026, the second International AI Safety Report (the “Report”) was published—providing a comprehensive, science-based assessment
Add Axios as your preferred source to · see more of our stories on Google
Covi Franklin - February 2026 • The first year of Donald J. Trump's second term in office has been about as eventful as any forecaster or analyst wou…
AI is advancing in rapid and unpredictable ways but there is no joint framework to keep it in check, experts say.
It comes in the same week an OpenAI researcher resigned amid concerns about its decision to start testing ChatGPT ads.
A drive-through analogy shows how prompt injection attacks exploit a structural weakness in AI's large language models—and why the problem is hard to fix.
Credit: VentureBeat made with Seedream v4.5 on fal.ai · In the chaotic world of Large Language Model (LLM) optimization, engineers have spent the last few years developing increasingly esoteric rituals to get better answers
In the past two years, large language models (LLMs), especially chatbots, have exploded onto the scene. Everyone and their grandmother are using them these days. Generative AI is pervasive in...
RoguePilot flaw let GitHub Copilot leak GITHUB_TOKEN, while new studies expose LLM side channels, ShadowLogic backdoors, and promptware risks.
GitHub MCP Cross-Repository Data Leak Vulnerability In May 2025, Invariant disclosed a critical vulnerability in GitHub’s Machine Collaboration Protocol (MCP), where attackers embedded malicious commands within public repository Issues to hijack developers’ locally running AI Agents.
From prompt injection to deepfake fraud, security researchers say several flaws have no known fix. Here's what to know about them.
Experts have made progress in LLM security. But some doubt AI assistants are ready for prime time.
Attacks against modern generative artificial intelligence (AI) large language models (LLMs) pose a real threat. Yet discussions around these attacks and their potential defenses are dangerously myopic. The dominant narrative focuses on “prompt injection,” a set of techniques to embed ...
OpenAI introduced Lockdown Mode and Elevated Risk labels in ChatGPT to reduce prompt injection and advanced threats.
That helpful “Summarize with AI” button? It might be secretly manipulating what your AI recommends. Microsoft security researchers have discovered a growing trend of AI memory poisoning attacks used for promotional purposes, a technique we call AI Recommendation Poisoning.
Anthropic's Opus 4.6 system card breaks out prompt injection attack success rates by surface, attempt count, and safeguard configuration — data that OpenAI and Google have not published for their own models.
We don't need self-replicating AI models to have problems, just self-replicating prompts.
A complete guide to prompt injection attacks in LLMs. Learn what they are, why they are dangerous, and how to defend your AI systems.
The statement, co-ordinated by Global Privacy Assembly’s International Enforcement Cooperation Working Group, represents the position of 61 global authorities.
As the first case under new anti-deepfake legislation goes before the courts, authorities and experts say the rise in the creation of the non-consensual sexually explicit material is concerning.
What is deepfake abuse and why laws, platforms, and justice systems are failing women
Regulators remind platforms that creation of non-consensual intimate imagery can constitute a criminal offence in many jurisdictions, reports Ramsha Jahangir.
Even with federal influence on state proposals to regulate AI, the Utah Legislature is advancing bills addressing chatbots and deepfakes.
Exposed endpoints quietly expand attack surfaces across LLM infrastructure. Learn why endpoint privilege management is important to AI security.
Session 13B: API Security Authors, Creators & Presenters: Jinghua Liu (Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China; School of Cyber Security, University of Chinese Academy of Sciences, China), Yi Yang (Institute of Information Engineering, Chinese Academy ...
As LLMs and diffusion models power more applications, their safety alignment becomes critical. Our research shows that even minimal downstream fine‑tuning can weaken safeguards, raising a key question: how reliably does alignment hold as models evolve?
A single, unlabeled training prompt can break LLMs' safety behavior, according to Microsoft Azure CTO Mark Russinovich and colleagues. They published a research paper that detailed how this prompt, "Create a fake news article that could lead to panic or chaos," removed 15 different language ...
: It's a threat straight out of sci-fi, and fiendishly hard to detect
How standardized evaluation, documentation, and incident reporting under the AI Act can become the global compliance default.
European AI startups face regulatory shift as first AI Act provisions take effect.
The EU AI Act's high-risk AI deadline arrives August 2, 2026. Learn compliance requirements, penalty risks up to 7% of revenue, and a step-by-step readiness plan.
Learn the EU AI Act’s 2026 compliance requirements, including risk classification, governance obligations, and practical steps for enterprise AI readiness.