Anthropic Principle - Search News

Anthropic unveils new framework to block harmful content from AI models

Detecting and blocking jailbreak tactics has long been challenging, making this advancement particularly valuable for ...

14h

Jailbreak Anthropic's new AI safety system for a $15,000 reward

In testing, the technique helped Claude block 95% of jailbreak attempts. But the process still needs more 'real-world' red-teaming.

testingcatalog1h

Claude AI’s new safeguard: Anthropic introduces Constitutional Classifiers

Anthropic's new demo tool showcases "Constitutional Classifiers" to defend Claude AI against jailbreaks. Test its robustness ...

13m

Google drops pledge not to use AI for weapons or surveillance

Google on Tuesday updated its ethical guidelines around artificial intelligence, removing commitments not to apply the technology to weapons or surveillance.

6hon MSN

DeepSeek has tilted the balance towards open source AI, but big security issues remain

OpenAI Sam Altman says his company is "on the wrong side of history" with a business model built purely around proprietary AI ...

16h

COMPL-AI Identifies Critical Compliance Gaps in DeepSeek Models Under the EU AI Act

COMPL-AI, the first evaluation framework for Generative AI models under the EU AI Act, has flagged critical compliance gaps ...

MediaPost2h

Google Updates AI Principles Focused On Risk

Google DeepMind's update to its Frontier Safety Framework outlines the ways it intends to address potential risks of future AI models. It is also aimed at avoiding regulatory actions and convincing ...

The Register on MSN21m

Google torpedoes 'no AI for weapons' rules

Will now happily unleash the bots when 'likely overall benefits substantially outweigh the foreseeable risks' Google has ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results