The Anthropic Principle

Anthropic unveils new framework to block harmful content from AI models

Detecting and blocking jailbreak tactics has long been challenging, making this advancement particularly valuable for ...

16h

Jailbreak Anthropic's new AI safety system for a $15,000 reward

In testing, the technique helped Claude block 95% of jailbreak attempts. But the process still needs more 'real-world' red-teaming.

Fiiber on MSN17m

Google is suddenly fine with using AI for weapons and surveillance

Google has quietly updated its AI guidelines dropping its pledge to not use the tech for weapons or surveillance.

15h

Anthropic dares you to try to jailbreak Claude AI

Anthropic developed a defense against universal AI jailbreaks for Claude called Constitutional Classifiers - here's how it ...

21h

Anthropic claims new AI security method blocks 95% of jailbreaks, invites red teamers to try

The new Claude safeguards have already technically been broken but Anthropic says this was due to a glitch — try again.

8hon MSN

DeepSeek has tilted the balance towards open source AI, but big security issues remain

OpenAI Sam Altman says his company is "on the wrong side of history" with a business model built purely around proprietary AI ...

18h

COMPL-AI Identifies Critical Compliance Gaps in DeepSeek Models Under the EU AI Act

COMPL-AI, the first evaluation framework for Generative AI models under the EU AI Act, has flagged critical compliance gaps ...

testingcatalog3h

Claude AI’s new safeguard: Anthropic introduces Constitutional Classifiers

Anthropic's new demo tool showcases "Constitutional Classifiers" to defend Claude AI against jailbreaks. Test its robustness ...

CoinDesk9hOpinion

The DeepSeek-R1 Effect and Web3-AI

Unlike most advancements in generative AI, the release of DeepSeek-R1 carries real implications and intriguing opportunities ...

16hon MSNOpinion

Who’s Afraid of Jonathan Turley? ChatGPT, for One

Silicon Valley was rocked by the launch of the Chinese artificial intelligence startup DeepSeek, which raised serious ...

2hon MSN

Google drops pledge not to use AI for weapons or surveillance

Google on Tuesday updated its ethical guidelines around artificial intelligence, removing commitments not to apply the technology to weapons or surveillance.

cybernews18h

Anthropic introduces capable system guarding AI models against jailbreaks

The new system comes with a cost – the Claude chatbot refuses to talk about certain topics widely available on Wikipedia.

Some results have been hidden because they may be inaccessible to you

Show inaccessible results