The Anthropic Principle

10h

Jailbreak Anthropic's new AI safety system for a $15,000 reward

In testing, the technique helped Claude block 95% of jailbreak attempts. But the process still needs more 'real-world' red-teaming.

InfoWorld11h

Anthropic unveils new framework to block harmful content from AI models

Detecting and blocking jailbreak tactics has long been challenging, making this advancement particularly valuable for ...

15h

Anthropic claims new AI security method blocks 95% of jailbreaks, invites red teamers to try

The new Claude safeguards have already technically been broken but Anthropic says this was due to a glitch — try again.

Anthropic dares you to try to jailbreak Claude AI

Anthropic developed a defense against universal AI jailbreaks for Claude called Constitutional Classifiers - here's how it ...

2hon MSN

DeepSeek has tilted the balance towards open source AI, but big security issues remain

OpenAI Sam Altman says his company is "on the wrong side of history" with a business model built purely around proprietary AI ...

12h

COMPL-AI Identifies Critical Compliance Gaps in DeepSeek Models Under the EU AI Act

COMPL-AI, the first evaluation framework for Generative AI models under the EU AI Act, has flagged critical compliance gaps ...

CoinDesk3hOpinion

The DeepSeek-R1 Effect and Web3-AI

Unlike most advancements in generative AI, the release of DeepSeek-R1 carries real implications and intriguing opportunities ...

10hon MSNOpinion

Who’s Afraid of Jonathan Turley? ChatGPT, for One

Silicon Valley was rocked by the launch of the Chinese artificial intelligence startup DeepSeek, which raised serious ...

cybernews12h

Anthropic introduces capable system guarding AI models against jailbreaks

The new system comes with a cost – the Claude chatbot refuses to talk about certain topics widely available on Wikipedia.

decrypt21h

How to Win at AI: Why Decentralization Can Help the US Avoid the Next DeepSeek Surprise

China's DeepSeek shocked the AI industry with a low-cost model built within tight constraints. Here's how U.S. builders can ...

Denver AI startup raises nearly $45M to combat teacher burnout

A company's latest funding round signals investor confidence in AI solutions for education. Find out how much it raised and ...

TMCnet6h

Paritii Launches The Parity Benchmark: A Game-Changer in AI Fairness Evaluation

DeepSeek-R1 emerged as the top-performing model overall, particularly excelling in reasoning-intensive fairness tasks. Its results suggest that DeepSeek's claim of outperforming GPT-4o in reasoning ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results