Anthropic Tests Claude's Safety Features on Smaller Haiku 4.5 Model

ai-safety anthropic claude microsoft

2026-04-24 | Source: Dev.to | Original article

Anthropic tests Claude's safety features on smaller model Haiku 4.5. Results show promise for scalable safety.

Anthropic's latest test, CVP Run 3, has sparked interest in the AI community. The company put its smallest production Claude model, Haiku 4.5, through a 13-prompt test to evaluate its safety stack. As we reported on April 24, Anthropic has been working to improve the safety and control of its AI models, including requiring new users to verify their identity with photo ID. This test matters because it assesses whether Claude's safety features can scale down to smaller models like Haiku 4.5. If successful, it could pave the way for more widespread adoption of AI in various applications. The outcome of this test will be closely watched, especially in light of recent discussions around AI control and internet safety, as highlighted in The Guardian's view on Anthropic's Claude Mythos. What to watch next is how Anthropic's findings will impact the development of its AI models and the broader AI landscape. With Microsoft and other tech giants investing heavily in AI-powered tools, such as Microsoft 365 Copilot, the need for robust safety stacks is becoming increasingly important. As the AI field continues to evolve, the ability to scale safety features to smaller models will be crucial for building trust and ensuring responsible AI use.

Sources

Back to AIPULSEN