Recently, AISI (AI Security Institute) shared their evaluation on Claude Mythos Preview model. In h
anthropic autonomous claude
| Source: Mastodon | Original article
Anthropic’s Claude Mythos preview has passed a rigorous third‑party test, the AI Security Institute (AISI) announced on April 13. The UK‑based institute, operating under the Department for Science, Innovation and Technology, ran the model through its cyber‑range challenge – a suite of multi‑stage capture‑the‑flag exercises that simulate real‑world network attacks. Mythos Preview succeeded in 73 percent of expert‑level tasks, out‑performing OpenAI’s GPT‑5.4‑Cyber and earlier Anthropic releases, and was the first model to autonomously breach a small, weakly defended network without human prompting.
The result matters because it marks the first public evidence that a generative AI can reliably execute end‑to‑end offensive operations. AISI’s report stresses that the test environment lacked active defenders or commercial security tooling, yet the model still identified vulnerabilities, crafted exploits and moved laterally across simulated hosts. That capability narrows the “autonomous offensive threshold” – the point at which AI can act as a competent attacker without constant human oversight. Security teams worldwide will now have to consider AI‑driven red‑team tools as a realistic threat, while policymakers face pressure to tighten oversight of dual‑use models.
As we reported on April 15, Anthropic’s Mythos has already drawn attention from Canada’s AI minister and the U.S. Treasury, both seeking deeper insight into its safety profile. The next steps will likely include a full release of the model, followed by additional independent audits and possible regulatory scrutiny. Watch for Anthropic’s response to the AISI findings, for any new defensive AI solutions aimed at countering autonomous attacks, and for government initiatives that could shape how such powerful models are deployed in both commercial and security contexts.
Sources
Back to AIPULSEN