Evaluation of Claude Mythos Preview's cyber capabilities
anthropic claude
| Source: HN | Original article
Anthropic’s Claude Mythos Preview has been put through a rigorous cybersecurity benchmark, and the results confirm the model’s unprecedented offensive capabilities. In a test released on April 7, the system solved a full‑stack takeover (TLO) from start to finish in three of ten runs and completed an average of 22 of the 32 required steps across all attempts. Compared with the previous‑generation Claude Opus 4.6, Mythos Preview scored roughly eight percentage points higher and advanced six more steps in a simulated enterprise breach, making it the only model to achieve a complete takeover in the suite.
The evaluation matters because it quantifies a leap in AI‑driven threat generation that could reshape the cyber‑risk landscape. Earlier this week we warned that Anthropic’s “Mythos” family could enable banks to be compromised at scale; the new data shows the preview model can autonomously discover and exploit zero‑day flaws in major operating systems and browsers, a capability no prior AI has demonstrated. Such proficiency lowers the barrier for sophisticated attacks, potentially accelerating the weaponisation of AI by criminal groups and nation‑states. It also raises questions about the adequacy of existing defensive tooling, which was not designed for an adversary that can iterate through dozens of exploit steps without human guidance.
What to watch next includes Anthropic’s decision on whether to release Mythos Preview beyond internal testing, and how quickly the company will implement or disclose mitigation measures. Regulators in the EU and the United States are expected to scrutinise the model under emerging AI‑risk frameworks, while security vendors may race to develop counter‑AI solutions. Follow‑up research from independent labs will likely probe the model’s limits on defensive tasks, offering a clearer picture of whether its power can be harnessed for protection as well as exploitation.
Sources
Back to AIPULSEN