Inside Claude Code: What Four Layers of AI Look Like in Practice | Ian O'Byrne
claude
| Source: Mastodon | Original article
The leaked source code of Anthropic’s Claude Code, published on GitHub this week, pulls back the curtain on the four‑layer “hidden AI” architecture that powers the developer‑focused assistant. According to the dump, the model itself never drives the coding session; instead a stack of subsystems—Agency, Memory, Identity and Orchestration—manage permissions, synthesize context, maintain persona profiles and coordinate tool use. The system re‑sends the full system prompt on every turn, relying on prompt‑caching to keep costs down, while a fallback chain of defensive checks guards against drift and policy violations.
As we reported on April 6, 2026, Claude Code had become unreliable for complex engineering tasks after the February updates, prompting a wave of workarounds and third‑party CLI tools. The new leak explains why: the four layers were designed for safety and modularity, not raw token efficiency. Claude Code’s runtime runs on Bun rather than Node, a choice meant to shave startup latency, but the architecture’s heavy reliance on full‑prompt retransmission inflates token usage compared with GitHub’s Codex, which employs a diff‑based, token‑compact protocol. The “Memory” layer’s engineered “dreaming” process—periodic summarisation of accumulated context—can still overflow, leading to the context‑drift failures that users have reported.
The revelation matters because it confirms that Anthropic’s strategy is to treat the LLM as a component within a broader orchestration framework, a model that could shape future AI‑assisted development platforms. It also suggests that the current performance bottlenecks are architectural rather than purely model‑size issues, opening a path for targeted optimisation.
What to watch next: Anthropic’s response to the leak, potential patches to the fallback chain, and whether the company will expose a lighter‑weight API that bypasses the full‑prompt cycle. Competitors may also adopt similar layered designs, turning the leak into a blueprint for the next generation of AI coding assistants.
Sources
Back to AIPULSEN