I Built a CLI That X-Rays Your AI Coding Sessions — No LLM, <5ms (Open Source)
agents claude cursor gemini open-source reasoning
| Source: Dev.to | Original article
A developer has released an open‑source command‑line tool that “X‑rays” AI‑assisted coding sessions, scoring every prompt in under five milliseconds and doing so without invoking a large language model. The utility, dubbed **rtk**, intercepts the text you type into any supported AI coding agent—Claude Code, Cursor, Gemini CLI, Aider, Codex, Windsurf, Cline, among others—compresses the output before it reaches the model’s context window and assigns a numeric quality score. Over ten weeks the author logged 3,140 prompts, posting an average score of 38, a metric the creator says correlates with downstream success rates such as fewer compilation errors and reduced token consumption.
Why it matters is twofold. First, prompt engineering has become a hidden bottleneck in developer workflows that now lean heavily on generative AI. Real‑time feedback lets programmers refine their queries before the model processes them, cutting wasted cycles and cloud costs. Second, because rtk operates entirely locally, it sidesteps the privacy concerns that have dogged commercial AI services—a theme we explored in our April 9 piece on the trade‑off between convenience and data exposure. By shrinking the prompt before it hits the model, rtk also stretches the effective context window, enabling longer, more coherent coding sessions without the token‑budget penalties that typically force developers to truncate history.
The release builds on a series of community‑driven tools that treat AI‑augmented development as a first‑class artifact. Earlier this month we covered a “time‑machine” CLI that snapshots sessions for later review, and a tmux‑based IDE that persists terminal state across reboots. rtk’s scoring engine adds a quantitative layer to those retrospectives, turning anecdotal notes into actionable metrics.
What to watch next: the project’s GitHub repo already lists integration hooks for emerging agents, and the author hints at a dashboard that visualises score trends over time. If the community adopts rtk widely, we could see a new benchmark for prompt quality, and perhaps commercial IDEs will embed similar analytics to market “smarter” AI coding experiences. Keep an eye on the repo’s issue tracker for extensions that tie scores to automated refactoring or CI pipelines.
Sources
Back to AIPULSEN