Show HN: Prompt-to-Excalidraw demo with Gemma 4 E2B in the browser (3.1GB)
gemini gemma multimodal
| Source: HN | Original article
A new “Show HN” entry demonstrates a browser‑only workflow that turns natural‑language prompts into hand‑drawn‑style diagrams using Google’s Gemma 4 E2B model. The 3.1 GB checkpoint runs entirely client‑side via WebGPU, parses the user’s description, and streams SVG commands to Excalidraw, the open‑source whiteboard library that stores drawings locally in the browser. The result is an instant, privacy‑preserving sketch generator that works without any server calls.
The demo matters because it showcases the convergence of three trends that have been shaping the AI landscape this spring. First, Gemma 4, announced earlier this year, is Google DeepMind’s most capable open‑source family, built on Gemini 3 research and engineered for “frontier‑level” performance on edge hardware. Its E2B variant is deliberately lightweight—just 3 GB—yet retains enough reasoning power to handle multimodal tasks such as text‑to‑image generation. Second, the rise of WebGPU and libraries like LiteRT (which we covered on 19 April) has made it feasible to run large language models directly in the browser, eliminating latency and data‑exfiltration concerns. Third, Excalidraw’s popularity as a low‑code visual tool means that a seamless prompt‑to‑diagram pipeline can accelerate prototyping, education, and remote collaboration.
What to watch next is whether the Gemma 4 E2B model will be integrated into broader developer tooling, such as the Claude Code orchestrator UI we highlighted on 19 April, or into on‑device AI suites for smartphones and laptops. Google’s roadmap hints at larger Gemma variants (E4B, A4B, 31B) that could support richer visual outputs, while the community is already experimenting with chaining the model to other WebGL‑based editors. If the browser demo gains traction, it could signal the start of a new class of offline, multimodal AI assistants that blend reasoning and graphics without ever leaving the user’s device.
Sources
Back to AIPULSEN