Local LLM App by Ente

privacy

2026-03-25 | Source: HN | Original article

Ente has launched Ensu, a consumer‑grade app that runs a large language model entirely on the user’s device. The first version ships for macOS and iOS, with an Android beta slated for later this quarter. Ensu bundles a compact transformer model—optimised for Apple’s Neural Engine and Qualcomm’s Hexagon DSP—behind a sleek chat interface, while keeping all prompts and responses on‑device. Users can also enable a “Remote Tunnel” feature that forwards the model’s inference to a personal cloudflare‑hosted endpoint, letting them offload heavy workloads without exposing data to third‑party APIs. The release marks a tangible shift from the cloud‑centric AI services that dominate the market. By keeping the model local, Ente promises zero‑knowledge privacy, lower latency, and the ability to operate offline—attributes that appeal to privacy‑conscious consumers and enterprises wary of data‑leak risks. The move also underscores the rapid maturation of model compression techniques; a 7‑billion‑parameter model that once required a server‑grade GPU now fits within a smartphone’s memory budget. This follows our earlier coverage of hobbyists building private, local AI tools in a weekend with Ol, showing that the barrier to entry is collapsing from specialist to mainstream. What to watch next is the ecosystem that will grow around Ensu. Ente has opened a developer portal for plug‑ins, hinting at third‑party extensions such as domain‑specific knowledge bases or custom voice assistants. Analysts will be tracking adoption metrics on the App Store and any partnership announcements with hardware vendors that could embed the engine deeper into devices. A follow‑up update is expected in June when Ente plans to roll out a larger 13‑billion‑parameter model and expand support to Windows laptops, potentially setting a new baseline for on‑device AI performance.

Sources

Back to AIPULSEN