Google Gemma 4 Runs Natively on iPhone with Full Offline AI Inference
gemma google inference
| Source: HN | Original article
Google has rolled out a native iPhone version of its Gemma 4 large‑language model, letting users run the 4‑billion‑parameter AI entirely offline. The app, released through the AI Edge Gallery, installs directly on iPhone 15 Pro and later devices and performs inference without any cloud connection, subscription fee or data‑outflow. Users can launch the model from the home screen, feed prompts, and receive responses in real time, with the same multi‑step planning and code‑generation capabilities that Google showcased on Android earlier this year.
The move marks a sharp turn for Google’s on‑device AI strategy, which until now was confined to Android. In a blog post the company highlighted Gemma 4’s ability to handle autonomous tasks such as generating scripts, analysing images and orchestrating simple workflows, all while keeping user data on the handset. By delivering a full‑stack solution on iOS, Google directly challenges Apple’s own on‑device models and third‑party tools like Ollama that have been the only way for iPhone users to run comparable LLMs locally. Privacy‑focused consumers and enterprises that need zero‑latency AI will find the offline capability especially appealing, and the launch could accelerate adoption of edge AI in sectors ranging from healthcare to finance.
Google hinted that the iOS release is the first step toward a broader ecosystem of on‑device agents, with plans to expose Gemma 4 through Swift‑compatible SDKs and to support future, larger variants. Watch for performance benchmarks that compare Gemma 4 on iPhone against Apple’s Neural Engine models, for developer‑tool updates that enable deeper app integration, and for any pricing or licensing tweaks as Google expands its edge‑AI portfolio. As we reported on the Android‑only preview of Gemma 4 a week ago, the iPhone launch confirms Google’s commitment to a cross‑platform AI edge strategy.
Sources
Back to AIPULSEN