Google AI Edge Gallery App Runs Models Locally on Android

Google has quietly released the AI Edge Gallery app, which lets users download models and run them locally without Internet connectivity. Available for Android and eventually on iOS, the experimental app is hosted on GitHub where it can be downloaded for free. Users can find compatible models capable of running on-device, like Google’s Gemma 3n, and run them offline to do things like generate images, get answers to questions, and write and edit code using the processor of supported smartphones. While locally running models aren’t as powerful as their cloud counterparts, they offer more privacy and can sometimes be faster.

While the AI Edge Gallery app is available on GitHub, most of the open-source Google models it is compatible with can be found on Hugging Face. In addition to Gemma 3n, these include Alphabet’s Gemma 2 2B and lighter weight Gecko-110m.

Although Google does not provide a definitive list of compatible models, users can explore what’s available after downloading the app by using its Model Selection Screen discovery tool. Users must specify a desired task — like “AI Chat,” “Prompt Lab” or “Ask Image” — before the model list populates.

Models compatible with Google’s open-source LiteRT (formerly TensorFlow Lite) and MediaPipe frameworks can reportedly be used. Qwen has been mentioned as compatible, as have some Google specialty models like the MedGemma-27B healthcare AI.

Google has posted a how-to Wiki on downloading and use of AI Edge Gallery.

The Prompt Lab lets users “kick off ‘single-turn’ tasks powered by models, like summarizing and rewriting text,” TechCrunch explains, noting that it  “comes with several task templates and configurable settings to fine-tune the models’ behaviors.”

TechCrunch also cautions that performance will vary depending on the smartphone’s processing capabilities as well as the size of the model. “Larger models will take more time to complete a task — say, answering a question about an image — than smaller models,” TechCrunch points out.

While Android Police writes “local AI models give faster responses since there’s no lag caused by waiting for responses from a server,” there are some very fast cloud-based AI models that could conceivably be quicker than a large model running on a low-powered device.

Android Police applauds the fact that “since nothing essentially leaves your device, there’s a significantly lower risk of your data being intercepted, stored, or misused.”

No Comments Yet

You can be the first to comment!

Leave a comment

You must be logged in to post a comment.