The Lens

By Erik Loyd, SaaS CEO and former COO/CFO of an AWS Premier Partner.

Updated Jun 2026

LiteRT-LM runs language models directly on a device, no cloud and no internet required. The model lives on the phone, laptop, smartwatch, or even in the browser, so data never leaves the hardware, it works offline, and there's no per-query bill. This is Google's own framework, and Google uses it to power on-device AI in Chrome, Chromebook Plus, and the Pixel Watch. Apache 2.0, completely free.

It's cross-platform by design, targeting Android, iOS, desktop, the web via WebGPU, and small boards like Raspberry Pi, and it taps GPU and NPU acceleration instead of grinding on the CPU. It runs open models like Gemma, Llama, Phi, and Qwen. The work isn't running a server, because there is no server. The work is on the build side: you obtain and convert models into the right format, then wire up the native SDK for each platform you ship to, and manage on-device memory per device class. Heavier than calling a cloud API, far lighter than operating an inference cluster.

The real competition is other on-device runtimes. llama.cpp has broader model coverage and a bigger community; Meta's ExecuTorch is the closest vendor-backed rival; Apple's MLX wins on Macs but only on Macs. LiteRT-LM's edge is tight, official integration with Android and Google silicon. It doesn't replace a paid product so much as move certain workloads off the paid-API meter: the small and mid-size models you'd otherwise rent from a cloud. Solo and small teams shipping mobile or edge apps: this is the Google-blessed path. Larger teams already on Android get first-party support.

The catch is maturity. The core runtime is production-ready and shipping in real Google products, but some bindings, Swift and JavaScript among them, are still early preview, and the project is young. And on-device models are not frontier models. If you need GPT-class quality, this isn't that. It's for when private, offline, free, and good-enough beats cloud-quality.

Explore Further

GitHub Repository

Source code, issues, README

Reddit Discussions

Community opinions and use cases

Hacker News

HN threads and discussions

Dev.to Articles

Tutorials and write-ups

Tutorials & Guides

Getting started resources

Official Website

Docs, blog, and more

LiteRT-LM

The Lens

Free vs Self-Hosted vs Paid

Trust Signals

License: Apache License 2.0

About

Explore Further

More tools in the directory

openclaw

everything-claude-code

hermes-agent