GoogleGuides & TipsOfficial Blog
Google explains LiteRT-LM for fast on-device AI
Build on-device GenAI to reduce latency and privacy concerns.
Key Points
- 1Designed for Gemma 4 on-device execution
- 2Supports multiple hardware backends
- 3Mentions web execution via WebGPU
Google explained how LiteRT-LM optimizes running Gemma 4 on-device across phones, laptops, and the web. It highlights acceleration across CPU/GPU/NPU backends and options like WebGPU. This makes it easier to build features that don’t rely on network calls. Performance varies by device and platform support.