Google00:00Pricing & PlansOfficial Blog
Flex & Priority Inference Tiers for Gemini API
Run background jobs at 50% cost, saving budgets significantly.
Key Points
- 1Flex: Cost-opt, lower priority
- 2Priority: Low-latency, high priority
- 3For Gemini 2.5/3.1 models
- 4Available now
Google added Flex (cost-optimized) and Priority (latency-optimized) tiers to Gemini API. Flex offers up to 50% savings for tolerant workloads; Priority prioritizes traffic. Balances cost, speed, reliability for devs.