Gemini 3.1 Flash-Lite

Budget

Gemini 3.1 Flash-Lite is Google's most cost-efficient model to date, released in preview on May 5, 2026. At $0.25 per million input tokens and $1.50 per million output tokens, it costs roughly half of Gemini 3 Flash while keeping a 1,048,576 token context window and reasoning support. Google positions it for high-volume developer workloads where latency, throughput, and price-per-task matter more than frontier reasoning.

Input Price

$0.25

per 1M tokens

Output Price

$1.50

per 1M tokens

Context Window

1.0M

tokens

Released

2026-05

API access

Capabilities

textvisiontool-usecodereasoning

Key Strengths

✓$0.25 per 1M input tokens
✓1M token context window
✓2.5x faster time-to-first-token vs 2.5 Flash
✓Reasoning, vision, and tool use included

Best For

▸High-volume classification and extraction
▸Customer support routing
▸RAG over very long contexts
▸Batch document processing

Pricing Details

Input tokens

$0.25

per 1M tokens

Output tokens

$1.50

per 1M tokens

Estimated cost per 1K requests

$1.00

~1K input + ~500 output tokens avg

Prices are subject to change. Check the official documentation for current pricing. See the cost calculator for detailed estimates.

Related Models

Gemini 2.5 Pro

Flagship

Google

$1.25 in / $10.00 out

Gemini 2.0 Flash

Budget

Google

$0.10 in / $0.40 out

Gemini 3.5 Flash

Mid-tier

Google

$1.50 in / $9.00 out

Claude Haiku 4.5

Budget

Anthropic

$0.80 in / $4.00 out

GPT-4o-mini

Budget

OpenAI

$0.15 in / $0.60 out

Mistral Small

Budget

Mistral

$0.10 in / $0.30 out

Command R

Budget

Cohere

$0.15 in / $0.60 out

View Documentation Compare Models Cost Calculator Full Pricing Guide