Our most cost-efficient multimodal model, offering the fastest performance for high-frequency, lightweight tasks. Gemini 3.1 Flash-Lite is best for high-volume agentic tasks, simple data extraction, and extremely low-latency applications where budget and speed are the primary constraints.
function_callinghttps://generativelanguage.googleapis.com/v1beta/v1beta/models/{model}:generateContenthttps://generativelanguage.googleapis.com/v1beta/v1beta/models/{model}:streamGenerateContent| Input | Cached input | Output | |
|---|---|---|---|
| Standard | $0.25 | $0.025 | $1.5 |
| Batch | $0.125 | — | $0.75 |
| Model | Context | Max out | Pricing |
|---|---|---|---|
| gemini-3.1-flash-lite-preview | 1.0M | 66K | $0.25$1.5 |
| gemini-3.1-flash-lite | 1.0M | 66K | $0.25$1.5 |
| gemini-3.1-flash-image-preview | 131K | 33K | $0.5$3 |
| gemini-3.1-flash-live-preview | 131K | 66K | $0.75$4.5 |
| gemini-3.1-flash-tts-preview | 8K | 16K | $1$20 |
| gemini-3.1-pro-preview | 1.0M | 66K | $2$12 |
/v1/models/google/gemini-3.1-flash-lite-previewOur most cost-efficient multimodal model, offering the fastest performance for high-frequency, lightweight tasks. Gemini 3.1 Flash-Lite is best for high-volume agentic tasks, simple data extraction, and extremely low-latency applications where budget and speed are the primary constraints.