The model supports text, image, video, audio, and PDF inputs, and is designed for high-volume agentic workflows, simple data extraction, and applications where latency and API cost are the primary constraints.
function_callinghttps://generativelanguage.googleapis.com/v1beta/v1beta/models/{model}:generateContenthttps://generativelanguage.googleapis.com/v1beta/v1beta/models/{model}:streamGenerateContent| Input | Cached input | Output | |
|---|---|---|---|
| Standard | $0.25 | $0.025 | $1.5 |
| Batch | $0.125 | — | $0.75 |
| Model | Context | Max out | Pricing |
|---|---|---|---|
| gemini-3.1-flash-lite-preview | 1.0M | 66K | $0.25$1.5 |
| gemini-3.1-flash-lite | 1.0M | 66K | $0.25$1.5 |
| gemini-3.1-flash-image-preview | 131K | 33K | $0.5$3 |
| gemini-3.1-flash-live-preview | 131K | 66K | $0.75$4.5 |
| gemini-3.1-flash-tts-preview | 8K | 16K | $1$20 |
| gemini-3.1-pro-preview | 1.0M | 66K | $2$12 |
/v1/models/google/gemini-3.1-flash-lite