gemini-3.1-flash-lite-preview

proprietary

Our most cost-efficient multimodal model, offering the fastest performance for high-frequency, lightweight tasks. Gemini 3.1 Flash-Lite is best for high-volume agentic tasks, simple data extraction, and extremely low-latency applications where budget and speed are the primary constraints.

Context
1.0M
Max output
66K
Input price
$0.25/1M tokens
Output price
$1.5/1M tokens

Capabilities

vision
tool call
structured output
reasoning
json mode
streaming
fine tuning
batch

Details

Provider Google AI
Creator google
Licenseproprietary
Parameters
Statusactive
Input modalitiestext, image, video, audio
Output modalitiestext, image, audio
Architecture
Knowledge cutoff
Training data cutoff
Release date
Deprecation date
Typechat
Reasoning tokens
Max input
Open weightNo
Sourceofficial
Last updated

Tools

Function Callingfunction_calling
Call external functions and APIs

Endpoints

Generate ContentPOST
Generate text from multimodal inputhttps://generativelanguage.googleapis.com/v1beta/v1beta/models/{model}:generateContent
Stream ContentPOST
Stream text generation responseshttps://generativelanguage.googleapis.com/v1beta/v1beta/models/{model}:streamGenerateContent

Pricing

Text tokensPer 1M tokens
InputCached inputOutput
Standard$0.25$0.025$1.5
Batch$0.125$0.75

Family Comparison: gemini-3.1

ModelContextMax outPricing
gemini-3.1-flash-lite-preview1.0M66K$0.25$1.5
gemini-3.1-flash-lite1.0M66K$0.25$1.5
gemini-3.1-flash-image-preview131K33K$0.5$3
gemini-3.1-flash-live-preview131K66K$0.75$4.5
gemini-3.1-flash-tts-preview8K16K$1$20
gemini-3.1-pro-preview1.0M66K$2$12

API

GET/v1/models/google/gemini-3.1-flash-lite-preview