Gemini 3 Flash

Preview

This product or feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of the Service Specific Terms, and theAdditional Terms for Generative AI Preview Products. You can process personal data for this product or feature as outlined in theCloud Data Processing Addendum, subject to the obligations and restrictions described in the agreement under which you access Google Cloud. Pre-GA products and features are available "as is" and might have limited support. For more information, see thelaunch stage descriptions.

Gemini 3 Flash combines Gemini 3 Pro's reasoning capabilitieswith the Flash line's levels on latency, efficiency, and cost. It not onlyenables everyday tasks with improved reasoning, but is designed to tackle themost complex agentic workflows.

Gemini 3 Flash uses several new features to improve performance,control, and multimodal fidelity:

Thinking level: Use thethinking_level parameter to control the amountof internal reasoning the model performs (minimal,low,medium, orhigh) to balance response quality, reasoning complexity, latency, andcost. Thethinking_level parameter replacesthinking_budget forGemini 3 models.
Note: If you used a thinking budget of0 with Gemini 2.5 Flash,set your thinking level toMINIMAL for similar latency and cost; however,you still need to handle thought signatures when using theminimalthinking level.
For details on the different thinking levels, see Thinking.
Thought signatures: Stricter validation ofthought signaturesimproves reliability in multi-turn function calling.
Media resolution: Use themedia_resolution parameter (low,medium,high, orultra high) to control vision processing for multimodal inputs,impacting token usage and latency. SeeGet started withGemini 3for default resolution settings.
- Theultra high media resolution level is only available for theIMAGE modality.
- PDF token counts will be listed under theIMAGE modality instead oftheDOCUMENT modality inusage_metadata.
Multimodal function responses: Function responses can now includemultimodal objects like images and PDFs in addition totext.
Streaming Function calling:Stream partial function call argumentsto improve user experience during tool use.

For more information on using these features, seeGet started withGemini3.

Try inVertex AI View inModel Garden (Preview) Deploy example app

Note: To use the "Deploy example app" feature, you need a Google Cloud project with billing and Vertex AI API enabled.

Technical specifications
Model ID	`gemini-3-flash-preview`
Supported inputs & outputs	Inputs: Text,Code,Images,Audio,Video,PDF Outputs: Text
Token limits	Maximum input tokens: 1,048,576 Maximum output tokens: 65,536
Capabilities	Supported Grounding with Google Search Code execution System instructions Structured output Function calling Count Tokens Thinking Implicit context caching Explicit context caching Vertex AI RAG Engine Chat completions Not supported Gemini Live API
Consumption options	Supported Provisioned Throughput Standard PayGo Flex PayGo Priority PayGo Batch prediction Not supported
Consumption options	SeeConsumption options for more information.
	Images	Maximum images per prompt: 900 Maximum file size per file for inline data or direct uploads through the console: 7 MB Maximum file size per file from Google Cloud Storage: 30 MB Default resolution tokens: 1120 Supported MIME types: `image/png`,`image/jpeg`,`image/webp`,`image/heic`,`image/heif`
	Documents	Maximum number of files per prompt: 900 Maximum number of pages per file: 900 Maximum file size per file for the API or Cloud Storage imports: 50 MB Maximum file size per file for direct uploads through the console: 7 MB Default resolution tokens: 560 OCR for scanned PDFs: Not used by default Supported MIME types: `application/pdf`,`text/plain`
	Video	Maximum video length (with audio): Approximately 45 minutes Maximum video length (without audio): Approximately 1 hour Maximum number of videos per prompt: 10 Default resolution tokens per frame: 70 Supported MIME types: `video/x-flv`,`video/quicktime`,`video/mpeg`,`video/mpegs`,`video/mpg`,`video/mp4`,`video/webm`,`video/wmv`,`video/3gpp`
	Audio	Maximum audio length per prompt: Approximately 8.4 hours, or up to 1 million tokens Maximum number of audio files per prompt: 1 Speech understanding for: Audio summarization, transcription, and translation Supported MIME types: `audio/x-aac`,`audio/flac`,`audio/mp3`,`audio/m4a`,`audio/mpeg`,`audio/mpga`,`audio/mp4`,`audio/ogg`,`audio/pcm`,`audio/wav`,`audio/webm`
	Parameter defaults	Temperature: 0.0-2.0 (default 1.0) topP: 0.0-1.0 (default 0.95) topK: 64 (fixed) candidateCount: 1–8 (default 1)
Supported regions
	Model availability	Global global
	SeeDeployments and endpoints for more information.
Knowledge cutoff date	January 2025
Versions	`gemini-3-flash-preview` Launch stage: Public preview Release date: December 17, 2025
Supported languages	SeeSupported languages.
Pricing	SeePricing.

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-19 UTC.

Movatterモバイル変換

Gemini 3 Flash Stay organized with collections Save and categorize content based on your preferences.

Gemini 3 Flash