Gemini 3 Flash Launched as Default Model in Gemini App, Replacing 2.5 Flash

Google has introduced Gemini 3 Flash, a new AI model focused on speed and lower cost. It brings advanced Gemini 3 intelligence to a wide audience across Google products. Gemini 3 Flash is now the default model in the Gemini app, replacing Gemini 2.5 Flash globally.

All Gemini app users now access the Gemini 3 experience without extra charges, giving common tasks a clear performance boost. The model will also appear in AI Mode in Search, so more people can try next-generation AI responses quickly, directly inside familiar Google services.

Gemini 3 Flash Launched as Default Model in Gemini App

Gemini 3 Flash performance and benchmarks

Gemini 3 Flash keeps the core strengths of Gemini 3, including complex reasoning, multimodal understanding and support for agentic workflows. It offers Pro-level reasoning but keeps Flash-style speed, efficiency and pricing. This combination helps with everyday tasks and also supports demanding, agent-based use cases.

The model shows high scores on academic-style tests, matching or beating larger AI systems. Gemini 3 Flash reaches 90.4% on GPQA Diamond and 33.7% on Humanity's Last Exam without tools. It also achieves 81.2% on MMMU Pro, a result similar to Gemini 3 Pro.

Benchmark / Metric	Gemini 3 Flash	Comparison Model
GPQA Diamond	90.4%	Rivals larger frontier models
Humanity's Last Exam (no tools)	33.7%	Above earlier Gemini 2.5 Pro
MMMU Pro	81.2%	Comparable to Gemini 3 Pro

Gemini 3 Flash speed, efficiency and pricing

Gemini 3 Flash is designed for raw speed, following the earlier Flash models used by developers and consumers. According to Artificial Analysis benchmarking, it is three times faster than Gemini 2.5 Pro. It also delivers higher performance than 2.5 Pro, while keeping usage costs much lower.

Usage Type	Gemini 3 Flash Price
Input tokens	$0.50 per 1M tokens
Output tokens	$3 per 1M tokens
Audio input tokens	$1 per 1M tokens

Efficiency is a key focus for Gemini 3 Flash, which aims to balance quality, latency and cost. The model adjusts how deeply it thinks based on task complexity. On typical traffic, it uses around 30% fewer tokens than Gemini 2.5 Pro, while giving more accurate everyday answers.

Gemini 3 Flash multimodal uses and developer access

Gemini 3 Flash also offers strong multimodal reasoning, helping people work with text, images, audio and video. Users can ask Gemini to analyse photos or clips and then turn them into structured plans. This might include travel schedules, study notes or task lists generated in a few seconds.

Developers already gain preview access to Gemini 3 Flash through several Google tools. It is live in the Gemini API via Google AI Studio, Google Antigravity, Vertex AI and Gemini Enterprise. Gemini 3 Flash also works with Gemini CLI and Android Studio, supporting app and service integration.

As Gemini 3 Flash reaches more products, Google aims to combine fast responses with strong reasoning in one model. People using the Gemini app, AI Mode in Search and supported developer platforms now receive higher-level intelligence at lower cost, with better support for complex and everyday workloads.

Via