Gemini 3 Flash Launched as Default Model in Gemini App, Replacing 2.5 Flash
Google has introduced Gemini 3 Flash, a new AI model focused on speed and lower cost. It brings advanced Gemini 3 intelligence to a wide audience across Google products. Gemini 3 Flash is now the default model in the Gemini app, replacing Gemini 2.5 Flash globally.
All Gemini app users now access the Gemini 3 experience without extra charges, giving common tasks a clear performance boost. The model will also appear in AI Mode in Search, so more people can try next-generation AI responses quickly, directly inside familiar Google services.

Gemini 3 Flash performance and benchmarks
Gemini 3 Flash keeps the core strengths of Gemini 3, including complex reasoning, multimodal understanding and support for agentic workflows. It offers Pro-level reasoning but keeps Flash-style speed, efficiency and pricing. This combination helps with everyday tasks and also supports demanding, agent-based use cases.
The model shows high scores on academic-style tests, matching or beating larger AI systems. Gemini 3 Flash reaches 90.4% on GPQA Diamond and 33.7% on Humanity's Last Exam without tools. It also achieves 81.2% on MMMU Pro, a result similar to Gemini 3 Pro.
| Benchmark / Metric | Gemini 3 Flash | Comparison Model |
|---|---|---|
| GPQA Diamond | 90.4% | Rivals larger frontier models |
| Humanity's Last Exam (no tools) | 33.7% | Above earlier Gemini 2.5 Pro |
| MMMU Pro | 81.2% | Comparable to Gemini 3 Pro |
Gemini 3 Flash speed, efficiency and pricing
Gemini 3 Flash is designed for raw speed, following the earlier Flash models used by developers and consumers. According to Artificial Analysis benchmarking, it is three times faster than Gemini 2.5 Pro. It also delivers higher performance than 2.5 Pro, while keeping usage costs much lower.
| Usage Type | Gemini 3 Flash Price |
|---|---|
| Input tokens | $0.50 per 1M tokens |
| Output tokens | $3 per 1M tokens |
| Audio input tokens | $1 per 1M tokens |
Efficiency is a key focus for Gemini 3 Flash, which aims to balance quality, latency and cost. The model adjusts how deeply it thinks based on task complexity. On typical traffic, it uses around 30% fewer tokens than Gemini 2.5 Pro, while giving more accurate everyday answers.
Gemini 3 Flash multimodal uses and developer access
Gemini 3 Flash also offers strong multimodal reasoning, helping people work with text, images, audio and video. Users can ask Gemini to analyse photos or clips and then turn them into structured plans. This might include travel schedules, study notes or task lists generated in a few seconds.
Developers already gain preview access to Gemini 3 Flash through several Google tools. It is live in the Gemini API via Google AI Studio, Google Antigravity, Vertex AI and Gemini Enterprise. Gemini 3 Flash also works with Gemini CLI and Android Studio, supporting app and service integration.
As Gemini 3 Flash reaches more products, Google aims to combine fast responses with strong reasoning in one model. People using the Gemini app, AI Mode in Search and supported developer platforms now receive higher-level intelligence at lower cost, with better support for complex and everyday workloads.


Click it and Unblock the Notifications








