The AI ​​wars continue to heat up. Just a few weeks later OpenAI announces code red in fight against Googlethe latter has released its latest lightweight model: Gemini 3 Flash. This particular Flash is the latest in Google Gemini 3 familywhich started with Gemini 3 Pro and Gemini 3 Deep Think. But while this latest model is intended to be a lighter, less expensive variant of the existing Gemini 3 models, the Gemini 3 Flash is actually quite powerful in its own right. In fact, it outperforms the Gemini 3 Pro and OpenAI GPT-5.2 models in some tests.
Lightweight models are typically designed for simpler queries, lower budget queries, or to run on low-power hardware. This means they are often faster than more powerful models that take longer to process but can do more. According to Google, Gemini 3 Flash combines the best of both of these worlds, creating a model with Gemini 3's “professional-grade logic” with “Flash-level latency, efficiency, and cost.” While this probably has the biggest impact on developers, regular users should also notice improvements as Gemini 3 Flash is now the default for both Gemini (chatbot) and Gemini 3 Flash. AI ModeGoogle search powered by artificial intelligence.
Gemini 3 Flash Performance
You can see these improvements in Google's comparative statistics for Gemini 3 Flash. IN The last exam of humanityan academic reasoning test that tests LLMs on 2,500 questions in over 100 subjects, Gemini 3 Flash scored 33.7% without tools and 43.5% with code search and execution. Compare that to Gemini 3 Pro's 37.5% and 45.8% respectively, or OpenAI GPT-5.2's 34.5% and 45.5%. In MMMU-Pro, a test that tests multimodal understanding and model reasoning, Gemini 3 Flash received the highest score (81.2%) compared to Gemini 3 Pro (81%) and GPT-5.2 (79.5). In fact, among the 21 comparison tests that Google highlights in its announcement, the Gemini 3 Flash has a perfect score in three: MMMU-Pro (along with Gemini 3 Pro), Toolathlon, and MMMLU. Gemini 3 Pro still ranks first in most tests (14), and GPT-5.2 topped eight tests, but Gemini 3 Flash holds its own.
Google notes that Gemini 3 Flash also outperforms Gemini 3 Pro and the entire 2.5 series in the SWE-bench Verified test, which tests the model's encoding agent capabilities. Gemini 3 Flash scored 78%, Gemini 3 Pro scored 76.2%, Gemini 2.5 Flash scored 60.4% and Gemini 2.5 Pro scored 59.6%. (Note that the GPT-5.2 received the best score among the models mentioned by Google in this announcement.) It's a close race, especially considering it's a lightweight model that sits next to that company's flagship models.
Cost of Gemini 3 Flash
This could pose an interesting dilemma for developers who pay to use AI models in their programs. Gemini 3 Flash costs $0.50 for every million input tokens (what you ask the model to do) and $3.00 for every million output tokens (the result the model returns from your request). Compare this to Gemini 3 Pro, which costs $2 for every million input tokens and $12 for every million output tokens, or GPT-5.2 costs $3 and $15 respectively. For what it's worth, it's not as cheap as the Gemini 2.5 Flash ($0.30 and $2.50) or the Grok 4.1 Fast ($0.20 and $0.50), but it beats those models in Google's tests. Google notes that Gemini 3 Flash uses an average of 30% fewer tokens than 2.5 Pro, saving on costs while still running three times faster.
If you need LLMs like Gemini 3 Flash to power your products, but don't want to pay the higher costs associated with more powerful models, I can imagine this latest lightweight model looks attractive from a financial perspective.
How the average user will perceive Gemini 3 Flash
Most of us who use AI don't do so as developers who need to worry about API prices. Most Gemini users are likely introduced to this model through Google's consumer products such as Search, Workspace, and the Gemini app.
What are your thoughts so far?
Starting today, Gemini 3 Flash is the default model in the Gemini app. Google claims it can handle many tasks “in just a few seconds.” This could involve asking a Gemini for advice on improving your golf game based on your video, or uploading a speech on a specific historical topic and asking for any facts you may have missed. You can also ask the bot to write you a working app based on a series of thoughts.
You will also be able to use Gemini 3 Flash in Google Search AI mode. Google says the new model is better at “understanding the nuances of your question” and thinking through every part of your query. The AI ​​mode attempts to return a more comprehensive search result by crawling hundreds of sites at once and compiling a summary of the sources of your answer. Let's see if Gemini 3 Flash improves over previous AI mode versions.
I'm someone who still doesn't find much use for generative AI products in my daily life, and I'm not entirely sure Gemini 3 Flash will change that for me. However, balancing the performance gains with the cost of processing that power is interesting, and I'm particularly interested to see how OpenAI responds to this.
Gemini 3 Flash is available to all users starting today. In addition to regular users of Gemini and AI Mode, developers will find it in the Gemini API in Google AI Studio, Gemini CLI and Google Antigradity, the company's new agent development platform. Enterprise users can use it in Vertex AI and Gemini Enterprise.




:quality(85):upscale()/2025/12/17/779/n/1922153/f10463356942eba81a4bc0.65863691_.png?w=150&resize=150,150&ssl=1)

