Google Unveils Gemini 2.5 Flash with Customizable Reasoning
Google has introduced Gemini 2.5 Flash, a powerful new AI model featuring customizable reasoning through a “thinking budget.”
Designed for flexibility, it lets developers control performance and cost, making AI smarter and more adaptable for business and enterprise needs.
Customizable Reasoning via Thinking Budget
The standout feature of Gemini 2.5 Flash is its “thinking budget,” allowing developers to fine-tune how much computational effort the model applies to complex tasks.
This hybrid reasoning system enables a balance between performance and cost.
“We want to offer developers flexibility to adapt how much thinking the model does, depending on their needs,” said Tulsee Doshi, Product Director for Gemini Models at Google DeepMind.
Dynamic Pricing Based on Reasoning Needs
Google introduces a new pricing model:
- Input: $0.15 per million tokens
- Output: $0.60 per million tokens (thinking off)
- $3.50 per million tokens (thinking on)
Developers only pay for the tokens used during both the output and reasoning phases. The thinking budget can be set up to 24,576 tokens per request.
Performance Highlights and Benchmark Wins
Despite being lightweight, Gemini 2.5 Flash shows strong performance:
- Humanity’s Last Exam: 12.1% (beats Claude 3.7 Sonnet and DeepSeek R1; trails OpenAI’s o4-mini at 14.3%)
- GPQA Diamond: 78.3%
- AIME 2025/2024: 78.0% / 88.0%
Smart Cost Control for Enterprises
Businesses can toggle reasoning on or off based on task complexity, optimizing cost-efficiency. Simple tasks like translations can skip reasoning, while deeper tasks (e.g., engineering calculations) can leverage full processing power.
Availability and Future Plans
Currently available in preview via Google AI Studio and Vertex AI, Gemini 2.5 Flash also appears as an experimental option in the Gemini app.
Google will refine its features based on user feedback during this rollout phase.
Gemini 2.5 Flash positions Google as a competitive force in enterprise AI by offering adaptive reasoning, cost control, and strong performance—ushering in a new era of customizable AI experiences.
News Gist
Google has launched Gemini 2.5 Flash, an AI model with a “thinking budget” that lets developers control reasoning effort and costs.
With dynamic pricing, strong performance, and enterprise flexibility, it’s now available via Google AI Studio and Vertex AI.