Generative AI News

Google Unveils Gemini 2.5 Flash with Customizable Reasoning

Google has introduced Gemini 2.5 Flash, a powerful new AI model featuring customizable reasoning through a “thinking budget.”

Designed for flexibility, it lets developers control performance and cost, making AI smarter and more adaptable for business and enterprise needs.

Customizable Reasoning via Thinking Budget

The standout feature of Gemini 2.5 Flash is its “thinking budget,” allowing developers to fine-tune how much computational effort the model applies to complex tasks.

This hybrid reasoning system enables a balance between performance and cost.

“We want to offer developers flexibility to adapt how much thinking the model does, depending on their needs,” said Tulsee Doshi, Product Director for Gemini Models at Google DeepMind.

 Dynamic Pricing Based on Reasoning Needs

Google introduces a new pricing model:  

  • Input: $0.15 per million tokens  
  • Output:    $0.60 per million tokens (thinking off)  
  •   $3.50 per million tokens (thinking on)

Developers only pay for the tokens used during both the output and reasoning phases. The thinking budget can be set up to 24,576 tokens per request.

 Performance Highlights and Benchmark Wins

Despite being lightweight, Gemini 2.5 Flash shows strong performance:  

  • Humanity’s Last Exam: 12.1% (beats Claude 3.7 Sonnet and DeepSeek R1; trails OpenAI’s o4-mini at 14.3%)  
  • GPQA Diamond: 78.3%  
  • AIME 2025/2024: 78.0% / 88.0%

Smart Cost Control for Enterprises

Businesses can toggle reasoning on or off based on task complexity, optimizing cost-efficiency. Simple tasks like translations can skip reasoning, while deeper tasks (e.g., engineering calculations) can leverage full processing power.

Availability and Future Plans

Currently available in preview via Google AI Studio and Vertex AI, Gemini 2.5 Flash also appears as an experimental option in the Gemini app.

Google will refine its features based on user feedback during this rollout phase.

Gemini 2.5 Flash positions Google as a competitive force in enterprise AI by offering adaptive reasoning, cost control, and strong performance—ushering in a new era of customizable AI experiences.

News Gist

Google has launched Gemini 2.5 Flash, an AI model with a “thinking budget” that lets developers control reasoning effort and costs.

With dynamic pricing, strong performance, and enterprise flexibility, it’s now available via Google AI Studio and Vertex AI.

Leave a Reply

Your email address will not be published. Required fields are marked *

AI Binger
Privacy Overview

This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.