Google's New AI Model Comes With 'Thinking Budget' for Developers

Available now in preview on Google AI Studio and Vertex AI, the model balances advanced reasoning capabilities with competitive pricing

Google's New AI Model Comes With 'Thinking Budget' for Developers

With Gemini 2.5 Flash, Google has unveiled a major update to its AI model lineup, offering businesses and developers granular control over how much “thinking” their AI does.

Available now in preview on Google AI Studio and Vertex AI, the model balances advanced reasoning capabilities with competitive pricing.

A standout feature is the “thinking budget” — a tunable mechanism that lets users allocate how much computational effort the model should apply before producing a response.

This enables a flexible trade-off between response depth and cost, addressing a key challenge in AI deployment: sophisticated reasoning often leads to higher latency and expense.

“We want to offer developers the flexibility to adapt the amount of thinking the model does, depending on their needs,” said Tulsee Doshi, Product Director for Gemini at Google DeepMind.

Key highlights:

  • Gemini 2.5 Flash is Google’s first hybrid reasoning model, allowing developers to toggle deep thinking on or off.
  • Pricing is tiered based on reasoning:
    • $0.60 per million tokens (no reasoning)
    • $3.50 per million tokens (with reasoning)
  • The thinking budget can go up to 24,576 tokens, with the model smartly using only what’s needed per task.

With this update, Google is leaning into the demand for cost-effective, controllable AI tools—especially for enterprise use cases where predictability and performance are mission-critical.

Recently, in celebration of National Dolphin Day, Google has teamed up with Georgia Tech and the Wild Dolphin Project (WDP) to unveil DolphinGemma — a groundbreaking AI model designed to analyze and generate dolphin vocalisations.

This marks a significant step toward the long-standing goal of understanding and possibly communicating with dolphins. Trained on decades of underwater audio and video data from WDP’s long-term study of Atlantic spotted dolphins (Stenella frontalis) in the Bahamas, DolphinGemma can detect vocal patterns and generate lifelike dolphin sounds.

“By identifying recurring sound patterns, clusters, and reliable sequences, the model can help researchers uncover hidden structures and potential meanings,” Google explained.