Google has officially launched Gemini 2.5 Pro, making its reasoning model broadly accessible to both consumers and developers, following an initial preview three months ago. The experimental testing of Gemini 2.5 Pro began in late March (March 25), and shortly after it became available to paying subscribers and developers, Google unexpectedly opened access to free users as well.
In May (May 6), just prior to the Google I/O event, the model received a significant coding upgrade, followed by one last update at the beginning of June (June 5). The stable version released today reflects no changes from the June 5 preview.
With this official launch, Gemini 2.5 Pro has shed its “preview” label in the Gemini app’s model selection. This follows the earlier introduction of 2.5 Flash, which also became generally available at last month’s I/O conference. Google indicates that 2.5 Pro is intended for prompts related to reasoning, math, and code, whereas 2.5 Flash provides fast, all-around assistance.
According to Google, users of the free Gemini app will have “limited access” to 2.5 Pro, while AI Pro subscribers can utilize up to 100 prompts per day, offering them “expanded access.” The highest tier, AI Ultra, provides the fullest access.
Additionally, 2.5 Flash is now stable for developers, maintaining the same preview model from May 20 but with new pricing structures: input tokens are now priced at $0.30 per million (up from $0.15) and output tokens at $2.50 per million (down from $3.50). The differentiation in pricing for thinking versus non-thinking has been eliminated, establishing a single price tier regardless of the size of input tokens.
Developers can also experiment with Gemini 2.5 Flash Lite, designed for high-volume, latency-sensitive operations such as translation and classification, prioritizing cost-effectiveness. Google emphasizes that this version offers lower latency compared to both 2.0 Flash-Lite and 2.0 Flash across a wide range of prompts.
The default setting for thinking capabilities is off but can be activated within budget constraints. Native tools now available include Grounding with Google Search, Code Execution, URL Context, and function calling, alongside multimodal input and a context length of up to one million tokens. The quality of 2.5 Flash Lite surpasses that of 2.0 Flash-Lite in coding, mathematics, science, reasoning, and multimodal benchmarks.