Gemini 2.5 Pro (Max Thinking)
xhighPricing verified 1y ago · Same per-token price as the standard variant; max-thinking mode consumes far more output tokens per request.
Benchmarks
general
Contamination-controlled average across seven rolling task categories (reasoning, coding, agentic coding, mathematics, data analysis, language, instruction following). Questions are rotated every six months and ground-truth answers are objective, removing the need for LLM-as-judge scoring.
data analysis
Rolling contamination-controlled data-analysis evaluation. Table comprehension, CSV / spreadsheet reasoning, SQL-style joins, and chart interpretation. Refreshed every six months with new tables and questions to minimise contamination.
Reliability monitor
Loading drift signal…
Hosted endpoints
| Host | Input $/M | Output $/M | Context | Quant |
|---|---|---|---|---|
| Host G | $1.25 | $10.00 | 1.0M | unknown |
| Host M | $1.25 | $10.00 | 1.0M | unknown |
| Host L | $1.25 | $10.00 | 1.0M | unknown |
| Host F | $1.25 | $10.00 | 1.0M | unknown |
Effort variants
Same API model, different reasoning budget. Thinking / xHigh modes usually score better on reasoning benchmarks but emit many more output tokens per request.