BE
← Leaderboard

Gemini 2.5 Pro (Max Thinking)

xhigh
Closed
Google
Proprietary
text
vision
Gemini 2.5Released 1y ago
Avg score
71.3
/ 100
Context
2.0M
Output limit
66k
Input price
$1.25 /M
Output price
$10.00 /M

Pricing verified 1y ago · Same per-token price as the standard variant; max-thinking mode consumes far more output tokens per request.

Benchmarks

general

Rolling Contamination-Controlled AverageFresh
/100

Contamination-controlled average across seven rolling task categories (reasoning, coding, agentic coding, mathematics, data analysis, language, instruction following). Questions are rotated every six months and ground-truth answers are objective, removing the need for LLM-as-judge scoring.

data analysis

Rolling Data AnalysisFresh
/100

Rolling contamination-controlled data-analysis evaluation. Table comprehension, CSV / spreadsheet reasoning, SQL-style joins, and chart interpretation. Refreshed every six months with new tables and questions to minimise contamination.

Reliability monitor

Loading drift signal…

Hosted endpoints

HostInput $/MOutput $/MContextQuant
Host G$1.25$10.001.0Munknown
Host M$1.25$10.001.0Munknown
Host L$1.25$10.001.0Munknown
Host F$1.25$10.001.0Munknown
Anonymised third-party hosts. Sorted by lowest output price.

Effort variants

Same API model, different reasoning budget. Thinking / xHigh modes usually score better on reasoning benchmarks but emit many more output tokens per request.

Compare with...