o1-mini

Closed

OpenAI

Proprietary

text

o-seriesReleased 2y ago

Avg score

72.6

/ 100

Context

128k

Output limit

66k

Input price

$3.00 /M

Output price

$12.00 /M

Pricing verified 1y ago · source

Benchmarks

preference

Chatbot Arena EloElo

Crowdsourced pairwise human preference rankings of LLM responses. Higher Elo means more frequently preferred by users.

math

AIME 2024%

American Invitational Mathematics Examination 2024 problems. Three-digit integer answers; very hard for non-reasoning models.

coding

HumanEval% pass@1

164 hand-written Python programming problems scored by passing unit tests. Saturated for frontier models.

vision

MMMU%

Massive Multi-discipline Multimodal Understanding; college-exam level questions with images across 30+ subjects.

MathVista%

Math reasoning over visual contexts (charts, figures, geometry).

long context

RULER 128k%

Long-context retrieval and reasoning suite. We report the 128k token effective-context score.

performance

Output Speedtok/s

Median sustained output speed in tokens per second on the model's first-party API for medium-length prompts. Higher is faster.

Time to First Tokenms

Median time from request to first output chunk in milliseconds on the model's first-party API for medium-length prompts. Lower is snappier; reasoning models are penalised here because they think before talking.

Providers

Only available from OpenAI — no third-party hosts found.

Compare with...

vs GPT-4o vs GPT-4o mini vs o1 vs o3 vs o3-mini vs GPT-4 Turbo vs Claude 3.5 Sonnet vs Claude 3.5 Haiku vs Claude 3 Opus vs Claude Sonnet 4