Qwen3 235B

Open source

Alibaba (Qwen)

Open license

text

Qwen 3Released 1y ago

Avg score

66.1

/ 100

Context

131k

Output limit

16k

Input price

$0.20 /M

Output price

$0.60 /M

Pricing verified 1y ago · source

Benchmarks

preference

Chatbot Arena EloElo

Crowdsourced pairwise human preference rankings of LLM responses. Higher Elo means more frequently preferred by users.

math

AIME 2024%

American Invitational Mathematics Examination 2024 problems. Three-digit integer answers; very hard for non-reasoning models.

coding

HumanEval% pass@1

164 hand-written Python programming problems scored by passing unit tests. Saturated for frontier models.

LiveCodeBench% pass@1

Continuously refreshed coding benchmark drawing from LeetCode, AtCoder, and Codeforces; reduces benchmark contamination.

agentic

SWE-bench Verified% resolved

Real GitHub issues solved end-to-end. Verified subset is a 500-task human-validated slice of SWE-bench.

long context

RULER 128k%

Long-context retrieval and reasoning suite. We report the 128k token effective-context score.

performance

Output Speedtok/s

Median sustained output speed in tokens per second on the model's first-party API for medium-length prompts. Higher is faster.

Time to First Tokenms

Median time from request to first output chunk in milliseconds on the model's first-party API for medium-length prompts. Lower is snappier; reasoning models are penalised here because they think before talking.

Providers

Provider	Input $/M	Output $/M	Context	Quant
DeepInfra deepinfra/fp8	$0.07	$0.10	262k	fp8
WandB wandb/bf16	$0.10	$0.10	262k	bf16
Novita novita/fp8	$0.09	$0.58	131k	fp8
Alibaba alibaba	$0.15	$0.60	131k	unknown
SiliconFlow siliconflow/fp8	$0.09	$0.60	262k	fp8
Parasail parasail/fp8	$0.10	$0.60	131k	fp8
Together together	$0.20	$0.60	262k	unknown
Friendli friendli	$0.20	$0.80	262k	unknown
AtlasCloud atlas-cloud/fp8	$0.20	$0.88	131k	fp8
Google google-vertex	$0.22	$0.88	262k	unknown
Google google-vertex	$0.25	$1.00	262k	unknown
Cerebras cerebras/fp16	$0.60	$1.20	131k	fp16

Sourced from OpenRouter. Sorted by lowest output price.

Compare with...

vs GPT-4o vs GPT-4o mini vs o1 vs o1-mini vs o3 vs o3-mini vs GPT-4 Turbo vs Claude 3.5 Sonnet vs Claude 3.5 Haiku vs Claude 3 Opus