BE

Free, ad-free, open-data

The interactive AI decision engine

Compare 30+ language and image models across real-world scenarios. Drag the weight sliders, set your budget, see what wins.

Pick a scenario

Each scenario weights benchmarks differently. Switch between them โ€” the leaderboard re-ranks live.

Build your own

A pair-programming assistant for IDE / agent loops. Heavy on coding benchmarks, with a real-world agentic component (SWE-bench) and some weight on cost since coding loops burn tokens.

#ModelScenario scoreEst. $/monthContextLiveCBSWE-benchHumanEvalArena
1
OpenAI
95.4
$600200k
2
Google
vision
92.6
$1202.0M
3
DeepSeek
79.7
$33128k
4
Anthropic
vision
75.4
$1,035200k
5
Google
vision
65.630% cov.
$61.0Mโ€”โ€”
6
Anthropic
vision
64.9
$207200k
7
OpenAI
61.930% cov.
$180128kโ€”โ€”
8
OpenAI
61.3
$66200k
9
Alibaba (Qwen)
59.9
$10131k
10
OpenAI
56.560% cov.
$900200kโ€”
11
xAI
53.130% cov.
$138131kโ€”โ€”
12
Google
vision
49.930% cov.
$752.0Mโ€”โ€”
13
Meta
49.330% cov.
$116128kโ€”โ€”
14
xAI
47.460% cov.
$2071.0Mโ€”
15
Meta
46.530% cov.
$29128kโ€”โ€”
16
Meta
vision
44.130% cov.
$1010.0Mโ€”โ€”
17
Mistral
42.730% cov.
$102128kโ€”โ€”
18
Alibaba (Qwen)
42.430% cov.
$30131kโ€”โ€”
19
Anthropic
41.630% cov.
$55200kโ€”โ€”
20
Anthropic
vision
40.7
$207200k
21
Anthropic
vision
38.330% cov.
$1,035200kโ€”โ€”
22
DeepSeek
35.9
$16128k
23
Meta
31.430% cov.
$29128kโ€”โ€”
24
Google
vision
27.930% cov.
$51.0Mโ€”โ€”
25
Meta
vision
26.360% cov.
$141.0Mโ€”
26
OpenAI
vision
23.1
$150128k
27
OpenAI
vision
20.270% cov.
$9128kโ€”
28
OpenAI
vision
14.760% cov.
$510128kโ€”โ€”
29
Mistral
2.030% cov.
$4066kโ€”โ€”
30
OpenAI
image
0.00% cov.
โ€”โ€”โ€”โ€”โ€”โ€”
31
OpenAI
image
0.00% cov.
โ€”โ€”โ€”โ€”โ€”โ€”
32
Google
image
0.00% cov.
โ€”โ€”โ€”โ€”โ€”โ€”
33
Google
image
0.00% cov.
โ€”โ€”โ€”โ€”โ€”โ€”
34
Midjourney
image
0.00% cov.
โ€”โ€”โ€”โ€”โ€”โ€”
35
Black Forest Labs
image
0.00% cov.
โ€”โ€”โ€”โ€”โ€”โ€”
36
Black Forest Labs
image
0.00% cov.
โ€”โ€”โ€”โ€”โ€”โ€”
37
Stability AI
image
0.00% cov.
โ€”โ€”โ€”โ€”โ€”โ€”
38
Ideogram
image
0.00% cov.
โ€”โ€”โ€”โ€”โ€”โ€”

Showing 38 of 38 models. Hover any score for the source. Click a model to see its full benchmark profile.

Cost vs quality

Models on the Pareto frontier (highlighted) give you the best quality at their cost tier.

Pareto frontierBubble size = context window