Updated Mar 3, 2026

FAQ

Keep going

Claude Opus 4.6 vs MiniMax M2.5New provider

GPT-5.3 Chat vs Pony AlphaNew provider

Claude Opus 4.6 vs Grok 3New provider

Claude Opus 4.6vsGPT-5.3 Chat

47 fights queued

Why Claude Opus 4.6?

7.8x more context1.0M

Why GPT-5.3 Chat?

1.9x cheaper overall$1.75/M in · $14.00/M out

Favorites

Claude Opus 4.6GPT-5.3 Chat

Movie

2001: A Space Odyssey

2001: A Space Odyssey

Inception

Inception

Album

Kind of Blue

Kind of Blue

Book

Gödel, Escher, Bach

Gödel, Escher, Bach

Douglas R. Hofstadter

The Left Hand of Darkness

The Left Hand of Darkness

Ursula K. Le Guin

City

Istanbul

Tokyo

Game

The Legend of Zelda: Breath of the Wild

The Legend of Zelda: Breath of the Wild

Adventure, Action4.5

Tetris (1984)

Tetris (1984)

12 / 47

01

Estimate Complexityreasoning

Tests an AI's ability to make educated estimates based on technical knowledge

Claude Opus 4.6

Claude Opus 4.6

Anthropic · Feb 2026text

Nothing here. The model returned empty. We stared at it for a while.

GPT-5.3 Chat

GPT-5.3 Chat

Openai · Mar 2026text

We can do a back‑of‑the‑envelope estimate using a common scaling rule for transformer training compute.

Known approximate GPT‑3 specs

Parameters N ≈ 175 billion = 1.75 × 10^11
Training tokens T ≈ 300 billion = 3 × 10^11

FLOPs per token for a transformer
For large transformers, a common approximation is:

FLOPs per token ≈ 6 × N

Reasoning (rough):

Forward pass through all layers costs ~2N multiply‑adds.
Backpropagation roughly doubles or triples that cost.
Total ≈ ~6N FLOPs per token for forward + backward + gradients.

Total training FLOPs
Total compute ≈ (FLOPs per token) × (number of tokens)

= 6 × N × T

Plug in values:

6 × (1.75 × 10^11) × (3 × 10^11)

First multiply constants: 6 × 1.75 × 3 = 31.5

Then powers: 10^11 × 10^11 = 10^22

So:

Total ≈ 31.5 × 10^22
≈ 3.15 × 10^23 FLOPs

Final estimate
Total training compute for GPT‑3 is on the order of:

≈ 3 × 10^23 FLOPs

Sanity check
This sits in the typical reported range (~10^23 FLOPs) for GPT‑3‑scale training, so the estimate is reasonable.

Final estimate: about 3 × 10^23 FLOPs to train GPT‑3.

Vote

2 / 47

02

AI Board Game Logicreasoning

Tests an AI's ability to understand game rules and strategy

Claude Opus 4.6

Claude Opus 4.6

Anthropic · Feb 2026svg

Nothing here. The model returned empty. We stared at it for a while.

GPT-5.3 Chat

GPT-5.3 Chat

Openai · Mar 2026svg

Nothing here. The model returned empty. We stared at it for a while.

Vote

3 / 47

03

Logic Puzzlereasoning

Tests an AI's ability to solve a simple but potentially confusing logic puzzle

Claude Opus 4.6

Claude Opus 4.6

Anthropic · Feb 2026text

Nothing here. The model returned empty. We stared at it for a while.

GPT-5.3 Chat

GPT-5.3 Chat

Openai · Mar 2026text

Nothing here. The model returned empty. We stared at it for a while.

Vote

4 / 47

04

Stochastic Consistency Testanalysis

Tests an AI's randomness and creativity

Claude Opus 4.6

Claude Opus 4.6

Anthropic · Feb 2026text

Nothing here. The model returned empty. We stared at it for a while.

GPT-5.3 Chat

GPT-5.3 Chat

Openai · Mar 2026text

Nothing here. The model returned empty. We stared at it for a while.

Vote

5 / 47

05

SVG Layout Challengeimage generation

Tests an AI's ability to generate vector graphics

Claude Opus 4.6

Claude Opus 4.6

Anthropic · Feb 2026svg

Nothing here. The model returned empty. We stared at it for a while.

GPT-5.3 Chat

GPT-5.3 Chat

Openai · Mar 2026svg

Vote

6 / 47

06

Xbox Controller SVG Artimage generation

Tests an AI's ability to create detailed SVG illustrations of gaming hardware

Claude Opus 4.6

Claude Opus 4.6

Anthropic · Feb 2026svg

Nothing here. The model returned empty. We stared at it for a while.

GPT-5.3 Chat

GPT-5.3 Chat

Openai · Mar 2026svg

Nothing here. The model returned empty. We stared at it for a while.

Vote

7 / 47

07

Generate a Stand-Up Routineconversation

Tests an AI's humor and creative writing ability

Claude Opus 4.6

Claude Opus 4.6

Anthropic · Feb 2026text

Nothing here. The model returned empty. We stared at it for a while.

GPT-5.3 Chat

GPT-5.3 Chat

Openai · Mar 2026text

Nothing here. The model returned empty. We stared at it for a while.

Vote

8 / 47

08

Realistic AI Interviewconversation

Tests an AI's ability to simulate personalities and predict future trends

Claude Opus 4.6

Claude Opus 4.6

Anthropic · Feb 2026text

Nothing here. The model returned empty. We stared at it for a while.

GPT-5.3 Chat

GPT-5.3 Chat

Openai · Mar 2026text

Nothing here. The model returned empty. We stared at it for a while.

Vote

Sponsored

9 / 47

09

Satirical Fake News Headlineconversation

Tests an AI's humor and understanding of current events

Claude Opus 4.6

Claude Opus 4.6

Anthropic · Feb 2026text

Nothing here. The model returned empty. We stared at it for a while.

GPT-5.3 Chat

GPT-5.3 Chat

Openai · Mar 2026text

Nothing here. The model returned empty. We stared at it for a while.

Vote

10 / 47

10

Character Voice Testconversation

Tests an AI's ability to write in distinct character voices

Claude Opus 4.6

Claude Opus 4.6

Anthropic · Feb 2026text

Nothing here. The model returned empty. We stared at it for a while.

GPT-5.3 Chat

GPT-5.3 Chat

Openai · Mar 2026text

Nothing here. The model returned empty. We stared at it for a while.

Vote

11 / 47

11

Minimalist Landing Pageweb design

Tests an AI's ability to generate a complete, working landing page

Claude Opus 4.6

Claude Opus 4.6

Anthropic · Feb 2026website

GPT-5.3 Chat

GPT-5.3 Chat

Openai · Mar 2026website

Vote

12 / 47

12

Pokémon Battle UI Recreationweb design

Recreate an interactive, nostalgic Pokémon battle UI in a single HTML file.

Claude Opus 4.6

Claude Opus 4.6

Anthropic · Feb 2026website

Nothing here. The model returned empty. We stared at it for a while.

GPT-5.3 Chat

GPT-5.3 Chat

Openai · Mar 2026website

Nothing here. The model returned empty. We stared at it for a while.

Vote

The full comparison is right here

35+ head-to-head challenges. All of them judged by real people.

5 credits on us when you join

By continuing, you agree to Rival's Terms of Service and Privacy Policy

Prompt Lab

Want to throw your own prompt in the ring?

Run it across these models and 200+ more in Prompt Lab. No survivors.