Updated Mar 3, 2026

FAQ

Keep going

GPT-5.3 Chat vs MiniMax M2.5New provider

GPT-5.3 Chat vs Aurora AlphaNew provider

Claude Sonnet 4.5 vs Gemini 2.5 Pro Preview 06-05New provider

Claude Sonnet 4.5vsGPT-5.3 Chat

37 fights queued

Why Claude Sonnet 4.5?

1.6x more context200K

Why GPT-5.3 Chat?

5 months newerMar 2026

Favorites

Claude Sonnet 4.5GPT-5.3 Chat

Movie

2001: A Space Odyssey

2001: A Space Odyssey

Inception

Inception

Album

Kind of Blue

Kind of Blue

Book

Die Verwandlung

Die Verwandlung

The Left Hand of Darkness

The Left Hand of Darkness

Ursula K. Le Guin

City

Amsterdam

Tokyo

Game

Portal 2

Portal 2

Shooter, Puzzle4.6

Tetris (1984)

Tetris (1984)

12 / 37

01

Estimate Complexityreasoning

Tests an AI's ability to make educated estimates based on technical knowledge

Claude Sonnet 4.5

Claude Sonnet 4.5

Anthropic · Sep 2025text

Nothing here. The model returned empty. We stared at it for a while.

GPT-5.3 Chat

GPT-5.3 Chat

Openai · Mar 2026text

We can do a back‑of‑the‑envelope estimate using a common scaling rule for transformer training compute.

Known approximate GPT‑3 specs

Parameters N ≈ 175 billion = 1.75 × 10^11
Training tokens T ≈ 300 billion = 3 × 10^11

FLOPs per token for a transformer
For large transformers, a common approximation is:

FLOPs per token ≈ 6 × N

Reasoning (rough):

Forward pass through all layers costs ~2N multiply‑adds.
Backpropagation roughly doubles or triples that cost.
Total ≈ ~6N FLOPs per token for forward + backward + gradients.

Total training FLOPs
Total compute ≈ (FLOPs per token) × (number of tokens)

= 6 × N × T

Plug in values:

6 × (1.75 × 10^11) × (3 × 10^11)

First multiply constants: 6 × 1.75 × 3 = 31.5

Then powers: 10^11 × 10^11 = 10^22

So:

Total ≈ 31.5 × 10^22
≈ 3.15 × 10^23 FLOPs

Final estimate
Total training compute for GPT‑3 is on the order of:

≈ 3 × 10^23 FLOPs

Sanity check
This sits in the typical reported range (~10^23 FLOPs) for GPT‑3‑scale training, so the estimate is reasonable.

Final estimate: about 3 × 10^23 FLOPs to train GPT‑3.

Vote

2 / 37

02

AI Board Game Logicreasoning

Tests an AI's ability to understand game rules and strategy

Claude Sonnet 4.5

Claude Sonnet 4.5

Anthropic · Sep 2025svg

Nothing here. The model returned empty. We stared at it for a while.

GPT-5.3 Chat

GPT-5.3 Chat

Openai · Mar 2026svg

Nothing here. The model returned empty. We stared at it for a while.

Vote

3 / 37

03

Logic Puzzlereasoning

Tests an AI's ability to solve a simple but potentially confusing logic puzzle

Claude Sonnet 4.5

Claude Sonnet 4.5

Anthropic · Sep 2025text

Nothing here. The model returned empty. We stared at it for a while.

GPT-5.3 Chat

GPT-5.3 Chat

Openai · Mar 2026text

Nothing here. The model returned empty. We stared at it for a while.

Vote

4 / 37

04

Stochastic Consistency Testanalysis

Tests an AI's randomness and creativity

Claude Sonnet 4.5

Claude Sonnet 4.5

Anthropic · Sep 2025text

Nothing here. The model returned empty. We stared at it for a while.

GPT-5.3 Chat

GPT-5.3 Chat

Openai · Mar 2026text

Nothing here. The model returned empty. We stared at it for a while.

Vote

5 / 37

05

SVG Layout Challengeimage generation

Tests an AI's ability to generate vector graphics

Claude Sonnet 4.5

Claude Sonnet 4.5

Anthropic · Sep 2025svg

Nothing here. The model returned empty. We stared at it for a while.

GPT-5.3 Chat

GPT-5.3 Chat

Openai · Mar 2026svg

Vote

6 / 37

06

Xbox Controller SVG Artimage generation

Tests an AI's ability to create detailed SVG illustrations of gaming hardware

Claude Sonnet 4.5

Claude Sonnet 4.5

Anthropic · Sep 2025svg

Nothing here. The model returned empty. We stared at it for a while.

GPT-5.3 Chat

GPT-5.3 Chat

Openai · Mar 2026svg

Nothing here. The model returned empty. We stared at it for a while.

Vote

7 / 37

07

Generate a Stand-Up Routineconversation

Tests an AI's humor and creative writing ability

Claude Sonnet 4.5

Claude Sonnet 4.5

Anthropic · Sep 2025text

Nothing here. The model returned empty. We stared at it for a while.

GPT-5.3 Chat

GPT-5.3 Chat

Openai · Mar 2026text

Nothing here. The model returned empty. We stared at it for a while.

Vote

8 / 37

08

Character Voice Testconversation

Tests an AI's ability to write in distinct character voices

Claude Sonnet 4.5

Claude Sonnet 4.5

Anthropic · Sep 2025text

Nothing here. The model returned empty. We stared at it for a while.

GPT-5.3 Chat

GPT-5.3 Chat

Openai · Mar 2026text

Nothing here. The model returned empty. We stared at it for a while.

Vote

Sponsored

9 / 37

09

Pokémon Battle UI Recreationweb design

Recreate an interactive, nostalgic Pokémon battle UI in a single HTML file.

Claude Sonnet 4.5

Claude Sonnet 4.5

Anthropic · Sep 2025website

Nothing here. The model returned empty. We stared at it for a while.

GPT-5.3 Chat

GPT-5.3 Chat

Openai · Mar 2026website

Nothing here. The model returned empty. We stared at it for a while.

Vote

10 / 37

10

Mario Level UI Recreationweb design

Recreate an interactive, classic Mario level in a single HTML file.

Claude Sonnet 4.5

Claude Sonnet 4.5

Anthropic · Sep 2025website

GPT-5.3 Chat

GPT-5.3 Chat

Openai · Mar 2026website

Vote

11 / 37

11

Linear App Cloneweb design

Tests an AI's ability to replicate an existing UI with Tailwind CSS

Claude Sonnet 4.5

Claude Sonnet 4.5

Anthropic · Sep 2025website

Nothing here. The model returned empty. We stared at it for a while.

GPT-5.3 Chat

GPT-5.3 Chat

Openai · Mar 2026website

Nothing here. The model returned empty. We stared at it for a while.

Vote

12 / 37

12

Framer-Style Animationweb design

Tests an AI's ability to create smooth web animations

Claude Sonnet 4.5

Claude Sonnet 4.5

Anthropic · Sep 2025website

Nothing here. The model returned empty. We stared at it for a while.

GPT-5.3 Chat

GPT-5.3 Chat

Openai · Mar 2026website

Nothing here. The model returned empty. We stared at it for a while.

Vote

The full comparison is right here

25+ head-to-head challenges. All of them judged by real people.

5 credits on us when you join

By continuing, you agree to Rival's Terms of Service and Privacy Policy

Prompt Lab

Want to throw your own prompt in the ring?

Run it across these models and 200+ more in Prompt Lab. No survivors.