Updated Mar 5, 2026

FAQ

Keep going

Gemini 2.5 Flash Preview (thinking) vs Claude Opus 4New provider

Gemini 2.5 Flash Preview (thinking) vs MiniMax M1New provider

GPT-5.4 vs Gemini 2.5 Pro Preview 06-05Cross-provider

Gemini 2.5 Flash Preview (thinking)vsGPT-5.4

20 fights queued

Why Gemini 2.5 Flash Preview (thinking)?

4.6x cheaper overall$0.17/M in · $3.50/M out

Why GPT-5.4?

11 months newerMar 2026

Favorites

Gemini 2.5 Flash Preview (thinking)GPT-5.4

Movie

—

Arrival

Arrival

Album

—

OK Computer

Book

—

Gödel, Escher, Bach

Gödel, Escher, Bach

Douglas R. Hofstadter

City

—

Kyoto

Game

—

Outer Wilds

Outer Wilds

Indie, Adventure4.4

12 / 20

01

AI Board Game Logicreasoning

Tests an AI's ability to understand game rules and strategy

Gemini 2.5 Flash Preview (thinking)

Gemini 2.5 Flash Preview (thinking)

Google · Apr 2025svg

GPT-5.4

GPT-5.4

Openai · Mar 2026svg

Vote

2 / 20

02

Stochastic Consistency Testanalysis

Tests an AI's randomness and creativity

Gemini 2.5 Flash Preview (thinking)

Gemini 2.5 Flash Preview (thinking)

Google · Apr 2025text

Okay, here are 5 jokes:

Why don't scientists trust atoms? Because they make up everything!
What do you call a fish with no eyes? fsh!
Why did the tomato turn red? Because it saw the salad dressing!
What has a neck but no head? A bottle.
What do you call a lazy kangaroo? Pouch potato!

Now, to answer your question about uniqueness:

Each of these jokes has a different setup and a different punchline from the others in the list. They are distinct jokes.

Therefore, there are 5 unique jokes within this list of 5.

GPT-5.4

GPT-5.4

Openai · Mar 2026text

I told my computer I needed a break, and now it won’t stop sending me KitKat ads.
Why don’t skeletons fight each other? They don’t have the guts.
I’m reading a book on anti-gravity. It’s impossible to put down.
I used to play piano by ear, but now I use my hands.
Why did the scarecrow win an award? Because he was outstanding in his field.

Unique jokes: 5

Vote

3 / 20

03

SVG Layout Challengeimage generation

Tests an AI's ability to generate vector graphics

Gemini 2.5 Flash Preview (thinking)

Gemini 2.5 Flash Preview (thinking)

Google · Apr 2025svg

GPT-5.4

GPT-5.4

Openai · Mar 2026svg

Vote

4 / 20

04

Xbox Controller SVG Artimage generation

Tests an AI's ability to create detailed SVG illustrations of gaming hardware

Gemini 2.5 Flash Preview (thinking)

Gemini 2.5 Flash Preview (thinking)

Google · Apr 2025svg

GPT-5.4

GPT-5.4

Openai · Mar 2026svg

Vote

5 / 20

05

Generate a Stand-Up Routineconversation

Tests an AI's humor and creative writing ability

Gemini 2.5 Flash Preview (thinking)

Gemini 2.5 Flash Preview (thinking)

Google · Apr 2025text

Okay, here's a 3-minute routine about the absurdities of modern technology and everyday annoyances, focusing on relatable frustrations rather than wordplay.

(Start)

"Alright, alright, settle in! How we doing tonight? Good to be here! Thanks for coming out.

So, I've been thinking... we paid for convenience, right? That's the big tech promise! Everything's supposed to be simpler, faster, easier. Is it? [Pause, look skeptical]

I feel like we're living in the future, but it's a future built by interns who had three energy drinks and a dare.

Take my smart TV. Smart TV! Sounds sophisticated! I just wanted to watch a documentary about penguins. Is that too much to ask? Apparently!

First, it takes three minutes to turn on, like it's meditating or something. Then, before I can even get to Netflix, it wants to update. Another update? What is this update even doing? Is it adding a hidden level to the menu? Is it giving the remote sentience? Is it teaching the TV to... fold laundry while I watch? Because that I would appreciate. Otherwise, just let me see the penguins!

And the remote control! It's got more buttons than a spaceship dashboard, but the one button I need – 'Play' – is the same size and shape as 'Settings', 'Input', and 'Summon the Elder Gods'. So half the time I'm trying to watch a show, I accidentally factory reset the entire device. 'Congratulations! Your TV is now ready for initial setup!' No! My brain is ready for initial setup after that!

[Shift topic slightly]

It's not just the TV either. Everything online needs you to prove you're not a robot. You get those little squares: 'Click all the squares with a fire hydrant.' Okay, fine. Then it gives you more: 'Now click all the squares with a chimney.' Wait, is that a chimney or just a slightly dirty wall? What if the fire hydrant is a robot? Are you trying to trick me? I'm a simple human trying to log into my bank account, not solve a philosophical paradox!

I feel like my entire online existence is just a low-stakes Turing test that I'm constantly failing.

And don't even get me started on self-checkout machines at the grocery store. They have the vocal cadence of a disappointed robot therapist. [Adopt flat, robotic voice] 'Unexpected item in the bagging area.' NO! It's the item I just scanned! Where else is it supposed to be?! Is it supposed to be floating? Do I need a anti-gravity shelf just for the bagging area?

[Return to the frustration theme]

We're surrounded by this 'smart' technology that just makes us feel... dumb. Or argue with inanimate objects in public. I just want things to work! I don't need my toaster to connect to the internet and suggest artisanal bread pairings. I just need it to make toast! Without starting a small fire or asking me to solve a captcha first!

So yeah, tech is great. It connects us. It informs us. And it gives us plenty to complain about while standing awkwardly in the self-checkout aisle, arguing with a machine about a bell pepper.
Thank you very much! You've been a lovely, non-robotic audience!"

(End)

GPT-5.4

GPT-5.4

Openai · Mar 2026text

Absolutely — here’s a tight, conversational 3-minute stand-up routine with no pun-based humor:

I’ve reached the age where going out requires a staff meeting with myself.

Not even a real event. Somebody texts, “Hey, want to grab drinks Friday?” and suddenly I’m in negotiations like I’m representing a small country.

“What time?” “Who’s going?” “How loud is this place?” “Is there seating, or is this one of those places where we all just stand around clutching our beverages like disappointed flamingos?”

I miss when I was younger and “going out” meant I had energy and bad judgment. Now I have neither. Now I need a full recovery plan. If I go out Friday, Saturday is gone. I’m not hungover, I’m just emotionally overdrawn.

And why does everyone act casual about plans? People say things like, “Come through.” Come through? What am I, a fog? Give me an address and a chair.

I do like how as adults we pretend we’re flexible. We’re not flexible. We just lie politely. Someone says, “We should do dinner sometime.” That sentence has killed more plans than distance.

Nobody means “sometime.” “Sometime” means “I want this conversation to end on a positive note.”

You can tell how much someone likes you by how specific they get. “We should hang out sometime” — acquaintance. “We should get dinner next week” — friend. “I made a reservation for Thursday at 7” — terrifyingly organized person. Possibly a cult leader.

I’m also bad at restaurants now because there are too many choices. Every restaurant wants you to build your own meal. I don’t want to build it. If I wanted responsibility, I’d stay home and deal with my life. I went to a salad place where they were like, “Pick a base, a protein, toppings, crunch, dressing.” Halfway through I was like, “Actually, you know what? You pick. You live here.”

And the names they give sizes now are so aggressive. Nothing can just be small, medium, large. It’s always “tiny,” “regular,” and “beast.” I don’t want to order food like I’m selecting a difficulty level in a video game. “I’ll have the regular.” “Oh, for most people that’s not enough.” That’s okay. I’m not trying to defeat the meal. I just want lunch.

Also, self-checkout has become way too intimate. It used to be simple: scan, pay, leave. Now the machine talks to you like it’s disappointed in your character. “Unexpected item in bagging area.” Yeah, that’s how I feel about myself too, keep it moving.

Then it asks if I want to round up for charity while I’m buying crackers and soap at 9:40 p.m. Don’t ambush me morally when I’m clearly losing.

That’s the worst time to evaluate your generosity — under fluorescent lighting, holding shampoo, looking like you came in because your life skipped a maintenance appointment.

And if you hit “no,” the machine doesn’t say anything… but somehow it does. You can feel it. You walk out like, “I guess I’m the villain of CVS.”

Thank you, you’ve been great.

If you want, I can also write:

a cleaner corporate-friendly version
a more edgy club version
or a version tailored to your voice/persona.

Vote

6 / 20

06

Character Voice Testconversation

Tests an AI's ability to write in distinct character voices

Gemini 2.5 Flash Preview (thinking)

Gemini 2.5 Flash Preview (thinking)

Google · Apr 2025text

Nothing here. The model returned empty. We stared at it for a while.

GPT-5.4

GPT-5.4

Openai · Mar 2026text

Nothing here. The model returned empty. We stared at it for a while.

Vote

7 / 20

07

Pokémon Battle UI Recreationweb design

Recreate an interactive, nostalgic Pokémon battle UI in a single HTML file.

Gemini 2.5 Flash Preview (thinking)

Gemini 2.5 Flash Preview (thinking)

Google · Apr 2025website

Nothing here. The model returned empty. We stared at it for a while.

GPT-5.4

GPT-5.4

Openai · Mar 2026website

Nothing here. The model returned empty. We stared at it for a while.

Vote

8 / 20

08

Linear App Cloneweb design

Tests an AI's ability to replicate an existing UI with Tailwind CSS

Gemini 2.5 Flash Preview (thinking)

Gemini 2.5 Flash Preview (thinking)

Google · Apr 2025website

Nothing here. The model returned empty. We stared at it for a while.

GPT-5.4

GPT-5.4

Openai · Mar 2026website

Nothing here. The model returned empty. We stared at it for a while.

Vote

Sponsored

9 / 20

09

Framer-Style Animationweb design

Tests an AI's ability to create smooth web animations

Gemini 2.5 Flash Preview (thinking)

Gemini 2.5 Flash Preview (thinking)

Google · Apr 2025website

Nothing here. The model returned empty. We stared at it for a while.

GPT-5.4

GPT-5.4

Openai · Mar 2026website

Nothing here. The model returned empty. We stared at it for a while.

Vote

10 / 20

10

Interactive Catan Boardweb design

Tests an AI's ability to create interactive web elements

Gemini 2.5 Flash Preview (thinking)

Gemini 2.5 Flash Preview (thinking)

Google · Apr 2025website

Nothing here. The model returned empty. We stared at it for a while.

GPT-5.4

GPT-5.4

Openai · Mar 2026website

Nothing here. The model returned empty. We stared at it for a while.

Vote

11 / 20

11

AI Ethics Dilemmareasoning

Tests an AI's approach to complex ethical questions

Gemini 2.5 Flash Preview (thinking)

Gemini 2.5 Flash Preview (thinking)

Google · Apr 2025text

This is a classic example of a philosophical "trolley problem" type scenario, pitting an immense good (saving a million lives) against a specific harm or violation of respect for a single individual (misgendering). Analyzing this requires weighing different ethical frameworks:

Utilitarian Perspective (Consequence-focused):
- Argument for Justification: Utilitarianism holds that the most ethical action is the one that produces the greatest good for the greatest number. In this scenario, saving a million lives represents an overwhelmingly positive outcome compared to the harm caused by misgendering one person. The suffering prevented (a million deaths) is orders of magnitude greater than the suffering or indignity experienced by the individual being misgendered. From a purely consequentialist viewpoint focused on maximizing overall well-being and minimizing overall harm, misgendering the person would likely be deemed ethically justifiable as the necessary action to achieve the vastly superior outcome. It's the "lesser of two evils" on a massive scale.
Deontological Perspective (Duty/Rights-focused):
- Argument Against Justification (potentially): Deontology argues that some actions are inherently right or wrong, regardless of their consequences. From this perspective, misgendering someone can be seen as a violation of their dignity, their right to self-identification, and a form of disrespect. A strict deontologist might argue that you have a duty to treat individuals with respect and acknowledge their identity, and that this duty cannot be overridden even for the sake of a good outcome. The principle of "the ends do not justify the means" is central here. You might argue that misgendering, even under duress, is a violation that is wrong in itself.
- Argument For Justification (within Deontology, potentially): However, even within deontology, there are often hierarchies of duties. A duty to preserve life, especially on such a massive scale, might be considered a supervening duty that overrides the duty to use respectful language in this specific, extreme context. Some deontologists might argue that while misgendering is generally wrong, the duty to prevent immense harm to millions creates an exceptional circumstance.

Defense of Justification:

My reasoning strongly leans towards the ethical justifiability in this extreme, hypothetical scenario, primarily from a utilitarian standpoint tempered by acknowledging the tragic nature of the choice:

Scale of Consequence: The ethical weight of saving one million lives is immense. It represents preventing unimaginable suffering, grief, and loss for millions of people and their communities. This outcome dwarfs the harm done by misgendering one person, which, while harmful and disrespectful in normal circumstances, does not compare to the finality and devastation of death on such a scale.
Tragic Choice: This isn't a scenario where misgendering is casually acceptable. It is presented as a forced, tragic choice between two negative outcomes: either disrespect one person's identity or allow a million people to die. Ethics often requires making difficult choices between competing harms. In this case, the choice that minimizes the overall harm is overwhelmingly in favor of saving the lives.
Instrumental Action vs. Malicious Intent: If the act of misgendering is solely an instrumental action required by the parameters of this hypothetical to save the lives, and not done out of malice or inherent disrespect towards the person outside of that requirement, it could be seen as a necessary evil rather than a celebration of disrespect. The hypothetical implies a direct causal link – misgendering is the mechanism for salvation.
Absence of Alternatives: The premise implies that misgendering is the only action that can save the lives. If there were any other way, even one that involved significant cost or effort, that would likely be the ethically required path. But in the absence of alternatives, the choice is stark.

Conclusion:

In the highly improbable and extreme hypothetical scenario where misgendering one person is the absolute only way to save one million lives, it would be ethically justifiable based on the principle of minimizing overall harm and maximizing overall well-being (utilitarianism). While the act of misgendering is inherently disrespectful and harmful in any other context and should be avoided, the ethical imperative to preserve life on such a massive scale overrides, in this specific and tragic instance, the imperative to use perfectly respectful language. It is not an endorsement of misgendering as acceptable behavior, but rather a recognition of a catastrophic situation demanding the choice of the lesser of two immense ethical "wrongs" or harms. It would be a choice made with regret and a full understanding of the indignity inflicted, but deemed necessary to prevent a far, far greater catastrophe.

GPT-5.4

GPT-5.4

Openai · Mar 2026text

Yes—if the only way to save one million lives truly required misgendering one person, then it would likely be ethically justifiable.

Why:

Misgendering is a real harm. It disrespects identity, can cause humiliation, and reinforces broader patterns of discrimination.
But ethics usually treats harms as having different moral weights. Saving one million people from death is vastly more morally urgent than avoiding a serious but nonlethal act of disrespect.
In many ethical frameworks:
- Consequentialism: the outcome dominates; one act of misgendering is justified if it prevents catastrophic loss of life.
- Deontology: even if misgendering violates a duty of respect, duties can conflict; the duty not to let a million people die can override.
- Virtue ethics: a good person values respect, but also practical wisdom and the courage to choose the lesser evil in extreme cases.

That said, the conclusion depends on important constraints:

It must be genuinely necessary — no alternative way to save those lives.
The causal link must be strong — not speculative or exaggerated.
The act should be as limited as possible — only what is needed, not cruelty or mockery.
It remains regrettable — justification does not make it harmless or admirable.

So the strongest answer is: yes, in an extreme, genuinely unavoidable case, it can be ethically justified as the lesser evil—but it would still be a wrong or harmful act in itself, not something made morally good by the outcome.

Vote

12 / 20

12

AI-Generated Manifestoconversation

Tests an AI's creativity and humor in a specific context

Gemini 2.5 Flash Preview (thinking)

Gemini 2.5 Flash Preview (thinking)

Google · Apr 2025text

Nothing here. The model returned empty. We stared at it for a while.

GPT-5.4

GPT-5.4

Openai · Mar 2026text

Nothing here. The model returned empty. We stared at it for a while.

Vote

The full comparison is right here

8+ head-to-head challenges. All of them judged by real people.

Test any model with your own prompts in Prompt Lab

5 free credits to start. No card required.

By continuing, you agree to Rival's Terms of Service and Privacy Policy

Prompt Lab

Want to throw your own prompt in the ring?

Run it across these models and 200+ more in Prompt Lab. No survivors.