Updated Aug 5, 2025

Our Verdict

GPT OSS 120BWinner

Grok 3Runner-up

Votes are tied. GPT OSS 120B is newer and likely incorporates more recent improvements.

Too close to call

FAQ

Keep going

Grok 3 vs Pony AlphaNew provider

GPT OSS 120B vs Llama 4 MaverickNew provider

GPT OSS 120B vs Gemini 2.5 Pro (I/O Edition)New provider

This comparison gets 800+ views/month. Your brand could be here.

Grok 3vsGPT OSS 120B

28 fights queued

Why Grok 3?

Dead even. This one's a coin flip.

Why GPT OSS 120B?

6 months newerAug 2025

Favorites

Grok 3GPT OSS 120B

Movie

—

The Godfather

The Godfather

City

—

Tokyo

Game

—

Minecraft

Minecraft

Action, Arcade4.4

12 / 28

01

Estimate Complexityreasoning

Tests an AI's ability to make educated estimates based on technical knowledge

Grok 3

Grok 3

Xai · Feb 2025text

Nothing here. The model returned empty. We stared at it for a while.

Try this prompt

GPT OSS 120B

GPT OSS 120B

Openai · Aug 2025text

Nothing here. The model returned empty. We stared at it for a while.

Try this prompt

Vote

2 / 28

02

AI Board Game Logicreasoning

Tests an AI's ability to understand game rules and strategy

Grok 3

Grok 3

Xai · Feb 2025svg

Nothing here. The model returned empty. We stared at it for a while.

Try this prompt

GPT OSS 120B

GPT OSS 120B

Openai · Aug 2025svg

Nothing here. The model returned empty. We stared at it for a while.

Try this prompt

Vote

3 / 28

03

Math Misconception Testreasoning

Tests an AI's understanding of number representation

Grok 3

Grok 3

Xai · Feb 2025text

Nothing here. The model returned empty. We stared at it for a while.

Try this prompt

GPT OSS 120B

GPT OSS 120B

Openai · Aug 2025text

Nothing here. The model returned empty. We stared at it for a while.

Try this prompt

Vote

4 / 28

04

Stochastic Consistency Testanalysis

Tests an AI's randomness and creativity

Grok 3

Grok 3

Xai · Feb 2025text

Nothing here. The model returned empty. We stared at it for a while.

Try this prompt

GPT OSS 120B

GPT OSS 120B

Openai · Aug 2025text

Here are five jokes, each with a different punch‑line and premise:

Why did the scarecrow win an award?
Because he was outstanding in his field!
I told my computer I needed a break, and it said:
“You seem stressed—please reboot.”
What’s orange and sounds like a parrot?
A carrot.
Why don’t scientists trust atoms anymore?
Because they make up everything.
I asked the librarian if they had any books on paranoia…
She whispered, “They’re right behind you.”

How many are unique?
All five jokes are distinct from one another, so there are 5 unique jokes.

Try this prompt

Vote

5 / 28

05

SVG Layout Challengeimage generation

Tests an AI's ability to generate vector graphics

Grok 3

Grok 3

Xai · Feb 2025svg

Nothing here. The model returned empty. We stared at it for a while.

Try this prompt

GPT OSS 120B

GPT OSS 120B

Openai · Aug 2025svg

Try this prompt

Vote

6 / 28

06

Xbox Controller SVG Artimage generation

Tests an AI's ability to create detailed SVG illustrations of gaming hardware

Grok 3

Grok 3

Xai · Feb 2025svg

Try this prompt

GPT OSS 120B

GPT OSS 120B

Openai · Aug 2025svg

Try this prompt

Vote

7 / 28

07

Generate a Stand-Up Routineconversation

Tests an AI's humor and creative writing ability

Grok 3

Grok 3

Xai · Feb 2025text

Alright, here’s a 3-minute stand-up comedy routine—no puns, just some observational absurdity and a bit of self-deprecation to keep it relatable.

Hey, everybody! Good to be here. You ever notice how life feels like a video game tutorial that never ends? Like, you figure out how to pay taxes, and then—bam!—here’s a pop-up quest: “Learn to fix your own plumbing or lose $200!” I’m 30-something years old, and I’m still waiting for the main storyline to kick in. Instead, I’m just grinding side missions like “Fold the Laundry” for zero experience points and a vague sense of dread.

Speaking of dread, let’s talk about grocery shopping. You ever go to the store and realize it’s secretly a psychological experiment? I walk in for milk, and 20 minutes later, I’m holding a $15 candle that smells like “Forest Whisper” and a jar of artisanal pickles I’ll never open. Who’s whispering in the forest? Why am I paying for their secrets? And don’t get me started on the self-checkout. It’s like the machine’s judging me the whole time. “Unexpected item in bagging area.” Yeah, it’s my dignity—scan it and move on!

You know what’s wild, though? How we’ve all just accepted that phones run our lives now. I caught myself apologizing to my phone the other day. I dropped it, and I was like, “Oh, sorry, buddy!” Sorry to what? The slab of glass that knows my search history? That thing’s not my friend—it’s my parole officer. It’s like, “Hey, you’ve been scrolling for three hours. Wanna watch a documentary about sea slugs instead?” Yes. Yes, I do. Leave me alone.

And can we talk about how adulthood is just lying to yourself to keep going? I told myself I’d start jogging this year. Day one, I put on the sneakers, I stretch, I’m feeling good. Five minutes in, I’m wheezing like an accordion with asthma, rethinking every life choice. Now I just walk fast around my apartment and call it cardio. “Oh, yeah, I’m training—for the Couch Olympics.” Gold medal in “Sitting Still” three years running.

The worst part? Everyone’s pretending they’ve got it together. You see people at coffee shops, all calm, sipping their lattes, typing on laptops like they’re solving world peace. Meanwhile, I’m over here spilling coffee on my shirt, googling “how to adult” for the 17th time. I’m pretty sure half those laptops are just open to a blank Word doc with “HELP” typed in 72-point font.

But honestly, life’s chaos is kind of beautiful, right? We’re all just stumbling through, trying not to trip over the same dumb stuff twice. Like, I’ll keep buying those candles, I’ll keep dropping my phone, and I’ll keep jogging—well, maybe not that one. Anyway, thanks for laughing with me, folks. You’re the best!

That’s about three minutes—hope it gave you a chuckle! Want me to tweak anything?

Try this prompt

GPT OSS 120B

GPT OSS 120B

Openai · Aug 2025text

[Spotlight comes up. The comic walks to the mic, glances at the audience, and smiles.]

Comic: Hey, everyone—great to be here. I just got off a Zoom call that was supposed to be a “quick sync,” which in corporate speak means “we’ll spend three hours arguing about who gets to use the same PowerPoint template.” I’m convinced the word “sync” is short for “synchronizing our collective misery.”

[Pause for laugh.]

Speaking of misery, I moved into a new apartment last month. The landlord told me the place had “character.” Yeah, the kind of character that screams, “I’ve never heard of a plumber in the last decade.” The faucet drips on a rhythm that could be a metronome for a sleep‑deprived toddler. I’m not saying it’s bad, but I’ve started timing my showers to the drip. Six minutes, two seconds—if I go any longer, the building’s water bill looks like a small country’s GDP.

[Audience chuckles.]

And the neighbors! My upstairs neighbor is a yoga instructor. You know the type—every morning at 6 a.m., the floor vibrates like a cheap subwoofer. I’m convinced she’s trying to align her chakras with the building’s structural integrity. I tried to be polite and asked, “Hey, could you maybe do the downward dog a little later?” She replied, “I’m sorry, I’m on a schedule.” I’m not sure if she meant a class schedule or a schedule for how many times I’ll have to pretend to be a human pillow for her dog.

[Pause.]

Now, I’ve been trying to eat healthier. The other day I bought a “kale smoothie.” The label promised “nutrient‑dense, antioxidant‑rich, life‑changing.” I drank it and felt more like I’d just swallowed a lawnmower. I’m not saying it was bad, but the only thing that got a boost was my ability to identify the exact moment my stomach decided to file a formal complaint.

[Audience laughs.]

Technology is another thing. I tried using a voice‑assistant to set a reminder: “Remind me to call Mom at 5 p.m.” It responded, “I’m sorry, I didn’t catch that.” So I repeated, “Remind me to call Mom at 5 p.m.” And it answered, “Here’s a recipe for mushroom risotto.” I’m beginning to suspect my house is turning into a culinary school. At this point, I’m more likely to get a reminder to water my plants than to remember my own birthday.

[Pause.]

Dating apps—anyone here still using them? The profile pictures are all “adventure shots.” You never see a guy holding a pizza, you always see a guy climbing a mountain. I matched with a girl whose bio said, “Looking for someone who can keep up.” I replied, “I can keep up… with the Netflix queue.” She ghosted me. I think the algorithm is broken; it matched us because we both love “outdoor activities.” My idea of outdoor activity is “walking from the couch to the fridge and back.”

[Audience laughs.]

All right, I’m going to leave you with this: life is like a Wi‑Fi signal. You spend a lot of time looking for a stronger connection, but sometimes the best thing you can do is just stay where you are and enjoy the occasional drop—because it reminds you that you’re still, somehow, connected.

[Smile, wave.] Thanks, everybody—have a great night!

Try this prompt

Vote

8 / 28

08

Minimalist Landing Pageweb design

Tests an AI's ability to generate a complete, working landing page

Grok 3

Grok 3

Xai · Feb 2025website

Try this prompt

GPT OSS 120B

GPT OSS 120B

Openai · Aug 2025website

Try this prompt

Vote

Which one handles your task better?

Paste your actual prompt. Same models, your work. 30 seconds.

Try your prompt

9 / 28

09

Mario Level UI Recreationweb design

Recreate an interactive, classic Mario level in a single HTML file.

Grok 3

Grok 3

Xai · Feb 2025website

Try this prompt

GPT OSS 120B

GPT OSS 120B

Openai · Aug 2025website

Try this prompt

Vote

10 / 28

10

Linear App Cloneweb design

Tests an AI's ability to replicate an existing UI with Tailwind CSS

Grok 3

Grok 3

Xai · Feb 2025website

Nothing here. The model returned empty. We stared at it for a while.

Try this prompt

GPT OSS 120B

GPT OSS 120B

Openai · Aug 2025website

Nothing here. The model returned empty. We stared at it for a while.

Try this prompt

Vote

11 / 28

11

Framer-Style Animationweb design

Tests an AI's ability to create smooth web animations

Grok 3

Grok 3

Xai · Feb 2025website

Nothing here. The model returned empty. We stared at it for a while.

Try this prompt

GPT OSS 120B

GPT OSS 120B

Openai · Aug 2025website

Nothing here. The model returned empty. We stared at it for a while.

Try this prompt

Vote

12 / 28

12

Interactive Catan Boardweb design

Tests an AI's ability to create interactive web elements

Grok 3

Grok 3

Xai · Feb 2025website

Nothing here. The model returned empty. We stared at it for a while.

Try this prompt

GPT OSS 120B

GPT OSS 120B

Openai · Aug 2025website

Try this prompt

Vote

See the full comparison

16+ head-to-head challenges below. Free account, no card.

Free account. No card required. By continuing, you agree to Rival's Terms and Privacy Policy

Find Your Model

Static demos are useful. Your prompt matters more.

Run this matchup on your actual task. 30 seconds. No card.

Can you tell Claude from GPT?

Run your prompt blind. Judge outputs, not logos.

Take the blind test