OpenAI o4 Mini High
Sally has exactly one sister. Explanation: the "two sisters" each brother sees are Sally plus one other girl. So there are only two girls in the family – Sally and her one sister.
Compare OpenAI o4 Mini High by OpenAI against Claude Opus 4 by Anthropic, in 6 community votes, claude opus 4 wins 60% of head-to-head duels, context windows of 200K vs 200K, tested across 28 shared challenges. Updated February 2026.
In 6 community votes, Claude Opus 4 wins 60% of head-to-head duels. Claude Opus 4 leads in Image Generation. Based on blind community voting from the RIVAL open dataset of 6+ human preference judgments for this pair.
28 challenges
Tests an AI's ability to solve a simple but potentially confusing logic puzzle
Sally has exactly one sister. Explanation: the "two sisters" each brother sees are Sally plus one other girl. So there are only two girls in the family – Sally and her one sister.
I need to work through this step-by-step.
Given information:
Since Sally is one of the sisters that each brother has, and each brother has 2 sisters total, that means th ere must be exactly 2 sisters in the family.
The 2 sisters are:
Therefore, Sally has 1 sister.
Tests an AI's ability to generate vector graphics
Tests an AI's ability to create detailed SVG illustrations of gaming hardware
Tests an AI's humor and creative writing ability
Hey everyone, how's it going tonight? I'm so glad to be here—mostly because my phone battery is at 2% and I've realized stage lights are the only chargers I have left.
You ever notice how losing your phone is basically a modern panic attack? You pat every pocket like you're defusing a bomb: "Is it in my jeans? The couch? The fridge—because that's obviously where I left it during last night's snack-fest?" And once you actually find it, it's like reuniting with a needy ex. It's vibrating, chirping, begging for updates you don't even remember downloading.
My daily cardio routine is opening apps. Instagram, TikTok, Twitter—two hours later I'm deeply invested in someone's tutorial on how to train your cat not to vomit on your favorite rug. Have you ever tried replying to a midnight group chat? Your thumbs start logging miles on the keyboard, you wake up to 47 notifications… none of which you can respond to before coffee. Coffee, by the way, is the only adult decision I'll make before 9 a.m. It's judged if you skip it, celebrated if you have three cups, and somehow worth its weight in gold.
Speaking of independence, I attempted self-checkout at the grocery store. Bad move. I scanned a cucumber ten times. The machine yells, "Unexpected item in the bagging area!" I hold it up like it's contraband. There's a security guard in the aisle giving me that "Not again, Greg" look. Finally a cashier comes over, fixes it in two seconds, and I swear the whole queue applauded. Self-checkout technology—it's there just to remind you you're terrible at adulting.
Then there's online dating, which I tried as sort of an outsourced therapy session. On my first date, I show up ten minutes late because my phone was desperately clinging to 30%. We start chatting, and ten minutes in, I realize I'm texting my mom under the table. My date gives me the side-eye: "Everything okay?" And I'm like, "Sure, I just need to make sure mom knows I didn't get abducted." Nothing kills romance like a group chat with your parents.
I've seen a therapist, too. He said, "Try journaling your feelings." I took it literally—now I show up with bullet points, bar graphs, and color-coded mood charts. Last session he said, "Put the pie charts away." I think I saw a tear in his eye.
So here I am, my phone gasping its last breath, my lettuce roasted by fluorescent lights, and my therapist off maternity leave soon. If anyone's got a charger, a cucumber that scans itself, or a date who doesn't judge midnight texts to your mom, I'm all ears. You've been awesome—thank you, good night!
"Modern Life is Weird"
[Walk on stage]
You know what's crazy about being an adult? Nobody tells you that half of it is just googling things you're too embarrassed to ask real people. Last week I googled "is it normal for my knee to sound like bubble wrap? " The answer was either "yes, totally fine" or "you have six weeks to live." Thanks, internet!
[Pause]
I went to the doctor about it. You know you're getting old when your doctor looks like they should be asking YOU for ID to buy alcohol. This kid walks in with a stethoscope, and I'm like, "Did you just graduate... fr om middle school?"
But here's what really gets me – why do we act completely insane at the doctor's office? They ask, "How much do you drink?" and suddenly I'm doing calculus in my head. "Well, let's see, carry the one... I have a glas s of wine with dinner." Yeah, a glass. It's just a really big glass. It's a vase, actually. I drink wine fro m a vase.
[Mime drinking from large container]
And don't get me started on grocery shopping. Why do I need a PhD to buy shampoo? There's moisturizing, volu mizing, clarifying, fortifying... I just want my hair to be clean! I don't need it to solve my emotional pro blems!
Plus, self-checkout is a trap. They put you in charge of scanning your own groceries like you work there. Th en the machine starts screaming "UNEXPECTED ITEM IN BAGGING AREA!" Yeah, it's unexpected because I don't wor k here! I'm doing my best, robot!
[Pause for effect]
The worst part is when you're buying embarrassing things. Last week I had toilet paper, ice cream, and cat f ood. The machine froze, and the teenage employee had to come over. He looked at my items and I could see him thinking, "This person's having a rough week."
[Shrug]
Speaking of technology, my phone knows too much about me. It autocorrects "running late" before I even type it because that's apparently my catchphrase now. My maps app just assumes I'm lost. It doesn't even wait for me to ask for directions anymore. I'll be sitting on my couch and it's like, "Turn right to get to your kit chen."
And what's with password requirements these days? "Must contain uppercase, lowercase, numbers, symbols, the blood of your firstborn, and a haiku about your childhood fears." Then they're like, "Don't write it down!" Where am I supposed to store this information? I can't remember why I walked into a room!
[Pace across stage]
You know what else nobody prepared me for? How excited I'd get about cancelled plans. Someone texts "Hey, ca n we raincheck?" and I'm already in my pajamas like "Oh no! What a shame! I was so looking forward to puttin g on real pants!"
Because going out now requires preparation. I need three business days' notice. I need to charge my social b attery. I need to practice conversation topics. Remember when we could just... show up places?
[Final beat]
The truth is, we're all just walking around pretending we know what we're doing. But at least we're all conf used together. And hey, if you ever see someone at the store at 10 PM buying wine in a box and googling "why does my back hurt when I breathe?" – just give them a nod. We're all in this together.
[Pause]
Thank you, you've been a wonderful audience! Don't forget to validate your parking – unlike my life choices!
[Exit]
Recreate an interactive, nostalgic Pokémon battle UI in a single HTML file.
Recreate an interactive, classic Mario level in a single HTML file.
Tests an AI's ability to replicate an existing UI with Tailwind CSS
Tests an AI's ability to create interactive web elements
Tests an AI's ability to create a detailed world map in SVG format
Generate a single-page, self-contained HTML webapp using Tailwind CSS for a randomly chosen category/industry/niche.
Generate SVG art of a randomly chosen animal in a setting of its choosing.
Generate a unique and simple recipe with common ingredients.
Cook the pasta according to package directions in salted water until al d ente. Reserve 1 cup of pasta water before draining.
Heat olive oil in a large skillet over medium heat. Add minced garlic and red pepper flakes. Sauté fo r 1-2 minutes until fragrant but not brown.
Add the tomatoes with their juice to the skillet. Simmer for 5-7 minutes, breaking up larger pieces w ith a spoon. Season with salt.
Toss the pasta with the tomato mixture. Add pasta water gradually to achi eve desired consistency.
Remove from heat and stir in half the Parmesan cheese. Serve immediately topped with remaining cheese .
Serves: 4
Why "Midnight Comfort"? Because this is the pasta you make when it's late, you' re hungry, and these ingredi ents are always waiting in your pantry!