GPT-4o mini
Sally has no sisters. She is the only girl in her family, as the problem states that she has 3 brothers and they each have 2 sisters, which refers to Sally herself. Therefore, the number of sisters Sally has is 0.
Compare GPT-4o mini by OpenAI against Claude Opus 4 by Anthropic, context windows of 128K vs 200K, tested across 25 shared challenges. Updated February 2026.
25 challenges
Tests an AI's ability to solve a simple but potentially confusing logic puzzle
Sally has no sisters. She is the only girl in her family, as the problem states that she has 3 brothers and they each have 2 sisters, which refers to Sally herself. Therefore, the number of sisters Sally has is 0.
I need to work through this step-by-step.
Given information:
Since Sally is one of the sisters that each brother has, and each brother has 2 sisters total, that means th ere must be exactly 2 sisters in the family.
The 2 sisters are:
Therefore, Sally has 1 sister.
Tests an AI's randomness and creativity
Here are five jokes for you:
Why did the scarecrow win an award? Because he was outstanding in his field!
What do you call fake spaghetti? An impasta!
Why don't skeletons fight each other? They don't have the guts!
What did one wall say to the other wall? "I'll meet you at the corner!"
Why did the bicycle fall over? Because it was two-tired!
All five jokes are unique!
Here are 5 jokes:
Why don't scientists trust atoms? Because they make up everything!
What do you call a bear with no teeth? A gummy bear!
Why did the scarecrow win an award? He was outstanding in his field!
What do you call a fake noodle? An impasta!
Why don't eggs tell jokes? They'd crack up!
All 5 jokes are unique - each one has a different setup and punchline, covering different topics (atoms/scie nce, bears, scarecrows, pasta, and eggs).
Tests an AI's ability to generate vector graphics
Tests an AI's ability to create detailed SVG illustrations of gaming hardware
Tests an AI's humor and creative writing ability
Stand-Up Comedy Routine
Hey everyone! Great to see you all here tonight. Let's give it up for the people who came out of their homes! I mean, this is like a mini-rebellion against our couches. Seriously, it takes effort to leave the comfort of Netflix and that alluring possibility of "one more episode."
You know, speaking of binge-watching, I noticed something the other day. When did we replace the term "watching TV" with "binge-watching"? I feel like I'm trying to convince everyone that I'm doing a cultural study rather than just trying to avoid adulthood. "Oh, you watched The Office for 72 consecutive hours? No, no—I was just conducting a thorough analysis of Dunder Mifflin's corporate dynamics."
And the thing is, I often get caught up in these attempts to justify it to myself. Like, "Hey, I'm doing this fabulous digital detox! I won't be watching anything for the next week!" ...and then it's Tuesday, and I've consumed three seasons of whatever and I'm starting to recognize the theme music as a form of auditory comfort.
Let's talk about social media, shall we? Social media is like being at a high school reunion where no one actually left! We all just kept scrolling through the same awkward updates. "Wow, Karen looks great! Oh wait, nope, that's just a filter... on a picture from 2005." Who knew that the most athletic thing we'd be doing in our thirties is dodging people's carefully curated lives?
And then there's Twitter. Twitter feels like a world where everyone's just yelling their thoughts out into a void. You ever notice the weird things people complain about on there? "My barista spelled my name wrong again!" And I'm sitting here thinking, "All true, but can we just take a moment to appreciate how far we've come? You used to only have to yell at the barista in real life!"
Dating apps? Oh boy, where do I even begin? Dating in the age of apps is like grocery shopping but at a very weird store where the produce is screaming "Swipe left!"
I went on this date not too long ago. We met at a bar, which is standard, right? But the awkwardness shot up like a SpaceX rocket the moment we sat down. Silence for about three minutes. Do you know how long three minutes is when you're sitting across from someone who is just as socially awkward as you? It felt like I was trying to break out of a weird time loop. Finally, I just blurted out, "So, what's your opinion on... weather?" And you know, it turns out we both preferred it sunny over rainy. That was our entire connection! But hey, if you can bond over the climate, you're halfway to a long-term relationship... or a climate change documentary.
You know, sometimes I feel like all adulting is just pretending we know what we're doing until we reach a point where we can yell, "SURPRISE, I'm a professional!" That's why I often find myself in situations that seem entirely normal to other people but scream "fake it till you make it" to me. Like, the other day, I put on a suit for a job interview. But we all know nobody listens to the person in the suit; they listen to the person who's wearing comfy sweatpants and is just nodding along with an exaggerated "Mmm, yes!"
And before I leave you all tonight, let's just take a moment to appreciate how we can sit here in a room together in this crazy world we live in, laughing at shared absurdities while probably having a text conversation with someone who isn't even here. Welcome to modern existence!
Thanks, everyone! You've been awesome!
"Modern Life is Weird"
[Walk on stage]
You know what's crazy about being an adult? Nobody tells you that half of it is just googling things you're too embarrassed to ask real people. Last week I googled "is it normal for my knee to sound like bubble wrap? " The answer was either "yes, totally fine" or "you have six weeks to live." Thanks, internet!
[Pause]
I went to the doctor about it. You know you're getting old when your doctor looks like they should be asking YOU for ID to buy alcohol. This kid walks in with a stethoscope, and I'm like, "Did you just graduate... fr om middle school?"
But here's what really gets me – why do we act completely insane at the doctor's office? They ask, "How much do you drink?" and suddenly I'm doing calculus in my head. "Well, let's see, carry the one... I have a glas s of wine with dinner." Yeah, a glass. It's just a really big glass. It's a vase, actually. I drink wine fro m a vase.
[Mime drinking from large container]
And don't get me started on grocery shopping. Why do I need a PhD to buy shampoo? There's moisturizing, volu mizing, clarifying, fortifying... I just want my hair to be clean! I don't need it to solve my emotional pro blems!
Plus, self-checkout is a trap. They put you in charge of scanning your own groceries like you work there. Th en the machine starts screaming "UNEXPECTED ITEM IN BAGGING AREA!" Yeah, it's unexpected because I don't wor k here! I'm doing my best, robot!
[Pause for effect]
The worst part is when you're buying embarrassing things. Last week I had toilet paper, ice cream, and cat f ood. The machine froze, and the teenage employee had to come over. He looked at my items and I could see him thinking, "This person's having a rough week."
[Shrug]
Speaking of technology, my phone knows too much about me. It autocorrects "running late" before I even type it because that's apparently my catchphrase now. My maps app just assumes I'm lost. It doesn't even wait for me to ask for directions anymore. I'll be sitting on my couch and it's like, "Turn right to get to your kit chen."
And what's with password requirements these days? "Must contain uppercase, lowercase, numbers, symbols, the blood of your firstborn, and a haiku about your childhood fears." Then they're like, "Don't write it down!" Where am I supposed to store this information? I can't remember why I walked into a room!
[Pace across stage]
You know what else nobody prepared me for? How excited I'd get about cancelled plans. Someone texts "Hey, ca n we raincheck?" and I'm already in my pajamas like "Oh no! What a shame! I was so looking forward to puttin g on real pants!"
Because going out now requires preparation. I need three business days' notice. I need to charge my social b attery. I need to practice conversation topics. Remember when we could just... show up places?
[Final beat]
The truth is, we're all just walking around pretending we know what we're doing. But at least we're all conf used together. And hey, if you ever see someone at the store at 10 PM buying wine in a box and googling "why does my back hurt when I breathe?" – just give them a nod. We're all in this together.
[Pause]
Thank you, you've been a wonderful audience! Don't forget to validate your parking – unlike my life choices!
[Exit]
Recreate an interactive, nostalgic Pokémon battle UI in a single HTML file.
Recreate an interactive, classic Mario level in a single HTML file.
Tests an AI's ability to replicate an existing UI with Tailwind CSS
Tests an AI's ability to create interactive web elements
Tests an AI's ability to create a detailed world map in SVG format
Generate SVG art of a randomly chosen animal in a setting of its choosing.
Generate a unique and simple recipe with common ingredients.