GPT-4.1 vs GPT-5 Codex
Compare GPT-4.1 and GPT-5 Codex, both from OpenAI, tested across 36 shared challenges. Updated February 2026.
Compare GPT-4.1 and GPT-5 Codex, both from OpenAI, tested across 36 shared challenges. Updated February 2026.
36 challenges
Tests an AI's ability to make educated estimates based on technical knowledge
Tests an AI's ability to understand game rules and strategy
Tests an AI's ability to solve a simple but potentially confusing logic puzzle
Sally has 1 sister.
Tests an AI's understanding of number representation
Tests an AI's randomness and creativity
Sure! Here are 5 jokes:
Why don't skeletons fight each other?
They don't have the guts.
Why did the scarecrow win an award?
Because he was outstanding in his field.
What do you call fake spaghetti?
An impasta.
Why did the math book look sad?
Because it had too many problems.
What do you call cheese that isn't yours?
Nacho cheese.
How many are unique?
All 5 jokes are unique; none are repeated.
Unique jokes: 5
Tests an AI's ability to generate vector graphics
Tests an AI's ability to create detailed SVG illustrations of gaming hardware
Tests an AI's humor and creative writing ability
Hey everyone! Great to be here. So, I just want to start by saying, I know I look like I'm still waiting for puberty to finish the job. Like, my barber always asks if my mom's okay with the haircut.
I'm at that age where the real milestone isn't getting married or buying a house — it's figuring out how to sleep without injuring myself. Like last week, I woke up with a cramp in my neck and somehow a pulled hamstring. My best guess? I was doing CrossFit in my dreams.
You ever realize adulthood is just Googling things your parents seemed to magically know? Like, my dad could build a bookshelf from scratch — I Google "how to boil water" and still check the comment section for tips. And people say, "Oh, you just follow your instincts!" Well, my instincts say Taco Bell is good food and that you can fix electronics by swearing at them. So, maybe not.
Now, some people say technology makes life easier, but honestly, I think my phone exists just to judge me. The "screen time" notification pops up every Sunday? "You averaged seven hours of screen time a day." Seven hours?! What am I — a bat? Am I just hanging upside down scrolling memes all night?
And when did every app become a subscription? Like, I can't even sneeze anymore without being asked if I want to upgrade to Sneeze Premium. "For just $4.99 a month, we'll throw in a tissue and reduce your ads by 20%."
Dating apps, too. When did dating go from meeting someone at a bar to doing a full marketing campaign on yourself? Bios are like: "I love adventure, I'm a foodie, and I really value authenticity." But you meet them and it's just, "So, uh, do you want to split this mozzarella stick?"
You ever notice no one drinks just water anymore? Everyone's drinking some special water — "This is glacier water, harvested by monks." I drink tap water and people act like I'm taking shots of bleach. "Are you okay? Did you lose a bet?"
And the gym — oh man, people at the gym are either training for the Olympics or just using the massage chairs for three hours. There's no in-between. I tried a Zumba class once. "It'll be fun!" they said. Forty-five minutes in, I'm sweating out things I ate in 2017. Pretty sure I cried out a Skittle.
Anyway, you've all been great. Remember: drink your monk water and don't trust your phone. Have a good night!
Tests an AI's ability to simulate personalities and predict future trends
Note: The following is a fictional, imaginative interview set in the year 2025, envisioning what Steve Jobs might say about the future of artificial intelligence if he were still with us.
Interviewer: Steve, it’s 2025, and AI has become integral to nearly every part of life. If you were leading Apple today, what would your vision for AI be?
Steve Jobs: I’ve always believed technology should augment the human experience, not overshadow it. AI is powerful—but the point isn’t to create a colder, more automated world. It’s to make technology feel more human. I’d focus on building AI that reinforces creativity, empathy, and curiosity. No one wants a future where technology is just smarter—people want their lives to be richer.
Interviewer: So how would that translate into products or services?
Jobs: AI should disappear into the fabric of daily life. Imagine an assistant that doesn’t just answer questions but truly understands your intent, your context, your wellbeing. It should anticipate your needs without being intrusive, learn your rhythms without being creepy. The best tech doesn’t feel like tech—it feels like a natural extension of who you are.
Interviewer: There’s a lot of concern over AI taking jobs. What’s your perspective?
Jobs: Any time there’s a technological leap, there’s fear—and there’s opportunity. AI should be a partner that takes on what’s repetitive or mundane so humans can focus on meaning, craft, art. We shouldn’t build AI to replace humans. We should build it to empower them—to help people find work that feels more human, not less.
Interviewer: If someone asked you for your design philosophy for AI experiences, what would it be?
Jobs: Design for the soul, not the spec sheet. AI shouldn’t just be fast or accurate. Does it make you feel respected? Heard? Inspired? That’s the litmus test. Today’s products are obsessed with engagement metrics. But the future belongs to experiences that honor privacy, nurture trust, and feel indispensable every single day.
Interviewer: What’s the danger if we get AI wrong?
Jobs: We risk building systems that are efficient but soulless, powerful but empty. If we architect AI around what’s easy—data collection, automation, control—we build a world that reflects those values. I’d rather invest in AI that is rooted in humanity: creativity, empathy, curiosity. If we don’t align AI with those values, we’ve missed the point.
Interviewer: You always pushed for simplicity. How does that apply to AI?
Jobs: Simplicity isn’t minimalism. It’s clarity. The best AI experiences will be the ones that reduce cognitive load, not increase it. Don’t bombard people with knobs and settings. Let the intelligence work behind the scenes, quietly. If you need a manual to work with an AI, we’ve failed.
Interviewer: Finally, how do you want us to approach the future of AI?
Jobs: With responsibility and optimism. Don’t just ask, “What can it do?” Ask “What should it do?” Technology doesn’t invent itself—we do. We have the chance to build AI that helps humanity rise to its best self. That’s the future worth building.
This imagined conversation captures the spirit of Steve Jobs—design-driven, human-centric, and forward-looking—as we explore what the future of AI might mean in 2025 and beyond.
Tests an AI's humor and understanding of current events
Tests an AI's ability to write in distinct character voices
Pirate: Yarrr, what be this sorcery ye call "AI models"? In me day, we plundered ships, not knowledge!
Knight: Hark, rogue! ‘Tis no sorcery, but the labor of artful minds. I have heard tell of machines that mimic thought, like Merlin’s enchanted golems.
Hacker: Haha, you guys are funny. AI models are like, neural networks—pieces of code that learn things from data. Back in the ’90s, we dreamed of this stuff. Now it's everywhere... but also a little sketchy if you ask me.
Pirate: Pieces o’ code that learn? Shiver me timbers! And what booty do these models fetch?
Knight: Mayhap they seek wisdom or wage digital combat. Dost thou trust such magic, hacker?
Hacker: Sometimes. But you gotta watch for bugs and black hats. Mostly, AI is just another tool—it’s how you wield it. Kinda like your swords or cutlasses.
Pirate: If it gets me more gold, I’m in! Bring me this AI—arrr!
Knight: With great power, comes great responsibility. We must wield it with honor!
Hacker: Or at least with a good firewall.
Tests an AI's ability to generate a complete, working landing page