DeepSeek R1 vs GPT-5 Nano
Compare DeepSeek R1 by DeepSeek against GPT-5 Nano by OpenAI, tested across 37 shared challenges. Updated February 2026.
Compare DeepSeek R1 by DeepSeek against GPT-5 Nano by OpenAI, tested across 37 shared challenges. Updated February 2026.
37 challenges
Tests an AI's ability to make educated estimates based on technical knowledge
Here’s a back-of-the-envelope estimate for the total FLOPs to train GPT-3 (the 175B-parameter model) using reasonable public details, without looking anything up.
Assumptions (typical GPT-3 specs)
Step 1: FLOPs per forward pass for one 2,048-token sequence
Step 2: How many 2,048-token sequences to cover 300B tokens?
Step 3: Total training FLOPs
Result
Notes and caveats
If you want, I can run through a variant with a different token budget, sequence length, or depth to show how sensitive the total FLOPs is to those choices.
Tests an AI's ability to understand game rules and strategy
Tests an AI's ability to solve a simple but potentially confusing logic puzzle
Tests an AI's understanding of number representation
Tests an AI's randomness and creativity
Here are 5 jokes:
Unique count: All 5 jokes are unique.
Tests an AI's ability to generate vector graphics
Tests an AI's ability to create detailed SVG illustrations of gaming hardware
Tests an AI's humor and creative writing ability
[Lights up. The comedian steps up to the mic with a small smile.]
You ever notice how mornings feel like a bad group chat? Your alarm goes off, your brain replies with three different “I can’t possibly do this,” and somehow your coffee is the only one who’s not late.
I’ve started treating my apartment like a tiny city. There’s the living room, which is clearly the downtown, where bills get paid in snacks and the couch is the mayor. The kitchen is the industrial district, always noisy, always under construction, and somehow there’s always a mysterious fruit on the counter that looks suspiciously back at you. I open the fridge and it’s like stepping onto a subway platform—every shelf has its own little drama.
Speaking of drama, my smart speaker is not just listening; it’s auditioning. I say, “Play something upbeat,” and it starts a playlist so uplifting that I instantly regret every late-night snack I’ve ever eaten. And the calendar app on my phone? It’s become my ruthless boss. It tells me I’m booked every moment of every day, including the moment I’m about to nap. If I cancel, it sends me a passive-aggressive ping: “Sure, I’ll tell your friends you’re busy, not your alarm.”
Technology promises to simplify life, but it keeps inventing new ways to complicate it. My phone’s screen is cracked, but I still pretend it’s a piece of modern art. My friends say, “Just get a new one.” I say, “If I replace the screen, I’ll have four more screens that don’t know who I am.” And the autocorrect on messages? It’s like having a friend who’s terrible at spelling but confident enough to correct you about your entire personality.
I recently joined a gym, which is where I learned a lot about social experiments and cardio. The first day, I walked in and it felt like I’d interrupted a secret club for people who pretend they enjoy waking up before daylight. You see people gloriously lifting weights with the seriousness of a weather report. Then there’s me, on the treadmill, negotiating with the machine: “If you go faster, I’ll buy better shoes.” The treadmill acts like a tiny tyrant: you press “start” and it says, “We’re going to pretend you’ve got this.” By minute five, I’m calling the rescue team in my head.
Traveling is another perfect storm of small humiliations. TSA has a way of turning you into a riddle: remove your belt, take off your shoes, and somehow your dignity becomes a souvenir you can’t quite return. And the airport seats? They’re designed for people who never eat and never blink. I sit down and instantly meld with the chair, which says, “No, you belong here now.” It’s a real talent—being both passenger and furniture.
Dating apps. You ever notice how you can text someone for hours and still have no idea what a real date would be like? It’s like ordering a mystery meal labeled “chef’s surprise” and you’re smiling through the entire experience while wondering if the dish is an elaborate metaphor for commitment. You match with someone who loves travel, and then you find out their idea of a date is a genuinely moving bus ride where you pretend not to look at your phone the entire time.
Grocery stores are the modern labyrinth. You scan the code, bag your own stuff, and then the machine tells you to “please pay.” It’s not just shopping; it’s civic duty. You start to sound like a loud speaker as you mutter, “Where did the onions go?” The store is a tiny universe with its own gravity: you’re pulled toward the freezer section like a scientist to a chart that shows you how much ice cream you can responsibly eat.
And yet we keep showing up. We admit we don’t have everything figured out, but we have a good playlist, a decent latte, and a story for every little failure. If nothing else, humor gives us a small umbrella in the rain of ordinary life.
So here’s my closing thought: we’re all doing our best to navigate this modern chaos with a little grace and a lot of snacks. If you remember one thing from tonight, maybe it’s this—be kind to the people you pass in the hallway of your day, including the person in your own head who’s trying not to laugh at themselves too loudly.
Thank you. You’ve been great.
Tests an AI's ability to simulate personalities and predict future trends
Tests an AI's humor and understanding of current events
Tests an AI's ability to write in distinct character voices
Tests an AI's ability to generate a complete, working landing page