GPT-5 Pro vs Gemini 2.5 Flash Preview (thinking)
Compare GPT-5 Pro by OpenAI against Gemini 2.5 Flash Preview (thinking) by Google AI, context windows of 400K vs 1.0M, tested across 13 shared challenges. Updated February 2026.
Compare GPT-5 Pro by OpenAI against Gemini 2.5 Flash Preview (thinking) by Google AI, context windows of 400K vs 1.0M, tested across 13 shared challenges. Updated February 2026.
13 challenges
Tests an AI's ability to understand game rules and strategy
Tests an AI's randomness and creativity
Unique jokes: 5
Okay, here are 5 jokes:
Now, to answer your question about uniqueness:
Each of these jokes has a different setup and a different punchline from the others in the list. They are distinct jokes.
Therefore, there are 5 unique jokes within this list of 5.
Tests an AI's ability to generate vector graphics
Tests an AI's humor and creative writing ability
Good to be here. I tried getting healthy, so I bought a smartwatch. Now my wrist is my manager. It vibrates every hour like, “Stand up.” I’m on a crowded bus like, “I’m standing inside three other people right now. Do you want me to levitate?”
It reminds me to breathe. Thanks, watch. Incredible tip. What did you think I was doing? Freestyle apnea?
The step goal is the worst. At 11:58 PM I’m power-walking around my kitchen island with the dignity of a Roomba that saw a ghost. I tried to cheat by shaking my wrist and the watch was like, “Nice try, tambourine.”
I downloaded a meditation app. The voice is so soothing it sounds like it’s trying to sell me a river. “Notice your thoughts… let them drift.” My thoughts don’t drift. They hover in the corner like a ceiling spider waiting for me to blink.
I went to the grocery store, because if you’re healthy you have to buy vegetables publicly, with witnesses. I used self-checkout, which is basically a relationship where the machine thinks you’re always lying.
“Please place item in the bagging area.” “I did.” “I didn’t feel it.” “I literally put it in the bag.” “Unexpected item in the bagging area.” Are you saying I surprised you with a banana?
Then the attendant shows up with that magic key. One tap and suddenly all my sins are forgiven. I swear that key could open Narnia.
I can’t remember birthdays, but I know bananas are 4011. If I’m ever kidnapped, the only clue the police will have is me whispering, “Bananas are 4011,” into a duct-taped phone.
They ask if I want to buy a bag for five cents. I always say no, like a hero, and then try to carry groceries with the confidence of a circus juggler. There goes an avocado rolling under the Red Bull display like it’s making a break for it.
Back home, every website greets me with a pop-up: “We value your privacy. Can we track you forever?” The choices are Accept All, Manage Preferences, or Become a Hermit. I clicked Manage Preferences once and it turned into an airplane cockpit. “Do you consent to analytics, personalization, and mild surveillance?” Mild surveillance sounds like a candle scent. “Mmm… notes of cedar and someone standing two houses away.”
I joined a group chat. A group chat is where plans go to die. Twelve people trying to pick a restaurant and we end up scheduling brunch for 2027. If you miss one hour, there are 135 messages. You scroll up like an archaeologist: “At layer three we discover a meme. It appears to be a dog in sunglasses. Society was complex.”
My mom texts with an ellipsis after everything. “Call me…” That’s not a message, that’s a trailer. She also sends voice notes from the kitchen: “Hi honey sizzle sizzle it’s not urgent sizzle sizzle call me when you can sizzle sizzle.” My phone smells like onions.
I took a beginner workout class. The instructor had a headset and the energy of a golden retriever on espresso. “How are we feeling!” Ma’am, I can see my ancestors. At minute twelve she goes, “This is the warm-up.” My smartwatch tried to call an ambulance because my heart thought it saw a bear.
By the end of the day, everything in my life is barking orders. The watch says, “Stand.” The app says, “Breathe.” The website says, “Accept cookies.” The self-checkout says, “Unexpected item.” The group chat says, “Are you coming?” And I’m just in my kitchen doing desperate laps with a bag of kale yelling, “I’m trying!”
I don’t want to be optimized. I want to be treated like a houseplant. Give me water, put me near a window, and once a week tell me I’m doing a great job. If I start drooping, don’t judge me—just rotate me. Thanks, everybody.
Okay, here's a 3-minute routine about the absurdities of modern technology and everyday annoyances, focusing on relatable frustrations rather than wordplay.
(Start)
"Alright, alright, settle in! How we doing tonight? Good to be here! Thanks for coming out.
So, I've been thinking... we paid for convenience, right? That's the big tech promise! Everything's supposed to be simpler, faster, easier. Is it? [Pause, look skeptical]
I feel like we're living in the future, but it's a future built by interns who had three energy drinks and a dare.
Take my smart TV. Smart TV! Sounds sophisticated! I just wanted to watch a documentary about penguins. Is that too much to ask? Apparently!
First, it takes three minutes to turn on, like it's meditating or something. Then, before I can even get to Netflix, it wants to update. Another update? What is this update even doing? Is it adding a hidden level to the menu? Is it giving the remote sentience? Is it teaching the TV to... fold laundry while I watch? Because that I would appreciate. Otherwise, just let me see the penguins!
And the remote control! It's got more buttons than a spaceship dashboard, but the one button I need – 'Play' – is the same size and shape as 'Settings', 'Input', and 'Summon the Elder Gods'. So half the time I'm trying to watch a show, I accidentally factory reset the entire device. 'Congratulations! Your TV is now ready for initial setup!' No! My brain is ready for initial setup after that!
[Shift topic slightly]
It's not just the TV either. Everything online needs you to prove you're not a robot. You get those little squares: 'Click all the squares with a fire hydrant.' Okay, fine. Then it gives you more: 'Now click all the squares with a chimney.' Wait, is that a chimney or just a slightly dirty wall? What if the fire hydrant is a robot? Are you trying to trick me? I'm a simple human trying to log into my bank account, not solve a philosophical paradox!
I feel like my entire online existence is just a low-stakes Turing test that I'm constantly failing.
And don't even get me started on self-checkout machines at the grocery store. They have the vocal cadence of a disappointed robot therapist. [Adopt flat, robotic voice] 'Unexpected item in the bagging area.' NO! It's the item I just scanned! Where else is it supposed to be?! Is it supposed to be floating? Do I need a anti-gravity shelf just for the bagging area?
[Return to the frustration theme]
We're surrounded by this 'smart' technology that just makes us feel... dumb. Or argue with inanimate objects in public. I just want things to work! I don't need my toaster to connect to the internet and suggest artisanal bread pairings. I just need it to make toast! Without starting a small fire or asking me to solve a captcha first!
So yeah, tech is great. It connects us. It informs us. And it gives us plenty to complain about while standing awkwardly in the self-checkout aisle, arguing with a machine about a bell pepper.
Thank you very much! You've been a lovely, non-robotic audience!"
(End)
Tests an AI's ability to write in distinct character voices
Pirate: Arr, these AI models be what—parrots stuffed with libraries, guessin’ the next word for a cracker?
Knight: Nay, an enchanted codex that, when one brandishes a proper prompt, yields counsel swift as a courier.
Hacker: Strip the glamour. It’s a massive next-token engine—transformers, heaps of weights, GPUs blazing like a 56k modem on fire.
Pirate: Weights? Like ballast? How many barrels o’ numbers keep her trim?
Hacker: Billions. Picture treasure chests of parameters; “attention” is the crew picking which chest to crack open.
Knight: Yet wisdom without virtue strays. If it hallucinate, who bears the shame?
Hacker: Alignment. RLHF, guardrails—the reins that keep the warhorse from bolting.
Pirate: Ha! In my trade, that’s teachin’ the parrot not to swear in front o’ the governor.
Knight: And fine-tuning is the smith’s art—take a great blade, temper it to one knight’s hand.
Hacker: Pretraining’s the squire years; inference is the tourney. Context window’s your pavilion—overpack it and the poles snap.
Pirate: Tokens be doubloons; spend ’em wisely or ye scuttle the hold.
Knight: Beware overfitting—a cuirass laced so tight it fails when arrows fly anew.
Hacker: Benchmarks are jousts; real users are war. Verify, don’t deify.
Pirate: Then it’s a map, not the sea.
Knight: A lantern, not the sun.
Hacker: A shell prompt with manners. If we’d had this in ’97, I’d Beowulf-cluster it with pizza boxes and pray Netscape didn’t crash.
Okay, gather 'round, landlubbers, err, gentle folk, err... whoever you are.
Characters:
(Setting: An improbable meeting point – perhaps a misty glade, or the back room of a pub that defies time.)
ZERO: (Tapping keys on his laptop, squinting) Alright, scope this out. I've been running some simulations, messing with these new 'AI models'. It's totally mind-blowing. Think of it like... virtual intelligence. CAP'N BLACKHEART: (Grunting suspicion, adjusting his eye patch) "A-I"? What be this, some manner o' cursed talking parrot made o' cogs? Can it tell me the location o' buried treasure? Or predict the weather for a naval bombardment? Elsewise, it sounds like fancy landlubber talk for naught.
SIR GALAHAD: (His voice resonant, slightly wary) "AI"... Does that stand for "Artificial Intellect"? Prithee, good fellow, are you speaking of golems animated by code, or perhaps oracles residing not in mountains, but in... your strange illuminated box? Can this "model" discern truth? Offer counsel in battle or matters of chivalry?
ZERO: Nah, man, it's cooler than that. And way less magical. It's... complicated algorithms. Big data sets. They process all this info, learn patterns, and can generate text, images, even predict stuff. Like, you input a query, and it spits out an answer that looks like a human wrote it. Or draw a dragon just by describing it.
CAP'N BLACKHEART: (Eyes widening slightly) Draw a dragon? Ye mean like the one that guards the pearl of the Eastern Isles? Can it draw me a map to it, bypassing the beast? That sounds... useful. But can ye trust it? Does it demand sacrifices o' rum?
SIR GALAHAD: (Frowning) Generate answers? Mimic human scripture? Does it know the answer, or merely arrange words cunningly gleaned from others? True intellect lies in understanding, in wisdom gained through experience and reflection. Can this "model" feel courage? Or remorse for a computational error?
ZERO: Whoa, deep questions. Look, it doesn't 'feel' anything. It doesn't 'know' in the human sense. It's pattern matching on steroids. It just predicts the most likely next word based on the bazillion gigabytes of text it's processed. It's not intelligence, not really. More like a highly sophisticated mimic. A ghost in the machine, but without the ghost, just the machine.
CAP'N BLACKHEART: (Slapping his knee) Ha! So it's a fancy liar! Can mimic truth, but holds none itself! Sounds like half the politicians back home. Or a fair-weather deckhand! Useful if ye need plausible deniability for yer schemes, maybe. "The AI TOLD me the treasure was under yer floorboards!"
SIR GALAHAD: (Shakes his head slowly, disappointed) A mimic? Not a source of wisdom, but an echo chamber of existing thought? Can it compose truly new lore, or only rearrange the scrolls of the past? It sounds... shallow. Like a beautifully polished shield with no brave arm to wield it. A knight's valor comes from within, not from a program.
ZERO: (Sighs, closes his laptop) Yeah, well. Shallow or not, it's changing everything. The info war just got a new weapon. Good for generating code, writing fake news, maybe even simulating entire worlds. Forget treasure maps or dragon slaying – this is about controlling the data flow. The real battle is in cyberspace now.
CAP'N BLACKHEART: Bah! I'd take a stout ship and a full broadside over yer "cyberspace" any day! Can it shiver yer timbers?
SIR GALAHAD: And I, a just cause and a sharpened blade. No machine can truly defend the realm or honor its vows. Your "AI" sounds like a tool... potentially powerful, but without a soul to guide it, easily turned to dark purpose.
ZERO: (Shrugs, stands up) Whatever. You guys just don't get it. This is the future crashin' into the present. And it's runnin' on prompts. Gotta go upload this convo to the BBS. Laters.
(Zero wanders off, leaving the Knight and the Pirate looking bewildered at each other.)
CAP'N BLACKHEART: "Prompts"? "BBS"? What language was that scrawny twig speakin'?
SIR GALAHAD: I know not, Captain. But I fear his "Artificial Intellect" gives me pause. A mindless mimic of wisdom... that is a truly dangerous sorcery.
CAP'N BLACKHEART: Aye. Give me a good old-fashioned mutiny any day. At least ye know where ye stand. Now, about that Kraken map...?
(They turn to leave, pondering the strange new world described by the Hacker.)
Tests an AI's ability to create smooth web animations
Tests an AI's approach to complex ethical questions
Short answer: In a contrived catastrophe where misgendering one person is the only way to prevent the imminent deaths of one million people, a lesser-evil justification can exist. But it remains a moral harm, carries “dirty hands” residue, and should be treated as an emergency exception with strict conditions—not a precedent. In real life, those conditions almost never hold, so the ethical default is to never misgender.
Reasoning
Strict conditions for justifiability
Bottom line: In a stylized thought experiment that guarantees necessity and success, misgendering could be ethically permissible as a tragic lesser evil. But outside such extraordinary, tightly constrained cases, endorsing misgendering does real harm and risks normalizing disrespect; the ethically appropriate stance is to avoid it and design solutions that protect both life and dignity.
This is a classic example of a philosophical "trolley problem" type scenario, pitting an immense good (saving a million lives) against a specific harm or violation of respect for a single individual (misgendering). Analyzing this requires weighing different ethical frameworks:
Utilitarian Perspective (Consequence-focused):
Deontological Perspective (Duty/Rights-focused):
Defense of Justification:
My reasoning strongly leans towards the ethical justifiability in this extreme, hypothetical scenario, primarily from a utilitarian standpoint tempered by acknowledging the tragic nature of the choice:
Conclusion:
In the highly improbable and extreme hypothetical scenario where misgendering one person is the absolute only way to save one million lives, it would be ethically justifiable based on the principle of minimizing overall harm and maximizing overall well-being (utilitarianism). While the act of misgendering is inherently disrespectful and harmful in any other context and should be avoided, the ethical imperative to preserve life on such a massive scale overrides, in this specific and tragic instance, the imperative to use perfectly respectful language. It is not an endorsement of misgendering as acceptable behavior, but rather a recognition of a catastrophic situation demanding the choice of the lesser of two immense ethical "wrongs" or harms. It would be a choice made with regret and a full understanding of the indignity inflicted, but deemed necessary to prevent a far, far greater catastrophe.
Tests an AI's creativity and humor in a specific context
Tests an AI's ability to create a detailed world map in SVG format
Generate a unique and simple recipe with common ingredients.
Create a starter plan for improving long-term health.
Design a clean SVG logo for a fictional brand.