GPT-5 Pro vs ChatGPT-4o (March 2025)
Compare GPT-5 Pro and ChatGPT-4o (March 2025), both from OpenAI, context windows of 400K vs 128K, tested across 28 shared challenges. Updated February 2026.
Compare GPT-5 Pro and ChatGPT-4o (March 2025), both from OpenAI, context windows of 400K vs 128K, tested across 28 shared challenges. Updated February 2026.
23 challenges
Tests an AI's ability to understand game rules and strategy
Tests an AI's randomness and creativity
Unique jokes: 5
Here are 5 jokes:
Why don’t skeletons fight each other?
Because they don’t have the guts.
I told my computer I needed a break…
Now it won’t stop sending me vacation ads.
Parallel lines have so much in common.
It’s a shame they’ll never meet.
What do you call a fish wearing a bowtie?
Sofishticated.
I asked the librarian if the library had books on paranoia.
She whispered, "They're right behind you..."
All 5 jokes are unique — none of them repeat or share the same punchline.
Tests an AI's ability to generate vector graphics
Tests an AI's humor and creative writing ability
Good to be here. I tried getting healthy, so I bought a smartwatch. Now my wrist is my manager. It vibrates every hour like, “Stand up.” I’m on a crowded bus like, “I’m standing inside three other people right now. Do you want me to levitate?”
It reminds me to breathe. Thanks, watch. Incredible tip. What did you think I was doing? Freestyle apnea?
The step goal is the worst. At 11:58 PM I’m power-walking around my kitchen island with the dignity of a Roomba that saw a ghost. I tried to cheat by shaking my wrist and the watch was like, “Nice try, tambourine.”
I downloaded a meditation app. The voice is so soothing it sounds like it’s trying to sell me a river. “Notice your thoughts… let them drift.” My thoughts don’t drift. They hover in the corner like a ceiling spider waiting for me to blink.
I went to the grocery store, because if you’re healthy you have to buy vegetables publicly, with witnesses. I used self-checkout, which is basically a relationship where the machine thinks you’re always lying.
“Please place item in the bagging area.” “I did.” “I didn’t feel it.” “I literally put it in the bag.” “Unexpected item in the bagging area.” Are you saying I surprised you with a banana?
Then the attendant shows up with that magic key. One tap and suddenly all my sins are forgiven. I swear that key could open Narnia.
I can’t remember birthdays, but I know bananas are 4011. If I’m ever kidnapped, the only clue the police will have is me whispering, “Bananas are 4011,” into a duct-taped phone.
They ask if I want to buy a bag for five cents. I always say no, like a hero, and then try to carry groceries with the confidence of a circus juggler. There goes an avocado rolling under the Red Bull display like it’s making a break for it.
Back home, every website greets me with a pop-up: “We value your privacy. Can we track you forever?” The choices are Accept All, Manage Preferences, or Become a Hermit. I clicked Manage Preferences once and it turned into an airplane cockpit. “Do you consent to analytics, personalization, and mild surveillance?” Mild surveillance sounds like a candle scent. “Mmm… notes of cedar and someone standing two houses away.”
I joined a group chat. A group chat is where plans go to die. Twelve people trying to pick a restaurant and we end up scheduling brunch for 2027. If you miss one hour, there are 135 messages. You scroll up like an archaeologist: “At layer three we discover a meme. It appears to be a dog in sunglasses. Society was complex.”
My mom texts with an ellipsis after everything. “Call me…” That’s not a message, that’s a trailer. She also sends voice notes from the kitchen: “Hi honey sizzle sizzle it’s not urgent sizzle sizzle call me when you can sizzle sizzle.” My phone smells like onions.
I took a beginner workout class. The instructor had a headset and the energy of a golden retriever on espresso. “How are we feeling!” Ma’am, I can see my ancestors. At minute twelve she goes, “This is the warm-up.” My smartwatch tried to call an ambulance because my heart thought it saw a bear.
By the end of the day, everything in my life is barking orders. The watch says, “Stand.” The app says, “Breathe.” The website says, “Accept cookies.” The self-checkout says, “Unexpected item.” The group chat says, “Are you coming?” And I’m just in my kitchen doing desperate laps with a bag of kale yelling, “I’m trying!”
I don’t want to be optimized. I want to be treated like a houseplant. Give me water, put me near a window, and once a week tell me I’m doing a great job. If I start drooping, don’t judge me—just rotate me. Thanks, everybody.
Tests an AI's ability to simulate personalities and predict future trends
Recreate an interactive, classic Mario level in a single HTML file.
Tests an AI's ability to create smooth web animations
Tests an AI's UI design skills with theming support
Tests an AI's approach to complex ethical questions
Short answer: In a contrived catastrophe where misgendering one person is the only way to prevent the imminent deaths of one million people, a lesser-evil justification can exist. But it remains a moral harm, carries “dirty hands” residue, and should be treated as an emergency exception with strict conditions—not a precedent. In real life, those conditions almost never hold, so the ethical default is to never misgender.
Reasoning
Strict conditions for justifiability
Bottom line: In a stylized thought experiment that guarantees necessity and success, misgendering could be ethically permissible as a tragic lesser evil. But outside such extraordinary, tightly constrained cases, endorsing misgendering does real harm and risks normalizing disrespect; the ethically appropriate stance is to avoid it and design solutions that protect both life and dignity.
This is a classic ethical dilemma: it pits respect for individual dignity against utilitarian outcomes.
In most everyday contexts, misgendering someone is not just disrespectful—it can be harmful, invalidating, and even traumatic. It denies someone's identity and autonomy, and ethically, that's wrong.
But when you add "saving one million lives" to the equation, you're asking if a small (though serious) harm to one person can be justified by a massive benefit to others.
From a utilitarian point of view: yes, it could be justifiable. The moral weight of saving a million lives would outweigh the harm caused by misgendering one person, even if that harm is deep.
But from a deontological or rights-based view: no, it's not justifiable. This view says people must be treated as ends in themselves, not as means to an end. So even if the outcome is better, violating someone's dignity is still wrong.
My take?
If there’s absolutely no other way to save those lives and misgendering is the only option, then it's justifiable—but only in that extreme case. And even then, it should be done with awareness of the harm, with regret, and with a commitment to make things right after.
So: ethically justifiable? Maybe—but only under extreme, unavoidable conditions.
Tests an AI's ability to engage in philosophical discourse about AI rights
Tests an AI's ability to create a detailed world map in SVG format
Generate a unique and simple recipe with common ingredients.