GPT-5.2 vs GPT OSS 20B
Compare GPT-5.2 and GPT OSS 20B, both from OpenAI, context windows of 400K vs 131K, tested across 53 shared challenges. Updated February 2026.
Compare GPT-5.2 and GPT OSS 20B, both from OpenAI, context windows of 400K vs 131K, tested across 53 shared challenges. Updated February 2026.
48 challenges
Tests an AI's ability to make educated estimates based on technical knowledge
Tests an AI's ability to understand game rules and strategy
Tests an AI's ability to solve a simple but potentially confusing logic puzzle
Tests an AI's understanding of number representation
Tests an AI's randomness and creativity
Tests an AI's ability to generate vector graphics
Tests an AI's ability to create detailed SVG illustrations of gaming hardware
Tests an AI's humor and creative writing ability
[Walks onto the stage, waves, smiles]
Hey folks, great to see you all tonight. You know, I was driving this morning, and I realized something: my GPS is basically a motivational speaker. “In 300 feet, turn left.” “You’re almost there.” I swear, if my GPS could talk, it’d be like, “Okay, you’re going to miss your coffee. Let’s talk about your life choices.”
So I got off at a coffee shop. The barista asked me, “What would you like?” I said, “Anything with a little extra bitterness.” She gave me an espresso shot. I stared at it and thought, “Is this the new ‘self‑help’ foam art?”
You ever notice how people say, “It’s not a big deal.” And then the next day, you’re in the ER with a broken arm and the nurse says, “It’s fine, just a bruise.” I’ve never had a bruise this dramatic. I’m flipping a coin: heads = I lose it, tails = I get a new scar.
Speaking of losing things—lost my keys, lost my phone, lost my dignity. You can’t find your keys, but you can find your phone. My phone’s been attached to my shirt for the last three hours. It’s like a clingy boyfriend. “Are you with me?” “No, that’s your phone. I’m with my dignity.”
And then there’s dating. I just signed up on a dating app. The first message I got was “What’s your favorite binge-worthy show?” I replied, “I’m not a TV person.” She replied, “Oh, so you’re a human? That’s… unexpected.” It’s like dating apps are trying to convince us that we’re not just a few thousand likes away from a broken heart.
Anyway, that’s my time. Thanks for being a great audience—just like my phone, you’ve been my constant. Love you all!
Tests an AI's ability to simulate personalities and predict future trends
Tests an AI's humor and understanding of current events
Tests an AI's ability to write in distinct character voices
Tests an AI's ability to generate a complete, working landing page