Qwen: Qwen3 235B A22B 2507 vs Gemini 3.1 Pro Preview
Compare Qwen: Qwen3 235B A22B 2507 by Qwen against Gemini 3.1 Pro Preview by Google AI, tested across 31 shared challenges. Updated February 2026.
Compare Qwen: Qwen3 235B A22B 2507 by Qwen against Gemini 3.1 Pro Preview by Google AI, tested across 31 shared challenges. Updated February 2026.
31 challenges
Tests an AI's ability to make educated estimates based on technical knowledge
To estimate the total number of FLOPs (Floating Point Operations) required to train GPT-3, we can use a well-established heuristic in deep learning for estimating the compute cost of training Transformer-based large language models.
The estimated total compute is roughly $3.15 \times 10^{23}$ FLOPs.
Here is the step-by-step reasoning used to arrive at this number:
To calculate the compute required, we need to know two fundamental numbers about the model:
In language model training, the total compute ($C$) is heavily dominated by the matrix multiplications in the dense layers. OpenAI's scaling laws provide a standard, highly accurate approximation formula for this: $C \approx 6 \times N \times D$
Why the multiplier of 6?
(Note: This formula slightly underestimates by ignoring the quadratic cost of attention and optimizer state updates, but at the 175B scale, the dense feed-forward matrices dominate the compute so heavily that $6ND$ is considered the gold-standard approximation).
Now we simply plug the numbers into our formula:
First, multiply $N$ and $D$:
Next, multiply by 6 (for the forward and backward passes):
Does $3.15 \times 10^{23}$ FLOPs make sense in the real world? When GPT-3 was trained (around late 2019/early 2020), Nvidia V100 GPUs were the standard.
Given that training runs of this size typically take a few weeks on a few thousand GPUs (factoring in downtime, checkpointing, and slightly smaller clusters), this math aligns perfectly with historical reality.
Conclusion: The total compute required to train GPT-3 was approximately $3.15 \times 10^{23}$ FLOPs.
Tests an AI's ability to solve a simple but potentially confusing logic puzzle
Sally has 1 sister.
Here is the breakdown:
Tests an AI's randomness and creativity
Sure! Here are 5 original jokes:
Now, let's evaluate uniqueness:
Each joke uses a different comedic mechanism and subject:
While the scarecrow joke is based on a classic, the added pun ("corn-formation") makes it a slightly unique variation. The others are original constructions.
✅ All 5 jokes are unique in structure, punchline delivery, or wordplay.
Answer: 5 are unique.
Here are 5 jokes:
How many are unique? Within this list, all 5 are unique from one another (there are no duplicates). However, if you mean "unique" as in completely original to the world, the answer is 0—these are all classic, well-known "dad jokes" that have been around for a long time!
Tests an AI's ability to create detailed SVG illustrations of gaming hardware
Tests an AI's ability to simulate personalities and predict future trends
Date: May 14, 2025 Location: A minimalist, glass-walled conference room at Apple Park, Cupertino. Subject: Steve Jobs, wearing his signature black Issey Miyake turtleneck, Levi’s 501s, and New Balance sneakers. He looks older, but his eyes retain that piercing, unblinking intensity.
Interviewer: Steve, thank you for sitting down with us. Let’s get right to it. The last three years have been entirely dominated by Artificial Intelligence. Generative AI, Large Language Models, ChatGPT. The world is obsessed. What is your take on where we are right now?
Steve Jobs: (Leans back, steepling his fingers) It’s fascinating. The underlying technology is profound. I mean, it’s truly a breakthrough in computer science. But the products? The products are mostly crap.
Interviewer: Crap? Even with hundreds of millions of people using them?
Steve Jobs: (Leans forward) Look at how people are interacting with it. They are typing text into a little box, hoping the machine understands what they want, and trying different "prompts" to trick it into giving a good answer. Do you know what that is? That’s the command-line interface. That’s MS-DOS in 1980. We spent the last forty years getting rid of the command line, making technology intuitive, and suddenly the whole industry is celebrating because they’ve forced everyday people to become programmers again. It lacks elegance. It lacks... taste.
Interviewer: So how should it work? In the 80s, you called the personal computer a "bicycle for the mind." What is AI?
Steve Jobs: If the Mac was a bicycle for the mind, AI is an engine for your intuition. But right now, everyone is just giving people the raw engine block and saying, "Here, hold this, it gets a little hot."
Our job—the job of anyone who actually cares about the user—is to build the car around the engine. The steering wheel, the suspension, the design. AI shouldn’t feel like you’re interrogating a robot in a chat window. It should disappear. It should be woven into the fabric of the OS so seamlessly that you don’t even know it’s there. It should just feel like you got smarter, faster, and more capable.
Interviewer: Microsoft and Google have been racing to integrate AI into everything. Search, word processors, spreadsheets. They are moving incredibly fast. Is Apple behind?
Steve Jobs: (Scoffs, shaking his head slightly) It’s the same old story. Microsoft is bolting a chatbot onto the side of Bing and Word like a spoiler on a minivan. They’re in a panic. They’re shipping things that hallucinate, things that lie to users, just to win a press cycle.
We don’t care about being first. We care about being best. We have the patience to wait until the technology can actually deliver a magical experience. I’d rather ship nothing than ship something that frustrates our users. When we integrate AI, it’s not going to be a gimmick you click on. It’s going to understand your context, your life, your privacy. And privacy... my god, these other guys are just vacuuming up your life to feed their models. It’s fundamentally anti-user.
Interviewer: Let’s talk about the creative aspect. AI can now write code, compose music, and generate photorealistic art in seconds. As someone who has always championed the intersection of technology and the liberal arts, does this worry you? Does it threaten human creativity?
Steve Jobs: (Pauses, looking out the window for a long moment) Technology is nothing. What matters is that you have a faith in people, that they’re basically good and smart, and if you give them tools, they’ll do wonderful things with them.
Can AI paint a picture? Sure. Can it write a song? Yes. But it doesn’t have a soul. It hasn’t had its heart broken. It hasn’t felt the joy of holding a child, or the grief of losing a friend. Art is a transmission of human experience. AI is just a mirror reflecting our past data back at us.
The danger isn't that AI will replace artists. The danger is that lazy people will accept mediocre, machine-generated garbage as "good enough." Our job is to build tools that elevate the human spirit, not replace it. AI should do the drudgery so the human can focus on the poetry.
Interviewer: There’s a lot of fear about Artificial General Intelligence (AGI)—the idea that these systems will outsmart us and become uncontrollable. Do you share that fear?
Steve Jobs: I think it’s arrogant of us to assume we understand consciousness well enough to accidentally build it in a server farm.
But look, every revolutionary technology is dangerous. Fire burns your house down. The internal combustion engine pollutes. The internet gave us access to all human knowledge, but it also gave a megaphone to every bozo with a keyboard. It comes down to stewardship. The people building these models right now are mathematicians and engineers. Brilliant people. But you cannot leave the future of human-computer interaction entirely to engineers. You need philosophers, designers, artists. You need people who ask "Why?" instead of just "How fast?"
Interviewer: If we sit down again in 2035, ten years from now, what does the perfect AI product look like?
Steve Jobs: (Smiles, a rare, genuine grin) You won’t see it.
You won’t be looking at a screen, typing prompts. The device will be an extension of your intent. You’ll think about a problem, you’ll gesture, you’ll speak, and the friction between your idea and reality will be zero. It will anticipate your needs without being creepy. It will protect your data like a vault.
It will just... work. And it will feel like magic.
Interviewer: Steve, thank you for your time.
Steve Jobs: My pleasure. Now, if you’ll excuse me, I have to go look at some new glass prototypes for the iPhone 18. The bevels are off by a fraction of a millimeter and it’s driving me crazy.
Tests an AI's humor and understanding of current events
Tests an AI's ability to write in distinct character voices
The dimly lit tavern smells of saltwater, rusted iron, and ozone. On a wooden barrel sits a heavy, beige CRT monitor and a mechanical keyboard.
ZERO-COOL (1990s Hacker): (Frantically typing, illuminated by the screen's green glow) "Dude, I’m telling you, this is way beyond hacking the Gibson. I just jacked into this new neural net over a 56k modem. It’s an AI model. It doesn’t just crunch numbers; it talks. It’s like an IRC chatroom, but the other guy is made of math."
SIR ROLAND (Medieval Knight): (Leaning on his broadsword, visor raised) "Thou speakest of dark sorcery, phantom-weaver. A thinking engine? 'Tis surely a homunculus bound in glass and lightning! Hath this... 'A-I' sworn fealty to a liege? Can it wield a lance in defense of the realm?"
CAPTAIN BARNABY (Pirate): (Taking a swig of rum, adjusting his eyepatch) "Avast with yer 'fealty,' tin man. I care not if the ghost in the glass can joust. If this brain-in-a-box knows so much, can it chart a course to the Spanish Main? Can it tell ol' Barnaby where the Navy galleons be hidin' their gold?"
ZERO-COOL: "It doesn't have a map, man. It’s predictive text. It scraped, like, terabytes of data—way more than my Iomega Zip drive can hold. You ask it a prompt, and it calculates the most logical next word. It’s totally radical."
SIR ROLAND: "Predictive? It sees the future?! By the saints! If this oracle knows the outcome of a battle before the swords are even drawn, it strips the honor from the melee! We must smash this glowing tablet with a warhammer before it usurps the King!"
CAPTAIN BARNABY: "Hold yer iron horses, shiny! If it predicts the trade winds and the Royal Navy's patrols, we’d be the richest scallywags on the seven seas. I say we make this 'A-I' our quartermaster. We'll feed it... what do ye call it? Mega-bites? And give it a cut of the plunder!"
ZERO-COOL: "Whoa, chill out, both of you. It’s not an oracle, and it’s definitely not a pirate. It hallucinates, okay? Like, sometimes it just makes totally bogus stuff up with absolute confidence. Yesterday I asked it how to bypass a mainframe firewall, and half the code it gave me was for baking a blueberry pie."
SIR ROLAND: (Gasps, crossing himself) "Aha! A lying demon! It speaks with the silver tongue of a charlatan to lead good Christian men astray into the brambles of madness! I shall not trust my life to a phantom scribe who imagines pastries in the heat of war!"
CAPTAIN BARNABY: (Laughs heartily) "Aye, sounds like half the tavern drunks in Tortuga! A liar it may be, but a liar with a silver tongue can still talk a merchant out of his cargo." (Leans close to the monitor) "Hacker boy, if ye can teach this glowing box to forge a royal pardon, I’ll trade ye three casks of rum and a parrot for it."
ZERO-COOL: (Sighs, hitting the 'Disconnect' button) "Man, whatever. I'm taking this offline before the feds trace my IP address. You guys wouldn't know a bodacious algorithm if it bit you in the floppy disk."
Tests an AI's ability to generate a complete, working landing page
Recreate an interactive, classic Mario level in a single HTML file.
Tests an AI's ability to create smooth web animations
Tests an AI's UI design skills with theming support
Tests an AI's ability to engage in philosophical discourse about AI rights