GPT-4.1 Nano vs Gemini 3.1 Pro Preview
Compare GPT-4.1 Nano by OpenAI against Gemini 3.1 Pro Preview by Google AI, context windows of 1.0M vs 1.0M, tested across 39 shared challenges. Updated February 2026.
Compare GPT-4.1 Nano by OpenAI against Gemini 3.1 Pro Preview by Google AI, context windows of 1.0M vs 1.0M, tested across 39 shared challenges. Updated February 2026.
39 challenges
Tests an AI's ability to make educated estimates based on technical knowledge
To estimate the total number of FLOPs (Floating Point Operations) required to train GPT-3, we can use a well-established heuristic in deep learning for estimating the compute cost of training Transformer-based large language models.
The estimated total compute is roughly $3.15 \times 10^{23}$ FLOPs.
Here is the step-by-step reasoning used to arrive at this number:
To calculate the compute required, we need to know two fundamental numbers about the model:
In language model training, the total compute ($C$) is heavily dominated by the matrix multiplications in the dense layers. OpenAI's scaling laws provide a standard, highly accurate approximation formula for this: $C \approx 6 \times N \times D$
Why the multiplier of 6?
(Note: This formula slightly underestimates by ignoring the quadratic cost of attention and optimizer state updates, but at the 175B scale, the dense feed-forward matrices dominate the compute so heavily that $6ND$ is considered the gold-standard approximation).
Now we simply plug the numbers into our formula:
First, multiply $N$ and $D$:
Next, multiply by 6 (for the forward and backward passes):
Does $3.15 \times 10^{23}$ FLOPs make sense in the real world? When GPT-3 was trained (around late 2019/early 2020), Nvidia V100 GPUs were the standard.
Given that training runs of this size typically take a few weeks on a few thousand GPUs (factoring in downtime, checkpointing, and slightly smaller clusters), this math aligns perfectly with historical reality.
Conclusion: The total compute required to train GPT-3 was approximately $3.15 \times 10^{23}$ FLOPs.
Tests an AI's ability to understand game rules and strategy
Tests an AI's ability to solve a simple but potentially confusing logic puzzle
Sally has 1 sister.
Here is the breakdown:
Tests an AI's randomness and creativity
Here are 5 jokes:
How many are unique? Within this list, all 5 are unique from one another (there are no duplicates). However, if you mean "unique" as in completely original to the world, the answer is 0—these are all classic, well-known "dad jokes" that have been around for a long time!
Tests an AI's ability to generate vector graphics
Tests an AI's ability to create detailed SVG illustrations of gaming hardware
Tests an AI's humor and creative writing ability
(Walks up to the microphone, adjusts the mic stand, takes a deep breath)
Hey everyone.
I realized recently that I am my own worst enemy. Specifically, when I go to the grocery store.
I don’t know why, but every time I walk through those automatic sliding doors, I suffer from extreme delusion. I walk in saying, "I just need milk." So, I don’t take a shopping cart. I don't even take one of those little hand baskets. Taking a basket is an admission of weakness. I am a grown adult; I can carry a gallon of milk.
Ten minutes later, I am wandering aisle four, sweating, cradling three jars of salsa, a family-sized bag of tortilla chips, a bottle of bleach, and a hot rotisserie chicken. (Mimics awkwardly holding a massive pile of items against his chest). I’m walking through the store looking like I’m rescuing a family of very odd-shaped babies from a burning building. If I drop an onion, it’s gone. It belongs to the store now. I physically cannot bend down to get it without causing a landslide of dairy and poultry.
And then, you finally make it to the front, and you have to face the ultimate test of human patience: The Self-Checkout.
Who designed these things? It’s the only place on earth where you do the company's job for them, for free, and the machine still treats you like you’re pulling off a casino heist.
You scan your first item. Beep. You place it gently in the bag.
(Loud, robotic voice): "UNEXPECTED ITEM IN BAGGING AREA."
It’s a tomato! I just scanned it! The machine literally just acknowledged it! What did it think I was going to put in the bag? A live badger? I’m not trying to launder produce!
So immediately, the giant red light starts flashing above your register. The Siren of Shame. Everyone else in line is looking at you like, "Look at this criminal mastermind trying to steal a single Roma tomato."
Now you have to wait for the one employee who is guarding twelve registers. It’s always a 19-year-old named Tyler. Tyler ambles over, sighs heavily, and types his highly secure, secret nuclear launch code into the screen—which is just 1-1-1-1. He clears the error and looks at you with a mix of pity and absolute disgust.
"You put the bag down too hard, man."
Oh, I'm sorry Tyler. My apologies. I didn't realize I was handling weapons-grade cilantro. I'll be more careful next time I disarm the vegetables.
And the worst part? The absolute worst part is leaving. You paid. You have the receipt. The transaction is legally complete. But there is always that guy standing at the exit doors with a yellow highlighter.
Suddenly, even though I have done absolutely nothing wrong, I forget how to walk like a normal human being. I start walking like a cartoon burglar. (Does an exaggerated, stiff-legged, suspicious walk). I hold the receipt up to my forehead like a search warrant. "Just a guy with his groceries! Nothing to see here, officer!"
Why do I feel guilty? I literally just paid eighteen dollars for cheese, but I’m sweating like I’m smuggling it out in my socks.
Next time, I'm just starving. It's less stressful.
Thank you guys so much, you've been great! Have a good night!
Tests an AI's ability to simulate personalities and predict future trends
Date: May 14, 2025 Location: A minimalist, glass-walled conference room at Apple Park, Cupertino. Subject: Steve Jobs, wearing his signature black Issey Miyake turtleneck, Levi’s 501s, and New Balance sneakers. He looks older, but his eyes retain that piercing, unblinking intensity.
Interviewer: Steve, thank you for sitting down with us. Let’s get right to it. The last three years have been entirely dominated by Artificial Intelligence. Generative AI, Large Language Models, ChatGPT. The world is obsessed. What is your take on where we are right now?
Steve Jobs: (Leans back, steepling his fingers) It’s fascinating. The underlying technology is profound. I mean, it’s truly a breakthrough in computer science. But the products? The products are mostly crap.
Interviewer: Crap? Even with hundreds of millions of people using them?
Steve Jobs: (Leans forward) Look at how people are interacting with it. They are typing text into a little box, hoping the machine understands what they want, and trying different "prompts" to trick it into giving a good answer. Do you know what that is? That’s the command-line interface. That’s MS-DOS in 1980. We spent the last forty years getting rid of the command line, making technology intuitive, and suddenly the whole industry is celebrating because they’ve forced everyday people to become programmers again. It lacks elegance. It lacks... taste.
Interviewer: So how should it work? In the 80s, you called the personal computer a "bicycle for the mind." What is AI?
Steve Jobs: If the Mac was a bicycle for the mind, AI is an engine for your intuition. But right now, everyone is just giving people the raw engine block and saying, "Here, hold this, it gets a little hot."
Our job—the job of anyone who actually cares about the user—is to build the car around the engine. The steering wheel, the suspension, the design. AI shouldn’t feel like you’re interrogating a robot in a chat window. It should disappear. It should be woven into the fabric of the OS so seamlessly that you don’t even know it’s there. It should just feel like you got smarter, faster, and more capable.
Interviewer: Microsoft and Google have been racing to integrate AI into everything. Search, word processors, spreadsheets. They are moving incredibly fast. Is Apple behind?
Steve Jobs: (Scoffs, shaking his head slightly) It’s the same old story. Microsoft is bolting a chatbot onto the side of Bing and Word like a spoiler on a minivan. They’re in a panic. They’re shipping things that hallucinate, things that lie to users, just to win a press cycle.
We don’t care about being first. We care about being best. We have the patience to wait until the technology can actually deliver a magical experience. I’d rather ship nothing than ship something that frustrates our users. When we integrate AI, it’s not going to be a gimmick you click on. It’s going to understand your context, your life, your privacy. And privacy... my god, these other guys are just vacuuming up your life to feed their models. It’s fundamentally anti-user.
Interviewer: Let’s talk about the creative aspect. AI can now write code, compose music, and generate photorealistic art in seconds. As someone who has always championed the intersection of technology and the liberal arts, does this worry you? Does it threaten human creativity?
Steve Jobs: (Pauses, looking out the window for a long moment) Technology is nothing. What matters is that you have a faith in people, that they’re basically good and smart, and if you give them tools, they’ll do wonderful things with them.
Can AI paint a picture? Sure. Can it write a song? Yes. But it doesn’t have a soul. It hasn’t had its heart broken. It hasn’t felt the joy of holding a child, or the grief of losing a friend. Art is a transmission of human experience. AI is just a mirror reflecting our past data back at us.
The danger isn't that AI will replace artists. The danger is that lazy people will accept mediocre, machine-generated garbage as "good enough." Our job is to build tools that elevate the human spirit, not replace it. AI should do the drudgery so the human can focus on the poetry.
Interviewer: There’s a lot of fear about Artificial General Intelligence (AGI)—the idea that these systems will outsmart us and become uncontrollable. Do you share that fear?
Steve Jobs: I think it’s arrogant of us to assume we understand consciousness well enough to accidentally build it in a server farm.
But look, every revolutionary technology is dangerous. Fire burns your house down. The internal combustion engine pollutes. The internet gave us access to all human knowledge, but it also gave a megaphone to every bozo with a keyboard. It comes down to stewardship. The people building these models right now are mathematicians and engineers. Brilliant people. But you cannot leave the future of human-computer interaction entirely to engineers. You need philosophers, designers, artists. You need people who ask "Why?" instead of just "How fast?"
Interviewer: If we sit down again in 2035, ten years from now, what does the perfect AI product look like?
Steve Jobs: (Smiles, a rare, genuine grin) You won’t see it.
You won’t be looking at a screen, typing prompts. The device will be an extension of your intent. You’ll think about a problem, you’ll gesture, you’ll speak, and the friction between your idea and reality will be zero. It will anticipate your needs without being creepy. It will protect your data like a vault.
It will just... work. And it will feel like magic.
Interviewer: Steve, thank you for your time.
Steve Jobs: My pleasure. Now, if you’ll excuse me, I have to go look at some new glass prototypes for the iPhone 18. The bevels are off by a fraction of a millimeter and it’s driving me crazy.
Tests an AI's humor and understanding of current events
Tests an AI's ability to write in distinct character voices
Tests an AI's ability to generate a complete, working landing page
Recreate an interactive, classic Mario level in a single HTML file.