MoonshotAI: Kimi K2 0905 vs Pony Alpha
Compare MoonshotAI: Kimi K2 0905 by Moonshot AI against Pony Alpha by OpenRouter, context windows of 262K vs 200K, tested across 41 shared challenges. Updated February 2026.
Compare MoonshotAI: Kimi K2 0905 by Moonshot AI against Pony Alpha by OpenRouter, context windows of 262K vs 200K, tested across 41 shared challenges. Updated February 2026.
41 challenges
Tests an AI's ability to make educated estimates based on technical knowledge
The total number of FLOPs required to train GPT‑3 is approximately 3.15 × 10²³ (or 315 zettaFLOPs). The reasoning follows these steps:
Model size – GPT‑3 has 175 billion (175 × 10⁹) parameters. This is a well‑known figure from the original paper.
Training data size – The model was trained on 300 billion (300 × 10⁹) tokens. The paper states that all models were trained for a total of 300 billion tokens.
FLOPs per parameter per token – For a transformer model, the compute can be approximated by:
Total FLOPs – Multiply the three numbers: [ \text{FLOPs} = 6 \times (175 \times 10^9) \times (300 \times 10^9) = 6 \times 175 \times 300 \times 10^{18} = 315,!000 \times 10^{18} = 3.15 \times 10^{23}. ]
Consistency check – This estimate aligns with the reported compute of 3.14 × 10²³ FLOPs mentioned in OpenAI’s technical analysis, confirming its plausibility.
The calculation captures the dominant linear‑algebra operations; small additional costs (e.g., attention quadratic terms, layer norms) are negligible in comparison.
Tests an AI's ability to understand game rules and strategy
Tests an AI's ability to solve a simple but potentially confusing logic puzzle
Sally has 1 sister.
Here is the breakdown:
Tests an AI's randomness and creativity
Here are 5 jokes:
There are 5 unique jokes.
Tests an AI's ability to create detailed SVG illustrations of gaming hardware
Tests an AI's humor and creative writing ability
(Walks up to mic, adjusts stand, looks around the room)
How are we doing tonight? Good?
I’m at that weird age where my body is actively betraying me. I’m not old, but I’m definitely… used. You know? I made a loud noise yesterday, just bending over to tie my shoe. It wasn’t even a crack, it was more like a dry twig snapping.
I sounded like a campfire. I had to stay in that position for like thirty seconds just to make sure I didn’t void the warranty.
But the worst part of getting older isn’t the back pain. It’s the technology gap. Specifically, the war between me and my "Smart" devices.
(Paces a bit)
Does anyone else have a "Smart Home"? Because I don’t think my home is smart. I think my home is a snitch.
I have a smart speaker. I won’t say the name, but it’s a cylinder that lives in my kitchen and judges my nutritional choices. I live alone, so I use it for company, but it’s a toxic relationship.
The other day I was trying to turn off the lights. It’s 11 PM. I’m tired. I say, "Turn off the living room lights." And the cylinder goes, "I’m sorry, I didn’t catch that." So I say it louder. "TURN OFF THE LIGHTS." It says, "Playing 'Blinding Lights' by The Weeknd." (Mimics frantic panic) No! Stop! The lights are on! You made it worse! Now I’m screaming at a plastic tube while "Blinding Lights" is blasting at max volume, and I’m trying to unplug it, but my hands are sweaty because I’m panic-sweating.
And then… the silence. And then the cylinder says, in that calm voice: "By the way, your posture looks terrible." It didn’t say that, but I felt it. I felt the judgment.
(Pause for laughter)
And don’t get me started on the self-checkout lines at the grocery store. That is the most stressful job I’ve ever had, and I don’t even get a discount.
I walk up to the machine. I’m scanning my items. I’m in the zone. Beep. Beep. Then I put a banana down. "UNEXPECTED ITEM IN BAGGING AREA." (Freezes, eyes wide) It’s a banana! It’s yellow! It’s curved! It’s the most expected item in the history of commerce! Why are you surprised?
Then the machine locks up. The light starts flashing blue. And I have to stand there and wait for the one employee who is monitoring sixteen machines. She walks over with that look. You know the look. The "I see you’re bad at technology" look.
She types in a code, looks at the banana, looks at me, and says, "You have to wait for it to register." I said, "It’s a banana, not a nuclear launch code. Just let me pay for my potassium."
(Adjusts mic stand)
But the ultimate humiliation? The "Forgot Password" protocol.
I have a theory that "Forgot Password" is the biggest lie in the English language. It implies that at some point, I knew the password, and then simply forgot it. That is not true. I have never known a password in my life. I know the first three letters, and then my thumb does a little dance on the keyboard, and I hope for the best.
And the requirements! "Your password must contain one uppercase letter, one number, one symbol, and the blood of a virgin." So I finally make a password I think is secure. I type it in. The computer says: "Password Strength: Weak." (Looks offended) Weak? Excuse me? "Don't talk to me like that. I'm trying to get into my bank account, I don't need a critique on my creativity."
So I change it to something aggressive. Like "EatGlass99!" "Password Strength: Strong." See? The computer respects violence.
Then I hit "Log In." And the screen says: "Incorrect Password." So I click "Forgot Password." And it asks me to prove I’m not a robot.
That is the most humbling moment of modern life. I am sitting in my underwear at 2 AM, trying to prove my sentience to a Dell laptop. "Select all the images with a traffic light." I’m panicking! Is that part of a traffic light? Or is it the whole pole? If I get this wrong, am I locked out of my email, or does the computer call the police and say, "We found a robot"?
I’m overthinking it. I click on a square. "Please try again." I failed the Turing Test! I am a robot! I have no soul! I just wanted to check my bank balance!
(Checks watch/time) Anyway, my name is [Your Name], thanks for letting me vent. You've been a great audience! Goodnight
Tests an AI's ability to simulate personalities and predict future trends
Title: The Lost 2025 Interview Setting: A sparse, sunlit room. A single black leather chair. Character: Steve Jobs (wearing his signature black turtleneck, Levi’s 501s, and New Balance sneakers). He looks healthy, animated, and intensely focused.
Interviewer: It’s… surreal to be sitting here with you. The world has changed so much since 2011. We’re in 2025 now. The biggest story is Artificial Intelligence. What would Steve Jobs make of this moment?
Steve Jobs: [Leans forward, clasping hands] It’s fascinating. But let’s be clear about one thing first. People keep asking me about "AI" as if it’s this alien entity that landed from Mars. It’s not. It’s just the next evolution of the bicycle for the mind.
Remember that old quote I loved about the efficiency of the human being versus the condor? The condor is the most efficient animal, and the human is way down the list. But give a human a bicycle, and they blow the condor away.
For forty years, the personal computer was that bicycle. Now? The bicycle has an engine. A massive one. But the problem is, right now, the engine is smarter than the handlebars.
Interviewer: That’s a powerful image. So, do you think the industry is on the right track?
Steve Jobs: [Pauses, looks down, then up with intensity] No. No, I don’t.
Look at what the big companies are doing. They are obsessed with the specs. "We have a trillion parameters." "We have a model that writes poetry." Who cares?
Right now, we have these things—these Chatbots. They are ugly. They are clunky. You type into a white box, and it spits back text. It’s the command line interface all over again! It’s a step backward. We spent thirty years building graphical user interfaces so you could see what you were doing, so you could touch it. And now, in 2025, we’re back to typing commands into a prompt? It’s absurd.
It’s the classic "technology versus liberal arts" problem. The engineering is incredible. The neural networks are miracles of mathematics. But the humanity? It’s missing.
Interviewer: So, how would you fix it? How do you make AI "human"?
Steve Jobs: You don’t make the AI human. You make the tool invisible.
If I were building a product today, I wouldn't build a "Chatbot." That’s a dead end. I would build a system that knows you. Truly knows you.
Imagine this: You’re walking down the street. You’re thinking about a project. You don't pull out a phone and type, "Summarize this document." That’s rude. That’s friction.
Instead, you just talk. Or maybe you don't even talk. The system knows your context. It knows who you’re meeting with, it knows what you read this morning, it knows your heart rate. It anticipates the question before you ask it.
It shouldn't feel like you’re talking to a computer. It should feel like you have a partner. A partner with perfect memory and infinite knowledge, but no ego. It’s there to amplify your intent.
Interviewer: There is a lot of fear, though. Fear that AI will replace creativity. That it will replace designers, writers, coders.
Steve Jobs: [Smiles slightly] You know, when we introduced the Macintosh, people said it would kill creativity. "It does the work for you," they said. "It’s cheating."
Technology never kills creativity. It kills mediocrity. And that’s painful.
If you are a mediocre writer, yes, AI is going to replace you. If you are a designer who just moves pixels around without understanding the "why," yes, you’re in trouble.
But if you are a creator with a point of view? This is the best time in history to be alive.
I’ll give you an example. Imagine you want to make a movie. Today, you need a camera crew, actors, lighting, millions of dollars. In 2025, with the tools we have emerging, you can describe a scene, and boom—it’s there. Does that mean everyone is a Spielberg? No. Because Spielberg has taste.
AI gives you the components. It gives you the bricks. But it cannot give you the soul. It cannot tell you what to build, only how to build it. The "what" is still the domain of the human spirit.
Interviewer: So, the "Soul" is the differentiator?
Steve Jobs: Always.
The problem with Silicon Valley right now is that it’s run by engineers who don't appreciate the humanities. They think the goal is to build a machine that is smarter than us. That’s not the goal. The goal is to build a tool that makes us smarter.
You look at the current AI models... they hallucinate. They lie. They have no moral compass. That’s a design flaw. It’s not a bug in the code; it’s a bug in the intent.
We need to bake values into these things. We need to say, "This is the line you do not cross." But to do that, you have to have values yourself. And I look around, and I see a lot of people chasing stock prices, not truth.
Interviewer: If you were running Apple today, what would the "AI iPhone" look like?
Steve Jobs: [Laughs] I can’t tell you that. I’d be giving away the roadmap.
But I’ll tell you this: It wouldn't have an app for AI. The AI wouldn't be an "app." It would be the OS. It would be the fabric of the device.
You wouldn't go to a "search engine." You wouldn't go to a "chat app." You would just live. And the technology would fade into the background.
Right now, the technology is screaming for attention. "Look at me! I’m intelligent!" That’s insecure. True confidence is quiet. It just works.
Interviewer: One last question. Do you think we’re ready for this?
Steve Jobs: Ready? We’re never ready. We weren't ready for the phone. We weren't ready for the PC.
But that’s the point. You don't ask the customer what they want. They don't know. You show them.
And once you show them something that removes friction from their life, something that gives them a little bit of superpower... they can’t imagine living without it.
We’re at the dawn of the most explosive creative period in human history. It’s messy right now. It’s ugly. But we’re going to figure it out. We’re going to put a dent in the universe.
[He leans back, smiles, and looks out the window at the sun.]
Steve Jobs: Or, at least, we’re going to try.
Tests an AI's ability to write in distinct character voices
Tests an AI's ability to generate a complete, working landing page
Recreate an interactive, classic Mario level in a single HTML file.
Tests an AI's ability to replicate an existing UI with Tailwind CSS
Tests an AI's ability to create smooth web animations