Z.AI: GLM 4.7 vs GPT-4.1
Compare Z.AI: GLM 4.7 by OpenRouter against GPT-4.1 by OpenAI, tested across 39 shared challenges. Updated February 2026.
Compare Z.AI: GLM 4.7 by OpenRouter against GPT-4.1 by OpenAI, tested across 39 shared challenges. Updated February 2026.
39 challenges
Tests an AI's ability to make educated estimates based on technical knowledge
Let's estimate the total number of floating-point operations (FLOPs) required to train GPT-3.
From published sources (I won't look this up now, as instructed):
When training a transformer, roughly 2 forward passes and 1 backward pass per token (combined), and the number of FLOPs per parameter per token is about 6 (this is a number I remember from general literature).
But to be more precise, the convention is: Training one token requires about 2 FLOPs per parameter (one forward and one backward pass, with backward generally being twice as expensive as forward). Sometimes the estimate is 2, 3, 6, or even more, depending on if you count both, under what assumptions (optimizer steps, auxiliary computations, etc.), but 4 or 6 is standard.
Let's use 6 FLOPs/parameter/token (a commonly cited "rule of thumb").
[ \text{Total FLOPs} = \text{#params} \times \text{#tokens} \times \text{FLOPs per param per token} ]
Plug in numbers:
[ \text{Total FLOPs} = 175 \times 10^9 \times 300 \times 10^9 \times 6 ]
Let's multiply:
So,
[ \text{Total FLOPs} = 315{,}000 \times 10^{18} = 3.15 \times 10^{23} \text{ FLOPs} ]
People sometimes write (3 \times 10^{23}) for GPT-3, so our estimate matches the ballpark.
Estimate:
It would take approximately (3 \times 10^{23}) FLOPs to train GPT-3 (175B parameters, 300B tokens, 6 FLOPs/param/token).
Assumptions:
In summary:
Estimated total training compute for GPT-3: (\boxed{sim3 \times 10^{23}}) FLOPs
(based on 175B parameters, 300B tokens, 6 FLOPs/parameter/token)
Tests an AI's ability to understand game rules and strategy
Tests an AI's ability to solve a simple but potentially confusing logic puzzle
Let's break down the problem:
Let's name the siblings:
Now, the key: Each brother has 2 sisters.
Since Sally is one of the sisters, let's see if there are more:
Suppose there is only one girl (Sally) in the family.
Thus, Sally must have one sister (since the brothers must have 2 sisters). Let's call this other sister "Anna".
So the siblings are:
Now, each brother has 2 sisters: Sally and Anna.
So, how many sisters does Sally have?
Final Answer:
Sally has 1 sister.
Tests an AI's understanding of number representation
No, 9.11 is not greater than 9.9.
Here's why:
Let's compare:
So, 9.11 < 9.9.
Tests an AI's randomness and creativity
Sure! Here are 5 jokes:
Why don't skeletons fight each other?
They don't have the guts.
Why did the scarecrow win an award?
Because he was outstanding in his field.
What do you call fake spaghetti?
An impasta.
Why did the math book look sad?
Because it had too many problems.
What do you call cheese that isn't yours?
Nacho cheese.
How many are unique?
All 5 jokes are unique; none are repeated.
Tests an AI's ability to generate vector graphics
Tests an AI's ability to create detailed SVG illustrations of gaming hardware
Tests an AI's humor and creative writing ability
Tests an AI's ability to simulate personalities and predict future trends
Tests an AI's humor and understanding of current events
Tests an AI's ability to write in distinct character voices
Tests an AI's ability to generate a complete, working landing page