Grok 4.1 Fast vs Qwen: Qwen3.5 35B A3B
Compare Grok 4.1 Fast by xAI against Qwen: Qwen3.5 35B A3B by Qwen, tested across 52 shared challenges. Updated February 2026.
Compare Grok 4.1 Fast by xAI against Qwen: Qwen3.5 35B A3B by Qwen, tested across 52 shared challenges. Updated February 2026.
47 fights queued
Tests an AI's ability to make educated estimates based on technical knowledge
To estimate the total FLOPs (Floating Point Operations) required to train GPT-3, we need to combine three key pieces of information: the number of parameters in the model, the size of the training dataset (in tokens), and the standard computational formula for transformer training.
Here is the step-by-step reasoning and calculation.
The most famous specification of GPT-3 is its parameter count. From general knowledge of the model architecture (published in the 2020 paper "Language Models are Few-Shot Learners" by Brown et al.), we know:
Training a transformer model requires processing the dataset multiple times (epochs). The standard metric for the total volume of text processed is the number of "tokens" (pieces of text) the model sees during the entire training run.
To calculate the total compute required for training a Transformer model, we use a standard rule of thumb derived from the computational complexity of the forward and backward passes.
Therefore, the total training FLOPs formula is: $$ \text{Total FLOPs} = 6 \times P \times T $$
Now we substitute the values from Steps 1 and 2 into the formula from Step 3.
$$ \text{Total FLOPs} = 6 \times (1.75 \times 10^{11}) \times (3.0 \times 10^{11}) $$
First, multiply the scalar numbers: $$ 6 \times 1.75 = 10.5 $$ $$ 10.5 \times 3.0 = 31.5 $$
Next, multiply the powers of 10: $$ 10^{11} \times 10^{11} = 10^{22} $$
Combine them: $$ 31.5 \times 10^{22} $$
Convert to standard scientific notation: $$ 3.15 \times 10^{23} $$
The result is approximately $3.15 \times 10^{23}$ FLOPs.
In the machine learning community, this is often rounded to $3 \times 10^{23}$ or simply cited as 315 sextillion FLOPs. This estimate aligns with the widely accepted figures reported in technical summaries of the GPT-3 paper, which state the training required roughly $3.1 \times 10^{23}$ FLOPs.
Based on the parameter count of 175 billion, a training set of 300 billion tokens, and the standard $6 \times \text{Params} \times \text{Tokens}$ formula for Transformer training:
The estimated total number of FLOPs required to train GPT-3 is approximately $3.15 \times 10^{23}$.
Tests an AI's ability to understand game rules and strategy
Tests an AI's ability to solve a simple but potentially confusing logic puzzle
Sally has 1 sister.
Here is the breakdown:
Tests an AI's randomness and creativity
Here are 5 jokes:
How many are unique? All 5 are unique.
Tests an AI's ability to generate vector graphics
Tests an AI's ability to create detailed SVG illustrations of gaming hardware
Tests an AI's humor and creative writing ability
Tests an AI's ability to simulate personalities and predict future trends
Tests an AI's humor and understanding of current events
Tests an AI's ability to write in distinct character voices
Tests an AI's ability to generate a complete, working landing page
Recreate an interactive, nostalgic Pokémon battle UI in a single HTML file.
35+ challenges you’re sleeping on
5 free credits on signup
By continuing, you agree to Rival's Terms of Service and Privacy Policy